* [PATCH 5/7] hwrng: core: Move hwrng miscdev minor number to include/linux/miscdevice.h
From: Corentin Labbe @ 2016-12-09 14:21 UTC (permalink / raw)
To: mpm, herbert, arnd, gregkh; +Cc: linux-crypto, linux-kernel, Corentin Labbe
In-Reply-To: <1481293299-21697-1-git-send-email-clabbe.montjoie@gmail.com>
This patch move the define for hwrng's miscdev minor number to
include/linux/miscdevice.h.
It's better that all minor number are in the same place.
Rename it to HWRNG_MINOR (from RNG_MISCDEV_MINOR) in he process since
no other miscdev define have MISCDEV in their name.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
---
drivers/char/hw_random/core.c | 3 +--
include/linux/miscdevice.h | 1 +
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index 7a2e496..1e1e385 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -26,7 +26,6 @@
#define RNG_MODULE_NAME "hw_random"
#define PFX RNG_MODULE_NAME ": "
-#define RNG_MISCDEV_MINOR 183 /* official */
static struct hwrng *current_rng;
static struct task_struct *hwrng_fill;
@@ -283,7 +282,7 @@ static const struct file_operations rng_chrdev_ops = {
static const struct attribute_group *rng_dev_groups[];
static struct miscdevice rng_miscdev = {
- .minor = RNG_MISCDEV_MINOR,
+ .minor = HWRNG_MINOR,
.name = RNG_MODULE_NAME,
.nodename = "hwrng",
.fops = &rng_chrdev_ops,
diff --git a/include/linux/miscdevice.h b/include/linux/miscdevice.h
index 722698a..659f586 100644
--- a/include/linux/miscdevice.h
+++ b/include/linux/miscdevice.h
@@ -31,6 +31,7 @@
#define SGI_MMTIMER 153
#define STORE_QUEUE_MINOR 155 /* unused */
#define I2O_MINOR 166
+#define HWRNG_MINOR 183
#define MICROCODE_MINOR 184
#define VFIO_MINOR 196
#define TUN_MINOR 200
--
2.7.3
^ permalink raw reply related
* [PATCH 6/7] hwrng: core: remove unused PFX macro
From: Corentin Labbe @ 2016-12-09 14:21 UTC (permalink / raw)
To: mpm, herbert, arnd, gregkh; +Cc: linux-crypto, linux-kernel, Corentin Labbe
In-Reply-To: <1481293299-21697-1-git-send-email-clabbe.montjoie@gmail.com>
This patch remove the unused PFX macro.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
---
drivers/char/hw_random/core.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index 1e1e385..5c654b5 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -25,7 +25,6 @@
#include <linux/uaccess.h>
#define RNG_MODULE_NAME "hw_random"
-#define PFX RNG_MODULE_NAME ": "
static struct hwrng *current_rng;
static struct task_struct *hwrng_fill;
--
2.7.3
^ permalink raw reply related
* [PATCH 7/7] hwrng: core: Remove two unused include
From: Corentin Labbe @ 2016-12-09 14:21 UTC (permalink / raw)
To: mpm, herbert, arnd, gregkh; +Cc: linux-crypto, linux-kernel, Corentin Labbe
In-Reply-To: <1481293299-21697-1-git-send-email-clabbe.montjoie@gmail.com>
linux/fs.h and linux/sched.h are useless for hw_random/core.c.
This patch remove them.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
---
drivers/char/hw_random/core.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index 5c654b5..85c9ab3 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -13,14 +13,12 @@
#include <linux/delay.h>
#include <linux/device.h>
#include <linux/err.h>
-#include <linux/fs.h>
#include <linux/hw_random.h>
#include <linux/kernel.h>
#include <linux/kthread.h>
#include <linux/miscdevice.h>
#include <linux/module.h>
#include <linux/random.h>
-#include <linux/sched.h>
#include <linux/slab.h>
#include <linux/uaccess.h>
--
2.7.3
^ permalink raw reply related
* [PATCH 1/7] hwrng: core: do not use multiple blank lines
From: Corentin Labbe @ 2016-12-09 14:21 UTC (permalink / raw)
To: mpm, herbert, arnd, gregkh; +Cc: linux-crypto, linux-kernel, Corentin Labbe
This patch fix the checkpatch warning "Please don't use multiple blank lines"
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
---
drivers/char/hw_random/core.c | 5 -----
1 file changed, 5 deletions(-)
diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index d2d2c89..00cbb81 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -30,7 +30,6 @@
*/
-
#include <linux/device.h>
#include <linux/hw_random.h>
#include <linux/module.h>
@@ -45,12 +44,10 @@
#include <linux/err.h>
#include <asm/uaccess.h>
-
#define RNG_MODULE_NAME "hw_random"
#define PFX RNG_MODULE_NAME ": "
#define RNG_MISCDEV_MINOR 183 /* official */
-
static struct hwrng *current_rng;
static struct task_struct *hwrng_fill;
static LIST_HEAD(rng_list);
@@ -296,7 +293,6 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
goto out;
}
-
static const struct file_operations rng_chrdev_ops = {
.owner = THIS_MODULE,
.open = rng_dev_open,
@@ -314,7 +310,6 @@ static struct miscdevice rng_miscdev = {
.groups = rng_dev_groups,
};
-
static ssize_t hwrng_attr_current_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t len)
--
2.7.3
^ permalink raw reply related
* [PATCH 2/7] hwrng: core: rewrite better comparison to NULL
From: Corentin Labbe @ 2016-12-09 14:21 UTC (permalink / raw)
To: mpm, herbert, arnd, gregkh; +Cc: linux-crypto, linux-kernel, Corentin Labbe
In-Reply-To: <1481293299-21697-1-git-send-email-clabbe.montjoie@gmail.com>
This patch fix the checkpatch warning "Comparison to NULL could be written "!ptr"
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
---
drivers/char/hw_random/core.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index 00cbb81..7029246 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -439,8 +439,7 @@ int hwrng_register(struct hwrng *rng)
int err = -EINVAL;
struct hwrng *old_rng, *tmp;
- if (rng->name == NULL ||
- (rng->data_read == NULL && rng->read == NULL))
+ if (!rng->name || (!rng->data_read && !rng->read))
goto out;
mutex_lock(&rng_mutex);
--
2.7.3
^ permalink raw reply related
* [PATCH 4/7] hwrng: core: Replace asm/uaccess.h by linux/uaccess.h
From: Corentin Labbe @ 2016-12-09 14:21 UTC (permalink / raw)
To: mpm, herbert, arnd, gregkh; +Cc: linux-crypto, linux-kernel, Corentin Labbe
In-Reply-To: <1481293299-21697-1-git-send-email-clabbe.montjoie@gmail.com>
This patch fix the checkpatch warning about asm/uaccess.h.
In the same time, we sort the headers in alphabetical order.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
---
drivers/char/hw_random/core.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index a8e63ae..7a2e496 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -10,19 +10,19 @@
* of the GNU General Public License, incorporated herein by reference.
*/
+#include <linux/delay.h>
#include <linux/device.h>
+#include <linux/err.h>
+#include <linux/fs.h>
#include <linux/hw_random.h>
-#include <linux/module.h>
#include <linux/kernel.h>
-#include <linux/fs.h>
-#include <linux/sched.h>
-#include <linux/miscdevice.h>
#include <linux/kthread.h>
-#include <linux/delay.h>
-#include <linux/slab.h>
+#include <linux/miscdevice.h>
+#include <linux/module.h>
#include <linux/random.h>
-#include <linux/err.h>
-#include <asm/uaccess.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
#define RNG_MODULE_NAME "hw_random"
#define PFX RNG_MODULE_NAME ": "
--
2.7.3
^ permalink raw reply related
* [PATCH v2 0/3] crypto: arm64/ARM: NEON accelerated ChaCha20 *skcipher*
From: Ard Biesheuvel @ 2016-12-09 14:33 UTC (permalink / raw)
To: linux-crypto, herbert; +Cc: linux-arm-kernel, Ard Biesheuvel
Another port of existing x86 SSE code to NEON, again both for arm64 and ARM.
ChaCha20 is a stream cipher described in RFC 7539, and is intended to be
an efficient software implementable 'standby cipher', in case AES cannot
be used.
This NEON implementation is almost 2x as fast as the generic C code
(measured on Cortex-A57 using the arm64 version)
Changes in v2:
- add patch to convert the generic and x86 to skciphers first
- tweaked the arm64 version for some additional performance
- use chunksize == 4x blocksize for optimal speed
Ard Biesheuvel (3):
crypto: chacha20 - convert generic and x86 versions to skcipher
crypto: arm64/chacha20 - implement NEON version based on SSE3 code
crypto: arm/chacha20 - implement NEON version based on SSE3 code
arch/arm/crypto/Kconfig | 6 +
arch/arm/crypto/Makefile | 2 +
arch/arm/crypto/chacha20-neon-core.S | 524 ++++++++++++++++++++
arch/arm/crypto/chacha20-neon-glue.c | 127 +++++
arch/arm64/crypto/Kconfig | 6 +
arch/arm64/crypto/Makefile | 3 +
arch/arm64/crypto/chacha20-neon-core.S | 450 +++++++++++++++++
arch/arm64/crypto/chacha20-neon-glue.c | 122 +++++
arch/x86/crypto/chacha20_glue.c | 69 ++-
crypto/chacha20_generic.c | 73 ++-
include/crypto/chacha20.h | 6 +-
11 files changed, 1304 insertions(+), 84 deletions(-)
create mode 100644 arch/arm/crypto/chacha20-neon-core.S
create mode 100644 arch/arm/crypto/chacha20-neon-glue.c
create mode 100644 arch/arm64/crypto/chacha20-neon-core.S
create mode 100644 arch/arm64/crypto/chacha20-neon-glue.c
--
2.7.4
^ permalink raw reply
* [PATCH v2 1/3] crypto: chacha20 - convert generic and x86 versions to skcipher
From: Ard Biesheuvel @ 2016-12-09 14:33 UTC (permalink / raw)
To: linux-crypto, herbert; +Cc: linux-arm-kernel, Ard Biesheuvel
In-Reply-To: <1481294033-23508-1-git-send-email-ard.biesheuvel@linaro.org>
This converts the ChaCha20 code from a blkcipher to a skcipher, which
is now the preferred way to implement symmetric block and stream ciphers.
This ports the generic and x86 versions at the same time because the
latter reuses routines of the former.
Note that the skcipher_walk() API guarantees that all presented blocks
except the final one are a multiple of the chunk size, so we can simplify
the encrypt() routine somewhat.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/x86/crypto/chacha20_glue.c | 69 +++++++++---------
crypto/chacha20_generic.c | 73 ++++++++------------
include/crypto/chacha20.h | 6 +-
3 files changed, 64 insertions(+), 84 deletions(-)
diff --git a/arch/x86/crypto/chacha20_glue.c b/arch/x86/crypto/chacha20_glue.c
index f910d1d449f0..78f75b07dc25 100644
--- a/arch/x86/crypto/chacha20_glue.c
+++ b/arch/x86/crypto/chacha20_glue.c
@@ -11,7 +11,7 @@
#include <crypto/algapi.h>
#include <crypto/chacha20.h>
-#include <linux/crypto.h>
+#include <crypto/internal/skcipher.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <asm/fpu/api.h>
@@ -63,36 +63,34 @@ static void chacha20_dosimd(u32 *state, u8 *dst, const u8 *src,
}
}
-static int chacha20_simd(struct blkcipher_desc *desc, struct scatterlist *dst,
- struct scatterlist *src, unsigned int nbytes)
+static int chacha20_simd(struct skcipher_request *req)
{
- u32 *state, state_buf[16 + (CHACHA20_STATE_ALIGN / sizeof(u32)) - 1];
- struct blkcipher_walk walk;
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+ struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
+ u32 state[16] __aligned(CHACHA20_STATE_ALIGN);
+ struct skcipher_walk walk;
int err;
- if (nbytes <= CHACHA20_BLOCK_SIZE || !may_use_simd())
- return crypto_chacha20_crypt(desc, dst, src, nbytes);
+ if (req->cryptlen <= CHACHA20_BLOCK_SIZE || !may_use_simd())
+ return crypto_chacha20_crypt(req);
- state = (u32 *)roundup((uintptr_t)state_buf, CHACHA20_STATE_ALIGN);
+ err = skcipher_walk_virt(&walk, req, true);
- blkcipher_walk_init(&walk, dst, src, nbytes);
- err = blkcipher_walk_virt_block(desc, &walk, CHACHA20_BLOCK_SIZE);
-
- crypto_chacha20_init(state, crypto_blkcipher_ctx(desc->tfm), walk.iv);
+ crypto_chacha20_init(state, ctx, walk.iv);
kernel_fpu_begin();
while (walk.nbytes >= CHACHA20_BLOCK_SIZE) {
chacha20_dosimd(state, walk.dst.virt.addr, walk.src.virt.addr,
rounddown(walk.nbytes, CHACHA20_BLOCK_SIZE));
- err = blkcipher_walk_done(desc, &walk,
- walk.nbytes % CHACHA20_BLOCK_SIZE);
+ err = skcipher_walk_done(&walk,
+ walk.nbytes % CHACHA20_BLOCK_SIZE);
}
if (walk.nbytes) {
chacha20_dosimd(state, walk.dst.virt.addr, walk.src.virt.addr,
walk.nbytes);
- err = blkcipher_walk_done(desc, &walk, 0);
+ err = skcipher_walk_done(&walk, 0);
}
kernel_fpu_end();
@@ -100,27 +98,22 @@ static int chacha20_simd(struct blkcipher_desc *desc, struct scatterlist *dst,
return err;
}
-static struct crypto_alg alg = {
- .cra_name = "chacha20",
- .cra_driver_name = "chacha20-simd",
- .cra_priority = 300,
- .cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER,
- .cra_blocksize = 1,
- .cra_type = &crypto_blkcipher_type,
- .cra_ctxsize = sizeof(struct chacha20_ctx),
- .cra_alignmask = sizeof(u32) - 1,
- .cra_module = THIS_MODULE,
- .cra_u = {
- .blkcipher = {
- .min_keysize = CHACHA20_KEY_SIZE,
- .max_keysize = CHACHA20_KEY_SIZE,
- .ivsize = CHACHA20_IV_SIZE,
- .geniv = "seqiv",
- .setkey = crypto_chacha20_setkey,
- .encrypt = chacha20_simd,
- .decrypt = chacha20_simd,
- },
- },
+static struct skcipher_alg alg = {
+ .base.cra_name = "chacha20",
+ .base.cra_driver_name = "chacha20-simd",
+ .base.cra_priority = 300,
+ .base.cra_blocksize = 1,
+ .base.cra_ctxsize = sizeof(struct chacha20_ctx),
+ .base.cra_alignmask = sizeof(u32) - 1,
+ .base.cra_module = THIS_MODULE,
+
+ .min_keysize = CHACHA20_KEY_SIZE,
+ .max_keysize = CHACHA20_KEY_SIZE,
+ .ivsize = CHACHA20_IV_SIZE,
+ .chunksize = CHACHA20_BLOCK_SIZE,
+ .setkey = crypto_chacha20_setkey,
+ .encrypt = chacha20_simd,
+ .decrypt = chacha20_simd,
};
static int __init chacha20_simd_mod_init(void)
@@ -133,12 +126,12 @@ static int __init chacha20_simd_mod_init(void)
boot_cpu_has(X86_FEATURE_AVX2) &&
cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL);
#endif
- return crypto_register_alg(&alg);
+ return crypto_register_skcipher(&alg);
}
static void __exit chacha20_simd_mod_fini(void)
{
- crypto_unregister_alg(&alg);
+ crypto_unregister_skcipher(&alg);
}
module_init(chacha20_simd_mod_init);
diff --git a/crypto/chacha20_generic.c b/crypto/chacha20_generic.c
index 1cab83146e33..8b3c04d625c3 100644
--- a/crypto/chacha20_generic.c
+++ b/crypto/chacha20_generic.c
@@ -10,10 +10,9 @@
*/
#include <crypto/algapi.h>
-#include <linux/crypto.h>
-#include <linux/kernel.h>
-#include <linux/module.h>
#include <crypto/chacha20.h>
+#include <crypto/internal/skcipher.h>
+#include <linux/module.h>
static inline u32 le32_to_cpuvp(const void *p)
{
@@ -63,10 +62,10 @@ void crypto_chacha20_init(u32 *state, struct chacha20_ctx *ctx, u8 *iv)
}
EXPORT_SYMBOL_GPL(crypto_chacha20_init);
-int crypto_chacha20_setkey(struct crypto_tfm *tfm, const u8 *key,
+int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keysize)
{
- struct chacha20_ctx *ctx = crypto_tfm_ctx(tfm);
+ struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
int i;
if (keysize != CHACHA20_KEY_SIZE)
@@ -79,66 +78,54 @@ int crypto_chacha20_setkey(struct crypto_tfm *tfm, const u8 *key,
}
EXPORT_SYMBOL_GPL(crypto_chacha20_setkey);
-int crypto_chacha20_crypt(struct blkcipher_desc *desc, struct scatterlist *dst,
- struct scatterlist *src, unsigned int nbytes)
+int crypto_chacha20_crypt(struct skcipher_request *req)
{
- struct blkcipher_walk walk;
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+ struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct skcipher_walk walk;
u32 state[16];
int err;
- blkcipher_walk_init(&walk, dst, src, nbytes);
- err = blkcipher_walk_virt_block(desc, &walk, CHACHA20_BLOCK_SIZE);
-
- crypto_chacha20_init(state, crypto_blkcipher_ctx(desc->tfm), walk.iv);
+ err = skcipher_walk_virt(&walk, req, true);
- while (walk.nbytes >= CHACHA20_BLOCK_SIZE) {
- chacha20_docrypt(state, walk.dst.virt.addr, walk.src.virt.addr,
- rounddown(walk.nbytes, CHACHA20_BLOCK_SIZE));
- err = blkcipher_walk_done(desc, &walk,
- walk.nbytes % CHACHA20_BLOCK_SIZE);
- }
+ crypto_chacha20_init(state, ctx, walk.iv);
- if (walk.nbytes) {
+ while (walk.nbytes > 0) {
chacha20_docrypt(state, walk.dst.virt.addr, walk.src.virt.addr,
walk.nbytes);
- err = blkcipher_walk_done(desc, &walk, 0);
+ err = skcipher_walk_done(&walk, 0);
}
return err;
}
EXPORT_SYMBOL_GPL(crypto_chacha20_crypt);
-static struct crypto_alg alg = {
- .cra_name = "chacha20",
- .cra_driver_name = "chacha20-generic",
- .cra_priority = 100,
- .cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER,
- .cra_blocksize = 1,
- .cra_type = &crypto_blkcipher_type,
- .cra_ctxsize = sizeof(struct chacha20_ctx),
- .cra_alignmask = sizeof(u32) - 1,
- .cra_module = THIS_MODULE,
- .cra_u = {
- .blkcipher = {
- .min_keysize = CHACHA20_KEY_SIZE,
- .max_keysize = CHACHA20_KEY_SIZE,
- .ivsize = CHACHA20_IV_SIZE,
- .geniv = "seqiv",
- .setkey = crypto_chacha20_setkey,
- .encrypt = crypto_chacha20_crypt,
- .decrypt = crypto_chacha20_crypt,
- },
- },
+static struct skcipher_alg alg = {
+ .base.cra_name = "chacha20",
+ .base.cra_driver_name = "chacha20-generic",
+ .base.cra_priority = 100,
+ .base.cra_blocksize = 1,
+ .base.cra_ctxsize = sizeof(struct chacha20_ctx),
+ .base.cra_alignmask = sizeof(u32) - 1,
+ .base.cra_module = THIS_MODULE,
+
+ .min_keysize = CHACHA20_KEY_SIZE,
+ .max_keysize = CHACHA20_KEY_SIZE,
+ .ivsize = CHACHA20_IV_SIZE,
+ .chunksize = CHACHA20_BLOCK_SIZE,
+ .setkey = crypto_chacha20_setkey,
+ .encrypt = crypto_chacha20_crypt,
+ .decrypt = crypto_chacha20_crypt,
};
static int __init chacha20_generic_mod_init(void)
{
- return crypto_register_alg(&alg);
+ return crypto_register_skcipher(&alg);
}
static void __exit chacha20_generic_mod_fini(void)
{
- crypto_unregister_alg(&alg);
+ crypto_unregister_skcipher(&alg);
}
module_init(chacha20_generic_mod_init);
diff --git a/include/crypto/chacha20.h b/include/crypto/chacha20.h
index 20d20f681a72..445fc45f4b5b 100644
--- a/include/crypto/chacha20.h
+++ b/include/crypto/chacha20.h
@@ -5,6 +5,7 @@
#ifndef _CRYPTO_CHACHA20_H
#define _CRYPTO_CHACHA20_H
+#include <crypto/skcipher.h>
#include <linux/types.h>
#include <linux/crypto.h>
@@ -18,9 +19,8 @@ struct chacha20_ctx {
void chacha20_block(u32 *state, void *stream);
void crypto_chacha20_init(u32 *state, struct chacha20_ctx *ctx, u8 *iv);
-int crypto_chacha20_setkey(struct crypto_tfm *tfm, const u8 *key,
+int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keysize);
-int crypto_chacha20_crypt(struct blkcipher_desc *desc, struct scatterlist *dst,
- struct scatterlist *src, unsigned int nbytes);
+int crypto_chacha20_crypt(struct skcipher_request *req);
#endif
--
2.7.4
^ permalink raw reply related
* [PATCH v2 2/3] crypto: arm64/chacha20 - implement NEON version based on SSE3 code
From: Ard Biesheuvel @ 2016-12-09 14:33 UTC (permalink / raw)
To: linux-crypto, herbert; +Cc: linux-arm-kernel, Ard Biesheuvel
In-Reply-To: <1481294033-23508-1-git-send-email-ard.biesheuvel@linaro.org>
This is a straight port to arm64/NEON of the x86 SSE3 implementation
of the ChaCha20 stream cipher.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/crypto/Kconfig | 6 +
arch/arm64/crypto/Makefile | 3 +
arch/arm64/crypto/chacha20-neon-core.S | 450 ++++++++++++++++++++
arch/arm64/crypto/chacha20-neon-glue.c | 122 ++++++
4 files changed, 581 insertions(+)
diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index 450a85df041a..0bf0f531f539 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -72,4 +72,10 @@ config CRYPTO_CRC32_ARM64
depends on ARM64
select CRYPTO_HASH
+config CRYPTO_CHACHA20_NEON
+ tristate "NEON accelerated ChaCha20 symmetric cipher"
+ depends on KERNEL_MODE_NEON
+ select CRYPTO_BLKCIPHER
+ select CRYPTO_CHACHA20
+
endif
diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile
index aa8888d7b744..9d2826c5fccf 100644
--- a/arch/arm64/crypto/Makefile
+++ b/arch/arm64/crypto/Makefile
@@ -41,6 +41,9 @@ sha256-arm64-y := sha256-glue.o sha256-core.o
obj-$(CONFIG_CRYPTO_SHA512_ARM64) += sha512-arm64.o
sha512-arm64-y := sha512-glue.o sha512-core.o
+obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o
+chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o
+
AFLAGS_aes-ce.o := -DINTERLEAVE=4
AFLAGS_aes-neon.o := -DINTERLEAVE=4
diff --git a/arch/arm64/crypto/chacha20-neon-core.S b/arch/arm64/crypto/chacha20-neon-core.S
new file mode 100644
index 000000000000..3cbbf23dc4d2
--- /dev/null
+++ b/arch/arm64/crypto/chacha20-neon-core.S
@@ -0,0 +1,450 @@
+/*
+ * ChaCha20 256-bit cipher algorithm, RFC7539, arm64 NEON functions
+ *
+ * Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Based on:
+ * ChaCha20 256-bit cipher algorithm, RFC7539, x64 SSSE3 functions
+ *
+ * Copyright (C) 2015 Martin Willi
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/linkage.h>
+
+ .text
+ .align 6
+
+ENTRY(chacha20_block_xor_neon)
+ // x0: Input state matrix, s
+ // x1: 1 data block output, o
+ // x2: 1 data block input, i
+
+ //
+ // This function encrypts one ChaCha20 block by loading the state matrix
+ // in four NEON registers. It performs matrix operation on four words in
+ // parallel, but requires shuffling to rearrange the words after each
+ // round.
+ //
+
+ // x0..3 = s0..3
+ adr x3, ROT8
+ ld1 {v0.4s-v3.4s}, [x0]
+ ld1 {v8.4s-v11.4s}, [x0]
+ ld1 {v12.4s}, [x3]
+
+ mov x3, #10
+
+.Ldoubleround:
+ // x0 += x1, x3 = rotl32(x3 ^ x0, 16)
+ add v0.4s, v0.4s, v1.4s
+ eor v3.16b, v3.16b, v0.16b
+ rev32 v3.8h, v3.8h
+
+ // x2 += x3, x1 = rotl32(x1 ^ x2, 12)
+ add v2.4s, v2.4s, v3.4s
+ eor v4.16b, v1.16b, v2.16b
+ shl v1.4s, v4.4s, #12
+ sri v1.4s, v4.4s, #20
+
+ // x0 += x1, x3 = rotl32(x3 ^ x0, 8)
+ add v0.4s, v0.4s, v1.4s
+ eor v3.16b, v3.16b, v0.16b
+ tbl v3.16b, {v3.16b}, v12.16b
+
+ // x2 += x3, x1 = rotl32(x1 ^ x2, 7)
+ add v2.4s, v2.4s, v3.4s
+ eor v4.16b, v1.16b, v2.16b
+ shl v1.4s, v4.4s, #7
+ sri v1.4s, v4.4s, #25
+
+ // x1 = shuffle32(x1, MASK(0, 3, 2, 1))
+ ext v1.16b, v1.16b, v1.16b, #4
+ // x2 = shuffle32(x2, MASK(1, 0, 3, 2))
+ ext v2.16b, v2.16b, v2.16b, #8
+ // x3 = shuffle32(x3, MASK(2, 1, 0, 3))
+ ext v3.16b, v3.16b, v3.16b, #12
+
+ // x0 += x1, x3 = rotl32(x3 ^ x0, 16)
+ add v0.4s, v0.4s, v1.4s
+ eor v3.16b, v3.16b, v0.16b
+ rev32 v3.8h, v3.8h
+
+ // x2 += x3, x1 = rotl32(x1 ^ x2, 12)
+ add v2.4s, v2.4s, v3.4s
+ eor v4.16b, v1.16b, v2.16b
+ shl v1.4s, v4.4s, #12
+ sri v1.4s, v4.4s, #20
+
+ // x0 += x1, x3 = rotl32(x3 ^ x0, 8)
+ add v0.4s, v0.4s, v1.4s
+ eor v3.16b, v3.16b, v0.16b
+ tbl v3.16b, {v3.16b}, v12.16b
+
+ // x2 += x3, x1 = rotl32(x1 ^ x2, 7)
+ add v2.4s, v2.4s, v3.4s
+ eor v4.16b, v1.16b, v2.16b
+ shl v1.4s, v4.4s, #7
+ sri v1.4s, v4.4s, #25
+
+ // x1 = shuffle32(x1, MASK(2, 1, 0, 3))
+ ext v1.16b, v1.16b, v1.16b, #12
+ // x2 = shuffle32(x2, MASK(1, 0, 3, 2))
+ ext v2.16b, v2.16b, v2.16b, #8
+ // x3 = shuffle32(x3, MASK(0, 3, 2, 1))
+ ext v3.16b, v3.16b, v3.16b, #4
+
+ subs x3, x3, #1
+ b.ne .Ldoubleround
+
+ ld1 {v4.16b-v7.16b}, [x2]
+
+ // o0 = i0 ^ (x0 + s0)
+ add v0.4s, v0.4s, v8.4s
+ eor v0.16b, v0.16b, v4.16b
+
+ // o1 = i1 ^ (x1 + s1)
+ add v1.4s, v1.4s, v9.4s
+ eor v1.16b, v1.16b, v5.16b
+
+ // o2 = i2 ^ (x2 + s2)
+ add v2.4s, v2.4s, v10.4s
+ eor v2.16b, v2.16b, v6.16b
+
+ // o3 = i3 ^ (x3 + s3)
+ add v3.4s, v3.4s, v11.4s
+ eor v3.16b, v3.16b, v7.16b
+
+ st1 {v0.16b-v3.16b}, [x1]
+
+ ret
+ENDPROC(chacha20_block_xor_neon)
+
+ .align 6
+ENTRY(chacha20_4block_xor_neon)
+ // x0: Input state matrix, s
+ // x1: 4 data blocks output, o
+ // x2: 4 data blocks input, i
+
+ //
+ // This function encrypts four consecutive ChaCha20 blocks by loading
+ // the state matrix in NEON registers four times. The algorithm performs
+ // each operation on the corresponding word of each state matrix, hence
+ // requires no word shuffling. For final XORing step we transpose the
+ // matrix by interleaving 32- and then 64-bit words, which allows us to
+ // do XOR in NEON registers.
+ //
+ adr x3, CTRINC // ... and ROT8
+ ld1 {v30.4s-v31.4s}, [x3]
+
+ // x0..15[0-3] = s0..3[0..3]
+ mov x4, x0
+ ld4r { v0.4s- v3.4s}, [x4], #16
+ ld4r { v4.4s- v7.4s}, [x4], #16
+ ld4r { v8.4s-v11.4s}, [x4], #16
+ ld4r {v12.4s-v15.4s}, [x4]
+
+ // x12 += counter values 0-3
+ add v12.4s, v12.4s, v30.4s
+
+ mov x3, #10
+
+.Ldoubleround4:
+ // x0 += x4, x12 = rotl32(x12 ^ x0, 16)
+ // x1 += x5, x13 = rotl32(x13 ^ x1, 16)
+ // x2 += x6, x14 = rotl32(x14 ^ x2, 16)
+ // x3 += x7, x15 = rotl32(x15 ^ x3, 16)
+ add v0.4s, v0.4s, v4.4s
+ add v1.4s, v1.4s, v5.4s
+ add v2.4s, v2.4s, v6.4s
+ add v3.4s, v3.4s, v7.4s
+
+ eor v12.16b, v12.16b, v0.16b
+ eor v13.16b, v13.16b, v1.16b
+ eor v14.16b, v14.16b, v2.16b
+ eor v15.16b, v15.16b, v3.16b
+
+ rev32 v12.8h, v12.8h
+ rev32 v13.8h, v13.8h
+ rev32 v14.8h, v14.8h
+ rev32 v15.8h, v15.8h
+
+ // x8 += x12, x4 = rotl32(x4 ^ x8, 12)
+ // x9 += x13, x5 = rotl32(x5 ^ x9, 12)
+ // x10 += x14, x6 = rotl32(x6 ^ x10, 12)
+ // x11 += x15, x7 = rotl32(x7 ^ x11, 12)
+ add v8.4s, v8.4s, v12.4s
+ add v9.4s, v9.4s, v13.4s
+ add v10.4s, v10.4s, v14.4s
+ add v11.4s, v11.4s, v15.4s
+
+ eor v18.16b, v4.16b, v8.16b
+ eor v19.16b, v5.16b, v9.16b
+ eor v20.16b, v6.16b, v10.16b
+ eor v21.16b, v7.16b, v11.16b
+
+ shl v4.4s, v18.4s, #12
+ shl v5.4s, v19.4s, #12
+ shl v6.4s, v20.4s, #12
+ shl v7.4s, v21.4s, #12
+
+ sri v4.4s, v18.4s, #20
+ sri v5.4s, v19.4s, #20
+ sri v6.4s, v20.4s, #20
+ sri v7.4s, v21.4s, #20
+
+ // x0 += x4, x12 = rotl32(x12 ^ x0, 8)
+ // x1 += x5, x13 = rotl32(x13 ^ x1, 8)
+ // x2 += x6, x14 = rotl32(x14 ^ x2, 8)
+ // x3 += x7, x15 = rotl32(x15 ^ x3, 8)
+ add v0.4s, v0.4s, v4.4s
+ add v1.4s, v1.4s, v5.4s
+ add v2.4s, v2.4s, v6.4s
+ add v3.4s, v3.4s, v7.4s
+
+ eor v12.16b, v12.16b, v0.16b
+ eor v13.16b, v13.16b, v1.16b
+ eor v14.16b, v14.16b, v2.16b
+ eor v15.16b, v15.16b, v3.16b
+
+ tbl v12.16b, {v12.16b}, v31.16b
+ tbl v13.16b, {v13.16b}, v31.16b
+ tbl v14.16b, {v14.16b}, v31.16b
+ tbl v15.16b, {v15.16b}, v31.16b
+
+ // x8 += x12, x4 = rotl32(x4 ^ x8, 7)
+ // x9 += x13, x5 = rotl32(x5 ^ x9, 7)
+ // x10 += x14, x6 = rotl32(x6 ^ x10, 7)
+ // x11 += x15, x7 = rotl32(x7 ^ x11, 7)
+ add v8.4s, v8.4s, v12.4s
+ add v9.4s, v9.4s, v13.4s
+ add v10.4s, v10.4s, v14.4s
+ add v11.4s, v11.4s, v15.4s
+
+ eor v18.16b, v4.16b, v8.16b
+ eor v19.16b, v5.16b, v9.16b
+ eor v20.16b, v6.16b, v10.16b
+ eor v21.16b, v7.16b, v11.16b
+
+ shl v4.4s, v18.4s, #7
+ shl v5.4s, v19.4s, #7
+ shl v6.4s, v20.4s, #7
+ shl v7.4s, v21.4s, #7
+
+ sri v4.4s, v18.4s, #25
+ sri v5.4s, v19.4s, #25
+ sri v6.4s, v20.4s, #25
+ sri v7.4s, v21.4s, #25
+
+ // x0 += x5, x15 = rotl32(x15 ^ x0, 16)
+ // x1 += x6, x12 = rotl32(x12 ^ x1, 16)
+ // x2 += x7, x13 = rotl32(x13 ^ x2, 16)
+ // x3 += x4, x14 = rotl32(x14 ^ x3, 16)
+ add v0.4s, v0.4s, v5.4s
+ add v1.4s, v1.4s, v6.4s
+ add v2.4s, v2.4s, v7.4s
+ add v3.4s, v3.4s, v4.4s
+
+ eor v15.16b, v15.16b, v0.16b
+ eor v12.16b, v12.16b, v1.16b
+ eor v13.16b, v13.16b, v2.16b
+ eor v14.16b, v14.16b, v3.16b
+
+ rev32 v15.8h, v15.8h
+ rev32 v12.8h, v12.8h
+ rev32 v13.8h, v13.8h
+ rev32 v14.8h, v14.8h
+
+ // x10 += x15, x5 = rotl32(x5 ^ x10, 12)
+ // x11 += x12, x6 = rotl32(x6 ^ x11, 12)
+ // x8 += x13, x7 = rotl32(x7 ^ x8, 12)
+ // x9 += x14, x4 = rotl32(x4 ^ x9, 12)
+ add v10.4s, v10.4s, v15.4s
+ add v11.4s, v11.4s, v12.4s
+ add v8.4s, v8.4s, v13.4s
+ add v9.4s, v9.4s, v14.4s
+
+ eor v18.16b, v5.16b, v10.16b
+ eor v19.16b, v6.16b, v11.16b
+ eor v20.16b, v7.16b, v8.16b
+ eor v21.16b, v4.16b, v9.16b
+
+ shl v5.4s, v18.4s, #12
+ shl v6.4s, v19.4s, #12
+ shl v7.4s, v20.4s, #12
+ shl v4.4s, v21.4s, #12
+
+ sri v5.4s, v18.4s, #20
+ sri v6.4s, v19.4s, #20
+ sri v7.4s, v20.4s, #20
+ sri v4.4s, v21.4s, #20
+
+ // x0 += x5, x15 = rotl32(x15 ^ x0, 8)
+ // x1 += x6, x12 = rotl32(x12 ^ x1, 8)
+ // x2 += x7, x13 = rotl32(x13 ^ x2, 8)
+ // x3 += x4, x14 = rotl32(x14 ^ x3, 8)
+ add v0.4s, v0.4s, v5.4s
+ add v1.4s, v1.4s, v6.4s
+ add v2.4s, v2.4s, v7.4s
+ add v3.4s, v3.4s, v4.4s
+
+ eor v15.16b, v15.16b, v0.16b
+ eor v12.16b, v12.16b, v1.16b
+ eor v13.16b, v13.16b, v2.16b
+ eor v14.16b, v14.16b, v3.16b
+
+ tbl v15.16b, {v15.16b}, v31.16b
+ tbl v12.16b, {v12.16b}, v31.16b
+ tbl v13.16b, {v13.16b}, v31.16b
+ tbl v14.16b, {v14.16b}, v31.16b
+
+ // x10 += x15, x5 = rotl32(x5 ^ x10, 7)
+ // x11 += x12, x6 = rotl32(x6 ^ x11, 7)
+ // x8 += x13, x7 = rotl32(x7 ^ x8, 7)
+ // x9 += x14, x4 = rotl32(x4 ^ x9, 7)
+ add v10.4s, v10.4s, v15.4s
+ add v11.4s, v11.4s, v12.4s
+ add v8.4s, v8.4s, v13.4s
+ add v9.4s, v9.4s, v14.4s
+
+ eor v18.16b, v5.16b, v10.16b
+ eor v19.16b, v6.16b, v11.16b
+ eor v20.16b, v7.16b, v8.16b
+ eor v21.16b, v4.16b, v9.16b
+
+ shl v5.4s, v18.4s, #7
+ shl v6.4s, v19.4s, #7
+ shl v7.4s, v20.4s, #7
+ shl v4.4s, v21.4s, #7
+
+ sri v5.4s, v18.4s, #25
+ sri v6.4s, v19.4s, #25
+ sri v7.4s, v20.4s, #25
+ sri v4.4s, v21.4s, #25
+
+ subs x3, x3, #1
+ b.ne .Ldoubleround4
+
+ ld4r {v16.4s-v19.4s}, [x0], #16
+ ld4r {v20.4s-v23.4s}, [x0], #16
+
+ // x12 += counter values 0-3
+ add v12.4s, v12.4s, v30.4s
+
+ // x0[0-3] += s0[0]
+ // x1[0-3] += s0[1]
+ // x2[0-3] += s0[2]
+ // x3[0-3] += s0[3]
+ add v0.4s, v0.4s, v16.4s
+ add v1.4s, v1.4s, v17.4s
+ add v2.4s, v2.4s, v18.4s
+ add v3.4s, v3.4s, v19.4s
+
+ ld4r {v24.4s-v27.4s}, [x0], #16
+ ld4r {v28.4s-v31.4s}, [x0]
+
+ // x4[0-3] += s1[0]
+ // x5[0-3] += s1[1]
+ // x6[0-3] += s1[2]
+ // x7[0-3] += s1[3]
+ add v4.4s, v4.4s, v20.4s
+ add v5.4s, v5.4s, v21.4s
+ add v6.4s, v6.4s, v22.4s
+ add v7.4s, v7.4s, v23.4s
+
+ // x8[0-3] += s2[0]
+ // x9[0-3] += s2[1]
+ // x10[0-3] += s2[2]
+ // x11[0-3] += s2[3]
+ add v8.4s, v8.4s, v24.4s
+ add v9.4s, v9.4s, v25.4s
+ add v10.4s, v10.4s, v26.4s
+ add v11.4s, v11.4s, v27.4s
+
+ // x12[0-3] += s3[0]
+ // x13[0-3] += s3[1]
+ // x14[0-3] += s3[2]
+ // x15[0-3] += s3[3]
+ add v12.4s, v12.4s, v28.4s
+ add v13.4s, v13.4s, v29.4s
+ add v14.4s, v14.4s, v30.4s
+ add v15.4s, v15.4s, v31.4s
+
+ // interleave 32-bit words in state n, n+1
+ zip1 v16.4s, v0.4s, v1.4s
+ zip2 v17.4s, v0.4s, v1.4s
+ zip1 v18.4s, v2.4s, v3.4s
+ zip2 v19.4s, v2.4s, v3.4s
+ zip1 v20.4s, v4.4s, v5.4s
+ zip2 v21.4s, v4.4s, v5.4s
+ zip1 v22.4s, v6.4s, v7.4s
+ zip2 v23.4s, v6.4s, v7.4s
+ zip1 v24.4s, v8.4s, v9.4s
+ zip2 v25.4s, v8.4s, v9.4s
+ zip1 v26.4s, v10.4s, v11.4s
+ zip2 v27.4s, v10.4s, v11.4s
+ zip1 v28.4s, v12.4s, v13.4s
+ zip2 v29.4s, v12.4s, v13.4s
+ zip1 v30.4s, v14.4s, v15.4s
+ zip2 v31.4s, v14.4s, v15.4s
+
+ // interleave 64-bit words in state n, n+2
+ zip1 v0.2d, v16.2d, v18.2d
+ zip2 v4.2d, v16.2d, v18.2d
+ zip1 v8.2d, v17.2d, v19.2d
+ zip2 v12.2d, v17.2d, v19.2d
+ ld1 {v16.16b-v19.16b}, [x2], #64
+
+ zip1 v1.2d, v20.2d, v22.2d
+ zip2 v5.2d, v20.2d, v22.2d
+ zip1 v9.2d, v21.2d, v23.2d
+ zip2 v13.2d, v21.2d, v23.2d
+ ld1 {v20.16b-v23.16b}, [x2], #64
+
+ zip1 v2.2d, v24.2d, v26.2d
+ zip2 v6.2d, v24.2d, v26.2d
+ zip1 v10.2d, v25.2d, v27.2d
+ zip2 v14.2d, v25.2d, v27.2d
+ ld1 {v24.16b-v27.16b}, [x2], #64
+
+ zip1 v3.2d, v28.2d, v30.2d
+ zip2 v7.2d, v28.2d, v30.2d
+ zip1 v11.2d, v29.2d, v31.2d
+ zip2 v15.2d, v29.2d, v31.2d
+ ld1 {v28.16b-v31.16b}, [x2]
+
+ // xor with corresponding input, write to output
+ eor v16.16b, v16.16b, v0.16b
+ eor v17.16b, v17.16b, v1.16b
+ eor v18.16b, v18.16b, v2.16b
+ eor v19.16b, v19.16b, v3.16b
+ eor v20.16b, v20.16b, v4.16b
+ eor v21.16b, v21.16b, v5.16b
+ st1 {v16.16b-v19.16b}, [x1], #64
+ eor v22.16b, v22.16b, v6.16b
+ eor v23.16b, v23.16b, v7.16b
+ eor v24.16b, v24.16b, v8.16b
+ eor v25.16b, v25.16b, v9.16b
+ st1 {v20.16b-v23.16b}, [x1], #64
+ eor v26.16b, v26.16b, v10.16b
+ eor v27.16b, v27.16b, v11.16b
+ eor v28.16b, v28.16b, v12.16b
+ st1 {v24.16b-v27.16b}, [x1], #64
+ eor v29.16b, v29.16b, v13.16b
+ eor v30.16b, v30.16b, v14.16b
+ eor v31.16b, v31.16b, v15.16b
+ st1 {v28.16b-v31.16b}, [x1]
+
+ ret
+ENDPROC(chacha20_4block_xor_neon)
+
+CTRINC: .word 0, 1, 2, 3
+ROT8: .word 0x02010003, 0x06050407, 0x0a09080b, 0x0e0d0c0f
diff --git a/arch/arm64/crypto/chacha20-neon-glue.c b/arch/arm64/crypto/chacha20-neon-glue.c
new file mode 100644
index 000000000000..06fdb8468ac6
--- /dev/null
+++ b/arch/arm64/crypto/chacha20-neon-glue.c
@@ -0,0 +1,122 @@
+/*
+ * ChaCha20 256-bit cipher algorithm, RFC7539, arm64 NEON functions
+ *
+ * Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Based on:
+ * ChaCha20 256-bit cipher algorithm, RFC7539, SIMD glue code
+ *
+ * Copyright (C) 2015 Martin Willi
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <crypto/algapi.h>
+#include <crypto/chacha20.h>
+#include <crypto/internal/skcipher.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+
+#include <asm/neon.h>
+
+asmlinkage void chacha20_block_xor_neon(u32 *state, u8 *dst, const u8 *src);
+asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src);
+
+static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
+ unsigned int bytes)
+{
+ u8 buf[CHACHA20_BLOCK_SIZE];
+
+ while (bytes >= CHACHA20_BLOCK_SIZE * 4) {
+ chacha20_4block_xor_neon(state, dst, src);
+ bytes -= CHACHA20_BLOCK_SIZE * 4;
+ src += CHACHA20_BLOCK_SIZE * 4;
+ dst += CHACHA20_BLOCK_SIZE * 4;
+ state[12] += 4;
+ }
+ while (bytes >= CHACHA20_BLOCK_SIZE) {
+ chacha20_block_xor_neon(state, dst, src);
+ bytes -= CHACHA20_BLOCK_SIZE;
+ src += CHACHA20_BLOCK_SIZE;
+ dst += CHACHA20_BLOCK_SIZE;
+ state[12]++;
+ }
+ if (bytes) {
+ memcpy(buf, src, bytes);
+ chacha20_block_xor_neon(state, buf, buf);
+ memcpy(dst, buf, bytes);
+ }
+}
+
+static int chacha20_neon(struct skcipher_request *req)
+{
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+ struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct skcipher_walk walk;
+ u32 state[16];
+ int err;
+
+ if (req->cryptlen <= CHACHA20_BLOCK_SIZE)
+ return crypto_chacha20_crypt(req);
+
+ err = skcipher_walk_virt(&walk, req, true);
+
+ crypto_chacha20_init(state, ctx, walk.iv);
+
+ kernel_neon_begin();
+ while (walk.nbytes > 0) {
+ unsigned int nbytes = walk.nbytes;
+
+ if (nbytes < walk.total)
+ nbytes = round_down(nbytes, walk.chunksize);
+
+ chacha20_doneon(state, walk.dst.virt.addr, walk.src.virt.addr,
+ nbytes);
+ err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
+ }
+ kernel_neon_end();
+
+ return err;
+}
+
+static struct skcipher_alg alg = {
+ .base.cra_name = "chacha20",
+ .base.cra_driver_name = "chacha20-neon",
+ .base.cra_priority = 300,
+ .base.cra_blocksize = 1,
+ .base.cra_ctxsize = sizeof(struct chacha20_ctx),
+ .base.cra_alignmask = 1,
+ .base.cra_module = THIS_MODULE,
+
+ .min_keysize = CHACHA20_KEY_SIZE,
+ .max_keysize = CHACHA20_KEY_SIZE,
+ .ivsize = CHACHA20_IV_SIZE,
+ .chunksize = 4 * CHACHA20_BLOCK_SIZE,
+ .setkey = crypto_chacha20_setkey,
+ .encrypt = chacha20_neon,
+ .decrypt = chacha20_neon,
+};
+
+static int __init chacha20_simd_mod_init(void)
+{
+ return crypto_register_skcipher(&alg);
+}
+
+static void __exit chacha20_simd_mod_fini(void)
+{
+ crypto_unregister_skcipher(&alg);
+}
+
+module_init(chacha20_simd_mod_init);
+module_exit(chacha20_simd_mod_fini);
+
+MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
+MODULE_LICENSE("GPL v2");
+MODULE_ALIAS_CRYPTO("chacha20");
--
2.7.4
^ permalink raw reply related
* [PATCH v2 3/3] crypto: arm/chacha20 - implement NEON version based on SSE3 code
From: Ard Biesheuvel @ 2016-12-09 14:33 UTC (permalink / raw)
To: linux-crypto, herbert; +Cc: linux-arm-kernel, Ard Biesheuvel
In-Reply-To: <1481294033-23508-1-git-send-email-ard.biesheuvel@linaro.org>
This is a straight port to ARM/NEON of the x86 SSE3 implementation
of the ChaCha20 stream cipher.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm/crypto/Kconfig | 6 +
arch/arm/crypto/Makefile | 2 +
arch/arm/crypto/chacha20-neon-core.S | 524 ++++++++++++++++++++
arch/arm/crypto/chacha20-neon-glue.c | 127 +++++
4 files changed, 659 insertions(+)
diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index 13f1b4c289d4..2f3339f015d3 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -130,4 +130,10 @@ config CRYPTO_CRC32_ARM_CE
depends on KERNEL_MODE_NEON && CRC32
select CRYPTO_HASH
+config CRYPTO_CHACHA20_NEON
+ tristate "NEON accelerated ChaCha20 symmetric cipher"
+ depends on KERNEL_MODE_NEON
+ select CRYPTO_BLKCIPHER
+ select CRYPTO_CHACHA20
+
endif
diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile
index b578a1820ab1..8d74e55eacd4 100644
--- a/arch/arm/crypto/Makefile
+++ b/arch/arm/crypto/Makefile
@@ -8,6 +8,7 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM) += sha1-arm.o
obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o
+obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o
ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
@@ -40,6 +41,7 @@ aes-arm-ce-y := aes-ce-core.o aes-ce-glue.o
ghash-arm-ce-y := ghash-ce-core.o ghash-ce-glue.o
crct10dif-arm-ce-y := crct10dif-ce-core.o crct10dif-ce-glue.o
crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o
+chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o
quiet_cmd_perl = PERL $@
cmd_perl = $(PERL) $(<) > $(@)
diff --git a/arch/arm/crypto/chacha20-neon-core.S b/arch/arm/crypto/chacha20-neon-core.S
new file mode 100644
index 000000000000..ff1d337bdb4a
--- /dev/null
+++ b/arch/arm/crypto/chacha20-neon-core.S
@@ -0,0 +1,524 @@
+/*
+ * ChaCha20 256-bit cipher algorithm, RFC7539, ARM NEON functions
+ *
+ * Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Based on:
+ * ChaCha20 256-bit cipher algorithm, RFC7539, x64 SSE3 functions
+ *
+ * Copyright (C) 2015 Martin Willi
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/linkage.h>
+
+ .text
+ .fpu neon
+ .align 5
+
+ENTRY(chacha20_block_xor_neon)
+ // r0: Input state matrix, s
+ // r1: 1 data block output, o
+ // r2: 1 data block input, i
+
+ //
+ // This function encrypts one ChaCha20 block by loading the state matrix
+ // in four NEON registers. It performs matrix operation on four words in
+ // parallel, but requireds shuffling to rearrange the words after each
+ // round.
+ //
+
+ // x0..3 = s0..3
+ add ip, r0, #0x20
+ vld1.32 {q0-q1}, [r0]
+ vld1.32 {q2-q3}, [ip]
+
+ vmov q8, q0
+ vmov q9, q1
+ vmov q10, q2
+ vmov q11, q3
+
+ mov r3, #10
+
+.Ldoubleround:
+ // x0 += x1, x3 = rotl32(x3 ^ x0, 16)
+ vadd.i32 q0, q0, q1
+ veor q4, q3, q0
+ vshl.u32 q3, q4, #16
+ vsri.u32 q3, q4, #16
+
+ // x2 += x3, x1 = rotl32(x1 ^ x2, 12)
+ vadd.i32 q2, q2, q3
+ veor q4, q1, q2
+ vshl.u32 q1, q4, #12
+ vsri.u32 q1, q4, #20
+
+ // x0 += x1, x3 = rotl32(x3 ^ x0, 8)
+ vadd.i32 q0, q0, q1
+ veor q4, q3, q0
+ vshl.u32 q3, q4, #8
+ vsri.u32 q3, q4, #24
+
+ // x2 += x3, x1 = rotl32(x1 ^ x2, 7)
+ vadd.i32 q2, q2, q3
+ veor q4, q1, q2
+ vshl.u32 q1, q4, #7
+ vsri.u32 q1, q4, #25
+
+ // x1 = shuffle32(x1, MASK(0, 3, 2, 1))
+ vext.8 q1, q1, q1, #4
+ // x2 = shuffle32(x2, MASK(1, 0, 3, 2))
+ vext.8 q2, q2, q2, #8
+ // x3 = shuffle32(x3, MASK(2, 1, 0, 3))
+ vext.8 q3, q3, q3, #12
+
+ // x0 += x1, x3 = rotl32(x3 ^ x0, 16)
+ vadd.i32 q0, q0, q1
+ veor q4, q3, q0
+ vshl.u32 q3, q4, #16
+ vsri.u32 q3, q4, #16
+
+ // x2 += x3, x1 = rotl32(x1 ^ x2, 12)
+ vadd.i32 q2, q2, q3
+ veor q4, q1, q2
+ vshl.u32 q1, q4, #12
+ vsri.u32 q1, q4, #20
+
+ // x0 += x1, x3 = rotl32(x3 ^ x0, 8)
+ vadd.i32 q0, q0, q1
+ veor q4, q3, q0
+ vshl.u32 q3, q4, #8
+ vsri.u32 q3, q4, #24
+
+ // x2 += x3, x1 = rotl32(x1 ^ x2, 7)
+ vadd.i32 q2, q2, q3
+ veor q4, q1, q2
+ vshl.u32 q1, q4, #7
+ vsri.u32 q1, q4, #25
+
+ // x1 = shuffle32(x1, MASK(2, 1, 0, 3))
+ vext.8 q1, q1, q1, #12
+ // x2 = shuffle32(x2, MASK(1, 0, 3, 2))
+ vext.8 q2, q2, q2, #8
+ // x3 = shuffle32(x3, MASK(0, 3, 2, 1))
+ vext.8 q3, q3, q3, #4
+
+ subs r3, r3, #1
+ bne .Ldoubleround
+
+ add ip, r2, #0x20
+ vld1.8 {q4-q5}, [r2]
+ vld1.8 {q6-q7}, [ip]
+
+ // o0 = i0 ^ (x0 + s0)
+ vadd.i32 q0, q0, q8
+ veor q0, q0, q4
+
+ // o1 = i1 ^ (x1 + s1)
+ vadd.i32 q1, q1, q9
+ veor q1, q1, q5
+
+ // o2 = i2 ^ (x2 + s2)
+ vadd.i32 q2, q2, q10
+ veor q2, q2, q6
+
+ // o3 = i3 ^ (x3 + s3)
+ vadd.i32 q3, q3, q11
+ veor q3, q3, q7
+
+ add ip, r1, #0x20
+ vst1.8 {q0-q1}, [r1]
+ vst1.8 {q2-q3}, [ip]
+
+ bx lr
+ENDPROC(chacha20_block_xor_neon)
+
+ .align 5
+ENTRY(chacha20_4block_xor_neon)
+ push {r4-r6, lr}
+ mov ip, sp // preserve the stack pointer
+ sub r3, sp, #0x20 // allocate a 32 byte buffer
+ bic r3, r3, #0x1f // aligned to 32 bytes
+ mov sp, r3
+
+ // r0: Input state matrix, s
+ // r1: 4 data blocks output, o
+ // r2: 4 data blocks input, i
+
+ //
+ // This function encrypts four consecutive ChaCha20 blocks by loading
+ // the state matrix in NEON registers four times. The algorithm performs
+ // each operation on the corresponding word of each state matrix, hence
+ // requires no word shuffling. For final XORing step we transpose the
+ // matrix by interleaving 32- and then 64-bit words, which allows us to
+ // do XOR in NEON registers.
+ //
+
+ // x0..15[0-3] = s0..3[0..3]
+ add r3, r0, #0x20
+ vld1.32 {q0-q1}, [r0]
+ vld1.32 {q2-q3}, [r3]
+
+ adr r3, CTRINC
+ vdup.32 q15, d7[1]
+ vdup.32 q14, d7[0]
+ vld1.32 {q11}, [r3, :128]
+ vdup.32 q13, d6[1]
+ vdup.32 q12, d6[0]
+ vadd.i32 q12, q12, q11 // x12 += counter values 0-3
+ vdup.32 q11, d5[1]
+ vdup.32 q10, d5[0]
+ vdup.32 q9, d4[1]
+ vdup.32 q8, d4[0]
+ vdup.32 q7, d3[1]
+ vdup.32 q6, d3[0]
+ vdup.32 q5, d2[1]
+ vdup.32 q4, d2[0]
+ vdup.32 q3, d1[1]
+ vdup.32 q2, d1[0]
+ vdup.32 q1, d0[1]
+ vdup.32 q0, d0[0]
+
+ mov r3, #10
+
+.Ldoubleround4:
+ // x0 += x4, x12 = rotl32(x12 ^ x0, 16)
+ // x1 += x5, x13 = rotl32(x13 ^ x1, 16)
+ // x2 += x6, x14 = rotl32(x14 ^ x2, 16)
+ // x3 += x7, x15 = rotl32(x15 ^ x3, 16)
+ vadd.i32 q0, q0, q4
+ vadd.i32 q1, q1, q5
+ vadd.i32 q2, q2, q6
+ vadd.i32 q3, q3, q7
+
+ veor q12, q12, q0
+ veor q13, q13, q1
+ veor q14, q14, q2
+ veor q15, q15, q3
+
+ vrev32.16 q12, q12
+ vrev32.16 q13, q13
+ vrev32.16 q14, q14
+ vrev32.16 q15, q15
+
+ // x8 += x12, x4 = rotl32(x4 ^ x8, 12)
+ // x9 += x13, x5 = rotl32(x5 ^ x9, 12)
+ // x10 += x14, x6 = rotl32(x6 ^ x10, 12)
+ // x11 += x15, x7 = rotl32(x7 ^ x11, 12)
+ vadd.i32 q8, q8, q12
+ vadd.i32 q9, q9, q13
+ vadd.i32 q10, q10, q14
+ vadd.i32 q11, q11, q15
+
+ vst1.32 {q8-q9}, [sp, :256]
+
+ veor q8, q4, q8
+ veor q9, q5, q9
+ vshl.u32 q4, q8, #12
+ vshl.u32 q5, q9, #12
+ vsri.u32 q4, q8, #20
+ vsri.u32 q5, q9, #20
+
+ veor q8, q6, q10
+ veor q9, q7, q11
+ vshl.u32 q6, q8, #12
+ vshl.u32 q7, q9, #12
+ vsri.u32 q6, q8, #20
+ vsri.u32 q7, q9, #20
+
+ // x0 += x4, x12 = rotl32(x12 ^ x0, 8)
+ // x1 += x5, x13 = rotl32(x13 ^ x1, 8)
+ // x2 += x6, x14 = rotl32(x14 ^ x2, 8)
+ // x3 += x7, x15 = rotl32(x15 ^ x3, 8)
+ vadd.i32 q0, q0, q4
+ vadd.i32 q1, q1, q5
+ vadd.i32 q2, q2, q6
+ vadd.i32 q3, q3, q7
+
+ veor q8, q12, q0
+ veor q9, q13, q1
+ vshl.u32 q12, q8, #8
+ vshl.u32 q13, q9, #8
+ vsri.u32 q12, q8, #24
+ vsri.u32 q13, q9, #24
+
+ veor q8, q14, q2
+ veor q9, q15, q3
+ vshl.u32 q14, q8, #8
+ vshl.u32 q15, q9, #8
+ vsri.u32 q14, q8, #24
+ vsri.u32 q15, q9, #24
+
+ vld1.32 {q8-q9}, [sp, :256]
+
+ // x8 += x12, x4 = rotl32(x4 ^ x8, 7)
+ // x9 += x13, x5 = rotl32(x5 ^ x9, 7)
+ // x10 += x14, x6 = rotl32(x6 ^ x10, 7)
+ // x11 += x15, x7 = rotl32(x7 ^ x11, 7)
+ vadd.i32 q8, q8, q12
+ vadd.i32 q9, q9, q13
+ vadd.i32 q10, q10, q14
+ vadd.i32 q11, q11, q15
+
+ vst1.32 {q8-q9}, [sp, :256]
+
+ veor q8, q4, q8
+ veor q9, q5, q9
+ vshl.u32 q4, q8, #7
+ vshl.u32 q5, q9, #7
+ vsri.u32 q4, q8, #25
+ vsri.u32 q5, q9, #25
+
+ veor q8, q6, q10
+ veor q9, q7, q11
+ vshl.u32 q6, q8, #7
+ vshl.u32 q7, q9, #7
+ vsri.u32 q6, q8, #25
+ vsri.u32 q7, q9, #25
+
+ vld1.32 {q8-q9}, [sp, :256]
+
+ // x0 += x5, x15 = rotl32(x15 ^ x0, 16)
+ // x1 += x6, x12 = rotl32(x12 ^ x1, 16)
+ // x2 += x7, x13 = rotl32(x13 ^ x2, 16)
+ // x3 += x4, x14 = rotl32(x14 ^ x3, 16)
+ vadd.i32 q0, q0, q5
+ vadd.i32 q1, q1, q6
+ vadd.i32 q2, q2, q7
+ vadd.i32 q3, q3, q4
+
+ veor q15, q15, q0
+ veor q12, q12, q1
+ veor q13, q13, q2
+ veor q14, q14, q3
+
+ vrev32.16 q15, q15
+ vrev32.16 q12, q12
+ vrev32.16 q13, q13
+ vrev32.16 q14, q14
+
+ // x10 += x15, x5 = rotl32(x5 ^ x10, 12)
+ // x11 += x12, x6 = rotl32(x6 ^ x11, 12)
+ // x8 += x13, x7 = rotl32(x7 ^ x8, 12)
+ // x9 += x14, x4 = rotl32(x4 ^ x9, 12)
+ vadd.i32 q10, q10, q15
+ vadd.i32 q11, q11, q12
+ vadd.i32 q8, q8, q13
+ vadd.i32 q9, q9, q14
+
+ vst1.32 {q8-q9}, [sp, :256]
+
+ veor q8, q7, q8
+ veor q9, q4, q9
+ vshl.u32 q7, q8, #12
+ vshl.u32 q4, q9, #12
+ vsri.u32 q7, q8, #20
+ vsri.u32 q4, q9, #20
+
+ veor q8, q5, q10
+ veor q9, q6, q11
+ vshl.u32 q5, q8, #12
+ vshl.u32 q6, q9, #12
+ vsri.u32 q5, q8, #20
+ vsri.u32 q6, q9, #20
+
+ // x0 += x5, x15 = rotl32(x15 ^ x0, 8)
+ // x1 += x6, x12 = rotl32(x12 ^ x1, 8)
+ // x2 += x7, x13 = rotl32(x13 ^ x2, 8)
+ // x3 += x4, x14 = rotl32(x14 ^ x3, 8)
+ vadd.i32 q0, q0, q5
+ vadd.i32 q1, q1, q6
+ vadd.i32 q2, q2, q7
+ vadd.i32 q3, q3, q4
+
+ veor q8, q15, q0
+ veor q9, q12, q1
+ vshl.u32 q15, q8, #8
+ vshl.u32 q12, q9, #8
+ vsri.u32 q15, q8, #24
+ vsri.u32 q12, q9, #24
+
+ veor q8, q13, q2
+ veor q9, q14, q3
+ vshl.u32 q13, q8, #8
+ vshl.u32 q14, q9, #8
+ vsri.u32 q13, q8, #24
+ vsri.u32 q14, q9, #24
+
+ vld1.32 {q8-q9}, [sp, :256]
+
+ // x10 += x15, x5 = rotl32(x5 ^ x10, 7)
+ // x11 += x12, x6 = rotl32(x6 ^ x11, 7)
+ // x8 += x13, x7 = rotl32(x7 ^ x8, 7)
+ // x9 += x14, x4 = rotl32(x4 ^ x9, 7)
+ vadd.i32 q10, q10, q15
+ vadd.i32 q11, q11, q12
+ vadd.i32 q8, q8, q13
+ vadd.i32 q9, q9, q14
+
+ vst1.32 {q8-q9}, [sp, :256]
+
+ veor q8, q7, q8
+ veor q9, q4, q9
+ vshl.u32 q7, q8, #7
+ vshl.u32 q4, q9, #7
+ vsri.u32 q7, q8, #25
+ vsri.u32 q4, q9, #25
+
+ veor q8, q5, q10
+ veor q9, q6, q11
+ vshl.u32 q5, q8, #7
+ vshl.u32 q6, q9, #7
+ vsri.u32 q5, q8, #25
+ vsri.u32 q6, q9, #25
+
+ subs r3, r3, #1
+ beq 0f
+
+ vld1.32 {q8-q9}, [sp, :256]
+ b .Ldoubleround4
+
+ // x0[0-3] += s0[0]
+ // x1[0-3] += s0[1]
+ // x2[0-3] += s0[2]
+ // x3[0-3] += s0[3]
+0: ldmia r0!, {r3-r6}
+ vdup.32 q8, r3
+ vdup.32 q9, r4
+ vadd.i32 q0, q0, q8
+ vadd.i32 q1, q1, q9
+ vdup.32 q8, r5
+ vdup.32 q9, r6
+ vadd.i32 q2, q2, q8
+ vadd.i32 q3, q3, q9
+
+ // x4[0-3] += s1[0]
+ // x5[0-3] += s1[1]
+ // x6[0-3] += s1[2]
+ // x7[0-3] += s1[3]
+ ldmia r0!, {r3-r6}
+ vdup.32 q8, r3
+ vdup.32 q9, r4
+ vadd.i32 q4, q4, q8
+ vadd.i32 q5, q5, q9
+ vdup.32 q8, r5
+ vdup.32 q9, r6
+ vadd.i32 q6, q6, q8
+ vadd.i32 q7, q7, q9
+
+ // interleave 32-bit words in state n, n+1
+ vzip.32 q0, q1
+ vzip.32 q2, q3
+ vzip.32 q4, q5
+ vzip.32 q6, q7
+
+ // interleave 64-bit words in state n, n+2
+ vswp d1, d4
+ vswp d3, d6
+ vswp d9, d12
+ vswp d11, d14
+
+ // xor with corresponding input, write to output
+ vld1.8 {q8-q9}, [r2]!
+ veor q8, q8, q0
+ veor q9, q9, q4
+ vst1.8 {q8-q9}, [r1]!
+
+ vld1.32 {q8-q9}, [sp, :256]
+
+ // x8[0-3] += s2[0]
+ // x9[0-3] += s2[1]
+ // x10[0-3] += s2[2]
+ // x11[0-3] += s2[3]
+ ldmia r0!, {r3-r6}
+ vdup.32 q0, r3
+ vdup.32 q4, r4
+ vadd.i32 q8, q8, q0
+ vadd.i32 q9, q9, q4
+ vdup.32 q0, r5
+ vdup.32 q4, r6
+ vadd.i32 q10, q10, q0
+ vadd.i32 q11, q11, q4
+
+ // x12[0-3] += s3[0]
+ // x13[0-3] += s3[1]
+ // x14[0-3] += s3[2]
+ // x15[0-3] += s3[3]
+ ldmia r0!, {r3-r6}
+ vdup.32 q0, r3
+ vdup.32 q4, r4
+ adr r3, CTRINC
+ vadd.i32 q12, q12, q0
+ vld1.32 {q0}, [r3, :128]
+ vadd.i32 q13, q13, q4
+ vadd.i32 q12, q12, q0 // x12 += counter values 0-3
+
+ vdup.32 q0, r5
+ vdup.32 q4, r6
+ vadd.i32 q14, q14, q0
+ vadd.i32 q15, q15, q4
+
+ // interleave 32-bit words in state n, n+1
+ vzip.32 q8, q9
+ vzip.32 q10, q11
+ vzip.32 q12, q13
+ vzip.32 q14, q15
+
+ // interleave 64-bit words in state n, n+2
+ vswp d17, d20
+ vswp d19, d22
+ vswp d25, d28
+ vswp d27, d30
+
+ vmov q4, q1
+
+ vld1.8 {q0-q1}, [r2]!
+ veor q0, q0, q8
+ veor q1, q1, q12
+ vst1.8 {q0-q1}, [r1]!
+
+ vld1.8 {q0-q1}, [r2]!
+ veor q0, q0, q2
+ veor q1, q1, q6
+ vst1.8 {q0-q1}, [r1]!
+
+ vld1.8 {q0-q1}, [r2]!
+ veor q0, q0, q10
+ veor q1, q1, q14
+ vst1.8 {q0-q1}, [r1]!
+
+ vld1.8 {q0-q1}, [r2]!
+ veor q0, q0, q4
+ veor q1, q1, q5
+ vst1.8 {q0-q1}, [r1]!
+
+ vld1.8 {q0-q1}, [r2]!
+ veor q0, q0, q9
+ veor q1, q1, q13
+ vst1.8 {q0-q1}, [r1]!
+
+ vld1.8 {q0-q1}, [r2]!
+ veor q0, q0, q3
+ veor q1, q1, q7
+ vst1.8 {q0-q1}, [r1]!
+
+ vld1.8 {q0-q1}, [r2]
+ veor q0, q0, q11
+ veor q1, q1, q15
+ vst1.8 {q0-q1}, [r1]
+
+ mov sp, ip
+ pop {r4-r6, pc}
+ENDPROC(chacha20_4block_xor_neon)
+
+ .align 4
+CTRINC: .word 0, 1, 2, 3
+
diff --git a/arch/arm/crypto/chacha20-neon-glue.c b/arch/arm/crypto/chacha20-neon-glue.c
new file mode 100644
index 000000000000..20623237e2bc
--- /dev/null
+++ b/arch/arm/crypto/chacha20-neon-glue.c
@@ -0,0 +1,127 @@
+/*
+ * ChaCha20 256-bit cipher algorithm, RFC7539, ARM NEON functions
+ *
+ * Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Based on:
+ * ChaCha20 256-bit cipher algorithm, RFC7539, SIMD glue code
+ *
+ * Copyright (C) 2015 Martin Willi
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <crypto/algapi.h>
+#include <crypto/chacha20.h>
+#include <crypto/internal/skcipher.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+
+#include <asm/hwcap.h>
+#include <asm/neon.h>
+#include <asm/simd.h>
+
+asmlinkage void chacha20_block_xor_neon(u32 *state, u8 *dst, const u8 *src);
+asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src);
+
+static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
+ unsigned int bytes)
+{
+ u8 buf[CHACHA20_BLOCK_SIZE];
+
+ while (bytes >= CHACHA20_BLOCK_SIZE * 4) {
+ chacha20_4block_xor_neon(state, dst, src);
+ bytes -= CHACHA20_BLOCK_SIZE * 4;
+ src += CHACHA20_BLOCK_SIZE * 4;
+ dst += CHACHA20_BLOCK_SIZE * 4;
+ state[12] += 4;
+ }
+ while (bytes >= CHACHA20_BLOCK_SIZE) {
+ chacha20_block_xor_neon(state, dst, src);
+ bytes -= CHACHA20_BLOCK_SIZE;
+ src += CHACHA20_BLOCK_SIZE;
+ dst += CHACHA20_BLOCK_SIZE;
+ state[12]++;
+ }
+ if (bytes) {
+ memcpy(buf, src, bytes);
+ chacha20_block_xor_neon(state, buf, buf);
+ memcpy(dst, buf, bytes);
+ }
+}
+
+static int chacha20_neon(struct skcipher_request *req)
+{
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+ struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct skcipher_walk walk;
+ u32 state[16];
+ int err;
+
+ if (req->cryptlen <= CHACHA20_BLOCK_SIZE || !may_use_simd())
+ return crypto_chacha20_crypt(req);
+
+ err = skcipher_walk_virt(&walk, req, true);
+
+ crypto_chacha20_init(state, ctx, walk.iv);
+
+ kernel_neon_begin();
+ while (walk.nbytes > 0) {
+ unsigned int nbytes = walk.nbytes;
+
+ if (nbytes < walk.total)
+ nbytes = round_down(nbytes, walk.chunksize);
+
+ chacha20_doneon(state, walk.dst.virt.addr, walk.src.virt.addr,
+ nbytes);
+ err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
+ }
+ kernel_neon_end();
+
+ return err;
+}
+
+static struct skcipher_alg alg = {
+ .base.cra_name = "chacha20",
+ .base.cra_driver_name = "chacha20-neon",
+ .base.cra_priority = 300,
+ .base.cra_blocksize = 1,
+ .base.cra_ctxsize = sizeof(struct chacha20_ctx),
+ .base.cra_alignmask = 1,
+ .base.cra_module = THIS_MODULE,
+
+ .min_keysize = CHACHA20_KEY_SIZE,
+ .max_keysize = CHACHA20_KEY_SIZE,
+ .ivsize = CHACHA20_IV_SIZE,
+ .chunksize = 4 * CHACHA20_BLOCK_SIZE,
+ .setkey = crypto_chacha20_setkey,
+ .encrypt = chacha20_neon,
+ .decrypt = chacha20_neon,
+};
+
+static int __init chacha20_simd_mod_init(void)
+{
+ if (!(elf_hwcap & HWCAP_NEON))
+ return -ENODEV;
+
+ return crypto_register_skcipher(&alg);
+}
+
+static void __exit chacha20_simd_mod_fini(void)
+{
+ crypto_unregister_skcipher(&alg);
+}
+
+module_init(chacha20_simd_mod_init);
+module_exit(chacha20_simd_mod_fini);
+
+MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
+MODULE_LICENSE("GPL v2");
+MODULE_ALIAS_CRYPTO("chacha20");
--
2.7.4
^ permalink raw reply related
* Re: [PATCH] linux/types.h: enable endian checks for all sparse builds
From: Bart Van Assche @ 2016-12-09 15:18 UTC (permalink / raw)
To: Madhani, Himanshu, Michael S. Tsirkin
Cc: kvm@vger.kernel.org, Neil Armstrong, David Airlie,
linux-remoteproc@vger.kernel.org, dri-devel@lists.freedesktop.org,
virtualization@lists.linux-foundation.org,
linux-s390@vger.kernel.org, James E.J. Bottomley, Herbert Xu,
linux-scsi@vger.kernel.org, Christoph Hellwig,
v9fs-developer@lists.sourceforge.net, Asias He, Arnd Bergmann,
linux-kbuild@vger.kernel.org, Jens Axboe, Michal Marek,
Stefan Hajnoczi <stef
In-Reply-To: <6199215E-2AA4-4705-9552-5D61FE03F866@cavium.com>
On 12/08/16 22:40, Madhani, Himanshu wrote:
> We’ll take a look and send patches to resolve these warnings.
Thanks!
Bart.
^ permalink raw reply
* Fri, 09 Dec 2016 16:11:33 -0000.Dear linux-crypto Your Electricity Bill.7512902343099$$$!!!16141866278
From: barbara.polaszek @ 2016-12-09 16:11 UTC (permalink / raw)
To: linux-crypto
[-- Attachment #1: URGENT_805395950816022_linux-crypto.zip --]
[-- Type: application/zip, Size: 4735 bytes --]
^ permalink raw reply
* Re: [PATCH 7/7] hwrng: core: Remove two unused include
From: Corentin Labbe @ 2016-12-09 18:24 UTC (permalink / raw)
To: mpm, herbert, arnd, gregkh; +Cc: linux-crypto, linux-kernel
In-Reply-To: <1481293299-21697-7-git-send-email-clabbe.montjoie@gmail.com>
On Fri, Dec 09, 2016 at 03:21:39PM +0100, Corentin Labbe wrote:
> linux/fs.h and linux/sched.h are useless for hw_random/core.c.
> This patch remove them.
>
> Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
> ---
> drivers/char/hw_random/core.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> index 5c654b5..85c9ab3 100644
> --- a/drivers/char/hw_random/core.c
> +++ b/drivers/char/hw_random/core.c
> @@ -13,14 +13,12 @@
> #include <linux/delay.h>
> #include <linux/device.h>
> #include <linux/err.h>
> -#include <linux/fs.h>
> #include <linux/hw_random.h>
> #include <linux/kernel.h>
> #include <linux/kthread.h>
> #include <linux/miscdevice.h>
> #include <linux/module.h>
> #include <linux/random.h>
> -#include <linux/sched.h>
> #include <linux/slab.h>
> #include <linux/uaccess.h>
>
> --
> 2.7.3
>
Sorry forget this patch, it is buggy.
linux/fs.h is needed.
Regards
^ permalink raw reply
* [PATCH] siphash: add cryptographically secure hashtable function
From: Jason A. Donenfeld @ 2016-12-09 18:36 UTC (permalink / raw)
To: linux-kernel, kernel-hardening, linux-crypto, rusty, torvalds
Cc: Jason A. Donenfeld, Jean-Philippe Aumasson, Daniel J . Bernstein
SipHash is a 64-bit keyed hash function that is actually a
cryptographically secure PRF, like HMAC. Except SipHash is super fast,
and is meant to be used as a hashtable keyed lookup function.
SipHash isn't just some new trendy hash function. It's been around for a
while, and there really isn't anything that comes remotely close to
being useful in the way SipHash is. With that said, why do we need this?
There are a variety of attacks known as "hashtable poisoning" in which an
attacker forms some data such that the hash of that data will be the
same, and then preceeds to fill up all entries of a hashbucket. This is
a realistic and well-known denial-of-service vector.
Linux developers already seem to be aware that this is an issue, and
various places that use hash tables in, say, a network context, use a
non-cryptographically secure function (usually jhash) and then try to
twiddle with the key on a time basis (or in many cases just do nothing
and hope that nobody notices). While this is an admirable attempt at
solving the problem, it doesn't actually fix it. SipHash fixes it.
(It fixes it in such a sound way that you could even build a stream
cipher out of SipHash that would resist the modern cryptanalysis.)
There are a modicum of places in the kernel that are vulnerable to
hashtable poisoning attacks, either via userspace vectors or network
vectors, and there's not a reliable mechanism inside the kernel at the
moment to fix it. The first step toward fixing these issues is actually
getting a secure primitive into the kernel for developers to use. Then
we can, bit by bit, port things over to it as deemed appropriate.
Dozens of languages are already using this internally for their hash
tables. Some of the BSDs already use this in their kernels. SipHash is
a widely known high-speed solution to a widely known problem, and it's
time we catch-up.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Daniel J. Bernstein <djb@cr.yp.to>
---
include/linux/siphash.h | 18 ++++++
lib/Makefile | 3 +-
lib/siphash.c | 163 ++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 183 insertions(+), 1 deletion(-)
create mode 100644 include/linux/siphash.h
create mode 100644 lib/siphash.c
diff --git a/include/linux/siphash.h b/include/linux/siphash.h
new file mode 100644
index 000000000000..485c2101cc7d
--- /dev/null
+++ b/include/linux/siphash.h
@@ -0,0 +1,18 @@
+/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ */
+
+#ifndef _LINUX_SIPHASH_H
+#define _LINUX_SIPHASH_H
+
+#include <linux/types.h>
+
+enum siphash24_lengths {
+ SIPHASH24_KEY_LEN = 16
+};
+
+uint64_t siphash24(const uint8_t *data, size_t len, const uint8_t key[SIPHASH24_KEY_LEN]);
+
+#endif /* _LINUX_SIPHASH_H */
diff --git a/lib/Makefile b/lib/Makefile
index 50144a3aeebd..d224337b0d01 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -22,7 +22,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
sha1.o chacha20.o md5.o irq_regs.o argv_split.o \
flex_proportions.o ratelimit.o show_mem.o \
is_single_threaded.o plist.o decompress.o kobject_uevent.o \
- earlycpio.o seq_buf.o nmi_backtrace.o nodemask.o win_minmax.o
+ earlycpio.o seq_buf.o siphash.o \
+ nmi_backtrace.o nodemask.o win_minmax.o
lib-$(CONFIG_MMU) += ioremap.o
lib-$(CONFIG_SMP) += cpumask.o
diff --git a/lib/siphash.c b/lib/siphash.c
new file mode 100644
index 000000000000..022d86f04b9b
--- /dev/null
+++ b/lib/siphash.c
@@ -0,0 +1,163 @@
+/* Copyright (C) 2015-2016 Jason A. Donenfeld <Jason@zx2c4.com>
+ * Copyright (C) 2012-2014 Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
+ * Copyright (C) 2012-2014 Daniel J. Bernstein <djb@cr.yp.to>
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ */
+
+#include <linux/siphash.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+
+#define ROTL(x,b) (uint64_t)(((x) << (b)) | ((x) >> (64 - (b))))
+#define U8TO64(p) le64_to_cpu(*(__le64 *)(p))
+
+#define SIPROUND \
+ do { \
+ v0 += v1; v1 = ROTL(v1, 13); v1 ^= v0; v0 = ROTL(v0, 32); \
+ v2 += v3; v3 = ROTL(v3, 16); v3 ^= v2; \
+ v0 += v3; v3 = ROTL(v3, 21); v3 ^= v0; \
+ v2 += v1; v1 = ROTL(v1, 17); v1 ^= v2; v2 = ROTL(v2, 32); \
+ } while(0)
+
+__attribute__((optimize("unroll-loops")))
+uint64_t siphash24(const uint8_t *data, size_t len, const uint8_t key[SIPHASH24_KEY_LEN])
+{
+ uint64_t v0 = 0x736f6d6570736575ULL;
+ uint64_t v1 = 0x646f72616e646f6dULL;
+ uint64_t v2 = 0x6c7967656e657261ULL;
+ uint64_t v3 = 0x7465646279746573ULL;
+ uint64_t b;
+ uint64_t k0 = U8TO64(key);
+ uint64_t k1 = U8TO64(key + sizeof(uint64_t));
+ uint64_t m;
+ const uint8_t *end = data + len - (len % sizeof(uint64_t));
+ const uint8_t left = len & (sizeof(uint64_t) - 1);
+ b = ((uint64_t)len) << 56;
+ v3 ^= k1;
+ v2 ^= k0;
+ v1 ^= k1;
+ v0 ^= k0;
+ for (; data != end; data += sizeof(uint64_t)) {
+ m = U8TO64(data);
+ v3 ^= m;
+ SIPROUND;
+ SIPROUND;
+ v0 ^= m;
+ }
+ switch (left) {
+ case 7: b |= ((uint64_t)data[6]) << 48;
+ case 6: b |= ((uint64_t)data[5]) << 40;
+ case 5: b |= ((uint64_t)data[4]) << 32;
+ case 4: b |= ((uint64_t)data[3]) << 24;
+ case 3: b |= ((uint64_t)data[2]) << 16;
+ case 2: b |= ((uint64_t)data[1]) << 8;
+ case 1: b |= ((uint64_t)data[0]); break;
+ case 0: break;
+ }
+ v3 ^= b;
+ SIPROUND;
+ SIPROUND;
+ v0 ^= b;
+ v2 ^= 0xff;
+ SIPROUND;
+ SIPROUND;
+ SIPROUND;
+ SIPROUND;
+ b = (v0 ^ v1) ^ (v2 ^ v3);
+ return (__force uint64_t)cpu_to_le64(b);
+}
+EXPORT_SYMBOL(siphash24);
+
+#ifdef DEBUG
+static const uint8_t test_vectors[64][8] = {
+ { 0x31, 0x0e, 0x0e, 0xdd, 0x47, 0xdb, 0x6f, 0x72 },
+ { 0xfd, 0x67, 0xdc, 0x93, 0xc5, 0x39, 0xf8, 0x74 },
+ { 0x5a, 0x4f, 0xa9, 0xd9, 0x09, 0x80, 0x6c, 0x0d },
+ { 0x2d, 0x7e, 0xfb, 0xd7, 0x96, 0x66, 0x67, 0x85 },
+ { 0xb7, 0x87, 0x71, 0x27, 0xe0, 0x94, 0x27, 0xcf },
+ { 0x8d, 0xa6, 0x99, 0xcd, 0x64, 0x55, 0x76, 0x18 },
+ { 0xce, 0xe3, 0xfe, 0x58, 0x6e, 0x46, 0xc9, 0xcb },
+ { 0x37, 0xd1, 0x01, 0x8b, 0xf5, 0x00, 0x02, 0xab },
+ { 0x62, 0x24, 0x93, 0x9a, 0x79, 0xf5, 0xf5, 0x93 },
+ { 0xb0, 0xe4, 0xa9, 0x0b, 0xdf, 0x82, 0x00, 0x9e },
+ { 0xf3, 0xb9, 0xdd, 0x94, 0xc5, 0xbb, 0x5d, 0x7a },
+ { 0xa7, 0xad, 0x6b, 0x22, 0x46, 0x2f, 0xb3, 0xf4 },
+ { 0xfb, 0xe5, 0x0e, 0x86, 0xbc, 0x8f, 0x1e, 0x75 },
+ { 0x90, 0x3d, 0x84, 0xc0, 0x27, 0x56, 0xea, 0x14 },
+ { 0xee, 0xf2, 0x7a, 0x8e, 0x90, 0xca, 0x23, 0xf7 },
+ { 0xe5, 0x45, 0xbe, 0x49, 0x61, 0xca, 0x29, 0xa1 },
+ { 0xdb, 0x9b, 0xc2, 0x57, 0x7f, 0xcc, 0x2a, 0x3f },
+ { 0x94, 0x47, 0xbe, 0x2c, 0xf5, 0xe9, 0x9a, 0x69 },
+ { 0x9c, 0xd3, 0x8d, 0x96, 0xf0, 0xb3, 0xc1, 0x4b },
+ { 0xbd, 0x61, 0x79, 0xa7, 0x1d, 0xc9, 0x6d, 0xbb },
+ { 0x98, 0xee, 0xa2, 0x1a, 0xf2, 0x5c, 0xd6, 0xbe },
+ { 0xc7, 0x67, 0x3b, 0x2e, 0xb0, 0xcb, 0xf2, 0xd0 },
+ { 0x88, 0x3e, 0xa3, 0xe3, 0x95, 0x67, 0x53, 0x93 },
+ { 0xc8, 0xce, 0x5c, 0xcd, 0x8c, 0x03, 0x0c, 0xa8 },
+ { 0x94, 0xaf, 0x49, 0xf6, 0xc6, 0x50, 0xad, 0xb8 },
+ { 0xea, 0xb8, 0x85, 0x8a, 0xde, 0x92, 0xe1, 0xbc },
+ { 0xf3, 0x15, 0xbb, 0x5b, 0xb8, 0x35, 0xd8, 0x17 },
+ { 0xad, 0xcf, 0x6b, 0x07, 0x63, 0x61, 0x2e, 0x2f },
+ { 0xa5, 0xc9, 0x1d, 0xa7, 0xac, 0xaa, 0x4d, 0xde },
+ { 0x71, 0x65, 0x95, 0x87, 0x66, 0x50, 0xa2, 0xa6 },
+ { 0x28, 0xef, 0x49, 0x5c, 0x53, 0xa3, 0x87, 0xad },
+ { 0x42, 0xc3, 0x41, 0xd8, 0xfa, 0x92, 0xd8, 0x32 },
+ { 0xce, 0x7c, 0xf2, 0x72, 0x2f, 0x51, 0x27, 0x71 },
+ { 0xe3, 0x78, 0x59, 0xf9, 0x46, 0x23, 0xf3, 0xa7 },
+ { 0x38, 0x12, 0x05, 0xbb, 0x1a, 0xb0, 0xe0, 0x12 },
+ { 0xae, 0x97, 0xa1, 0x0f, 0xd4, 0x34, 0xe0, 0x15 },
+ { 0xb4, 0xa3, 0x15, 0x08, 0xbe, 0xff, 0x4d, 0x31 },
+ { 0x81, 0x39, 0x62, 0x29, 0xf0, 0x90, 0x79, 0x02 },
+ { 0x4d, 0x0c, 0xf4, 0x9e, 0xe5, 0xd4, 0xdc, 0xca },
+ { 0x5c, 0x73, 0x33, 0x6a, 0x76, 0xd8, 0xbf, 0x9a },
+ { 0xd0, 0xa7, 0x04, 0x53, 0x6b, 0xa9, 0x3e, 0x0e },
+ { 0x92, 0x59, 0x58, 0xfc, 0xd6, 0x42, 0x0c, 0xad },
+ { 0xa9, 0x15, 0xc2, 0x9b, 0xc8, 0x06, 0x73, 0x18 },
+ { 0x95, 0x2b, 0x79, 0xf3, 0xbc, 0x0a, 0xa6, 0xd4 },
+ { 0xf2, 0x1d, 0xf2, 0xe4, 0x1d, 0x45, 0x35, 0xf9 },
+ { 0x87, 0x57, 0x75, 0x19, 0x04, 0x8f, 0x53, 0xa9 },
+ { 0x10, 0xa5, 0x6c, 0xf5, 0xdf, 0xcd, 0x9a, 0xdb },
+ { 0xeb, 0x75, 0x09, 0x5c, 0xcd, 0x98, 0x6c, 0xd0 },
+ { 0x51, 0xa9, 0xcb, 0x9e, 0xcb, 0xa3, 0x12, 0xe6 },
+ { 0x96, 0xaf, 0xad, 0xfc, 0x2c, 0xe6, 0x66, 0xc7 },
+ { 0x72, 0xfe, 0x52, 0x97, 0x5a, 0x43, 0x64, 0xee },
+ { 0x5a, 0x16, 0x45, 0xb2, 0x76, 0xd5, 0x92, 0xa1 },
+ { 0xb2, 0x74, 0xcb, 0x8e, 0xbf, 0x87, 0x87, 0x0a },
+ { 0x6f, 0x9b, 0xb4, 0x20, 0x3d, 0xe7, 0xb3, 0x81 },
+ { 0xea, 0xec, 0xb2, 0xa3, 0x0b, 0x22, 0xa8, 0x7f },
+ { 0x99, 0x24, 0xa4, 0x3c, 0xc1, 0x31, 0x57, 0x24 },
+ { 0xbd, 0x83, 0x8d, 0x3a, 0xaf, 0xbf, 0x8d, 0xb7 },
+ { 0x0b, 0x1a, 0x2a, 0x32, 0x65, 0xd5, 0x1a, 0xea },
+ { 0x13, 0x50, 0x79, 0xa3, 0x23, 0x1c, 0xe6, 0x60 },
+ { 0x93, 0x2b, 0x28, 0x46, 0xe4, 0xd7, 0x06, 0x66 },
+ { 0xe1, 0x91, 0x5f, 0x5c, 0xb1, 0xec, 0xa4, 0x6c },
+ { 0xf3, 0x25, 0x96, 0x5c, 0xa1, 0x6d, 0x62, 0x9f },
+ { 0x57, 0x5f, 0xf2, 0x8e, 0x60, 0x38, 0x1b, 0xe5 },
+ { 0x72, 0x45, 0x06, 0xeb, 0x4c, 0x32, 0x8a, 0x95 }
+};
+
+static int siphash24_selftest(void)
+{
+ uint8_t in[64], k[16], i;
+ uint64_t out;
+ int ret = 0;
+
+ for (i = 0; i < 16; ++i)
+ k[i] = i;
+
+ for (i = 0; i < 64; ++i) {
+ in[i] = i;
+ out = siphash24(in, i, k);
+ if (memcmp(&out, test_vectors[i], 8)) {
+ printk(KERN_INFO "siphash24: self-test %u: FAIL\n", i + 1);
+ ret = -1;
+ }
+ }
+ if (!ret)
+ printk(KERN_INFO "siphash24: self-tests: pass\n");
+ return ret;
+}
+__initcall(siphash24_selftest);
+#endif
--
2.11.0
^ permalink raw reply related
* Re: [PATCH] linux/types.h: enable endian checks for all sparse builds
From: Michael S. Tsirkin @ 2016-12-09 20:45 UTC (permalink / raw)
To: Bart Van Assche
Cc: Madhani, Himanshu, kvm@vger.kernel.org, Neil Armstrong,
David Airlie, linux-remoteproc@vger.kernel.org,
dri-devel@lists.freedesktop.org,
virtualization@lists.linux-foundation.org,
linux-s390@vger.kernel.org, James E.J. Bottomley, Herbert Xu,
linux-scsi@vger.kernel.org, Christoph Hellwig,
v9fs-developer@lists.sourceforge.net, Asias He,
Arnd Bergmann <ar
In-Reply-To: <BLUPR02MB168371E06EFA9AB34AA2C58181870@BLUPR02MB1683.namprd02.prod.outlook.com>
On Fri, Dec 09, 2016 at 03:18:02PM +0000, Bart Van Assche wrote:
> On 12/08/16 22:40, Madhani, Himanshu wrote:
> > We’ll take a look and send patches to resolve these warnings.
>
> Thanks!
>
> Bart.
>
Sounds good. I posted what I have so far so that you can
start from that.
--
MST
^ permalink raw reply
* Re: [PATCH v2 1/3] crypto: brcm: DT documentation for Broadcom SPU driver
From: Rob Herring @ 2016-12-09 21:26 UTC (permalink / raw)
To: Rob Rice
Cc: Herbert Xu, David S. Miller, Mark Rutland,
linux-crypto-u79uwXL29TY76Z2rM5mHXA,
devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Ray Jui, Scott Branden,
Jon Mason, bcm-kernel-feedback-list-dY08KVG/lbpWk0Htik3J/w,
Catalin Marinas, Will Deacon,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Steve Lin
In-Reply-To: <1480714499-1476-2-git-send-email-rob.rice-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
On Fri, Dec 02, 2016 at 04:34:57PM -0500, Rob Rice wrote:
> Device tree documentation for Broadcom Secure Processing Unit
> (SPU) crypto driver.
>
> Signed-off-by: Steve Lin <steven.lin1-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Rob Rice <rob.rice-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
> ---
> .../devicetree/bindings/crypto/brcm,spu-crypto.txt | 25 ++++++++++++++++++++++
> 1 file changed, 25 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/crypto/brcm,spu-crypto.txt
>
> diff --git a/Documentation/devicetree/bindings/crypto/brcm,spu-crypto.txt b/Documentation/devicetree/bindings/crypto/brcm,spu-crypto.txt
> new file mode 100644
> index 0000000..e5fe942
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/crypto/brcm,spu-crypto.txt
> @@ -0,0 +1,25 @@
> +The Broadcom Secure Processing Unit (SPU) driver supports symmetric
Bindings describe h/w, not drivers.
> +cryptographic offload for Broadcom SoCs with SPU hardware. A SoC may have
> +multiple SPU hardware blocks.
> +
> +Required properties:
> +- compatible : Should be "brcm,spum-crypto" for devices with SPU-M hardware
Additionally, you should have SoC specific compatible here.
> + (e.g., Northstar2) or "brcm,spum-nsp-crypto" for the Northstar Plus variant
> + of the SPU-M hardware.
> +
> +- reg: Should contain SPU registers location and length.
> +- mboxes: A list of mailbox channels to be used by the kernel driver. Mailbox
> +channels correspond to DMA rings on the device.
What determines the mbox assignment?
Needs to specify how many.
> +
> +Example:
> + spu-crypto@612d0000 {
Just crypto@...
Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH v6 1/2] sparc: fix a building error reported by kbuild
From: Sam Ravnborg @ 2016-12-09 21:58 UTC (permalink / raw)
To: Gonglei
Cc: linux-kernel, qemu-devel, virtio-dev, virtualization,
linux-crypto, luonengjun, mst, stefanha, weidong.huang, wu.wubin,
xin.zeng, claudio.fontana, herbert, pasic, davem, jianjay.zhou,
hanweidong, arei.gonglei, cornelia.huck, xuquan8, longpeng2,
wanzongshun, sparclinux
In-Reply-To: <1481171829-116496-2-git-send-email-arei.gonglei@huawei.com>
Hi Gonglei.
On Thu, Dec 08, 2016 at 12:37:08PM +0800, Gonglei wrote:
> >> arch/sparc/include/asm/topology_64.h:44:44:
> error: implicit declaration of function 'cpu_data'
> [-Werror=implicit-function-declaration]
>
> #define topology_physical_package_id(cpu) (cpu_data(cpu).proc_id)
> ^
> Let's include cpudata.h in topology_64.h.
>
> Cc: Sam Ravnborg <sam@ravnborg.org>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: sparclinux@vger.kernel.org
> Suggested-by: Sam Ravnborg <sam@ravnborg.org>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
> ---
> arch/sparc/include/asm/topology_64.h | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/sparc/include/asm/topology_64.h b/arch/sparc/include/asm/topology_64.h
> index 7b4898a..2255430 100644
> --- a/arch/sparc/include/asm/topology_64.h
> +++ b/arch/sparc/include/asm/topology_64.h
> @@ -4,6 +4,7 @@
> #ifdef CONFIG_NUMA
>
> #include <asm/mmzone.h>
> +#include <asm/cpudata.h>
Nitpick - if you are going to resend this patch, then please
order the two includes in alphabetic order.
For two includes this looks like bikeshedding, but when we add
more having them in a defined arder prevents merge conflicts.
And makes it readable too.
We also sometimes order the includes with the longest lines topmost,
and lines with the ame length are ordered alphabetically.
But this is not seen so often.
Sam
^ permalink raw reply
* Remaining crypto API regressions with CONFIG_VMAP_STACK
From: Eric Biggers @ 2016-12-09 23:08 UTC (permalink / raw)
To: linux-crypto
Cc: linux-kernel, linux-mm, kernel-hardening, Herbert Xu,
Andrew Lutomirski, Stephan Mueller
In the 4.9 kernel, virtually-mapped stacks will be supported and enabled by
default on x86_64. This has been exposing a number of problems in which
on-stack buffers are being passed into the crypto API, which to support crypto
accelerators operates on 'struct page' rather than on virtual memory.
Some of these problems have already been fixed, but I was wondering how many
problems remain, so I briefly looked through all the callers of sg_set_buf() and
sg_init_one(). Overall I found quite a few remaining problems, detailed below.
The following crypto drivers initialize a scatterlist to point into an
ahash_request, which may have been allocated on the stack with
AHASH_REQUEST_ON_STACK():
drivers/crypto/bfin_crc.c:351
drivers/crypto/qce/sha.c:299
drivers/crypto/sahara.c:973,988
drivers/crypto/talitos.c:1910
drivers/crypto/ccp/ccp-crypto-aes-cmac.c:105,119,142
drivers/crypto/ccp/ccp-crypto-sha.c:95,109,124
drivers/crypto/qce/sha.c:325
The following crypto drivers initialize a scatterlist to point into an
ablkcipher_request, which may have been allocated on the stack with
SKCIPHER_REQUEST_ON_STACK():
drivers/crypto/ccp/ccp-crypto-aes-xts.c:162
drivers/crypto/ccp/ccp-crypto-aes.c:94
And these other places do crypto operations on buffers clearly on the stack:
drivers/net/wireless/intersil/orinoco/mic.c:72
drivers/usb/wusbcore/crypto.c:264
net/ceph/crypto.c:182
net/rxrpc/rxkad.c:737,1000
security/keys/encrypted-keys/encrypted.c:500
fs/cifs/smbencrypt.c:96
Note: I almost certainly missed some, since I excluded places where the use of a
stack buffer was not obvious to me. I also excluded AEAD algorithms since there
isn't an AEAD_REQUEST_ON_STACK() macro (yet).
The "good" news with these bugs is that on x86_64 without CONFIG_DEBUG_SG=y or
CONFIG_DEBUG_VIRTUAL=y, you can still do virt_to_page() and then page_address()
on a vmalloc address and get back the same address, even though you aren't
*supposed* to be able to do this. This will make things still work for most
people. The bad news is that if you happen to have consumed just about 1 page
(or N pages) of your stack at the time you call the crypto API, your stack
buffer may actually span physically non-contiguous pages, so the crypto
algorithm will scribble over some unrelated page. Also, hardware crypto drivers
which actually do operate on physical memory will break too.
So I am wondering: is the best solution really to make all these crypto API
algorithms and users use heap buffers, as opposed to something like maintaining
a lowmem alias for the stack, or introducing a more general function to convert
buffers (possibly in the vmalloc space) into scatterlists? And if the current
solution is desired, who is going to fix all of these bugs and when?
Eric
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
* Re: [PATCH 7/7] hwrng: core: Remove two unused include
From: kbuild test robot @ 2016-12-10 1:27 UTC (permalink / raw)
To: Corentin Labbe
Cc: kbuild-all, mpm, herbert, arnd, gregkh, linux-crypto,
linux-kernel, Corentin Labbe
In-Reply-To: <1481293299-21697-7-git-send-email-clabbe.montjoie@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 23999 bytes --]
Hi Corentin,
[auto build test ERROR on char-misc/char-misc-testing]
[also build test ERROR on v4.9-rc8 next-20161209]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Corentin-Labbe/hwrng-core-do-not-use-multiple-blank-lines/20161210-072632
config: i386-randconfig-x007-201649 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All error/warnings (new ones prefixed by >>):
In file included from include/linux/linkage.h:4:0,
from include/linux/kernel.h:6,
from include/linux/delay.h:10,
from drivers/char/hw_random/core.c:13:
drivers/char/hw_random/core.c: In function 'rng_dev_open':
>> drivers/char/hw_random/core.c:169:11: error: dereferencing pointer to incomplete type 'struct file'
if ((filp->f_mode & FMODE_READ) == 0)
^
include/linux/compiler.h:149:30: note: in definition of macro '__trace_if'
if (__builtin_constant_p(!!(cond)) ? !!(cond) : \
^~~~
>> drivers/char/hw_random/core.c:169:2: note: in expansion of macro 'if'
if ((filp->f_mode & FMODE_READ) == 0)
^~
>> drivers/char/hw_random/core.c:169:22: error: 'FMODE_READ' undeclared (first use in this function)
if ((filp->f_mode & FMODE_READ) == 0)
^
include/linux/compiler.h:149:30: note: in definition of macro '__trace_if'
if (__builtin_constant_p(!!(cond)) ? !!(cond) : \
^~~~
>> drivers/char/hw_random/core.c:169:2: note: in expansion of macro 'if'
if ((filp->f_mode & FMODE_READ) == 0)
^~
drivers/char/hw_random/core.c:169:22: note: each undeclared identifier is reported only once for each function it appears in
if ((filp->f_mode & FMODE_READ) == 0)
^
include/linux/compiler.h:149:30: note: in definition of macro '__trace_if'
if (__builtin_constant_p(!!(cond)) ? !!(cond) : \
^~~~
>> drivers/char/hw_random/core.c:169:2: note: in expansion of macro 'if'
if ((filp->f_mode & FMODE_READ) == 0)
^~
>> drivers/char/hw_random/core.c:171:21: error: 'FMODE_WRITE' undeclared (first use in this function)
if (filp->f_mode & FMODE_WRITE)
^
include/linux/compiler.h:149:30: note: in definition of macro '__trace_if'
if (__builtin_constant_p(!!(cond)) ? !!(cond) : \
^~~~
drivers/char/hw_random/core.c:171:2: note: in expansion of macro 'if'
if (filp->f_mode & FMODE_WRITE)
^~
drivers/char/hw_random/core.c: In function 'rng_dev_read':
>> drivers/char/hw_random/core.c:221:23: error: 'O_NONBLOCK' undeclared (first use in this function)
!(filp->f_flags & O_NONBLOCK));
^~~~~~~~~~
drivers/char/hw_random/core.c: At top level:
>> drivers/char/hw_random/core.c:272:21: error: variable 'rng_chrdev_ops' has initializer but incomplete type
static const struct file_operations rng_chrdev_ops = {
^~~~~~~~~~~~~~~
>> drivers/char/hw_random/core.c:273:2: error: unknown field 'owner' specified in initializer
.owner = THIS_MODULE,
^
In file included from include/linux/linkage.h:6:0,
from include/linux/kernel.h:6,
from include/linux/delay.h:10,
from drivers/char/hw_random/core.c:13:
include/linux/export.h:37:21: warning: excess elements in struct initializer
#define THIS_MODULE ((struct module *)0)
^
>> drivers/char/hw_random/core.c:273:12: note: in expansion of macro 'THIS_MODULE'
.owner = THIS_MODULE,
^~~~~~~~~~~
include/linux/export.h:37:21: note: (near initialization for 'rng_chrdev_ops')
#define THIS_MODULE ((struct module *)0)
^
>> drivers/char/hw_random/core.c:273:12: note: in expansion of macro 'THIS_MODULE'
.owner = THIS_MODULE,
^~~~~~~~~~~
>> drivers/char/hw_random/core.c:274:2: error: unknown field 'open' specified in initializer
.open = rng_dev_open,
^
>> drivers/char/hw_random/core.c:274:11: warning: excess elements in struct initializer
.open = rng_dev_open,
^~~~~~~~~~~~
drivers/char/hw_random/core.c:274:11: note: (near initialization for 'rng_chrdev_ops')
>> drivers/char/hw_random/core.c:275:2: error: unknown field 'read' specified in initializer
.read = rng_dev_read,
^
drivers/char/hw_random/core.c:275:11: warning: excess elements in struct initializer
.read = rng_dev_read,
^~~~~~~~~~~~
drivers/char/hw_random/core.c:275:11: note: (near initialization for 'rng_chrdev_ops')
>> drivers/char/hw_random/core.c:276:2: error: unknown field 'llseek' specified in initializer
.llseek = noop_llseek,
^
>> drivers/char/hw_random/core.c:276:13: error: 'noop_llseek' undeclared here (not in a function)
.llseek = noop_llseek,
^~~~~~~~~~~
drivers/char/hw_random/core.c:276:13: warning: excess elements in struct initializer
drivers/char/hw_random/core.c:276:13: note: (near initialization for 'rng_chrdev_ops')
>> drivers/char/hw_random/core.c:272:37: error: storage size of 'rng_chrdev_ops' isn't known
static const struct file_operations rng_chrdev_ops = {
^~~~~~~~~~~~~~
vim +169 drivers/char/hw_random/core.c
92c7987e7 Corentin Labbe 2016-12-09 7 * Please read Documentation/hw_random.txt for details on use.
92c7987e7 Corentin Labbe 2016-12-09 8 *
92c7987e7 Corentin Labbe 2016-12-09 9 * This software may be used and distributed according to the terms
92c7987e7 Corentin Labbe 2016-12-09 10 * of the GNU General Public License, incorporated herein by reference.
844dd05fe Michael Buesch 2006-06-26 11 */
844dd05fe Michael Buesch 2006-06-26 12
b70f09c75 Corentin Labbe 2016-12-09 @13 #include <linux/delay.h>
844dd05fe Michael Buesch 2006-06-26 14 #include <linux/device.h>
b70f09c75 Corentin Labbe 2016-12-09 15 #include <linux/err.h>
844dd05fe Michael Buesch 2006-06-26 16 #include <linux/hw_random.h>
844dd05fe Michael Buesch 2006-06-26 17 #include <linux/kernel.h>
be4000bc4 Torsten Duwe 2014-06-14 18 #include <linux/kthread.h>
b70f09c75 Corentin Labbe 2016-12-09 19 #include <linux/miscdevice.h>
b70f09c75 Corentin Labbe 2016-12-09 20 #include <linux/module.h>
d9e797261 Kees Cook 2014-03-03 21 #include <linux/random.h>
b70f09c75 Corentin Labbe 2016-12-09 22 #include <linux/slab.h>
b70f09c75 Corentin Labbe 2016-12-09 23 #include <linux/uaccess.h>
844dd05fe Michael Buesch 2006-06-26 24
844dd05fe Michael Buesch 2006-06-26 25 #define RNG_MODULE_NAME "hw_random"
844dd05fe Michael Buesch 2006-06-26 26
844dd05fe Michael Buesch 2006-06-26 27 static struct hwrng *current_rng;
be4000bc4 Torsten Duwe 2014-06-14 28 static struct task_struct *hwrng_fill;
844dd05fe Michael Buesch 2006-06-26 29 static LIST_HEAD(rng_list);
9372b35e1 Rusty Russell 2014-12-08 30 /* Protects rng_list and current_rng */
844dd05fe Michael Buesch 2006-06-26 31 static DEFINE_MUTEX(rng_mutex);
9372b35e1 Rusty Russell 2014-12-08 32 /* Protects rng read functions, data_avail, rng_buffer and rng_fillbuf */
9372b35e1 Rusty Russell 2014-12-08 33 static DEFINE_MUTEX(reading_mutex);
9996508b3 Ian Molton 2009-12-01 34 static int data_avail;
be4000bc4 Torsten Duwe 2014-06-14 35 static u8 *rng_buffer, *rng_fillbuf;
0f734e6e7 Torsten Duwe 2014-06-14 36 static unsigned short current_quality;
0f734e6e7 Torsten Duwe 2014-06-14 37 static unsigned short default_quality; /* = 0; default to "off" */
be4000bc4 Torsten Duwe 2014-06-14 38
be4000bc4 Torsten Duwe 2014-06-14 39 module_param(current_quality, ushort, 0644);
be4000bc4 Torsten Duwe 2014-06-14 40 MODULE_PARM_DESC(current_quality,
be4000bc4 Torsten Duwe 2014-06-14 41 "current hwrng entropy estimation per mill");
0f734e6e7 Torsten Duwe 2014-06-14 42 module_param(default_quality, ushort, 0644);
0f734e6e7 Torsten Duwe 2014-06-14 43 MODULE_PARM_DESC(default_quality,
0f734e6e7 Torsten Duwe 2014-06-14 44 "default entropy content of hwrng per mill");
be4000bc4 Torsten Duwe 2014-06-14 45
ff77c150f Herbert Xu 2014-12-23 46 static void drop_current_rng(void);
90ac41bd4 Herbert Xu 2014-12-23 47 static int hwrng_init(struct hwrng *rng);
be4000bc4 Torsten Duwe 2014-06-14 48 static void start_khwrngd(void);
f7f154f12 Rusty Russell 2013-03-05 49
d3cc79964 Amit Shah 2014-07-10 50 static inline int rng_get_data(struct hwrng *rng, u8 *buffer, size_t size,
d3cc79964 Amit Shah 2014-07-10 51 int wait);
d3cc79964 Amit Shah 2014-07-10 52
f7f154f12 Rusty Russell 2013-03-05 53 static size_t rng_buffer_size(void)
f7f154f12 Rusty Russell 2013-03-05 54 {
f7f154f12 Rusty Russell 2013-03-05 55 return SMP_CACHE_BYTES < 32 ? 32 : SMP_CACHE_BYTES;
f7f154f12 Rusty Russell 2013-03-05 56 }
844dd05fe Michael Buesch 2006-06-26 57
d3cc79964 Amit Shah 2014-07-10 58 static void add_early_randomness(struct hwrng *rng)
d3cc79964 Amit Shah 2014-07-10 59 {
d3cc79964 Amit Shah 2014-07-10 60 int bytes_read;
6d4952d9d Andrew Lutomirski 2016-10-17 61 size_t size = min_t(size_t, 16, rng_buffer_size());
d3cc79964 Amit Shah 2014-07-10 62
9372b35e1 Rusty Russell 2014-12-08 63 mutex_lock(&reading_mutex);
6d4952d9d Andrew Lutomirski 2016-10-17 64 bytes_read = rng_get_data(rng, rng_buffer, size, 1);
9372b35e1 Rusty Russell 2014-12-08 65 mutex_unlock(&reading_mutex);
d3cc79964 Amit Shah 2014-07-10 66 if (bytes_read > 0)
6d4952d9d Andrew Lutomirski 2016-10-17 67 add_device_randomness(rng_buffer, bytes_read);
d3cc79964 Amit Shah 2014-07-10 68 }
d3cc79964 Amit Shah 2014-07-10 69
3a2c0ba5a Rusty Russell 2014-12-08 70 static inline void cleanup_rng(struct kref *kref)
3a2c0ba5a Rusty Russell 2014-12-08 71 {
3a2c0ba5a Rusty Russell 2014-12-08 72 struct hwrng *rng = container_of(kref, struct hwrng, ref);
3a2c0ba5a Rusty Russell 2014-12-08 73
3a2c0ba5a Rusty Russell 2014-12-08 74 if (rng->cleanup)
3a2c0ba5a Rusty Russell 2014-12-08 75 rng->cleanup(rng);
a027f30d7 Rusty Russell 2014-12-08 76
77584ee57 Herbert Xu 2014-12-23 77 complete(&rng->cleanup_done);
3a2c0ba5a Rusty Russell 2014-12-08 78 }
3a2c0ba5a Rusty Russell 2014-12-08 79
90ac41bd4 Herbert Xu 2014-12-23 80 static int set_current_rng(struct hwrng *rng)
3a2c0ba5a Rusty Russell 2014-12-08 81 {
90ac41bd4 Herbert Xu 2014-12-23 82 int err;
90ac41bd4 Herbert Xu 2014-12-23 83
3a2c0ba5a Rusty Russell 2014-12-08 84 BUG_ON(!mutex_is_locked(&rng_mutex));
90ac41bd4 Herbert Xu 2014-12-23 85
90ac41bd4 Herbert Xu 2014-12-23 86 err = hwrng_init(rng);
90ac41bd4 Herbert Xu 2014-12-23 87 if (err)
90ac41bd4 Herbert Xu 2014-12-23 88 return err;
90ac41bd4 Herbert Xu 2014-12-23 89
ff77c150f Herbert Xu 2014-12-23 90 drop_current_rng();
3a2c0ba5a Rusty Russell 2014-12-08 91 current_rng = rng;
90ac41bd4 Herbert Xu 2014-12-23 92
90ac41bd4 Herbert Xu 2014-12-23 93 return 0;
3a2c0ba5a Rusty Russell 2014-12-08 94 }
3a2c0ba5a Rusty Russell 2014-12-08 95
3a2c0ba5a Rusty Russell 2014-12-08 96 static void drop_current_rng(void)
3a2c0ba5a Rusty Russell 2014-12-08 97 {
3a2c0ba5a Rusty Russell 2014-12-08 98 BUG_ON(!mutex_is_locked(&rng_mutex));
3a2c0ba5a Rusty Russell 2014-12-08 99 if (!current_rng)
3a2c0ba5a Rusty Russell 2014-12-08 100 return;
3a2c0ba5a Rusty Russell 2014-12-08 101
3a2c0ba5a Rusty Russell 2014-12-08 102 /* decrease last reference for triggering the cleanup */
3a2c0ba5a Rusty Russell 2014-12-08 103 kref_put(¤t_rng->ref, cleanup_rng);
3a2c0ba5a Rusty Russell 2014-12-08 104 current_rng = NULL;
3a2c0ba5a Rusty Russell 2014-12-08 105 }
3a2c0ba5a Rusty Russell 2014-12-08 106
3a2c0ba5a Rusty Russell 2014-12-08 107 /* Returns ERR_PTR(), NULL or refcounted hwrng */
3a2c0ba5a Rusty Russell 2014-12-08 108 static struct hwrng *get_current_rng(void)
3a2c0ba5a Rusty Russell 2014-12-08 109 {
3a2c0ba5a Rusty Russell 2014-12-08 110 struct hwrng *rng;
3a2c0ba5a Rusty Russell 2014-12-08 111
3a2c0ba5a Rusty Russell 2014-12-08 112 if (mutex_lock_interruptible(&rng_mutex))
3a2c0ba5a Rusty Russell 2014-12-08 113 return ERR_PTR(-ERESTARTSYS);
3a2c0ba5a Rusty Russell 2014-12-08 114
3a2c0ba5a Rusty Russell 2014-12-08 115 rng = current_rng;
3a2c0ba5a Rusty Russell 2014-12-08 116 if (rng)
3a2c0ba5a Rusty Russell 2014-12-08 117 kref_get(&rng->ref);
3a2c0ba5a Rusty Russell 2014-12-08 118
3a2c0ba5a Rusty Russell 2014-12-08 119 mutex_unlock(&rng_mutex);
3a2c0ba5a Rusty Russell 2014-12-08 120 return rng;
3a2c0ba5a Rusty Russell 2014-12-08 121 }
3a2c0ba5a Rusty Russell 2014-12-08 122
3a2c0ba5a Rusty Russell 2014-12-08 123 static void put_rng(struct hwrng *rng)
3a2c0ba5a Rusty Russell 2014-12-08 124 {
3a2c0ba5a Rusty Russell 2014-12-08 125 /*
3a2c0ba5a Rusty Russell 2014-12-08 126 * Hold rng_mutex here so we serialize in case they set_current_rng
3a2c0ba5a Rusty Russell 2014-12-08 127 * on rng again immediately.
3a2c0ba5a Rusty Russell 2014-12-08 128 */
3a2c0ba5a Rusty Russell 2014-12-08 129 mutex_lock(&rng_mutex);
3a2c0ba5a Rusty Russell 2014-12-08 130 if (rng)
3a2c0ba5a Rusty Russell 2014-12-08 131 kref_put(&rng->ref, cleanup_rng);
3a2c0ba5a Rusty Russell 2014-12-08 132 mutex_unlock(&rng_mutex);
3a2c0ba5a Rusty Russell 2014-12-08 133 }
3a2c0ba5a Rusty Russell 2014-12-08 134
90ac41bd4 Herbert Xu 2014-12-23 135 static int hwrng_init(struct hwrng *rng)
844dd05fe Michael Buesch 2006-06-26 136 {
15b66cd54 Herbert Xu 2014-12-23 137 if (kref_get_unless_zero(&rng->ref))
15b66cd54 Herbert Xu 2014-12-23 138 goto skip_init;
15b66cd54 Herbert Xu 2014-12-23 139
d3cc79964 Amit Shah 2014-07-10 140 if (rng->init) {
d3cc79964 Amit Shah 2014-07-10 141 int ret;
d3cc79964 Amit Shah 2014-07-10 142
d3cc79964 Amit Shah 2014-07-10 143 ret = rng->init(rng);
d3cc79964 Amit Shah 2014-07-10 144 if (ret)
d3cc79964 Amit Shah 2014-07-10 145 return ret;
d3cc79964 Amit Shah 2014-07-10 146 }
15b66cd54 Herbert Xu 2014-12-23 147
15b66cd54 Herbert Xu 2014-12-23 148 kref_init(&rng->ref);
15b66cd54 Herbert Xu 2014-12-23 149 reinit_completion(&rng->cleanup_done);
15b66cd54 Herbert Xu 2014-12-23 150
15b66cd54 Herbert Xu 2014-12-23 151 skip_init:
d3cc79964 Amit Shah 2014-07-10 152 add_early_randomness(rng);
be4000bc4 Torsten Duwe 2014-06-14 153
0f734e6e7 Torsten Duwe 2014-06-14 154 current_quality = rng->quality ? : default_quality;
506bf0c04 Keith Packard 2015-03-18 155 if (current_quality > 1024)
506bf0c04 Keith Packard 2015-03-18 156 current_quality = 1024;
0f734e6e7 Torsten Duwe 2014-06-14 157
0f734e6e7 Torsten Duwe 2014-06-14 158 if (current_quality == 0 && hwrng_fill)
0f734e6e7 Torsten Duwe 2014-06-14 159 kthread_stop(hwrng_fill);
be4000bc4 Torsten Duwe 2014-06-14 160 if (current_quality > 0 && !hwrng_fill)
be4000bc4 Torsten Duwe 2014-06-14 161 start_khwrngd();
be4000bc4 Torsten Duwe 2014-06-14 162
844dd05fe Michael Buesch 2006-06-26 163 return 0;
844dd05fe Michael Buesch 2006-06-26 164 }
844dd05fe Michael Buesch 2006-06-26 165
844dd05fe Michael Buesch 2006-06-26 166 static int rng_dev_open(struct inode *inode, struct file *filp)
844dd05fe Michael Buesch 2006-06-26 167 {
844dd05fe Michael Buesch 2006-06-26 168 /* enforce read-only access to this chrdev */
844dd05fe Michael Buesch 2006-06-26 @169 if ((filp->f_mode & FMODE_READ) == 0)
844dd05fe Michael Buesch 2006-06-26 170 return -EINVAL;
844dd05fe Michael Buesch 2006-06-26 @171 if (filp->f_mode & FMODE_WRITE)
844dd05fe Michael Buesch 2006-06-26 172 return -EINVAL;
844dd05fe Michael Buesch 2006-06-26 173 return 0;
844dd05fe Michael Buesch 2006-06-26 174 }
844dd05fe Michael Buesch 2006-06-26 175
9996508b3 Ian Molton 2009-12-01 176 static inline int rng_get_data(struct hwrng *rng, u8 *buffer, size_t size,
9996508b3 Ian Molton 2009-12-01 177 int wait) {
9996508b3 Ian Molton 2009-12-01 178 int present;
9996508b3 Ian Molton 2009-12-01 179
9372b35e1 Rusty Russell 2014-12-08 180 BUG_ON(!mutex_is_locked(&reading_mutex));
9996508b3 Ian Molton 2009-12-01 181 if (rng->read)
9996508b3 Ian Molton 2009-12-01 182 return rng->read(rng, (void *)buffer, size, wait);
9996508b3 Ian Molton 2009-12-01 183
9996508b3 Ian Molton 2009-12-01 184 if (rng->data_present)
9996508b3 Ian Molton 2009-12-01 185 present = rng->data_present(rng, wait);
9996508b3 Ian Molton 2009-12-01 186 else
9996508b3 Ian Molton 2009-12-01 187 present = 1;
9996508b3 Ian Molton 2009-12-01 188
9996508b3 Ian Molton 2009-12-01 189 if (present)
9996508b3 Ian Molton 2009-12-01 190 return rng->data_read(rng, (u32 *)buffer);
9996508b3 Ian Molton 2009-12-01 191
9996508b3 Ian Molton 2009-12-01 192 return 0;
9996508b3 Ian Molton 2009-12-01 193 }
9996508b3 Ian Molton 2009-12-01 194
844dd05fe Michael Buesch 2006-06-26 195 static ssize_t rng_dev_read(struct file *filp, char __user *buf,
844dd05fe Michael Buesch 2006-06-26 196 size_t size, loff_t *offp)
844dd05fe Michael Buesch 2006-06-26 197 {
844dd05fe Michael Buesch 2006-06-26 198 ssize_t ret = 0;
984e976f5 Patrick McHardy 2007-11-21 199 int err = 0;
9996508b3 Ian Molton 2009-12-01 200 int bytes_read, len;
3a2c0ba5a Rusty Russell 2014-12-08 201 struct hwrng *rng;
844dd05fe Michael Buesch 2006-06-26 202
844dd05fe Michael Buesch 2006-06-26 203 while (size) {
3a2c0ba5a Rusty Russell 2014-12-08 204 rng = get_current_rng();
3a2c0ba5a Rusty Russell 2014-12-08 205 if (IS_ERR(rng)) {
3a2c0ba5a Rusty Russell 2014-12-08 206 err = PTR_ERR(rng);
844dd05fe Michael Buesch 2006-06-26 207 goto out;
9996508b3 Ian Molton 2009-12-01 208 }
3a2c0ba5a Rusty Russell 2014-12-08 209 if (!rng) {
844dd05fe Michael Buesch 2006-06-26 210 err = -ENODEV;
3a2c0ba5a Rusty Russell 2014-12-08 211 goto out;
844dd05fe Michael Buesch 2006-06-26 212 }
984e976f5 Patrick McHardy 2007-11-21 213
1ab87298c Jiri Slaby 2015-11-27 214 if (mutex_lock_interruptible(&reading_mutex)) {
1ab87298c Jiri Slaby 2015-11-27 215 err = -ERESTARTSYS;
1ab87298c Jiri Slaby 2015-11-27 216 goto out_put;
1ab87298c Jiri Slaby 2015-11-27 217 }
9996508b3 Ian Molton 2009-12-01 218 if (!data_avail) {
3a2c0ba5a Rusty Russell 2014-12-08 219 bytes_read = rng_get_data(rng, rng_buffer,
f7f154f12 Rusty Russell 2013-03-05 220 rng_buffer_size(),
9996508b3 Ian Molton 2009-12-01 @221 !(filp->f_flags & O_NONBLOCK));
893f11286 Ralph Wuerthner 2008-04-17 222 if (bytes_read < 0) {
893f11286 Ralph Wuerthner 2008-04-17 223 err = bytes_read;
9372b35e1 Rusty Russell 2014-12-08 224 goto out_unlock_reading;
9996508b3 Ian Molton 2009-12-01 225 }
9996508b3 Ian Molton 2009-12-01 226 data_avail = bytes_read;
893f11286 Ralph Wuerthner 2008-04-17 227 }
844dd05fe Michael Buesch 2006-06-26 228
9996508b3 Ian Molton 2009-12-01 229 if (!data_avail) {
9996508b3 Ian Molton 2009-12-01 230 if (filp->f_flags & O_NONBLOCK) {
9996508b3 Ian Molton 2009-12-01 231 err = -EAGAIN;
9372b35e1 Rusty Russell 2014-12-08 232 goto out_unlock_reading;
9996508b3 Ian Molton 2009-12-01 233 }
9996508b3 Ian Molton 2009-12-01 234 } else {
9996508b3 Ian Molton 2009-12-01 235 len = data_avail;
9996508b3 Ian Molton 2009-12-01 236 if (len > size)
9996508b3 Ian Molton 2009-12-01 237 len = size;
9996508b3 Ian Molton 2009-12-01 238
9996508b3 Ian Molton 2009-12-01 239 data_avail -= len;
9996508b3 Ian Molton 2009-12-01 240
9996508b3 Ian Molton 2009-12-01 241 if (copy_to_user(buf + ret, rng_buffer + data_avail,
9996508b3 Ian Molton 2009-12-01 242 len)) {
844dd05fe Michael Buesch 2006-06-26 243 err = -EFAULT;
9372b35e1 Rusty Russell 2014-12-08 244 goto out_unlock_reading;
9996508b3 Ian Molton 2009-12-01 245 }
9996508b3 Ian Molton 2009-12-01 246
9996508b3 Ian Molton 2009-12-01 247 size -= len;
9996508b3 Ian Molton 2009-12-01 248 ret += len;
844dd05fe Michael Buesch 2006-06-26 249 }
844dd05fe Michael Buesch 2006-06-26 250
9372b35e1 Rusty Russell 2014-12-08 251 mutex_unlock(&reading_mutex);
3a2c0ba5a Rusty Russell 2014-12-08 252 put_rng(rng);
9996508b3 Ian Molton 2009-12-01 253
844dd05fe Michael Buesch 2006-06-26 254 if (need_resched())
844dd05fe Michael Buesch 2006-06-26 255 schedule_timeout_interruptible(1);
9996508b3 Ian Molton 2009-12-01 256
9996508b3 Ian Molton 2009-12-01 257 if (signal_pending(current)) {
844dd05fe Michael Buesch 2006-06-26 258 err = -ERESTARTSYS;
844dd05fe Michael Buesch 2006-06-26 259 goto out;
844dd05fe Michael Buesch 2006-06-26 260 }
9996508b3 Ian Molton 2009-12-01 261 }
844dd05fe Michael Buesch 2006-06-26 262 out:
844dd05fe Michael Buesch 2006-06-26 263 return ret ? : err;
3a2c0ba5a Rusty Russell 2014-12-08 264
9372b35e1 Rusty Russell 2014-12-08 265 out_unlock_reading:
9372b35e1 Rusty Russell 2014-12-08 266 mutex_unlock(&reading_mutex);
1ab87298c Jiri Slaby 2015-11-27 267 out_put:
3a2c0ba5a Rusty Russell 2014-12-08 268 put_rng(rng);
3a2c0ba5a Rusty Russell 2014-12-08 269 goto out;
844dd05fe Michael Buesch 2006-06-26 270 }
844dd05fe Michael Buesch 2006-06-26 271
62322d255 Arjan van de Ven 2006-07-03 @272 static const struct file_operations rng_chrdev_ops = {
844dd05fe Michael Buesch 2006-06-26 @273 .owner = THIS_MODULE,
844dd05fe Michael Buesch 2006-06-26 @274 .open = rng_dev_open,
844dd05fe Michael Buesch 2006-06-26 @275 .read = rng_dev_read,
6038f373a Arnd Bergmann 2010-08-15 @276 .llseek = noop_llseek,
844dd05fe Michael Buesch 2006-06-26 277 };
844dd05fe Michael Buesch 2006-06-26 278
0daa7a0af Takashi Iwai 2015-02-02 279 static const struct attribute_group *rng_dev_groups[];
:::::: The code at line 169 was first introduced by commit
:::::: 844dd05fec172d98b0dacecd9b9e9f6595204c13 [PATCH] Add new generic HW RNG core
:::::: TO: Michael Buesch <mb@bu3sch.de>
:::::: CC: Linus Torvalds <torvalds@g5.osdl.org>
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 17496 bytes --]
^ permalink raw reply
* Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
From: Andy Lutomirski @ 2016-12-10 5:25 UTC (permalink / raw)
To: Eric Biggers
Cc: linux-crypto, linux-kernel@vger.kernel.org, linux-mm@kvack.org,
kernel-hardening@lists.openwall.com, Herbert Xu,
Andrew Lutomirski, Stephan Mueller
In-Reply-To: <20161209230851.GB64048@google.com>
On Fri, Dec 9, 2016 at 3:08 PM, Eric Biggers <ebiggers3@gmail.com> wrote:
> In the 4.9 kernel, virtually-mapped stacks will be supported and enabled by
> default on x86_64. This has been exposing a number of problems in which
> on-stack buffers are being passed into the crypto API, which to support crypto
> accelerators operates on 'struct page' rather than on virtual memory.
>
> Some of these problems have already been fixed, but I was wondering how many
> problems remain, so I briefly looked through all the callers of sg_set_buf() and
> sg_init_one(). Overall I found quite a few remaining problems, detailed below.
>
> The following crypto drivers initialize a scatterlist to point into an
> ahash_request, which may have been allocated on the stack with
> AHASH_REQUEST_ON_STACK():
>
> drivers/crypto/bfin_crc.c:351
> drivers/crypto/qce/sha.c:299
> drivers/crypto/sahara.c:973,988
> drivers/crypto/talitos.c:1910
This are impossible or highly unlikely on x86.
> drivers/crypto/ccp/ccp-crypto-aes-cmac.c:105,119,142
> drivers/crypto/ccp/ccp-crypto-sha.c:95,109,124
These
> drivers/crypto/qce/sha.c:325
This is impossible on x86.
>
> The following crypto drivers initialize a scatterlist to point into an
> ablkcipher_request, which may have been allocated on the stack with
> SKCIPHER_REQUEST_ON_STACK():
>
> drivers/crypto/ccp/ccp-crypto-aes-xts.c:162
> drivers/crypto/ccp/ccp-crypto-aes.c:94
These are real, and I wish I'd known about them sooner.
>
> And these other places do crypto operations on buffers clearly on the stack:
>
> drivers/net/wireless/intersil/orinoco/mic.c:72
Ick.
> drivers/usb/wusbcore/crypto.c:264
Well, crud. I thought I had fixed this driver but I missed one case.
Will send a fix tomorrow. But I'm still unconvinced that this
hardware ever shipped.
> net/ceph/crypto.c:182
Ick.
> net/rxrpc/rxkad.c:737,1000
Well, crud. This was supposed to have been fixed in:
commit a263629da519b2064588377416e067727e2cbdf9
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Sun Jun 26 14:55:24 2016 -0700
rxrpc: Avoid using stack memory in SG lists in rxkad
> security/keys/encrypted-keys/encrypted.c:500
That's a trivial one-liner. Patch coming tomorrow.
> fs/cifs/smbencrypt.c:96
Ick.
>
> Note: I almost certainly missed some, since I excluded places where the use of a
> stack buffer was not obvious to me. I also excluded AEAD algorithms since there
> isn't an AEAD_REQUEST_ON_STACK() macro (yet).
>
> The "good" news with these bugs is that on x86_64 without CONFIG_DEBUG_SG=y or
> CONFIG_DEBUG_VIRTUAL=y, you can still do virt_to_page() and then page_address()
> on a vmalloc address and get back the same address, even though you aren't
> *supposed* to be able to do this. This will make things still work for most
> people. The bad news is that if you happen to have consumed just about 1 page
> (or N pages) of your stack at the time you call the crypto API, your stack
> buffer may actually span physically non-contiguous pages, so the crypto
> algorithm will scribble over some unrelated page.
Are you sure? If it round-trips to the same virtual address, it
doesn't matter if the buffer is contiguous.
> Also, hardware crypto drivers
> which actually do operate on physical memory will break too.
Those were already broken. DMA has been illegal on the stack for
years and DMA debugging would have caught it.
>
> So I am wondering: is the best solution really to make all these crypto API
> algorithms and users use heap buffers, as opposed to something like maintaining
> a lowmem alias for the stack, or introducing a more general function to convert
> buffers (possibly in the vmalloc space) into scatterlists? And if the current
> solution is desired, who is going to fix all of these bugs and when?
The *right* solution IMO is to fix crypto to stop using scatterlists.
Scatterlists are for DMA using physical addresses, and they're
inappropriate almost every user of them that's using them for crypto.
kiov would be much better -- it would make sense and it would be
faster.
I have a hack to make scatterlists pointing to the stack work (as long
as they're only one element), but that's seriously gross.
Herbert, how hard would it be to teach the crypto code to use a more
sensible data structure than scatterlist and to use coccinelle fix
this stuff for real?
In the mean time, we should patch the handful of drivers that matter.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
* Re: [PATCH 7/7] hwrng: core: Remove two unused include
From: kbuild test robot @ 2016-12-10 5:31 UTC (permalink / raw)
To: Corentin Labbe
Cc: kbuild-all, mpm, herbert, arnd, gregkh, linux-crypto,
linux-kernel, Corentin Labbe
In-Reply-To: <1481293299-21697-7-git-send-email-clabbe.montjoie@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 5431 bytes --]
Hi Corentin,
[auto build test ERROR on char-misc/char-misc-testing]
[also build test ERROR on v4.9-rc8 next-20161209]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Corentin-Labbe/hwrng-core-do-not-use-multiple-blank-lines/20161210-072632
config: i386-randconfig-i0-201649 (attached as .config)
compiler: gcc-4.8 (Debian 4.8.4-1) 4.8.4
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All errors (new ones prefixed by >>):
drivers/char/hw_random/core.c: In function 'rng_dev_open':
>> drivers/char/hw_random/core.c:169:11: error: dereferencing pointer to incomplete type
if ((filp->f_mode & FMODE_READ) == 0)
^
drivers/char/hw_random/core.c:169:22: error: 'FMODE_READ' undeclared (first use in this function)
if ((filp->f_mode & FMODE_READ) == 0)
^
drivers/char/hw_random/core.c:169:22: note: each undeclared identifier is reported only once for each function it appears in
drivers/char/hw_random/core.c:171:10: error: dereferencing pointer to incomplete type
if (filp->f_mode & FMODE_WRITE)
^
drivers/char/hw_random/core.c:171:21: error: 'FMODE_WRITE' undeclared (first use in this function)
if (filp->f_mode & FMODE_WRITE)
^
drivers/char/hw_random/core.c: In function 'rng_dev_read':
drivers/char/hw_random/core.c:221:11: error: dereferencing pointer to incomplete type
!(filp->f_flags & O_NONBLOCK));
^
drivers/char/hw_random/core.c:221:23: error: 'O_NONBLOCK' undeclared (first use in this function)
!(filp->f_flags & O_NONBLOCK));
^
drivers/char/hw_random/core.c:230:12: error: dereferencing pointer to incomplete type
if (filp->f_flags & O_NONBLOCK) {
^
drivers/char/hw_random/core.c: At top level:
drivers/char/hw_random/core.c:272:21: error: variable 'rng_chrdev_ops' has initializer but incomplete type
static const struct file_operations rng_chrdev_ops = {
^
drivers/char/hw_random/core.c:273:2: error: unknown field 'owner' specified in initializer
.owner = THIS_MODULE,
^
In file included from include/linux/linkage.h:6:0,
from include/linux/kernel.h:6,
from include/linux/delay.h:10,
from drivers/char/hw_random/core.c:13:
include/linux/export.h:37:30: warning: excess elements in struct initializer [enabled by default]
#define THIS_MODULE ((struct module *)0)
^
drivers/char/hw_random/core.c:273:12: note: in expansion of macro 'THIS_MODULE'
.owner = THIS_MODULE,
^
include/linux/export.h:37:30: warning: (near initialization for 'rng_chrdev_ops') [enabled by default]
#define THIS_MODULE ((struct module *)0)
^
drivers/char/hw_random/core.c:273:12: note: in expansion of macro 'THIS_MODULE'
.owner = THIS_MODULE,
^
drivers/char/hw_random/core.c:274:2: error: unknown field 'open' specified in initializer
.open = rng_dev_open,
^
drivers/char/hw_random/core.c:274:2: warning: excess elements in struct initializer [enabled by default]
drivers/char/hw_random/core.c:274:2: warning: (near initialization for 'rng_chrdev_ops') [enabled by default]
drivers/char/hw_random/core.c:275:2: error: unknown field 'read' specified in initializer
.read = rng_dev_read,
^
drivers/char/hw_random/core.c:275:2: warning: excess elements in struct initializer [enabled by default]
drivers/char/hw_random/core.c:275:2: warning: (near initialization for 'rng_chrdev_ops') [enabled by default]
drivers/char/hw_random/core.c:276:2: error: unknown field 'llseek' specified in initializer
.llseek = noop_llseek,
^
drivers/char/hw_random/core.c:276:13: error: 'noop_llseek' undeclared here (not in a function)
.llseek = noop_llseek,
^
drivers/char/hw_random/core.c:276:2: warning: excess elements in struct initializer [enabled by default]
.llseek = noop_llseek,
^
drivers/char/hw_random/core.c:276:2: warning: (near initialization for 'rng_chrdev_ops') [enabled by default]
vim +169 drivers/char/hw_random/core.c
844dd05f Michael Buesch 2006-06-26 163 return 0;
844dd05f Michael Buesch 2006-06-26 164 }
844dd05f Michael Buesch 2006-06-26 165
844dd05f Michael Buesch 2006-06-26 166 static int rng_dev_open(struct inode *inode, struct file *filp)
844dd05f Michael Buesch 2006-06-26 167 {
844dd05f Michael Buesch 2006-06-26 168 /* enforce read-only access to this chrdev */
844dd05f Michael Buesch 2006-06-26 @169 if ((filp->f_mode & FMODE_READ) == 0)
844dd05f Michael Buesch 2006-06-26 170 return -EINVAL;
844dd05f Michael Buesch 2006-06-26 171 if (filp->f_mode & FMODE_WRITE)
844dd05f Michael Buesch 2006-06-26 172 return -EINVAL;
:::::: The code at line 169 was first introduced by commit
:::::: 844dd05fec172d98b0dacecd9b9e9f6595204c13 [PATCH] Add new generic HW RNG core
:::::: TO: Michael Buesch <mb@bu3sch.de>
:::::: CC: Linus Torvalds <torvalds@g5.osdl.org>
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 26603 bytes --]
^ permalink raw reply
* Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
From: Herbert Xu @ 2016-12-10 5:32 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Eric Biggers, linux-crypto, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, kernel-hardening@lists.openwall.com,
Andrew Lutomirski, Stephan Mueller
In-Reply-To: <CALCETrW=+3u3P8Xva+0ck9=fr-mD6azPtTkOQ3uQO+GoOA6FcQ@mail.gmail.com>
On Fri, Dec 09, 2016 at 09:25:38PM -0800, Andy Lutomirski wrote:
>
> > The following crypto drivers initialize a scatterlist to point into an
> > ablkcipher_request, which may have been allocated on the stack with
> > SKCIPHER_REQUEST_ON_STACK():
> >
> > drivers/crypto/ccp/ccp-crypto-aes-xts.c:162
> > drivers/crypto/ccp/ccp-crypto-aes.c:94
>
> These are real, and I wish I'd known about them sooner.
Are you sure? Any instance of *_ON_STACK must only be used with
sync algorithms and most drivers under drivers/crypto declare
themselves as async.
Cheers,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
From: Herbert Xu @ 2016-12-10 5:37 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Eric Biggers, linux-crypto, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, kernel-hardening@lists.openwall.com,
Andrew Lutomirski, Stephan Mueller
In-Reply-To: <CALCETrW=+3u3P8Xva+0ck9=fr-mD6azPtTkOQ3uQO+GoOA6FcQ@mail.gmail.com>
On Fri, Dec 09, 2016 at 09:25:38PM -0800, Andy Lutomirski wrote:
>
> Herbert, how hard would it be to teach the crypto code to use a more
> sensible data structure than scatterlist and to use coccinelle fix
> this stuff for real?
First of all we already have a sync non-SG hash interface, it's
called shash.
If we had enough sync-only users of skcipher then I'll consider
adding an interface for it. However, at this point in time it
appears to more sense to convert such users over to the async
interface rather than the other way around.
As for AEAD we never had a sync interface to begin with and I
don't think I'm going to add one.
Cheers,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
From: Eric Biggers @ 2016-12-10 5:55 UTC (permalink / raw)
To: Andy Lutomirski
Cc: linux-crypto, linux-kernel@vger.kernel.org, linux-mm@kvack.org,
kernel-hardening@lists.openwall.com, Herbert Xu,
Andrew Lutomirski, Stephan Mueller
In-Reply-To: <CALCETrW=+3u3P8Xva+0ck9=fr-mD6azPtTkOQ3uQO+GoOA6FcQ@mail.gmail.com>
On Fri, Dec 09, 2016 at 09:25:38PM -0800, Andy Lutomirski wrote:
> > The following crypto drivers initialize a scatterlist to point into an
> > ahash_request, which may have been allocated on the stack with
> > AHASH_REQUEST_ON_STACK():
> >
> > drivers/crypto/bfin_crc.c:351
> > drivers/crypto/qce/sha.c:299
> > drivers/crypto/sahara.c:973,988
> > drivers/crypto/talitos.c:1910
>
> This are impossible or highly unlikely on x86.
>
> > drivers/crypto/ccp/ccp-crypto-aes-cmac.c:105,119,142
> > drivers/crypto/ccp/ccp-crypto-sha.c:95,109,124
>
> These
>
> > drivers/crypto/qce/sha.c:325
>
> This is impossible on x86.
>
Thanks for looking into these. I didn't investigate who/what is likely to be
using each driver.
Of course I would not be surprised to see people want to start supporting
virtually mapped stacks on other architectures too.
> >
> > The "good" news with these bugs is that on x86_64 without CONFIG_DEBUG_SG=y or
> > CONFIG_DEBUG_VIRTUAL=y, you can still do virt_to_page() and then page_address()
> > on a vmalloc address and get back the same address, even though you aren't
> > *supposed* to be able to do this. This will make things still work for most
> > people. The bad news is that if you happen to have consumed just about 1 page
> > (or N pages) of your stack at the time you call the crypto API, your stack
> > buffer may actually span physically non-contiguous pages, so the crypto
> > algorithm will scribble over some unrelated page.
>
> Are you sure? If it round-trips to the same virtual address, it
> doesn't matter if the buffer is contiguous.
You may be right, I didn't test this. The hash_walk and blkcipher_walk code do
go page by page, but I suppose on x86_64 it would just step from one bogus
"struct page" to the adjacent one and still map it to the original virtual
address.
Eric
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
* Crypto Fixes for 4.9
From: Herbert Xu @ 2016-12-10 6:01 UTC (permalink / raw)
To: Linus Torvalds, David S. Miller, Linux Kernel Mailing List,
Linux Crypto Mailing List
In-Reply-To: <20161205063754.GA9408@gondor.apana.org.au>
Hi Linus:
This push fixes the following issues:
- Fix pointer size when caam is used with AArch64 boot loader on
AArch32 kernel.
- Fix ahash state corruption in marvell driver.
- Fix buggy algif_aed tag handling.
- Prevent mcryptd from being used with incompatible algorithms
which can cause crashes.
Please pull from
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6.git linus
Horia Geantă (1):
crypto: caam - fix pointer size for AArch64 boot loader, AArch32 kernel
Romain Perier (2):
crypto: marvell - Don't copy hash operation twice into the SRAM
crypto: marvell - Don't corrupt state of an STD req for re-stepped ahash
Stephan Mueller (2):
crypto: algif_aead - fix AEAD tag memory handling
crypto: algif_aead - fix uninitialized variable warning
tim (1):
crypto: mcryptd - Check mcryptd algorithm compatibility
crypto/algif_aead.c | 59 ++++++++++++++++++++++++++---------------
crypto/mcryptd.c | 19 ++++++++-----
drivers/crypto/caam/ctrl.c | 5 ++--
drivers/crypto/marvell/hash.c | 11 ++++----
4 files changed, 57 insertions(+), 37 deletions(-)
Thanks,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox