* Re: [PATCH RESEND v2 1/1] crypto: atmel-sha204a - fix heap info leak on I2C transfer failure
From: Herbert Xu @ 2026-06-13 12:28 UTC (permalink / raw)
To: Lothar Rubusch
Cc: thorsten.blum, davem, nicolas.ferre, alexandre.belloni,
claudiu.beznea, ardb, krzk+dt, linux-crypto, linux-arm-kernel,
linux-kernel
In-Reply-To: <CAFXKEHYcp-0+uCA47mtDe_+LUAZucEPbDJzoh5+e3Q3R20mN9Q@mail.gmail.com>
On Sat, Jun 13, 2026 at 10:52:25AM +0200, Lothar Rubusch wrote:
> On Thu, Jun 11, 2026 at 6:59 AM Herbert Xu <herbert@gondor.apana.org.au> wrote:
> >
> > On Tue, Jun 09, 2026 at 09:47:23AM +0000, Lothar Rubusch wrote:
> > >
> > > diff --git a/drivers/crypto/atmel-sha204a.c b/drivers/crypto/atmel-sha204a.c
> > > index 4c9af737b33a..20cd915ea8a3 100644
> > > --- a/drivers/crypto/atmel-sha204a.c
> > > +++ b/drivers/crypto/atmel-sha204a.c
> > > @@ -31,10 +31,15 @@ static void atmel_sha204a_rng_done(struct atmel_i2c_work_data *work_data,
> > > struct atmel_i2c_client_priv *i2c_priv = work_data->ctx;
> > > struct hwrng *rng = areq;
> > >
> > > - if (status)
> > > + if (status) {
> > > dev_warn_ratelimited(&i2c_priv->client->dev,
> > > "i2c transaction failed (%d)\n",
> > > status);
> > > + kfree(work_data);
> > > + rng->priv = 0;
> >
> > Why is this necessary? It appears that rng_read_nonblocking already
> > zeroes rng->priv.
> >
>
> IMHO this is not the same. The patch targets the error path. If the
> `status` in `atmel_sha204a_rng_done()` is failed, then failed `work_data` is
> still assigned and `rng->priv` is not zeroed at the moment. Only a
> subsequent call to `rng_read_nonblocking()` will set `rng->priv = 0;`
Right, the rng->priv gets set on the error path prior to your patch.
But with your patch, there is no need to clear rng->priv because it
never gets set on the error path.
All I'm asking for is to remove the rng->priv = 0 because it only
causes confusion.
Cheers,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* i.MX95: EdgeLock Enclave secure storage
From: Fabio Estevam @ 2026-06-13 13:58 UTC (permalink / raw)
To: Pankaj Gupta
Cc: Schrempf Frieder,
moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
open list:HARDWARE RANDOM NUMBER GENERATOR CORE, Peng Fan,
Stefano Babic, Frank Li
Hi Pankaj,
First of all, thank you for your work on upstreaming the
EdgeLock Enclave (ELE) support. It is great to finally see the
ELE framework landing upstream after a long development effort.
I am currently evaluating the state of i.MX95 secure-boot and
storage-security support based on current linux-next, with the
goal of understanding what can already be achieved using
upstream software and what pieces are still under development.
From my review, it appears that the following infrastructure is
already available upstream:
- ELE/V2X mailbox support for i.MX95.
- OCOTP/ELE nvmem support for fuse access.
- Secure-enclave bindings documenting the i.MX95 ELE HSM.
However, I could not find upstream support for several
capabilities that would be useful for secure storage
deployments on i.MX95, including:
- An ELE-backed trusted-key provider for the Linux trusted key
framework.
- Integration allowing Linux to use ELE as a key-sealing/
unsealing backend.
- i.MX95-specific crypto acceleration exposed through the Linux
crypto API for dm-crypt use cases.
Are you aware of any ongoing upstream or planned development
activities in these areas, particularly for i.MX95?
Any information about the upstream roadmap, ongoing
development, or expected direction for these features would be
greatly appreciated.
Thanks again for your work and for any insights you can share.
Regards,
Fabio Estevam
^ permalink raw reply
* Re: [PATCH] crypto: atmel-ecc - reject hardware ECDH without a public key
From: Thorsten Blum @ 2026-06-13 14:21 UTC (permalink / raw)
To: Herbert Xu, David S. Miller, Nicolas Ferre, Alexandre Belloni,
Claudiu Beznea, Tudor Ambarus
Cc: linux-crypto, linux-arm-kernel, linux-kernel
In-Reply-To: <20260611213617.463552-2-thorsten.blum@linux.dev>
On Thu, Jun 11, 2026 at 11:36:17PM +0200, Thorsten Blum wrote:
> The hardware ECDH path in atmel_ecdh_compute_shared_secret() uses the
> private key stored in the device. However, the public key is cached only
> after atmel_ecdh_set_secret() successfully generated that private key
> for the current tfm.
>
> atmel_ecdh_generate_public_key() already rejects requests when no public
> key is cached. Add the same check to atmel_ecdh_compute_shared_secret()
> to prevent the device from using a private key that was not generated
> for the current tfm.
>
> Fixes: 11105693fa05 ("crypto: atmel-ecc - introduce Microchip / Atmel ECC driver")
> Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
> ---
> drivers/crypto/atmel-ecc.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/crypto/atmel-ecc.c b/drivers/crypto/atmel-ecc.c
> index 93f219558c2f..542c8cc13a0f 100644
> --- a/drivers/crypto/atmel-ecc.c
> +++ b/drivers/crypto/atmel-ecc.c
> @@ -173,6 +173,9 @@ static int atmel_ecdh_compute_shared_secret(struct kpp_request *req)
> return crypto_kpp_compute_shared_secret(req);
> }
>
> + if (!ctx->public_key)
> + return -EINVAL;
> +
> /* must have exactly two points to be on the curve */
> if (req->src_len != ATMEL_ECC_PUBKEY_SIZE)
> return -EINVAL;
I'll need to rebase and resend this assuming [1] is applied first, as it
currently doesn't apply cleanly.
[1] https://lore.kernel.org/lkml/20260609100552.233494-3-thorsten.blum@linux.dev/
^ permalink raw reply
* Re: [PATCH] crypto: atmel-ecc - drop unused curve id from atmel_ecdh_ctx
From: Thorsten Blum @ 2026-06-13 14:23 UTC (permalink / raw)
To: Herbert Xu, David S. Miller, Nicolas Ferre, Alexandre Belloni,
Claudiu Beznea
Cc: linux-crypto, linux-arm-kernel, linux-kernel
In-Reply-To: <20260611105159.460794-3-thorsten.blum@linux.dev>
On Thu, Jun 11, 2026 at 12:52:01PM +0200, Thorsten Blum wrote:
> ->curve_id is only set once, but never used - remove it.
>
> Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
> ---
> drivers/crypto/atmel-ecc.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/drivers/crypto/atmel-ecc.c b/drivers/crypto/atmel-ecc.c
> index 9da9dd6585df..93f219558c2f 100644
> --- a/drivers/crypto/atmel-ecc.c
> +++ b/drivers/crypto/atmel-ecc.c
> @@ -33,7 +33,6 @@ static struct atmel_ecc_driver_data driver_data;
> * @public_key : generated when calling set_secret(). It's the responsibility
> * of the user to not call set_secret() while
> * generate_public_key() or compute_shared_secret() are in flight.
> - * @curve_id : elliptic curve id
> * @do_fallback: true when the device doesn't support the curve or when the user
> * wants to use its own private key.
> */
> @@ -41,7 +40,6 @@ struct atmel_ecdh_ctx {
> struct i2c_client *client;
> struct crypto_kpp *fallback;
> const u8 *public_key;
> - unsigned int curve_id;
> bool do_fallback;
> };
>
> @@ -250,7 +248,6 @@ static int atmel_ecdh_init_tfm(struct crypto_kpp *tfm)
> struct crypto_kpp *fallback;
> struct atmel_ecdh_ctx *ctx = kpp_tfm_ctx(tfm);
>
> - ctx->curve_id = ECC_CURVE_NIST_P256;
> ctx->client = atmel_ecc_i2c_client_alloc();
> if (IS_ERR(ctx->client)) {
> pr_err("tfm - i2c_client binding failed\n");
I'll need to rebase and resend this assuming [1] is applied first, as it
currently doesn't apply cleanly.
[1] https://lore.kernel.org/lkml/20260609100552.233494-3-thorsten.blum@linux.dev/
^ permalink raw reply
* Re: [PATCH] crypto: ti - Use list_first_entry_or_null() in dthe_get_dev()
From: T Pratham @ 2026-06-13 19:30 UTC (permalink / raw)
To: Mert Seftali, Herbert Xu
Cc: David S . Miller, Dan Carpenter, linux-crypto, linux-kernel,
kernel test robot
In-Reply-To: <20260613085858.32580-1-mertsftl@gmail.com>
On 13-06-2026 14:28, Mert Seftali wrote:
> dthe_get_dev() fetches a device from the global device list with
> list_first_entry() and then checks the result for NULL. However,
> list_first_entry() never returns NULL: on an empty list it returns a
> bogus pointer computed from the list head. The NULL check is therefore
> dead code, and an empty list would be treated as a valid entry and
> moved around as if it were a real device.
>
> Use list_first_entry_or_null() so the existing NULL check works as
> intended and an empty list is handled gracefully.
>
> Fixes: 52f641bc63a4 ("crypto: ti - Add driver for DTHE V2 AES Engine (ECB, CBC)")
> Reported-by: kernel test robot <lkp@intel.com>
> Reported-by: Dan Carpenter <error27@gmail.com>
> Closes: https://lore.kernel.org/r/202606111933.69GGTKxr-lkp@intel.com/
> Signed-off-by: Mert Seftali <mertsftl@gmail.com>
> ---
> drivers/crypto/ti/dthev2-common.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/crypto/ti/dthev2-common.c b/drivers/crypto/ti/dthev2-common.c
> index a2ad79bec105..cc0244938267 100644
> --- a/drivers/crypto/ti/dthev2-common.c
> +++ b/drivers/crypto/ti/dthev2-common.c
> @@ -40,7 +40,7 @@ struct dthe_data *dthe_get_dev(struct dthe_tfm_ctx *ctx)
> return ctx->dev_data;
>
> spin_lock_bh(&dthe_dev_list.lock);
> - dev_data = list_first_entry(&dthe_dev_list.dev_list, struct dthe_data, list);
> + dev_data = list_first_entry_or_null(&dthe_dev_list.dev_list, struct dthe_data, list);
> if (dev_data)
> list_move_tail(&dev_data->list, &dthe_dev_list.dev_list);
> spin_unlock_bh(&dthe_dev_list.lock);
LGTM.
Reviewed-by: T Pratham <t-pratham@ti.com>
--
Regards
T Pratham <t-pratham@ti.com>
^ permalink raw reply
* [PATCH v3 1/1] crypto: atmel-sha204a - fix heap info leak on I2C transfer failure
From: Lothar Rubusch @ 2026-06-13 20:20 UTC (permalink / raw)
To: thorsten.blum, herbert, davem, nicolas.ferre, alexandre.belloni,
claudiu.beznea, ardb, krzk+dt
Cc: linux-crypto, linux-arm-kernel, linux-kernel, l.rubusch
The nonblocking RNG path allocates a work_data structure to track the
state of an in-flight asynchronous I2C request. This pointer is stored
in rng->priv and later consumed by the read path once the transaction
completes.
If the underlying I2C transfer fails, the completion callback is invoked
with a non-zero status. In this case, the allocated work_data is not
usable for producing RNG output and must not remain associated with the
hwrng state.
Previously, the failure path only logged a warning but left the pointer
state uncleared, which can result in subsequent read attempts observing
stale state and interpreting it as valid completion data.
Fix this by freeing the pending work_data. The I2C transaction reports
an error. This ensures that failed requests do not leave residual state
behind that could be interpreted as valid RNG data on later reads.
Clearing rng->priv is done at the subsequent call to nonblocking read.
Fixes: da001fb651b0 ("crypto: atmel-i2c - add support for SHA204A random number generator")
Signed-off-by: Lothar Rubusch <l.rubusch@gmail.com>
Assisted-by: Gemini:1.5 Pro [google]
Reviewed-by: Thorsten Blum <thorsten.blum@linux.dev>
---
v2 -> v3:
- remove existing error-path cleanup behavior [`rng->priv = 0;`],
update commit msg
- rebased
v1 -> v2:
- reword commit message for clarity and precision
- keep existing error-path cleanup behavior unchanged, update commit msg
drivers/crypto/atmel-sha204a.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/crypto/atmel-sha204a.c b/drivers/crypto/atmel-sha204a.c
index 4c9af737b33a..5eb76245347d 100644
--- a/drivers/crypto/atmel-sha204a.c
+++ b/drivers/crypto/atmel-sha204a.c
@@ -31,10 +31,14 @@ static void atmel_sha204a_rng_done(struct atmel_i2c_work_data *work_data,
struct atmel_i2c_client_priv *i2c_priv = work_data->ctx;
struct hwrng *rng = areq;
- if (status)
+ if (status) {
dev_warn_ratelimited(&i2c_priv->client->dev,
"i2c transaction failed (%d)\n",
status);
+ kfree(work_data);
+ atomic_dec(&i2c_priv->tfm_count);
+ return;
+ }
rng->priv = (unsigned long)work_data;
atomic_dec(&i2c_priv->tfm_count);
base-commit: 6ea0ce3a19f9c37a014099e2b0a46b27fa164564
--
2.53.0
^ permalink raw reply related
* [PATCH] crypto: s5p-sss - correct CONFIG_CRYPTO_DEV_EXYNOS_RNG macro name in comment
From: Ethan Nelson-Moore @ 2026-06-13 22:36 UTC (permalink / raw)
To: linux-crypto, linux-samsung-soc
Cc: Ethan Nelson-Moore, Krzysztof Kozlowski, Vladimir Zapolskiy,
Herbert Xu, David S. Miller
A comment in drivers/crypto/s5p-sss.c incorrectly refers to
CONFIG_EXYNOS_RNG instead of CONFIG_CRYPTO_DEV_EXYNOS_RNG. Correct it.
Discovered while searching for CONFIG_* symbols referenced in code but
not defined in any Kconfig file.
Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>
---
drivers/crypto/s5p-sss.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/crypto/s5p-sss.c b/drivers/crypto/s5p-sss.c
index bdda7b39af85..9bb1b1661174 100644
--- a/drivers/crypto/s5p-sss.c
+++ b/drivers/crypto/s5p-sss.c
@@ -2151,8 +2151,8 @@ static int s5p_aes_probe(struct platform_device *pdev)
/*
* Note: HASH and PRNG uses the same registers in secss, avoid
- * overwrite each other. This will drop HASH when CONFIG_EXYNOS_RNG
- * is enabled in config. We need larger size for HASH registers in
+ * overwrite each other. This will drop HASH when CONFIG_CRYPTO_DEV_EXYNOS_RNG
+ * is enabled. We need larger size for HASH registers in
* secss, current describe only AES/DES
*/
if (IS_ENABLED(CONFIG_CRYPTO_DEV_EXYNOS_HASH)) {
--
2.43.0
^ permalink raw reply related
* [PATCH] crypto: amcc - embed pdr_uinfo as flexible array in crypto4xx_device
From: Rosen Penev @ 2026-06-13 23:45 UTC (permalink / raw)
To: linux-crypto; +Cc: Herbert Xu, David S. Miller, open list
No need to allocate and free separately.
This keeps crypto4xx_destroy_pdr dedicated to dma freeing only.
Assisted-by: opencode:big-pickle
Signed-off-by: Rosen Penev <rosenp@gmail.com>
---
drivers/crypto/amcc/crypto4xx_core.c | 12 +-----------
drivers/crypto/amcc/crypto4xx_core.h | 3 ++-
2 files changed, 3 insertions(+), 12 deletions(-)
diff --git a/drivers/crypto/amcc/crypto4xx_core.c b/drivers/crypto/amcc/crypto4xx_core.c
index 001da785af07..ea1e40b3184b 100644
--- a/drivers/crypto/amcc/crypto4xx_core.c
+++ b/drivers/crypto/amcc/crypto4xx_core.c
@@ -171,14 +171,6 @@ static u32 crypto4xx_build_pdr(struct crypto4xx_device *dev)
if (!dev->pdr)
return -ENOMEM;
- dev->pdr_uinfo = kzalloc_objs(struct pd_uinfo, PPC4XX_NUM_PD);
- if (!dev->pdr_uinfo) {
- dma_free_coherent(dev->core_dev->device,
- sizeof(struct ce_pd) * PPC4XX_NUM_PD,
- dev->pdr,
- dev->pdr_pa);
- return -ENOMEM;
- }
dev->shadow_sa_pool = dma_alloc_coherent(dev->core_dev->device,
sizeof(union shadow_sa_buf) * PPC4XX_NUM_PD,
&dev->shadow_sa_pool_pa,
@@ -226,8 +218,6 @@ static void crypto4xx_destroy_pdr(struct crypto4xx_device *dev)
dma_free_coherent(dev->core_dev->device,
sizeof(struct sa_state_record) * PPC4XX_NUM_PD,
dev->shadow_sr_pool, dev->shadow_sr_pool_pa);
-
- kfree(dev->pdr_uinfo);
}
static u32 crypto4xx_get_pd_from_pdr_nolock(struct crypto4xx_device *dev)
@@ -1247,7 +1237,7 @@ static int crypto4xx_probe(struct platform_device *ofdev)
dev_set_drvdata(dev, core_dev);
core_dev->ofdev = ofdev;
core_dev->dev = devm_kzalloc(
- &ofdev->dev, sizeof(struct crypto4xx_device), GFP_KERNEL);
+ &ofdev->dev, struct_size(core_dev->dev, pdr_uinfo, PPC4XX_NUM_PD), GFP_KERNEL);
if (!core_dev->dev)
return -ENOMEM;
diff --git a/drivers/crypto/amcc/crypto4xx_core.h b/drivers/crypto/amcc/crypto4xx_core.h
index 66a95733c86d..bd4a286514a4 100644
--- a/drivers/crypto/amcc/crypto4xx_core.h
+++ b/drivers/crypto/amcc/crypto4xx_core.h
@@ -93,11 +93,12 @@ struct crypto4xx_device {
u32 gdr_head;
u32 sdr_tail;
u32 sdr_head;
- struct pd_uinfo *pdr_uinfo;
struct list_head alg_list; /* List of algorithm supported
by this device */
struct ratelimit_state aead_ratelimit;
bool is_revb;
+
+ struct pd_uinfo pdr_uinfo[];
};
struct crypto4xx_core_device {
--
2.54.0
^ permalink raw reply related
* Re: [PATCH] crypto: s5p-sss - correct CONFIG_CRYPTO_DEV_EXYNOS_RNG macro name in comment
From: Eric Biggers @ 2026-06-14 0:50 UTC (permalink / raw)
To: Ethan Nelson-Moore
Cc: linux-crypto, linux-samsung-soc, Krzysztof Kozlowski,
Vladimir Zapolskiy, Herbert Xu, David S. Miller
In-Reply-To: <20260613223648.119694-1-enelsonmoore@gmail.com>
On Sat, Jun 13, 2026 at 03:36:47PM -0700, Ethan Nelson-Moore wrote:
> A comment in drivers/crypto/s5p-sss.c incorrectly refers to
> CONFIG_EXYNOS_RNG instead of CONFIG_CRYPTO_DEV_EXYNOS_RNG. Correct it.
>
> Discovered while searching for CONFIG_* symbols referenced in code but
> not defined in any Kconfig file.
>
> Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>
> ---
> drivers/crypto/s5p-sss.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/crypto/s5p-sss.c b/drivers/crypto/s5p-sss.c
> index bdda7b39af85..9bb1b1661174 100644
> --- a/drivers/crypto/s5p-sss.c
> +++ b/drivers/crypto/s5p-sss.c
> @@ -2151,8 +2151,8 @@ static int s5p_aes_probe(struct platform_device *pdev)
>
> /*
> * Note: HASH and PRNG uses the same registers in secss, avoid
> - * overwrite each other. This will drop HASH when CONFIG_EXYNOS_RNG
> - * is enabled in config. We need larger size for HASH registers in
> + * overwrite each other. This will drop HASH when CONFIG_CRYPTO_DEV_EXYNOS_RNG
> + * is enabled. We need larger size for HASH registers in
> * secss, current describe only AES/DES
> */
> if (IS_ENABLED(CONFIG_CRYPTO_DEV_EXYNOS_HASH)) {
CONFIG_CRYPTO_DEV_EXYNOS_RNG was already removed by
https://lore.kernel.org/linux-crypto/20260531175932.32171-1-ebiggers@kernel.org/
I didn't want to touch this comment which is nonsense anyway. But if
you're going to try to update it, it should be updated to correctly
explain that the driver is working around broken devicetree bindings.
- Eric
^ permalink raw reply
* [PATCH v2] lib/raid/xor: x86: Add AVX-512 optimized xor_gen()
From: Eric Biggers @ 2026-06-14 1:03 UTC (permalink / raw)
To: Andrew Morton, linux-kernel
Cc: Christoph Hellwig, linux-crypto, x86, Eric Biggers, David Laight,
linux-raid
Add an implementation of xor_gen() using AVX-512.
It uses 512-bit vectors, i.e. ZMM registers. It also uses the
vpternlogq instruction to do three-input XORs when applicable.
It's enabled on x86_64 CPUs that have AVX512F && !PREFER_YMM. In
practice that means:
- AMD Zen 4 and later (client and server)
- Intel Sapphire Rapids and later (server)
- Intel Rocket Lake (client)
- Intel Nova Lake and later (client)
The !PREFER_YMM condition excludes the older AVX-512 implementations in
Intel Skylake Server and Intel Ice Lake. They could run this code, but
they're known to have overly-eager downclocking when ZMM registers are
used. This is the same policy that the crypto and CRC code uses.
Benchmark on AMD Ryzen 9 9950X (Zen 5):
src_cnt avx avx512 Improvement
======= ========== ========== ===========
1 56353 MB/s 75388 MB/s 33%
2 54274 MB/s 68409 MB/s 26%
3 44649 MB/s 64042 MB/s 43%
4 41315 MB/s 55002 MB/s 33%
Note: for now I omitted the cpu_has_xfeatures() check that the AVX-512
optimized crypto and CRC code does, since it's not implemented on
User-Mode Linux and it's never been present in the RAID6 code either.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
Changed in v2:
- Fixed build on UML
- Reworked the implementation
lib/raid/xor/Makefile | 2 +-
lib/raid/xor/x86/xor-avx512.c | 121 ++++++++++++++++++++++++++++++++++
lib/raid/xor/x86/xor_arch.h | 26 ++++----
3 files changed, 137 insertions(+), 12 deletions(-)
create mode 100644 lib/raid/xor/x86/xor-avx512.c
diff --git a/lib/raid/xor/Makefile b/lib/raid/xor/Makefile
index 4d633dfd5b90..4af945861a51 100644
--- a/lib/raid/xor/Makefile
+++ b/lib/raid/xor/Makefile
@@ -26,11 +26,11 @@ xor-$(CONFIG_ALTIVEC) += powerpc/xor_vmx.o powerpc/xor_vmx_glue.o
xor-$(CONFIG_RISCV_ISA_V) += riscv/xor.o riscv/xor-glue.o
xor-$(CONFIG_SPARC32) += sparc/xor-sparc32.o
xor-$(CONFIG_SPARC64) += sparc/xor-sparc64.o sparc/xor-sparc64-glue.o
xor-$(CONFIG_S390) += s390/xor.o
xor-$(CONFIG_X86_32) += x86/xor-avx.o x86/xor-sse.o x86/xor-mmx.o
-xor-$(CONFIG_X86_64) += x86/xor-avx.o x86/xor-sse.o
+xor-$(CONFIG_X86_64) += x86/xor-avx.o x86/xor-sse.o x86/xor-avx512.o
obj-y += tests/
CFLAGS_arm/xor-neon.o += $(CC_FLAGS_FPU)
CFLAGS_REMOVE_arm/xor-neon.o += $(CC_FLAGS_NO_FPU)
diff --git a/lib/raid/xor/x86/xor-avx512.c b/lib/raid/xor/x86/xor-avx512.c
new file mode 100644
index 000000000000..87b981d74c90
--- /dev/null
+++ b/lib/raid/xor/x86/xor-avx512.c
@@ -0,0 +1,121 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * AVX-512 optimized implementation of xor_gen()
+ *
+ * Copyright 2026 Google LLC
+ */
+
+#include <linux/types.h>
+#include <asm/fpu/api.h>
+#include "xor_impl.h"
+#include "xor_arch.h"
+
+/*
+ * Implementation notes:
+ *
+ * Unrolling by the number of buffers (2-5) is very important.
+ *
+ * Unrolling by length is less important, especially when using register-indexed
+ * addressing with negative indices from the end of the buffers. That approach
+ * results in just two loop control instructions being needed per iteration,
+ * regardless of the number of buffers.
+ *
+ * In fact, benchmarks showed that the 2 and 3 buffer cases require only 2x
+ * unrolling by length, while the 4 and 5 buffer cases don't require any
+ * unrolling by length. Benchmarks also showed that the register-indexed
+ * addressing isn't a bottleneck either; i.e., we can't do any better by
+ * incrementing the pointers as we go along, even with more unrolling.
+ */
+
+static void xor_avx512_2(long bytes, u8 *p0, const u8 *p1)
+{
+ long i = -bytes;
+
+ asm volatile("1: vmovdqa64 (%0,%1), %%zmm0\n"
+ "vmovdqa64 64(%0,%1), %%zmm1\n"
+ "vpxorq (%0,%2), %%zmm0, %%zmm0\n"
+ "vpxorq 64(%0,%2), %%zmm1, %%zmm1\n"
+ "vmovdqa64 %%zmm0, (%0,%1)\n"
+ "vmovdqa64 %%zmm1, 64(%0,%1)\n"
+ "add $128, %0\n"
+ "jnz 1b\n"
+ : "+&r"(i)
+ : "r"(p0 + bytes), "r"(p1 + bytes)
+ : "memory", "cc");
+}
+
+static void xor_avx512_3(long bytes, u8 *p0, const u8 *p1, const u8 *p2)
+{
+ long i = -bytes;
+
+ asm volatile("1: vmovdqa64 (%0,%1), %%zmm0\n"
+ "vmovdqa64 64(%0,%1), %%zmm1\n"
+ "vmovdqa64 (%0,%2), %%zmm2\n"
+ "vmovdqa64 64(%0,%2), %%zmm3\n"
+ "vpternlogq $0x96, (%0,%3), %%zmm2, %%zmm0\n"
+ "vpternlogq $0x96, 64(%0,%3), %%zmm3, %%zmm1\n"
+ "vmovdqa64 %%zmm0, (%0,%1)\n"
+ "vmovdqa64 %%zmm1, 64(%0,%1)\n"
+ "add $128, %0\n"
+ "jnz 1b\n"
+ : "+&r"(i)
+ : "r"(p0 + bytes), "r"(p1 + bytes), "r"(p2 + bytes)
+ : "memory", "cc");
+}
+
+static void xor_avx512_4(long bytes, u8 *p0, const u8 *p1, const u8 *p2,
+ const u8 *p3)
+{
+ long i = -bytes;
+
+ asm volatile("1: vmovdqa64 (%0,%1), %%zmm0\n"
+ "vmovdqa64 (%0,%2), %%zmm1\n"
+ "vpxorq (%0,%3), %%zmm0, %%zmm0\n"
+ "vpternlogq $0x96, (%0,%4), %%zmm1, %%zmm0\n"
+ "vmovdqa64 %%zmm0, (%0,%1)\n"
+ "add $64, %0\n"
+ "jnz 1b\n"
+ : "+&r"(i)
+ : "r"(p0 + bytes), "r"(p1 + bytes), "r"(p2 + bytes),
+ "r"(p3 + bytes)
+ : "memory", "cc");
+}
+
+static void xor_avx512_5(long bytes, u8 *p0, const u8 *p1, const u8 *p2,
+ const u8 *p3, const u8 *p4)
+{
+ long i = -bytes;
+
+ asm volatile("1: vmovdqa64 (%0,%1), %%zmm0\n"
+ "vmovdqa64 (%0,%2), %%zmm1\n"
+ "vpternlogq $0x96, (%0,%3), %%zmm1, %%zmm0\n"
+ "vmovdqa64 (%0,%4), %%zmm1\n"
+ "vpternlogq $0x96, (%0,%5), %%zmm1, %%zmm0\n"
+ "vmovdqa64 %%zmm0, (%0,%1)\n"
+ "add $64, %0\n"
+ "jnz 1b\n"
+ : "+&r"(i)
+ : "r"(p0 + bytes), "r"(p1 + bytes), "r"(p2 + bytes),
+ "r"(p3 + bytes), "r"(p4 + bytes)
+ : "memory", "cc");
+}
+
+DO_XOR_BLOCKS(avx512_inner, xor_avx512_2, xor_avx512_3, xor_avx512_4,
+ xor_avx512_5);
+
+/*
+ * Preconditions: bytes is a nonzero multiple of 512, and all buffers are
+ * 64-byte aligned.
+ */
+static void xor_gen_avx512(void *dest, void **srcs, unsigned int src_cnt,
+ unsigned int bytes)
+{
+ kernel_fpu_begin();
+ xor_gen_avx512_inner(dest, srcs, src_cnt, bytes);
+ kernel_fpu_end();
+}
+
+struct xor_block_template xor_block_avx512 = {
+ .name = "avx512",
+ .xor_gen = xor_gen_avx512,
+};
diff --git a/lib/raid/xor/x86/xor_arch.h b/lib/raid/xor/x86/xor_arch.h
index 99fe85a213c6..b5d49376fc97 100644
--- a/lib/raid/xor/x86/xor_arch.h
+++ b/lib/raid/xor/x86/xor_arch.h
@@ -4,26 +4,30 @@
extern struct xor_block_template xor_block_pII_mmx;
extern struct xor_block_template xor_block_p5_mmx;
extern struct xor_block_template xor_block_sse;
extern struct xor_block_template xor_block_sse_pf64;
extern struct xor_block_template xor_block_avx;
+extern struct xor_block_template xor_block_avx512;
-/*
- * When SSE is available, use it as it can write around L2. We may also be able
- * to load into the L1 only depending on how the cpu deals with a load to a line
- * that is being prefetched.
- *
- * When AVX2 is available, force using it as it is better by all measures.
- *
- * 32-bit without MMX can fall back to the generic routines.
- */
static __always_inline void __init arch_xor_init(void)
{
- if (boot_cpu_has(X86_FEATURE_AVX) &&
- boot_cpu_has(X86_FEATURE_OSXSAVE)) {
+ if (IS_ENABLED(CONFIG_X86_64) && boot_cpu_has(X86_FEATURE_AVX512F) &&
+ boot_cpu_has(X86_FEATURE_OSXSAVE) &&
+ !boot_cpu_has(X86_FEATURE_PREFER_YMM)) {
+ /* AVX-512 will be the best; no need to try others. */
+ /* !PREFER_YMM excludes CPUs with overly-eager downclocking. */
+ xor_force(&xor_block_avx512);
+ } else if (boot_cpu_has(X86_FEATURE_AVX) &&
+ boot_cpu_has(X86_FEATURE_OSXSAVE)) {
+ /* AVX will be the best; no need to try others. */
xor_force(&xor_block_avx);
} else if (IS_ENABLED(CONFIG_X86_64) || boot_cpu_has(X86_FEATURE_XMM)) {
+ /*
+ * When SSE is available, use it as it can write around L2. We
+ * may also be able to load into the L1 only depending on how
+ * the cpu deals with a load to a line that is being prefetched.
+ */
xor_register(&xor_block_sse);
xor_register(&xor_block_sse_pf64);
} else if (boot_cpu_has(X86_FEATURE_MMX)) {
xor_register(&xor_block_pII_mmx);
xor_register(&xor_block_p5_mmx);
base-commit: 2b07ea76fd28989bde5993532d7a943a6f90e246
--
2.54.0
^ permalink raw reply related
* Re: [PATCH] crypto: amcc - embed pdr_uinfo as flexible array in crypto4xx_device
From: Rosen Penev @ 2026-06-14 1:27 UTC (permalink / raw)
To: linux-crypto; +Cc: Herbert Xu, David S. Miller, open list
In-Reply-To: <20260613234559.20934-1-rosenp@gmail.com>
On Sat, Jun 13, 2026 at 4:46 PM Rosen Penev <rosenp@gmail.com> wrote:
>
> No need to allocate and free separately.
On further review, it makes more sense to change to a static array.
>
> This keeps crypto4xx_destroy_pdr dedicated to dma freeing only.
>
> Assisted-by: opencode:big-pickle
> Signed-off-by: Rosen Penev <rosenp@gmail.com>
> ---
> drivers/crypto/amcc/crypto4xx_core.c | 12 +-----------
> drivers/crypto/amcc/crypto4xx_core.h | 3 ++-
> 2 files changed, 3 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/crypto/amcc/crypto4xx_core.c b/drivers/crypto/amcc/crypto4xx_core.c
> index 001da785af07..ea1e40b3184b 100644
> --- a/drivers/crypto/amcc/crypto4xx_core.c
> +++ b/drivers/crypto/amcc/crypto4xx_core.c
> @@ -171,14 +171,6 @@ static u32 crypto4xx_build_pdr(struct crypto4xx_device *dev)
> if (!dev->pdr)
> return -ENOMEM;
>
> - dev->pdr_uinfo = kzalloc_objs(struct pd_uinfo, PPC4XX_NUM_PD);
> - if (!dev->pdr_uinfo) {
> - dma_free_coherent(dev->core_dev->device,
> - sizeof(struct ce_pd) * PPC4XX_NUM_PD,
> - dev->pdr,
> - dev->pdr_pa);
> - return -ENOMEM;
> - }
> dev->shadow_sa_pool = dma_alloc_coherent(dev->core_dev->device,
> sizeof(union shadow_sa_buf) * PPC4XX_NUM_PD,
> &dev->shadow_sa_pool_pa,
> @@ -226,8 +218,6 @@ static void crypto4xx_destroy_pdr(struct crypto4xx_device *dev)
> dma_free_coherent(dev->core_dev->device,
> sizeof(struct sa_state_record) * PPC4XX_NUM_PD,
> dev->shadow_sr_pool, dev->shadow_sr_pool_pa);
> -
> - kfree(dev->pdr_uinfo);
> }
>
> static u32 crypto4xx_get_pd_from_pdr_nolock(struct crypto4xx_device *dev)
> @@ -1247,7 +1237,7 @@ static int crypto4xx_probe(struct platform_device *ofdev)
> dev_set_drvdata(dev, core_dev);
> core_dev->ofdev = ofdev;
> core_dev->dev = devm_kzalloc(
> - &ofdev->dev, sizeof(struct crypto4xx_device), GFP_KERNEL);
> + &ofdev->dev, struct_size(core_dev->dev, pdr_uinfo, PPC4XX_NUM_PD), GFP_KERNEL);
> if (!core_dev->dev)
> return -ENOMEM;
>
> diff --git a/drivers/crypto/amcc/crypto4xx_core.h b/drivers/crypto/amcc/crypto4xx_core.h
> index 66a95733c86d..bd4a286514a4 100644
> --- a/drivers/crypto/amcc/crypto4xx_core.h
> +++ b/drivers/crypto/amcc/crypto4xx_core.h
> @@ -93,11 +93,12 @@ struct crypto4xx_device {
> u32 gdr_head;
> u32 sdr_tail;
> u32 sdr_head;
> - struct pd_uinfo *pdr_uinfo;
> struct list_head alg_list; /* List of algorithm supported
> by this device */
> struct ratelimit_state aead_ratelimit;
> bool is_revb;
> +
> + struct pd_uinfo pdr_uinfo[];
> };
>
> struct crypto4xx_core_device {
> --
> 2.54.0
>
^ permalink raw reply
* [PATCH] crypto: amcc: move ioremapping up
From: Rosen Penev @ 2026-06-14 1:29 UTC (permalink / raw)
To: linux-crypto; +Cc: Herbert Xu, David S. Miller, open list
There's no need for devm_platform_ioremap_resource() to be so far down.
In fact, putting it up allows direct return instead of having to goto
some branch. Also, remove the error message as the function complains
loudly itself. No need to duplicate.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
---
drivers/crypto/amcc/crypto4xx_core.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/drivers/crypto/amcc/crypto4xx_core.c b/drivers/crypto/amcc/crypto4xx_core.c
index 001da785af07..0271b5e4d923 100644
--- a/drivers/crypto/amcc/crypto4xx_core.c
+++ b/drivers/crypto/amcc/crypto4xx_core.c
@@ -1251,6 +1251,10 @@ static int crypto4xx_probe(struct platform_device *ofdev)
if (!core_dev->dev)
return -ENOMEM;
+ core_dev->dev->ce_base = devm_platform_ioremap_resource(ofdev, 0);
+ if (IS_ERR(core_dev->dev->ce_base))
+ return PTR_ERR(core_dev->dev->ce_base);
+
/*
* Older version of 460EX/GT have a hardware bug.
* Hence they do not support H/W based security intr coalescing
@@ -1286,13 +1290,6 @@ static int crypto4xx_probe(struct platform_device *ofdev)
tasklet_init(&core_dev->tasklet, crypto4xx_bh_tasklet_cb,
(unsigned long) dev);
- core_dev->dev->ce_base = devm_platform_ioremap_resource(ofdev, 0);
- if (IS_ERR(core_dev->dev->ce_base)) {
- dev_err(&ofdev->dev, "failed to ioremap resource");
- rc = PTR_ERR(core_dev->dev->ce_base);
- goto err_build_sdr;
- }
-
/* Register for Crypto isr, Crypto Engine IRQ */
core_dev->irq = platform_get_irq(ofdev, 0);
if (core_dev->irq < 0) {
--
2.54.0
^ permalink raw reply related
* Re: [PATCH] crypto: s5p-sss - correct CONFIG_CRYPTO_DEV_EXYNOS_RNG macro name in comment
From: Ethan Nelson-Moore @ 2026-06-14 1:36 UTC (permalink / raw)
To: Eric Biggers
Cc: linux-crypto, linux-samsung-soc, Krzysztof Kozlowski,
Vladimir Zapolskiy, Herbert Xu, David S. Miller
In-Reply-To: <20260614005044.GA1808@sol>
Hi, Eric,
On Sat, Jun 13, 2026 at 5:52 PM Eric Biggers <ebiggers@kernel.org> wrote:
> CONFIG_CRYPTO_DEV_EXYNOS_RNG was already removed by
> https://lore.kernel.org/linux-crypto/20260531175932.32171-1-ebiggers@kernel.org/
Thanks for letting me know.
> I didn't want to touch this comment which is nonsense anyway. But if
> you're going to try to update it, it should be updated to correctly
> explain that the driver is working around broken devicetree bindings.
Yes, that comment definitely needs rewriting - I had no idea that is
what it is referring to.
Ethan
^ permalink raw reply
* Re: [RFC] ML-KEM (FIPS 203) implementation with reusable decapsulation pool
From: kstzavertaylo @ 2026-06-14 7:50 UTC (permalink / raw)
To: Eric Biggers; +Cc: linux-crypto, herbert
In-Reply-To: <20260612183240.GA2157807@google.com>
Thank you for the detailed feedback and for outlining the historical
context regarding pools in the crypto subsystem.
I understand your point of view and the preference for keeping the
core implementation simple with per-operation allocations (or
caller-provided workspaces), especially given the lack of precedent
for pool-based designs in lib/crypto. My approach with the reusable
decapsulation pool was driven by a focus on constrained environments
where minimizing stack usage and relying on reusable preallocated
working memory during the hot path can be particularly valuable.
However, I fully agree that concrete data is needed to properly
evaluate the trade-offs.
I see your point regarding preallocated workspaces and caller-managed
caching. One of the goals of my prototype was to explore a design
where decapsulation operates on reusable preallocated contexts rather
than per-call working memory, primarily to reduce stack requirements
and move memory management into an initialization phase. I need to
analyze more carefully how much of this can already be achieved
through a caller-provided workspace model and whether the additional
complexity of a dedicated pool is actually justified.
I am currently working on benchmarks that compare stack consumption,
allocation behavior, memory footprint, and performance between the
different approaches. Once I have solid numbers, I will share the
results and my conclusions.
I also appreciate the clarification regarding KPP. My original
prototype used KPP because it appeared to be the closest existing
interface for key establishment, but I am not specifically attached to
that approach and will spend some time evaluating how the same ideas
could fit into the lib/crypto model as well. In the meantime, I will
also look into how the pre-allocated workspace support you suggested
could be integrated.
Best regards,
K. Zavertailo
On Fri, Jun 12, 2026 at 9:32 PM Eric Biggers <ebiggers@kernel.org> wrote:
>
> On Fri, Jun 12, 2026 at 05:14:54PM +0300, kstzavertaylo wrote:
> > Thank you for the detailed reply and for pointing me to the existing
> > ML-KEM/X-Wing patchset. I spent some time reviewing the implementation
> > to better understand the design choices and how they compare to the
> > approach I took in my own work.
> >
> > After reviewing the patchset, I can see several strengths in the
> > implementation. It integrates cleanly into the existing lib/crypto
> > infrastructure, reuses kernel cryptographic primitives, avoids large
> > stack allocations, and includes KUnit-based validation. The
> > implementation also appears intentionally compact and well aligned
> > with existing kernel conventions.
> >
> > While reviewing the implementation, I noticed that decapsulation
> > allocates a temporary workspace for each operation. This is one of the
> > areas where my design diverged, which is what originally motivated the
> > reusable pool approach.
> >
> > My implementation was developed with a somewhat different goal in
> > mind. I experimented with a reusable decapsulation workspace model
> > where memory is allocated during key initialization and then reused
> > across subsequent decapsulation operations. The main motivation was
> > reducing allocation frequency and minimizing both stack usage and
> > repeated memory management during decapsulation.
> >
> > As a result, the implementation avoids allocations during
> > decapsulation entirely by reusing preallocated workspaces associated
> > with the key context. My original hypothesis was that moving memory
> > allocation to key initialization, thereby eliminating allocations from
> > the decapsulation path, could reduce allocation overhead during
> > repeated decapsulation operations and be beneficial in environments
> > where allocation activity is considered undesirable.
>
> In my ML-KEM code, all the decapsulation memory is consolidated into
> struct mlkem_decap_workspace. It would be straightforward to support
> the caller providing a pre-allocated workspace.
>
> In the case of X-Wing, we could also support pre-expanding the
> decapsulation key.
>
> It just depends on what is actually going to be needed by the kernel
> feature(s) that are going to use this. Which we don't really know yet.
>
> We do know that it hasn't been found to be useful for the crypto
> subsystem to provide pools for any other algorithm in the kernel, for a
> variety of reasons. Usually callers can just allocate per-operation, or
> they have some sort of object (inode, block device, socket, etc.) that's
> a natural place for them to cache whatever they need anyway. In the
> rare cases where some sort of pool is needed it's implemented in the
> caller, optimized for the particular use case. So I think there's a
> good chance your pool idea is going off on the wrong track.
>
> > Another difference is the integration level. My prototype explored
> > direct integration through the KPP interface, whereas the patchset
> > focuses on providing a reusable cryptographic library component within
> > lib/crypto. These approaches address somewhat different layers of the
> > kernel crypto stack.
>
> We don't need crypto_kpp support, as it's much more complex and harder
> to use than the crypto library
> (https://docs.kernel.org/crypto/libcrypto.html). Also it seems it's not
> really possible anyway, since crypto_kpp is an old design that works for
> Diffie-Hellman but not KEMs.
>
> - Eric
^ permalink raw reply
* Re: [PATCH v2] lib/raid/xor: x86: Add AVX-512 optimized xor_gen()
From: David Laight @ 2026-06-14 10:16 UTC (permalink / raw)
To: Eric Biggers
Cc: Andrew Morton, linux-kernel, Christoph Hellwig, linux-crypto, x86,
linux-raid
In-Reply-To: <20260614010357.69416-1-ebiggers@kernel.org>
On Sat, 13 Jun 2026 18:03:57 -0700
Eric Biggers <ebiggers@kernel.org> wrote:
> Add an implementation of xor_gen() using AVX-512.
>
> It uses 512-bit vectors, i.e. ZMM registers. It also uses the
> vpternlogq instruction to do three-input XORs when applicable.
>
> It's enabled on x86_64 CPUs that have AVX512F && !PREFER_YMM. In
> practice that means:
>
> - AMD Zen 4 and later (client and server)
Doesn't zen4 only have a 256bit bus between the cpu and cache?
So avx512 reads take two clocks.
Since this is memory limited it is unlikely to run faster than the
avx256 version.
OTOH if it doesn't cause down-clocking as well then it won't be slower.
> - Intel Sapphire Rapids and later (server)
> - Intel Rocket Lake (client)
> - Intel Nova Lake and later (client)
>
> The !PREFER_YMM condition excludes the older AVX-512 implementations in
> Intel Skylake Server and Intel Ice Lake. They could run this code, but
> they're known to have overly-eager downclocking when ZMM registers are
> used. This is the same policy that the crypto and CRC code uses.
>
> Benchmark on AMD Ryzen 9 9950X (Zen 5):
>
> src_cnt avx avx512 Improvement
> ======= ========== ========== ===========
> 1 56353 MB/s 75388 MB/s 33%
> 2 54274 MB/s 68409 MB/s 26%
> 3 44649 MB/s 64042 MB/s 43%
> 4 41315 MB/s 55002 MB/s 33%
>
> Note: for now I omitted the cpu_has_xfeatures() check that the AVX-512
> optimized crypto and CRC code does, since it's not implemented on
> User-Mode Linux and it's never been present in the RAID6 code either.
>
> Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Since I suggested it :-)
Reviewed-By: David Laight <david.laight.linux@gmail.com>
Some 'not very important' comments:
I did wonder whether moving the loop into the asm() would help.
gcc has a nasty habit of pessimising loops when you try to be clever.
It is certainly safer for tight loops like these.
That does have the side effect of making p0 be %1 which doesn't improve
readability. Either used named parameters or possibly just change p0 to p1 (etc)
so they match.
The code should be limited by the memory reads, so the 3-argument xor and
the interleave of the unroll may make no difference.
Some cpu do have constraints on the cache alignment in order to do two
reads per clock, but I've forgotten them and they got better before AVX-512.
If that were affecting this code (on the tested cpu) then I'd expect the
interleaved unroll would improve the _4 and -5 functions.
So it probably doesn't affect this code.
Using the same loop for the avx-256 and sse (and even smaller) functions could
well generate code that runs 'pretty much as fast as possible' on older cpu.
Intel cpu (going back to Sandy bridge) are likely to execute the loop in the
same number of clocks - but clearly copying half or a quarter of the data.
But I've no experience of zen1.
Might be worth doing for avx-256, does any care about anything older :-)
David
> ---
>
> Changed in v2:
> - Fixed build on UML
> - Reworked the implementation
>
> lib/raid/xor/Makefile | 2 +-
> lib/raid/xor/x86/xor-avx512.c | 121 ++++++++++++++++++++++++++++++++++
> lib/raid/xor/x86/xor_arch.h | 26 ++++----
> 3 files changed, 137 insertions(+), 12 deletions(-)
> create mode 100644 lib/raid/xor/x86/xor-avx512.c
>
> diff --git a/lib/raid/xor/Makefile b/lib/raid/xor/Makefile
> index 4d633dfd5b90..4af945861a51 100644
> --- a/lib/raid/xor/Makefile
> +++ b/lib/raid/xor/Makefile
> @@ -26,11 +26,11 @@ xor-$(CONFIG_ALTIVEC) += powerpc/xor_vmx.o powerpc/xor_vmx_glue.o
> xor-$(CONFIG_RISCV_ISA_V) += riscv/xor.o riscv/xor-glue.o
> xor-$(CONFIG_SPARC32) += sparc/xor-sparc32.o
> xor-$(CONFIG_SPARC64) += sparc/xor-sparc64.o sparc/xor-sparc64-glue.o
> xor-$(CONFIG_S390) += s390/xor.o
> xor-$(CONFIG_X86_32) += x86/xor-avx.o x86/xor-sse.o x86/xor-mmx.o
> -xor-$(CONFIG_X86_64) += x86/xor-avx.o x86/xor-sse.o
> +xor-$(CONFIG_X86_64) += x86/xor-avx.o x86/xor-sse.o x86/xor-avx512.o
> obj-y += tests/
>
> CFLAGS_arm/xor-neon.o += $(CC_FLAGS_FPU)
> CFLAGS_REMOVE_arm/xor-neon.o += $(CC_FLAGS_NO_FPU)
>
> diff --git a/lib/raid/xor/x86/xor-avx512.c b/lib/raid/xor/x86/xor-avx512.c
> new file mode 100644
> index 000000000000..87b981d74c90
> --- /dev/null
> +++ b/lib/raid/xor/x86/xor-avx512.c
> @@ -0,0 +1,121 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * AVX-512 optimized implementation of xor_gen()
> + *
> + * Copyright 2026 Google LLC
> + */
> +
> +#include <linux/types.h>
> +#include <asm/fpu/api.h>
> +#include "xor_impl.h"
> +#include "xor_arch.h"
> +
> +/*
> + * Implementation notes:
> + *
> + * Unrolling by the number of buffers (2-5) is very important.
> + *
> + * Unrolling by length is less important, especially when using register-indexed
> + * addressing with negative indices from the end of the buffers. That approach
> + * results in just two loop control instructions being needed per iteration,
> + * regardless of the number of buffers.
> + *
> + * In fact, benchmarks showed that the 2 and 3 buffer cases require only 2x
> + * unrolling by length, while the 4 and 5 buffer cases don't require any
> + * unrolling by length. Benchmarks also showed that the register-indexed
> + * addressing isn't a bottleneck either; i.e., we can't do any better by
> + * incrementing the pointers as we go along, even with more unrolling.
> + */
> +
> +static void xor_avx512_2(long bytes, u8 *p0, const u8 *p1)
> +{
> + long i = -bytes;
> +
> + asm volatile("1: vmovdqa64 (%0,%1), %%zmm0\n"
> + "vmovdqa64 64(%0,%1), %%zmm1\n"
> + "vpxorq (%0,%2), %%zmm0, %%zmm0\n"
> + "vpxorq 64(%0,%2), %%zmm1, %%zmm1\n"
> + "vmovdqa64 %%zmm0, (%0,%1)\n"
> + "vmovdqa64 %%zmm1, 64(%0,%1)\n"
> + "add $128, %0\n"
> + "jnz 1b\n"
> + : "+&r"(i)
> + : "r"(p0 + bytes), "r"(p1 + bytes)
> + : "memory", "cc");
> +}
> +
> +static void xor_avx512_3(long bytes, u8 *p0, const u8 *p1, const u8 *p2)
> +{
> + long i = -bytes;
> +
> + asm volatile("1: vmovdqa64 (%0,%1), %%zmm0\n"
> + "vmovdqa64 64(%0,%1), %%zmm1\n"
> + "vmovdqa64 (%0,%2), %%zmm2\n"
> + "vmovdqa64 64(%0,%2), %%zmm3\n"
> + "vpternlogq $0x96, (%0,%3), %%zmm2, %%zmm0\n"
> + "vpternlogq $0x96, 64(%0,%3), %%zmm3, %%zmm1\n"
> + "vmovdqa64 %%zmm0, (%0,%1)\n"
> + "vmovdqa64 %%zmm1, 64(%0,%1)\n"
> + "add $128, %0\n"
> + "jnz 1b\n"
> + : "+&r"(i)
> + : "r"(p0 + bytes), "r"(p1 + bytes), "r"(p2 + bytes)
> + : "memory", "cc");
> +}
> +
> +static void xor_avx512_4(long bytes, u8 *p0, const u8 *p1, const u8 *p2,
> + const u8 *p3)
> +{
> + long i = -bytes;
> +
> + asm volatile("1: vmovdqa64 (%0,%1), %%zmm0\n"
> + "vmovdqa64 (%0,%2), %%zmm1\n"
> + "vpxorq (%0,%3), %%zmm0, %%zmm0\n"
> + "vpternlogq $0x96, (%0,%4), %%zmm1, %%zmm0\n"
> + "vmovdqa64 %%zmm0, (%0,%1)\n"
> + "add $64, %0\n"
> + "jnz 1b\n"
> + : "+&r"(i)
> + : "r"(p0 + bytes), "r"(p1 + bytes), "r"(p2 + bytes),
> + "r"(p3 + bytes)
> + : "memory", "cc");
> +}
> +
> +static void xor_avx512_5(long bytes, u8 *p0, const u8 *p1, const u8 *p2,
> + const u8 *p3, const u8 *p4)
> +{
> + long i = -bytes;
> +
> + asm volatile("1: vmovdqa64 (%0,%1), %%zmm0\n"
> + "vmovdqa64 (%0,%2), %%zmm1\n"
> + "vpternlogq $0x96, (%0,%3), %%zmm1, %%zmm0\n"
> + "vmovdqa64 (%0,%4), %%zmm1\n"
> + "vpternlogq $0x96, (%0,%5), %%zmm1, %%zmm0\n"
> + "vmovdqa64 %%zmm0, (%0,%1)\n"
> + "add $64, %0\n"
> + "jnz 1b\n"
> + : "+&r"(i)
> + : "r"(p0 + bytes), "r"(p1 + bytes), "r"(p2 + bytes),
> + "r"(p3 + bytes), "r"(p4 + bytes)
> + : "memory", "cc");
> +}
> +
> +DO_XOR_BLOCKS(avx512_inner, xor_avx512_2, xor_avx512_3, xor_avx512_4,
> + xor_avx512_5);
> +
> +/*
> + * Preconditions: bytes is a nonzero multiple of 512, and all buffers are
> + * 64-byte aligned.
> + */
> +static void xor_gen_avx512(void *dest, void **srcs, unsigned int src_cnt,
> + unsigned int bytes)
> +{
> + kernel_fpu_begin();
> + xor_gen_avx512_inner(dest, srcs, src_cnt, bytes);
> + kernel_fpu_end();
> +}
> +
> +struct xor_block_template xor_block_avx512 = {
> + .name = "avx512",
> + .xor_gen = xor_gen_avx512,
> +};
> diff --git a/lib/raid/xor/x86/xor_arch.h b/lib/raid/xor/x86/xor_arch.h
> index 99fe85a213c6..b5d49376fc97 100644
> --- a/lib/raid/xor/x86/xor_arch.h
> +++ b/lib/raid/xor/x86/xor_arch.h
> @@ -4,26 +4,30 @@
> extern struct xor_block_template xor_block_pII_mmx;
> extern struct xor_block_template xor_block_p5_mmx;
> extern struct xor_block_template xor_block_sse;
> extern struct xor_block_template xor_block_sse_pf64;
> extern struct xor_block_template xor_block_avx;
> +extern struct xor_block_template xor_block_avx512;
>
> -/*
> - * When SSE is available, use it as it can write around L2. We may also be able
> - * to load into the L1 only depending on how the cpu deals with a load to a line
> - * that is being prefetched.
> - *
> - * When AVX2 is available, force using it as it is better by all measures.
> - *
> - * 32-bit without MMX can fall back to the generic routines.
> - */
> static __always_inline void __init arch_xor_init(void)
> {
> - if (boot_cpu_has(X86_FEATURE_AVX) &&
> - boot_cpu_has(X86_FEATURE_OSXSAVE)) {
> + if (IS_ENABLED(CONFIG_X86_64) && boot_cpu_has(X86_FEATURE_AVX512F) &&
> + boot_cpu_has(X86_FEATURE_OSXSAVE) &&
> + !boot_cpu_has(X86_FEATURE_PREFER_YMM)) {
> + /* AVX-512 will be the best; no need to try others. */
> + /* !PREFER_YMM excludes CPUs with overly-eager downclocking. */
> + xor_force(&xor_block_avx512);
> + } else if (boot_cpu_has(X86_FEATURE_AVX) &&
> + boot_cpu_has(X86_FEATURE_OSXSAVE)) {
> + /* AVX will be the best; no need to try others. */
> xor_force(&xor_block_avx);
> } else if (IS_ENABLED(CONFIG_X86_64) || boot_cpu_has(X86_FEATURE_XMM)) {
> + /*
> + * When SSE is available, use it as it can write around L2. We
> + * may also be able to load into the L1 only depending on how
> + * the cpu deals with a load to a line that is being prefetched.
> + */
> xor_register(&xor_block_sse);
> xor_register(&xor_block_sse_pf64);
> } else if (boot_cpu_has(X86_FEATURE_MMX)) {
> xor_register(&xor_block_pII_mmx);
> xor_register(&xor_block_p5_mmx);
>
> base-commit: 2b07ea76fd28989bde5993532d7a943a6f90e246
^ permalink raw reply
* [PATCH 0/2] crypto: qat - bound the live migration import parser
From: Michael Bommarito @ 2026-06-14 13:06 UTC (permalink / raw)
To: Giovanni Cabiddu, Herbert Xu
Cc: David S . Miller, Kees Cook, qat-linux, linux-crypto,
linux-kernel
adf_mstate_mgr_init_from_remote() sets the section-walk cursor to
mgr->buf + preh_len from a remote-supplied preh_len, and the default
preamble checker only rejects preh_len > mgr->size. A remote preamble
with preh_len == mgr->size moves the cursor one region past the
allocation while n_sects is still honoured, so adf_mstate_sect_validate()
reads sect->size before the section header is proven in bounds. The
remote stream reaches this parser from the destination-host VFIO
migration path (qat_vf_resume_write), so a malformed import reads out of
bounds in the destination host kernel (fatal under KASAN / panic_on_warn).
Patch 1 rejects section headers not fully contained in the state buffer.
Patch 2 adds KUnit coverage and is offered separately so it can be taken
or dropped on its own. The parser was driven on QEMU x86_64 under KASAN
via the patch 2 suite (Level-2: buggy code unchanged, surrounding VFIO/PF
environment synthesized); the boundary trigger reports the out-of-bounds
read on the unfixed parser and is gone after patch 1, with two benign
controls passing on both trees.
Michael Bommarito (2):
crypto: qat - validate migration section header is in bounds
crypto: qat - add KUnit coverage for the migration import parser
drivers/crypto/intel/qat/Kconfig | 16 ++++
.../intel/qat/qat_common/adf_mstate_mgr.c | 18 ++++-
.../qat/qat_common/adf_mstate_mgr_test.c | 81 +++++++++++++++++++
3 files changed, 113 insertions(+), 2 deletions(-)
create mode 100644 drivers/crypto/intel/qat/qat_common/adf_mstate_mgr_test.c
--
2.53.0
^ permalink raw reply
* [PATCH 1/2] crypto: qat - validate migration section header is in bounds
From: Michael Bommarito @ 2026-06-14 13:06 UTC (permalink / raw)
To: Giovanni Cabiddu, Herbert Xu
Cc: David S . Miller, Kees Cook, qat-linux, linux-crypto,
linux-kernel
In-Reply-To: <20260614130619.2519534-1-michael.bommarito@gmail.com>
adf_mstate_mgr_init_from_remote() sets the section-walk cursor to
mgr->buf + preh_len from the remote migration preamble. The default
preamble checker only rejects preh_len > mgr->size, so preh_len ==
mgr->size (the 4096-byte QAT VF state buffer) puts mgr->state one
region past the allocation while n_sects is still honoured.
adf_mstate_sect_validate() then reads sect->size from that cursor
before proving the section header is in the buffer. The remote stream
reaches this parser from the destination-host VFIO migration path
(qat_vf_resume_write), so a malformed import reads out of bounds.
Reject section headers not fully contained in the state buffer before
dereferencing any of their fields.
Reproduced under KASAN on QEMU via the KUnit case in patch 2; the
slab-out-of-bounds read is gone after this change.
Fixes: f0bbfc391aa7 ("crypto: qat - implement interface for live migration")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
---
The patch 2 KUnit case drives the real parser on a kmalloc(4096) buffer
under KASAN on QEMU x86_64. Trigger {preh_len=4096, n_sects=1}: stock
tree reports
BUG: KASAN: slab-out-of-bounds in qat_mstate_remote_run
reading sect->size 8 bytes past the allocation; patched it returns
-EINVAL and KASAN is silent. Two benign controls (empty preamble,
in-bounds section header) drive the same path with no OOB and pass on
both trees. No in-tree selftest exercises adf_mstate_mgr.c; patch 2 is
the coverage offered. KASAN build of the touched object is warning clean.
.../crypto/intel/qat/qat_common/adf_mstate_mgr.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/crypto/intel/qat/qat_common/adf_mstate_mgr.c b/drivers/crypto/intel/qat/qat_common/adf_mstate_mgr.c
index f9017e03ec0f2..7370b87f72a2f 100644
--- a/drivers/crypto/intel/qat/qat_common/adf_mstate_mgr.c
+++ b/drivers/crypto/intel/qat/qat_common/adf_mstate_mgr.c
@@ -231,8 +231,18 @@ static int adf_mstate_sect_validate(struct adf_mstate_mgr *mgr)
end = (uintptr_t)mgr->buf + mgr->size;
for (i = 0; i < mgr->n_sects; i++) {
- uintptr_t s_start = (uintptr_t)sect->state;
- uintptr_t s_end = s_start + sect->size;
+ uintptr_t s_start, s_end;
+
+ /* The section header must be in the buffer before it is read. */
+ if ((uintptr_t)sect < (uintptr_t)mgr->buf ||
+ (uintptr_t)sect > end - sizeof(*sect)) {
+ pr_debug("QAT: LM - Section header out of bounds (index=%u) in state_mgr (size=%u, secs=%u)\n",
+ i, mgr->size, mgr->n_sects);
+ return -EINVAL;
+ }
+
+ s_start = (uintptr_t)sect->state;
+ s_end = s_start + sect->size;
if (s_end < s_start || s_end > end) {
pr_debug("QAT: LM - Corrupted state section (index=%u, size=%u) in state_mgr (size=%u, secs=%u)\n",
--
2.53.0
^ permalink raw reply related
* [PATCH 2/2] crypto: qat - add KUnit coverage for the migration import parser
From: Michael Bommarito @ 2026-06-14 13:06 UTC (permalink / raw)
To: Giovanni Cabiddu, Herbert Xu
Cc: David S . Miller, Kees Cook, qat-linux, linux-crypto,
linux-kernel
In-Reply-To: <20260614130619.2519534-1-michael.bommarito@gmail.com>
Add KUnit coverage for the remote (migration import) path of the QAT
live migration state manager. The cases drive the real
adf_mstate_mgr_init_from_remote() -> adf_mstate_sect_validate() parser
on a buffer sized as the GEN4 VFIO migration backend allocates it, and
include the preh_len == buffer-size boundary case that is the
regression oracle for the preceding fix. The test file is included from
adf_mstate_mgr.c so it can reach the file-local preamble and section
types, gated by CONFIG_CRYPTO_DEV_QAT_KUNIT_TEST. It is offered as a
separate patch so it can be taken or dropped independently of the fix.
Reproduced under KASAN on QEMU; the boundary case reports the
slab-out-of-bounds read on an unfixed tree and passes once the fix is
in place.
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
---
Three cases under KASAN on QEMU x86_64: a valid empty preamble, a valid
in-bounds section header, and the preh_len == buffer-size (4096) boundary
case. The boundary case is the regression oracle for patch 1: it reports
the slab-out-of-bounds read on an unfixed tree and returns -EINVAL with
the fix in place. The two valid cases drive the same parser path with no
out-of-bounds access and pass on both trees.
drivers/crypto/intel/qat/Kconfig | 16 ++++
.../intel/qat/qat_common/adf_mstate_mgr.c | 4 +
.../qat/qat_common/adf_mstate_mgr_test.c | 81 +++++++++++++++++++
3 files changed, 101 insertions(+)
create mode 100644 drivers/crypto/intel/qat/qat_common/adf_mstate_mgr_test.c
diff --git a/drivers/crypto/intel/qat/Kconfig b/drivers/crypto/intel/qat/Kconfig
index 9d6e6f52d2dcb..116f7f94c9a64 100644
--- a/drivers/crypto/intel/qat/Kconfig
+++ b/drivers/crypto/intel/qat/Kconfig
@@ -133,3 +133,19 @@ config CRYPTO_DEV_QAT_ERROR_INJECTION
This functionality is available via debugfs entry of the Intel(R)
QuickAssist device
+
+config CRYPTO_DEV_QAT_KUNIT_TEST
+ bool "KUnit tests for Intel(R) QAT live migration state manager" if !KUNIT_ALL_TESTS
+ depends on CRYPTO_DEV_QAT && KUNIT=y
+ default KUNIT_ALL_TESTS
+ help
+ Build KUnit tests for the Intel(R) QAT live migration state
+ manager remote-import parser (adf_mstate_mgr.c). The tests drive
+ the migration setup parser on a buffer sized as the QAT GEN4
+ VFIO migration backend allocates it and check that a malformed
+ remote preamble is rejected before any out-of-bounds access.
+
+ For more information on KUnit and unit tests in general, please
+ refer to the KUnit documentation in Documentation/dev-tools/kunit/.
+
+ If unsure, say N.
diff --git a/drivers/crypto/intel/qat/qat_common/adf_mstate_mgr.c b/drivers/crypto/intel/qat/qat_common/adf_mstate_mgr.c
index 7370b87f72a2f..701279409c32c 100644
--- a/drivers/crypto/intel/qat/qat_common/adf_mstate_mgr.c
+++ b/drivers/crypto/intel/qat/qat_common/adf_mstate_mgr.c
@@ -326,3 +326,7 @@ found:
return sect;
}
+
+#if IS_ENABLED(CONFIG_CRYPTO_DEV_QAT_KUNIT_TEST)
+#include "adf_mstate_mgr_test.c"
+#endif
diff --git a/drivers/crypto/intel/qat/qat_common/adf_mstate_mgr_test.c b/drivers/crypto/intel/qat/qat_common/adf_mstate_mgr_test.c
new file mode 100644
index 0000000000000..e067c13f4a9dd
--- /dev/null
+++ b/drivers/crypto/intel/qat/qat_common/adf_mstate_mgr_test.c
@@ -0,0 +1,81 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2026 Intel Corporation */
+
+/*
+ * KUnit coverage for the QAT live migration remote-import parser. The cases
+ * drive the real adf_mstate_mgr_init_from_remote() on a buffer sized as the
+ * GEN4 VFIO migration backend allocates it (4096 bytes), including the
+ * preh_len == buffer-size boundary case. Included from adf_mstate_mgr.c to
+ * reach the file-local preamble and section types.
+ */
+
+#include <kunit/test.h>
+
+#define ADF_MSTATE_TEST_BUF_SIZE 4096
+
+static void qat_mstate_remote_run(struct kunit *test, u16 preh_len,
+ u16 n_sects, u32 sect0_size, int expect)
+{
+ struct adf_mstate_mgr mgr;
+ struct adf_mstate_preh *pre;
+ u8 *buf;
+ int ret;
+
+ buf = kzalloc(ADF_MSTATE_TEST_BUF_SIZE, GFP_KERNEL);
+ KUNIT_ASSERT_NOT_NULL(test, buf);
+
+ pre = (struct adf_mstate_preh *)buf;
+ pre->magic = ADF_MSTATE_MAGIC;
+ pre->version = ADF_MSTATE_VERSION;
+ pre->preh_len = preh_len;
+ pre->n_sects = n_sects;
+ pre->size = 0;
+
+ /* Place an in-bounds section header when there is room for one. */
+ if (n_sects &&
+ (u32)preh_len + sizeof(struct adf_mstate_sect_h) <= ADF_MSTATE_TEST_BUF_SIZE) {
+ struct adf_mstate_sect_h *s =
+ (struct adf_mstate_sect_h *)(buf + preh_len);
+
+ s->size = sect0_size;
+ s->sub_sects = 0;
+ }
+
+ ret = adf_mstate_mgr_init_from_remote(&mgr, buf, ADF_MSTATE_TEST_BUF_SIZE,
+ NULL, NULL);
+ KUNIT_EXPECT_EQ(test, ret, expect);
+
+ kfree(buf);
+}
+
+/* Valid empty preamble: the validation loop never runs. */
+static void qat_mstate_remote_empty(struct kunit *test)
+{
+ qat_mstate_remote_run(test, sizeof(struct adf_mstate_preh), 0, 0, 0);
+}
+
+/* Valid in-bounds section header: same parser path, no out-of-bounds read. */
+static void qat_mstate_remote_inbounds_sect(struct kunit *test)
+{
+ qat_mstate_remote_run(test, sizeof(struct adf_mstate_preh), 1, 0, 0);
+}
+
+/* preh_len == buffer size puts the cursor past the allocation; expect -EINVAL. */
+static void qat_mstate_remote_oob_header(struct kunit *test)
+{
+ qat_mstate_remote_run(test, ADF_MSTATE_TEST_BUF_SIZE, 1, 0, -EINVAL);
+}
+
+static struct kunit_case qat_mstate_remote_cases[] = {
+ KUNIT_CASE(qat_mstate_remote_empty),
+ KUNIT_CASE(qat_mstate_remote_inbounds_sect),
+ KUNIT_CASE(qat_mstate_remote_oob_header),
+ {}
+};
+
+static struct kunit_suite qat_mstate_remote_suite = {
+ .name = "qat_mstate_remote",
+ .test_cases = qat_mstate_remote_cases,
+};
+
+kunit_test_suite(qat_mstate_remote_suite);
--
2.53.0
^ permalink raw reply related
* [PATCH] crypto: qce - drop unused scatterlist traversal in qce_ahash_update
From: Thorsten Blum @ 2026-06-14 15:26 UTC (permalink / raw)
To: Bartosz Golaszewski, Herbert Xu, David S. Miller
Cc: Thorsten Blum, linux-crypto, linux-arm-msm, linux-kernel
Commit df12ef60c87b ("crypto: qce/sha - Do not modify scatterlist passed
along with request") removed the only use of sg_last, rendering the
scatterlist traversal useless. Remove it and its local variables.
Also remove the redundant hash_later check, inline the source offset,
and assign the number of complete blocks directly to req->nbytes.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
---
drivers/crypto/qce/sha.c | 31 +++++--------------------------
1 file changed, 5 insertions(+), 26 deletions(-)
diff --git a/drivers/crypto/qce/sha.c b/drivers/crypto/qce/sha.c
index 1b37121cbcdc..13a1174d2175 100644
--- a/drivers/crypto/qce/sha.c
+++ b/drivers/crypto/qce/sha.c
@@ -187,10 +187,8 @@ static int qce_ahash_update(struct ahash_request *req)
struct qce_sha_reqctx *rctx = ahash_request_ctx_dma(req);
struct qce_alg_template *tmpl = to_ahash_tmpl(req->base.tfm);
struct qce_device *qce = tmpl->qce;
- struct scatterlist *sg_last, *sg;
- unsigned int total, len;
+ unsigned int total;
unsigned int hash_later;
- unsigned int nbytes;
unsigned int blocksize;
blocksize = crypto_tfm_alg_blocksize(crypto_ahash_tfm(tfm));
@@ -238,28 +236,8 @@ static int qce_ahash_update(struct ahash_request *req)
if (!hash_later)
hash_later = blocksize;
- if (hash_later) {
- unsigned int src_offset = req->nbytes - hash_later;
- scatterwalk_map_and_copy(rctx->buf, req->src, src_offset,
- hash_later, 0);
- }
-
- /* here nbytes is multiple of blocksize */
- nbytes = total - hash_later;
-
- len = rctx->buflen;
- sg = sg_last = req->src;
-
- while (len < nbytes && sg) {
- if (len + sg_dma_len(sg) > nbytes)
- break;
- len += sg_dma_len(sg);
- sg_last = sg;
- sg = sg_next(sg);
- }
-
- if (!sg_last)
- return -EINVAL;
+ scatterwalk_map_and_copy(rctx->buf, req->src, req->nbytes - hash_later,
+ hash_later, 0);
if (rctx->buflen) {
sg_init_table(rctx->sg, 2);
@@ -268,7 +246,8 @@ static int qce_ahash_update(struct ahash_request *req)
req->src = rctx->sg;
}
- req->nbytes = nbytes;
+ /* hash only complete blocks */
+ req->nbytes = total - hash_later;
rctx->buflen = hash_later;
return qce->async_req_enqueue(tmpl->qce, &req->base);
^ permalink raw reply related
* Re: [PATCH RESEND 1/6] sock: add sock_kzalloc helper
From: Thorsten Blum @ 2026-06-14 15:32 UTC (permalink / raw)
To: Herbert Xu, David S. Miller, Eric Dumazet, Kuniyuki Iwashima,
Paolo Abeni, Willem de Bruijn, Jakub Kicinski, Simon Horman
Cc: linux-crypto, linux-kernel, netdev
In-Reply-To: <20260527082509.1133816-8-thorsten.blum@linux.dev>
On Wed, May 27, 2026 at 10:25:11AM +0200, Thorsten Blum wrote:
> Add sock_kzalloc() helper - the sock equivalent to kzalloc().
>
> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
> Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
> ---
> Patch 1/6 needs an Acked-by: from netdev maintainers for the series to
> go through Herbert's crypto tree:
> https://lore.kernel.org/lkml/ahVkZOxZtFes6Huf@gondor.apana.org.au/
> ---
> include/net/sock.h | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 76bfd3e56d63..b521bd34ac9f 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -1913,6 +1913,11 @@ void sock_kfree_s(struct sock *sk, void *mem, int size);
> void sock_kzfree_s(struct sock *sk, void *mem, int size);
> void sk_send_sigurg(struct sock *sk);
>
> +static inline void *sock_kzalloc(struct sock *sk, int size, gfp_t priority)
> +{
> + return sock_kmalloc(sk, size, priority | __GFP_ZERO);
> +}
> +
> static inline void sock_replace_proto(struct sock *sk, struct proto *proto)
> {
> if (sk->sk_socket)
Gentle ping? Patch 1/6 still needs an ack from netdev maintainers.
Thanks,
Thorsten
^ permalink raw reply
* Re: [PATCH v3] hwrng: virtio: clamp device-reported used.len at copy_data()
From: Michael Bommarito @ 2026-06-14 16:14 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Herbert Xu, Olivia Mackall, linux-crypto, Jason Wang, Kees Cook,
Christian Borntraeger, virtualization, linux-kernel, Dan Williams,
Ingo Molnar, H. Peter Anvin, torvalds, alan, tglx
In-Reply-To: <20260611064040-mutt-send-email-mst@kernel.org>
On Thu, Jun 11, 2026 at 6:43 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> AKA defence is depth programming)
> Alright we can drop this. No biggie.
Sorry for the delay. I'll ship a v4 without the nospec
Thanks,
Mike
^ permalink raw reply
* [PATCH v4] hwrng: virtio: clamp device-reported used.len at copy_data()
From: Michael Bommarito @ 2026-06-14 16:40 UTC (permalink / raw)
To: Olivia Mackall, Herbert Xu, linux-crypto
Cc: Michael S . Tsirkin, Jason Wang, Kees Cook, Christian Borntraeger,
virtualization, linux-kernel
copy_data() trusts the device-reported used.len stored in vi->data_avail
and memcpy()s that many bytes out of the inline vi->data buffer without
bounding it against sizeof(vi->data) (SMP_CACHE_BYTES, typically 32 or
64). A malicious or buggy virtio-rng backend can report a used.len past
the buffer and steer the memcpy() into adjacent slab memory;
hwrng_fillfn() then mixes those bytes into the guest RNG and guest root
can read them back via /dev/hwrng. No guest userspace action is required
to first trigger the read.
Clamp data_avail to sizeof(vi->data) at point of use and bail if the
running index has already reached the clamped bound. Same class as
commit c04db81cd028 ("net/9p: Fix buffer overflow in USB transport
layer").
Fixes: f7f510ec1957 ("virtio: An entropy device, as suggested by hpa.")
Cc: stable@vger.kernel.org
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Assisted-by: Claude:claude-opus-4-8
---
KASAN on a v7.1-rc4 guest whose backend reports used.len = 0x10000:
BUG: KASAN: slab-out-of-bounds in virtio_read+0x394/0x5d0
Read of size 64 at addr ffff88800ae0ba20 by task hwrng/52
__asan_memcpy+0x23/0x60
virtio_read+0x394/0x5d0
hwrng_fillfn+0xb2/0x470
located 0 bytes to the right of the 544-byte kmalloc-1k region.
With the clamp the same harness boots clean: copy_data() returns 0 for
the bogus report and the driver reissues the request.
Confidential-compute angle: a malicious hypervisor plus compromised guest
root could use /dev/hwrng as a guest-kernel heap leak channel, though
SEV-SNP/TDX guests usually disable virtio-rng. The memory-safety fix is
worth carrying regardless.
Changes in v4:
- Drop array_index_nospec() on vi->data_idx (and linux/nospec.h) per
Herbert Xu and Michael S. Tsirkin: data_idx is driver-maintained and
already bounded by the check above, with no demonstrated speculation
gadget. Clamp unchanged; KASAN repro re-run (stock splats, patched
clean).
Changes in v3: repost of v2 after the thread went quiet, rebased onto
v7.1-rc4.
Changes in v2 (Michael S. Tsirkin): move the check into copy_data() next
to the memcpy(); clamp to sizeof(vi->data) instead of forcing len = 0 so
an occasionally-over-reporting device does not start returning
zero-length reads.
v1: https://lore.kernel.org/all/20260418000020.1847122-1-michael.bommarito@gmail.com/
v2: https://lore.kernel.org/all/20260418150613.3522589-1-michael.bommarito@gmail.com/
v3: https://lore.kernel.org/all/20260531142251.2792061-1-michael.bommarito@gmail.com/
---
drivers/char/hw_random/virtio-rng.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/drivers/char/hw_random/virtio-rng.c b/drivers/char/hw_random/virtio-rng.c
index 0ce02d7e5048e..7413d24a67a9d 100644
--- a/drivers/char/hw_random/virtio-rng.c
+++ b/drivers/char/hw_random/virtio-rng.c
@@ -69,7 +69,22 @@ static void request_entropy(struct virtrng_info *vi)
static unsigned int copy_data(struct virtrng_info *vi, void *buf,
unsigned int size)
{
- size = min_t(unsigned int, size, vi->data_avail);
+ unsigned int avail;
+
+ /*
+ * vi->data_avail was set from the device-reported used.len and
+ * vi->data_idx was advanced by previous copy_data() calls. A
+ * malicious or buggy virtio-rng backend can drive data_avail past
+ * sizeof(vi->data), so clamp it at point of use before the memcpy()
+ * below can be steered into adjacent slab memory.
+ */
+ avail = min_t(unsigned int, vi->data_avail, sizeof(vi->data));
+ if (vi->data_idx >= avail) {
+ vi->data_avail = 0;
+ request_entropy(vi);
+ return 0;
+ }
+ size = min_t(unsigned int, size, avail - vi->data_idx);
memcpy(buf, vi->data + vi->data_idx, size);
vi->data_idx += size;
vi->data_avail -= size;
base-commit: a1f173eb51db0dc78536334729ef832c62d6c65a
--
2.53.0
^ permalink raw reply related
* [BUG] crypto: virtio - KASAN slab-use-after-free in virtio_crypto_skcipher_encrypt
From: Shuangpeng Bai @ 2026-06-15 2:10 UTC (permalink / raw)
To: arei.gonglei, mst, jasowang, xuanzhuo, eperezma, herbert, davem,
virtualization, linux-crypto, linux-kernel
Hi,
I hit the following KASAN report while testing current upstream kernel.
The issue was reproduced by queuing an AF_ALG skcipher request backed by
virtio-crypto, unbinding virtio0 from the virtio_crypto driver, and then
receiving from the old AF_ALG op fd.
KASAN: slab-use-after-free in virtio_crypto_skcipher_encrypt
I reproduced this on commit: e8c2f9fdadee7cbc75134dc463c1e0d856d6e5c7 (May 25 2026)
The reproducer and .config files are here.
https://gist.github.com/shuangpengbai/f6117a0883dd574f02288ca812bb7d65
I'm happy to test debug patches or provide additional information.
Reported-by: Shuangpeng Bai <shuangpeng.kernel@gmail.com>
[ 54.367992][ T8332] BUG: KASAN: slab-use-after-free in virtio_crypto_skcipher_encrypt (drivers/crypto/virtio/virtio_crypto_skcipher_algs.c:473)
[ 54.369596][ T8332] Read of size 8 at addr ffff888124a47010 by task virtio_crypto_a/8332
[ 54.370922][ T8332]
[ 54.371171][ T8332] Tainted: [W]=WARN
[ 54.371172][ T8332] Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 54.371175][ T8332] Call Trace:
[ 54.371179][ T8332] <TASK>
[ 54.371181][ T8332] dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120)
[ 54.371188][ T8332] print_report (mm/kasan/report.c:378 mm/kasan/report.c:482)
[ 54.371202][ T8332] kasan_report (mm/kasan/report.c:595)
[ 54.371213][ T8332] virtio_crypto_skcipher_encrypt (drivers/crypto/virtio/virtio_crypto_skcipher_algs.c:473)
[ 54.371216][ T8332] skcipher_recvmsg (crypto/algif_skcipher.c:203 crypto/algif_skcipher.c:226)
[ 54.371249][ T8332] sock_recvmsg (net/socket.c:1137 net/socket.c:1159)
[ 54.371253][ T8332] __sys_recvfrom (net/socket.c:2315)
[ 54.371273][ T8332] __x64_sys_recvfrom (net/socket.c:2330 net/socket.c:2326 net/socket.c:2326)
[ 54.371277][ T8332] do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
[ 54.371281][ T8332] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
[ 54.371285][ T8332] RIP: 0033:0x7f3c6caaac2c
[ 54.371289][ T8332] Code: 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 19 45 31 c9 45 31 c0 b8 2d 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 64 c3 0f 1f 00 55 48 83 ec 20 48 89 54 24 10
[ 54.371292][ T8332] RSP: 002b:00007ffed3785308 EFLAGS: 00000246 ORIG_RAX: 000000000000002d
[ 54.371297][ T8332] RAX: ffffffffffffffda RBX: 0000000000000064 RCX: 00007f3c6caaac2c
[ 54.371299][ T8332] RDX: 0000000000000040 RSI: 00007ffed37853a0 RDI: 0000000000000004
[ 54.371301][ T8332] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
[ 54.371303][ T8332] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
[ 54.371305][ T8332] R13: 00007ffed37853a0 R14: 0000558cc9904118 R15: 0000000000000000
[ 54.371309][ T8332] </TASK>
[ 54.371311][ T8332]
[ 54.394932][ T8332] Freed by task 8332 on cpu 0 at 54.364772s:
[ 54.395528][ T8332] kasan_save_track (mm/kasan/common.c:57 mm/kasan/common.c:78)
[ 54.395997][ T8332] kasan_save_free_info (mm/kasan/generic.c:584)
[ 54.396501][ T8332] __kasan_slab_free (mm/kasan/common.c:253 mm/kasan/common.c:285)
[ 54.396983][ T8332] kfree (include/linux/kasan.h:235 mm/slub.c:2689 mm/slub.c:6251 mm/slub.c:6566)
[ 54.397378][ T8332] virtio_dev_remove (drivers/virtio/virtio.c:375)
[ 54.397869][ T8332] device_release_driver_internal (drivers/base/dd.c:619 drivers/base/dd.c:1352 drivers/base/dd.c:1375)
[ 54.398475][ T8332] unbind_store (drivers/base/bus.c:244)
[ 54.398944][ T8332] kernfs_fop_write_iter (fs/kernfs/file.c:352)
[ 54.399476][ T8332] vfs_write (fs/read_write.c:595 fs/read_write.c:688)
[ 54.399915][ T8332] ksys_write (fs/read_write.c:740)
[ 54.400349][ T8332] do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
[ 54.400818][ T8332] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
[ 54.401406][ T8332]
[ 54.401650][ T8332] The buggy address belongs to the object at ffff888124a47000
[ 54.401650][ T8332] which belongs to the cache kmalloc-192 of size 192
[ 54.403038][ T8332] The buggy address is located 16 bytes inside of
[ 54.403038][ T8332] freed 192-byte region [ffff888124a47000, ffff888124a470c0)
[ 54.404385][ T8332]
Best,
Shuangpeng
^ permalink raw reply
* [GIT PULL] Crypto Update for 7.2
From: Herbert Xu @ 2026-06-15 3:37 UTC (permalink / raw)
To: Linus Torvalds, David S. Miller, Linux Kernel Mailing List,
Linux Crypto Mailing List
Hi Linus:
The following changes since commit d1fa83ecac31093a550534a79a33bc7f4ba8fc10:
rhashtable: Add bucket_table_free_atomic() helper (2026-05-05 16:12:07 +0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 tags/v7.2-p1
for you to fetch changes up to 6ea0ce3a19f9c37a014099e2b0a46b27fa164564:
crypto: tegra - fix refcount leak in tegra_se_host1x_submit() (2026-06-12 09:56:45 +0800)
----------------------------------------------------------------
This update includes the following changes:
API:
- Drop support for off-CPU cryptography in af_alg.
- Document that af_alg is *always* slower.
- Document the deprecation of af_alg.
- Remove zero-copy support from skcipher and aead in af_alg.
- Cap AEAD AD length to 0x80000000 in af_alg.
- Free default RNG on module exit.
Algorithms:
- Fix vli multiplication carry overflow in ecc.
- Drop unused cipher_null crypto_alg.
- Remove unused variants of drbg.
- Use lib/crypto in drbg.
- Use memcpy_from/to_sglist in authencesn.
- Allow authenc(hmac(sha{256,384}),cts(cbc(aes))) in FIPS mode.
- Disallow RSA PKCS#1 SHA-1 sig algs in FIPS mode.
- Filter out async aead implementations at alloc in krb5.
- Fix non-parallel fallback by rstoring callback in pcrypt.
- Validate poly1305 template argument in chacha20poly1305.
Drivers:
- Add sysfs PCI reset support to qat.
- Add KPT support for GEN6 devices to qat.
- Remove unused character device and ioctls from qat.
- Add support for hw access via SMCC to mtk.
- Remove prng support from crypto4xx.
- Remove prng support from hisi-trng.
- Remove prng support from sun4i-ss.
- Remove prng support from xilinx-trng.
- Remove loongson-rng.
- Remove exynos-rng.
Others:
- Remove support for AIO on sockets.
----------------------------------------------------------------
Abdun Nihaal (1):
crypto: safexcel - Fix potential memory leak in safexcel_pci_probe()
Ahsan Atta (9):
crypto: qat - keep VFs enabled during reset
crypto: qat - notify fatal error before AER reset preparation
crypto: qat - centralize bus master enable
crypto: qat - skip restart for down devices
crypto: qat - factor out AER reset helpers
crypto: qat - handle sysfs-triggered reset callbacks
crypto: qat - fix restarting state leak on allocation failure
crypto: qat - protect service table iterations with service_lock
crypto: qat - use pci logging variants for PCI-specific messages
Aleksander Jan Bajkowski (4):
crypto: safexcel - Remove repeated plus
crypto: eip93 - fix reset ring register definition
crypto: inside-secure/eip93 - Drop superfluous blank line
crypto: inside-secure/eip93 - Add check for devm_request_threaded_irq
Alexander Koskovich (1):
dt-bindings: crypto: qcom-qce: Document the Milos crypto engine
Alexey Kardashevskiy (1):
crypto: ccp/tsm - Enable the root port after the endpoint
Anastasia Tishchenko (1):
crypto: ecc - Fix carry overflow in vli multiplication
Ard Biesheuvel (1):
crypto: crypto_null - Drop unused cipher_null crypto_alg
Arnd Bergmann (1):
crypto: sun8i-ss - avoid hash and rng references
Bartosz Golaszewski (2):
dt-bindings: crypto: qcom-qce: document the Nord crypto engine
MAINTAINERS: make myself the maintainer of the Qualcomm QCE driver
Costa Shulyupin (1):
include: Remove unused crypto-ux500.h
Damian Muszynski (1):
crypto: qat - fix heartbeat error injection
Daniel Golle (3):
dt-bindings: rng: mtk-rng: fix style problems in example
dt-bindings: rng: mtk-rng: add SMC-based TRNG variants
hwrng: mtk - add support for hw access via SMCC
Dawei Feng (1):
crypto: amlogic - avoid double cleanup in meson_crypto_probe()
Deepti Jaggi (1):
dt-bindings: crypto: qcom,prng: Document TRNG on Nord SoC
Demi Marie Obenour (3):
net: Remove support for AIO on sockets
crypto: af_alg - Drop support for off-CPU cryptography
crypto: af_alg - Document that it is *always* slower
Eric Biggers (53):
crypto: drbg - Fix returning success on failure in CTR_DRBG
crypto: drbg - Fix misaligned writes in CTR_DRBG and HASH_DRBG
crypto: drbg - Fix ineffective sanity check
crypto: drbg - Fix drbg_max_addtl() on 64-bit kernels
crypto: drbg - Fix the fips_enabled priority boost
crypto: drbg - Remove always-enabled symbol CRYPTO_DRBG_HMAC
crypto: drbg - Remove broken commented-out code
crypto: drbg - Remove unhelpful helper functions
crypto: drbg - Remove obsolete FIPS 140-2 continuous test
crypto: drbg - Fold include/crypto/drbg.h into crypto/drbg.c
crypto: drbg - Remove import of crypto_cipher functions
crypto: drbg - Remove support for CTR_DRBG
crypto: drbg - Remove support for HASH_DRBG
crypto: drbg - Flatten the DRBG menu
crypto: testmgr - Add test for drbg_pr_hmac_sha512
crypto: testmgr - Update test for drbg_nopr_hmac_sha512
crypto: drbg - Remove support for HMAC-SHA256 and HMAC-SHA384
crypto: drbg - Simplify algorithm registration
crypto: drbg - De-virtualize drbg_state_ops
crypto: drbg - Move fixed values into constants
crypto: drbg - Embed V and C into struct drbg_state
crypto: drbg - Use HMAC-SHA512 library API
crypto: drbg - Remove drbg_core
crypto: drbg - Install separate seed functions for pr and nopr
crypto: drbg - Move module aliases to end of file
crypto: drbg - Consolidate "instantiate" logic and remove drbg_state::C
crypto: drbg - Eliminate use of 'drbg_string' and lists
crypto: drbg - Simplify drbg_generate_long() and fold into caller
crypto: drbg - Put rng_alg methods in logical order
crypto: drbg - Fold drbg_instantiate() into drbg_kcapi_seed()
crypto: drbg - Separate "reseed" case in drbg_kcapi_seed()
crypto: drbg - Fold drbg_prepare_hrng() into drbg_kcapi_seed()
crypto: drbg - Simplify "uninstantiate" logic
crypto: drbg - Include get_random_bytes() output in additional input
crypto: drbg - Change DRBG_MAX_REQUESTS to 4096
crypto: drbg - Remove redundant reseeding based on random.c state
crypto: drbg - Clean up generation code
crypto: drbg - Clean up loop in drbg_hmac_update()
crypto: af_alg - Document the deprecation of AF_ALG
crypto: af_alg - Remove zero-copy support from skcipher and aead
crypto: drbg - Rename MAX_ADDTL => MAX_ADDTL_BYTES
crypto: drbg - Remove support for "prediction resistance"
crypto: loongson - Select CRYPTO_RNG
crypto: crypto4xx - Remove insecure and unused rng_alg
crypto: loongson - Remove broken and unused loongson-rng
crypto: hisi-trng - Remove crypto_rng interface
hwrng: hisi-trng - Move hisi-trng into drivers/char/hw_random/
crypto: exynos-rng - Remove exynos-rng driver
crypto: xilinx-trng - Remove crypto_rng interface
crypto: xilinx-trng - Fix return value of xtrng_hwrng_trng_read()
crypto: xilinx-trng - Replace crypto_drbg_ctr_df() with HMAC-SHA512
hwrng: xilinx - Move xilinx-rng into drivers/char/hw_random/
crypto: sun4i-ss - Remove insecure and unused rng_alg
Ethan Nelson-Moore (2):
LoongArch: Remove unused arch/loongarch/crypto directory
MIPS: Remove unused arch/mips/crypto directory
Felix Gu (2):
crypto: marvell/octeontx - fix DMA cleanup using wrong loop index
crypto: cavium/cpt - fix DMA cleanup using wrong loop index
Fiona Trahe (1):
Documentation: qat_rl: make rate limiting wording clearer
Giovanni Cabiddu (5):
crypto: qat - remove unused character device and IOCTLs
crypto: qat - rename adf_ctl_drv.c to adf_module.c
crypto: qat - remove MODULE_VERSION
crypto: qat - fix VF2PF work teardown race in adf_disable_sriov()
crypto: qat - validate RSA CRT component lengths
Harshal Dev (2):
dt-bindings: crypto: qcom-qce: Document the Glymur crypto engine
dt-bindings: crypto: qcom,prng: Document Glymur TRNG
Herbert Xu (5):
crypto: authencesn - Use memcpy_from/to_sglist
crypto: af_alg - Cap AEAD AD length to 0x80000000
crypto: tegra - Fix dma_free_coherent size error
crypto: tegra - Return ENOMEM when input buffer allocation fails for ccm
crypto: rng - Free default RNG on module exit
Ilya Dryomov (1):
crypto: testmgr - allow authenc(hmac(sha{256,384}),cts(cbc(aes))) in FIPS mode
Jeff Barnes (1):
crypto: testmgr - disallow RSA PKCS#1 SHA-1 sig algs in FIPS mode
Julian Braha (1):
keys: cleanup dead code in Kconfig for FIPS_SIGNATURE_SELFTEST
Junyuan Wang (1):
crypto: qat - add KPT support for GEN6 devices
Krzysztof Kozlowski (2):
dt-bindings: crypto: qcom-qce: Add Qualcomm Eliza QCE
crypto: drivers - Move MODULE_DEVICE_TABLE next to the table itself
Lothar Rubusch (1):
crypto: atmel-sha204a - fix blocking and non-blocking rng logic
Lukas Wunner (2):
crypto: ecc - Unbreak the build on arm with CONFIG_KASAN_STACK=y
X.509: Fix validation of ASN.1 certificate header
Manivannan Sadhasivam (2):
dt-bindings: crypto: qcom,prng: Document Hawi TRNG
dt-bindings: crypto: qcom,inline-crypto-engine: Document Hawi ICE
Michael Bommarito (1):
crypto: krb5 - filter out async aead implementations at alloc
Mikko Perttunen (1):
crypto: tegra - Don't touch bo refcount in host1x bo pin/unpin
Paul Louvel (11):
crypto: talitos - use dma_sync_single_for_cpu() before reading descriptor header
crypto: talitos - add chaining of arbitrary number of descriptor for the SEC1
crypto: talitos - move dma unmapping code in flush_channel() into a standalone dma_unmap_request() function
crypto: talitos - move dma mapping code in talitos_submit() into a standalone dma_map_request() function
crypto: talitos - move code in current_desc_hdr() into a standalone function
crypto: talitos/hash - prepare SEC1 descriptor chaining, remove additional descriptor
crypto: talitos/hash - use descriptor chaining for SEC1 instead of workqueue
crypto: talitos/hash - drop workqueue mechanism for SEC1
crypto: talitos/hash - rename first_desc/last_desc to first_request/last_request
crypto: talitos/hash - remove useless wrapper
crypto: talitos/hash - fix SEC2 64k - 1 ahash request limitation
Rosen Penev (4):
crypto: cesa - allocate engines with main struct
crypto: talitos - allocate channels with main struct
crypto: talitos - use devm_platform_ioremap_resource()
crypto: amcc - convert irq_of_parse_and_map to platform_get_irq
Ruijie Li (1):
crypto: pcrypt - restore callback for non-parallel fallback
Ruoyu Wang (1):
crypto: ixp4xx - fix buffer chain unwind on allocation failure
Sam James (1):
crypto: nx - fix nx_crypto_ctx_exit argument
Sean Christopherson (1):
crypto: ccp - Treat zero-length cert chain as query for blob lengths
Stepan Ionichev (1):
crypto: ccp/sev-dev-tsm - bail out early when pdev->bus is NULL
Thorsten Blum (37):
crypto: atmel-ecc - add support for atecc608b
dt-bindings: trivial-devices: add atmel,atecc608b
crypto: caam - use print_hex_dump_devel to guard key hex dumps
crypto: caam - use print_hex_dump_devel to guard key hex dumps
crypto: omap - add omap_aes_unregister_algs helper
crypto: omap - add omap_des_unregister_algs helper
crypto: omap - add omap_sham_unregister_algs helper
crypto: starfive - use list_first_entry_or_null to simplify cryp_find_dev
crypto: atmel-sha204a - drop hwrng quality reduction for ATSHA204A
crypto: ecrdsa - fix unknown OID check in ecrdsa_param_curve
crypto: jitterentropy - drop redundant delta check in jent_entropy_init
crypto: jitterentropy - fix URL
hwrng: core - drop unnecessary forward declarations
hwrng: core - use bool for wait parameter in rng_get_data
hwrng: core - use MAX to simplify RNG_BUFFER_SIZE
hwrng: core - use sysfs_emit_at in rng_available_show
crypto: artpec6 - refactor crypto_setup_out_descr for readability
crypto: ccree - replace snprintf("%s") with strscpy
crypto: atmel-ecc - replace min_t with min
crypto: api - use designated initializers for report structs
crypto: atmel-sha204a - drop __maybe_unused and of_match_ptr
crypto: atmel-ecc - drop CONFIG_OF guard and of_match_ptr
crypto: cesa - use max to simplify mv_cesa_probe
crypto: atmel - use min3 to simplify atmel_sha_append_sg
crypto: riscv/aes - replace min_t with min in riscv64_aes_ctr_crypt
crypto: atmel-i2c - drop redundant void * callback cast in enqueue
crypto: drivers - remove of_match_ptr from OF match tables
crypto: atmel-sha - use memcpy_and_pad to simplify hmac_setup
crypto: omap-des - add COMPILE_TEST and fix CONFIG_OF=n build
crypto: omap-des - drop of_match_ptr from OF match table
crypto: atmel-sha204a - remove sysfs group before hwrng
crypto: atmel-sha204a - fail on hwrng registration error in probe path
crypto: ecrdsa - remove empty sig_alg exit callback
crypto: octeontx - use strscpy_pad in ucode_load_store
crypto: powerpc/aes - use min in ppc_{ecb,cbc,ctr,xts}_crypt
crypto: qat - simplify adf_service_mask_to_string helper
crypto: atmel-ecc - drop dead code in atmel_ecdh_max_size
Tycho Andersen (AMD) (8):
crypto: ccp - Reverse the cleanup order in psp_dev_destroy()
crypto: ccp - Fix snp_filter_reserved_mem_regions() off-by-one
crypto: ccp - Check for page allocation failure correctly in TIO
crypto: ccp - Initialize data during __sev_snp_init_locked()
crypto: ccp - Do not initialize SNP for SEV ioctls
crypto: ccp - Do not initialize SNP for ioctl(SNP_COMMIT)
crypto: ccp - Do not initialize SNP for ioctl(SNP_VLEK_LOAD)
crypto: ccp - Do not initialize SNP for ioctl(SNP_CONFIG)
Uwe Kleine-König (The Capable Hub) (6):
hwrng: drivers - Drop unused assignment to pci driver_data
crypto: ccp - Define pci_device_ids using named initializers
crypto: drivers - Drop explicit assigment of 0 in pci_device_id array
crypto: atmel-sha204a - Drop of_device_id data
crypto: atmel-sha204a - Use named initializers for struct i2c_device_id
crypto: atmel-ecc - Use named initializers for struct i2c_device_id
Weili Qian (2):
crypto: hisilicon/qm - disable error report before flr
crypto: hisilicon - mask all error type when removing driver
Weiming Shi (1):
crypto: asymmetric_keys - fix OOB read in pefile_digest_pe_contents
Wentao Liang (2):
hwrng: jh7110 - fix refcount leak in starfive_trng_read()
crypto: tegra - fix refcount leak in tegra_se_host1x_submit()
Xiaonan Zhao (1):
crypto: chacha20poly1305 - validate poly1305 template argument
Zhushuai Yin (3):
crypto: hisilicon/qm - allow VF devices to query hardware isolation status
crypto: hisilicon/qm - place the interrupt status interface after the PM usage counter
crypto: hisilicon/qm - support function-level error reset
Zongyu Wu (1):
crypto: hisilicon/qm - support doorbell enable control
lizhi (1):
crypto: hisilicon/sec2 - lower priority for hisilicon crypto implementations
Documentation/ABI/testing/sysfs-driver-qat_kpt | 97 ++
Documentation/ABI/testing/sysfs-driver-qat_rl | 2 +-
Documentation/crypto/api-samples.rst | 2 +-
Documentation/crypto/userspace-if.rst | 127 +-
.../bindings/crypto/qcom,inline-crypto-engine.yaml | 1 +
.../devicetree/bindings/crypto/qcom,prng.yaml | 3 +
.../devicetree/bindings/crypto/qcom-qce.yaml | 4 +
Documentation/devicetree/bindings/rng/mtk-rng.yaml | 43 +-
.../devicetree/bindings/trivial-devices.yaml | 4 +-
Documentation/userspace-api/ioctl/ioctl-number.rst | 1 -
MAINTAINERS | 17 +-
arch/arm/configs/exynos_defconfig | 1 -
arch/arm/configs/multi_v7_defconfig | 1 -
arch/arm/configs/sunxi_defconfig | 1 -
arch/arm64/configs/defconfig | 4 +-
arch/loongarch/Makefile | 2 -
arch/loongarch/configs/loongson32_defconfig | 1 -
arch/loongarch/configs/loongson64_defconfig | 1 -
arch/loongarch/crypto/Kconfig | 5 -
arch/loongarch/crypto/Makefile | 4 -
arch/m68k/configs/amiga_defconfig | 2 -
arch/m68k/configs/apollo_defconfig | 2 -
arch/m68k/configs/atari_defconfig | 2 -
arch/m68k/configs/bvme6000_defconfig | 2 -
arch/m68k/configs/hp300_defconfig | 2 -
arch/m68k/configs/mac_defconfig | 2 -
arch/m68k/configs/multi_defconfig | 2 -
arch/m68k/configs/mvme147_defconfig | 2 -
arch/m68k/configs/mvme16x_defconfig | 2 -
arch/m68k/configs/q40_defconfig | 2 -
arch/m68k/configs/sun3_defconfig | 2 -
arch/m68k/configs/sun3x_defconfig | 2 -
arch/mips/Makefile | 2 -
arch/mips/configs/decstation_64_defconfig | 2 -
arch/mips/configs/decstation_defconfig | 2 -
arch/mips/configs/decstation_r4k_defconfig | 2 -
arch/mips/crypto/.gitignore | 2 -
arch/mips/crypto/Kconfig | 5 -
arch/mips/crypto/Makefile | 5 -
arch/powerpc/crypto/aes-spe-glue.c | 9 +-
arch/riscv/crypto/aes-riscv64-glue.c | 4 +-
crypto/Kconfig | 120 +-
crypto/Makefile | 7 +-
crypto/acompress.c | 8 +-
crypto/aead.c | 10 +-
crypto/af_alg.c | 106 +-
crypto/ahash.c | 8 +-
crypto/akcipher.c | 8 +-
crypto/algif_aead.c | 51 +-
crypto/algif_hash.c | 4 +-
crypto/algif_rng.c | 4 +-
crypto/algif_skcipher.c | 66 +-
crypto/asymmetric_keys/Kconfig | 1 -
crypto/asymmetric_keys/verify_pefile.c | 2 +
crypto/asymmetric_keys/x509_loader.c | 2 +-
crypto/authencesn.c | 33 +-
crypto/chacha20poly1305.c | 11 +-
crypto/crypto_null.c | 35 +-
crypto/crypto_user.c | 14 +-
crypto/df_sp80090a.c | 222 ---
crypto/drbg.c | 1819 +++-----------------
crypto/ecc.c | 31 +-
crypto/ecrdsa.c | 7 +-
crypto/jitterentropy.c | 6 +-
crypto/kpp.c | 8 +-
crypto/krb5/krb5_api.c | 2 +-
crypto/lskcipher.c | 10 +-
crypto/pcrypt.c | 4 +
crypto/rng.c | 19 +-
crypto/scompress.c | 8 +-
crypto/shash.c | 8 +-
crypto/sig.c | 6 +-
crypto/skcipher.c | 10 +-
crypto/testmgr.c | 162 +-
crypto/testmgr.h | 1080 +++---------
drivers/char/hw_random/Kconfig | 21 +
drivers/char/hw_random/Makefile | 2 +
drivers/char/hw_random/amd-rng.c | 6 +-
drivers/char/hw_random/cavium-rng.c | 4 +-
drivers/char/hw_random/core.c | 30 +-
drivers/char/hw_random/geode-rng.c | 4 +-
drivers/char/hw_random/hisi-trng-v2.c | 98 ++
drivers/char/hw_random/jh7110-trng.c | 13 +-
drivers/char/hw_random/mtk-rng.c | 125 +-
.../xilinx => char/hw_random}/xilinx-trng.c | 135 +-
drivers/crypto/Kconfig | 37 +-
drivers/crypto/Makefile | 2 -
drivers/crypto/allwinner/Kconfig | 10 -
drivers/crypto/allwinner/sun4i-ss/Makefile | 1 -
drivers/crypto/allwinner/sun4i-ss/sun4i-ss-core.c | 36 -
drivers/crypto/allwinner/sun4i-ss/sun4i-ss-prng.c | 69 -
drivers/crypto/allwinner/sun4i-ss/sun4i-ss.h | 20 -
drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c | 12 +
drivers/crypto/allwinner/sun8i-ss/sun8i-ss-core.c | 12 +
drivers/crypto/amcc/crypto4xx_core.c | 94 +-
drivers/crypto/amcc/crypto4xx_core.h | 6 +-
drivers/crypto/amcc/crypto4xx_reg_def.h | 11 -
drivers/crypto/amlogic/amlogic-gxl-core.c | 2 +-
drivers/crypto/atmel-ecc.c | 29 +-
drivers/crypto/atmel-i2c.c | 2 +-
drivers/crypto/atmel-sha.c | 11 +-
drivers/crypto/atmel-sha204a.c | 50 +-
drivers/crypto/axis/artpec6_crypto.c | 21 +-
drivers/crypto/bcm/cipher.c | 6 +-
drivers/crypto/caam/caamalg.c | 12 +-
drivers/crypto/caam/caamalg_qi.c | 12 +-
drivers/crypto/caam/caamalg_qi2.c | 12 +-
drivers/crypto/caam/caamhash.c | 4 +-
drivers/crypto/caam/key_gen.c | 4 +-
drivers/crypto/cavium/cpt/cptpf_main.c | 2 +-
drivers/crypto/cavium/cpt/cptvf_main.c | 6 +-
drivers/crypto/cavium/cpt/cptvf_reqmanager.c | 4 +-
drivers/crypto/cavium/nitrox/nitrox_main.c | 4 +-
drivers/crypto/ccp/psp-dev.c | 12 +-
drivers/crypto/ccp/sev-dev-tsm.c | 23 +-
drivers/crypto/ccp/sev-dev.c | 96 +-
drivers/crypto/ccp/sp-pci.c | 28 +-
drivers/crypto/ccree/cc_aead.c | 9 +-
drivers/crypto/ccree/cc_cipher.c | 7 +-
drivers/crypto/ccree/cc_hash.c | 13 +-
drivers/crypto/exynos-rng.c | 399 -----
drivers/crypto/hisilicon/Kconfig | 8 -
drivers/crypto/hisilicon/Makefile | 1 -
drivers/crypto/hisilicon/hpre/hpre_main.c | 19 +-
drivers/crypto/hisilicon/qm.c | 334 +++-
drivers/crypto/hisilicon/sec2/sec_crypto.c | 2 +-
drivers/crypto/hisilicon/sec2/sec_main.c | 13 +-
drivers/crypto/hisilicon/trng/Makefile | 2 -
drivers/crypto/hisilicon/trng/trng.c | 390 -----
drivers/crypto/hisilicon/zip/zip_main.c | 20 +-
drivers/crypto/inside-secure/eip93/eip93-main.c | 3 +-
drivers/crypto/inside-secure/eip93/eip93-regs.h | 2 +-
drivers/crypto/inside-secure/safexcel.c | 4 +-
drivers/crypto/intel/ixp4xx/ixp4xx_crypto.c | 25 +-
drivers/crypto/intel/qat/qat_420xx/adf_drv.c | 11 +-
drivers/crypto/intel/qat/qat_4xxx/adf_drv.c | 11 +-
.../crypto/intel/qat/qat_6xxx/adf_6xxx_hw_data.c | 21 +-
.../crypto/intel/qat/qat_6xxx/adf_6xxx_hw_data.h | 9 +
drivers/crypto/intel/qat/qat_6xxx/adf_drv.c | 8 +-
drivers/crypto/intel/qat/qat_c3xxx/adf_drv.c | 6 +-
drivers/crypto/intel/qat/qat_c3xxxvf/adf_drv.c | 4 +-
drivers/crypto/intel/qat/qat_c62x/adf_drv.c | 6 +-
drivers/crypto/intel/qat/qat_c62xvf/adf_drv.c | 4 +-
drivers/crypto/intel/qat/qat_common/Makefile | 4 +-
.../intel/qat/qat_common/adf_accel_devices.h | 4 +
drivers/crypto/intel/qat/qat_common/adf_admin.c | 39 +
drivers/crypto/intel/qat/qat_common/adf_admin.h | 2 +
drivers/crypto/intel/qat/qat_common/adf_aer.c | 122 +-
drivers/crypto/intel/qat/qat_common/adf_cfg.c | 10 -
drivers/crypto/intel/qat/qat_common/adf_cfg.h | 1 -
.../crypto/intel/qat/qat_common/adf_cfg_common.h | 32 -
.../crypto/intel/qat/qat_common/adf_cfg_services.c | 7 +-
drivers/crypto/intel/qat/qat_common/adf_cfg_user.h | 38 -
.../crypto/intel/qat/qat_common/adf_common_drv.h | 14 +-
drivers/crypto/intel/qat/qat_common/adf_ctl_drv.c | 466 -----
drivers/crypto/intel/qat/qat_common/adf_dev_mgr.c | 70 -
.../intel/qat/qat_common/adf_heartbeat_inject.c | 6 +-
.../crypto/intel/qat/qat_common/adf_hw_arbiter.c | 25 -
drivers/crypto/intel/qat/qat_common/adf_init.c | 26 +
drivers/crypto/intel/qat/qat_common/adf_isr.c | 39 +
drivers/crypto/intel/qat/qat_common/adf_kpt.c | 56 +
drivers/crypto/intel/qat/qat_common/adf_kpt.h | 29 +
drivers/crypto/intel/qat/qat_common/adf_module.c | 63 +
drivers/crypto/intel/qat/qat_common/adf_sriov.c | 34 +-
.../crypto/intel/qat/qat_common/adf_sysfs_kpt.c | 296 ++++
.../crypto/intel/qat/qat_common/adf_sysfs_kpt.h | 10 +
.../intel/qat/qat_common/icp_qat_fw_init_admin.h | 8 +
drivers/crypto/intel/qat/qat_common/icp_qat_hw.h | 3 +-
.../crypto/intel/qat/qat_common/qat_asym_algs.c | 10 +-
drivers/crypto/intel/qat/qat_dh895xcc/adf_drv.c | 6 +-
drivers/crypto/intel/qat/qat_dh895xccvf/adf_drv.c | 4 +-
drivers/crypto/loongson/Kconfig | 5 -
drivers/crypto/loongson/Makefile | 1 -
drivers/crypto/loongson/loongson-rng.c | 209 ---
drivers/crypto/marvell/cesa/cesa.c | 16 +-
drivers/crypto/marvell/cesa/cesa.h | 42 +-
drivers/crypto/marvell/octeontx/otx_cptpf_main.c | 2 +-
drivers/crypto/marvell/octeontx/otx_cptpf_ucode.c | 5 +-
drivers/crypto/marvell/octeontx/otx_cptvf_main.c | 6 +-
drivers/crypto/marvell/octeontx/otx_cptvf_reqmgr.c | 4 +-
drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c | 2 +-
drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c | 8 +-
drivers/crypto/nx/nx.c | 6 +-
drivers/crypto/nx/nx.h | 2 +-
drivers/crypto/omap-aes.c | 43 +-
drivers/crypto/omap-des.c | 32 +-
drivers/crypto/omap-sham.c | 27 +-
drivers/crypto/qcom-rng.c | 2 +-
drivers/crypto/starfive/jh7110-cryp.c | 17 +-
drivers/crypto/talitos.c | 592 ++++---
drivers/crypto/talitos.h | 17 +-
drivers/crypto/tegra/tegra-se-aes.c | 33 +-
drivers/crypto/tegra/tegra-se-main.c | 5 +-
drivers/crypto/xilinx/Makefile | 1 -
include/crypto/df_sp80090a.h | 28 -
include/crypto/drbg.h | 263 ---
include/crypto/if_alg.h | 19 +-
include/crypto/internal/drbg.h | 54 -
include/linux/hisi_acc_qm.h | 15 +-
include/linux/platform_data/crypto-ux500.h | 22 -
include/linux/socket.h | 1 -
io_uring/net.c | 1 -
net/compat.c | 1 -
net/socket.c | 7 +-
tools/perf/trace/beauty/include/linux/socket.h | 1 -
tools/testing/crypto/chacha20-s390/test-cipher.c | 1 -
206 files changed, 2971 insertions(+), 6632 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-driver-qat_kpt
delete mode 100644 arch/loongarch/crypto/Kconfig
delete mode 100644 arch/loongarch/crypto/Makefile
delete mode 100644 arch/mips/crypto/.gitignore
delete mode 100644 arch/mips/crypto/Kconfig
delete mode 100644 arch/mips/crypto/Makefile
delete mode 100644 crypto/df_sp80090a.c
create mode 100644 drivers/char/hw_random/hisi-trng-v2.c
rename drivers/{crypto/xilinx => char/hw_random}/xilinx-trng.c (75%)
delete mode 100644 drivers/crypto/allwinner/sun4i-ss/sun4i-ss-prng.c
delete mode 100644 drivers/crypto/exynos-rng.c
delete mode 100644 drivers/crypto/hisilicon/trng/Makefile
delete mode 100644 drivers/crypto/hisilicon/trng/trng.c
delete mode 100644 drivers/crypto/intel/qat/qat_common/adf_cfg_user.h
delete mode 100644 drivers/crypto/intel/qat/qat_common/adf_ctl_drv.c
create mode 100644 drivers/crypto/intel/qat/qat_common/adf_kpt.c
create mode 100644 drivers/crypto/intel/qat/qat_common/adf_kpt.h
create mode 100644 drivers/crypto/intel/qat/qat_common/adf_module.c
create mode 100644 drivers/crypto/intel/qat/qat_common/adf_sysfs_kpt.c
create mode 100644 drivers/crypto/intel/qat/qat_common/adf_sysfs_kpt.h
delete mode 100644 drivers/crypto/loongson/Kconfig
delete mode 100644 drivers/crypto/loongson/Makefile
delete mode 100644 drivers/crypto/loongson/loongson-rng.c
delete mode 100644 include/crypto/df_sp80090a.h
delete mode 100644 include/crypto/drbg.h
delete mode 100644 include/crypto/internal/drbg.h
delete mode 100644 include/linux/platform_data/crypto-ux500.h
Thanks,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: [PATCH 2/2] dt-bindings: crypto: qcom,inline-crypto-engine: Document Maili ICE
From: Jingyi Wang @ 2026-06-15 6:12 UTC (permalink / raw)
To: Krzysztof Kozlowski
Cc: Herbert Xu, David S. Miller, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Vinod Koul, Bjorn Andersson, aiqun.yu,
tingwei.zhang, trilok.soni, yijie.yang, linux-arm-msm,
linux-crypto, devicetree, linux-kernel
In-Reply-To: <20260610-mighty-dalmatian-of-piety-2fa184@quoll>
On 6/10/2026 4:55 PM, Krzysztof Kozlowski wrote:
> On Tue, Jun 09, 2026 at 02:08:57AM -0700, Jingyi Wang wrote:
>> The Inline Crypto Engine found on Maili SoC is compatible with the common
>> baseline IP 'qcom,inline-crypto-engine'. Hence, document the compatible as
>> such.
>>
>> Signed-off-by: Jingyi Wang <jingyi.wang@oss.qualcomm.com>
>> ---
>> Documentation/devicetree/bindings/crypto/qcom,inline-crypto-engine.yaml | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/Documentation/devicetree/bindings/crypto/qcom,inline-crypto-engine.yaml b/Documentation/devicetree/bindings/crypto/qcom,inline-crypto-engine.yaml
>> index db895c50e2d2..c9489f6b8081 100644
>> --- a/Documentation/devicetree/bindings/crypto/qcom,inline-crypto-engine.yaml
>> +++ b/Documentation/devicetree/bindings/crypto/qcom,inline-crypto-engine.yaml
>> @@ -16,6 +16,7 @@ properties:
>> - qcom,eliza-inline-crypto-engine
>> - qcom,hawi-inline-crypto-engine
>> - qcom,kaanapali-inline-crypto-engine
>> + - qcom,maili-inline-crypto-engine
>
> Why clocks are flexible?
I have just noticed that this patch has been merged:
https://lore.kernel.org/all/20260416-qcom_ice_power_and_clk_vote-v5-1-5ccf5d7e2846@oss.qualcomm.com/
Will add qcom,maili-inline-crypto-engine to the eliza/milos list in next version.
( Maybe hawi should also be added together? )
Thanks,
Jingyi
>
> Best regards,
> Krzysztof
>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox