Linux cryptographic layer development
 help / color / mirror / Atom feed
* Re: [PATCH v2] lib/raid/xor: x86: Add AVX-512 optimized xor_gen()
From: Eric Biggers @ 2026-06-15 18:44 UTC (permalink / raw)
  To: David Laight
  Cc: Andrew Morton, linux-kernel, Christoph Hellwig, linux-crypto, x86,
	linux-raid
In-Reply-To: <20260614111628.00af46b9@pumpkin>

On Sun, Jun 14, 2026 at 11:16:28AM +0100, David Laight wrote:
> On Sat, 13 Jun 2026 18:03:57 -0700
> Eric Biggers <ebiggers@kernel.org> wrote:
> 
> > Add an implementation of xor_gen() using AVX-512.
> > 
> > It uses 512-bit vectors, i.e. ZMM registers.  It also uses the
> > vpternlogq instruction to do three-input XORs when applicable.
> > 
> > It's enabled on x86_64 CPUs that have AVX512F && !PREFER_YMM.  In
> > practice that means:
> > 
> >     - AMD Zen 4 and later (client and server)
> 
> Doesn't zen4 only have a 256bit bus between the cpu and cache?
> So avx512 reads take two clocks.
> Since this is memory limited it is unlikely to run faster than the
> avx256 version.

On AMD Genoa (Zen 4 server processor), the AVX-512 code added by this
patch is indeed about the same speed as the existing AVX-2 code.

> OTOH if it doesn't cause down-clocking as well then it won't be slower.

Yes, as far as I know that's not an issue on AMD processors, even Zen 4.
The "avoid AVX-512 due to downclocking" rule is historical guidance for
Intel processors that had a bad implementation of AVX-512.  There's no
reason to exclude Zen 4 from executing AVX-512 optimized code.  At worst
it will just be the same, as we're seeing here.

> Since I suggested it :-)
> 
> Reviewed-By: David Laight <david.laight.linux@gmail.com>
> 
> Some 'not very important' comments:
> 
> I did wonder whether moving the loop into the asm() would help.
> gcc has a nasty habit of pessimising loops when you try to be clever.
> It is certainly safer for tight loops like these.

I originally tried leaving the loops to the compiler, but gcc unrolled
the 1x ones by 2x, despite it having no visibility into the asm block.
That broke the intent with the indexed addressing, since to achieve the
unrolling it generated code that incremented the pointers.

So I just ended up moving the loop to the asm, which reliably gives us
the code we want.

> That does have the side effect of making p0 be %1 which doesn't improve
> readability. Either used named parameters or possibly just change p0 to p1 (etc)
> so they match.
> 
> The code should be limited by the memory reads, so the 3-argument xor and
> the interleave of the unroll may make no difference.

The unroll by 2x in the 2 and 3-buffer cases helped a little bit on
Sapphire Rapids.  I don't know exactly why, but it makes sense that
those cases are where the loop overhead is most likely to matter.

> Some cpu do have constraints on the cache alignment in order to do two
> reads per clock, but I've forgotten them and they got better before AVX-512.
> If that were affecting this code (on the tested cpu) then I'd expect the
> interleaved unroll would improve the _4 and -5 functions.
> So it probably doesn't affect this code.

The buffers are always 64-byte aligned here, as documented.

> Using the same loop for the avx-256 and sse (and even smaller) functions could
> well generate code that runs 'pretty much as fast as possible' on older cpu.
> Intel cpu (going back to Sandy bridge) are likely to execute the loop in the
> same number of clocks - but clearly copying half or a quarter of the data.
> But I've no experience of zen1.
> 
> Might be worth doing for avx-256, does any care about anything older :-)

Yes, the existing AVX code is probably excessively unrolled.  It
generates almost 4 KiB of code.

- Eric

^ permalink raw reply

* [GIT PULL] Crypto library updates for 7.2
From: Eric Biggers @ 2026-06-15 17:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-crypto, linux-kernel, Ard Biesheuvel, Jason A. Donenfeld,
	Herbert Xu, Arnd Bergmann, Christophe Leroy

The following changes since commit 7fd2df204f342fc17d1a0bfcd474b24232fb0f32:

  Linux 7.1-rc2 (2026-05-03 14:21:25 -0700)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git tags/libcrypto-for-linus

for you to fetch changes up to 065f978a0e015c4dd9f536f5c08078a37f5509c1:

  lib/crypto: gf128hash: mark clmul32() as noinline_for_stack (2026-06-11 12:57:49 -0700)

----------------------------------------------------------------

- Drop the last architecture-specific implementation of MD5

- Mark clmul32() as noinline_for_stack to improve codegen in some cases

----------------------------------------------------------------
Arnd Bergmann (1):
      lib/crypto: gf128hash: mark clmul32() as noinline_for_stack

Eric Biggers (1):
      lib/crypto: powerpc/md5: Drop powerpc optimized MD5 code

 lib/crypto/Kconfig           |   5 -
 lib/crypto/Makefile          |   4 -
 lib/crypto/gf128hash.c       |   2 +-
 lib/crypto/md5.c             |  20 ++--
 lib/crypto/powerpc/md5-asm.S | 235 -------------------------------------------
 lib/crypto/powerpc/md5.h     |  12 ---
 6 files changed, 8 insertions(+), 270 deletions(-)
 delete mode 100644 lib/crypto/powerpc/md5-asm.S
 delete mode 100644 lib/crypto/powerpc/md5.h

^ permalink raw reply

* [GIT PULL] CRC updates for 7.2
From: Eric Biggers @ 2026-06-15 17:48 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-crypto, linux-arm-kernel, linux-kernel, Ard Biesheuvel,
	Christoph Hellwig

The following changes since commit e7ae89a0c97ce2b68b0983cd01eda67cf373517d:

  Linux 7.1-rc5 (2026-05-24 13:48:06 -0700)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git tags/crc-for-linus

for you to fetch changes up to cbe44c389ae80362e72696ac08f7c55a83f2a050:

  crypto: aegis128 - Use neon-intrinsics.h on ARM too (2026-05-28 13:14:25 -0700)

----------------------------------------------------------------

Accelerate CRC64-NVME for 32-bit ARM by refactoring the arm64 NEON
intrinsics implementation to be shared by 32-bit and 64-bit.

Also apply a similar cleanup to the 32-bit ARM NEON implementation of
xor_gen(), where it now reuses code from the 64-bit implementation.

----------------------------------------------------------------
Ard Biesheuvel (6):
      ARM: Add a neon-intrinsics.h header like on arm64
      xor/arm: Replace vectorized implementation with arm64's intrinsics
      xor/arm64: Use shared NEON intrinsics implementation from 32-bit ARM
      lib/crc: Turn NEON intrinsics crc64 implementation into common code
      lib/crc: arm: Enable arm64's NEON intrinsics implementation of crc64
      crypto: aegis128 - Use neon-intrinsics.h on ARM too

 Documentation/arch/arm/kernel_mode_neon.rst        |   4 +-
 arch/arm/include/asm/neon-intrinsics.h             |  60 +++++++++
 crypto/Makefile                                    |  10 +-
 crypto/aegis128-neon-inner.c                       |   4 +-
 lib/crc/Kconfig                                    |   1 +
 lib/crc/Makefile                                   |   9 +-
 lib/crc/arm/crc64-neon.h                           |  34 +++++
 lib/crc/arm/crc64.h                                |  36 +++++
 lib/crc/arm64/crc64-neon.h                         |  21 +++
 lib/crc/arm64/crc64.h                              |   4 +-
 lib/crc/{arm64/crc64-neon-inner.c => crc64-neon.c} |  26 +---
 lib/raid/xor/Makefile                              |  13 +-
 lib/raid/xor/arm/xor-neon.c                        |  26 ----
 lib/raid/xor/arm/xor-neon.h                        |   7 +
 lib/raid/xor/arm/xor_arch.h                        |   7 +-
 lib/raid/xor/arm64/xor-eor3.c                      | 146 +++++++++++++++++++++
 lib/raid/xor/xor-8regs.c                           |   2 -
 lib/raid/xor/{arm64 => }/xor-neon.c                | 143 +-------------------
 18 files changed, 338 insertions(+), 215 deletions(-)
 create mode 100644 arch/arm/include/asm/neon-intrinsics.h
 create mode 100644 lib/crc/arm/crc64-neon.h
 create mode 100644 lib/crc/arm/crc64.h
 create mode 100644 lib/crc/arm64/crc64-neon.h
 rename lib/crc/{arm64/crc64-neon-inner.c => crc64-neon.c} (62%)
 delete mode 100644 lib/raid/xor/arm/xor-neon.c
 create mode 100644 lib/raid/xor/arm/xor-neon.h
 create mode 100644 lib/raid/xor/arm64/xor-eor3.c
 rename lib/raid/xor/{arm64 => }/xor-neon.c (56%)

^ permalink raw reply

* [PATCH] hw_random: timeriomem-rng: add configurable read width and data mask
From: Jad Keskes @ 2026-06-15 17:09 UTC (permalink / raw)
  To: Olivia Mackall, Herbert Xu
  Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Alexander Clouter,
	linux-crypto, devicetree, linux-kernel, Jad Keskes

The TODO for supporting read sizes other than 32 bits and masking has
been sitting in this driver since 2009.  Implement it.

Add width (8, 16, or 32 bits) and mask properties to the platform data
and device tree bindings.  The read loop dispatches on width using
readb/readw/readl so a configured 8-bit access doesn't trigger a bus
error on hardware that rejects 32-bit reads to that address.  The mask
is ANDed with the value before storing.

These are platform properties, not runtime policy -- width depends on
SoC integration, mask reflects which output bits carry entropy.

The alignment check in probe is updated to verify the resource is
aligned to the configured width instead of hardcoding 4-byte alignment.

Signed-off-by: Jad Keskes <inasj268@gmail.com>
---
 .../bindings/rng/timeriomem-rng.yaml          | 76 ++++++++++++++++++
 drivers/char/hw_random/timeriomem-rng.c       | 78 +++++++++++++++----
 include/linux/timeriomem-rng.h                | 12 +++
 3 files changed, 153 insertions(+), 13 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/rng/timeriomem-rng.yaml

diff --git a/Documentation/devicetree/bindings/rng/timeriomem-rng.yaml b/Documentation/devicetree/bindings/rng/timeriomem-rng.yaml
new file mode 100644
index 000000000000..0d8460e9f916
--- /dev/null
+++ b/Documentation/devicetree/bindings/rng/timeriomem-rng.yaml
@@ -0,0 +1,76 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/rng/timeriomem-rng.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Timer IOMEM Hardware Random Number Generator
+
+description: |
+  This binding covers platforms that have a single IO memory address which
+  provides periodic random data.  The driver reads from the address at a
+  fixed interval, returning a configurable-width value masked to the desired
+  bits.
+
+maintainers:
+  - Alexander Clouter <alex@digriz.org.uk>
+
+properties:
+  compatible:
+    enum:
+      - timeriomem_rng
+
+  reg:
+    maxItems: 1
+
+  period:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    description:
+      Interval in microseconds between reads.  New random data is expected to
+      be available at this rate.
+
+  quality:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    default: 0
+    description:
+      Estimated entropy per 1024 bits of data, in the same scale as the
+      kernel's hwrng core (0-1024).
+
+  width:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    default: 32
+    enum: [8, 16, 32]
+    description:
+      Access width in bits.  Determines whether the read is performed as
+      an 8-bit, 16-bit, or 32-bit bus access.
+
+  mask:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    default: 0xFFFFFFFF
+    description:
+      Mask applied to the value read from the register.  Bits set to 0 in
+      the mask are cleared in the output data.  Default (no mask) passes
+      all bits through.
+
+required:
+  - compatible
+  - reg
+  - period
+
+additionalProperties: false
+
+examples:
+  - |
+    rng@f0001000 {
+        compatible = "timeriomem_rng";
+        reg = <0xf0001000 0x4>;
+        period = <100000>;
+    };
+
+    rng@f0002000 {
+        compatible = "timeriomem_rng";
+        reg = <0xf0002000 0x1>;
+        period = <50000>;
+        width = <8>;
+        mask = <0xFF>;
+    };
diff --git a/drivers/char/hw_random/timeriomem-rng.c b/drivers/char/hw_random/timeriomem-rng.c
index e61f06393209..930898852147 100644
--- a/drivers/char/hw_random/timeriomem-rng.c
+++ b/drivers/char/hw_random/timeriomem-rng.c
@@ -14,7 +14,9 @@
  *   has to do is provide the address and 'wait time' that new data becomes
  *   available.
  *
- * TODO: add support for reading sizes other than 32bits and masking
+ * The read width (8, 16, or 32 bits) and an optional data mask can be
+ * configured through platform data or device tree properties.  Default is
+ * 32-bit reads with no mask.
  */
 
 #include <linux/completion.h>
@@ -34,6 +36,8 @@ struct timeriomem_rng_private {
 	void __iomem		*io_base;
 	ktime_t			period;
 	unsigned int		present:1;
+	unsigned int		width;
+	u32			mask;
 
 	struct hrtimer		timer;
 	struct completion	completion;
@@ -48,6 +52,7 @@ static int timeriomem_rng_read(struct hwrng *hwrng, void *data,
 		container_of(hwrng, struct timeriomem_rng_private, rng_ops);
 	int retval = 0;
 	int period_us = ktime_to_us(priv->period);
+	int chunk = priv->width / 8;
 
 	/*
 	 * There may not have been enough time for new data to be generated
@@ -71,11 +76,28 @@ static int timeriomem_rng_read(struct hwrng *hwrng, void *data,
 			usleep_range(period_us,
 					period_us + max(1, period_us / 100));
 
-		*(u32 *)data = readl(priv->io_base);
-		retval += sizeof(u32);
-		data += sizeof(u32);
-		max -= sizeof(u32);
-	} while (wait && max > sizeof(u32));
+		switch (priv->width) {
+		case 8: {
+			u8 val = readb(priv->io_base) & priv->mask;
+			*(u8 *)data = val;
+			break;
+		}
+		case 16: {
+			u16 val = readw(priv->io_base) & priv->mask;
+			*(u16 *)data = val;
+			break;
+		}
+		case 32: {
+			u32 val = readl(priv->io_base) & priv->mask;
+			*(u32 *)data = val;
+			break;
+		}
+		}
+
+		retval += chunk;
+		data += chunk;
+		max -= chunk;
+	} while (wait && max > chunk);
 
 	/*
 	 * Block any new callers until the RNG has had time to generate new
@@ -125,11 +147,8 @@ static int timeriomem_rng_probe(struct platform_device *pdev)
 	if (IS_ERR(priv->io_base))
 		return PTR_ERR(priv->io_base);
 
-	if (res->start % 4 != 0 || resource_size(res) < 4) {
-		dev_err(&pdev->dev,
-			"address must be at least four bytes wide and 32-bit aligned\n");
-		return -EINVAL;
-	}
+	priv->width = 32;
+	priv->mask = 0xFFFFFFFF;
 
 	if (pdev->dev.of_node) {
 		int i;
@@ -145,9 +164,42 @@ static int timeriomem_rng_probe(struct platform_device *pdev)
 		if (!of_property_read_u32(pdev->dev.of_node,
 						"quality", &i))
 			priv->rng_ops.quality = i;
+
+		of_property_read_u32(pdev->dev.of_node,
+				     "width", &priv->width);
+		of_property_read_u32(pdev->dev.of_node,
+				     "mask", &priv->mask);
 	} else {
 		period = pdata->period;
 		priv->rng_ops.quality = pdata->quality;
+
+		if (pdata->width_set)
+			priv->width = pdata->width;
+		if (pdata->mask_set)
+			priv->mask = pdata->mask;
+	}
+
+	if (priv->width == 0)
+		priv->width = 32;
+
+	switch (priv->width) {
+	case 8:
+	case 16:
+	case 32:
+		break;
+	default:
+		dev_err(&pdev->dev, "invalid width %u, must be 8, 16, or 32\n",
+			priv->width);
+		return -EINVAL;
+	}
+
+	if (res->start % (priv->width / 8) != 0 ||
+	    resource_size(res) < priv->width / 8) {
+		dev_err(&pdev->dev,
+			"address must be at least %u-bit aligned (%u byte%s)\n",
+			priv->width, priv->width / 8,
+			priv->width / 8 > 1 ? "s" : "");
+		return -EINVAL;
 	}
 
 	priv->period = us_to_ktime(period);
@@ -167,8 +219,8 @@ static int timeriomem_rng_probe(struct platform_device *pdev)
 		return err;
 	}
 
-	dev_info(&pdev->dev, "32bits from 0x%p @ %dus\n",
-			priv->io_base, period);
+	dev_info(&pdev->dev, "%ubit from %p @ %dus\n",
+		 priv->width, priv->io_base, period);
 
 	return 0;
 }
diff --git a/include/linux/timeriomem-rng.h b/include/linux/timeriomem-rng.h
index 672df7fbf6c1..b4202ad2f507 100644
--- a/include/linux/timeriomem-rng.h
+++ b/include/linux/timeriomem-rng.h
@@ -16,6 +16,18 @@ struct timeriomem_rng_data {
 
 	/* bits of entropy per 1024 bits read */
 	unsigned int		quality;
+
+	/* read width (8, 16, or 32), 0 means 32 */
+	unsigned int		width;
+
+	/* set to true if width is explicitly provided */
+	bool			width_set;
+
+	/* mask applied to raw read value */
+	u32			mask;
+
+	/* set to true if mask is explicitly provided */
+	bool			mask_set;
 };
 
 #endif /* _LINUX_TIMERIOMEM_RNG_H */
-- 
2.54.0


^ permalink raw reply related

* Re: [PATCH RESEND 1/6] sock: add sock_kzalloc helper
From: Jakub Kicinski @ 2026-06-15 16:15 UTC (permalink / raw)
  To: Thorsten Blum
  Cc: Herbert Xu, David S. Miller, Eric Dumazet, Kuniyuki Iwashima,
	Paolo Abeni, Willem de Bruijn, Simon Horman, linux-crypto,
	linux-kernel, netdev
In-Reply-To: <ai7JfHTFgFt6YN_K@linux.dev>

On Sun, 14 Jun 2026 17:32:12 +0200 Thorsten Blum wrote:
> Gentle ping? Patch 1/6 still needs an ack from netdev maintainers.

Perhaps other maintainers shared my feeling that this is a waste of
time.

^ permalink raw reply

* [PATCH v2 8/8] crypto: qce - Use fallback for CCM with a fragmented payload
From: Bartosz Golaszewski @ 2026-06-15 15:49 UTC (permalink / raw)
  To: Thara Gopinath, Herbert Xu, David S. Miller, Stanimir Varbanov,
	Eneas U de Queiroz, Kuldeep Singh, Eric Biggers
  Cc: linux-crypto, linux-arm-msm, linux-kernel, brgl,
	Bartosz Golaszewski, stable
In-Reply-To: <20260615-qce-fix-self-tests-v2-0-dc911f1aad42@oss.qualcomm.com>

The crypto engine reliably processes CCM only when the message payload
is a single contiguous buffer. The associated data is already linearized
into a bounce buffer before being submitted, but when the payload itself
is split across multiple scatterlist entries the engine stalls waiting
for input and the request fails with a hardware operation error. This
was uncovered by the crypto self-tests, which feed the algorithms
randomly fragmented buffers.

Detect a payload that spans more than one scatterlist entry (in either
the source or the destination, skipping past the associated data) and
route the request to the software fallback.

Cc: stable@vger.kernel.org
Fixes: 9363efb4181c ("crypto: qce - Add support for AEAD algorithms")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/aead.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/qce/aead.c b/drivers/crypto/qce/aead.c
index 46d3e3eb53b271e2ce755847bbcc83f81b9bda7e..2f4bb4fcd8265a0f7c2f568dfb8915b8c6b8ecee 100644
--- a/drivers/crypto/qce/aead.c
+++ b/drivers/crypto/qce/aead.c
@@ -500,7 +500,8 @@ static int qce_aead_crypt(struct aead_request *req, int encrypt)
 	struct qce_aead_reqctx *rctx = aead_request_ctx_dma(req);
 	struct qce_aead_ctx *ctx = crypto_aead_ctx(tfm);
 	struct qce_alg_template *tmpl = to_aead_tmpl(tfm);
-	unsigned int blocksize = crypto_aead_blocksize(tfm);
+	unsigned int blocksize = crypto_aead_blocksize(tfm), authsize;
+	struct scatterlist __sg[2], *msg_sg;
 
 	rctx->flags  = tmpl->alg_flags;
 	rctx->flags |= encrypt ? QCE_ENCRYPT : QCE_DECRYPT;
@@ -524,6 +525,27 @@ static int qce_aead_crypt(struct aead_request *req, int encrypt)
 	if (IS_CCM(rctx->flags) && !IS_ALIGNED(rctx->cryptlen, AES_BLOCK_SIZE))
 		ctx->need_fallback = true;
 
+	/*
+	 * The CE reliably processes CCM only when the message payload is a
+	 * single contiguous buffer. The associated data is linearized into a
+	 * bounce buffer before being handed to the engine, but a fragmented
+	 * payload makes the engine stall waiting for input, so route those
+	 * requests to the fallback.
+	 */
+	if (IS_CCM(rctx->flags) && rctx->cryptlen) {
+		authsize = ctx->authsize;
+
+		msg_sg = scatterwalk_ffwd(__sg, req->src, req->assoclen);
+		if (sg_nents_for_len(msg_sg, rctx->cryptlen +
+				     (encrypt ? 0 : authsize)) > 1)
+			ctx->need_fallback = true;
+
+		msg_sg = scatterwalk_ffwd(__sg, req->dst, req->assoclen);
+		if (sg_nents_for_len(msg_sg, rctx->cryptlen +
+				     (encrypt ? authsize : 0)) > 1)
+			ctx->need_fallback = true;
+	}
+
 	/* If fallback is needed, schedule and exit */
 	if (ctx->need_fallback) {
 		/* Reset need_fallback in case the same ctx is used for another transaction */

-- 
2.47.3


^ permalink raw reply related

* [PATCH v2 7/8] crypto: qce - Use a fallback for CCM with a partial final block
From: Bartosz Golaszewski @ 2026-06-15 15:49 UTC (permalink / raw)
  To: Thara Gopinath, Herbert Xu, David S. Miller, Stanimir Varbanov,
	Eneas U de Queiroz, Kuldeep Singh, Eric Biggers
  Cc: linux-crypto, linux-arm-msm, linux-kernel, brgl,
	Bartosz Golaszewski, stable
In-Reply-To: <20260615-qce-fix-self-tests-v2-0-dc911f1aad42@oss.qualcomm.com>

CCM builds on AES-CTR for encryption, and the crypto engine stalls on a
partial final block just as it does for plain ctr(aes): a payload whose
length is not a multiple of the AES block size leaves the operation
incomplete and fails with a hardware operation error. This was caught by
the ccm(aes) crypto self-tests.

Force the software fallback for CCM requests whose message length is not
block aligned, reusing the driver's existing need_fallback mechanism.

Cc: stable@vger.kernel.org
Fixes: 9363efb4181c ("crypto: qce - Add support for AEAD algorithms")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/aead.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/crypto/qce/aead.c b/drivers/crypto/qce/aead.c
index 6a511e5d7f6290a1df0093e463f39f5f2db25f88..46d3e3eb53b271e2ce755847bbcc83f81b9bda7e 100644
--- a/drivers/crypto/qce/aead.c
+++ b/drivers/crypto/qce/aead.c
@@ -516,6 +516,14 @@ static int qce_aead_crypt(struct aead_request *req, int encrypt)
 			ctx->need_fallback = true;
 	}
 
+	/*
+	 * CCM uses AES-CTR internally and the CE stalls on a partial final
+	 * block, so a payload that is not a multiple of the block size has to
+	 * be handled by the fallback.
+	 */
+	if (IS_CCM(rctx->flags) && !IS_ALIGNED(rctx->cryptlen, AES_BLOCK_SIZE))
+		ctx->need_fallback = true;
+
 	/* If fallback is needed, schedule and exit */
 	if (ctx->need_fallback) {
 		/* Reset need_fallback in case the same ctx is used for another transaction */

-- 
2.47.3


^ permalink raw reply related

* [PATCH v2 6/8] crypto: qce - Fix xts-aes-qce for weak keys
From: Bartosz Golaszewski @ 2026-06-15 15:49 UTC (permalink / raw)
  To: Thara Gopinath, Herbert Xu, David S. Miller, Stanimir Varbanov,
	Eneas U de Queiroz, Kuldeep Singh, Eric Biggers
  Cc: linux-crypto, linux-arm-msm, linux-kernel, brgl,
	Bartosz Golaszewski, stable
In-Reply-To: <20260615-qce-fix-self-tests-v2-0-dc911f1aad42@oss.qualcomm.com>

From: Kuldeep Singh <kuldeep.singh@oss.qualcomm.com>

The QCE hardware does not support AES XTS mode when key1 and key2 are
equal. The driver was handling this by unconditionally rejecting the
keys with -ENOKEY(-126), regardless of whether FIPS mode is active or
the FORBID_WEAK_KEYS flag is set.
[    5.599170] alg: skcipher: xts-aes-qce setkey failed on test vector 0; expected_error=0, actual_error=-126, flags=0x1
[    5.599184] alg: self-tests for xts(aes) using xts-aes-qce failed (rc=-126)

In general for weak keys,
- If FIPS mode is active or FORBID_WEAK_KEYS is set: return -EINVAL.
- In non-FIPS mode, Accept the key and encrypt successfully.

Since QCE was returning -ENOKEY for non-FIPS mode whereas the
expectation is to encrypt content and return success, the selftest saw a
mismatch and failed.

There are two problems in QCE behavior:
  * -ENOKEY is returned instead of -EINVAL for the FIPS/weak-key
    rejection case.
  * key1 == key2 is rejected even in non-FIPS mode

Fix xts-aes-qce behavior by using generic helper xts_verify_key() to
reject keys early with -EINVAL for FIPS mode active(or FORBID_WEAK_KEYS
set). For non-FIPS mode, since QCE hardware cannot accept the keys, use
software fallback mechanism to encrypt the data.

Cc: stable@vger.kernel.org
Fixes: f0d078dd6c49 ("crypto: qce - Return unsupported if key1 and key 2 are same for AES XTS algorithm")
Signed-off-by: Kuldeep Singh <kuldeep.singh@oss.qualcomm.com>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/cipher.h   |  1 +
 drivers/crypto/qce/skcipher.c | 20 +++++++++++++-------
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/crypto/qce/cipher.h b/drivers/crypto/qce/cipher.h
index 850f257d00f3aca0397adc1f703aea690c754d60..daea07551118d444d2f749588bdfe2ae2c6c553f 100644
--- a/drivers/crypto/qce/cipher.h
+++ b/drivers/crypto/qce/cipher.h
@@ -14,6 +14,7 @@
 struct qce_cipher_ctx {
 	u8 enc_key[QCE_MAX_KEY_SIZE];
 	unsigned int enc_keylen;
+	bool use_fallback;
 	struct crypto_skcipher *fallback;
 };
 
diff --git a/drivers/crypto/qce/skcipher.c b/drivers/crypto/qce/skcipher.c
index cf34278da30b1ffccf230ed194faae2352cb8550..e152a5b559c373b1bd6730a019bbd55609bc45d1 100644
--- a/drivers/crypto/qce/skcipher.c
+++ b/drivers/crypto/qce/skcipher.c
@@ -14,6 +14,7 @@
 #include <crypto/algapi.h>
 #include <crypto/internal/des.h>
 #include <crypto/internal/skcipher.h>
+#include <crypto/xts.h>
 
 #include "cipher.h"
 
@@ -196,14 +197,17 @@ static int qce_skcipher_setkey(struct crypto_skcipher *ablk, const u8 *key,
 	if (!key || !keylen)
 		return -EINVAL;
 
-	/*
-	 * AES XTS key1 = key2 not supported by crypto engine.
-	 * Revisit to request a fallback cipher in this case.
-	 */
 	if (IS_XTS(flags)) {
+		ret = xts_verify_key(ablk, key, keylen);
+		if (ret)
+			return ret;
 		__keylen = keylen >> 1;
-		if (!memcmp(key, key + __keylen, __keylen))
-			return -ENOKEY;
+		/*
+		 * QCE does not support key1 == key2 for XTS.
+		 * Use fallback cipher in this case.
+		 */
+		ctx->use_fallback = !crypto_memneq(key, key + __keylen,
+						       __keylen);
 	} else {
 		__keylen = keylen;
 	}
@@ -279,13 +283,15 @@ static int qce_skcipher_crypt(struct skcipher_request *req, int encrypt)
 	 * needed in all versions of CE)
 	 * AES-CTR with a partial final block (the CE stalls waiting for a full
 	 * block of input).
+	 * AES-XTS with key1 == key2 (not supported by the CE).
 	 */
 	if (IS_AES(rctx->flags) &&
 	    ((keylen != AES_KEYSIZE_128 && keylen != AES_KEYSIZE_256) ||
 	    (IS_CTR(rctx->flags) && !IS_ALIGNED(req->cryptlen, AES_BLOCK_SIZE)) ||
 	    (IS_XTS(rctx->flags) && ((req->cryptlen <= aes_sw_max_len) ||
 	    (req->cryptlen > QCE_SECTOR_SIZE &&
-	    req->cryptlen % QCE_SECTOR_SIZE))))) {
+	    req->cryptlen % QCE_SECTOR_SIZE))) ||
+	    (IS_XTS(rctx->flags) && ctx->use_fallback))) {
 		skcipher_request_set_tfm(&rctx->fallback_req, ctx->fallback);
 		skcipher_request_set_callback(&rctx->fallback_req,
 					      req->base.flags,

-- 
2.47.3


^ permalink raw reply related

* [PATCH v2 5/8] crypto: qce - Use a fallback for AES-CTR with a partial final block
From: Bartosz Golaszewski @ 2026-06-15 15:49 UTC (permalink / raw)
  To: Thara Gopinath, Herbert Xu, David S. Miller, Stanimir Varbanov,
	Eneas U de Queiroz, Kuldeep Singh, Eric Biggers
  Cc: linux-crypto, linux-arm-msm, linux-kernel, brgl,
	Bartosz Golaszewski, stable
In-Reply-To: <20260615-qce-fix-self-tests-v2-0-dc911f1aad42@oss.qualcomm.com>

ctr(aes) is registered with a block size of 1, so the crypto API hands
the driver requests whose length is not a multiple of the AES block
size. The crypto engine, however, stalls waiting for a full block of
input in that case, leaving the operation incomplete and failing the
request (and the crypto self-tests) with a hardware operation error.

Route AES-CTR requests with a partial final block to the software
fallback, which already handles the other cases the engine cannot.

Cc: stable@vger.kernel.org
Fixes: bb5c863b3d3c ("crypto: qce - fix ctr-aes-qce block, chunk sizes")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/skcipher.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/crypto/qce/skcipher.c b/drivers/crypto/qce/skcipher.c
index 379b45d2cd952a39c387e84af71238b53f7737e9..cf34278da30b1ffccf230ed194faae2352cb8550 100644
--- a/drivers/crypto/qce/skcipher.c
+++ b/drivers/crypto/qce/skcipher.c
@@ -277,9 +277,12 @@ static int qce_skcipher_crypt(struct skcipher_request *req, int encrypt)
 	 * AES-XTS request with len > QCE_SECTOR_SIZE and
 	 * is not a multiple of it.(Revisit this condition to check if it is
 	 * needed in all versions of CE)
+	 * AES-CTR with a partial final block (the CE stalls waiting for a full
+	 * block of input).
 	 */
 	if (IS_AES(rctx->flags) &&
 	    ((keylen != AES_KEYSIZE_128 && keylen != AES_KEYSIZE_256) ||
+	    (IS_CTR(rctx->flags) && !IS_ALIGNED(req->cryptlen, AES_BLOCK_SIZE)) ||
 	    (IS_XTS(rctx->flags) && ((req->cryptlen <= aes_sw_max_len) ||
 	    (req->cryptlen > QCE_SECTOR_SIZE &&
 	    req->cryptlen % QCE_SECTOR_SIZE))))) {

-- 
2.47.3


^ permalink raw reply related

* [PATCH v2 4/8] crypto: qce - Fix CTR-AES for partial block requests
From: Bartosz Golaszewski @ 2026-06-15 15:49 UTC (permalink / raw)
  To: Thara Gopinath, Herbert Xu, David S. Miller, Stanimir Varbanov,
	Eneas U de Queiroz, Kuldeep Singh, Eric Biggers
  Cc: linux-crypto, linux-arm-msm, linux-kernel, brgl,
	Bartosz Golaszewski, stable
In-Reply-To: <20260615-qce-fix-self-tests-v2-0-dc911f1aad42@oss.qualcomm.com>

From: Kuldeep Singh <kuldeep.singh@oss.qualcomm.com>

In CTR mode, the IV acts as the initial counter block.
APer NIST SP 800-38A, after a CTR mode operation the next unused counter
value is:

IV_next = IV_in + ceil(cryptlen / AES_BLOCK_SIZE)

The skcipher requires req->iv to hold this updated counter on
completion, ensuring chained requests produce correct results.

Referring to Crypto6.0 documentation, Section 2.2.5 says:
"The count value increments automatically once per block of data (in
AES, a block is 16 bytes) based on the value in the
CRYPTO_ENCR_CNTR_MASK registers."

QCE increments internal counter register once per full 16-byte block(for
ctr-aes) is processed. In case of partial request length, the hardware
uses the current counter to generate keystreams but does not increment
the counter register afterwards. So the counter value written in
CRYPTO_ENCR_CNTRn_IVn later once read by software is one less than the
expected value.

Crypto selftest framework capture this scenario with test vector
4 comprising of a 499-byte payload (31 full blocks + 3 partial bytes).
Error:
[    5.606169] alg: skcipher: ctr-aes-qce encryption test failed (wrong output IV) on test vector 4, cfg="in-place (one sglist)"
[    5.606176] 00000000: e7 82 1d b8 53 11 ac 47 e2 7d 18 d6 71 0c a7 61
[    5.606192] alg: self-tests for ctr(aes) using ctr-aes-qce failed (rc=-22)
Expected iv_out: 0x62 (iv_in + 32)
Obtained iv_out: 0x61 (iv_in + 31, partial block not counted)

To fix this, just increase the counter value for partial block requests
by 1 and for the full block size requests, don't take any action as
expected value is already returned by the hardware.

Cc: stable@vger.kernel.org
Fixes: 3e806a12d10a ("crypto: qce - update the skcipher IV")
Signed-off-by: Kuldeep Singh <kuldeep.singh@oss.qualcomm.com>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/skcipher.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/qce/skcipher.c b/drivers/crypto/qce/skcipher.c
index 68b83e3ae088ae42a7fb2a2f0c2e132acf62e849..379b45d2cd952a39c387e84af71238b53f7737e9 100644
--- a/drivers/crypto/qce/skcipher.c
+++ b/drivers/crypto/qce/skcipher.c
@@ -11,6 +11,7 @@
 #include <linux/types.h>
 #include <linux/errno.h>
 #include <crypto/aes.h>
+#include <crypto/algapi.h>
 #include <crypto/internal/des.h>
 #include <crypto/internal/skcipher.h>
 
@@ -34,6 +35,7 @@ static void qce_skcipher_done(void *data)
 	struct qce_device *qce = tmpl->qce;
 	struct qce_result_dump *result_buf = qce->dma.result_buf;
 	enum dma_data_direction dir_src, dir_dst;
+	unsigned int blocks;
 	u32 status;
 	int error;
 	bool diff_dst;
@@ -57,7 +59,21 @@ static void qce_skcipher_done(void *data)
 	if (error < 0)
 		dev_dbg(qce->dev, "skcipher operation error (%x)\n", status);
 
-	memcpy(rctx->iv, result_buf->encr_cntr_iv, rctx->ivsize);
+	if (IS_CTR(rctx->flags)) {
+		/*
+		 * QCE hardware does not increment the counter for a partial
+		 * final block. Increment it in software so that iv_out
+		 * reflects the correct next counter value expected by the CTR
+		 * mode.
+		 */
+		blocks = DIV_ROUND_UP(rctx->cryptlen, AES_BLOCK_SIZE);
+
+		while (blocks--)
+			crypto_inc(rctx->iv, rctx->ivsize);
+	} else {
+		memcpy(rctx->iv, result_buf->encr_cntr_iv, rctx->ivsize);
+	}
+
 	qce->async_req_done(tmpl->qce, error);
 }
 

-- 
2.47.3


^ permalink raw reply related

* [PATCH v2 3/8] crypto: qce - Reject empty messages for AES-XTS
From: Bartosz Golaszewski @ 2026-06-15 15:49 UTC (permalink / raw)
  To: Thara Gopinath, Herbert Xu, David S. Miller, Stanimir Varbanov,
	Eneas U de Queiroz, Kuldeep Singh, Eric Biggers
  Cc: linux-crypto, linux-arm-msm, linux-kernel, brgl,
	Bartosz Golaszewski, stable
In-Reply-To: <20260615-qce-fix-self-tests-v2-0-dc911f1aad42@oss.qualcomm.com>

XTS is not defined for an empty plaintext: it requires at least one full
block of data. The driver treated a zero-length request as a successful
no-op, so the crypto self-tests "unexpectedly succeeded" when -EINVAL
was expected.

Return -EINVAL for empty XTS requests while keeping the no-op behavior
for the other ciphers, which the crypto engine simply cannot process due
to its DMA not supporting zero-length transfers.

Cc: stable@vger.kernel.org
Fixes: f08789462255 ("crypto: qce - Return error for zero length messages")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/skcipher.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/qce/skcipher.c b/drivers/crypto/qce/skcipher.c
index eff80ad5cb943c5b2e1e293c723bb1b31102b006..68b83e3ae088ae42a7fb2a2f0c2e132acf62e849 100644
--- a/drivers/crypto/qce/skcipher.c
+++ b/drivers/crypto/qce/skcipher.c
@@ -239,8 +239,12 @@ static int qce_skcipher_crypt(struct skcipher_request *req, int encrypt)
 	keylen = IS_XTS(rctx->flags) ? ctx->enc_keylen >> 1 : ctx->enc_keylen;
 
 	/* CE does not handle 0 length messages */
-	if (!req->cryptlen)
+	if (!req->cryptlen) {
+		/* XTS requires at least one full block of data */
+		if (IS_XTS(rctx->flags))
+			return -EINVAL;
 		return 0;
+	}
 
 	/*
 	 * ECB and CBC algorithms require message lengths to be

-- 
2.47.3


^ permalink raw reply related

* [PATCH v2 2/8] crypto: qce - Fix HMAC self-test failures for empty messages
From: Bartosz Golaszewski @ 2026-06-15 15:49 UTC (permalink / raw)
  To: Thara Gopinath, Herbert Xu, David S. Miller, Stanimir Varbanov,
	Eneas U de Queiroz, Kuldeep Singh, Eric Biggers
  Cc: linux-crypto, linux-arm-msm, linux-kernel, brgl,
	Bartosz Golaszewski, stable
In-Reply-To: <20260615-qce-fix-self-tests-v2-0-dc911f1aad42@oss.qualcomm.com>

BAM DMA cannot process zero-length transfers. For plain hashes this is
handled by returning the precomputed hash of the empty message
(tmpl->hash_zero), but for keyed HMAC the result depends on the key and
cannot be a constant. As a result, hmac(sha256) produced an incorrect
digest for an empty message and the crypto self-tests failed.

Allocate a software fallback ahash for the HMAC transforms and use it to
compute the digest whenever the message is empty (in both the .final()
and .digest() paths). The fallback is allocated in a dedicated cra_init
for the HMAC algorithms and is excluded from matching the crypto engine's
own algorithm to avoid recursion. It is kept keyed in sync with the
hardware transform in .setkey().

Cc: stable@vger.kernel.org
Fixes: ec8f5d8f6f76 ("crypto: qce - Qualcomm crypto engine driver")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/sha.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++-
 drivers/crypto/qce/sha.h |  1 +
 2 files changed, 84 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/qce/sha.c b/drivers/crypto/qce/sha.c
index dc962296139da334c00237e44290356023cd7420..00e1a8f6d4ec905cfb035db958a71566b1abb0a7 100644
--- a/drivers/crypto/qce/sha.c
+++ b/drivers/crypto/qce/sha.c
@@ -274,6 +274,36 @@ static int qce_ahash_update(struct ahash_request *req)
 	return qce->async_req_enqueue(tmpl->qce, &req->base);
 }
 
+/*
+ * BAM DMA cannot handle zero-length transfers. For plain hashes the result of
+ * an empty message is a known constant (hash_zero), for keyed HMAC it depends
+ * on the key, so compute it with the software fallback.
+ */
+static int qce_ahash_hmac_zero(struct ahash_request *req)
+{
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct qce_sha_ctx *ctx = crypto_tfm_ctx(crypto_ahash_tfm(tfm));
+	struct ahash_request *subreq;
+	struct crypto_wait wait;
+	struct scatterlist sg;
+	int ret;
+
+	subreq = ahash_request_alloc(ctx->fallback, GFP_ATOMIC);
+	if (!subreq)
+		return -ENOMEM;
+
+	crypto_init_wait(&wait);
+	ahash_request_set_callback(subreq, CRYPTO_TFM_REQ_MAY_BACKLOG,
+				   crypto_req_done, &wait);
+	sg_init_one(&sg, NULL, 0);
+	ahash_request_set_crypt(subreq, &sg, req->result, 0);
+
+	ret = crypto_wait_req(crypto_ahash_digest(subreq), &wait);
+
+	ahash_request_free(subreq);
+	return ret;
+}
+
 static int qce_ahash_final(struct ahash_request *req)
 {
 	struct qce_sha_reqctx *rctx = ahash_request_ctx_dma(req);
@@ -284,6 +314,8 @@ static int qce_ahash_final(struct ahash_request *req)
 		if (tmpl->hash_zero)
 			memcpy(req->result, tmpl->hash_zero,
 					tmpl->alg.ahash.halg.digestsize);
+		else if (IS_SHA_HMAC(rctx->flags))
+			return qce_ahash_hmac_zero(req);
 		return 0;
 	}
 
@@ -321,6 +353,8 @@ static int qce_ahash_digest(struct ahash_request *req)
 		if (tmpl->hash_zero)
 			memcpy(req->result, tmpl->hash_zero,
 					tmpl->alg.ahash.halg.digestsize);
+		else if (IS_SHA_HMAC(rctx->flags))
+			return qce_ahash_hmac_zero(req);
 		return 0;
 	}
 
@@ -344,6 +378,17 @@ static int qce_ahash_hmac_setkey(struct crypto_ahash *tfm, const u8 *key,
 	blocksize = crypto_tfm_alg_blocksize(crypto_ahash_tfm(tfm));
 	memset(ctx->authkey, 0, sizeof(ctx->authkey));
 
+	/*
+	 * Keep the software fallback keyed in sync - it is used for empty
+	 * messages, which the DMA engine cannot process.
+	 */
+	crypto_ahash_clear_flags(ctx->fallback, CRYPTO_TFM_REQ_MASK);
+	crypto_ahash_set_flags(ctx->fallback,
+			       crypto_ahash_get_flags(tfm) & CRYPTO_TFM_REQ_MASK);
+	ret = crypto_ahash_setkey(ctx->fallback, key, keylen);
+	if (ret)
+		return ret;
+
 	if (keylen <= blocksize) {
 		memcpy(ctx->authkey, key, keylen);
 		return 0;
@@ -401,6 +446,36 @@ static int qce_ahash_cra_init(struct crypto_tfm *tfm)
 	return 0;
 }
 
+static int qce_ahash_hmac_cra_init(struct crypto_tfm *tfm)
+{
+	struct qce_sha_ctx *ctx = crypto_tfm_ctx(tfm);
+	struct crypto_ahash *fallback;
+	int ret;
+
+	ret = qce_ahash_cra_init(tfm);
+	if (ret)
+		return ret;
+
+	/*
+	 * The fallback is used to compute HMACs of empty messages, which the
+	 * DMA engine cannot process.
+	 */
+	fallback = crypto_alloc_ahash(crypto_tfm_alg_name(tfm), 0,
+				      CRYPTO_ALG_NEED_FALLBACK);
+	if (IS_ERR(fallback))
+		return PTR_ERR(fallback);
+
+	ctx->fallback = fallback;
+	return 0;
+}
+
+static void qce_ahash_hmac_cra_exit(struct crypto_tfm *tfm)
+{
+	struct qce_sha_ctx *ctx = crypto_tfm_ctx(tfm);
+
+	crypto_free_ahash(ctx->fallback);
+}
+
 struct qce_ahash_def {
 	unsigned long flags;
 	const char *name;
@@ -479,7 +554,14 @@ static int qce_ahash_register_one(const struct qce_ahash_def *def,
 	base->cra_ctxsize = sizeof(struct qce_sha_ctx);
 	base->cra_alignmask = 0;
 	base->cra_module = THIS_MODULE;
-	base->cra_init = qce_ahash_cra_init;
+
+	if (IS_SHA_HMAC(def->flags)) {
+		base->cra_flags |= CRYPTO_ALG_NEED_FALLBACK;
+		base->cra_init = qce_ahash_hmac_cra_init;
+		base->cra_exit = qce_ahash_hmac_cra_exit;
+	} else {
+		base->cra_init = qce_ahash_cra_init;
+	}
 
 	strscpy(base->cra_name, def->name);
 	strscpy(base->cra_driver_name, def->drv_name);
diff --git a/drivers/crypto/qce/sha.h b/drivers/crypto/qce/sha.h
index a22695361f1654cc94325ec5d886a158fa4bfb9c..5ba6b786f450cbae52988cb39cd68d5795fd19db 100644
--- a/drivers/crypto/qce/sha.h
+++ b/drivers/crypto/qce/sha.h
@@ -18,6 +18,7 @@
 
 struct qce_sha_ctx {
 	u8 authkey[QCE_SHA_MAX_BLOCKSIZE];
+	struct crypto_ahash *fallback;
 };
 
 /**

-- 
2.47.3


^ permalink raw reply related

* [PATCH v2 1/8] crypto: qce - Remove unsafe/deprecated algorithms
From: Bartosz Golaszewski @ 2026-06-15 15:49 UTC (permalink / raw)
  To: Thara Gopinath, Herbert Xu, David S. Miller, Stanimir Varbanov,
	Eneas U de Queiroz, Kuldeep Singh, Eric Biggers
  Cc: linux-crypto, linux-arm-msm, linux-kernel, brgl,
	Bartosz Golaszewski, stable
In-Reply-To: <20260615-qce-fix-self-tests-v2-0-dc911f1aad42@oss.qualcomm.com>

Remove algorithms that are either unsafe or deprecated and have no
in-kernel users that cannot be served by the ARM CE implementations.

AES-ECB reveals plaintext patterns (identical plaintext blocks produce
identical ciphertext blocks) and should not be exposed as a hardware-
accelerated primitive. DES, Triple DES and HMAC-SHA1 have been
deprecated for years.

Remove ecb(aes), cbc(des), ecb(des3_ede), cbc(des3_ede), hmac(sha1) and
all AEAD variants built on these primitives. Also clean up the - now dead
- code, flags and constants.

Cc: stable@vger.kernel.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/aead.c     | 40 -------------------------
 drivers/crypto/qce/common.c   | 27 ++++-------------
 drivers/crypto/qce/common.h   |  7 +----
 drivers/crypto/qce/regs-v5.h  |  1 -
 drivers/crypto/qce/sha.c      |  9 ------
 drivers/crypto/qce/skcipher.c | 69 +------------------------------------------
 6 files changed, 8 insertions(+), 145 deletions(-)

diff --git a/drivers/crypto/qce/aead.c b/drivers/crypto/qce/aead.c
index 03b8042da9a1b4aebdc775ad8ab912abc7b2383d..6a511e5d7f6290a1df0093e463f39f5f2db25f88 100644
--- a/drivers/crypto/qce/aead.c
+++ b/drivers/crypto/qce/aead.c
@@ -592,7 +592,6 @@ static int qce_aead_setkey(struct crypto_aead *tfm, const u8 *key, unsigned int
 	struct qce_aead_ctx *ctx = crypto_aead_ctx(tfm);
 	struct crypto_authenc_keys authenc_keys;
 	unsigned long flags = to_aead_tmpl(tfm)->alg_flags;
-	u32 _key[6];
 	int err;
 
 	err = crypto_authenc_extractkeys(&authenc_keys, key, keylen);
@@ -607,21 +606,6 @@ static int qce_aead_setkey(struct crypto_aead *tfm, const u8 *key, unsigned int
 		err = verify_aead_des_key(tfm, authenc_keys.enckey, authenc_keys.enckeylen);
 		if (err)
 			return err;
-	} else if (IS_3DES(flags)) {
-		err = verify_aead_des3_key(tfm, authenc_keys.enckey, authenc_keys.enckeylen);
-		if (err)
-			return err;
-		/*
-		 * The crypto engine does not support any two keys
-		 * being the same for triple des algorithms. The
-		 * verify_skcipher_des3_key does not check for all the
-		 * below conditions. Schedule fallback in this case.
-		 */
-		memcpy(_key, authenc_keys.enckey, DES3_EDE_KEY_SIZE);
-		if (!((_key[0] ^ _key[2]) | (_key[1] ^ _key[3])) ||
-		    !((_key[2] ^ _key[4]) | (_key[3] ^ _key[5])) ||
-		    !((_key[0] ^ _key[4]) | (_key[1] ^ _key[5])))
-			ctx->need_fallback = true;
 	} else if (IS_AES(flags)) {
 		/* No random key sizes */
 		if (authenc_keys.enckeylen != AES_KEYSIZE_128 &&
@@ -693,22 +677,6 @@ struct qce_aead_def {
 };
 
 static const struct qce_aead_def aead_def[] = {
-	{
-		.flags          = QCE_ALG_DES | QCE_MODE_CBC | QCE_HASH_SHA1_HMAC,
-		.name           = "authenc(hmac(sha1),cbc(des))",
-		.drv_name       = "authenc-hmac-sha1-cbc-des-qce",
-		.blocksize      = DES_BLOCK_SIZE,
-		.ivsize         = DES_BLOCK_SIZE,
-		.maxauthsize	= SHA1_DIGEST_SIZE,
-	},
-	{
-		.flags          = QCE_ALG_3DES | QCE_MODE_CBC | QCE_HASH_SHA1_HMAC,
-		.name           = "authenc(hmac(sha1),cbc(des3_ede))",
-		.drv_name       = "authenc-hmac-sha1-cbc-3des-qce",
-		.blocksize      = DES3_EDE_BLOCK_SIZE,
-		.ivsize         = DES3_EDE_BLOCK_SIZE,
-		.maxauthsize	= SHA1_DIGEST_SIZE,
-	},
 	{
 		.flags          = QCE_ALG_DES | QCE_MODE_CBC | QCE_HASH_SHA256_HMAC,
 		.name           = "authenc(hmac(sha256),cbc(des))",
@@ -717,14 +685,6 @@ static const struct qce_aead_def aead_def[] = {
 		.ivsize         = DES_BLOCK_SIZE,
 		.maxauthsize	= SHA256_DIGEST_SIZE,
 	},
-	{
-		.flags          = QCE_ALG_3DES | QCE_MODE_CBC | QCE_HASH_SHA256_HMAC,
-		.name           = "authenc(hmac(sha256),cbc(des3_ede))",
-		.drv_name       = "authenc-hmac-sha256-cbc-3des-qce",
-		.blocksize      = DES3_EDE_BLOCK_SIZE,
-		.ivsize         = DES3_EDE_BLOCK_SIZE,
-		.maxauthsize	= SHA256_DIGEST_SIZE,
-	},
 	{
 		.flags          =  QCE_ALG_AES | QCE_MODE_CBC | QCE_HASH_SHA256_HMAC,
 		.name           = "authenc(hmac(sha256),cbc(aes))",
diff --git a/drivers/crypto/qce/common.c b/drivers/crypto/qce/common.c
index 54a78a57f63028f01870a3edeb8e390f523bb190..b1f8cf7e0d22ff3c19bb92bdc0154ed403f4c2f1 100644
--- a/drivers/crypto/qce/common.c
+++ b/drivers/crypto/qce/common.c
@@ -115,7 +115,7 @@ static u32 qce_auth_cfg(unsigned long flags, u32 key_size, u32 auth_size)
 			cfg |= AUTH_KEY_SZ_AES256 << AUTH_KEY_SIZE_SHIFT;
 	}
 
-	if (IS_SHA1(flags) || IS_SHA1_HMAC(flags))
+	if (IS_SHA1(flags))
 		cfg |= AUTH_SIZE_SHA1 << AUTH_SIZE_SHIFT;
 	else if (IS_SHA256(flags) || IS_SHA256_HMAC(flags))
 		cfg |= AUTH_SIZE_SHA256 << AUTH_SIZE_SHIFT;
@@ -126,7 +126,7 @@ static u32 qce_auth_cfg(unsigned long flags, u32 key_size, u32 auth_size)
 
 	if (IS_SHA1(flags) || IS_SHA256(flags))
 		cfg |= AUTH_MODE_HASH << AUTH_MODE_SHIFT;
-	else if (IS_SHA1_HMAC(flags) || IS_SHA256_HMAC(flags))
+	else if (IS_SHA256_HMAC(flags))
 		cfg |= AUTH_MODE_HMAC << AUTH_MODE_SHIFT;
 	else if (IS_CCM(flags))
 		cfg |= AUTH_MODE_CCM << AUTH_MODE_SHIFT;
@@ -191,7 +191,7 @@ static int qce_setup_regs_ahash(struct crypto_async_request *async_req)
 	else
 		qce_cpu_to_be32p_array(auth, rctx->digest, digestsize);
 
-	iv_words = (IS_SHA1(rctx->flags) || IS_SHA1_HMAC(rctx->flags)) ? 5 : 8;
+	iv_words = IS_SHA1(rctx->flags) ? 5 : 8;
 	qce_write_array(qce, REG_AUTH_IV0, (u32 *)auth, iv_words);
 
 	if (rctx->first_blk)
@@ -243,15 +243,12 @@ static u32 qce_encr_cfg(unsigned long flags, u32 aes_key_size)
 
 	if (IS_AES(flags))
 		cfg |= ENCR_ALG_AES << ENCR_ALG_SHIFT;
-	else if (IS_DES(flags) || IS_3DES(flags))
+	else if (IS_DES(flags))
 		cfg |= ENCR_ALG_DES << ENCR_ALG_SHIFT;
 
 	if (IS_DES(flags))
 		cfg |= ENCR_KEY_SZ_DES << ENCR_KEY_SZ_SHIFT;
 
-	if (IS_3DES(flags))
-		cfg |= ENCR_KEY_SZ_3DES << ENCR_KEY_SZ_SHIFT;
-
 	switch (flags & QCE_MODE_MASK) {
 	case QCE_MODE_ECB:
 		cfg |= ENCR_MODE_ECB << ENCR_MODE_SHIFT;
@@ -343,9 +340,6 @@ static int qce_setup_regs_skcipher(struct crypto_async_request *async_req)
 	if (IS_DES(flags)) {
 		enciv_words = 2;
 		enckey_words = 2;
-	} else if (IS_3DES(flags)) {
-		enciv_words = 2;
-		enckey_words = 6;
 	} else if (IS_AES(flags)) {
 		if (IS_XTS(flags))
 			qce_xtskey(qce, ctx->enc_key, ctx->enc_keylen,
@@ -393,10 +387,6 @@ static int qce_setup_regs_skcipher(struct crypto_async_request *async_req)
 #endif
 
 #ifdef CONFIG_CRYPTO_DEV_QCE_AEAD
-static const u32 std_iv_sha1[SHA256_DIGEST_SIZE / sizeof(u32)] = {
-	SHA1_H0, SHA1_H1, SHA1_H2, SHA1_H3, SHA1_H4, 0, 0, 0
-};
-
 static const u32 std_iv_sha256[SHA256_DIGEST_SIZE / sizeof(u32)] = {
 	SHA256_H0, SHA256_H1, SHA256_H2, SHA256_H3,
 	SHA256_H4, SHA256_H5, SHA256_H6, SHA256_H7
@@ -473,13 +463,8 @@ static int qce_setup_regs_aead(struct crypto_async_request *async_req)
 	/* Write initial authentication IV only for HMAC algorithms */
 	if (IS_SHA_HMAC(rctx->flags)) {
 		/* Write default authentication iv */
-		if (IS_SHA1_HMAC(rctx->flags)) {
-			auth_ivsize = SHA1_DIGEST_SIZE;
-			memcpy(authiv, std_iv_sha1, auth_ivsize);
-		} else if (IS_SHA256_HMAC(rctx->flags)) {
-			auth_ivsize = SHA256_DIGEST_SIZE;
-			memcpy(authiv, std_iv_sha256, auth_ivsize);
-		}
+		auth_ivsize = SHA256_DIGEST_SIZE;
+		memcpy(authiv, std_iv_sha256, auth_ivsize);
 		authiv_words = auth_ivsize / sizeof(u32);
 		qce_write_array(qce, REG_AUTH_IV0, (u32 *)authiv, authiv_words);
 	} else if (IS_CCM(rctx->flags)) {
diff --git a/drivers/crypto/qce/common.h b/drivers/crypto/qce/common.h
index 02e63ad9f24557c2238caa70b0ec521d49da4f13..c96d907d524725e7738d199a1d345d943d2ca360 100644
--- a/drivers/crypto/qce/common.h
+++ b/drivers/crypto/qce/common.h
@@ -34,13 +34,11 @@
 
 /* cipher algorithms */
 #define QCE_ALG_DES			BIT(0)
-#define QCE_ALG_3DES			BIT(1)
 #define QCE_ALG_AES			BIT(2)
 
 /* hash and hmac algorithms */
 #define QCE_HASH_SHA1			BIT(3)
 #define QCE_HASH_SHA256			BIT(4)
-#define QCE_HASH_SHA1_HMAC		BIT(5)
 #define QCE_HASH_SHA256_HMAC		BIT(6)
 #define QCE_HASH_AES_CMAC		BIT(7)
 
@@ -59,17 +57,14 @@
 #define QCE_DECRYPT			BIT(31)
 
 #define IS_DES(flags)			(flags & QCE_ALG_DES)
-#define IS_3DES(flags)			(flags & QCE_ALG_3DES)
 #define IS_AES(flags)			(flags & QCE_ALG_AES)
 
 #define IS_SHA1(flags)			(flags & QCE_HASH_SHA1)
 #define IS_SHA256(flags)		(flags & QCE_HASH_SHA256)
-#define IS_SHA1_HMAC(flags)		(flags & QCE_HASH_SHA1_HMAC)
 #define IS_SHA256_HMAC(flags)		(flags & QCE_HASH_SHA256_HMAC)
 #define IS_CMAC(flags)			(flags & QCE_HASH_AES_CMAC)
 #define IS_SHA(flags)			(IS_SHA1(flags) || IS_SHA256(flags))
-#define IS_SHA_HMAC(flags)		\
-		(IS_SHA1_HMAC(flags) || IS_SHA256_HMAC(flags))
+#define IS_SHA_HMAC(flags)		IS_SHA256_HMAC(flags)
 
 #define IS_CBC(mode)			(mode & QCE_MODE_CBC)
 #define IS_ECB(mode)			(mode & QCE_MODE_ECB)
diff --git a/drivers/crypto/qce/regs-v5.h b/drivers/crypto/qce/regs-v5.h
index d59ed279890621a8e2e6f4cdb20692dbf39f1461..11a6f3db3ffd05b97a2b9fc0989d954a904c4cd5 100644
--- a/drivers/crypto/qce/regs-v5.h
+++ b/drivers/crypto/qce/regs-v5.h
@@ -285,7 +285,6 @@
 #define ENCR_KEY_SZ_SHIFT		3
 #define ENCR_KEY_SZ_MASK		GENMASK(5, 3)
 #define ENCR_KEY_SZ_DES			0
-#define ENCR_KEY_SZ_3DES		1
 #define ENCR_KEY_SZ_AES128		0
 #define ENCR_KEY_SZ_AES256		2
 
diff --git a/drivers/crypto/qce/sha.c b/drivers/crypto/qce/sha.c
index a3a1a205aaf8559a04809936e2a3b7d564c16c53..dc962296139da334c00237e44290356023cd7420 100644
--- a/drivers/crypto/qce/sha.c
+++ b/drivers/crypto/qce/sha.c
@@ -430,15 +430,6 @@ static const struct qce_ahash_def ahash_def[] = {
 		.statesize	= sizeof(struct qce_sha_saved_state),
 		.std_iv		= std_iv_sha256,
 	},
-	{
-		.flags		= QCE_HASH_SHA1_HMAC,
-		.name		= "hmac(sha1)",
-		.drv_name	= "hmac-sha1-qce",
-		.digestsize	= SHA1_DIGEST_SIZE,
-		.blocksize	= SHA1_BLOCK_SIZE,
-		.statesize	= sizeof(struct qce_sha_saved_state),
-		.std_iv		= std_iv_sha1,
-	},
 	{
 		.flags		= QCE_HASH_SHA256_HMAC,
 		.name		= "hmac(sha256)",
diff --git a/drivers/crypto/qce/skcipher.c b/drivers/crypto/qce/skcipher.c
index 1fef315a7105c869e7fc6a60719087b721e78bb3..eff80ad5cb943c5b2e1e293c723bb1b31102b006 100644
--- a/drivers/crypto/qce/skcipher.c
+++ b/drivers/crypto/qce/skcipher.c
@@ -224,36 +224,6 @@ static int qce_des_setkey(struct crypto_skcipher *ablk, const u8 *key,
 	return 0;
 }
 
-static int qce_des3_setkey(struct crypto_skcipher *ablk, const u8 *key,
-			   unsigned int keylen)
-{
-	struct qce_cipher_ctx *ctx = crypto_skcipher_ctx(ablk);
-	u32 _key[6];
-	int err;
-
-	err = verify_skcipher_des3_key(ablk, key);
-	if (err)
-		return err;
-
-	/*
-	 * The crypto engine does not support any two keys
-	 * being the same for triple des algorithms. The
-	 * verify_skcipher_des3_key does not check for all the
-	 * below conditions. Return -ENOKEY in case any two keys
-	 * are the same. Revisit to see if a fallback cipher
-	 * is needed to handle this condition.
-	 */
-	memcpy(_key, key, DES3_EDE_KEY_SIZE);
-	if (!((_key[0] ^ _key[2]) | (_key[1] ^ _key[3])) ||
-	    !((_key[2] ^ _key[4]) | (_key[3] ^ _key[5])) ||
-	    !((_key[0] ^ _key[4]) | (_key[1] ^ _key[5])))
-		return -ENOKEY;
-
-	ctx->enc_keylen = keylen;
-	memcpy(ctx->enc_key, key, keylen);
-	return 0;
-}
-
 static int qce_skcipher_crypt(struct skcipher_request *req, int encrypt)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
@@ -359,15 +329,6 @@ struct qce_skcipher_def {
 };
 
 static const struct qce_skcipher_def skcipher_def[] = {
-	{
-		.flags		= QCE_ALG_AES | QCE_MODE_ECB,
-		.name		= "ecb(aes)",
-		.drv_name	= "ecb-aes-qce",
-		.blocksize	= AES_BLOCK_SIZE,
-		.ivsize		= 0,
-		.min_keysize	= AES_MIN_KEY_SIZE,
-		.max_keysize	= AES_MAX_KEY_SIZE,
-	},
 	{
 		.flags		= QCE_ALG_AES | QCE_MODE_CBC,
 		.name		= "cbc(aes)",
@@ -405,33 +366,6 @@ static const struct qce_skcipher_def skcipher_def[] = {
 		.min_keysize	= DES_KEY_SIZE,
 		.max_keysize	= DES_KEY_SIZE,
 	},
-	{
-		.flags		= QCE_ALG_DES | QCE_MODE_CBC,
-		.name		= "cbc(des)",
-		.drv_name	= "cbc-des-qce",
-		.blocksize	= DES_BLOCK_SIZE,
-		.ivsize		= DES_BLOCK_SIZE,
-		.min_keysize	= DES_KEY_SIZE,
-		.max_keysize	= DES_KEY_SIZE,
-	},
-	{
-		.flags		= QCE_ALG_3DES | QCE_MODE_ECB,
-		.name		= "ecb(des3_ede)",
-		.drv_name	= "ecb-3des-qce",
-		.blocksize	= DES3_EDE_BLOCK_SIZE,
-		.ivsize		= 0,
-		.min_keysize	= DES3_EDE_KEY_SIZE,
-		.max_keysize	= DES3_EDE_KEY_SIZE,
-	},
-	{
-		.flags		= QCE_ALG_3DES | QCE_MODE_CBC,
-		.name		= "cbc(des3_ede)",
-		.drv_name	= "cbc-3des-qce",
-		.blocksize	= DES3_EDE_BLOCK_SIZE,
-		.ivsize		= DES3_EDE_BLOCK_SIZE,
-		.min_keysize	= DES3_EDE_KEY_SIZE,
-		.max_keysize	= DES3_EDE_KEY_SIZE,
-	},
 };
 
 static int qce_skcipher_register_one(const struct qce_skcipher_def *def,
@@ -455,8 +389,7 @@ static int qce_skcipher_register_one(const struct qce_skcipher_def *def,
 	alg->ivsize			= def->ivsize;
 	alg->min_keysize		= def->min_keysize;
 	alg->max_keysize		= def->max_keysize;
-	alg->setkey			= IS_3DES(def->flags) ? qce_des3_setkey :
-					  IS_DES(def->flags) ? qce_des_setkey :
+	alg->setkey			= IS_DES(def->flags) ? qce_des_setkey :
 					  qce_skcipher_setkey;
 	alg->encrypt			= qce_skcipher_encrypt;
 	alg->decrypt			= qce_skcipher_decrypt;

-- 
2.47.3


^ permalink raw reply related

* [PATCH v2 0/8] crypto: qce - Fix crypto self-test failures
From: Bartosz Golaszewski @ 2026-06-15 15:49 UTC (permalink / raw)
  To: Thara Gopinath, Herbert Xu, David S. Miller, Stanimir Varbanov,
	Eneas U de Queiroz, Kuldeep Singh, Eric Biggers
  Cc: linux-crypto, linux-arm-msm, linux-kernel, brgl,
	Bartosz Golaszewski, stable

This extends the initial submission from Kuldeep.

The QCE hardware crypto engine has several limitations that cause it to
produce incorrect results or stall on certain inputs. This series fixes
several bugs and adds workaround allowing the deiver to pass crypto
self-tests.

The failures addressed are:

- HMAC self-test failures for empty messages
- AES-XTS returning success on zero-length input (should be -EINVAL)
- AES-CTR: partial final block causes the engine to stall, output IV
  derivation was incorrect
- AES-XTS with key1 == key2 is not supported by the CE
- AES-CCM: partial final block and fragmented payload both stall the
  engine

All fixes were tested on an SM8650 QRD board with
CONFIG_CRYPTO_SELFTESTS=y and CONFIG_CRYPTO_SELFTESTS_FULL=y.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
Changes in v2:
- Add fixes for the full suite of crypto self-tests
- Add Fixes and Cc tags
- Link to v1: https://patch.msgid.link/20260610-qce_selftest_fix-v1-0-1b0504783a46@oss.qualcomm.com/

---
Bartosz Golaszewski (6):
      crypto: qce - Remove unsafe/deprecated algorithms
      crypto: qce - Fix HMAC self-test failures for empty messages
      crypto: qce - Reject empty messages for AES-XTS
      crypto: qce - Use a fallback for AES-CTR with a partial final block
      crypto: qce - Use a fallback for CCM with a partial final block
      crypto: qce - Use fallback for CCM with a fragmented payload

Kuldeep Singh (2):
      crypto: qce - Fix CTR-AES for partial block requests
      crypto: qce - Fix xts-aes-qce for weak keys

 drivers/crypto/qce/aead.c     |  72 +++++++++++---------------
 drivers/crypto/qce/cipher.h   |   1 +
 drivers/crypto/qce/common.c   |  27 +++-------
 drivers/crypto/qce/common.h   |   7 +--
 drivers/crypto/qce/regs-v5.h  |   1 -
 drivers/crypto/qce/sha.c      |  93 +++++++++++++++++++++++++++++----
 drivers/crypto/qce/sha.h      |   1 +
 drivers/crypto/qce/skcipher.c | 116 ++++++++++++++----------------------------
 8 files changed, 162 insertions(+), 156 deletions(-)
---
base-commit: 7f5e2941e7dccc9dfaaa23d0548a40039772a284
change-id: 20260610-qce-fix-self-tests-492ffd2ef955

Best regards,
-- 
Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>


^ permalink raw reply

* [PATCH v5] crypto/ccp: Introduce SNP_VERIFY_MITIGATION command
From: Pratik R. Sampat @ 2026-06-15 15:23 UTC (permalink / raw)
  To: ashish.kalra, thomas.lendacky, john.allen, herbert, davem
  Cc: linux-crypto, linux-kernel, aik, tycho, nikunj, michael.roth,
	prsampat

The SEV-SNP firmware provides the SNP_VERIFY_MITIGATION command, which
can be used to query the status of currently supported vulnerability
mitigations and to initiate mitigations within the firmware.

This command is an explicit mechanism to ascertain if a firmware
mitigation is applied without needing a full RMP re-build, which is most
useful in a live firmware update scenario.

The firmware supports two subcommands: STATUS and VERIFY. The STATUS
subcommand is used to query the supported and verified mitigation bits.
The VERIFY subcommand initiates the mitigation process within the FW for
the specified vulnerability. Expose a userspace interface under:
/sys/firmware/sev/vulnerabilities/
  - supported_mitigations (read-only): supported mitigation vector mask
  - verified_mitigations (read/write): current verified mask; write a
    vector to request VERIFY for that bit

The behavior of SNP_VERIFY_MITIGATION and the pre-requisites for using
it are bug-specific. Information about supported mitigations and its
corresponding vector is to be published as part of the AMD Security
Bulletin.

See SEV-SNP Firmware ABI specifications 1.58, SNP_VERIFY_MITIGATION for
more details.

Reviewed-by: Tycho Andersen (AMD) <tycho@kernel.org>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Pratik R. Sampat <prsampat@amd.com>
---
v5:
 * Collect Reviewed-by Tags
 * Check for multiple bits set in the mitigation vector - Tom
 * Add CONFG_SYSFS option to #else and #endif - Tom
 * Minor whitespace and grammer fixes - Tom
 * Return -EINVAL instead of -EIO for mitigation failure bit set
   reporting - Tycho

v4: https://lore.kernel.org/linux-crypto/4957b07dbb4029a4c59bb3cf35f068c36284aa48.1780693665.git.prsampat@amd.com/
 * Split interface definitions in documentation - Kernel Test Bot
 * Wrap snp_verify_mitigation() under #ifdef CONFIG_SYSFS - Tom
 * Remove check for snp initialized and feature info active for
   registering mitigigation interface - Tom
 * Since init vs init races should not be possible anymore[1], remove the
   sysfs mutex guard as sysfs' own synchornization suffices - Tom, Tycho
 * Dropping the reviewed-by since the patch has changed in a meaningful
   way

v3: https://lore.kernel.org/linux-crypto/a043a82c-f3dd-4f29-86fb-60638eaddc9b@amd.com/
  * Remove failed_status interface and report failure via dev_err - Tycho
  * Make vulnerability interfaces root only accessible - Sashiko
  * Move /sys/firmware/vulnerabilities/ to
    /sys/firmware/sev/vulnerabilities/ to be platform specific - Sashiko
  * Guard sysfs creation under a new mutex to avoid racing during
    creation and using the sev_cmd_mutex which would race with
    vulnerability operations - Sashiko

[1]: https://lore.kernel.org/all/20260504165147.1615643-5-tycho@kernel.org/

Patch based on cryptodev-2.6
---
 .../sysfs-firmware-sev-vulnerabilities        |  19 ++
 drivers/crypto/ccp/sev-dev.c                  | 177 ++++++++++++++++++
 drivers/crypto/ccp/sev-dev.h                  |   3 +
 include/linux/psp-sev.h                       |  51 +++++
 4 files changed, 250 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-firmware-sev-vulnerabilities

diff --git a/Documentation/ABI/testing/sysfs-firmware-sev-vulnerabilities b/Documentation/ABI/testing/sysfs-firmware-sev-vulnerabilities
new file mode 100644
index 000000000000..964362558bb2
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-firmware-sev-vulnerabilities
@@ -0,0 +1,19 @@
+What:		/sys/firmware/sev/vulnerabilities/supported_mitigations
+Date:		June 2026
+Contact:	linux-crypto@vger.kernel.org
+Description:
+		Read-only interface that reports the vector of SEV-SNP
+		firmware vulnerability mitigations supported by the firmware.
+
+What:		/sys/firmware/sev/vulnerabilities/verified_mitigations
+Date:		June 2026
+Contact:	linux-crypto@vger.kernel.org
+Description:
+		Read/write interface that reports the vector of SEV-SNP
+		firmware vulnerability mitigations already verified by the
+		firmware. Writing a vector value requests the firmware to
+		VERIFY the corresponding mitigation bit(s).
+
+		The list of supported mitigations and the meaning of each
+		vector bit are both platform- and bug-specific and are
+		published as part of the AMD Security Bulletin.
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 068b901034cb..43d2a7408c52 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -245,6 +245,7 @@ static int sev_cmd_buffer_len(int cmd)
 	case SEV_CMD_SNP_LAUNCH_FINISH:		return sizeof(struct sev_data_snp_launch_finish);
 	case SEV_CMD_SNP_DBG_DECRYPT:		return sizeof(struct sev_data_snp_dbg);
 	case SEV_CMD_SNP_DBG_ENCRYPT:		return sizeof(struct sev_data_snp_dbg);
+	case SEV_CMD_SNP_VERIFY_MITIGATION:	return sizeof(struct sev_data_snp_verify_mitigation);
 	case SEV_CMD_SNP_PAGE_UNSMASH:		return sizeof(struct sev_data_snp_page_unsmash);
 	case SEV_CMD_SNP_PLATFORM_STATUS:	return sizeof(struct sev_data_snp_addr);
 	case SEV_CMD_SNP_GUEST_REQUEST:		return sizeof(struct sev_data_snp_guest_request);
@@ -1352,6 +1353,162 @@ static int snp_filter_reserved_mem_regions(struct resource *rs, void *arg)
 	return 0;
 }
 
+#ifdef CONFIG_SYSFS
+static int snp_verify_mitigation(u16 command, u64 vector,
+				 struct sev_data_snp_verify_mitigation_dst *dst)
+{
+	struct sev_data_snp_verify_mitigation_dst *mit_dst = NULL;
+	struct sev_data_snp_verify_mitigation data = {0};
+	struct sev_device *sev = psp_master->sev_data;
+	int ret, error = 0;
+
+	mit_dst = snp_alloc_firmware_page(GFP_KERNEL | __GFP_ZERO);
+	if (!mit_dst)
+		return -ENOMEM;
+
+	data.length = sizeof(data);
+	data.subcommand = command;
+	data.vector = vector;
+	data.dst_paddr = __psp_pa(mit_dst);
+	data.dst_paddr_en = true;
+
+	ret = sev_do_cmd(SEV_CMD_SNP_VERIFY_MITIGATION, &data, &error);
+	if (!ret)
+		memcpy(dst, mit_dst, sizeof(*mit_dst));
+	else
+		dev_err(sev->dev, "SNP_VERIFY_MITIGATION command failed, ret = %d, error = %#x\n",
+			ret, error);
+
+	snp_free_firmware_page(mit_dst);
+
+	return ret;
+}
+
+static ssize_t supported_mitigations_show(struct kobject *kobj,
+					  struct kobj_attribute *attr, char *buf)
+{
+	struct sev_data_snp_verify_mitigation_dst dst;
+	int ret;
+
+	ret = snp_verify_mitigation(SNP_MIT_SUBCMD_REQ_STATUS, 0, &dst);
+	if (ret)
+		return ret;
+
+	return sysfs_emit(buf, "0x%llx\n", dst.mit_supported_vector);
+}
+
+static struct kobj_attribute supported_attr =
+		__ATTR_RO_MODE(supported_mitigations, 0400);
+
+static ssize_t verified_mitigations_show(struct kobject *kobj,
+					 struct kobj_attribute *attr, char *buf)
+{
+	struct sev_data_snp_verify_mitigation_dst dst;
+	int ret;
+
+	ret = snp_verify_mitigation(SNP_MIT_SUBCMD_REQ_STATUS, 0, &dst);
+	if (ret)
+		return ret;
+
+	return sysfs_emit(buf, "0x%llx\n", dst.mit_verified_vector);
+}
+
+static ssize_t verified_mitigations_store(struct kobject *kobj,
+					  struct kobj_attribute *attr,
+					  const char *buf, size_t count)
+{
+	struct sev_data_snp_verify_mitigation_dst dst;
+	struct sev_device *sev = psp_master->sev_data;
+	u64 vector;
+	int ret;
+
+	ret = kstrtoull(buf, 0, &vector);
+	if (ret)
+		return ret;
+
+	/*
+	 * The firmware verifies a single mitigation per call. Reject vectors
+	 * with more than one bit set early to avoid a guaranteed-to-fail call
+	 */
+	if (hweight64(vector) != 1)
+		return -EINVAL;
+
+	ret = snp_verify_mitigation(SNP_MIT_SUBCMD_REQ_VERIFY, vector, &dst);
+	if (ret)
+		return ret;
+
+	if (dst.mit_failure_status) {
+		dev_err(sev->dev, "Verify Mitigation - failure status: 0x%x\n",
+			dst.mit_failure_status);
+		return -EINVAL;
+	}
+
+	return count;
+}
+
+static struct kobj_attribute verified_attr =
+		__ATTR_RW_MODE(verified_mitigations, 0600);
+
+static struct attribute *mitigation_attrs[] = {
+	&supported_attr.attr,
+	&verified_attr.attr,
+	NULL
+};
+
+static const struct attribute_group mit_attr_group = {
+	.attrs = mitigation_attrs,
+};
+
+static void sev_snp_register_verify_mitigation(struct sev_device *sev)
+{
+	int rc;
+
+	if (!(sev->snp_feat_info_0.ecx & SNP_VERIFY_MITIGATION_SUPPORTED) ||
+	    sev->verify_mit)
+		return;
+
+	if (!sev->sev_kobj) {
+		sev->sev_kobj = kobject_create_and_add("sev", firmware_kobj);
+		if (!sev->sev_kobj)
+			return;
+	}
+
+	sev->verify_mit = kobject_create_and_add("vulnerabilities", sev->sev_kobj);
+	if (!sev->verify_mit)
+		goto err_sev_kobj;
+
+	rc = sysfs_create_group(sev->verify_mit, &mit_attr_group);
+	if (rc)
+		goto err_verify_mit;
+
+	return;
+
+err_verify_mit:
+	kobject_put(sev->verify_mit);
+	sev->verify_mit = NULL;
+err_sev_kobj:
+	kobject_put(sev->sev_kobj);
+	sev->sev_kobj = NULL;
+}
+
+static void sev_snp_unregister_verify_mitigation(struct sev_device *sev)
+{
+	if (sev->verify_mit) {
+		sysfs_remove_group(sev->verify_mit, &mit_attr_group);
+		kobject_put(sev->verify_mit);
+		sev->verify_mit = NULL;
+	}
+
+	if (sev->sev_kobj) {
+		kobject_put(sev->sev_kobj);
+		sev->sev_kobj = NULL;
+	}
+}
+#else	// CONFIG_SYSFS
+static void sev_snp_register_verify_mitigation(struct sev_device *sev) { }
+static void sev_snp_unregister_verify_mitigation(struct sev_device *sev) { }
+#endif	// CONFIG_SYSFS
+
 static int __sev_snp_init_locked(int *error, unsigned int max_snp_asid)
 {
 	struct sev_data_range_list *snp_range_list __free(kfree) = NULL;
@@ -1673,6 +1830,17 @@ int sev_platform_init(struct sev_platform_init_args *args)
 	rc = _sev_platform_init_locked(args);
 	mutex_unlock(&sev_cmd_mutex);
 
+	/*
+	 * Register the sysfs interface outside the sev_cmd_mutex. The
+	 * _show()/_store() handlers issue SEV commands that acquire the
+	 * sev_cmd_mutex, so creating (and on the shutdown path, removing) the
+	 * sysfs group must stay outside that lock. sysfs provides its own
+	 * synchronization between group creation/removal and concurrent
+	 * attribute access.
+	 */
+	if (!rc)
+		sev_snp_register_verify_mitigation(psp_master->sev_data);
+
 	return rc;
 }
 EXPORT_SYMBOL_GPL(sev_platform_init);
@@ -2752,6 +2920,15 @@ static void sev_firmware_shutdown(struct sev_device *sev)
 	if (sev->tio_status)
 		sev_tsm_uninit(sev);
 
+	/*
+	 * Remove the sysfs interface before taking the sev_cmd_mutex.
+	 * sysfs_remove_group() waits for in-flight _show()/_store() handlers
+	 * to drain, and those handlers issue SNP_VERIFY_MITIGATION via
+	 * sev_do_cmd() which acquires the sev_cmd_mutex. Removing the group
+	 * while holding the mutex could therefore deadlock.
+	 */
+	sev_snp_unregister_verify_mitigation(sev);
+
 	mutex_lock(&sev_cmd_mutex);
 
 	__sev_firmware_shutdown(sev, false);
diff --git a/drivers/crypto/ccp/sev-dev.h b/drivers/crypto/ccp/sev-dev.h
index b1cd556bbbf6..d5e596606def 100644
--- a/drivers/crypto/ccp/sev-dev.h
+++ b/drivers/crypto/ccp/sev-dev.h
@@ -59,6 +59,9 @@ struct sev_device {
 
 	bool snp_initialized;
 
+	struct kobject *sev_kobj;
+	struct kobject *verify_mit;
+
 	struct sev_user_data_status sev_plat_status;
 
 	struct sev_user_data_snp_status snp_plat_status;
diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h
index d5099a2baca5..98666c5a6f79 100644
--- a/include/linux/psp-sev.h
+++ b/include/linux/psp-sev.h
@@ -129,6 +129,7 @@ enum sev_cmd {
 	SEV_CMD_SNP_LAUNCH_FINISH	= 0x0A2,
 	SEV_CMD_SNP_DBG_DECRYPT		= 0x0B0,
 	SEV_CMD_SNP_DBG_ENCRYPT		= 0x0B1,
+	SEV_CMD_SNP_VERIFY_MITIGATION	= 0x0B2,
 	SEV_CMD_SNP_PAGE_SWAP_OUT	= 0x0C0,
 	SEV_CMD_SNP_PAGE_SWAP_IN	= 0x0C1,
 	SEV_CMD_SNP_PAGE_MOVE		= 0x0C2,
@@ -898,10 +899,60 @@ struct snp_feature_info {
 #define SNP_CIPHER_TEXT_HIDING_SUPPORTED	BIT(3)
 #define SNP_AES_256_XTS_POLICY_SUPPORTED	BIT(4)
 #define SNP_CXL_ALLOW_POLICY_SUPPORTED		BIT(5)
+#define SNP_VERIFY_MITIGATION_SUPPORTED	BIT(13)
 
 /* Feature bits in EBX */
 #define SNP_SEV_TIO_SUPPORTED			BIT(1)
 
+#define SNP_MIT_SUBCMD_REQ_STATUS      0x0
+#define SNP_MIT_SUBCMD_REQ_VERIFY      0x1
+
+/**
+ * struct sev_data_snp_verify_mitigation - SNP_VERIFY_MITIGATION command params
+ *
+ * @length: Length of the command buffer read by the PSP
+ * @subcommand: Mitigation sub-command for the firmware to execute.
+ *              REQ_STATUS: 0x0 - Request status about currently supported and
+ *                                verified mitigations
+ *              REQ_VERIFY: 0x1 - Request to initiate verification mitigation
+ *                                operation on a specific mitigation
+ * @rsvd: Reserved
+ * @vector: Bit specifying the vulnerability mitigation to process
+ * @dst_paddr_en: Destination paddr enabled
+ * @src_paddr_en: Source paddr enabled
+ * @rsvd1: Reserved
+ * @rsvd2: Reserved
+ * @src_paddr: Source address for optional input data
+ * @dst_paddr: Destination address to write the result
+ * @rsvd3: Reserved
+ */
+struct sev_data_snp_verify_mitigation {
+	u32 length;
+	u16 subcommand;
+	u16 rsvd;
+	u64 vector;
+	u32 dst_paddr_en : 1,
+	    src_paddr_en : 1,
+	    rsvd1 : 30;
+	u8 rsvd2[4];
+	u64 src_paddr;
+	u64 dst_paddr;
+	u8 rsvd3[24];
+} __packed;
+
+/**
+ * struct sev_data_snp_verify_mitigation_dst - mitigation result vectors
+ *
+ * @mit_verified_vector: Bit vector of vulnerability mitigations verified
+ * @mit_supported_vector: Bit vector of vulnerability mitigations supported
+ * @mit_failure_status: Status of the verification operation
+ */
+struct sev_data_snp_verify_mitigation_dst {
+	u64 mit_verified_vector;                /* OUT */
+	u64 mit_supported_vector;               /* OUT */
+	u32 mit_failure_status;                 /* OUT */
+} __packed;
+
 #ifdef CONFIG_CRYPTO_DEV_SP_PSP
 
 /**
-- 
2.43.0


^ permalink raw reply related

* Re: i.MX95: EdgeLock Enclave secure storage
From: Fabio Estevam @ 2026-06-15 12:13 UTC (permalink / raw)
  To: Frieder Schrempf
  Cc: Pankaj Gupta,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	open list:HARDWARE RANDOM NUMBER GENERATOR CORE, Peng Fan,
	Stefano Babic, Frank Li
In-Reply-To: <b7c92302-d675-4610-a815-b353ff365e36@kontron.de>

Hi Frieder,

On Mon, Jun 15, 2026 at 4:18 AM Frieder Schrempf
<frieder.schrempf@kontron.de> wrote:

> There is no upstream support for OCOTP access via ELE. The
> imx-ocotp-ele.c driver (despite its name) does not currently use the ELE
> but the FSB to access the fuses (and is therefore limited to read-only
> access).
>
> I have some local WIP to add ELE support for the OCOTP driver. I think I
> can post it soonish.

Thanks for the clarification, appreciate it.

^ permalink raw reply

* Re: [PATCH] crypto: ccp: Fix SNP range list bounds check
From: Jarkko Sakkinen @ 2026-06-15 12:05 UTC (permalink / raw)
  To: ZongYao.Chen
  Cc: Ashish Kalra, Tom Lendacky, John Allen, Herbert Xu,
	David S. Miller, Michael Roth, Borislav Petkov (AMD),
	Brijesh Singh, Tianjia Zhang, linux-crypto, linux-kernel, stable
In-Reply-To: <20260612092525.1203150-1-ZongYao.Chen@linux.alibaba.com>

On Fri, Jun 12, 2026 at 05:25:25PM +0800, ZongYao.Chen@linux.alibaba.com wrote:
> From: Zongyao Chen <ZongYao.Chen@linux.alibaba.com>
> 
> snp_filter_reserved_mem_regions() checks the range list size before
> adding a new entry. If the page-sized SNP_INIT_EX buffer is already
> full, the next matching resource can still write one entry past the end
> of the buffer.
> 
> Check that there is room for the next entry before appending it, and
> compute the next entry pointer only after the bounds check.
> 
> Fixes: 1ca5614b84ee ("crypto: ccp: Add support to initialize the AMD-SP for SEV-SNP")
> Cc: stable@vger.kernel.org
> Signed-off-by: Zongyao Chen <ZongYao.Chen@linux.alibaba.com>
> ---
>  drivers/crypto/ccp/sev-dev.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
> index d1e9e0ac63b6..9e6efb3ec175 100644
> --- a/drivers/crypto/ccp/sev-dev.c
> +++ b/drivers/crypto/ccp/sev-dev.c
> @@ -1324,17 +1324,19 @@ static int snp_get_platform_data(struct sev_device *sev, int *error)
>  static int snp_filter_reserved_mem_regions(struct resource *rs, void *arg)
>  {
>  	struct sev_data_range_list *range_list = arg;
> -	struct sev_data_range *range = &range_list->ranges[range_list->num_elements];
> +	struct sev_data_range *range;
>  	size_t size;
>  
>  	/*
>  	 * Ensure the list of HV_FIXED pages that will be passed to firmware
>  	 * do not exceed the page-sized argument buffer.
>  	 */
> -	if ((range_list->num_elements * sizeof(struct sev_data_range) +
> +	if (((range_list->num_elements + 1) * sizeof(struct sev_data_range) +
>  	     sizeof(struct sev_data_range_list)) > PAGE_SIZE)
>  		return -E2BIG;
>  
> +	range = &range_list->ranges[range_list->num_elements];
> +
>  	switch (rs->desc) {
>  	case E820_TYPE_RESERVED:
>  	case E820_TYPE_PMEM:
> -- 
> 2.47.3
> 

Obvious enough:

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply

* [PATCH v4 2/3] crypto: testmgr - test for multi-data-unit dispatch
From: Leonid Ravich @ 2026-06-15 11:14 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Alasdair Kergon, Ard Biesheuvel, Eric Biggers, Jens Axboe,
	Horia Geanta, Gilad Ben-Yossef, linux-crypto, dm-devel,
	linux-block
In-Reply-To: <20260615111459.9452-1-lravich@amazon.com>

Add a test that runs on every skcipher with ivsize == 16.  It
encrypts random plaintext two ways and compares:

  1. one batched request with skcipher_request_set_data_unit_size()
     set, over a deliberately fragmented scatterlist whose entries do
     not align to the data-unit size (so per-DU views cross SG entries
     and exercise the scatter_walk cursor), and
  2. an independent reference of N single-DU requests with IVs walked
     as a 128-bit LE counter, matching the convention documented in
     skcipher_request_set_data_unit_size().

The two must produce byte-identical ciphertext; this pins the IV
convention rather than only checking encrypt/decrypt symmetry.  The
batched ciphertext is then round-tripped back to plaintext, and the
caller IV is checked unchanged.  Iterates over typical data unit
sizes (512, 1024, 2048, 4096).

Algorithms the validator rejects for multi-DU return -EOPNOTSUPP on
the first call and skip cleanly; a genuine mismatch returns -EBADMSG
so it cannot be confused with a skip.

Signed-off-by: Leonid Ravich <lravich@amazon.com>
---
 crypto/testmgr.c | 192 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 192 insertions(+)

diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index 4d86efae65b2..5cbd0f4b070e 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -3211,6 +3211,194 @@ static int test_skcipher(int enc, const struct cipher_test_suite *suite,
 	return 0;
 }
 
+/* Increment a 16-byte IV as a little-endian 128-bit counter. */
+static void test_mdu_iv_inc(u8 iv[16])
+{
+	int i;
+
+	for (i = 0; i < 16; i++)
+		if (++iv[i])
+			break;
+}
+
+/*
+ * Encrypt one du_size block with a plain single-DU request; used to
+ * build an independent reference for the batched dispatch.
+ */
+static int test_mdu_ref_encrypt(struct crypto_skcipher *tfm, const u8 *in,
+				u8 *out, unsigned int du_size, const u8 iv[16])
+{
+	struct skcipher_request *req;
+	struct scatterlist sg_in, sg_out;
+	DECLARE_CRYPTO_WAIT(wait);
+	u8 ivbuf[16];
+	int err;
+
+	req = skcipher_request_alloc(tfm, GFP_KERNEL);
+	if (!req)
+		return -ENOMEM;
+	memcpy(ivbuf, iv, 16);
+	memcpy(out, in, du_size);
+	sg_init_one(&sg_in, out, du_size);
+	skcipher_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG |
+				      CRYPTO_TFM_REQ_MAY_SLEEP,
+				      crypto_req_done, &wait);
+	skcipher_request_set_crypt(req, &sg_in, &sg_in, du_size, ivbuf);
+	err = crypto_wait_req(crypto_skcipher_encrypt(req), &wait);
+	skcipher_request_free(req);
+	return err;
+}
+
+/*
+ * Build a deliberately fragmented SG over @buf: entries that do not
+ * align to du_size, so the splitter's per-DU views cross SG entries
+ * and exercise the scatter_walk cursor.
+ */
+static void test_mdu_sg_fragment(struct scatterlist *sg, unsigned int nents,
+				 u8 *buf, unsigned int total)
+{
+	unsigned int chunk = total / nents;
+	unsigned int off = 0, i;
+
+	sg_init_table(sg, nents);
+	for (i = 0; i < nents; i++) {
+		unsigned int len = (i == nents - 1) ? total - off : chunk;
+
+		sg_set_buf(&sg[i], buf + off, len);
+		off += len;
+	}
+}
+
+/*
+ * Multi-DU test: verify the batched dispatch produces byte-identical
+ * ciphertext to an independent N x single-DU reference with per-DU IVs
+ * walked as a 128-bit LE counter (pins the IV convention, not just
+ * enc/dec symmetry), over a fragmented SG, then round-trips.  Real
+ * mismatches return -EBADMSG; ineligible algorithms skip via the
+ * validator's -EOPNOTSUPP.
+ */
+#define TEST_MDU_NR_UNITS	4
+#define TEST_MDU_NR_FRAGS	5
+static int test_skcipher_multi_du_one(struct crypto_skcipher *tfm,
+				      unsigned int du_size)
+{
+	const char *driver = crypto_skcipher_driver_name(tfm);
+	const unsigned int total = du_size * TEST_MDU_NR_UNITS;
+	struct skcipher_request *req = NULL;
+	struct scatterlist sg[TEST_MDU_NR_FRAGS];
+	DECLARE_CRYPTO_WAIT(wait);
+	u8 iv_orig[16], iv_work[16], iv_ref[16];
+	u8 *plain = NULL, *buf = NULL, *ref = NULL;
+	unsigned int u;
+	int err;
+
+	plain = kmalloc(total, GFP_KERNEL);
+	buf = kmalloc(total, GFP_KERNEL);
+	ref = kmalloc(total, GFP_KERNEL);
+	req = skcipher_request_alloc(tfm, GFP_KERNEL);
+	if (!plain || !buf || !ref || !req) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	get_random_bytes(plain, total);
+	get_random_bytes(iv_orig, sizeof(iv_orig));
+
+	/* Reference: per-DU single requests with LE128-walked IVs. */
+	memcpy(iv_ref, iv_orig, sizeof(iv_orig));
+	for (u = 0; u < TEST_MDU_NR_UNITS; u++) {
+		err = test_mdu_ref_encrypt(tfm, plain + u * du_size,
+					   ref + u * du_size, du_size, iv_ref);
+		/* First single-DU call reveals an ineligible algorithm. */
+		if (err == -EOPNOTSUPP && u == 0)
+			goto out;
+		if (err) {
+			pr_err("alg: skcipher: %s multi-DU ref encrypt failed (du=%u): %d\n",
+			       driver, du_size, err);
+			goto out;
+		}
+		test_mdu_iv_inc(iv_ref);
+	}
+
+	/* Batched: one request over a fragmented SG. */
+	memcpy(buf, plain, total);
+	memcpy(iv_work, iv_orig, sizeof(iv_orig));
+	test_mdu_sg_fragment(sg, TEST_MDU_NR_FRAGS, buf, total);
+	skcipher_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG |
+				      CRYPTO_TFM_REQ_MAY_SLEEP,
+				      crypto_req_done, &wait);
+	skcipher_request_set_crypt(req, sg, sg, total, iv_work);
+	skcipher_request_set_data_unit_size(req, du_size);
+	err = crypto_wait_req(crypto_skcipher_encrypt(req), &wait);
+	if (err == -EOPNOTSUPP)
+		goto out;
+	if (err) {
+		pr_err("alg: skcipher: %s multi-DU encrypt failed (du=%u): %d\n",
+		       driver, du_size, err);
+		goto out;
+	}
+	if (memcmp(buf, ref, total) != 0) {
+		pr_err("alg: skcipher: %s multi-DU ciphertext differs from single-DU reference (du=%u)\n",
+		       driver, du_size);
+		err = -EBADMSG;
+		goto out;
+	}
+	/* req->iv must be unchanged after multi-DU dispatch. */
+	if (memcmp(iv_work, iv_orig, sizeof(iv_orig)) != 0) {
+		pr_err("alg: skcipher: %s multi-DU encrypt mutated caller IV (du=%u)\n",
+		       driver, du_size);
+		err = -EBADMSG;
+		goto out;
+	}
+
+	/* Round-trip the batched ciphertext back to plaintext. */
+	test_mdu_sg_fragment(sg, TEST_MDU_NR_FRAGS, buf, total);
+	skcipher_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG |
+				      CRYPTO_TFM_REQ_MAY_SLEEP,
+				      crypto_req_done, &wait);
+	skcipher_request_set_crypt(req, sg, sg, total, iv_work);
+	skcipher_request_set_data_unit_size(req, du_size);
+	err = crypto_wait_req(crypto_skcipher_decrypt(req), &wait);
+	if (err) {
+		pr_err("alg: skcipher: %s multi-DU decrypt failed (du=%u): %d\n",
+		       driver, du_size, err);
+		goto out;
+	}
+	if (memcmp(buf, plain, total) != 0) {
+		pr_err("alg: skcipher: %s multi-DU round-trip mismatch (du=%u)\n",
+		       driver, du_size);
+		err = -EBADMSG;
+	}
+
+out:
+	skcipher_request_free(req);
+	kfree(ref);
+	kfree(buf);
+	kfree(plain);
+	return err;
+}
+
+static int test_skcipher_multi_du(struct crypto_skcipher *tfm)
+{
+	static const unsigned int du_sizes[] = { 512, 1024, 2048, 4096 };
+	unsigned int j;
+	int err;
+
+	if (crypto_skcipher_ivsize(tfm) != 16)
+		return 0;
+
+	for (j = 0; j < ARRAY_SIZE(du_sizes); j++) {
+		err = test_skcipher_multi_du_one(tfm, du_sizes[j]);
+		/* Ineligible algorithms skip; real failures propagate. */
+		if (err == -EOPNOTSUPP)
+			return 0;
+		if (err)
+			return err;
+		cond_resched();
+	}
+	return 0;
+}
+
 static int alg_test_skcipher(const struct alg_test_desc *desc,
 			     const char *driver, u32 type, u32 mask)
 {
@@ -3259,6 +3447,10 @@ static int alg_test_skcipher(const struct alg_test_desc *desc,
 	if (err)
 		goto out;
 
+	err = test_skcipher_multi_du(tfm);
+	if (err)
+		goto out;
+
 	err = test_skcipher_vs_generic_impl(desc->generic_driver, req, tsgls);
 out:
 	free_cipher_test_sglists(tsgls);
-- 
2.47.3


^ permalink raw reply related

* [PATCH v4 3/3] dm crypt: batch all sectors of a bio per crypto request
From: Leonid Ravich @ 2026-06-15 11:14 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Alasdair Kergon, Ard Biesheuvel, Eric Biggers, Jens Axboe,
	Horia Geanta, Gilad Ben-Yossef, linux-crypto, dm-devel,
	linux-block
In-Reply-To: <20260615111459.9452-1-lravich@amazon.com>

Submit one skcipher request per bio with
skcipher_request_set_data_unit_size(req, cc->sector_size) instead of
issuing one request per sector.  This removes per-sector overhead in
the crypto API hot path: request allocation, callback dispatch,
completion handling, and SG setup.

The optimisation is enabled automatically at table load when all
of the following hold:

 - the cipher is non-aead (i.e. skcipher), sync, tfms_count 1;
 - the IV mode advertises sector_iv_le128, i.e. its per-sector IV
   advances as a 128-bit LE counter, matching the convention
   documented in skcipher_request_set_data_unit_size().  Only plain64
   sets it today (its 64-bit LE counter extends correctly); plain is
   excluded as its 32-bit counter wraps differently across a
   2^32-sector boundary;
 - ivsize is 16 (the core rejects other sizes with -EOPNOTSUPP);
 - the iv_gen_ops->post() hook is unset;
 - dm-integrity is not stacked (no integrity tag or integrity IV).

The cipher driver does not need to advertise anything: the crypto
API auto-splits multi-data-unit requests for drivers that cannot
handle them natively, so dm-crypt sees the same fast batched
submission contract regardless of the underlying driver.

A new CRYPT_MULTI_DATA_UNIT cipher_flag, set once at construction
time, gates the multi-data-unit dispatch.  The existing per-sector
path in crypt_convert_block_skcipher() is unchanged; the new
crypt_convert_block_skcipher_multi() is reached from a small
dispatch in crypt_convert() and shares the same backlog/-EBUSY/
-EINPROGRESS flow control with the per-sector path.

Heap-allocated scatterlists are stashed in dm_crypt_request and
freed in crypt_free_req_skcipher() to avoid races between the
synchronous-success free path and async-completion reuse from the
request pool.  On scatterlist allocation failure the helper returns
-EAGAIN, and the core returns -EOPNOTSUPP if a driver turns out
unable to do multi-DU; crypt_convert() handles both by clearing its
local multi_du flag and falling back to the per-sector path for the
rest of the current crypt_convert() invocation, ensuring forward progress
on the swap-out-to-dm-crypt path even under total memory exhaustion
(the per-sector path uses only cc->req_pool, a mempool with
reservoir set up at table-load time, and the inline
dmreq->sg_in[]/sg_out[] arrays — no allocation that could fail).

Verified end-to-end with a byte-equivalence test: encrypted output
of plain64 dm-crypt with the multi-data-unit path matches output of
the single-data-unit path bit-for-bit over a 256 MB device, with
xts-aes-aesni driving the auto-split path.

Signed-off-by: Leonid Ravich <lravich@amazon.com>
---
 drivers/md/dm-crypt.c | 215 ++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 207 insertions(+), 8 deletions(-)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 608b617fb817..bfb98dd876d7 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -101,6 +101,9 @@ struct dm_crypt_request {
 	struct scatterlist sg_in[4];
 	struct scatterlist sg_out[4];
 	u64 iv_sector;
+	/* Multi-data-unit SG arrays, NULL when sg_in[]/sg_out[] suffice. */
+	struct scatterlist *sg_in_ext;
+	struct scatterlist *sg_out_ext;
 };
 
 struct crypt_config;
@@ -115,6 +118,12 @@ struct crypt_iv_operations {
 			 struct dm_crypt_request *dmreq);
 	void (*post)(struct crypt_config *cc, u8 *iv,
 		     struct dm_crypt_request *dmreq);
+	/*
+	 * The per-sector IV advances as a 128-bit LE counter, so a bio's
+	 * consecutive sectors share one starting IV and can be batched into
+	 * a single skcipher request via data_unit_size.
+	 */
+	bool sector_iv_le128;
 };
 
 struct iv_benbi_private {
@@ -151,6 +160,7 @@ enum cipher_flags {
 	CRYPT_IV_LARGE_SECTORS,		/* Calculate IV from sector_size, not 512B sectors */
 	CRYPT_ENCRYPT_PREPROCESS,	/* Must preprocess data for encryption (elephant) */
 	CRYPT_KEY_MAC_SIZE_SET,		/* The integrity_key_size option was used */
+	CRYPT_MULTI_DATA_UNIT,		/* Batch all sectors of a bio per crypto request */
 };
 
 /*
@@ -1018,7 +1028,8 @@ static const struct crypt_iv_operations crypt_iv_plain_ops = {
 };
 
 static const struct crypt_iv_operations crypt_iv_plain64_ops = {
-	.generator = crypt_iv_plain64_gen
+	.generator = crypt_iv_plain64_gen,
+	.sector_iv_le128 = true,
 };
 
 static const struct crypt_iv_operations crypt_iv_plain64be_ops = {
@@ -1426,12 +1437,126 @@ static int crypt_convert_block_skcipher(struct crypt_config *cc,
 	return r;
 }
 
+/*
+ * Submit all remaining sectors of the current bio in one skcipher request.
+ * Same return convention as crypt_convert_block_skcipher() except for
+ * -EAGAIN, which the caller must treat as "disable multi-DU and re-enter
+ * the per-sector path" so swap-out-to-dm-crypt always makes forward
+ * progress on the mempool reserve.
+ */
+static int crypt_convert_block_skcipher_multi(struct crypt_config *cc,
+					      struct convert_context *ctx,
+					      struct skcipher_request *req,
+					      unsigned int *out_processed)
+{
+	const unsigned int sector_size = cc->sector_size;
+	const gfp_t gfp = GFP_NOIO | __GFP_NORETRY | __GFP_NOWARN;
+	unsigned int total = ctx->iter_in.bi_size;
+	unsigned int n_sg_in = 0, n_sg_out = 0;
+	struct dm_crypt_request *dmreq = dmreq_of_req(cc, req);
+	struct scatterlist *sg_in = NULL, *sg_out = NULL;
+	struct bvec_iter iter_in, iter_out;
+	struct bio_vec bv;
+	u8 *iv, *org_iv;
+	int r;
+
+	if (WARN_ON_ONCE(ctx->iter_in.bi_size != ctx->iter_out.bi_size))
+		return -EIO;
+	if (unlikely(total & (sector_size - 1)))
+		return -EIO;
+
+	iter_in = ctx->iter_in;
+	iter_in.bi_size = total;
+	__bio_for_each_bvec(bv, ctx->bio_in, iter_in, iter_in)
+		n_sg_in++;
+
+	iter_out = ctx->iter_out;
+	iter_out.bi_size = total;
+	__bio_for_each_bvec(bv, ctx->bio_out, iter_out, iter_out)
+		n_sg_out++;
+
+	sg_in = kmalloc_array(n_sg_in, sizeof(*sg_in), gfp);
+	sg_out = (ctx->bio_in == ctx->bio_out) ? sg_in :
+		 kmalloc_array(n_sg_out, sizeof(*sg_out), gfp);
+	if (!sg_in || !sg_out) {
+		kfree(sg_in);
+		if (sg_out != sg_in)
+			kfree(sg_out);
+		return -EAGAIN;
+	}
+
+	sg_init_table(sg_in, n_sg_in);
+	{
+		unsigned int i = 0;
+
+		iter_in = ctx->iter_in;
+		iter_in.bi_size = total;
+		__bio_for_each_bvec(bv, ctx->bio_in, iter_in, iter_in)
+			sg_set_page(&sg_in[i++], bv.bv_page, bv.bv_len,
+				    bv.bv_offset);
+	}
+
+	if (sg_out != sg_in) {
+		unsigned int i = 0;
+
+		sg_init_table(sg_out, n_sg_out);
+		iter_out = ctx->iter_out;
+		iter_out.bi_size = total;
+		__bio_for_each_bvec(bv, ctx->bio_out, iter_out, iter_out)
+			sg_set_page(&sg_out[i++], bv.bv_page, bv.bv_len,
+				    bv.bv_offset);
+	}
+
+	dmreq->iv_sector = ctx->cc_sector;
+	if (test_bit(CRYPT_IV_LARGE_SECTORS, &cc->cipher_flags))
+		dmreq->iv_sector >>= cc->sector_shift;
+	dmreq->ctx = ctx;
+
+	iv = iv_of_dmreq(cc, dmreq);
+	org_iv = org_iv_of_dmreq(cc, dmreq);
+	r = cc->iv_gen_ops->generator(cc, org_iv, dmreq);
+	if (r < 0)
+		goto out_free_sg;
+	memcpy(iv, org_iv, cc->iv_size);
+
+	dmreq->sg_in_ext = sg_in;
+	dmreq->sg_out_ext = (sg_out == sg_in) ? NULL : sg_out;
+
+	skcipher_request_set_crypt(req, sg_in, sg_out, total, iv);
+	skcipher_request_set_data_unit_size(req, sector_size);
+
+	if (bio_data_dir(ctx->bio_in) == WRITE)
+		r = crypto_skcipher_encrypt(req);
+	else
+		r = crypto_skcipher_decrypt(req);
+
+	/*
+	 * Sync error: kcryptd_async_done won't run, so free the SG
+	 * arrays here.  Async returns (-EINPROGRESS, -EBUSY) hand
+	 * ownership to the completion callback.
+	 */
+	if (r && r != -EINPROGRESS && r != -EBUSY)
+		goto out_free_sg;
+
+	*out_processed = total;
+	return r;
+
+out_free_sg:
+	kfree(sg_in);
+	if (sg_out != sg_in)
+		kfree(sg_out);
+	dmreq->sg_in_ext = NULL;
+	dmreq->sg_out_ext = NULL;
+	return r;
+}
+
 static void kcryptd_async_done(void *async_req, int error);
 
 static int crypt_alloc_req_skcipher(struct crypt_config *cc,
 				     struct convert_context *ctx)
 {
 	unsigned int key_index = ctx->cc_sector & (cc->tfms_count - 1);
+	struct dm_crypt_request *dmreq;
 
 	if (!ctx->r.req) {
 		ctx->r.req = mempool_alloc(&cc->req_pool, in_interrupt() ? GFP_ATOMIC : GFP_NOIO);
@@ -1441,6 +1566,11 @@ static int crypt_alloc_req_skcipher(struct crypt_config *cc,
 
 	skcipher_request_set_tfm(ctx->r.req, cc->cipher_tfm.tfms[key_index]);
 
+	/* Multi-DU SG arrays are owned by the helper that allocates them. */
+	dmreq = dmreq_of_req(cc, ctx->r.req);
+	dmreq->sg_in_ext = NULL;
+	dmreq->sg_out_ext = NULL;
+
 	/*
 	 * Use REQ_MAY_BACKLOG so a cipher driver internally backlogs
 	 * requests if driver request queue is full.
@@ -1487,6 +1617,12 @@ static void crypt_free_req_skcipher(struct crypt_config *cc,
 				    struct skcipher_request *req, struct bio *base_bio)
 {
 	struct dm_crypt_io *io = dm_per_bio_data(base_bio, cc->per_bio_data_size);
+	struct dm_crypt_request *dmreq = dmreq_of_req(cc, req);
+
+	kfree(dmreq->sg_in_ext);
+	dmreq->sg_in_ext = NULL;
+	kfree(dmreq->sg_out_ext);
+	dmreq->sg_out_ext = NULL;
 
 	if ((struct skcipher_request *)(io + 1) != req)
 		mempool_free(req, &cc->req_pool);
@@ -1515,7 +1651,9 @@ static void crypt_free_req(struct crypt_config *cc, void *req, struct bio *base_
 static blk_status_t crypt_convert(struct crypt_config *cc,
 			 struct convert_context *ctx, bool atomic, bool reset_pending)
 {
-	unsigned int sector_step = cc->sector_size >> SECTOR_SHIFT;
+	const unsigned int sector_step = cc->sector_size >> SECTOR_SHIFT;
+	bool multi_du = test_bit(CRYPT_MULTI_DATA_UNIT, &cc->cipher_flags);
+	unsigned int processed;
 	int r;
 
 	/*
@@ -1536,8 +1674,13 @@ static blk_status_t crypt_convert(struct crypt_config *cc,
 
 		atomic_inc(&ctx->cc_pending);
 
+		processed = cc->sector_size;
 		if (crypt_integrity_aead(cc))
 			r = crypt_convert_block_aead(cc, ctx, ctx->r.req_aead, ctx->tag_offset);
+		else if (multi_du)
+			r = crypt_convert_block_skcipher_multi(cc, ctx,
+							       ctx->r.req,
+							       &processed);
 		else
 			r = crypt_convert_block_skcipher(cc, ctx, ctx->r.req, ctx->tag_offset);
 
@@ -1559,8 +1702,19 @@ static blk_status_t crypt_convert(struct crypt_config *cc,
 					 * exit and continue processing in a workqueue
 					 */
 					ctx->r.req = NULL;
-					ctx->tag_offset++;
-					ctx->cc_sector += sector_step;
+					if (!multi_du) {
+						ctx->tag_offset++;
+						ctx->cc_sector += sector_step;
+					} else {
+						bio_advance_iter(ctx->bio_in,
+								 &ctx->iter_in,
+								 processed);
+						bio_advance_iter(ctx->bio_out,
+								 &ctx->iter_out,
+								 processed);
+						ctx->cc_sector +=
+							processed >> SECTOR_SHIFT;
+					}
 					return BLK_STS_DEV_RESOURCE;
 				}
 			} else {
@@ -1574,19 +1728,41 @@ static blk_status_t crypt_convert(struct crypt_config *cc,
 		 */
 		case -EINPROGRESS:
 			ctx->r.req = NULL;
-			ctx->tag_offset++;
-			ctx->cc_sector += sector_step;
+			if (!multi_du) {
+				ctx->tag_offset++;
+				ctx->cc_sector += sector_step;
+			} else {
+				bio_advance_iter(ctx->bio_in, &ctx->iter_in,
+						 processed);
+				bio_advance_iter(ctx->bio_out, &ctx->iter_out,
+						 processed);
+				ctx->cc_sector += processed >> SECTOR_SHIFT;
+			}
 			continue;
 		/*
 		 * The request was already processed (synchronously).
 		 */
 		case 0:
 			atomic_dec(&ctx->cc_pending);
-			ctx->cc_sector += sector_step;
-			ctx->tag_offset++;
+			if (!multi_du) {
+				ctx->cc_sector += sector_step;
+				ctx->tag_offset++;
+			} else {
+				bio_advance_iter(ctx->bio_in, &ctx->iter_in,
+						 processed);
+				bio_advance_iter(ctx->bio_out, &ctx->iter_out,
+						 processed);
+				ctx->cc_sector += processed >> SECTOR_SHIFT;
+			}
 			if (!atomic)
 				cond_resched();
 			continue;
+		/* Multi-DU rejected (no memory or sync-only mismatch): fall back. */
+		case -EAGAIN:
+		case -EOPNOTSUPP:
+			atomic_dec(&ctx->cc_pending);
+			multi_du = false;
+			continue;
 		/*
 		 * There was a data integrity error.
 		 */
@@ -3063,6 +3239,29 @@ static int crypt_ctr_cipher(struct dm_target *ti, char *cipher_in, char *key)
 		}
 	}
 
+	/*
+	 * Enable multi-data-unit batching only when per-DU IVs can be
+	 * derived from one starting IV as a 128-bit LE counter, matching
+	 * skcipher_request_set_data_unit_size().  Only IV modes flagged
+	 * sector_iv_le128 qualify (plain64; not plain, whose 32-bit counter
+	 * wraps differently across a 2^32-sector boundary).  ivsize must be
+	 * 16 (the core rejects otherwise) and the cipher must be sync,
+	 * single-tfm, no integrity, no per-sector post() hook.  The driver
+	 * advertises nothing: the core auto-splits for drivers that lack
+	 * native support.
+	 */
+	if (!crypt_integrity_aead(cc) && cc->tfms_count == 1 &&
+	    cc->iv_gen_ops && cc->iv_gen_ops->sector_iv_le128 &&
+	    !cc->iv_gen_ops->post &&
+	    !cc->integrity_tag_size && !cc->integrity_iv_size &&
+	    crypto_skcipher_ivsize(any_tfm(cc)) == 16 &&
+	    !(crypto_skcipher_alg(any_tfm(cc))->base.cra_flags &
+	      CRYPTO_ALG_ASYNC)) {
+		set_bit(CRYPT_MULTI_DATA_UNIT, &cc->cipher_flags);
+		DMINFO("Using multi-data-unit crypto offload (du=%u)",
+		       cc->sector_size);
+	}
+
 	/* wipe the kernel key payload copy */
 	if (cc->key_string)
 		memset(cc->key, 0, cc->key_size * sizeof(u8));
-- 
2.47.3


^ permalink raw reply related

* [PATCH v4 1/3] crypto: skcipher - add per-request data_unit_size with auto-splitting
From: Leonid Ravich @ 2026-06-15 11:14 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Alasdair Kergon, Ard Biesheuvel, Eric Biggers, Jens Axboe,
	Horia Geanta, Gilad Ben-Yossef, linux-crypto, dm-devel,
	linux-block
In-Reply-To: <20260615111459.9452-1-lravich@amazon.com>

Add a data_unit_size field to struct skcipher_request that lets a
caller submit several data units (typically 512..4096-byte sectors)
sharing one starting IV in a single request.  Algorithms derive each
data unit's IV from the caller-supplied IV by treating it as a
128-bit little-endian counter and adding the data-unit index, which
matches the layout produced by dm-crypt's plain64 IV mode and by
typical inline-encryption hardware.

This mirrors the data_unit_size concept already exposed by
struct blk_crypto_config for inline encryption.

The crypto API auto-splits a multi-data-unit request into per-DU
sub-requests when the underlying algorithm does not advertise
CRYPTO_ALG_SKCIPHER_NATIVE_MULTI_DU (a type-specific cra_flags bit,
defined in crypto/internal/skcipher.h).  A consumer sets
data_unit_size and submits: a native driver handles all units in one
pass, otherwise the core splits transparently.  The split derives
per-DU IVs as a 128-bit LE counter, so this is correct only for
algorithms using that IV convention (e.g. XTS with plain64-style
IVs); callers are responsible for that match, as they already are for
the IV itself.

skcipher_request_set_tfm() resets the field to 0 so a request reused
from a pool or stack defaults to single-data-unit semantics; callers
that want batching set it explicitly via
skcipher_request_set_data_unit_size() after configuring the tfm.

crypto_skcipher_encrypt()/decrypt() call
crypto_skcipher_validate_multi_du() before any algorithm dispatch.
data_unit_size must be a power of two when non-zero (realistic sizes
are 512..4096, letting the per-DU loop and the cryptlen alignment
check use a mask instead of a divide) and cryptlen a positive
multiple of it; a malformed geometry is rejected with -EINVAL.  A
target that cannot do multi-DU - ivsize != SKCIPHER_MDU_IVSIZE (16),
an lskcipher, or an async algorithm without the native flag - is
rejected with -EOPNOTSUPP so a caller can fall back.  Async is
excluded because the splitter dispatches synchronously: an
-EINPROGRESS return would leave later units unsubmitted while the
driver still owned the request's scatterlists and IV.  The check
gates the native path too, so algorithms never see a malformed
multi-DU request.

No in-tree algorithm sets CRYPTO_ALG_SKCIPHER_NATIVE_MULTI_DU yet;
subsequent patches add the testmgr coverage and the dm-crypt
consumer.

Signed-off-by: Leonid Ravich <lravich@amazon.com>
---
 crypto/skcipher.c                  | 132 +++++++++++++++++++++++++++++
 include/crypto/internal/skcipher.h |  10 +++
 include/crypto/skcipher.h          |  28 ++++++
 3 files changed, 170 insertions(+)

diff --git a/crypto/skcipher.c b/crypto/skcipher.c
index 2b31d1d5d268..9262b47acfb9 100644
--- a/crypto/skcipher.c
+++ b/crypto/skcipher.c
@@ -17,6 +17,7 @@
 #include <linux/cryptouser.h>
 #include <linux/err.h>
 #include <linux/kernel.h>
+#include <linux/log2.h>
 #include <linux/mm.h>
 #include <linux/module.h>
 #include <linux/seq_file.h>
@@ -432,15 +433,139 @@ int crypto_skcipher_setkey(struct crypto_skcipher *tfm, const u8 *key,
 }
 EXPORT_SYMBOL_GPL(crypto_skcipher_setkey);
 
+/* IV size for the 128-bit LE-counter multi-data-unit convention. */
+#define SKCIPHER_MDU_IVSIZE	16
+
+static inline void skcipher_iv_inc_le128(u8 *iv)
+{
+	__le64 lo_le, hi_le;
+	u64 lo;
+
+	memcpy(&lo_le, iv, 8);
+	memcpy(&hi_le, iv + 8, 8);
+	lo = le64_to_cpu(lo_le) + 1;
+	lo_le = cpu_to_le64(lo);
+	memcpy(iv, &lo_le, 8);
+	if (unlikely(lo == 0)) {
+		hi_le = cpu_to_le64(le64_to_cpu(hi_le) + 1);
+		memcpy(iv + 8, &hi_le, 8);
+	}
+}
+
+/*
+ * Dispatch a multi-data-unit request as one single-DU sub-request per
+ * unit.  Each unit's IV is the caller's IV plus the unit index, taken
+ * as a 128-bit little-endian counter.  A pair of scatter_walks advances
+ * through src/dst in a single linear pass (O(entries + units)); building
+ * each sub-request's view with scatterwalk_ffwd() would instead rescan
+ * from the head every unit, i.e. O(units^2).
+ */
+static int skcipher_split_data_units(struct skcipher_request *req,
+				     int (*body)(struct skcipher_request *))
+{
+	const unsigned int du = req->data_unit_size;
+	const unsigned int total = req->cryptlen;
+	struct scatterlist *orig_src = req->src;
+	struct scatterlist *orig_dst = req->dst;
+	bool inplace = orig_src == orig_dst;
+	struct scatter_walk src_walk, dst_walk;
+	struct scatterlist src_sg[2], dst_sg[2];
+	u8 iv_orig[SKCIPHER_MDU_IVSIZE];
+	u8 iv_work[SKCIPHER_MDU_IVSIZE];
+	unsigned int off;
+	int err = 0;
+
+	memcpy(iv_orig, req->iv, sizeof(iv_orig));
+	memcpy(iv_work, iv_orig, sizeof(iv_orig));
+
+	sg_init_table(src_sg, 2);
+	scatterwalk_start(&src_walk, orig_src);
+	if (!inplace) {
+		sg_init_table(dst_sg, 2);
+		scatterwalk_start(&dst_walk, orig_dst);
+	}
+
+	/* Stop the per-DU body from re-entering the splitter. */
+	req->data_unit_size = 0;
+	req->src = src_sg;
+	req->dst = inplace ? src_sg : dst_sg;
+
+	for (off = 0; off < total; off += du) {
+		req->cryptlen = du;
+		scatterwalk_get_sglist(&src_walk, src_sg);
+		scatterwalk_skip(&src_walk, du);
+		if (!inplace) {
+			scatterwalk_get_sglist(&dst_walk, dst_sg);
+			scatterwalk_skip(&dst_walk, du);
+		}
+
+		err = body(req);
+		if (err)
+			break;
+
+		skcipher_iv_inc_le128(iv_work);
+		memcpy(req->iv, iv_work, sizeof(iv_work));
+	}
+
+	/* Caller-visible IV is the starting IV regardless of outcome. */
+	memcpy(req->iv, iv_orig, sizeof(iv_orig));
+	req->src = orig_src;
+	req->dst = orig_dst;
+	req->cryptlen = total;
+	req->data_unit_size = du;
+	return err;
+}
+
+static int crypto_skcipher_validate_multi_du(struct skcipher_request *req)
+{
+	const unsigned int du = req->data_unit_size;
+	struct crypto_skcipher *tfm;
+	struct skcipher_alg *alg;
+	u32 cra_flags;
+
+	if (likely(!du))
+		return 0;
+	if (!is_power_of_2(du) || du < SKCIPHER_MDU_IVSIZE)
+		return -EINVAL;
+	if (!req->cryptlen || (req->cryptlen & (du - 1)))
+		return -EINVAL;
+
+	tfm = crypto_skcipher_reqtfm(req);
+	alg = crypto_skcipher_alg(tfm);
+
+	/* lskcipher's *_sg path doesn't honour data_unit_size. */
+	if (alg->co.base.cra_type != &crypto_skcipher_type)
+		return -EOPNOTSUPP;
+
+	/* Capability mismatch, not a malformed request: report -EOPNOTSUPP. */
+	if (crypto_skcipher_ivsize(tfm) != SKCIPHER_MDU_IVSIZE)
+		return -EOPNOTSUPP;
+
+	/* The auto-splitter is sync-only; native drivers own async dispatch. */
+	cra_flags = alg->co.base.cra_flags;
+	if ((cra_flags & CRYPTO_ALG_ASYNC) &&
+	    !(cra_flags & CRYPTO_ALG_SKCIPHER_NATIVE_MULTI_DU))
+		return -EOPNOTSUPP;
+
+	return 0;
+}
+
 int crypto_skcipher_encrypt(struct skcipher_request *req)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
 	struct skcipher_alg *alg = crypto_skcipher_alg(tfm);
+	int err;
 
 	if (crypto_skcipher_get_flags(tfm) & CRYPTO_TFM_NEED_KEY)
 		return -ENOKEY;
+	err = crypto_skcipher_validate_multi_du(req);
+	if (err)
+		return err;
 	if (alg->co.base.cra_type != &crypto_skcipher_type)
 		return crypto_lskcipher_encrypt_sg(req);
+	if (req->data_unit_size &&
+	    !(alg->co.base.cra_flags & CRYPTO_ALG_SKCIPHER_NATIVE_MULTI_DU))
+		return skcipher_split_data_units(req, alg->encrypt);
 	return alg->encrypt(req);
 }
 EXPORT_SYMBOL_GPL(crypto_skcipher_encrypt);
@@ -449,11 +574,18 @@ int crypto_skcipher_decrypt(struct skcipher_request *req)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
 	struct skcipher_alg *alg = crypto_skcipher_alg(tfm);
+	int err;
 
 	if (crypto_skcipher_get_flags(tfm) & CRYPTO_TFM_NEED_KEY)
 		return -ENOKEY;
+	err = crypto_skcipher_validate_multi_du(req);
+	if (err)
+		return err;
 	if (alg->co.base.cra_type != &crypto_skcipher_type)
 		return crypto_lskcipher_decrypt_sg(req);
+	if (req->data_unit_size &&
+	    !(alg->co.base.cra_flags & CRYPTO_ALG_SKCIPHER_NATIVE_MULTI_DU))
+		return skcipher_split_data_units(req, alg->decrypt);
 	return alg->decrypt(req);
 }
 EXPORT_SYMBOL_GPL(crypto_skcipher_decrypt);
diff --git a/include/crypto/internal/skcipher.h b/include/crypto/internal/skcipher.h
index a965b6aabf61..4c826f3bc715 100644
--- a/include/crypto/internal/skcipher.h
+++ b/include/crypto/internal/skcipher.h
@@ -21,6 +21,16 @@
  */
 #define CRYPTO_ALG_SKCIPHER_REQSIZE_LARGE CRYPTO_ALG_OPTIONAL_KEY
 
+/*
+ * Set by an skcipher that handles skcipher_request::data_unit_size > 0
+ * natively in one pass; otherwise the API splits the request.  Lives in
+ * the type-specific 0xff000000 cra_flags range.  A native driver must
+ * derive per-DU IVs as a 128-bit LE counter and leave @iv at the
+ * caller-supplied starting value on return, success or error, matching
+ * the auto-splitter so the two paths are observably identical.
+ */
+#define CRYPTO_ALG_SKCIPHER_NATIVE_MULTI_DU	0x01000000
+
 struct aead_request;
 struct rtattr;
 
diff --git a/include/crypto/skcipher.h b/include/crypto/skcipher.h
index 4efe2ca8c4d1..ced1fae08147 100644
--- a/include/crypto/skcipher.h
+++ b/include/crypto/skcipher.h
@@ -31,6 +31,11 @@ struct scatterlist;
 /**
  *	struct skcipher_request - Symmetric key cipher request
  *	@cryptlen: Number of bytes to encrypt or decrypt
+ *	@data_unit_size: Size in bytes of each data unit, or 0 for a
+ *		single-data-unit request (the default).  When non-zero,
+ *		must be a power of two, @cryptlen must be a positive
+ *		multiple of it, and per-DU IVs are derived from @iv as a
+ *		128-bit little-endian counter.
  *	@iv: Initialisation Vector
  *	@src: Source SG list
  *	@dst: Destination SG list
@@ -39,6 +44,7 @@ struct scatterlist;
  */
 struct skcipher_request {
 	unsigned int cryptlen;
+	unsigned int data_unit_size;
 
 	u8 *iv;
 
@@ -225,6 +231,7 @@ struct lskcipher_alg {
 	struct skcipher_request *name = \
 		(((struct skcipher_request *)__##name##_desc)->base.tfm = \
 			crypto_sync_skcipher_tfm((_tfm)), \
+		 ((struct skcipher_request *)__##name##_desc)->data_unit_size = 0, \
 		 (void *)__##name##_desc)
 
 /**
@@ -819,6 +826,8 @@ static inline void skcipher_request_set_tfm(struct skcipher_request *req,
 					    struct crypto_skcipher *tfm)
 {
 	req->base.tfm = crypto_skcipher_tfm(tfm);
+	/* Reused requests default to single-data-unit. */
+	req->data_unit_size = 0;
 }
 
 static inline void skcipher_request_set_sync_tfm(struct skcipher_request *req,
@@ -937,5 +946,24 @@ static inline void skcipher_request_set_crypt(
 	req->iv = iv;
 }
 
+/**
+ * skcipher_request_set_data_unit_size() - submit as multiple data units
+ * @req: request handle
+ * @data_unit_size: data-unit size in bytes (power of two), or 0 to disable
+ *
+ * Process @req as @cryptlen / @data_unit_size data units sharing one starting
+ * @iv, with per-DU IVs derived as a 128-bit little-endian counter.  @cryptlen
+ * must be a positive multiple of @data_unit_size, else the encrypt/decrypt
+ * call returns -EINVAL; a target that cannot do multi-DU (ivsize != 16, an
+ * lskcipher, or async without native support) returns -EOPNOTSUPP.  Unlike
+ * the single-DU path, @iv is preserved across the call regardless of outcome.
+ */
+static inline void
+skcipher_request_set_data_unit_size(struct skcipher_request *req,
+				    unsigned int data_unit_size)
+{
+	req->data_unit_size = data_unit_size;
+}
+
 #endif	/* _CRYPTO_SKCIPHER_H */
 

base-commit: a8cafdf8c949f17c92eca0045532e88ac0dac30d
-- 
2.47.3


^ permalink raw reply related

* [PATCH v4 0/3] crypto: skcipher - per-request multi-data-unit batching
From: Leonid Ravich @ 2026-06-15 11:14 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Alasdair Kergon, Ard Biesheuvel, Eric Biggers, Jens Axboe,
	Horia Geanta, Gilad Ben-Yossef, linux-crypto, dm-devel,
	linux-block

This is v4, addressing Herbert's review of v3.  Two architectural
changes:

  - data_unit_size is now per-request (on struct skcipher_request)
    rather than per-tfm.  Reverts to the v1 placement.

  - The crypto API auto-splits multi-data-unit requests when the
    underlying algorithm does not advertise
    CRYPTO_ALG_SKCIPHER_NATIVE_MULTI_DU.  Consumers no longer test
    for multi-DU support before submitting; setting data_unit_size
    on any skcipher request whose algorithm uses the 128-bit LE
    counter IV convention "just works".

These two changes shrink the series from 4 patches to 3 (the
generic xts(...) template needs no special handling - the
auto-splitter calls its single-DU encrypt/decrypt once per data
unit) and simplify the dm-crypt consumer (no advertise-flag check,
no per-tfm setup).

v3: https://lore.kernel.org/linux-crypto/20260601085641.16028-1-lravich@amazon.com/
v2: https://lore.kernel.org/linux-crypto/20260527065021.19525-1-lravich@amazon.com/
v1: https://lore.kernel.org/linux-crypto/20260519115955.27267-1-lravich@amazon.com/

The series adds a per-request "data unit size" to the skcipher API
so a caller can submit several data units (typically 512..4096-byte
sectors) sharing one starting IV in a single request.  Algorithms
derive each data unit's IV from the caller-supplied IV by treating
it as a 128-bit little-endian counter and adding the data-unit
index, matching the layout produced by dm-crypt's plain64 IV mode
and by typical inline-encryption hardware.

This mirrors the data_unit_size concept already exposed by
struct blk_crypto_config for inline encryption.

The first user is dm-crypt, which today issues one skcipher request
per sector and so pays a per-sector cost in request allocation,
callback dispatch, completion handling, and scatterlist setup.

Proof-of-concept performance numbers from the RFC reply [1]: +19%
throughput / -40% CPU on a single-core arm64 system with a hardware
XTS-AES-256 accelerator running fio 4 KiB sequential writes through
dm-crypt, when an out-of-tree arm64 xts driver advertises
CRYPTO_ALG_SKCIPHER_NATIVE_MULTI_DU.  This series itself does not
include arch enablement; the fast path is opt-in per driver, the
slow path is universal via the auto-splitter.

The native fast path amortises both per-sector dispatch and per-sector
crypto setup across a bio - the measured win above, on an engine that
offloads the AES compute.  The auto-splitter is for correctness and
reach: any consumer can set data_unit_size and get correct output with
the per-request allocation/callback/completion cost removed, but it
still issues one alg->encrypt per data unit, so on a software cipher it
saves only dispatch overhead (no throughput figure claimed - that is
hardware- and workload-dependent).  What it guarantees unconditionally
is byte-identical output (Verification below) at O(entries + units),
walking the scatterlists with a pair of struct scatter_walk cursors
rather than rescanning from the head per unit.

[1] https://lore.kernel.org/linux-crypto/20260428101225.24316-1-lravich@amazon.com/

Changes since v3
----------------

- data_unit_size moved from struct crypto_skcipher (per-tfm) to
  struct skcipher_request (per-request).  (Herbert)

- Crypto API auto-splits multi-data-unit requests when the algorithm
  does not advertise CRYPTO_ALG_SKCIPHER_NATIVE_MULTI_DU.  Drops the
  per-tfm setter/probe in favour of a single
  skcipher_request_set_data_unit_size() usable by every consumer.
  (Herbert)

- CRYPTO_ALG_SKCIPHER_NATIVE_MULTI_DU is a type-specific cra_flags
  bit (0x01000000) in crypto/internal/skcipher.h, not a generic bit
  in the public header; drivers set it to opt OUT of auto-splitting.

- The auto-splitter advances through src/dst with a pair of struct
  scatter_walk cursors (scatterwalk_start / scatterwalk_get_sglist /
  scatterwalk_skip) instead of scatterwalk_ffwd() per unit, which
  rescans from the head and is O(units^2) under fragmentation; the
  cursors give a single linear pass.  (Eric)

- crypto_skcipher_validate_multi_du() reports -EINVAL for a malformed
  geometry (du not a power of two, cryptlen not a positive multiple)
  and -EOPNOTSUPP for a target that cannot do multi-DU (ivsize != 16,
  lskcipher, or async without the native flag), so a caller can fall
  back.  Gates the native path too, not just the auto-splitter.
  (Eric)

- testmgr cross-checks the batched dispatch against an independent
  N x single-DU reference with LE128-walked IVs over a fragmented
  scatterlist (pins the IV convention and exercises the cursor),
  round-trips, and checks IV preservation.  Ineligible algorithms
  skip via -EOPNOTSUPP; a real mismatch returns -EBADMSG.

- dm-crypt enables batching only for IV modes flagged sector_iv_le128
  (a new bool on struct crypt_iv_operations, set on plain64 only),
  plus ivsize 16, sync, single-tfm, no integrity, no post() hook.  The
  flag replaces a hardcoded plain64 pointer-compare, so eligibility is
  a self-documenting property of the IV mode rather than a special
  case.  plain stays excluded (its 32-bit counter wraps differently
  past 2^32 sectors).  Sets req->data_unit_size = sector_size and
  submits; -EOPNOTSUPP/-EAGAIN fall back to the per-sector path.
  Mikulas's v2 Reviewed-by is dropped as the dm-crypt patch was
  substantially rewritten.

- The generic xts(...) template needs no separate handling, dropping
  the v3 crypto/xts.c patch (4 -> 3 patches).

Design overview
---------------

* Patch 1 adds the data_unit_size field, the setter, the
  CRYPTO_ALG_SKCIPHER_NATIVE_MULTI_DU flag, and the auto-splitter in
  crypto_skcipher_encrypt()/decrypt().  skcipher_request_set_tfm()
  resets the field so a reused request defaults to single-DU.

* Patch 2 adds the testmgr multi-DU test (every ivsize == 16
  skcipher).

* Patch 3 turns dm-crypt batching on automatically under the
  conditions above and sets req->data_unit_size = cc->sector_size.

This series does NOT add the capability flag to any arch driver; the
auto-splitter ensures correctness without that opt-in.

Verification
------------

A regression protocol is included in the project tree
(.claude/regression-protocol.md, .claude/run-regression.sh).  The
reference run reports 12/12 PASS:

  - x86 + arm64 build clean; checkpatch.pl --strict clean.
  - testmgr multi-DU: PASS for every ivsize == 16 skcipher in-tree.
  - dm-crypt activation gating: plain64 enabled; essiv:sha256 /
    plain64be / plain fall back.
  - dm-crypt round-trip plain64 with multi-DU via the auto-splitter
    (xts-aes-aesni, no native flag): PASS.
  - dm-crypt round-trip essiv:sha256 (per-sector path): PASS.
  - dm-crypt low-memory (mem=128M): PASS, no OOM kill.
  - Byte-equivalence: 256 MB of ciphertext through the auto-splitter
    is bit-identical to an unpatched axboe/for-next baseline (sha256
    4913910b1aa6f8859fcb8f4adec20230274993a3ade8f4dd0140a323dc43efc0).
  - arm64 functional under qemu-aarch64: PASS.



Leonid Ravich (3):
  crypto: skcipher - add per-request data_unit_size with auto-splitting
  crypto: testmgr - test for multi-data-unit dispatch
  dm crypt: batch all sectors of a bio per crypto request

 crypto/skcipher.c                  | 132 +++++++++++++++++++
 crypto/testmgr.c                   | 192 +++++++++++++++++++++++++
 drivers/md/dm-crypt.c              | 215 +++++++++++++++++++++++++++--
 include/crypto/internal/skcipher.h |  10 ++
 include/crypto/skcipher.h          |  28 ++++
 5 files changed, 569 insertions(+), 8 deletions(-)


base-commit: a8cafdf8c949f17c92eca0045532e88ac0dac30d
--
2.47.3


^ permalink raw reply

* [PATCH] crypto: sun4i-ss: remove debugfs directory on teardown
From: Pengpeng Hou @ 2026-06-15  9:11 UTC (permalink / raw)
  To: Corentin Labbe, Herbert Xu, David S. Miller, Chen-Yu Tsai,
	Jernej Skrabec, Samuel Holland, linux-crypto, linux-arm-kernel,
	linux-sunxi, linux-kernel
  Cc: Pengpeng Hou

sun4i_ss_probe() creates a debugfs directory and a stats file with struct
sun4i_ss_ctx as private data. The remove path unregisters the crypto
algorithms and tears down runtime PM but leaves the debugfs entries
published.

Remove the debugfs subtree before tearing down the driver state used by
the stats show callback.

Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
---
 drivers/crypto/allwinner/sun4i-ss/sun4i-ss-core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-core.c b/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-core.c
index 58a76e2ba64e..bcaddf1b83ca 100644
--- a/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-core.c
+++ b/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-core.c
@@ -512,6 +512,8 @@ static void sun4i_ss_remove(struct platform_device *pdev)
 	int i;
 	struct sun4i_ss_ctx *ss = platform_get_drvdata(pdev);
 
+	debugfs_remove_recursive(ss->dbgfs_dir);
+
 	for (i = 0; i < ARRAY_SIZE(ss_algs); i++) {
 		switch (ss_algs[i].type) {
 		case CRYPTO_ALG_TYPE_SKCIPHER:
-- 
2.50.1 (Apple Git-155)


^ permalink raw reply related

* Re: i.MX95: EdgeLock Enclave secure storage
From: Frieder Schrempf @ 2026-06-15  7:18 UTC (permalink / raw)
  To: Fabio Estevam, Pankaj Gupta
  Cc: moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	open list:HARDWARE RANDOM NUMBER GENERATOR CORE, Peng Fan,
	Stefano Babic, Frank Li
In-Reply-To: <CAOMZO5DgENq8RU6s2CPnKsf53i=7zoBeO38m_BtV=w54hr2hgQ@mail.gmail.com>

On 13.06.26 15:58, Fabio Estevam wrote:
> Hi Pankaj,
> 
> First of all, thank you for your work on upstreaming the
> EdgeLock Enclave (ELE) support. It is great to finally see the
> ELE framework landing upstream after a long development effort.
> 
> I am currently evaluating the state of i.MX95 secure-boot and
> storage-security support based on current linux-next, with the
> goal of understanding what can already be achieved using
> upstream software and what pieces are still under development.
> 
> From my review, it appears that the following infrastructure is
> already available upstream:
> 
> - ELE/V2X mailbox support for i.MX95.
> - OCOTP/ELE nvmem support for fuse access.

There is no upstream support for OCOTP access via ELE. The
imx-ocotp-ele.c driver (despite its name) does not currently use the ELE
but the FSB to access the fuses (and is therefore limited to read-only
access).

I have some local WIP to add ELE support for the OCOTP driver. I think I
can post it soonish.

> - Secure-enclave bindings documenting the i.MX95 ELE HSM.
> 
> However, I could not find upstream support for several
> capabilities that would be useful for secure storage
> deployments on i.MX95, including:
> 
> - An ELE-backed trusted-key provider for the Linux trusted key
> framework.
> - Integration allowing Linux to use ELE as a key-sealing/
> unsealing backend.
> - i.MX95-specific crypto acceleration exposed through the Linux
> crypto API for dm-crypt use cases.
> 
> Are you aware of any ongoing upstream or planned development
> activities in these areas, particularly for i.MX95?
> 
> Any information about the upstream roadmap, ongoing
> development, or expected direction for these features would be
> greatly appreciated.
> 
> Thanks again for your work and for any insights you can share.
> 
> Regards,
> 
> Fabio Estevam


^ permalink raw reply

* Re: [PATCH v3 2/4] KVM: selftests: Verify SNP VMs are rejected from migration and mirroring
From: Atish Patra @ 2026-06-15  6:46 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Borislav Petkov, Dave Hansen,
	x86, H. Peter Anvin, Tom Lendacky, Peter Gonda, Brijesh Singh,
	Youngjae Lee, Ashish Kalra, Michael Roth, John Allen, Herbert Xu
  Cc: clm, kvm, linux-kernel, linux-crypto, stable, Atish Patra
In-Reply-To: <20260602-sev_snp_fixes-v3-2-585e4783a42f@meta.com>


On 6/2/26 3:11 PM, Atish Patra wrote:
> From: Atish Patra <atishp@meta.com>
>
> Migration and mirroring of SEV-SNP VMs are not supported yet.
>
> Add two selftests that verify KVM rejects intra-host migration and
> mirroring when the source VM is an SNP VM, so the restriction stays enforced
> until proper SNP state transfer is implemented.
>
> Signed-off-by: Atish Patra <atishp@meta.com>
> ---
>   .../testing/selftests/kvm/x86/sev_migrate_tests.c  | 47 ++++++++++++++++++++++
>   1 file changed, 47 insertions(+)
>
> diff --git a/tools/testing/selftests/kvm/x86/sev_migrate_tests.c b/tools/testing/selftests/kvm/x86/sev_migrate_tests.c
> index 6b0928e69051..acef6ab26d3d 100644
> --- a/tools/testing/selftests/kvm/x86/sev_migrate_tests.c
> +++ b/tools/testing/selftests/kvm/x86/sev_migrate_tests.c
> @@ -313,6 +313,49 @@ static void test_sev_mirror_parameters(void)
>   	kvm_vm_free(vm_no_vcpu);
>   }
>   
> +static void test_sev_snp_migrate_reject(void)
> +{
> +	struct kvm_vm *src_vm, *dst_vm;
> +	int ret;
> +
> +	src_vm = vm_create_barebones_type(KVM_X86_SNP_VM);
> +	snp_vm_init(src_vm);
> +	__vm_vcpu_add(src_vm, 0);
> +	vm_sev_launch(src_vm, snp_default_policy(), NULL);
> +
> +	dst_vm = vm_create_barebones_type(KVM_X86_SNP_VM);
> +	__vm_vcpu_add(dst_vm, 0);
> +
> +	ret = __sev_migrate_from(dst_vm, src_vm);
> +	TEST_ASSERT(ret == -1 && errno == EINVAL,
> +		    "SNP VM migration should be rejected. ret: %d, errno: %d",
> +		    ret, errno);
> +
> +	kvm_vm_free(src_vm);
> +	kvm_vm_free(dst_vm);
> +}
> +
> +static void test_sev_snp_mirror_reject(void)
> +{
> +	struct kvm_vm *src_vm, *dst_vm;
> +	int ret;
> +
> +	src_vm = vm_create_barebones_type(KVM_X86_SNP_VM);
> +	snp_vm_init(src_vm);
> +	__vm_vcpu_add(src_vm, 0);
> +	vm_sev_launch(src_vm, snp_default_policy(), NULL);
> +
> +	dst_vm = aux_vm_create(false);
> +
> +	ret = __sev_mirror_create(dst_vm, src_vm);
> +	TEST_ASSERT(ret == -1 && errno == EINVAL,
> +		    "SNP VM mirroring should be rejected. ret: %d, errno: %d",
> +		    ret, errno);
> +
> +	kvm_vm_free(src_vm);
> +	kvm_vm_free(dst_vm);
> +}
> +
>   static void test_sev_move_copy(void)
>   {
>   	struct kvm_vm *dst_vm, *dst2_vm, *dst3_vm, *sev_vm, *mirror_vm,
> @@ -384,12 +427,16 @@ int main(int argc, char *argv[])
>   		test_sev_migrate_parameters();
>   		if (kvm_has_cap(KVM_CAP_VM_COPY_ENC_CONTEXT_FROM))
>   			test_sev_move_copy();
> +		if (kvm_cpu_has(X86_FEATURE_SEV_SNP))
> +			test_sev_snp_migrate_reject();
>   	}
>   	if (kvm_has_cap(KVM_CAP_VM_COPY_ENC_CONTEXT_FROM)) {
>   		test_sev_mirror(/* es= */ false);
>   		if (have_sev_es)
>   			test_sev_mirror(/* es= */ true);
>   		test_sev_mirror_parameters();
> +		if (kvm_cpu_has(X86_FEATURE_SEV_SNP))
> +			test_sev_snp_mirror_reject();
>   	}
>   	return 0;
>   }
>
gentle ping for any feedback on this patch ?

^ permalink raw reply

* Re: [PATCH 2/2] dt-bindings: crypto: qcom,inline-crypto-engine: Document Maili ICE
From: Jingyi Wang @ 2026-06-15  6:12 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: Herbert Xu, David S. Miller, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Vinod Koul, Bjorn Andersson, aiqun.yu,
	tingwei.zhang, trilok.soni, yijie.yang, linux-arm-msm,
	linux-crypto, devicetree, linux-kernel
In-Reply-To: <20260610-mighty-dalmatian-of-piety-2fa184@quoll>



On 6/10/2026 4:55 PM, Krzysztof Kozlowski wrote:
> On Tue, Jun 09, 2026 at 02:08:57AM -0700, Jingyi Wang wrote:
>> The Inline Crypto Engine found on Maili SoC is compatible with the common
>> baseline IP 'qcom,inline-crypto-engine'. Hence, document the compatible as
>> such.
>>
>> Signed-off-by: Jingyi Wang <jingyi.wang@oss.qualcomm.com>
>> ---
>>   Documentation/devicetree/bindings/crypto/qcom,inline-crypto-engine.yaml | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/Documentation/devicetree/bindings/crypto/qcom,inline-crypto-engine.yaml b/Documentation/devicetree/bindings/crypto/qcom,inline-crypto-engine.yaml
>> index db895c50e2d2..c9489f6b8081 100644
>> --- a/Documentation/devicetree/bindings/crypto/qcom,inline-crypto-engine.yaml
>> +++ b/Documentation/devicetree/bindings/crypto/qcom,inline-crypto-engine.yaml
>> @@ -16,6 +16,7 @@ properties:
>>             - qcom,eliza-inline-crypto-engine
>>             - qcom,hawi-inline-crypto-engine
>>             - qcom,kaanapali-inline-crypto-engine
>> +          - qcom,maili-inline-crypto-engine
> 
> Why clocks are flexible?

I have just noticed that this patch has been merged:
https://lore.kernel.org/all/20260416-qcom_ice_power_and_clk_vote-v5-1-5ccf5d7e2846@oss.qualcomm.com/

Will add qcom,maili-inline-crypto-engine to the eliza/milos list in next version.
( Maybe hawi should also be added together? )

Thanks,
Jingyi

> 
> Best regards,
> Krzysztof
> 


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox