Re: [PATCH 2/4] crypto: rockchip: Add RK356x/RK3588 cryptographic offloader driver

Devicetree
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Dawid Olesinski" <dawidro@gmail.com>
Cc: devicetree@vger.kernel.org, conor+dt@kernel.org, robh@kernel.org
Subject: Re: [PATCH 2/4] crypto: rockchip: Add RK356x/RK3588 cryptographic offloader driver
Date: Sat, 30 May 2026 16:34:47 +0000	[thread overview]
Message-ID: <20260530163448.9B6391F00898@smtp.kernel.org> (raw)
In-Reply-To: <20260530160704.3453555-3-dawidro@gmail.com>

Thank you for your contribution! Sashiko AI review found 9 potential issue(s) to consider:
- [High] Kconfig selects a non-existent symbol `CRYPTO_SM3_GENERIC`.
- [High] Incorrect IV offset calculation uses full scatterlist element capacity instead of the processed length, causing out-of-bounds reads and breaking cipher chaining.
- [High] The shared interrupt handler accesses hardware unconditionally without verifying PM state, causing fatal aborts.
- [High] Device removal cancels pending autosuspend timers without explicitly suspending the hardware, leading to permanent clock and power leakage.
- [High] Algorithm registration copies templates blindly without generating unique `cra_driver_name`s, preventing multi-instance probing.
- [High] A race condition between DMA timeout handling and delayed interrupts can cause spurious completions and memory corruption.
- [High] Hardware descriptors and parameters are assigned using CPU-native integers instead of fixed-endian types, breaking the driver on big-endian architectures.
- [High] 64-bit DMA addresses are silently truncated to 32 bits, risking memory corruption on systems with >4GB of RAM.
- [Medium] Algorithm fallback statistics are updated without atomics in highly concurrent code paths, leading to data races.
--

commit a4f5122cb83b084e2414510f88a5efed52b14807
Author: Dawid Olesinski <dawidro@gmail.com>

crypto: rockchip: Add RK356x/RK3588 cryptographic offloader driver

This commit adds a driver for the second-generation Rockchip cryptographic
hardware accelerator found on RK3568 and RK3588 SoCs. It provides hardware
acceleration for symmetric block ciphers and hashing functions.

> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
> index d23b58b81ca34..47a891593814a 100644
> --- a/drivers/crypto/Kconfig
> +++ b/drivers/crypto/Kconfig
[ ... ]
> @@ -709,6 +709,39 @@ config CRYPTO_DEV_ROCKCHIP_DEBUG
>  	  This will create /sys/kernel/debug/rk3288_crypto/stats for displaying
>  	  the number of requests per algorithm and other internal stats.
>  
> +config CRYPTO_DEV_ROCKCHIP2
> +	tristate "Rockchip's cryptographic offloader"
> +	depends on OF && ARCH_ROCKCHIP
> +	depends on PM
> +	select CRYPTO_ECB
> +	select CRYPTO_CBC
> +	select CRYPTO_AES
> +	select CRYPTO_MD5
> +	select CRYPTO_SHA1
> +	select CRYPTO_SHA256
> +	select CRYPTO_SHA512
> +	select CRYPTO_SM3_GENERIC

[Severity: High]
Does this select a nonexistent config symbol? The symbol CRYPTO_SM3_GENERIC
does not seem to exist in the kernel tree (the correct symbol is
CRYPTO_SM3).

Selecting a nonexistent symbol is silently ignored by Kconfig, potentially
leaving the generic SM3 algorithm disabled. Since the driver unconditionally
attempts to allocate a fallback for sm3 in rk2_hash_init_tfm(), could this
cause runtime allocation failures and prevent the hashing algorithm from
instantiating?

> diff --git a/drivers/crypto/rockchip/rk2_crypto.c b/drivers/crypto/rockchip/rk2_crypto.c
> new file mode 100644
> index 0000000000000..df7dab4d7ca0b
> --- /dev/null
> +++ b/drivers/crypto/rockchip/rk2_crypto.c
[ ... ]
> +static int rk2_crypto_pm_suspend(struct device *dev)
> +{
> +	struct rk2_crypto_dev *rkdev = dev_get_drvdata(dev);
> +
> +	reset_control_assert(rkdev->rst);
> +	udelay(10);
> +	clk_bulk_disable_unprepare(rkdev->num_clks, rkdev->clks);
> +
> +	return 0;
> +}

[Severity: High]
Should this suspend path call synchronize_irq() before gating the clocks?

Since the device requests its interrupt with IRQF_SHARED, another device
sharing the IRQ line could trigger the handler. If the clocks are disabled
while the shared interrupt handler is running or about to run, could this
lead to a system crash when the handler accesses powered-down registers?

[ ... ]
> +static void rk2_crypto_pm_exit(struct rk2_crypto_dev *rkdev)
> +{
> +	pm_runtime_disable(rkdev->dev);
> +}

[Severity: High]
Is it safe to unconditionally call pm_runtime_disable() here without ensuring
the hardware is suspended?

During removal, pm_runtime_disable() will cancel any pending asynchronous
suspends. Without a preceding pm_runtime_put_sync() or explicit hardware
shutdown in the removal path, the hardware might remain permanently locked
in the RPM_ACTIVE state, leaking clocks indefinitely.

[ ... ]
> +static irqreturn_t rk2_crypto_irq_handle(int irq, void *dev_id)
> +{
> +	struct rk2_crypto_dev *rkc = platform_get_drvdata(dev_id);
> +	u32 v;
> +
> +	v = readl(rkc->reg + RK2_CRYPTO_DMA_INT_ST);

[Severity: High]
Can this unconditional register read cause an external abort?

For a shared interrupt handler, the handler might be invoked when the device
is runtime-suspended. Reading from the register without first verifying the
device is active (e.g., using pm_runtime_get_if_active()) could cause a crash
or return 0xffffffff, which might generate a spurious completion.

[ ... ]
> +static int rk2_crypto_probe(struct platform_device *pdev)
> +{
[ ... ]
> +	rkc->num_algs = ARRAY_SIZE(rk2_crypto_algs_template);
> +	rkc->algs = devm_kmemdup(dev, rk2_crypto_algs_template,
> +				 sizeof(rk2_crypto_algs_template), GFP_KERNEL);

[Severity: High]
Does duplicating the static templates preserve identical cra_driver_name
values across multiple device instances?

If a second hardware instance probes (such as on RK3588 SoCs with multiple
crypto cores), submitting these copies to the crypto subsystem without
modifying their cra_driver_name strings will result in a global name collision.
Could this cause crypto_engine_register_skcipher() to fail with -EEXIST,
aborting the probe for subsequent instances?

[ ... ]
> +	rkc->tl = dma_alloc_coherent(dev,
> +				     sizeof(struct rk2_crypto_lli) * MAX_LLI,
> +				     &rkc->t_phy, GFP_KERNEL);

[Severity: High]
Is a DMA mask needed before performing allocations?

The driver programs hardware descriptors using 32-bit DMA fields and writes
the LLI table base address into a 32-bit hardware register. Because the driver
never limits the allocation mask via dma_set_mask_and_coherent(dev,
DMA_BIT_MASK(32)) during initialization, the allocator might provide physical
addresses above 4GB on systems with large RAM.

If the upper 32 bits of the 64-bit dma_addr_t are discarded, could this cause
the DMA controller to read or write random regions of low memory?

> diff --git a/drivers/crypto/rockchip/rk2_crypto_ahash.c b/drivers/crypto/rockchip/rk2_crypto_ahash.c
> new file mode 100644
> index 0000000000000..5aeff32d14028
> --- /dev/null
> +++ b/drivers/crypto/rockchip/rk2_crypto_ahash.c
[ ... ]
> +static bool rk2_ahash_need_fallback(struct ahash_request *areq)
> +{
[ ... ]
> +	if (nents > 1) {
> +		algt->stat_fb_sgdiff++;
> +		return true;
> +	}

[Severity: Medium]
Is it safe to increment these statistics without atomic protection?

Software fallback functions increment performance statistics directly inside
frontend operations. Since these fallbacks bypass the serialized
crypto_engine kthread queue, they can be executed concurrently by multiple
threads on the same algorithm template instance.

Concurrent load-add-store operations on an unsigned long might create a
data race, resulting in lost updates. Should these counters use atomic types
or be protected by locks?

[ ... ]
> +int rk2_hash_run(struct crypto_engine *engine, void *breq)
> +{
[ ... ]
> +	while (sgs && len > 0) {
[ ... ]
> +		todo = min(sg_dma_len(sgs), len);
> +		dd->src_addr = sg_dma_address(sgs);
> +		dd->src_len = todo;

[Severity: High]
Does this directly assign native types to __le32 LLI descriptor fields?

Fields in struct rk2_crypto_lli like src_addr and src_len are defined as
__le32. Assigning CPU-native values to them directly without using the
required cpu_to_le32() conversions will break the driver on big-endian
architectures, as the hardware receives corrupted addresses.

[ ... ]
> +		timeout = wait_for_completion_timeout(&rkc->complete,
> +						      msecs_to_jiffies(2000));
> +	if (!timeout) {
> +		dev_err(rkc->dev, "DMA timeout\n");
> +		err = -ETIMEDOUT;
> +		reset_control_assert(rkc->rst);
> +		udelay(10);
> +		reset_control_deassert(rkc->rst);
> +		goto exit;
> +	}

[Severity: High]
Does the timeout error path need to synchronize with the interrupt handler?

When wait_for_completion_timeout() expires, the driver resets the hardware
and exits. The crypto framework then submits the next request, resetting the
completion structure. If the timed-out request's interrupt was delayed and
fires concurrently on another CPU, it will trigger complete() on the newly
initialized completion.

Could this spurious completion cause the next request to return from wait
immediately, unmapping buffers prematurely while the DMA engine is still
actively reading/writing memory?

> diff --git a/drivers/crypto/rockchip/rk2_crypto_skcipher.c b/drivers/crypto/rockchip/rk2_crypto_skcipher.c
> new file mode 100644
> index 0000000000000..e1a1a1a130964
> --- /dev/null
> +++ b/drivers/crypto/rockchip/rk2_crypto_skcipher.c
[ ... ]
> +int rk2_cipher_run(struct crypto_engine *engine, void *async_req)
> +{
[ ... ]
> +		if (areq->iv && crypto_skcipher_ivsize(tfm) > 0) {
> +			if (rctx->mode & RK2_CRYPTO_DEC) {
> +				offset = sgs->length - ivsize;

[Severity: High]
Does this calculate the IV offset using the full scatterlist element capacity
instead of the processed length?

The driver attempts to back up the output IV by copying the last processed
block. However, scatterlist lengths (sgs->length) represent buffer
capacities, which are often larger than the requested cryptlen or the
processed todo amount.

By using offset = sgs->length - ivsize, the driver might read uninitialized
or unrelated memory from the end of the scatterlist buffer instead of reading
the newly generated ciphertext at todo - ivsize. Could this corrupt the IV,
silently break subsequent chained block operations, and risk leaking
unrelated memory?

[ ... ]
> +		if (algt->is_xts) {
> +			for (i = 0; i < ctx->keylen / 8; i++) {
> +				v = cpu_to_be32(rkey[i]);
> +				writel(v, rkc->reg + RK2_CRYPTO_KEY0 + i * 4);

[Severity: High]
Does combining cpu_to_be32() with writel() incorrectly byte-swap on
big-endian systems?

The writel() function already performs a native-to-little-endian swap. By
pre-swapping the value with cpu_to_be32(), this code creates a double-swap
issue on big-endian architectures. Additionally, casting the key byte array
to u32* violates strict aliasing. Will this result in the hardware receiving
a corrupted key?

[ ... ]
> +		dd->src_addr = sg_dma_address(sgs);
> +		dd->src_len = todo;

[Severity: High]
Are CPU-native values being assigned directly to __le32 LLI descriptor fields
here as well?

Like in the ahash implementation, these fields require cpu_to_le32() conversions
to function correctly on big-endian systems.

[ ... ]
> +		timeout = wait_for_completion_timeout(&rkc->complete,
> +						      msecs_to_jiffies(2000));
> +		if (sgs == sgd) {
> +			dma_unmap_sg(rkc->dev, sgs, 1, DMA_BIDIRECTIONAL);
> +		} else {
> +			dma_unmap_sg(rkc->dev, sgs, 1, DMA_TO_DEVICE);
> +			dma_unmap_sg(rkc->dev, sgd, 1, DMA_FROM_DEVICE);
> +		}
> +
> +		if (!timeout) {
> +			dev_err(rkc->dev, "DMA timeout\n");
> +			err = -ETIMEDOUT;
> +			reset_control_assert(rkc->rst);
> +			udelay(10);
> +			reset_control_deassert(rkc->rst);
> +			goto exit;
> +		}

[Severity: High]
Is this timeout error path subject to the same race condition with delayed
interrupts as in the ahash implementation?

Without a call to synchronize_irq(), a delayed interrupt from this request
could fire during the next request, causing spurious completions and memory
corruption when buffers are unmapped prematurely.

[ ... ]
> +		if (areq->iv && ivsize > 0) {
> +			offset = sgd->length - ivsize;

[Severity: High]
Is the IV offset calculated incorrectly on the encryption path as well?

Using sgd->length instead of the processed length (todo) can read out of
bounds and corrupt the IV for chained block operations.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260530160704.3453555-1-dawidro@gmail.com?part=2

next prev parent reply	other threads:[~2026-05-30 16:34 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-30 16:06 [PATCH 0/4] crypto: rockchip: Add RK356x/RK3588 cryptographic Dawid Olesinski
2026-05-30 16:06 ` [PATCH 1/4] dt-bindings: crypto: rockchip: Add RK356x/RK3588 crypto engine binding Dawid Olesinski
2026-05-30 16:18   ` sashiko-bot
2026-05-30 20:48   ` Rob Herring (Arm)
2026-05-30 16:06 ` [PATCH 2/4] crypto: rockchip: Add RK356x/RK3588 cryptographic offloader driver Dawid Olesinski
2026-05-30 16:34   ` sashiko-bot [this message]
2026-05-30 16:06 ` [PATCH 3/4] arm64: dts: rockchip: Add crypto node to rk356x-base Dawid Olesinski
2026-05-30 16:42   ` sashiko-bot
2026-05-30 16:06 ` [PATCH 4/4] arm64: dts: rockchip: Add crypto node to rk3588-base Dawid Olesinski
2026-05-30 16:50   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260530163448.9B6391F00898@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=conor+dt@kernel.org \
    --cc=dawidro@gmail.com \
    --cc=devicetree@vger.kernel.org \
    --cc=robh@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox