Re: [PATCH v17 13/14] crypto: qce - Add BAM DMA support for crypto register I/O

DMA Engine development
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Bartosz Golaszewski" <bartosz.golaszewski@oss.qualcomm.com>
Cc: vkoul@kernel.org, Frank.Li@kernel.org, dmaengine@vger.kernel.org
Subject: Re: [PATCH v17 13/14] crypto: qce - Add BAM DMA support for crypto register I/O
Date: Tue, 19 May 2026 13:56:30 +0000	[thread overview]
Message-ID: <20260519135637.8FDAAC2BCB8@smtp.kernel.org> (raw)
In-Reply-To: <20260519-qcom-qce-cmd-descr-v17-13-53a595414b79@oss.qualcomm.com>

Thank you for your contribution! Sashiko AI review found 7 potential issue(s) to consider:
- [High] DMA API misuse causing potential memory corruption or IOMMU faults.
- [High] Missing endianness conversion for BAM DMA command data.
- [High] Buffer overflow due to missing bounds checking in BAM DMA arrays.
- [High] Resource leak of DMA mappings on successful crypto operations.
- [Medium] Race condition in `qce_dma_terminate_all()` due to improper DMA unmap ordering.
- [Low] Unused coherent DMA memory allocation for register reads.
- [Low] Suboptimal scatterlist construction for BAM command descriptors.
--

commit 7d591e31fc1682350db9b4ee4360e0e434326203
Author: Bartosz Golaszewski <brgl@kernel.org>

crypto: qce - Add BAM DMA support for crypto register I/O

Switch to using BAM DMA for register I/O in addition to passing data. To
that end: provide the necessary infrastructure in the driver, modify the
ordering of operations as required and replace all direct register writes
with wrappers queueing DMA command descriptors.

> diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
> index 3db46fc0c419a..b66e6386fccda 100644
> --- a/drivers/crypto/qce/dma.c
> +++ b/drivers/crypto/qce/dma.c

[ ... ]

> +int qce_submit_cmd_desc(struct qce_device *qce)
> +{
> +	struct qce_desc_info *qce_desc = qce->dma.bam_txn->desc;
> +	struct qce_bam_transaction *bam_txn = qce->dma.bam_txn;
> +	struct dma_async_tx_descriptor *dma_desc;
> +	struct dma_chan *chan = qce->dma.rxchan;
> +	unsigned long attrs = DMA_PREP_CMD;
> +	dma_cookie_t cookie;
> +	unsigned int mapped;
> +	int ret;
> +
> +	mapped = dma_map_sg(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt, DMA_TO_DEVICE);
> +	if (!mapped)
> +		return -ENOMEM;

Does this leak the DMA mapping on successful crypto operations?

The wr_sgl scatterlist is mapped for DMA here, but there does not appear to
be a corresponding dma_unmap_sg() on the successful completion paths of the
crypto requests. The unmapping is currently only handled in
qce_dma_terminate_all() during error cleanup.

> +
> +	dma_desc = dmaengine_prep_slave_sg(chan, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt,
> +					   DMA_MEM_TO_DEV, attrs);

Should this pass the mapped variable instead of bam_txn->wr_sgl_cnt?

Because dma_map_sg() coalesces contiguous physical memory regions, the
number of returned segments can be fewer than the original count. Passing
the original wr_sgl_cnt might force the DMA engine to process trailing,
unmapped scatterlist entries, which could lead to IOMMU faults or memory
corruption.

> +	if (!dma_desc) {
> +		ret = -ENOMEM;
> +		goto err_unmap_sg;
> +	}

[ ... ]

> +static void qce_prep_dma_cmd_desc(struct qce_device *qce, struct qce_dma_data *dma,
> +				  unsigned int addr, void *buf)
> +{
> +	struct qce_bam_transaction *bam_txn = dma->bam_txn;
> +	struct bam_cmd_element *bam_ce_buf;
> +	int bam_ce_size, cnt, idx;
> +
> +	idx = bam_txn->bam_ce_idx;
> +	bam_ce_buf = &bam_txn->bam_ce[idx];
> +	bam_prep_ce_le32(bam_ce_buf, addr, BAM_WRITE_COMMAND, *((__le32 *)buf));

Could this result in incorrect endianness on big-endian hardware?

The value passed to buf originates as a native host-endian u32 from
qce_write_dma(). Casting it directly to __le32 * and dereferencing it skips
the use of cpu_to_le32(). This would write native big-endian values directly
to the little-endian BAM DMA hardware.

> +
> +	bam_ce_buf = &bam_txn->bam_ce[bam_txn->pre_bam_ce_idx];
> +	bam_txn->bam_ce_idx++;

Does this code overflow the bam_txn->bam_ce[] array?

The array is sized to QCE_BAM_CMD_ELEMENT_SIZE (128), but there are no
bounds checks before unconditionally incrementing bam_ce_idx.

> +	bam_ce_size = (bam_txn->bam_ce_idx - bam_txn->pre_bam_ce_idx) * sizeof(*bam_ce_buf);
> +
> +	cnt = bam_txn->wr_sgl_cnt;
> +
> +	sg_set_buf(&bam_txn->wr_sgl[cnt], bam_ce_buf, bam_ce_size);
> +
> +	++bam_txn->wr_sgl_cnt;

Similarly, does this overflow the bam_txn->wr_sgl[] array?

It is also sized to QCE_BAM_CMD_SGL_SIZE (128) and incremented without
checking if the bounds have been exceeded.

> +	bam_txn->pre_bam_ce_idx = bam_txn->bam_ce_idx;

Could the scatterlist construction be improved here?

By setting pre_bam_ce_idx equal to bam_ce_idx after every write, this
appears to create a separate 16-byte scatterlist entry for every individual
register write instead of batching adjacent command elements into a single
contiguous scatterlist segment.

> +}
> +
> +void qce_write_dma(struct qce_device *qce, unsigned int offset, u32 val)
> +{
> +	unsigned int reg_addr = ((unsigned int)(qce->base_phys) + offset);
> +
> +	qce_prep_dma_cmd_desc(qce, &qce->dma, reg_addr, &val);
> +}
> +
>  int devm_qce_dma_request(struct qce_device *qce)
>  {

[ ... ]

> +	qce->read_buf = dmam_alloc_coherent(qce->dev, QCE_MAX_REG_READ * sizeof(*qce->read_buf),
> +					    &qce->read_buf_dma, GFP_KERNEL);

Is this allocated memory actually used?

The driver appears to allocate this coherent DMA memory, but continues to
use the synchronous readl() for qce_read(), leaving this buffer unused.

> +	if (!qce->read_buf)
> +		return -ENOMEM;
> +
>  	return 0;
>  }

[ ... ]

>  int qce_dma_terminate_all(struct qce_dma_data *dma)
>  {
> +	struct qce_device *qce = container_of(dma, struct qce_device, dma);
> +	struct qce_bam_transaction *bam_txn = dma->bam_txn;
>  	int ret;
>  
> +	dma_unmap_sg(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt, DMA_TO_DEVICE);
> +
>  	ret = dmaengine_terminate_all(dma->rxchan);
>  	return ret ?: dmaengine_terminate_all(dma->txchan);
>  }

Could this ordering cause an IOMMU fault?

Calling dma_unmap_sg() before dmaengine_terminate_all() can mean that the
memory is unmapped while the DMA engine is still actively processing
descriptors during an error or cancellation path. Should the hardware be
terminated prior to unmapping the memory?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260519-qcom-qce-cmd-descr-v17-0-53a595414b79@oss.qualcomm.com?part=13

next prev parent reply	other threads:[~2026-05-19 13:56 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-19 13:17 [PATCH v17 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
2026-05-19 13:17 ` [PATCH v17 01/14] dmaengine: constify struct dma_descriptor_metadata_ops Bartosz Golaszewski
2026-05-19 13:17 ` [PATCH v17 02/14] dmaengine: qcom: bam_dma: free interrupt before the clock in error path Bartosz Golaszewski
2026-05-19 13:48   ` sashiko-bot
2026-05-19 13:17 ` [PATCH v17 03/14] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue Bartosz Golaszewski
2026-05-19 13:44   ` sashiko-bot
2026-05-19 13:17 ` [PATCH v17 04/14] dmaengine: qcom: bam_dma: Extend the driver's device match data Bartosz Golaszewski
2026-05-19 13:17 ` [PATCH v17 05/14] dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support Bartosz Golaszewski
2026-05-19 13:17 ` [PATCH v17 06/14] dmaengine: qcom: bam_dma: add support for BAM locking Bartosz Golaszewski
2026-05-19 13:54   ` sashiko-bot
2026-05-19 13:17 ` [PATCH v17 07/14] crypto: qce - Cancel work on device detach Bartosz Golaszewski
2026-05-19 13:47   ` sashiko-bot
2026-05-19 13:17 ` [PATCH v17 08/14] crypto: qce - Include algapi.h in the core.h header Bartosz Golaszewski
2026-05-19 13:17 ` [PATCH v17 09/14] crypto: qce - Remove unused ignore_buf Bartosz Golaszewski
2026-05-19 13:48   ` sashiko-bot
2026-05-19 13:17 ` [PATCH v17 10/14] crypto: qce - Simplify arguments of devm_qce_dma_request() Bartosz Golaszewski
2026-05-19 13:17 ` [PATCH v17 11/14] crypto: qce - Use existing devres APIs in devm_qce_dma_request() Bartosz Golaszewski
2026-05-19 14:00   ` sashiko-bot
2026-05-19 13:17 ` [PATCH v17 12/14] crypto: qce - Map crypto memory for DMA Bartosz Golaszewski
2026-05-19 13:52   ` sashiko-bot
2026-05-19 13:17 ` [PATCH v17 13/14] crypto: qce - Add BAM DMA support for crypto register I/O Bartosz Golaszewski
2026-05-19 13:56   ` sashiko-bot [this message]
2026-05-19 13:17 ` [PATCH v17 14/14] crypto: qce - Communicate the base physical address to the dmaengine Bartosz Golaszewski
2026-05-19 14:17   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260519135637.8FDAAC2BCB8@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=Frank.Li@kernel.org \
    --cc=bartosz.golaszewski@oss.qualcomm.com \
    --cc=dmaengine@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=vkoul@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox