Re: [PATCH v19 06/14] dmaengine: qcom: bam_dma: add support for BAM locking

DMA Engine development
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Bartosz Golaszewski" <bartosz.golaszewski@oss.qualcomm.com>
Cc: Frank.Li@kernel.org, vkoul@kernel.org, dmaengine@vger.kernel.org
Subject: Re: [PATCH v19 06/14] dmaengine: qcom: bam_dma: add support for BAM locking
Date: Tue, 26 May 2026 15:01:13 +0000	[thread overview]
Message-ID: <20260526150113.C3EC11F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260526-qcom-qce-cmd-descr-v19-6-08472fdcbf4a@oss.qualcomm.com>

Thank you for your contribution! Sashiko AI review found 8 potential issue(s) to consider:
- [Critical] Dummy lock descriptors steal DMA cookies and cause premature completion of client transfers.
- [High] Hardware FIFO exhaustion causes unbounded spurious injections of nested LOCK and UNLOCK descriptors.
- [High] A partial allocation or mapping failure leaves the BAM pipeline permanently locked.
- [High] Pipe locking is applied exclusively to TX channels, contradicting the stated intent to support RX channels.
- [High] The dummy command descriptor is mapped using the client peripheral's device instead of the DMA engine's device.
- [High] Missing `cpu_to_le32()` conversion for the DMA descriptor hardware address.
- [High] The `lock_ce` command element shares a CPU cache line with other structure members, leading to false sharing.
- [Medium] Missing length validation when attaching descriptor metadata.
--

commit 8b6f3d804203416c63344d33ae0ad758f79620c7
Author: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

dmaengine: qcom: bam_dma: add support for BAM locking

Add support for BAM pipe locking. When starting DMA on an RX channel, this
prepends the queue of issued descriptors with a dummy command descriptor with
the LOCK bit set, and appends one with the UNLOCK bit once the transaction is
done.

> diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
> index 04fe1d546be73..84fd9e181bdd5 100644
> --- a/drivers/dma/qcom/bam_dma.c
> +++ b/drivers/dma/qcom/bam_dma.c
[ ... ]
> @@ -72,6 +76,10 @@ struct bam_async_desc {
>  
>  	struct bam_desc_hw *curr_desc;
>  
> +	/* BAM locking infrastructure */
> +	struct scatterlist lock_sg;
> +	struct bam_cmd_element lock_ce;

[Severity: High]
Is it safe to embed lock_ce directly in struct bam_async_desc without
cache line alignment?

Since lock_ce is mapped for DMA, sharing a CPU cache line with other members
like desc_node could violate the DMA API. On architectures with non-coherent
caches, DMA cache maintenance operations might cause memory corruption or
lost CPU writes to adjacent fields in the structure.

> +
>  	/* list node for the desc in the bam_chan list of descriptors */
>  	struct list_head desc_node;
[ ... ]
> @@ -652,6 +664,35 @@ static int bam_slave_config(struct dma_chan *chan,
>  	return 0;
>  }
>  
> +static int bam_metadata_attach(struct dma_async_tx_descriptor *desc, void *data, size_t len)
> +{
> +	struct bam_chan *bchan = to_bam_chan(desc->chan);
> +	const struct bam_device_data *bdata = bchan->bdev->dev_data;
> +	struct bam_desc_metadata *metadata = data;
> +
> +	if (!data)
> +		return -EINVAL;

[Severity: Medium]
Could this result in an out-of-bounds read if len is smaller than expected?

The driver casts the data pointer to struct bam_desc_metadata * and
dereferences it without validating that len is at least the size of
struct bam_desc_metadata.

> +
> +	if (!bdata->pipe_lock_supported)
[ ... ]
> @@ -1012,13 +1057,105 @@ static void bam_apply_new_config(struct bam_chan *bchan,
[ ... ]
> +static struct bam_async_desc *
> +bam_make_lock_desc(struct bam_chan *bchan, unsigned long flag)
> +{
[ ... ]
> +	bam_prep_ce_le32(&async_desc->lock_ce, bchan->scratchpad_addr, BAM_WRITE_COMMAND, 0);
> +	sg_set_buf(&async_desc->lock_sg, &async_desc->lock_ce, sizeof(async_desc->lock_ce));
> +
> +	mapped = dma_map_sg(chan->slave, &async_desc->lock_sg, 1, DMA_TO_DEVICE);

[Severity: High]
Should the memory be mapped using the DMA engine's device rather than the
client peripheral's device?

Because the BAM DMA controller executes this command, the memory needs to be
mapped into its own address space. If the DMA controller and the peripheral
are in different IOMMU domains, mapping it for chan->slave might yield an
address that triggers an IOMMU fault when the BAM DMA controller accesses it.

> +	if (!mapped) {
> +		kfree(async_desc);
> +		return ERR_PTR(-ENOMEM);
> +	}
> +
> +	desc->flags |= cpu_to_le16(DESC_FLAG_CMD | flag);
> +	desc->addr = sg_dma_address(&async_desc->lock_sg);

[Severity: High]
Are we missing a cpu_to_le32() conversion here?

The physical DMA address returned by sg_dma_address() is assigned directly
to desc->addr, which is of type __le32. On big-endian architectures, this
might cause the address to be written with the wrong byte order, leading the
hardware engine to fetch from an invalid physical address.

> +	desc->size = cpu_to_le16(sizeof(struct bam_cmd_element));
> +
> +	vc = &bchan->vc;
> +	vd = &async_desc->vd;
> +
> +	dma_async_tx_descriptor_init(&vd->tx, &vc->chan);
> +	vd->tx.flags = DMA_PREP_CMD;
> +	vd->tx.desc_free = vchan_tx_desc_free;
> +	vd->tx_result.result = DMA_TRANS_NOERROR;
> +	vd->tx_result.residue = 0;
> +
> +	cookie = dma_cookie_assign(&vd->tx);

[Severity: Critical]
Will assigning a new cookie to the lock descriptor cause premature completion
of client transfers?

Calling dma_cookie_assign() gives this descriptor a higher cookie value than
the client's already-submitted descriptors. Since the LOCK descriptor is
prepended to desc_issued, it executes first. When it completes, the channel's
completed_cookie is updated to this higher value.

Subsequent calls to dma_cookie_status() will falsely report the pending client
descriptors as complete, potentially causing the client to free or reuse memory
while the hardware is still performing DMA.

> +	ret = dma_submit_error(cookie);
[ ... ]
> +	return async_desc;
> +}
> +
> +static int bam_do_setup_pipe_lock(struct bam_chan *bchan, bool lock)
> +{
> +	struct bam_device *bdev = bchan->bdev;
> +	const struct bam_device_data *bdata = bdev->dev_data;
> +	struct bam_async_desc *lock_desc;
> +	unsigned long flag;
> +
> +	lockdep_assert_held(&bchan->vc.lock);
> +
> +	if (!bdata->pipe_lock_supported || !bchan->scratchpad_addr ||
> +	    bchan->direction != DMA_MEM_TO_DEV)

[Severity: High]
Does this logic contradict the commit message's stated intent to support
RX channels?

The check here aborts if the direction is not DMA_MEM_TO_DEV (a TX channel).
This applies pipe locking exclusively to TX channels, completely disabling it
for the intended RX channels.

> +		return 0;
> +
> +	flag = lock ? DESC_FLAG_LOCK : DESC_FLAG_UNLOCK;
[ ... ]
> +	return 0;
> +}
> +
> +static void bam_setup_pipe_lock(struct bam_chan *bchan)
> +{
> +	if (bam_do_setup_pipe_lock(bchan, true) || bam_do_setup_pipe_lock(bchan, false))

[Severity: High]
Could a partial allocation failure leave the BAM pipeline permanently deadlocked?

The insertion relies on short-circuiting logic. If the first call successfully
adds the LOCK descriptor to desc_issued, and the second call for the UNLOCK
subsequently fails (e.g. returning -ENOMEM), an error is logged but the LOCK
descriptor is never rolled back.

The transaction would then execute with the LOCK, but since no UNLOCK is
issued, the channel remains permanently locked.

> +		dev_err(bchan->vc.chan.slave, "Failed to setup BAM pipe lock descriptors");
> +}
> +
>  /**
>   * bam_start_dma - start next transaction
>   * @bchan: bam dma channel
>   */
>  static void bam_start_dma(struct bam_chan *bchan)
>  {
> -	struct virt_dma_desc *vd = vchan_next_desc(&bchan->vc);
> +	struct virt_dma_desc *vd;
>  	struct bam_device *bdev = bchan->bdev;
>  	struct bam_async_desc *async_desc = NULL;
>  	struct bam_desc_hw *desc;
> @@ -1030,6 +1167,9 @@ static void bam_start_dma(struct bam_chan *bchan)
>  
>  	lockdep_assert_held(&bchan->vc.lock);
>  
> +	bam_setup_pipe_lock(bchan);

[Severity: High]
Could this unconditionally accumulate spurious nested LOCK and UNLOCK descriptors
if the hardware FIFO is full?

During execution, if the loop over descriptors breaks early due to
!IS_BUSY(bchan), it leaves the remaining descriptors in desc_issued.
When the next interrupt triggers bam_start_dma() to continue the transaction,
it will unconditionally add another LOCK and UNLOCK pair. This seems to break
the hardware protocol by accumulating duplicate locking commands in the middle
of a continuous transfer.

> +
> +	vd = vchan_next_desc(&bchan->vc);
>  	if (!vd)
>  		return;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526-qcom-qce-cmd-descr-v19-0-08472fdcbf4a@oss.qualcomm.com?part=6

next prev parent reply	other threads:[~2026-05-26 15:01 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-26 13:10 [PATCH v19 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
2026-05-26 13:10 ` [PATCH v19 01/14] dmaengine: constify struct dma_descriptor_metadata_ops Bartosz Golaszewski
2026-05-26 13:10 ` [PATCH v19 02/14] dmaengine: qcom: bam_dma: free interrupt before the clock in error path Bartosz Golaszewski
2026-05-26 13:45   ` sashiko-bot
2026-05-26 13:10 ` [PATCH v19 03/14] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue Bartosz Golaszewski
2026-05-26 14:17   ` sashiko-bot
2026-05-26 13:10 ` [PATCH v19 04/14] dmaengine: qcom: bam_dma: Extend the driver's device match data Bartosz Golaszewski
2026-05-26 13:10 ` [PATCH v19 05/14] dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support Bartosz Golaszewski
2026-05-26 13:10 ` [PATCH v19 06/14] dmaengine: qcom: bam_dma: add support for BAM locking Bartosz Golaszewski
2026-05-26 15:01   ` sashiko-bot [this message]
2026-05-26 13:10 ` [PATCH v19 07/14] crypto: qce - Cancel work on device detach Bartosz Golaszewski
2026-05-26 15:33   ` sashiko-bot
2026-05-26 13:10 ` [PATCH v19 08/14] crypto: qce - Include algapi.h in the core.h header Bartosz Golaszewski
2026-05-26 13:10 ` [PATCH v19 09/14] crypto: qce - Remove unused ignore_buf Bartosz Golaszewski
2026-05-26 15:57   ` sashiko-bot
2026-05-26 13:10 ` [PATCH v19 10/14] crypto: qce - Simplify arguments of devm_qce_dma_request() Bartosz Golaszewski
2026-05-26 13:10 ` [PATCH v19 11/14] crypto: qce - Use existing devres APIs in devm_qce_dma_request() Bartosz Golaszewski
2026-05-26 16:09   ` sashiko-bot
2026-05-26 13:11 ` [PATCH v19 12/14] crypto: qce - Map crypto memory for DMA Bartosz Golaszewski
2026-05-26 16:30   ` sashiko-bot
2026-05-26 13:11 ` [PATCH v19 13/14] crypto: qce - Add BAM DMA support for crypto register I/O Bartosz Golaszewski
2026-05-26 17:13   ` sashiko-bot
2026-05-26 13:11 ` [PATCH v19 14/14] crypto: qce - Communicate the base physical address to the dmaengine Bartosz Golaszewski
2026-05-26 18:20   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260526150113.C3EC11F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=Frank.Li@kernel.org \
    --cc=bartosz.golaszewski@oss.qualcomm.com \
    --cc=dmaengine@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=vkoul@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox