DMA Engine development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Bartosz Golaszewski" <bartosz.golaszewski@oss.qualcomm.com>
Cc: dmaengine@vger.kernel.org, vkoul@kernel.org, Frank.Li@kernel.org
Subject: Re: [PATCH v20 06/14] dmaengine: qcom: bam_dma: add support for BAM locking
Date: Mon, 29 Jun 2026 10:16:13 +0000	[thread overview]
Message-ID: <20260629101613.DA0821F00A3D@smtp.kernel.org> (raw)
In-Reply-To: <20260629-qcom-qce-cmd-descr-v20-6-56f67da84c05@oss.qualcomm.com>

Thank you for your contribution! Sashiko AI review found 6 potential issue(s) to consider:
- [High] Race condition in locked descriptor processing leaves appended transactions unprotected.
- [High] Missing endianness conversion on physical address assigned to hardware descriptor.
- [High] Per-descriptor metadata is unsafely stored in the shared channel state.
- [High] Mapping an unaligned embedded structure field for DMA.
- [Medium] Internal command buffers are mapped using the client device instead of the DMA controller device.
- [Medium] Logic mismatch restricts BAM locking to TX channels instead of RX channels.
--

commit 74132b5d8141c72f8f781c94cfc7fae626e96be7
Author: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

dmaengine: qcom: bam_dma: add support for BAM locking

Add support for BAM pipe locking. To that end: when starting DMA on an RX
channel - prepend the existing queue of issued descriptors with an
additional "dummy" command descriptor with the LOCK bit set. Once the
transaction is done (no more issued descriptors), issue one more dummy
descriptor with the UNLOCK bit.

[ ... ]

> --- a/drivers/dma/qcom/bam_dma.c
> +++ b/drivers/dma/qcom/bam_dma.c

[ ... ]

> @@ -686,6 +701,35 @@ static int bam_slave_config(struct dma_chan *chan,
>  	return 0;
>  }
>  
> +static int bam_metadata_attach(struct dma_async_tx_descriptor *desc, void *data, size_t len)
> +{
> +	struct bam_chan *bchan = to_bam_chan(desc->chan);
> +	const struct bam_device_data *bdata = bchan->bdev->dev_data;
> +	struct bam_desc_metadata *metadata = data;
> +
> +	if (!data)
> +		return -EINVAL;
> +
> +	if (!bdata->pipe_lock_supported)
> +		/*
> +		 * The client wants to use locking but this BAM version doesn't
> +		 * support it. Don't return an error here as this will stop the
> +		 * client from using DMA at all for no reason.
> +		 */
> +		return 0;
> +
> +	guard(spinlock_irqsave)(&bchan->vc.lock);
> +
> +	bchan->scratchpad_addr = metadata->scratchpad_addr;
> +	bchan->direction = metadata->direction;

[Severity: High]
Does this safely handle multiple descriptors being prepared before they are
issued?

Since the attach callback operates per-descriptor, storing these properties
directly into the shared bchan object could overwrite the state of previously
prepared descriptors. 

When the locked sequence is eventually executed, might the channel apply this
overwritten metadata to all pending transactions, potentially using the wrong
scratchpad address for earlier ones?

> +
> +	return 0;
> +}
> +

[ ... ]

> @@ -1046,13 +1114,101 @@ static void bam_apply_new_config(struct bam_chan *bchan,
>  	bchan->reconfigure = 0;
>  }
>  
> +static struct bam_async_desc *
> +bam_make_lock_desc(struct bam_chan *bchan, unsigned long flag)
> +{
> +	struct dma_chan *chan = &bchan->vc.chan;
> +	struct bam_async_desc *async_desc;
> +	struct bam_desc_hw *desc;
> +	struct virt_dma_desc *vd;
> +	struct virt_dma_chan *vc;
> +	unsigned int mapped;
> +
> +	async_desc = kzalloc_flex(*async_desc, desc, 1, GFP_NOWAIT);
> +	if (!async_desc) {
> +		dev_err(bchan->bdev->dev, "failed to allocate the BAM lock descriptor\n");
> +		return ERR_PTR(-ENOMEM);
> +	}
> +
> +	sg_init_table(&async_desc->lock_sg, 1);
> +
> +	async_desc->num_desc = 1;
> +	async_desc->curr_desc = async_desc->desc;
> +	async_desc->dir = DMA_MEM_TO_DEV;
> +
> +	desc = async_desc->desc;
> +
> +	bam_prep_ce_le32(&async_desc->lock_ce, bchan->scratchpad_addr, BAM_WRITE_COMMAND, 0);
> +	sg_set_buf(&async_desc->lock_sg, &async_desc->lock_ce, sizeof(async_desc->lock_ce));

[Severity: High]
Could this mapping violate DMA API requirements for cacheline alignment?

The lock_ce field is embedded within struct bam_async_desc and is not
guaranteed to be cacheline-aligned to ARCH_DMA_MINALIGN. Mapping it via
dma_map_sg() means it shares a cacheline with other fields.

If the CPU modifies adjacent fields (like async_desc->desc_node) while the
DMA mapping is active, it dirties the cacheline. On non-coherent
architectures, could this lead to unpredictable behavior and trigger DMA API
debug warnings?

> +
> +	mapped = dma_map_sg(chan->slave, &async_desc->lock_sg, 1, DMA_TO_DEVICE);

[Severity: Medium]
Should this use the BAM DMA controller device instead of the client device?

The DMA mapping is being created in the IOMMU domain of the client device
(chan->slave). If the client device and the BAM controller do not share an
IOMMU domain, the BAM controller might be unable to access the command
element, resulting in an IOMMU fault.

Would it be better to map this using chan->device->dev or bchan->bdev->dev
since the BAM controller is the one reading the command buffer?

> +	if (!mapped) {
> +		kfree(async_desc);
> +		return ERR_PTR(-ENOMEM);
> +	}
> +
> +	desc->flags |= cpu_to_le16(DESC_FLAG_CMD | flag);
> +	desc->addr = sg_dma_address(&async_desc->lock_sg);

[Severity: High]
Is an endianness conversion missing here?

The desc->addr field is __le32, but it is being assigned a native
dma_addr_t without cpu_to_le32(). 

On big-endian systems, won't this write the wrong byte order into the hardware
descriptor and cause the BAM controller to fetch from an invalid memory address?

> +	desc->size = cpu_to_le16(sizeof(struct bam_cmd_element));
> +
> +	vc = &bchan->vc;
> +	vd = &async_desc->vd;
> +
> +	dma_async_tx_descriptor_init(&vd->tx, &vc->chan);
> +	vd->tx.flags = DMA_PREP_CMD;
> +	vd->tx_result.result = DMA_TRANS_NOERROR;
> +	vd->tx_result.residue = 0;
> +
> +	return async_desc;
> +}
> +
> +static int bam_setup_pipe_lock(struct bam_chan *bchan)
> +{
> +	const struct bam_device_data *bdata = bchan->bdev->dev_data;
> +	struct bam_async_desc *lock_desc, *unlock_desc;
> +
> +	lockdep_assert_held(&bchan->vc.lock);
> +
> +	if (!bdata->pipe_lock_supported || !bchan->scratchpad_addr ||
> +	    bchan->direction != DMA_MEM_TO_DEV)

[Severity: Medium]
Does this logic correctly target RX channels?

The commit message explicitly states: "when starting DMA on an RX channel",
but DMA_MEM_TO_DEV specifies the TX direction. 

Will this silently skip the lock setup for RX channels (DMA_DEV_TO_MEM)?

> +		return 0;

[ ... ]

> @@ -1064,9 +1220,23 @@ static void bam_start_dma(struct bam_chan *bchan)
>  
>  	lockdep_assert_held(&bchan->vc.lock);
>  
> +	vd = vchan_next_desc(&bchan->vc);
>  	if (!vd)
>  		return;
>  
> +	/*
> +	 * Wrap the issued work with a LOCK/UNLOCK pair exactly once, at the
> +	 * start of a fresh sequence and only when there is real work to lock
> +	 * around. On a re-entry after a full FIFO, we see the BAM is locked
> +	 * and must not add another pair we simply continue loading the
> +	 * remainder of the same locked sequence.
> +	 */
> +	if (!bchan->bam_locked) {
> +		ret = bam_setup_pipe_lock(bchan);
> +		if (ret == 0 && bchan->bam_locked)
> +			vd = vchan_next_desc(&bchan->vc);
> +	}

[Severity: High]
Is there a race condition here if new descriptors are appended?

If a new descriptor is submitted via vchan_issue_pending() while the hardware
is busy processing a locked sequence, the new descriptor is appended after the
UNLOCK descriptor.

When bam_start_dma() loops to process this new descriptor, bchan->bam_locked
is still true because it is only cleared asynchronously when the UNLOCK
hardware interrupt fires:

process_channel_irqs() {
    ...
    if (flags & DESC_FLAG_UNLOCK)
        bchan->bam_locked = false;
    ...
}

Because bchan->bam_locked is still true, won't this skip adding a new LOCK,
leaving the new transaction unprotected in the hardware FIFO?

> +
>  	ret = pm_runtime_get_sync(bdev->dev);
>  	if (ret < 0)
>  		return;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260629-qcom-qce-cmd-descr-v20-0-56f67da84c05@oss.qualcomm.com?part=6

  reply	other threads:[~2026-06-29 10:16 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-29 10:01 [PATCH v20 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
2026-06-29 10:01 ` [PATCH v20 01/14] dmaengine: constify struct dma_descriptor_metadata_ops Bartosz Golaszewski
2026-06-29 15:04   ` Pandey, Radhey Shyam
2026-06-29 10:01 ` [PATCH v20 02/14] dmaengine: qcom: bam_dma: free interrupt before the clock in error path Bartosz Golaszewski
2026-06-29 10:18   ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 03/14] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue Bartosz Golaszewski
2026-06-29 10:17   ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 04/14] dmaengine: qcom: bam_dma: Extend the driver's device match data Bartosz Golaszewski
2026-06-29 10:01 ` [PATCH v20 05/14] dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support Bartosz Golaszewski
2026-06-29 10:18   ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 06/14] dmaengine: qcom: bam_dma: add support for BAM locking Bartosz Golaszewski
2026-06-29 10:16   ` sashiko-bot [this message]
2026-06-29 10:01 ` [PATCH v20 07/14] crypto: qce - Cancel work on device detach Bartosz Golaszewski
2026-06-29 10:15   ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 08/14] crypto: qce - Include algapi.h in the core.h header Bartosz Golaszewski
2026-06-29 10:01 ` [PATCH v20 09/14] crypto: qce - Remove unused ignore_buf Bartosz Golaszewski
2026-06-29 10:01 ` [PATCH v20 10/14] crypto: qce - Simplify arguments of devm_qce_dma_request() Bartosz Golaszewski
2026-06-29 10:01 ` [PATCH v20 11/14] crypto: qce - Use existing devres APIs in devm_qce_dma_request() Bartosz Golaszewski
2026-06-29 10:17   ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 12/14] crypto: qce - Map crypto memory for DMA Bartosz Golaszewski
2026-06-29 10:14   ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 13/14] crypto: qce - Add BAM DMA support for crypto register I/O Bartosz Golaszewski
2026-06-29 10:22   ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 14/14] crypto: qce - Communicate the base physical address to the dmaengine Bartosz Golaszewski
2026-06-29 10:24   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260629101613.DA0821F00A3D@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=Frank.Li@kernel.org \
    --cc=bartosz.golaszewski@oss.qualcomm.com \
    --cc=dmaengine@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=vkoul@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox