From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9D63B3B2FD4 for ; Mon, 29 Jun 2026 10:16:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782728176; cv=none; b=BDR45CIWuZ4MpXGC91Qx8JUSeM4xYRXTxBpjqwThAX14IiyY6vzCo3fOnHRPQXvvRc4ufSjX6Br8pxPNonstfeoTlvyd67zfuwqG5KjmCR6zeI/Oe0rBkEPEQcSzdr2Z42V0UgwhyJ6k5s1iX7PyxyVFIYnWTCQ2CFYmgp70UsM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782728176; c=relaxed/simple; bh=p4fO68t2YUsdopmamKnSgRC0bVYgGoOzDytI3EAn374=; h=From:Subject:To:Cc:In-Reply-To:References:Content-Type:Date: Message-Id; b=vGJdXyrFD3JMtO5ucChUitYkmPsoXU5jEXC+DLNjQunL5slvNdSXh4s/E/0aV6MJ7HBVolel5Njpkimx0VCbgfoZk5+r1CstOV6n/UAZXp93fv/Cx/r7Nr63OOH1czv7B3zSRbsilh5vz/1K8mBa9DxGgRQ+ycIX1X7iaS8AQDA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ACPtIGed; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ACPtIGed" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DA0821F00A3D; Mon, 29 Jun 2026 10:16:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1782728174; bh=v+LZcHQ9oW2mjm1wzuQ102T8kG2XQ0ghvwfrKESkcVo=; h=From:Subject:Reply-To:To:Cc:In-Reply-To:References:Date; b=ACPtIGedtf0AVZR7Iek0vbgpx6Uodbkbku36WU/OTNMFw9fpffXibWkSN40IuWfXK iR88xcWtloBYvFHKQ9O0eO/nsfZUv6RzfjgU40bpNTmAhLKU5sPARIV6BGrPqJ5zH3 DSmieYA0C8IiIcFVoHrFsrwgX4UChCFBUrDR0i107uwcFRuBZUTnPT9nRaoumqDkdW 0vsts29mamSolq3lC9N7w9Bd/0S4MGkqIvwinqbqPQN0vF1H8qhkQ+Y/Cfn7ZMgzXN W7iGUF7RbARG08+cnbDGcBnoQtq2h05NZpjst5uUl3zDWcacaT0bvJTPXFmlgESXkD tFtYX43OBkniw== From: sashiko-bot@kernel.org Subject: Re: [PATCH v20 06/14] dmaengine: qcom: bam_dma: add support for BAM locking Reply-To: sashiko-reviews@lists.linux.dev To: "Bartosz Golaszewski" Cc: dmaengine@vger.kernel.org, vkoul@kernel.org, Frank.Li@kernel.org In-Reply-To: <20260629-qcom-qce-cmd-descr-v20-6-56f67da84c05@oss.qualcomm.com> References: <20260629-qcom-qce-cmd-descr-v20-0-56f67da84c05@oss.qualcomm.com> <20260629-qcom-qce-cmd-descr-v20-6-56f67da84c05@oss.qualcomm.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Mon, 29 Jun 2026 10:16:13 +0000 Message-Id: <20260629101613.DA0821F00A3D@smtp.kernel.org> Precedence: bulk X-Mailing-List: dmaengine@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Thank you for your contribution! Sashiko AI review found 6 potential issue(= s) to consider: - [High] Race condition in locked descriptor processing leaves appended tra= nsactions unprotected. - [High] Missing endianness conversion on physical address assigned to hard= ware descriptor. - [High] Per-descriptor metadata is unsafely stored in the shared channel s= tate. - [High] Mapping an unaligned embedded structure field for DMA. - [Medium] Internal command buffers are mapped using the client device inst= ead of the DMA controller device. - [Medium] Logic mismatch restricts BAM locking to TX channels instead of R= X channels. -- commit 74132b5d8141c72f8f781c94cfc7fae626e96be7 Author: Bartosz Golaszewski dmaengine: qcom: bam_dma: add support for BAM locking Add support for BAM pipe locking. To that end: when starting DMA on an RX channel - prepend the existing queue of issued descriptors with an additional "dummy" command descriptor with the LOCK bit set. Once the transaction is done (no more issued descriptors), issue one more dummy descriptor with the UNLOCK bit. [ ... ] > --- a/drivers/dma/qcom/bam_dma.c > +++ b/drivers/dma/qcom/bam_dma.c [ ... ] > @@ -686,6 +701,35 @@ static int bam_slave_config(struct dma_chan *chan, > return 0; > } > =20 > +static int bam_metadata_attach(struct dma_async_tx_descriptor *desc, voi= d *data, size_t len) > +{ > + struct bam_chan *bchan =3D to_bam_chan(desc->chan); > + const struct bam_device_data *bdata =3D bchan->bdev->dev_data; > + struct bam_desc_metadata *metadata =3D data; > + > + if (!data) > + return -EINVAL; > + > + if (!bdata->pipe_lock_supported) > + /* > + * The client wants to use locking but this BAM version doesn't > + * support it. Don't return an error here as this will stop the > + * client from using DMA at all for no reason. > + */ > + return 0; > + > + guard(spinlock_irqsave)(&bchan->vc.lock); > + > + bchan->scratchpad_addr =3D metadata->scratchpad_addr; > + bchan->direction =3D metadata->direction; [Severity: High] Does this safely handle multiple descriptors being prepared before they are issued? Since the attach callback operates per-descriptor, storing these properties directly into the shared bchan object could overwrite the state of previous= ly prepared descriptors.=20 When the locked sequence is eventually executed, might the channel apply th= is overwritten metadata to all pending transactions, potentially using the wro= ng scratchpad address for earlier ones? > + > + return 0; > +} > + [ ... ] > @@ -1046,13 +1114,101 @@ static void bam_apply_new_config(struct bam_chan= *bchan, > bchan->reconfigure =3D 0; > } > =20 > +static struct bam_async_desc * > +bam_make_lock_desc(struct bam_chan *bchan, unsigned long flag) > +{ > + struct dma_chan *chan =3D &bchan->vc.chan; > + struct bam_async_desc *async_desc; > + struct bam_desc_hw *desc; > + struct virt_dma_desc *vd; > + struct virt_dma_chan *vc; > + unsigned int mapped; > + > + async_desc =3D kzalloc_flex(*async_desc, desc, 1, GFP_NOWAIT); > + if (!async_desc) { > + dev_err(bchan->bdev->dev, "failed to allocate the BAM lock descriptor\= n"); > + return ERR_PTR(-ENOMEM); > + } > + > + sg_init_table(&async_desc->lock_sg, 1); > + > + async_desc->num_desc =3D 1; > + async_desc->curr_desc =3D async_desc->desc; > + async_desc->dir =3D DMA_MEM_TO_DEV; > + > + desc =3D async_desc->desc; > + > + bam_prep_ce_le32(&async_desc->lock_ce, bchan->scratchpad_addr, BAM_WRIT= E_COMMAND, 0); > + sg_set_buf(&async_desc->lock_sg, &async_desc->lock_ce, sizeof(async_des= c->lock_ce)); [Severity: High] Could this mapping violate DMA API requirements for cacheline alignment? The lock_ce field is embedded within struct bam_async_desc and is not guaranteed to be cacheline-aligned to ARCH_DMA_MINALIGN. Mapping it via dma_map_sg() means it shares a cacheline with other fields. If the CPU modifies adjacent fields (like async_desc->desc_node) while the DMA mapping is active, it dirties the cacheline. On non-coherent architectures, could this lead to unpredictable behavior and trigger DMA API debug warnings? > + > + mapped =3D dma_map_sg(chan->slave, &async_desc->lock_sg, 1, DMA_TO_DEVI= CE); [Severity: Medium] Should this use the BAM DMA controller device instead of the client device? The DMA mapping is being created in the IOMMU domain of the client device (chan->slave). If the client device and the BAM controller do not share an IOMMU domain, the BAM controller might be unable to access the command element, resulting in an IOMMU fault. Would it be better to map this using chan->device->dev or bchan->bdev->dev since the BAM controller is the one reading the command buffer? > + if (!mapped) { > + kfree(async_desc); > + return ERR_PTR(-ENOMEM); > + } > + > + desc->flags |=3D cpu_to_le16(DESC_FLAG_CMD | flag); > + desc->addr =3D sg_dma_address(&async_desc->lock_sg); [Severity: High] Is an endianness conversion missing here? The desc->addr field is __le32, but it is being assigned a native dma_addr_t without cpu_to_le32().=20 On big-endian systems, won't this write the wrong byte order into the hardw= are descriptor and cause the BAM controller to fetch from an invalid memory add= ress? > + desc->size =3D cpu_to_le16(sizeof(struct bam_cmd_element)); > + > + vc =3D &bchan->vc; > + vd =3D &async_desc->vd; > + > + dma_async_tx_descriptor_init(&vd->tx, &vc->chan); > + vd->tx.flags =3D DMA_PREP_CMD; > + vd->tx_result.result =3D DMA_TRANS_NOERROR; > + vd->tx_result.residue =3D 0; > + > + return async_desc; > +} > + > +static int bam_setup_pipe_lock(struct bam_chan *bchan) > +{ > + const struct bam_device_data *bdata =3D bchan->bdev->dev_data; > + struct bam_async_desc *lock_desc, *unlock_desc; > + > + lockdep_assert_held(&bchan->vc.lock); > + > + if (!bdata->pipe_lock_supported || !bchan->scratchpad_addr || > + bchan->direction !=3D DMA_MEM_TO_DEV) [Severity: Medium] Does this logic correctly target RX channels? The commit message explicitly states: "when starting DMA on an RX channel", but DMA_MEM_TO_DEV specifies the TX direction.=20 Will this silently skip the lock setup for RX channels (DMA_DEV_TO_MEM)? > + return 0; [ ... ] > @@ -1064,9 +1220,23 @@ static void bam_start_dma(struct bam_chan *bchan) > =20 > lockdep_assert_held(&bchan->vc.lock); > =20 > + vd =3D vchan_next_desc(&bchan->vc); > if (!vd) > return; > =20 > + /* > + * Wrap the issued work with a LOCK/UNLOCK pair exactly once, at the > + * start of a fresh sequence and only when there is real work to lock > + * around. On a re-entry after a full FIFO, we see the BAM is locked > + * and must not add another pair we simply continue loading the > + * remainder of the same locked sequence. > + */ > + if (!bchan->bam_locked) { > + ret =3D bam_setup_pipe_lock(bchan); > + if (ret =3D=3D 0 && bchan->bam_locked) > + vd =3D vchan_next_desc(&bchan->vc); > + } [Severity: High] Is there a race condition here if new descriptors are appended? If a new descriptor is submitted via vchan_issue_pending() while the hardwa= re is busy processing a locked sequence, the new descriptor is appended after = the UNLOCK descriptor. When bam_start_dma() loops to process this new descriptor, bchan->bam_locked is still true because it is only cleared asynchronously when the UNLOCK hardware interrupt fires: process_channel_irqs() { ... if (flags & DESC_FLAG_UNLOCK) bchan->bam_locked =3D false; ... } Because bchan->bam_locked is still true, won't this skip adding a new LOCK, leaving the new transaction unprotected in the hardware FIFO? > + > ret =3D pm_runtime_get_sync(bdev->dev); > if (ret < 0) > return; --=20 Sashiko AI review =C2=B7 https://sashiko.dev/#/patchset/20260629-qcom-qce-c= md-descr-v20-0-56f67da84c05@oss.qualcomm.com?part=3D6