DMA Engine development
 help / color / mirror / Atom feed
* [PATCH v19 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O
@ 2026-05-26 13:10 Bartosz Golaszewski
  2026-05-26 13:10 ` [PATCH v19 01/14] dmaengine: constify struct dma_descriptor_metadata_ops Bartosz Golaszewski
                   ` (13 more replies)
  0 siblings, 14 replies; 24+ messages in thread
From: Bartosz Golaszewski @ 2026-05-26 13:10 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
	Neil Armstrong
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski,
	Dmitry Baryshkov, Konrad Dybcio

I feel like I fell into the trap of trying to address pre-existing
issues reported by sashiko and in the process provoking more reports so
let this be the last iteration where I do this. Vinod can we get this
queued for v7.2 now and iron out any previously existing problems in
tree?

Merging strategy: there are build-time dependencies between the crypto
and DMA patches so the best approach is for Vinod to create an immutable
branch with the DMA part pulled in by the crypto tree.

This iteration continues to build on top of v12 but uses the BAM's NWD
bit on data descriptors as suggested by Stephan. To that end, there are
some more changes like reversing the order of command and data
descriptors queuedy by the QCE driver.

Currently the QCE crypto driver accesses the crypto engine registers
directly via CPU. Trust Zone may perform crypto operations simultaneously
resulting in a race condition. To remedy that, let's introduce support
for BAM locking/unlocking to the driver. The BAM driver will now wrap
any existing issued descriptor chains with additional descriptors
performing the locking when the client starts the transaction
(dmaengine_issue_pending()). The client wanting to profit from locking
needs to switch to performing register I/O over DMA and communicate the
address to which to perform the dummy writes via a call to
dmaengine_desc_attach_metadata().

In the specific case of the BAM DMA this translates to sending command
descriptors performing dummy writes with the relevant flags set. The BAM
will then lock all other pipes not related to the current pipe group, and
keep handling the current pipe only until it sees the the unlock bit.

In order for the locking to work correctly, we also need to switch to
using DMA for all register I/O.

On top of this, the series contains some additional tweaks and
refactoring.

The goal of this is not to improve the performance but to prepare the
driver for supporting decryption into secure buffers in the future.

Tested with tcrypt.ko, kcapi and cryptsetup.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
Changes in v19:
- Fix more potential issues in remove path (sashiko)
- Remove unneeded return value check for vchan_tx_prep() as it can never
  fail
- Link to v18: https://patch.msgid.link/20260522-qcom-qce-cmd-descr-v18-0-99103926bafc@oss.qualcomm.com

Changes in v18:
- Free the BAM interrupt before disabling the clock in remove() path too
- convert the size assigned to command descriptors to little endian
- don't pass DMA mapping attributes to dma_map_sg() in bam_dma when
  setting up command descriptors
- Cancel the QCE workqueue *after* any outstanding DMA transfer
  completes
- When mapping the scatterlist for command descriptors: use the actual
  number of mapped segments for dmaengine_prep_slave_sg()
- Drop the leftover read_buf field from struct qce_device
- Unmap command descriptors only after terminating the RX transfer
- Pass the actual size of the metadata struct to
  dmaengine_desc_attach_metadata(), this is not really required for our
  use-case but let's do this for correctness and make sashiko happy
- Drop double assignment of bam_ce_idx in qce_clear_bam_transaction()
- Remove unused QCE_MAX_REG_READ
- Link to v17: https://patch.msgid.link/20260519-qcom-qce-cmd-descr-v17-0-53a595414b79@oss.qualcomm.com

Changes in v17:
- New patch: free the interrupt before disabling the clock in error path
  in probe()
- New patch: cancel the QCE work on device detach
- Hold the channel lock when attaching the metadata
- Reorder the operations in devm_qce_dma_request() to avoid freeing
  memory that may still be used by the DMA channel
- Register algorithms as the last step in QCE's probe() to avoid making
  the resources available to the system before the DMA is fully set up
- Fix error paths in algo request handlers
- Don't pass dmaengine attributes to map_sg_attrs() as it expects
  dma-mapping attribute flags
- Fix a dma mapping leak for command descriptors
- Rebase on top of v7.1-rc4
- Link to v16: https://patch.msgid.link/20260427-qcom-qce-cmd-descr-v16-0-945fd1cafbbc@oss.qualcomm.com

Changes in v16:
- Fix a reported race between dma_map_sg() called with spinlock taken
  and the corresponding dma_unmap_sg() called without it by moving the
  descriptor locking data into the descriptor struct
- Also queue the TX data descriptors before the command descriptors to
  match what downstream is doing
- Tweak commit messages
- Rebase on top of v7.1-rc1
- Link to v15: https://patch.msgid.link/20260402-qcom-qce-cmd-descr-v15-0-98b5361f7ed7@oss.qualcomm.com

Changes in v15:
- Extend the descriptor metadata struct to also carry the channel's
  transfer direction and stop using dmaengine_slave_config() for that
- Link to v14: https://patch.msgid.link/20260323-qcom-qce-cmd-descr-v14-0-f323af411274@oss.qualcomm.com

Changes in v14:
- Don't return an error to a client which wants to use locking on BAM
  that doesn't support it
- Add a comment describing the DMA descriptor metadata structure
- Fix memory leaks
- Remove leftovers from previous iterations
- Propagate errors from dma_cookie_assign() when setting up lock
  descriptors
- Link to v13: https://patch.msgid.link/20260317-qcom-qce-cmd-descr-v13-0-0968eb4f8c40@oss.qualcomm.com

Changes in v13:
- As part of the DMA changes in the QCE driver: reverse the order of
  queueing the descriptors in the QCE driver: queue command descriptors
  with all the register writes first, followed by all the data descriptors,
  this is in line with the recommandations from the BAM HPG
- Set the NWD (notify-when-done) bit (DMA_PREP_FENCE in dmaengine
  parlance) on the data descriptors to ensure that the UNLOCK descriptor
  will not be processed until after they have been processed by the
  engine. While technically the NWD bit is only needed on the final data
  descriptor, it's hard to tell which one *will* be the last from the
  driver's point-of-view and both the downstream driver as well as
  the Qualcomm TZ against which we want to synchronize sets NWD on every
  data descriptor,
- Revert to creating the LOCK/UNLOCK command descriptor pair in one
  place now that the NWD bit is in place,
- Link to v12: https://patch.msgid.link/20260310-qcom-qce-cmd-descr-v12-0-398f37f26ef0@oss.qualcomm.com

Changes in v12:
- Wait until the transaction is done before queueing the UNLOCK command
  descriptor
- Use descriptor metadata for communicating the scratchpad address to
  the BAM driver
- To that end: reverse the order of the series (first BAM, then QCE) to
  maintain bisectability
- Unmap buffers used for dummy writes after the transaction
- Link to v11: https://patch.msgid.link/20260302-qcom-qce-cmd-descr-v11-0-4bf1f5db4802@oss.qualcomm.com

Changes in v11:
- Use new approach, not requiring the client to be involved in locking.
- Add a patch constifying dma_descriptor_metadata_ops
- Rebase on top of v7.0-rc1
- Link to v10: https://lore.kernel.org/r/20251219-qcom-qce-cmd-descr-v10-0-ff7e4bf7dad4@oss.qualcomm.com

Changes in v10:
- Move DESC_FLAG_(UN)LOCK BIT definitions from patch 2 to 3
- Add a patch constifying the dma engine metadata as the first in the
  series
- Use the VERSION register for dummy lock/unlock writes
- Link to v9: https://lore.kernel.org/r/20251128-qcom-qce-cmd-descr-v9-0-9a5f72b89722@linaro.org

Changes in v9:
- Drop the global, generic LOCK/UNLOCK flags and instead use DMA
  descriptor metadata ops to pass BAM-specific information from the QCE
  to the DMA engine
- Link to v8: https://lore.kernel.org/r/20251106-qcom-qce-cmd-descr-v8-0-ecddca23ca26@linaro.org

Changes in v8:
- Rework the command descriptor logic and drop a lot of unneeded code
- Use the physical address for BAM command descriptor access, not the
  mapped DMA address
- Fix the problems with iommu faults on newer platforms
- Generalize the LOCK/UNLOCK flags in dmaengine and reword the docs and
  commit messages
- Make the BAM locking logic stricter in the DMA engine driver
- Add some additional minor QCE driver refactoring changes to the series
- Lots of small reworks and tweaks to rebase on current mainline and fix
  previous issues
- Link to v7: https://lore.kernel.org/all/20250311-qce-cmd-descr-v7-0-db613f5d9c9f@linaro.org/

Changes in v7:
- remove unused code: writing to multiple registers was not used in v6,
  neither were the functions for reading registers over BAM DMA-
- remove
- don't read the SW_VERSION register needlessly in the BAM driver,
  instead: encode the information on whether the IP supports BAM locking
  in device match data
- shrink code where possible with logic modifications (for instance:
  change the implementation of qce_write() instead of replacing it
  everywhere with a new symbol)
- remove duplicated error messages
- rework commit messages
- a lot of shuffling code around for easier review and a more
  streamlined series
- Link to v6: https://lore.kernel.org/all/20250115103004.3350561-1-quic_mdalam@quicinc.com/

Changes in v6:
- change "BAM" to "DMA"
- Ensured this series is compilable with the current Linux-next tip of
  the tree (TOT).

Changes in v5:
- Added DMA_PREP_LOCK and DMA_PREP_UNLOCK flag support in separate patch
- Removed DMA_PREP_LOCK & DMA_PREP_UNLOCK flag
- Added FIELD_GET and GENMASK macro to extract major and minor version

Changes in v4:
- Added feature description and test hardware
  with test command
- Fixed patch version numbering
- Dropped dt-binding patch
- Dropped device tree changes
- Added BAM_SW_VERSION register read
- Handled the error path for the api dma_map_resource()
  in probe
- updated the commit messages for batter redability
- Squash the change where qce_bam_acquire_lock() and
  qce_bam_release_lock() api got introduce to the change where
  the lock/unlock flag get introced
- changed cover letter subject heading to
  "dmaengine: qcom: bam_dma: add cmd descriptor support"
- Added the very initial post for BAM lock/unlock patch link
  as v1 to track this feature

Changes in v3:
- https://lore.kernel.org/lkml/183d4f5e-e00a-8ef6-a589-f5704bc83d4a@quicinc.com/
- Addressed all the comments from v2
- Added the dt-binding
- Fix alignment issue
- Removed type casting from qce_write_reg_dma()
  and qce_read_reg_dma()
- Removed qce_bam_txn = dma->qce_bam_txn; line from
  qce_alloc_bam_txn() api and directly returning
  dma->qce_bam_txn

Changes in v2:
- https://lore.kernel.org/lkml/20231214114239.2635325-1-quic_mdalam@quicinc.com/
- Initial set of patches for cmd descriptor support
- Add client driver to use BAM lock/unlock feature
- Added register read/write via BAM in QCE Crypto driver
  to use BAM lock/unlock feature

---
Bartosz Golaszewski (14):
      dmaengine: constify struct dma_descriptor_metadata_ops
      dmaengine: qcom: bam_dma: free interrupt before the clock in error path
      dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue
      dmaengine: qcom: bam_dma: Extend the driver's device match data
      dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support
      dmaengine: qcom: bam_dma: add support for BAM locking
      crypto: qce - Cancel work on device detach
      crypto: qce - Include algapi.h in the core.h header
      crypto: qce - Remove unused ignore_buf
      crypto: qce - Simplify arguments of devm_qce_dma_request()
      crypto: qce - Use existing devres APIs in devm_qce_dma_request()
      crypto: qce - Map crypto memory for DMA
      crypto: qce - Add BAM DMA support for crypto register I/O
      crypto: qce - Communicate the base physical address to the dmaengine

 drivers/crypto/qce/aead.c        |  10 +-
 drivers/crypto/qce/common.c      |  20 ++--
 drivers/crypto/qce/core.c        |  39 ++++++-
 drivers/crypto/qce/core.h        |   7 ++
 drivers/crypto/qce/dma.c         | 168 ++++++++++++++++++++++++-----
 drivers/crypto/qce/dma.h         |  11 +-
 drivers/crypto/qce/sha.c         |  10 +-
 drivers/crypto/qce/skcipher.c    |  10 +-
 drivers/dma/qcom/bam_dma.c       | 227 +++++++++++++++++++++++++++++++++------
 drivers/dma/ti/k3-udma.c         |   2 +-
 drivers/dma/xilinx/xilinx_dma.c  |   2 +-
 include/linux/dma/qcom_bam_dma.h |  14 +++
 include/linux/dmaengine.h        |   2 +-
 13 files changed, 427 insertions(+), 95 deletions(-)
---
base-commit: def113ae602a35ab7a1dc42a6c43188e180287be
change-id: 20251103-qcom-qce-cmd-descr-c5e9b11fe609

Best regards,
-- 
Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v19 01/14] dmaengine: constify struct dma_descriptor_metadata_ops
  2026-05-26 13:10 [PATCH v19 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
@ 2026-05-26 13:10 ` Bartosz Golaszewski
  2026-05-26 13:10 ` [PATCH v19 02/14] dmaengine: qcom: bam_dma: free interrupt before the clock in error path Bartosz Golaszewski
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 24+ messages in thread
From: Bartosz Golaszewski @ 2026-05-26 13:10 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
	Neil Armstrong
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski

There's no reason for the instances of this struct to be modifiable.
Constify the pointer in struct dma_async_tx_descriptor and all drivers
currently using it.

Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/dma/ti/k3-udma.c        | 2 +-
 drivers/dma/xilinx/xilinx_dma.c | 2 +-
 include/linux/dmaengine.h       | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c
index c964ebfcf3b68d86e4bbc9b62bad2212f0ce3ee9..8a2f235b669aaf084a6f7b3e6b23d06b04768608 100644
--- a/drivers/dma/ti/k3-udma.c
+++ b/drivers/dma/ti/k3-udma.c
@@ -3408,7 +3408,7 @@ static int udma_set_metadata_len(struct dma_async_tx_descriptor *desc,
 	return 0;
 }
 
-static struct dma_descriptor_metadata_ops metadata_ops = {
+static const struct dma_descriptor_metadata_ops metadata_ops = {
 	.attach = udma_attach_metadata,
 	.get_ptr = udma_get_metadata_ptr,
 	.set_len = udma_set_metadata_len,
diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
index 404235c1735384635597e88edc25c67c7d250647..165b11a7c776abc6a8d66d631e19da669644577d 100644
--- a/drivers/dma/xilinx/xilinx_dma.c
+++ b/drivers/dma/xilinx/xilinx_dma.c
@@ -653,7 +653,7 @@ static void *xilinx_dma_get_metadata_ptr(struct dma_async_tx_descriptor *tx,
 	return seg->hw.app;
 }
 
-static struct dma_descriptor_metadata_ops xilinx_dma_metadata_ops = {
+static const struct dma_descriptor_metadata_ops xilinx_dma_metadata_ops = {
 	.get_ptr = xilinx_dma_get_metadata_ptr,
 };
 
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index b3d251c9734e95e1b75cf6763d4d2c3a1c6a9910..5244edb90e7e7510bf4460b6a74ee2a7f91c1ccc 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -623,7 +623,7 @@ struct dma_async_tx_descriptor {
 	void *callback_param;
 	struct dmaengine_unmap_data *unmap;
 	enum dma_desc_metadata_mode desc_metadata_mode;
-	struct dma_descriptor_metadata_ops *metadata_ops;
+	const struct dma_descriptor_metadata_ops *metadata_ops;
 #ifdef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
 	struct dma_async_tx_descriptor *next;
 	struct dma_async_tx_descriptor *parent;

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v19 02/14] dmaengine: qcom: bam_dma: free interrupt before the clock in error path
  2026-05-26 13:10 [PATCH v19 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
  2026-05-26 13:10 ` [PATCH v19 01/14] dmaengine: constify struct dma_descriptor_metadata_ops Bartosz Golaszewski
@ 2026-05-26 13:10 ` Bartosz Golaszewski
  2026-05-26 13:45   ` sashiko-bot
  2026-05-26 13:10 ` [PATCH v19 03/14] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue Bartosz Golaszewski
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 24+ messages in thread
From: Bartosz Golaszewski @ 2026-05-26 13:10 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
	Neil Armstrong
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski

The BAM interrupt is requested with a devres helper and so on error it's
freed after probe() returns. We disable the clock before freeing or
masking it so it may still fire and we may end up reading BAM registers
with clock disabled.

Stop using devres for interrupts as we free it in remove() manually
anyway. Add an appropriate label and free the interrupt before disabling
the clock in error path and in remove().

Fixes: e7c0fe2a5c84 ("dmaengine: add Qualcomm BAM dma driver")
Closes: https://sashiko.dev/#/patchset/20260427-qcom-qce-cmd-descr-v16-0-945fd1cafbbc%40oss.qualcomm.com?part=2
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/dma/qcom/bam_dma.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index 19116295f8325767a0d97a7848077885b118241c..b3d36ea79984385fe0d05ce56042d3e6e3030c5a 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -1302,8 +1302,7 @@ static int bam_dma_probe(struct platform_device *pdev)
 	for (i = 0; i < bdev->num_channels; i++)
 		bam_channel_init(bdev, &bdev->channels[i], i);
 
-	ret = devm_request_irq(bdev->dev, bdev->irq, bam_dma_irq,
-			IRQF_TRIGGER_HIGH, "bam_dma", bdev);
+	ret = request_irq(bdev->irq, bam_dma_irq, IRQF_TRIGGER_HIGH, "bam_dma", bdev);
 	if (ret)
 		goto err_bam_channel_exit;
 
@@ -1336,7 +1335,7 @@ static int bam_dma_probe(struct platform_device *pdev)
 	ret = dma_async_device_register(&bdev->common);
 	if (ret) {
 		dev_err(bdev->dev, "failed to register dma async device\n");
-		goto err_bam_channel_exit;
+		goto err_free_irq;
 	}
 
 	ret = of_dma_controller_register(pdev->dev.of_node, bam_dma_xlate,
@@ -1355,6 +1354,8 @@ static int bam_dma_probe(struct platform_device *pdev)
 
 err_unregister_dma:
 	dma_async_device_unregister(&bdev->common);
+err_free_irq:
+	free_irq(bdev->irq, bdev);
 err_bam_channel_exit:
 	for (i = 0; i < bdev->num_channels; i++)
 		tasklet_kill(&bdev->channels[i].vc.task);
@@ -1371,6 +1372,8 @@ static void bam_dma_remove(struct platform_device *pdev)
 	struct bam_device *bdev = platform_get_drvdata(pdev);
 	u32 i;
 
+	free_irq(bdev->irq, bdev);
+
 	pm_runtime_force_suspend(&pdev->dev);
 
 	of_dma_controller_free(pdev->dev.of_node);
@@ -1379,8 +1382,6 @@ static void bam_dma_remove(struct platform_device *pdev)
 	/* mask all interrupts for this execution environment */
 	writel_relaxed(0, bam_addr(bdev, 0,  BAM_IRQ_SRCS_MSK_EE));
 
-	devm_free_irq(bdev->dev, bdev->irq, bdev);
-
 	for (i = 0; i < bdev->num_channels; i++) {
 		bam_dma_terminate_all(&bdev->channels[i].vc.chan);
 		tasklet_kill(&bdev->channels[i].vc.task);

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v19 03/14] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue
  2026-05-26 13:10 [PATCH v19 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
  2026-05-26 13:10 ` [PATCH v19 01/14] dmaengine: constify struct dma_descriptor_metadata_ops Bartosz Golaszewski
  2026-05-26 13:10 ` [PATCH v19 02/14] dmaengine: qcom: bam_dma: free interrupt before the clock in error path Bartosz Golaszewski
@ 2026-05-26 13:10 ` Bartosz Golaszewski
  2026-05-26 14:17   ` sashiko-bot
  2026-05-26 13:10 ` [PATCH v19 04/14] dmaengine: qcom: bam_dma: Extend the driver's device match data Bartosz Golaszewski
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 24+ messages in thread
From: Bartosz Golaszewski @ 2026-05-26 13:10 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
	Neil Armstrong
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski,
	Dmitry Baryshkov

BH workqueues are a modern mechanism, aiming to replace legacy tasklets.
Let's convert the BAM DMA driver to using the high-priority variant of
the BH workqueue.

[Vinod: suggested using the BG workqueue instead of the regular one
running in process context]

Suggested-by: Vinod Koul <vkoul@kernel.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/dma/qcom/bam_dma.c | 32 ++++++++++++++++----------------
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index b3d36ea79984385fe0d05ce56042d3e6e3030c5a..1c62f845ac0b956e311857b93f5b504086662f45 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -42,6 +42,7 @@
 #include <linux/pm_runtime.h>
 #include <linux/scatterlist.h>
 #include <linux/slab.h>
+#include <linux/workqueue.h>
 
 #include "../dmaengine.h"
 #include "../virt-dma.h"
@@ -397,8 +398,8 @@ struct bam_device {
 	struct clk *bamclk;
 	int irq;
 
-	/* dma start transaction tasklet */
-	struct tasklet_struct task;
+	/* dma start transaction workqueue */
+	struct work_struct work;
 };
 
 /**
@@ -863,7 +864,7 @@ static u32 process_channel_irqs(struct bam_device *bdev)
 			/*
 			 * if complete, process cookie. Otherwise
 			 * push back to front of desc_issued so that
-			 * it gets restarted by the tasklet
+			 * it gets restarted by the work queue.
 			 */
 			if (!async_desc->num_desc) {
 				vchan_cookie_complete(&async_desc->vd);
@@ -893,9 +894,9 @@ static irqreturn_t bam_dma_irq(int irq, void *data)
 
 	srcs |= process_channel_irqs(bdev);
 
-	/* kick off tasklet to start next dma transfer */
+	/* kick off the work queue to start next dma transfer */
 	if (srcs & P_IRQ)
-		tasklet_schedule(&bdev->task);
+		queue_work(system_bh_highpri_wq, &bdev->work);
 
 	ret = pm_runtime_get_sync(bdev->dev);
 	if (ret < 0)
@@ -1091,14 +1092,14 @@ static void bam_start_dma(struct bam_chan *bchan)
 }
 
 /**
- * dma_tasklet - DMA IRQ tasklet
- * @t: tasklet argument (bam controller structure)
+ * bam_dma_work() - DMA interrupt work queue callback
+ * @work: work queue struct embedded in the BAM controller device struct
  *
  * Sets up next DMA operation and then processes all completed transactions
  */
-static void dma_tasklet(struct tasklet_struct *t)
+static void bam_dma_work(struct work_struct *work)
 {
-	struct bam_device *bdev = from_tasklet(bdev, t, task);
+	struct bam_device *bdev = from_work(bdev, work, work);
 	struct bam_chan *bchan;
 	unsigned int i;
 
@@ -1111,14 +1112,13 @@ static void dma_tasklet(struct tasklet_struct *t)
 		if (!list_empty(&bchan->vc.desc_issued) && !IS_BUSY(bchan))
 			bam_start_dma(bchan);
 	}
-
 }
 
 /**
  * bam_issue_pending - starts pending transactions
  * @chan: dma channel
  *
- * Calls tasklet directly which in turn starts any pending transactions
+ * Calls work queue directly which in turn starts any pending transactions
  */
 static void bam_issue_pending(struct dma_chan *chan)
 {
@@ -1286,14 +1286,14 @@ static int bam_dma_probe(struct platform_device *pdev)
 	if (ret)
 		goto err_disable_clk;
 
-	tasklet_setup(&bdev->task, dma_tasklet);
+	INIT_WORK(&bdev->work, bam_dma_work);
 
 	bdev->channels = devm_kcalloc(bdev->dev, bdev->num_channels,
 				sizeof(*bdev->channels), GFP_KERNEL);
 
 	if (!bdev->channels) {
 		ret = -ENOMEM;
-		goto err_tasklet_kill;
+		goto err_workqueue_cancel;
 	}
 
 	/* allocate and initialize channels */
@@ -1359,8 +1359,8 @@ static int bam_dma_probe(struct platform_device *pdev)
 err_bam_channel_exit:
 	for (i = 0; i < bdev->num_channels; i++)
 		tasklet_kill(&bdev->channels[i].vc.task);
-err_tasklet_kill:
-	tasklet_kill(&bdev->task);
+err_workqueue_cancel:
+	cancel_work_sync(&bdev->work);
 err_disable_clk:
 	clk_disable_unprepare(bdev->bamclk);
 
@@ -1394,7 +1394,7 @@ static void bam_dma_remove(struct platform_device *pdev)
 			    bdev->channels[i].fifo_phys);
 	}
 
-	tasklet_kill(&bdev->task);
+	cancel_work_sync(&bdev->work);
 
 	clk_disable_unprepare(bdev->bamclk);
 }

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v19 04/14] dmaengine: qcom: bam_dma: Extend the driver's device match data
  2026-05-26 13:10 [PATCH v19 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
                   ` (2 preceding siblings ...)
  2026-05-26 13:10 ` [PATCH v19 03/14] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue Bartosz Golaszewski
@ 2026-05-26 13:10 ` Bartosz Golaszewski
  2026-05-26 13:10 ` [PATCH v19 05/14] dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support Bartosz Golaszewski
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 24+ messages in thread
From: Bartosz Golaszewski @ 2026-05-26 13:10 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
	Neil Armstrong
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

In preparation for supporting the pipe locking feature flag, extend the
amount of information we can carry in device match data: create a
separate structure and make the register information one of its fields.
This way, in subsequent patches, it will be just a matter of adding a
new field to the device data.

Reviewed-by: Dmitry Baryshkov <lumag@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/dma/qcom/bam_dma.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index 1c62f845ac0b956e311857b93f5b504086662f45..2129ff5261571581a2c086c13dd657dc63e16f90 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -113,6 +113,10 @@ struct reg_offset_data {
 	unsigned int pipe_mult, evnt_mult, ee_mult;
 };
 
+struct bam_device_data {
+	const struct reg_offset_data *reg_info;
+};
+
 static const struct reg_offset_data bam_v1_3_reg_info[] = {
 	[BAM_CTRL]		= { 0x0F80, 0x00, 0x00, 0x00 },
 	[BAM_REVISION]		= { 0x0F84, 0x00, 0x00, 0x00 },
@@ -142,6 +146,10 @@ static const struct reg_offset_data bam_v1_3_reg_info[] = {
 	[BAM_P_FIFO_SIZES]	= { 0x1020, 0x00, 0x40, 0x00 },
 };
 
+static const struct bam_device_data bam_v1_3_data = {
+	.reg_info = bam_v1_3_reg_info,
+};
+
 static const struct reg_offset_data bam_v1_4_reg_info[] = {
 	[BAM_CTRL]		= { 0x0000, 0x00, 0x00, 0x00 },
 	[BAM_REVISION]		= { 0x0004, 0x00, 0x00, 0x00 },
@@ -171,6 +179,10 @@ static const struct reg_offset_data bam_v1_4_reg_info[] = {
 	[BAM_P_FIFO_SIZES]	= { 0x1820, 0x00, 0x1000, 0x00 },
 };
 
+static const struct bam_device_data bam_v1_4_data = {
+	.reg_info = bam_v1_4_reg_info,
+};
+
 static const struct reg_offset_data bam_v1_7_reg_info[] = {
 	[BAM_CTRL]		= { 0x00000, 0x00, 0x00, 0x00 },
 	[BAM_REVISION]		= { 0x01000, 0x00, 0x00, 0x00 },
@@ -200,6 +212,10 @@ static const struct reg_offset_data bam_v1_7_reg_info[] = {
 	[BAM_P_FIFO_SIZES]	= { 0x13820, 0x00, 0x1000, 0x00 },
 };
 
+static const struct bam_device_data bam_v1_7_data = {
+	.reg_info = bam_v1_7_reg_info,
+};
+
 /* BAM CTRL */
 #define BAM_SW_RST			BIT(0)
 #define BAM_EN				BIT(1)
@@ -393,7 +409,7 @@ struct bam_device {
 	bool powered_remotely;
 	u32 active_channels;
 
-	const struct reg_offset_data *layout;
+	const struct bam_device_data *dev_data;
 
 	struct clk *bamclk;
 	int irq;
@@ -411,7 +427,7 @@ struct bam_device {
 static inline void __iomem *bam_addr(struct bam_device *bdev, u32 pipe,
 		enum bam_reg reg)
 {
-	const struct reg_offset_data r = bdev->layout[reg];
+	const struct reg_offset_data r = bdev->dev_data->reg_info[reg];
 
 	return bdev->regs + r.base_offset +
 		r.pipe_mult * pipe +
@@ -1205,9 +1221,9 @@ static void bam_channel_init(struct bam_device *bdev, struct bam_chan *bchan,
 }
 
 static const struct of_device_id bam_of_match[] = {
-	{ .compatible = "qcom,bam-v1.3.0", .data = &bam_v1_3_reg_info },
-	{ .compatible = "qcom,bam-v1.4.0", .data = &bam_v1_4_reg_info },
-	{ .compatible = "qcom,bam-v1.7.0", .data = &bam_v1_7_reg_info },
+	{ .compatible = "qcom,bam-v1.3.0", .data = &bam_v1_3_data },
+	{ .compatible = "qcom,bam-v1.4.0", .data = &bam_v1_4_data },
+	{ .compatible = "qcom,bam-v1.7.0", .data = &bam_v1_7_data },
 	{}
 };
 
@@ -1231,7 +1247,7 @@ static int bam_dma_probe(struct platform_device *pdev)
 		return -ENODEV;
 	}
 
-	bdev->layout = match->data;
+	bdev->dev_data = match->data;
 
 	bdev->regs = devm_platform_ioremap_resource(pdev, 0);
 	if (IS_ERR(bdev->regs))

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v19 05/14] dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support
  2026-05-26 13:10 [PATCH v19 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
                   ` (3 preceding siblings ...)
  2026-05-26 13:10 ` [PATCH v19 04/14] dmaengine: qcom: bam_dma: Extend the driver's device match data Bartosz Golaszewski
@ 2026-05-26 13:10 ` Bartosz Golaszewski
  2026-05-26 13:10 ` [PATCH v19 06/14] dmaengine: qcom: bam_dma: add support for BAM locking Bartosz Golaszewski
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 24+ messages in thread
From: Bartosz Golaszewski @ 2026-05-26 13:10 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
	Neil Armstrong
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski,
	Dmitry Baryshkov

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Extend the device match data with a flag indicating whether the IP
supports the BAM lock/unlock feature. Set it to true on BAM IP versions
1.4.0 and above.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Acked-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/dma/qcom/bam_dma.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index 2129ff5261571581a2c086c13dd657dc63e16f90..04fe1d546be73f074c66c4a5712ad65717e10929 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -115,6 +115,7 @@ struct reg_offset_data {
 
 struct bam_device_data {
 	const struct reg_offset_data *reg_info;
+	bool pipe_lock_supported;
 };
 
 static const struct reg_offset_data bam_v1_3_reg_info[] = {
@@ -181,6 +182,7 @@ static const struct reg_offset_data bam_v1_4_reg_info[] = {
 
 static const struct bam_device_data bam_v1_4_data = {
 	.reg_info = bam_v1_4_reg_info,
+	.pipe_lock_supported = true,
 };
 
 static const struct reg_offset_data bam_v1_7_reg_info[] = {
@@ -214,6 +216,7 @@ static const struct reg_offset_data bam_v1_7_reg_info[] = {
 
 static const struct bam_device_data bam_v1_7_data = {
 	.reg_info = bam_v1_7_reg_info,
+	.pipe_lock_supported = true,
 };
 
 /* BAM CTRL */

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v19 06/14] dmaengine: qcom: bam_dma: add support for BAM locking
  2026-05-26 13:10 [PATCH v19 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
                   ` (4 preceding siblings ...)
  2026-05-26 13:10 ` [PATCH v19 05/14] dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support Bartosz Golaszewski
@ 2026-05-26 13:10 ` Bartosz Golaszewski
  2026-05-26 15:01   ` sashiko-bot
  2026-05-26 13:10 ` [PATCH v19 07/14] crypto: qce - Cancel work on device detach Bartosz Golaszewski
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 24+ messages in thread
From: Bartosz Golaszewski @ 2026-05-26 13:10 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
	Neil Armstrong
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski

Add support for BAM pipe locking. To that end: when starting DMA on an RX
channel - prepend the existing queue of issued descriptors with an
additional "dummy" command descriptor with the LOCK bit set. Once the
transaction is done (no more issued descriptors), issue one more dummy
descriptor with the UNLOCK bit.

We *must* wait until the transaction is signalled as done because we
must not perform any writes into config registers while the engine is
busy.

The dummy writes must be issued into a scratchpad register of the client
so provide a mechanism to communicate the right address via descriptor
metadata.

Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/dma/qcom/bam_dma.c       | 153 ++++++++++++++++++++++++++++++++++++++-
 include/linux/dma/qcom_bam_dma.h |  14 ++++
 2 files changed, 163 insertions(+), 4 deletions(-)

diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index 04fe1d546be73f074c66c4a5712ad65717e10929..84fd9e181bdd5fd9a4a744050ba57f05f54787c7 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -28,11 +28,13 @@
 #include <linux/clk.h>
 #include <linux/device.h>
 #include <linux/dma-mapping.h>
+#include <linux/dma/qcom_bam_dma.h>
 #include <linux/dmaengine.h>
 #include <linux/init.h>
 #include <linux/interrupt.h>
 #include <linux/io.h>
 #include <linux/kernel.h>
+#include <linux/lockdep.h>
 #include <linux/module.h>
 #include <linux/of_address.h>
 #include <linux/of_dma.h>
@@ -60,6 +62,8 @@ struct bam_desc_hw {
 #define DESC_FLAG_EOB BIT(13)
 #define DESC_FLAG_NWD BIT(12)
 #define DESC_FLAG_CMD BIT(11)
+#define DESC_FLAG_LOCK BIT(10)
+#define DESC_FLAG_UNLOCK BIT(9)
 
 struct bam_async_desc {
 	struct virt_dma_desc vd;
@@ -72,6 +76,10 @@ struct bam_async_desc {
 
 	struct bam_desc_hw *curr_desc;
 
+	/* BAM locking infrastructure */
+	struct scatterlist lock_sg;
+	struct bam_cmd_element lock_ce;
+
 	/* list node for the desc in the bam_chan list of descriptors */
 	struct list_head desc_node;
 	enum dma_transfer_direction dir;
@@ -391,6 +399,10 @@ struct bam_chan {
 	struct list_head desc_list;
 
 	struct list_head node;
+
+	/* BAM locking infrastructure */
+	phys_addr_t scratchpad_addr;
+	enum dma_transfer_direction direction;
 };
 
 static inline struct bam_chan *to_bam_chan(struct dma_chan *common)
@@ -652,6 +664,35 @@ static int bam_slave_config(struct dma_chan *chan,
 	return 0;
 }
 
+static int bam_metadata_attach(struct dma_async_tx_descriptor *desc, void *data, size_t len)
+{
+	struct bam_chan *bchan = to_bam_chan(desc->chan);
+	const struct bam_device_data *bdata = bchan->bdev->dev_data;
+	struct bam_desc_metadata *metadata = data;
+
+	if (!data)
+		return -EINVAL;
+
+	if (!bdata->pipe_lock_supported)
+		/*
+		 * The client wants to use locking but this BAM version doesn't
+		 * support it. Don't return an error here as this will stop the
+		 * client from using DMA at all for no reason.
+		 */
+		return 0;
+
+	guard(spinlock_irqsave)(&bchan->vc.lock);
+
+	bchan->scratchpad_addr = metadata->scratchpad_addr;
+	bchan->direction = metadata->direction;
+
+	return 0;
+}
+
+static const struct dma_descriptor_metadata_ops bam_metadata_ops = {
+	.attach = bam_metadata_attach,
+};
+
 /**
  * bam_prep_slave_sg - Prep slave sg transaction
  *
@@ -668,6 +709,7 @@ static struct dma_async_tx_descriptor *bam_prep_slave_sg(struct dma_chan *chan,
 	void *context)
 {
 	struct bam_chan *bchan = to_bam_chan(chan);
+	struct dma_async_tx_descriptor *tx_desc;
 	struct bam_device *bdev = bchan->bdev;
 	struct bam_async_desc *async_desc;
 	struct scatterlist *sg;
@@ -723,7 +765,10 @@ static struct dma_async_tx_descriptor *bam_prep_slave_sg(struct dma_chan *chan,
 		} while (remainder > 0);
 	}
 
-	return vchan_tx_prep(&bchan->vc, &async_desc->vd, flags);
+	tx_desc = vchan_tx_prep(&bchan->vc, &async_desc->vd, flags);
+	tx_desc->metadata_ops = &bam_metadata_ops;
+
+	return tx_desc;
 }
 
 /**
@@ -1012,13 +1057,105 @@ static void bam_apply_new_config(struct bam_chan *bchan,
 	bchan->reconfigure = 0;
 }
 
+static struct bam_async_desc *
+bam_make_lock_desc(struct bam_chan *bchan, unsigned long flag)
+{
+	struct dma_chan *chan = &bchan->vc.chan;
+	struct bam_async_desc *async_desc;
+	struct bam_desc_hw *desc;
+	struct virt_dma_desc *vd;
+	struct virt_dma_chan *vc;
+	unsigned int mapped;
+	dma_cookie_t cookie;
+	int ret;
+
+	async_desc = kzalloc_flex(*async_desc, desc, 1, GFP_NOWAIT);
+	if (!async_desc) {
+		dev_err(bchan->bdev->dev, "failed to allocate the BAM lock descriptor\n");
+		return ERR_PTR(-ENOMEM);
+	}
+
+	sg_init_table(&async_desc->lock_sg, 1);
+
+	async_desc->num_desc = 1;
+	async_desc->curr_desc = async_desc->desc;
+	async_desc->dir = DMA_MEM_TO_DEV;
+
+	desc = async_desc->desc;
+
+	bam_prep_ce_le32(&async_desc->lock_ce, bchan->scratchpad_addr, BAM_WRITE_COMMAND, 0);
+	sg_set_buf(&async_desc->lock_sg, &async_desc->lock_ce, sizeof(async_desc->lock_ce));
+
+	mapped = dma_map_sg(chan->slave, &async_desc->lock_sg, 1, DMA_TO_DEVICE);
+	if (!mapped) {
+		kfree(async_desc);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	desc->flags |= cpu_to_le16(DESC_FLAG_CMD | flag);
+	desc->addr = sg_dma_address(&async_desc->lock_sg);
+	desc->size = cpu_to_le16(sizeof(struct bam_cmd_element));
+
+	vc = &bchan->vc;
+	vd = &async_desc->vd;
+
+	dma_async_tx_descriptor_init(&vd->tx, &vc->chan);
+	vd->tx.flags = DMA_PREP_CMD;
+	vd->tx.desc_free = vchan_tx_desc_free;
+	vd->tx_result.result = DMA_TRANS_NOERROR;
+	vd->tx_result.residue = 0;
+
+	cookie = dma_cookie_assign(&vd->tx);
+	ret = dma_submit_error(cookie);
+	if (ret) {
+		dma_unmap_sg(chan->slave, &async_desc->lock_sg, 1, DMA_TO_DEVICE);
+		kfree(async_desc);
+		return ERR_PTR(ret);
+	}
+
+	return async_desc;
+}
+
+static int bam_do_setup_pipe_lock(struct bam_chan *bchan, bool lock)
+{
+	struct bam_device *bdev = bchan->bdev;
+	const struct bam_device_data *bdata = bdev->dev_data;
+	struct bam_async_desc *lock_desc;
+	unsigned long flag;
+
+	lockdep_assert_held(&bchan->vc.lock);
+
+	if (!bdata->pipe_lock_supported || !bchan->scratchpad_addr ||
+	    bchan->direction != DMA_MEM_TO_DEV)
+		return 0;
+
+	flag = lock ? DESC_FLAG_LOCK : DESC_FLAG_UNLOCK;
+
+	lock_desc = bam_make_lock_desc(bchan, flag);
+	if (IS_ERR(lock_desc))
+		return PTR_ERR(lock_desc);
+
+	if (lock)
+		list_add(&lock_desc->vd.node, &bchan->vc.desc_issued);
+	else
+		list_add_tail(&lock_desc->vd.node, &bchan->vc.desc_issued);
+
+	return 0;
+}
+
+static void bam_setup_pipe_lock(struct bam_chan *bchan)
+{
+	if (bam_do_setup_pipe_lock(bchan, true) || bam_do_setup_pipe_lock(bchan, false))
+		dev_err(bchan->vc.chan.slave, "Failed to setup BAM pipe lock descriptors");
+}
+
 /**
  * bam_start_dma - start next transaction
  * @bchan: bam dma channel
  */
 static void bam_start_dma(struct bam_chan *bchan)
 {
-	struct virt_dma_desc *vd = vchan_next_desc(&bchan->vc);
+	struct virt_dma_desc *vd;
 	struct bam_device *bdev = bchan->bdev;
 	struct bam_async_desc *async_desc = NULL;
 	struct bam_desc_hw *desc;
@@ -1030,6 +1167,9 @@ static void bam_start_dma(struct bam_chan *bchan)
 
 	lockdep_assert_held(&bchan->vc.lock);
 
+	bam_setup_pipe_lock(bchan);
+
+	vd = vchan_next_desc(&bchan->vc);
 	if (!vd)
 		return;
 
@@ -1157,8 +1297,12 @@ static void bam_issue_pending(struct dma_chan *chan)
  */
 static void bam_dma_free_desc(struct virt_dma_desc *vd)
 {
-	struct bam_async_desc *async_desc = container_of(vd,
-			struct bam_async_desc, vd);
+	struct bam_async_desc *async_desc = container_of(vd, struct bam_async_desc, vd);
+	struct bam_desc_hw *desc = async_desc->desc;
+	struct dma_chan *chan = vd->tx.chan;
+
+	if (le16_to_cpu(desc->flags) & (DESC_FLAG_LOCK | DESC_FLAG_UNLOCK))
+		dma_unmap_sg(chan->slave, &async_desc->lock_sg, 1, DMA_TO_DEVICE);
 
 	kfree(async_desc);
 }
@@ -1349,6 +1493,7 @@ static int bam_dma_probe(struct platform_device *pdev)
 	bdev->common.device_terminate_all = bam_dma_terminate_all;
 	bdev->common.device_issue_pending = bam_issue_pending;
 	bdev->common.device_tx_status = bam_tx_status;
+	bdev->common.desc_metadata_modes = DESC_METADATA_CLIENT;
 	bdev->common.dev = bdev->dev;
 
 	ret = dma_async_device_register(&bdev->common);
diff --git a/include/linux/dma/qcom_bam_dma.h b/include/linux/dma/qcom_bam_dma.h
index 68fc0e643b1b97fe4520d5878daa322b81f4f559..a2594264b0f58c4b2b1c85e243cad0d5669c26dc 100644
--- a/include/linux/dma/qcom_bam_dma.h
+++ b/include/linux/dma/qcom_bam_dma.h
@@ -6,6 +6,8 @@
 #ifndef _QCOM_BAM_DMA_H
 #define _QCOM_BAM_DMA_H
 
+#include <linux/dmaengine.h>
+
 #include <asm/byteorder.h>
 
 /*
@@ -34,6 +36,18 @@ enum bam_command_type {
 	BAM_READ_COMMAND,
 };
 
+/**
+ * struct bam_desc_metadata - DMA descriptor metadata specific to the BAM driver.
+ *
+ * @scratchpad_addr: Physical address to use for dummy write operations when
+ *                   queuing command descriptors with LOCK/UNLOCK bits set.
+ * @direction: Transfer direction of this channel.
+ */
+struct bam_desc_metadata {
+	phys_addr_t scratchpad_addr;
+	enum dma_transfer_direction direction;
+};
+
 /*
  * prep_bam_ce_le32 - Wrapper function to prepare a single BAM command
  * element with the data already in le32 format.

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v19 07/14] crypto: qce - Cancel work on device detach
  2026-05-26 13:10 [PATCH v19 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
                   ` (5 preceding siblings ...)
  2026-05-26 13:10 ` [PATCH v19 06/14] dmaengine: qcom: bam_dma: add support for BAM locking Bartosz Golaszewski
@ 2026-05-26 13:10 ` Bartosz Golaszewski
  2026-05-26 15:33   ` sashiko-bot
  2026-05-26 13:10 ` [PATCH v19 08/14] crypto: qce - Include algapi.h in the core.h header Bartosz Golaszewski
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 24+ messages in thread
From: Bartosz Golaszewski @ 2026-05-26 13:10 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
	Neil Armstrong
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski

The workqueue is setup in probe() but never cancelled on error or in
remove(). Set up a devres action to clean it up. We need to move the
initialization earlier as we don't want to cancel the work before any
outstanding DMA transfer is terminated. Make sure we do terminate all
transfers in qce_dma_release() devres action.

Fixes: eb7986e5e14d ("crypto: qce - convert tasklet to workqueue")
Closes: https://sashiko.dev/#/patchset/20260427-qcom-qce-cmd-descr-v16-0-945fd1cafbbc%40oss.qualcomm.com?part=7
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/core.c | 13 ++++++++++++-
 drivers/crypto/qce/dma.c  |  2 ++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
index b966f3365b7de8d2a8f6707397a34aa4facdc4ac..f671946cf7351cd5f0c319909bafd87e3af701c7 100644
--- a/drivers/crypto/qce/core.c
+++ b/drivers/crypto/qce/core.c
@@ -186,6 +186,13 @@ static int qce_check_version(struct qce_device *qce)
 	return 0;
 }
 
+static void qce_cancel_work(void *data)
+{
+	struct work_struct *work = data;
+
+	cancel_work_sync(work);
+}
+
 static int qce_crypto_probe(struct platform_device *pdev)
 {
 	struct device *dev = &pdev->dev;
@@ -227,6 +234,11 @@ static int qce_crypto_probe(struct platform_device *pdev)
 	if (ret)
 		return ret;
 
+	INIT_WORK(&qce->done_work, qce_req_done_work);
+	ret = devm_add_action_or_reset(dev, qce_cancel_work, &qce->done_work);
+	if (ret)
+		return ret;
+
 	ret = devm_qce_dma_request(qce->dev, &qce->dma);
 	if (ret)
 		return ret;
@@ -239,7 +251,6 @@ static int qce_crypto_probe(struct platform_device *pdev)
 	if (ret)
 		return ret;
 
-	INIT_WORK(&qce->done_work, qce_req_done_work);
 	crypto_init_queue(&qce->queue, QCE_QUEUE_LENGTH);
 
 	qce->async_req_enqueue = qce_async_request_enqueue;
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index 68cafd4741ad3d91906d39e817fc7873b028d498..7ec9d72fd690fb17e03ade7efe3cc522fb47e1ac 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -13,6 +13,8 @@ static void qce_dma_release(void *data)
 {
 	struct qce_dma_data *dma = data;
 
+	dmaengine_terminate_sync(dma->txchan);
+	dmaengine_terminate_sync(dma->rxchan);
 	dma_release_channel(dma->txchan);
 	dma_release_channel(dma->rxchan);
 	kfree(dma->result_buf);

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v19 08/14] crypto: qce - Include algapi.h in the core.h header
  2026-05-26 13:10 [PATCH v19 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
                   ` (6 preceding siblings ...)
  2026-05-26 13:10 ` [PATCH v19 07/14] crypto: qce - Cancel work on device detach Bartosz Golaszewski
@ 2026-05-26 13:10 ` Bartosz Golaszewski
  2026-05-26 13:10 ` [PATCH v19 09/14] crypto: qce - Remove unused ignore_buf Bartosz Golaszewski
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 24+ messages in thread
From: Bartosz Golaszewski @ 2026-05-26 13:10 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
	Neil Armstrong
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

The header defines a struct embedding struct crypto_queue whose size
needs to be known and which is defined in crypto/algapi.h. Move the
inclusion from core.c to core.h.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/core.c | 1 -
 drivers/crypto/qce/core.h | 1 +
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
index f671946cf7351cd5f0c319909bafd87e3af701c7..ad37c2b8ae53a373bb248aff06c3b7946e8439a8 100644
--- a/drivers/crypto/qce/core.c
+++ b/drivers/crypto/qce/core.c
@@ -13,7 +13,6 @@
 #include <linux/mod_devicetable.h>
 #include <linux/platform_device.h>
 #include <linux/types.h>
-#include <crypto/algapi.h>
 #include <crypto/internal/hash.h>
 
 #include "core.h"
diff --git a/drivers/crypto/qce/core.h b/drivers/crypto/qce/core.h
index eb6fa7a8b64a81daf9ad5304a3ae4e5e597a70b8..f092ce2d3b04a936a37805c20ac5ba78d8fdd2df 100644
--- a/drivers/crypto/qce/core.h
+++ b/drivers/crypto/qce/core.h
@@ -8,6 +8,7 @@
 
 #include <linux/mutex.h>
 #include <linux/workqueue.h>
+#include <crypto/algapi.h>
 
 #include "dma.h"
 

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v19 09/14] crypto: qce - Remove unused ignore_buf
  2026-05-26 13:10 [PATCH v19 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
                   ` (7 preceding siblings ...)
  2026-05-26 13:10 ` [PATCH v19 08/14] crypto: qce - Include algapi.h in the core.h header Bartosz Golaszewski
@ 2026-05-26 13:10 ` Bartosz Golaszewski
  2026-05-26 15:57   ` sashiko-bot
  2026-05-26 13:10 ` [PATCH v19 10/14] crypto: qce - Simplify arguments of devm_qce_dma_request() Bartosz Golaszewski
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 24+ messages in thread
From: Bartosz Golaszewski @ 2026-05-26 13:10 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
	Neil Armstrong
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

It's unclear what the purpose of this field is. It has been here since
the initial commit but without any explanation. The driver works fine
without it. We still keep allocating more space in the result buffer, we
just don't need to store its address. While at it: move the
QCE_IGNORE_BUF_SZ definition into dma.c as it's not used outside of this
compilation unit.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/dma.c | 4 ++--
 drivers/crypto/qce/dma.h | 2 --
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index 7ec9d72fd690fb17e03ade7efe3cc522fb47e1ac..d1daa229361aa74da5d3d7bfe1bc8ab189761e38 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -9,6 +9,8 @@
 
 #include "dma.h"
 
+#define QCE_IGNORE_BUF_SZ		(2 * QCE_BAM_BURST_SIZE)
+
 static void qce_dma_release(void *data)
 {
 	struct qce_dma_data *dma = data;
@@ -43,8 +45,6 @@ int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma)
 		goto error_nomem;
 	}
 
-	dma->ignore_buf = dma->result_buf + QCE_RESULT_BUF_SZ;
-
 	return devm_add_action_or_reset(dev, qce_dma_release, dma);
 
 error_nomem:
diff --git a/drivers/crypto/qce/dma.h b/drivers/crypto/qce/dma.h
index 31629185000e12242fa07c2cc08b95fcbd5d4b8c..fc337c435cd14917bdfb99febcf9119275afdeba 100644
--- a/drivers/crypto/qce/dma.h
+++ b/drivers/crypto/qce/dma.h
@@ -23,7 +23,6 @@ struct qce_result_dump {
 	u32 status2;
 };
 
-#define QCE_IGNORE_BUF_SZ	(2 * QCE_BAM_BURST_SIZE)
 #define QCE_RESULT_BUF_SZ	\
 		ALIGN(sizeof(struct qce_result_dump), QCE_BAM_BURST_SIZE)
 
@@ -31,7 +30,6 @@ struct qce_dma_data {
 	struct dma_chan *txchan;
 	struct dma_chan *rxchan;
 	struct qce_result_dump *result_buf;
-	void *ignore_buf;
 };
 
 int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma);

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v19 10/14] crypto: qce - Simplify arguments of devm_qce_dma_request()
  2026-05-26 13:10 [PATCH v19 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
                   ` (8 preceding siblings ...)
  2026-05-26 13:10 ` [PATCH v19 09/14] crypto: qce - Remove unused ignore_buf Bartosz Golaszewski
@ 2026-05-26 13:10 ` Bartosz Golaszewski
  2026-05-26 13:10 ` [PATCH v19 11/14] crypto: qce - Use existing devres APIs in devm_qce_dma_request() Bartosz Golaszewski
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 24+ messages in thread
From: Bartosz Golaszewski @ 2026-05-26 13:10 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
	Neil Armstrong
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

This function can extract all the information it needs from struct
qce_device alone so simplify its arguments. This is done in preparation
for adding support for register I/O over DMA which will require
accessing even more fields from struct qce_device.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/core.c | 2 +-
 drivers/crypto/qce/dma.c  | 5 ++++-
 drivers/crypto/qce/dma.h  | 4 +++-
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
index ad37c2b8ae53a373bb248aff06c3b7946e8439a8..a0e2eadc3afd5f83e46724c8bc3e3690146b86ba 100644
--- a/drivers/crypto/qce/core.c
+++ b/drivers/crypto/qce/core.c
@@ -238,7 +238,7 @@ static int qce_crypto_probe(struct platform_device *pdev)
 	if (ret)
 		return ret;
 
-	ret = devm_qce_dma_request(qce->dev, &qce->dma);
+	ret = devm_qce_dma_request(qce);
 	if (ret)
 		return ret;
 
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index d1daa229361aa74da5d3d7bfe1bc8ab189761e38..d60efb5c26d88f8b0259b1dccc8724d0f75571c6 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -7,6 +7,7 @@
 #include <linux/dmaengine.h>
 #include <crypto/scatterwalk.h>
 
+#include "core.h"
 #include "dma.h"
 
 #define QCE_IGNORE_BUF_SZ		(2 * QCE_BAM_BURST_SIZE)
@@ -22,8 +23,10 @@ static void qce_dma_release(void *data)
 	kfree(dma->result_buf);
 }
 
-int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma)
+int devm_qce_dma_request(struct qce_device *qce)
 {
+	struct qce_dma_data *dma = &qce->dma;
+	struct device *dev = qce->dev;
 	int ret;
 
 	dma->txchan = dma_request_chan(dev, "tx");
diff --git a/drivers/crypto/qce/dma.h b/drivers/crypto/qce/dma.h
index fc337c435cd14917bdfb99febcf9119275afdeba..483789d9fa98e79d1283de8297bf2fc2a773f3a7 100644
--- a/drivers/crypto/qce/dma.h
+++ b/drivers/crypto/qce/dma.h
@@ -8,6 +8,8 @@
 
 #include <linux/dmaengine.h>
 
+struct qce_device;
+
 /* maximum data transfer block size between BAM and CE */
 #define QCE_BAM_BURST_SIZE		64
 
@@ -32,7 +34,7 @@ struct qce_dma_data {
 	struct qce_result_dump *result_buf;
 };
 
-int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma);
+int devm_qce_dma_request(struct qce_device *qce);
 int qce_dma_prep_sgs(struct qce_dma_data *dma, struct scatterlist *sg_in,
 		     int in_ents, struct scatterlist *sg_out, int out_ents,
 		     dma_async_tx_callback cb, void *cb_param);

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v19 11/14] crypto: qce - Use existing devres APIs in devm_qce_dma_request()
  2026-05-26 13:10 [PATCH v19 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
                   ` (9 preceding siblings ...)
  2026-05-26 13:10 ` [PATCH v19 10/14] crypto: qce - Simplify arguments of devm_qce_dma_request() Bartosz Golaszewski
@ 2026-05-26 13:10 ` Bartosz Golaszewski
  2026-05-26 16:09   ` sashiko-bot
  2026-05-26 13:11 ` [PATCH v19 12/14] crypto: qce - Map crypto memory for DMA Bartosz Golaszewski
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 24+ messages in thread
From: Bartosz Golaszewski @ 2026-05-26 13:10 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
	Neil Armstrong
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski,
	Konrad Dybcio

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Switch to devm_kmalloc() and devm_dma_alloc_chan() in
devm_qce_dma_request(). This allows us to drop two labels and shrink the
function.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/dma.c | 34 +++++++++++-----------------------
 1 file changed, 11 insertions(+), 23 deletions(-)

diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index d60efb5c26d88f8b0259b1dccc8724d0f75571c6..c2602d35baa6ad3ca5de734de7ff6160ff29567c 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -12,7 +12,7 @@
 
 #define QCE_IGNORE_BUF_SZ		(2 * QCE_BAM_BURST_SIZE)
 
-static void qce_dma_release(void *data)
+static void qce_dma_terminate(void *data)
 {
 	struct qce_dma_data *dma = data;
 
@@ -27,34 +27,22 @@ int devm_qce_dma_request(struct qce_device *qce)
 {
 	struct qce_dma_data *dma = &qce->dma;
 	struct device *dev = qce->dev;
-	int ret;
 
-	dma->txchan = dma_request_chan(dev, "tx");
+	dma->result_buf = devm_kmalloc(dev, QCE_RESULT_BUF_SZ + QCE_IGNORE_BUF_SZ, GFP_KERNEL);
+	if (!dma->result_buf)
+		return -ENOMEM;
+
+	dma->txchan = devm_dma_request_chan(dev, "tx");
 	if (IS_ERR(dma->txchan))
 		return dev_err_probe(dev, PTR_ERR(dma->txchan),
 				     "Failed to get TX DMA channel\n");
 
-	dma->rxchan = dma_request_chan(dev, "rx");
-	if (IS_ERR(dma->rxchan)) {
-		ret = dev_err_probe(dev, PTR_ERR(dma->rxchan),
-				    "Failed to get RX DMA channel\n");
-		goto error_rx;
-	}
-
-	dma->result_buf = kmalloc(QCE_RESULT_BUF_SZ + QCE_IGNORE_BUF_SZ,
-				  GFP_KERNEL);
-	if (!dma->result_buf) {
-		ret = -ENOMEM;
-		goto error_nomem;
-	}
-
-	return devm_add_action_or_reset(dev, qce_dma_release, dma);
+	dma->rxchan = devm_dma_request_chan(dev, "rx");
+	if (IS_ERR(dma->rxchan))
+		return dev_err_probe(dev, PTR_ERR(dma->rxchan),
+				     "Failed to get RX DMA channel\n");
 
-error_nomem:
-	dma_release_channel(dma->rxchan);
-error_rx:
-	dma_release_channel(dma->txchan);
-	return ret;
+	return devm_add_action_or_reset(dev, qce_dma_terminate, dma);
 }
 
 struct scatterlist *

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v19 12/14] crypto: qce - Map crypto memory for DMA
  2026-05-26 13:10 [PATCH v19 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
                   ` (10 preceding siblings ...)
  2026-05-26 13:10 ` [PATCH v19 11/14] crypto: qce - Use existing devres APIs in devm_qce_dma_request() Bartosz Golaszewski
@ 2026-05-26 13:11 ` Bartosz Golaszewski
  2026-05-26 16:30   ` sashiko-bot
  2026-05-26 13:11 ` [PATCH v19 13/14] crypto: qce - Add BAM DMA support for crypto register I/O Bartosz Golaszewski
  2026-05-26 13:11 ` [PATCH v19 14/14] crypto: qce - Communicate the base physical address to the dmaengine Bartosz Golaszewski
  13 siblings, 1 reply; 24+ messages in thread
From: Bartosz Golaszewski @ 2026-05-26 13:11 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
	Neil Armstrong
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

As the first step in converting the driver to using DMA for register
I/O, let's map the crypto memory range.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/core.c | 23 ++++++++++++++++++++++-
 drivers/crypto/qce/core.h |  6 ++++++
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
index a0e2eadc3afd5f83e46724c8bc3e3690146b86ba..d7b7a3dda464964afe6a6893bb329d5bd5759dcd 100644
--- a/drivers/crypto/qce/core.c
+++ b/drivers/crypto/qce/core.c
@@ -192,10 +192,19 @@ static void qce_cancel_work(void *data)
 	cancel_work_sync(work);
 }
 
+static void qce_crypto_unmap_dma(void *data)
+{
+	struct qce_device *qce = data;
+
+	dma_unmap_resource(qce->dev, qce->base_dma, qce->dma_size,
+			   DMA_BIDIRECTIONAL, 0);
+}
+
 static int qce_crypto_probe(struct platform_device *pdev)
 {
 	struct device *dev = &pdev->dev;
 	struct qce_device *qce;
+	struct resource *res;
 	int ret;
 
 	qce = devm_kzalloc(dev, sizeof(*qce), GFP_KERNEL);
@@ -205,7 +214,7 @@ static int qce_crypto_probe(struct platform_device *pdev)
 	qce->dev = dev;
 	platform_set_drvdata(pdev, qce);
 
-	qce->base = devm_platform_ioremap_resource(pdev, 0);
+	qce->base = devm_platform_get_and_ioremap_resource(pdev, 0, &res);
 	if (IS_ERR(qce->base))
 		return PTR_ERR(qce->base);
 
@@ -255,6 +264,18 @@ static int qce_crypto_probe(struct platform_device *pdev)
 	qce->async_req_enqueue = qce_async_request_enqueue;
 	qce->async_req_done = qce_async_request_done;
 
+	qce->dma_size = resource_size(res);
+	qce->base_dma = dma_map_resource(dev, res->start, qce->dma_size,
+					 DMA_BIDIRECTIONAL, 0);
+	qce->base_phys = res->start;
+	ret = dma_mapping_error(dev, qce->base_dma);
+	if (ret)
+		return ret;
+
+	ret = devm_add_action_or_reset(qce->dev, qce_crypto_unmap_dma, qce);
+	if (ret)
+		return ret;
+
 	return devm_qce_register_algs(qce);
 }
 
diff --git a/drivers/crypto/qce/core.h b/drivers/crypto/qce/core.h
index f092ce2d3b04a936a37805c20ac5ba78d8fdd2df..a80e12eac6c87e5321cce16c56a4bf5003474ef0 100644
--- a/drivers/crypto/qce/core.h
+++ b/drivers/crypto/qce/core.h
@@ -27,6 +27,9 @@
  * @dma: pointer to dma data
  * @burst_size: the crypto burst size
  * @pipe_pair_id: which pipe pair id the device using
+ * @base_dma: base DMA address
+ * @base_phys: base physical address
+ * @dma_size: size of memory mapped for DMA
  * @async_req_enqueue: invoked by every algorithm to enqueue a request
  * @async_req_done: invoked by every algorithm to finish its request
  */
@@ -43,6 +46,9 @@ struct qce_device {
 	struct qce_dma_data dma;
 	int burst_size;
 	unsigned int pipe_pair_id;
+	dma_addr_t base_dma;
+	phys_addr_t base_phys;
+	size_t dma_size;
 	int (*async_req_enqueue)(struct qce_device *qce,
 				 struct crypto_async_request *req);
 	void (*async_req_done)(struct qce_device *qce, int ret);

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v19 13/14] crypto: qce - Add BAM DMA support for crypto register I/O
  2026-05-26 13:10 [PATCH v19 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
                   ` (11 preceding siblings ...)
  2026-05-26 13:11 ` [PATCH v19 12/14] crypto: qce - Map crypto memory for DMA Bartosz Golaszewski
@ 2026-05-26 13:11 ` Bartosz Golaszewski
  2026-05-26 17:13   ` sashiko-bot
  2026-05-26 13:11 ` [PATCH v19 14/14] crypto: qce - Communicate the base physical address to the dmaengine Bartosz Golaszewski
  13 siblings, 1 reply; 24+ messages in thread
From: Bartosz Golaszewski @ 2026-05-26 13:11 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
	Neil Armstrong
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Switch to using BAM DMA for register I/O in addition to passing data. To
that end: provide the necessary infrastructure in the driver, modify the
ordering of operations as required and replace all direct register writes
with wrappers queueing DMA command descriptors.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/aead.c     |  10 ++--
 drivers/crypto/qce/common.c   |  20 ++++---
 drivers/crypto/qce/dma.c      | 120 ++++++++++++++++++++++++++++++++++++++++--
 drivers/crypto/qce/dma.h      |   5 ++
 drivers/crypto/qce/sha.c      |  10 ++--
 drivers/crypto/qce/skcipher.c |  10 ++--
 6 files changed, 144 insertions(+), 31 deletions(-)

diff --git a/drivers/crypto/qce/aead.c b/drivers/crypto/qce/aead.c
index 03b8042da9a1b4aebdc775ad8ab912abc7b2383d..e271ecbcbb4a33c405fbec85c458cf1daa7e2c55 100644
--- a/drivers/crypto/qce/aead.c
+++ b/drivers/crypto/qce/aead.c
@@ -463,17 +463,17 @@ qce_aead_async_req_handle(struct crypto_async_request *async_req)
 			src_nents = dst_nents - 1;
 	}
 
-	ret = qce_dma_prep_sgs(&qce->dma, rctx->src_sg, src_nents, rctx->dst_sg, dst_nents,
-			       qce_aead_done, async_req);
+	ret = qce_start(async_req, tmpl->crypto_alg_type);
 	if (ret)
 		goto error_unmap_src;
 
-	qce_dma_issue_pending(&qce->dma);
-
-	ret = qce_start(async_req, tmpl->crypto_alg_type);
+	ret = qce_dma_prep_sgs(&qce->dma, rctx->src_sg, src_nents, rctx->dst_sg, dst_nents,
+			       qce_aead_done, async_req);
 	if (ret)
 		goto error_terminate;
 
+	qce_dma_issue_pending(&qce->dma);
+
 	return 0;
 
 error_terminate:
diff --git a/drivers/crypto/qce/common.c b/drivers/crypto/qce/common.c
index 54a78a57f63028f01870a3edeb8e390f523bb190..37bb6f03244d317a887aeb0aa10cefe327b4ce05 100644
--- a/drivers/crypto/qce/common.c
+++ b/drivers/crypto/qce/common.c
@@ -25,7 +25,7 @@ static inline u32 qce_read(struct qce_device *qce, u32 offset)
 
 static inline void qce_write(struct qce_device *qce, u32 offset, u32 val)
 {
-	writel(val, qce->base + offset);
+	qce_write_dma(qce, offset, val);
 }
 
 static inline void qce_write_array(struct qce_device *qce, u32 offset,
@@ -82,6 +82,8 @@ static void qce_setup_config(struct qce_device *qce)
 {
 	u32 config;
 
+	qce_clear_bam_transaction(qce);
+
 	/* get big endianness */
 	config = qce_config_reg(qce, 0);
 
@@ -90,12 +92,14 @@ static void qce_setup_config(struct qce_device *qce)
 	qce_write(qce, REG_CONFIG, config);
 }
 
-static inline void qce_crypto_go(struct qce_device *qce, bool result_dump)
+static inline int qce_crypto_go(struct qce_device *qce, bool result_dump)
 {
 	if (result_dump)
 		qce_write(qce, REG_GOPROC, BIT(GO_SHIFT) | BIT(RESULTS_DUMP_SHIFT));
 	else
 		qce_write(qce, REG_GOPROC, BIT(GO_SHIFT));
+
+	return qce_submit_cmd_desc(qce);
 }
 
 #if defined(CONFIG_CRYPTO_DEV_QCE_SHA) || defined(CONFIG_CRYPTO_DEV_QCE_AEAD)
@@ -223,9 +227,7 @@ static int qce_setup_regs_ahash(struct crypto_async_request *async_req)
 	config = qce_config_reg(qce, 1);
 	qce_write(qce, REG_CONFIG, config);
 
-	qce_crypto_go(qce, true);
-
-	return 0;
+	return qce_crypto_go(qce, true);
 }
 #endif
 
@@ -386,9 +388,7 @@ static int qce_setup_regs_skcipher(struct crypto_async_request *async_req)
 	config = qce_config_reg(qce, 1);
 	qce_write(qce, REG_CONFIG, config);
 
-	qce_crypto_go(qce, true);
-
-	return 0;
+	return qce_crypto_go(qce, true);
 }
 #endif
 
@@ -535,9 +535,7 @@ static int qce_setup_regs_aead(struct crypto_async_request *async_req)
 	qce_write(qce, REG_CONFIG, config);
 
 	/* Start the process */
-	qce_crypto_go(qce, !IS_CCM(flags));
-
-	return 0;
+	return qce_crypto_go(qce, !IS_CCM(flags));
 }
 #endif
 
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index c2602d35baa6ad3ca5de734de7ff6160ff29567c..769cc71da90076be446cbdf7ec7db27f628fa2ac 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -4,6 +4,8 @@
  */
 
 #include <linux/device.h>
+#include <linux/dma/qcom_bam_dma.h>
+#include <linux/dma-mapping.h>
 #include <linux/dmaengine.h>
 #include <crypto/scatterwalk.h>
 
@@ -11,6 +13,96 @@
 #include "dma.h"
 
 #define QCE_IGNORE_BUF_SZ		(2 * QCE_BAM_BURST_SIZE)
+#define QCE_BAM_CMD_SGL_SIZE		128
+#define QCE_BAM_CMD_ELEMENT_SIZE	128
+
+struct qce_desc_info {
+	struct dma_async_tx_descriptor *dma_desc;
+	enum dma_data_direction dir;
+};
+
+struct qce_bam_transaction {
+	struct bam_cmd_element bam_ce[QCE_BAM_CMD_ELEMENT_SIZE];
+	struct scatterlist wr_sgl[QCE_BAM_CMD_SGL_SIZE];
+	struct qce_desc_info *desc;
+	u32 bam_ce_idx;
+	u32 pre_bam_ce_idx;
+	u32 wr_sgl_cnt;
+};
+
+void qce_clear_bam_transaction(struct qce_device *qce)
+{
+	struct qce_bam_transaction *bam_txn = qce->dma.bam_txn;
+
+	bam_txn->bam_ce_idx = 0;
+	bam_txn->wr_sgl_cnt = 0;
+	bam_txn->pre_bam_ce_idx = 0;
+}
+
+int qce_submit_cmd_desc(struct qce_device *qce)
+{
+	struct qce_desc_info *qce_desc = qce->dma.bam_txn->desc;
+	struct qce_bam_transaction *bam_txn = qce->dma.bam_txn;
+	struct dma_async_tx_descriptor *dma_desc;
+	struct dma_chan *chan = qce->dma.rxchan;
+	unsigned long attrs = DMA_PREP_CMD;
+	dma_cookie_t cookie;
+	unsigned int mapped;
+	int ret;
+
+	mapped = dma_map_sg(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt, DMA_TO_DEVICE);
+	if (!mapped)
+		return -ENOMEM;
+
+	dma_desc = dmaengine_prep_slave_sg(chan, bam_txn->wr_sgl, mapped, DMA_MEM_TO_DEV, attrs);
+	if (!dma_desc) {
+		ret = -ENOMEM;
+		goto err_unmap_sg;
+	}
+
+	qce_desc->dma_desc = dma_desc;
+	cookie = dmaengine_submit(qce_desc->dma_desc);
+
+	ret = dma_submit_error(cookie);
+	if (ret)
+		goto err_unmap_sg;
+
+	return 0;
+
+err_unmap_sg:
+	dma_unmap_sg(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt, DMA_TO_DEVICE);
+	return ret;
+}
+
+static void qce_prep_dma_cmd_desc(struct qce_device *qce, struct qce_dma_data *dma,
+				  unsigned int addr, void *buf)
+{
+	struct qce_bam_transaction *bam_txn = dma->bam_txn;
+	struct bam_cmd_element *bam_ce_buf;
+	int bam_ce_size, cnt, idx;
+
+	idx = bam_txn->bam_ce_idx;
+	bam_ce_buf = &bam_txn->bam_ce[idx];
+	bam_prep_ce_le32(bam_ce_buf, addr, BAM_WRITE_COMMAND, *((__le32 *)buf));
+
+	bam_ce_buf = &bam_txn->bam_ce[bam_txn->pre_bam_ce_idx];
+	bam_txn->bam_ce_idx++;
+	bam_ce_size = (bam_txn->bam_ce_idx - bam_txn->pre_bam_ce_idx) * sizeof(*bam_ce_buf);
+
+	cnt = bam_txn->wr_sgl_cnt;
+
+	sg_set_buf(&bam_txn->wr_sgl[cnt], bam_ce_buf, bam_ce_size);
+
+	++bam_txn->wr_sgl_cnt;
+	bam_txn->pre_bam_ce_idx = bam_txn->bam_ce_idx;
+}
+
+void qce_write_dma(struct qce_device *qce, unsigned int offset, u32 val)
+{
+	unsigned int reg_addr = ((unsigned int)(qce->base_phys) + offset);
+
+	qce_prep_dma_cmd_desc(qce, &qce->dma, reg_addr, &val);
+}
 
 static void qce_dma_terminate(void *data)
 {
@@ -42,6 +134,16 @@ int devm_qce_dma_request(struct qce_device *qce)
 		return dev_err_probe(dev, PTR_ERR(dma->rxchan),
 				     "Failed to get RX DMA channel\n");
 
+	dma->bam_txn = devm_kzalloc(dev, sizeof(*dma->bam_txn), GFP_KERNEL);
+	if (!dma->bam_txn)
+		return -ENOMEM;
+
+	dma->bam_txn->desc = devm_kzalloc(dev, sizeof(*dma->bam_txn->desc), GFP_KERNEL);
+	if (!dma->bam_txn->desc)
+		return -ENOMEM;
+
+	sg_init_table(dma->bam_txn->wr_sgl, QCE_BAM_CMD_SGL_SIZE);
+
 	return devm_add_action_or_reset(dev, qce_dma_terminate, dma);
 }
 
@@ -101,28 +203,36 @@ int qce_dma_prep_sgs(struct qce_dma_data *dma, struct scatterlist *rx_sg,
 {
 	struct dma_chan *rxchan = dma->rxchan;
 	struct dma_chan *txchan = dma->txchan;
-	unsigned long flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
+	unsigned long txflags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
+	unsigned long rxflags = txflags | DMA_PREP_FENCE;
 	int ret;
 
-	ret = qce_dma_prep_sg(rxchan, rx_sg, rx_nents, flags, DMA_MEM_TO_DEV,
+	ret = qce_dma_prep_sg(rxchan, rx_sg, rx_nents, rxflags, DMA_MEM_TO_DEV,
 			     NULL, NULL);
 	if (ret)
 		return ret;
 
-	return qce_dma_prep_sg(txchan, tx_sg, tx_nents, flags, DMA_DEV_TO_MEM,
+	return qce_dma_prep_sg(txchan, tx_sg, tx_nents, txflags, DMA_DEV_TO_MEM,
 			       cb, cb_param);
 }
 
 void qce_dma_issue_pending(struct qce_dma_data *dma)
 {
-	dma_async_issue_pending(dma->rxchan);
 	dma_async_issue_pending(dma->txchan);
+	dma_async_issue_pending(dma->rxchan);
 }
 
 int qce_dma_terminate_all(struct qce_dma_data *dma)
 {
+	struct qce_device *qce = container_of(dma, struct qce_device, dma);
+	struct qce_bam_transaction *bam_txn = dma->bam_txn;
 	int ret;
 
 	ret = dmaengine_terminate_all(dma->rxchan);
-	return ret ?: dmaengine_terminate_all(dma->txchan);
+	if (ret)
+		return ret;
+
+	dma_unmap_sg(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt, DMA_TO_DEVICE);
+
+	return dmaengine_terminate_all(dma->txchan);
 }
diff --git a/drivers/crypto/qce/dma.h b/drivers/crypto/qce/dma.h
index 483789d9fa98e79d1283de8297bf2fc2a773f3a7..f05dfa9e6b25bd60e32f45079a8bc7e6a4cf81f9 100644
--- a/drivers/crypto/qce/dma.h
+++ b/drivers/crypto/qce/dma.h
@@ -8,6 +8,7 @@
 
 #include <linux/dmaengine.h>
 
+struct qce_bam_transaction;
 struct qce_device;
 
 /* maximum data transfer block size between BAM and CE */
@@ -32,6 +33,7 @@ struct qce_dma_data {
 	struct dma_chan *txchan;
 	struct dma_chan *rxchan;
 	struct qce_result_dump *result_buf;
+	struct qce_bam_transaction *bam_txn;
 };
 
 int devm_qce_dma_request(struct qce_device *qce);
@@ -43,5 +45,8 @@ int qce_dma_terminate_all(struct qce_dma_data *dma);
 struct scatterlist *
 qce_sgtable_add(struct sg_table *sgt, struct scatterlist *sg_add,
 		unsigned int max_len);
+void qce_write_dma(struct qce_device *qce, unsigned int offset, u32 val);
+int qce_submit_cmd_desc(struct qce_device *qce);
+void qce_clear_bam_transaction(struct qce_device *qce);
 
 #endif /* _DMA_H_ */
diff --git a/drivers/crypto/qce/sha.c b/drivers/crypto/qce/sha.c
index a3a1a205aaf8559a04809936e2a3b7d564c16c53..5be82b345753f49202797852cec09dbc7f0a1e03 100644
--- a/drivers/crypto/qce/sha.c
+++ b/drivers/crypto/qce/sha.c
@@ -109,17 +109,17 @@ static int qce_ahash_async_req_handle(struct crypto_async_request *async_req)
 		goto error_unmap_src;
 	}
 
-	ret = qce_dma_prep_sgs(&qce->dma, req->src, rctx->src_nents,
-			       &rctx->result_sg, 1, qce_ahash_done, async_req);
+	ret = qce_start(async_req, tmpl->crypto_alg_type);
 	if (ret)
 		goto error_unmap_dst;
 
-	qce_dma_issue_pending(&qce->dma);
-
-	ret = qce_start(async_req, tmpl->crypto_alg_type);
+	ret = qce_dma_prep_sgs(&qce->dma, req->src, rctx->src_nents,
+			       &rctx->result_sg, 1, qce_ahash_done, async_req);
 	if (ret)
 		goto error_terminate;
 
+	qce_dma_issue_pending(&qce->dma);
+
 	return 0;
 
 error_terminate:
diff --git a/drivers/crypto/qce/skcipher.c b/drivers/crypto/qce/skcipher.c
index 1fef315a7105c869e7fc6a60719087b721e78bb3..6535336a2c57c39db94999011890b8bdad5c58c2 100644
--- a/drivers/crypto/qce/skcipher.c
+++ b/drivers/crypto/qce/skcipher.c
@@ -142,18 +142,18 @@ qce_skcipher_async_req_handle(struct crypto_async_request *async_req)
 		src_nents = dst_nents - 1;
 	}
 
+	ret = qce_start(async_req, tmpl->crypto_alg_type);
+	if (ret)
+		goto error_unmap_src;
+
 	ret = qce_dma_prep_sgs(&qce->dma, rctx->src_sg, src_nents,
 			       rctx->dst_sg, dst_nents,
 			       qce_skcipher_done, async_req);
 	if (ret)
-		goto error_unmap_src;
+		goto error_terminate;
 
 	qce_dma_issue_pending(&qce->dma);
 
-	ret = qce_start(async_req, tmpl->crypto_alg_type);
-	if (ret)
-		goto error_terminate;
-
 	return 0;
 
 error_terminate:

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v19 14/14] crypto: qce - Communicate the base physical address to the dmaengine
  2026-05-26 13:10 [PATCH v19 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
                   ` (12 preceding siblings ...)
  2026-05-26 13:11 ` [PATCH v19 13/14] crypto: qce - Add BAM DMA support for crypto register I/O Bartosz Golaszewski
@ 2026-05-26 13:11 ` Bartosz Golaszewski
  2026-05-26 18:20   ` sashiko-bot
  13 siblings, 1 reply; 24+ messages in thread
From: Bartosz Golaszewski @ 2026-05-26 13:11 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
	Neil Armstrong
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski

In order to communicate to the BAM DMA engine which address should be
used as a scratchpad for dummy writes related to BAM pipe locking,
fill out and attach the provided metadata struct to the descriptor.

Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/dma.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index 769cc71da90076be446cbdf7ec7db27f628fa2ac..349c1d9ce9a2f4628087aa4ed5f8dda2bd9eaedb 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -11,6 +11,7 @@
 
 #include "core.h"
 #include "dma.h"
+#include "regs-v5.h"
 
 #define QCE_IGNORE_BUF_SZ		(2 * QCE_BAM_BURST_SIZE)
 #define QCE_BAM_CMD_SGL_SIZE		128
@@ -41,6 +42,10 @@ void qce_clear_bam_transaction(struct qce_device *qce)
 
 int qce_submit_cmd_desc(struct qce_device *qce)
 {
+	struct bam_desc_metadata meta = {
+		.scratchpad_addr = qce->base_phys + REG_VERSION,
+		.direction = DMA_MEM_TO_DEV,
+	};
 	struct qce_desc_info *qce_desc = qce->dma.bam_txn->desc;
 	struct qce_bam_transaction *bam_txn = qce->dma.bam_txn;
 	struct dma_async_tx_descriptor *dma_desc;
@@ -60,6 +65,10 @@ int qce_submit_cmd_desc(struct qce_device *qce)
 		goto err_unmap_sg;
 	}
 
+	ret = dmaengine_desc_attach_metadata(dma_desc, &meta, sizeof(meta));
+	if (ret)
+		goto err_unmap_sg;
+
 	qce_desc->dma_desc = dma_desc;
 	cookie = dmaengine_submit(qce_desc->dma_desc);
 

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v19 02/14] dmaengine: qcom: bam_dma: free interrupt before the clock in error path
  2026-05-26 13:10 ` [PATCH v19 02/14] dmaengine: qcom: bam_dma: free interrupt before the clock in error path Bartosz Golaszewski
@ 2026-05-26 13:45   ` sashiko-bot
  0 siblings, 0 replies; 24+ messages in thread
From: sashiko-bot @ 2026-05-26 13:45 UTC (permalink / raw)
  To: Bartosz Golaszewski; +Cc: vkoul, Frank.Li, dmaengine

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

New issues:
- [High] Premature `free_irq()` in `bam_dma_remove()` breaks active DMA clients.

Pre-existing issues:
- [High] MMIO registers are accessed after the peripheral clock is disabled, causing synchronous external aborts.
- [High] Hardware access in IRQ handler before ensuring the device is active.
--

commit 9a4b7f5a6e5a0de12bcd192dd65b74252ada73c2
Author: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

dmaengine: qcom: bam_dma: free interrupt before the clock in error path

This commit replaces devres with manual interrupt management in the BAM DMA
driver. It explicitly requests and frees the IRQ to ensure the interrupt is
freed before the clock is disabled in both the probe error path and during
device removal.

Closes: https://sashiko.dev/#/patchset/20260427-qcom-qce-cmd-descr-v16-0-945fd1cafbbc%40oss.qualcomm.com?part=2

> diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
> index 19116295f8325..b3d36ea799843 100644
> --- a/drivers/dma/qcom/bam_dma.c
> +++ b/drivers/dma/qcom/bam_dma.c
> @@ -1302,8 +1302,7 @@ static int bam_dma_probe(struct platform_device *pdev)
>  	for (i = 0; i < bdev->num_channels; i++)
>  		bam_channel_init(bdev, &bdev->channels[i], i);
>  
> -	ret = devm_request_irq(bdev->dev, bdev->irq, bam_dma_irq,
> -			IRQF_TRIGGER_HIGH, "bam_dma", bdev);
> +	ret = request_irq(bdev->irq, bam_dma_irq, IRQF_TRIGGER_HIGH, "bam_dma", bdev);

[Severity: High]
This isn't a bug introduced by this patch, but does the interrupt handler
bam_dma_irq() access hardware registers before ensuring the device is active?

If the device is in a suspended state (for example, via auto-suspend), its
clock is disabled. If a delayed or stray interrupt fires in this state,
reading from an unclocked MMIO register would result in a synchronous
external abort.

Should the handler use pm_runtime_get_if_active() to ensure the device is
active before touching MMIO registers, rather than risking a crash when the
clock is disabled?

>  	if (ret)
>  		goto err_bam_channel_exit;
>  

[ ... ]

> @@ -1371,6 +1372,8 @@ static void bam_dma_remove(struct platform_device *pdev)
>  	struct bam_device *bdev = platform_get_drvdata(pdev);
>  	u32 i;
>  
> +	free_irq(bdev->irq, bdev);
> +

[Severity: High]
Does moving this free_irq() to the very beginning of the remove function
introduce a race condition?

If the DMA controller is still registered and active from the subsystem's
perspective, any in-flight DMA transfers will complete in hardware, but
software will never receive their completion interrupts. Could this cause
client drivers to hang indefinitely or time out?

Should the teardown sequence first unregister the dmaengine device to
prevent new requests, terminate all active channels, mask interrupts, and
only then free the IRQ and disable the clock?

>  	pm_runtime_force_suspend(&pdev->dev);
>  
>  	of_dma_controller_free(pdev->dev.of_node);
> @@ -1379,8 +1382,6 @@ static void bam_dma_remove(struct platform_device *pdev)
>  	/* mask all interrupts for this execution environment */
>  	writel_relaxed(0, bam_addr(bdev, 0,  BAM_IRQ_SRCS_MSK_EE));

[Severity: High]
This is a pre-existing issue, but does calling pm_runtime_force_suspend()
earlier in this function cause synchronous external aborts here?

The runtime suspend callback will disable the BAM device clock, but
immediately following this, the function accesses hardware registers:

bam_dma_remove()
    pm_runtime_force_suspend(&pdev->dev); /* clock disabled */
    ...
    writel_relaxed(0, bam_addr(bdev, 0, BAM_IRQ_SRCS_MSK_EE));

It also loops over channels calling bam_dma_terminate_all(), which can call
bam_chan_init_hw() and read/write MMIO registers.

Should the device be kept awake until all hardware registers are masked and
channels are terminated?

> -	devm_free_irq(bdev->dev, bdev->irq, bdev);
> -
>  	for (i = 0; i < bdev->num_channels; i++) {
>  		bam_dma_terminate_all(&bdev->channels[i].vc.chan);
>  		tasklet_kill(&bdev->channels[i].vc.task);

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526-qcom-qce-cmd-descr-v19-0-08472fdcbf4a@oss.qualcomm.com?part=2

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v19 03/14] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue
  2026-05-26 13:10 ` [PATCH v19 03/14] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue Bartosz Golaszewski
@ 2026-05-26 14:17   ` sashiko-bot
  0 siblings, 0 replies; 24+ messages in thread
From: sashiko-bot @ 2026-05-26 14:17 UTC (permalink / raw)
  To: Bartosz Golaszewski; +Cc: dmaengine, Frank.Li, vkoul

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

Pre-existing issues:
- [High] Incorrect teardown sequence in `bam_dma_remove` leads to a synchronous external abort (bus fault) when masking interrupts.
- [High] The IRQ handler accesses hardware registers before ensuring the device clock is enabled, leading to a potential bus fault.
- [High] Failing to clear a level-triggered interrupt when `pm_runtime_get_sync` fails causes an interrupt storm and device disablement.
--

commit 6a70ab3d7ea9134c2e9e75e415cbff4d0dd6592a
Author: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue

This commit converts the BAM DMA driver from legacy tasklets to use the
modern high-priority variant of the BH workqueue.

> diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
> index b3d36ea799843..1c62f845ac0b9 100644
> --- a/drivers/dma/qcom/bam_dma.c
> +++ b/drivers/dma/qcom/bam_dma.c

[ ... ]

> @@ -893,9 +894,9 @@ static irqreturn_t bam_dma_irq(int irq, void *data)
>  
>  	srcs |= process_channel_irqs(bdev);

[Severity: High]
This is a pre-existing issue, but does process_channel_irqs() access
hardware registers before the device clock is guaranteed to be enabled?

In process_channel_irqs(), it immediately reads from BAM_IRQ_SRCS_EE. If a
spurious or delayed interrupt occurs while the device is runtime suspended,
could reading this register with the clock disabled cause a synchronous
external abort?

>  
> -	/* kick off tasklet to start next dma transfer */
> +	/* kick off the work queue to start next dma transfer */
>  	if (srcs & P_IRQ)
> -		tasklet_schedule(&bdev->task);
> +		queue_work(system_bh_highpri_wq, &bdev->work);
>  
>  	ret = pm_runtime_get_sync(bdev->dev);
>  	if (ret < 0)

[Severity: High]
This is also a pre-existing issue, but if pm_runtime_get_sync() fails,
the handler returns IRQ_NONE immediately without clearing the hardware
interrupt source.

Since the interrupt is registered as level-triggered (IRQF_TRIGGER_HIGH),
will exiting the handler without writing to BAM_IRQ_CLR cause the
interrupt controller to immediately re-assert it? Could this lead to an
interrupt storm that permanently disables the IRQ line?

[ ... ]

> @@ -1394,7 +1394,7 @@ static void bam_dma_remove(struct platform_device *pdev)
>  			    bdev->channels[i].fifo_phys);
>  	}
>  
> -	tasklet_kill(&bdev->task);
> +	cancel_work_sync(&bdev->work);
>  
>  	clk_disable_unprepare(bdev->bamclk);
>  }

[Severity: High]
This is a pre-existing issue, but earlier in bam_dma_remove(), the device
is forced into suspend:

bam_dma_remove()
    pm_runtime_force_suspend(&pdev->dev);
    ...
    writel_relaxed(0, bam_addr(bdev, 0,  BAM_IRQ_SRCS_MSK_EE));

Does pm_runtime_force_suspend() disable the device clock? If so, will the
subsequent writel_relaxed() write to an unclocked register and trigger a
synchronous external abort during driver unbind?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526-qcom-qce-cmd-descr-v19-0-08472fdcbf4a@oss.qualcomm.com?part=3

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v19 06/14] dmaengine: qcom: bam_dma: add support for BAM locking
  2026-05-26 13:10 ` [PATCH v19 06/14] dmaengine: qcom: bam_dma: add support for BAM locking Bartosz Golaszewski
@ 2026-05-26 15:01   ` sashiko-bot
  0 siblings, 0 replies; 24+ messages in thread
From: sashiko-bot @ 2026-05-26 15:01 UTC (permalink / raw)
  To: Bartosz Golaszewski; +Cc: Frank.Li, vkoul, dmaengine

Thank you for your contribution! Sashiko AI review found 8 potential issue(s) to consider:
- [Critical] Dummy lock descriptors steal DMA cookies and cause premature completion of client transfers.
- [High] Hardware FIFO exhaustion causes unbounded spurious injections of nested LOCK and UNLOCK descriptors.
- [High] A partial allocation or mapping failure leaves the BAM pipeline permanently locked.
- [High] Pipe locking is applied exclusively to TX channels, contradicting the stated intent to support RX channels.
- [High] The dummy command descriptor is mapped using the client peripheral's device instead of the DMA engine's device.
- [High] Missing `cpu_to_le32()` conversion for the DMA descriptor hardware address.
- [High] The `lock_ce` command element shares a CPU cache line with other structure members, leading to false sharing.
- [Medium] Missing length validation when attaching descriptor metadata.
--

commit 8b6f3d804203416c63344d33ae0ad758f79620c7
Author: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

dmaengine: qcom: bam_dma: add support for BAM locking

Add support for BAM pipe locking. When starting DMA on an RX channel, this
prepends the queue of issued descriptors with a dummy command descriptor with
the LOCK bit set, and appends one with the UNLOCK bit once the transaction is
done.

> diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
> index 04fe1d546be73..84fd9e181bdd5 100644
> --- a/drivers/dma/qcom/bam_dma.c
> +++ b/drivers/dma/qcom/bam_dma.c
[ ... ]
> @@ -72,6 +76,10 @@ struct bam_async_desc {
>  
>  	struct bam_desc_hw *curr_desc;
>  
> +	/* BAM locking infrastructure */
> +	struct scatterlist lock_sg;
> +	struct bam_cmd_element lock_ce;

[Severity: High]
Is it safe to embed lock_ce directly in struct bam_async_desc without
cache line alignment?

Since lock_ce is mapped for DMA, sharing a CPU cache line with other members
like desc_node could violate the DMA API. On architectures with non-coherent
caches, DMA cache maintenance operations might cause memory corruption or
lost CPU writes to adjacent fields in the structure.

> +
>  	/* list node for the desc in the bam_chan list of descriptors */
>  	struct list_head desc_node;
[ ... ]
> @@ -652,6 +664,35 @@ static int bam_slave_config(struct dma_chan *chan,
>  	return 0;
>  }
>  
> +static int bam_metadata_attach(struct dma_async_tx_descriptor *desc, void *data, size_t len)
> +{
> +	struct bam_chan *bchan = to_bam_chan(desc->chan);
> +	const struct bam_device_data *bdata = bchan->bdev->dev_data;
> +	struct bam_desc_metadata *metadata = data;
> +
> +	if (!data)
> +		return -EINVAL;

[Severity: Medium]
Could this result in an out-of-bounds read if len is smaller than expected?

The driver casts the data pointer to struct bam_desc_metadata * and
dereferences it without validating that len is at least the size of
struct bam_desc_metadata.

> +
> +	if (!bdata->pipe_lock_supported)
[ ... ]
> @@ -1012,13 +1057,105 @@ static void bam_apply_new_config(struct bam_chan *bchan,
[ ... ]
> +static struct bam_async_desc *
> +bam_make_lock_desc(struct bam_chan *bchan, unsigned long flag)
> +{
[ ... ]
> +	bam_prep_ce_le32(&async_desc->lock_ce, bchan->scratchpad_addr, BAM_WRITE_COMMAND, 0);
> +	sg_set_buf(&async_desc->lock_sg, &async_desc->lock_ce, sizeof(async_desc->lock_ce));
> +
> +	mapped = dma_map_sg(chan->slave, &async_desc->lock_sg, 1, DMA_TO_DEVICE);

[Severity: High]
Should the memory be mapped using the DMA engine's device rather than the
client peripheral's device?

Because the BAM DMA controller executes this command, the memory needs to be
mapped into its own address space. If the DMA controller and the peripheral
are in different IOMMU domains, mapping it for chan->slave might yield an
address that triggers an IOMMU fault when the BAM DMA controller accesses it.

> +	if (!mapped) {
> +		kfree(async_desc);
> +		return ERR_PTR(-ENOMEM);
> +	}
> +
> +	desc->flags |= cpu_to_le16(DESC_FLAG_CMD | flag);
> +	desc->addr = sg_dma_address(&async_desc->lock_sg);

[Severity: High]
Are we missing a cpu_to_le32() conversion here?

The physical DMA address returned by sg_dma_address() is assigned directly
to desc->addr, which is of type __le32. On big-endian architectures, this
might cause the address to be written with the wrong byte order, leading the
hardware engine to fetch from an invalid physical address.

> +	desc->size = cpu_to_le16(sizeof(struct bam_cmd_element));
> +
> +	vc = &bchan->vc;
> +	vd = &async_desc->vd;
> +
> +	dma_async_tx_descriptor_init(&vd->tx, &vc->chan);
> +	vd->tx.flags = DMA_PREP_CMD;
> +	vd->tx.desc_free = vchan_tx_desc_free;
> +	vd->tx_result.result = DMA_TRANS_NOERROR;
> +	vd->tx_result.residue = 0;
> +
> +	cookie = dma_cookie_assign(&vd->tx);

[Severity: Critical]
Will assigning a new cookie to the lock descriptor cause premature completion
of client transfers?

Calling dma_cookie_assign() gives this descriptor a higher cookie value than
the client's already-submitted descriptors. Since the LOCK descriptor is
prepended to desc_issued, it executes first. When it completes, the channel's
completed_cookie is updated to this higher value.

Subsequent calls to dma_cookie_status() will falsely report the pending client
descriptors as complete, potentially causing the client to free or reuse memory
while the hardware is still performing DMA.

> +	ret = dma_submit_error(cookie);
[ ... ]
> +	return async_desc;
> +}
> +
> +static int bam_do_setup_pipe_lock(struct bam_chan *bchan, bool lock)
> +{
> +	struct bam_device *bdev = bchan->bdev;
> +	const struct bam_device_data *bdata = bdev->dev_data;
> +	struct bam_async_desc *lock_desc;
> +	unsigned long flag;
> +
> +	lockdep_assert_held(&bchan->vc.lock);
> +
> +	if (!bdata->pipe_lock_supported || !bchan->scratchpad_addr ||
> +	    bchan->direction != DMA_MEM_TO_DEV)

[Severity: High]
Does this logic contradict the commit message's stated intent to support
RX channels?

The check here aborts if the direction is not DMA_MEM_TO_DEV (a TX channel).
This applies pipe locking exclusively to TX channels, completely disabling it
for the intended RX channels.

> +		return 0;
> +
> +	flag = lock ? DESC_FLAG_LOCK : DESC_FLAG_UNLOCK;
[ ... ]
> +	return 0;
> +}
> +
> +static void bam_setup_pipe_lock(struct bam_chan *bchan)
> +{
> +	if (bam_do_setup_pipe_lock(bchan, true) || bam_do_setup_pipe_lock(bchan, false))

[Severity: High]
Could a partial allocation failure leave the BAM pipeline permanently deadlocked?

The insertion relies on short-circuiting logic. If the first call successfully
adds the LOCK descriptor to desc_issued, and the second call for the UNLOCK
subsequently fails (e.g. returning -ENOMEM), an error is logged but the LOCK
descriptor is never rolled back.

The transaction would then execute with the LOCK, but since no UNLOCK is
issued, the channel remains permanently locked.

> +		dev_err(bchan->vc.chan.slave, "Failed to setup BAM pipe lock descriptors");
> +}
> +
>  /**
>   * bam_start_dma - start next transaction
>   * @bchan: bam dma channel
>   */
>  static void bam_start_dma(struct bam_chan *bchan)
>  {
> -	struct virt_dma_desc *vd = vchan_next_desc(&bchan->vc);
> +	struct virt_dma_desc *vd;
>  	struct bam_device *bdev = bchan->bdev;
>  	struct bam_async_desc *async_desc = NULL;
>  	struct bam_desc_hw *desc;
> @@ -1030,6 +1167,9 @@ static void bam_start_dma(struct bam_chan *bchan)
>  
>  	lockdep_assert_held(&bchan->vc.lock);
>  
> +	bam_setup_pipe_lock(bchan);

[Severity: High]
Could this unconditionally accumulate spurious nested LOCK and UNLOCK descriptors
if the hardware FIFO is full?

During execution, if the loop over descriptors breaks early due to
!IS_BUSY(bchan), it leaves the remaining descriptors in desc_issued.
When the next interrupt triggers bam_start_dma() to continue the transaction,
it will unconditionally add another LOCK and UNLOCK pair. This seems to break
the hardware protocol by accumulating duplicate locking commands in the middle
of a continuous transfer.

> +
> +	vd = vchan_next_desc(&bchan->vc);
>  	if (!vd)
>  		return;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526-qcom-qce-cmd-descr-v19-0-08472fdcbf4a@oss.qualcomm.com?part=6

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v19 07/14] crypto: qce - Cancel work on device detach
  2026-05-26 13:10 ` [PATCH v19 07/14] crypto: qce - Cancel work on device detach Bartosz Golaszewski
@ 2026-05-26 15:33   ` sashiko-bot
  0 siblings, 0 replies; 24+ messages in thread
From: sashiko-bot @ 2026-05-26 15:33 UTC (permalink / raw)
  To: Bartosz Golaszewski; +Cc: Frank.Li, dmaengine, vkoul

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

New issues:
- [High] The patch introduces a devres action to cancel the workqueue but registers it before DMA request and mutex initialization, causing a teardown order that leads to Use-After-Free and locking of a destroyed mutex.

Pre-existing issues:
- [High] Synchronous freeing of `qce_alg_template` in algorithm unregistration functions can lead to a Use-After-Free if active Transform (tfm) instances are still in use.
--

commit 346a6daf560fef47a46685921528f09b6a107436
Author: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

crypto: qce - Cancel work on device detach

This commit introduces a devres action to cancel the done_work workqueue
during device detach or probe error. It moves the workqueue initialization
earlier in the probe sequence to ensure it is cancelled after outstanding
DMA transfers are terminated. Additionally, it ensures all transfers are
explicitly terminated in the DMA release action.

Closes: https://sashiko.dev/#/patchset/20260427-qcom-qce-cmd-descr-v16-0-945fd1cafbbc%40oss.qualcomm.com?part=7

> diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
> index b966f3365b7de..f671946cf7351 100644
> --- a/drivers/crypto/qce/core.c
> +++ b/drivers/crypto/qce/core.c

[ ... ]

> @@ -227,6 +234,11 @@ static int qce_crypto_probe(struct platform_device *pdev)
>  	if (ret)
>  		return ret;
>  
> +	INIT_WORK(&qce->done_work, qce_req_done_work);
> +	ret = devm_add_action_or_reset(dev, qce_cancel_work, &qce->done_work);
> +	if (ret)
> +		return ret;
> +
>  	ret = devm_qce_dma_request(qce->dev, &qce->dma);
>  	if (ret)
>  		return ret;

[Severity: High]
Does registering the work cancellation before DMA and mutex initialization
introduce a teardown order issue?

Devres actions execute in reverse order of registration (LIFO). During driver
unbind or probe failure, the cleanup sequence would be:

1. Mutex is destroyed (via devm_mutex_init cleanup)
2. qce_dma_release() executes, releasing DMA channels and freeing
   dma->result_buf
3. qce_cancel_work() synchronously cancels the workqueue

If the work item (qce_req_done_work) is scheduled before the workqueue is
cancelled, could it execute concurrently with or after the mutex and DMA
cleanups?

When the work runs, it might attempt to lock the destroyed qce->lock and
dequeue the next crypto request using the freed dma->result_buf, which
could lead to a use-after-free.

> @@ -239,7 +251,6 @@ static int qce_crypto_probe(struct platform_device *pdev)
>  	if (ret)
>  		return ret;
>  
> -	INIT_WORK(&qce->done_work, qce_req_done_work);
>  	crypto_init_queue(&qce->queue, QCE_QUEUE_LENGTH);
>  
>  	qce->async_req_enqueue = qce_async_request_enqueue;

[ ... ]

> diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
> index 68cafd4741ad3..7ec9d72fd690f 100644
> --- a/drivers/crypto/qce/dma.c
> +++ b/drivers/crypto/qce/dma.c
> @@ -13,6 +13,8 @@ static void qce_dma_release(void *data)
>  {
>  	struct qce_dma_data *dma = data;
>  
> +	dmaengine_terminate_sync(dma->txchan);
> +	dmaengine_terminate_sync(dma->rxchan);
>  	dma_release_channel(dma->txchan);
>  	dma_release_channel(dma->rxchan);
>  	kfree(dma->result_buf);

[Severity: High]
This is a pre-existing issue, but is it safe to synchronously free the
algorithm templates during unregistration?

In unregistration paths like qce_skcipher_unregister(), the driver calls
crypto_unregister_skcipher() and immediately calls kfree(tmpl).
However, crypto_unregister_skcipher() only drops the algorithm registration
reference; it does not block until all active transform (tfm) instances are
destroyed.

If an application holds a reference to the algorithm (e.g. via an AF_ALG
socket) and the device is unbound, the remove function could execute and
synchronously free the memory.

If the application then continues to submit requests, could it access the
synchronously freed tmpl structure, leading to a use-after-free?

Would it be better to use the alg->cra_destroy callback to dynamically free
the template when the final reference is dropped?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526-qcom-qce-cmd-descr-v19-0-08472fdcbf4a@oss.qualcomm.com?part=7

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v19 09/14] crypto: qce - Remove unused ignore_buf
  2026-05-26 13:10 ` [PATCH v19 09/14] crypto: qce - Remove unused ignore_buf Bartosz Golaszewski
@ 2026-05-26 15:57   ` sashiko-bot
  0 siblings, 0 replies; 24+ messages in thread
From: sashiko-bot @ 2026-05-26 15:57 UTC (permalink / raw)
  To: Bartosz Golaszewski; +Cc: vkoul, Frank.Li, dmaengine

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

Pre-existing issues:
- [Critical] The asynchronous workqueue accesses freed DMA channels, freed memory, and a destroyed mutex during device teardown due to incorrect devres registration order.
- [High] The driver leaks uninitialized kernel heap memory or previous cryptographic results to user space when a hardware error occurs.
- [High] Sensitive 3DES key material is copied to a stack buffer but is not explicitly zeroized before the function returns.
--

commit d001aa784f83034e2199faf1b8d96703d4ecbbb7
Author: Bartosz Golaszewski <brgl@kernel.org>

crypto: qce - Remove unused ignore_buf

This commit removes the unused ignore_buf pointer from the qce_dma_data
struct and moves the QCE_IGNORE_BUF_SZ definition directly into dma.c
since it is not used outside of this compilation unit.

> diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
> index 7ec9d72fd690f..d1daa229361aa 100644
> --- a/drivers/crypto/qce/dma.c
> +++ b/drivers/crypto/qce/dma.c

[ ... ]

> @@ -43,8 +45,6 @@ int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma)
>  		goto error_nomem;
>  	}
>  
> -	dma->ignore_buf = dma->result_buf + QCE_RESULT_BUF_SZ;
> -

[Severity: High]
This is a pre-existing issue, but does the allocation of result_buf risk
leaking uninitialized kernel heap memory or stale cryptographic results to
user space?

In devm_qce_dma_request(), result_buf is allocated with kmalloc() without
the __GFP_ZERO flag. During asynchronous request completion in
qce_ahash_done() and qce_aead_done(), the driver appears to unconditionally
copy the digest, authentication tag, or IV from these buffers to the caller's
output buffers (e.g., req->result) which often map directly to user space.

If a hardware error occurs and the BAM DMA fails to write to the result
buffer, wouldn't this copy operation expose uninitialized heap memory or
previous request data before qce_check_status() can handle the error?

[Severity: Critical]
This is a pre-existing issue, but I noticed a potential use-after-free
during device teardown related to the devres registration order here.

If qce_crypto_probe() registers the devres action to cancel the
asynchronous workqueue (qce_cancel_work) first, and then calls
devm_qce_dma_request() and devm_mutex_init(), won't the teardown happen
in reverse order?

During device detach, the LIFO devres execution would destroy the mutex
and free the DMA channels and buffers before actually canceling the
workqueue. If a DMA callback schedules qce->done_work just before DMA
termination, could qce_req_done_work() execute concurrently during
teardown, attempting to acquire the destroyed qce->lock and using the
freed DMA resources?

>  	return devm_add_action_or_reset(dev, qce_dma_release, dma);
>  
>  error_nomem:

[ ... ]

> diff --git a/drivers/crypto/qce/dma.h b/drivers/crypto/qce/dma.h
> index 31629185000e1..fc337c435cd14 100644
> --- a/drivers/crypto/qce/dma.h
> +++ b/drivers/crypto/qce/dma.h

[ ... ]

> @@ -31,7 +30,6 @@ struct qce_dma_data {
>  	struct dma_chan *txchan;
>  	struct dma_chan *rxchan;
>  	struct qce_result_dump *result_buf;
> -	void *ignore_buf;
>  };
>  
>  int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma);

[Severity: High]
This isn't a bug introduced by this patch, but while reviewing the
cryptographic operations I noticed an issue in qce_aead_setkey() and
qce_des3_setkey().

Sensitive 3DES key material is copied into a local stack array
u32 _key[6] to check for weak keys. However, the _key array doesn't
seem to be explicitly zeroized using memzero_explicit() before the
function returns.

Does this leave sensitive cryptographic key material on the kernel
stack, potentially allowing it to be leaked through subsequent
uninitialized stack reads?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526-qcom-qce-cmd-descr-v19-0-08472fdcbf4a@oss.qualcomm.com?part=9

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v19 11/14] crypto: qce - Use existing devres APIs in devm_qce_dma_request()
  2026-05-26 13:10 ` [PATCH v19 11/14] crypto: qce - Use existing devres APIs in devm_qce_dma_request() Bartosz Golaszewski
@ 2026-05-26 16:09   ` sashiko-bot
  0 siblings, 0 replies; 24+ messages in thread
From: sashiko-bot @ 2026-05-26 16:09 UTC (permalink / raw)
  To: Bartosz Golaszewski; +Cc: dmaengine, vkoul, Frank.Li

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Invalid explicit cleanup of devm-managed resources causes slab corruption and double-free vulnerabilities.
--

commit 633f351bd909cf49fe67df70c5d1504f4356decf
Author: Bartosz Golaszewski <brgl@kernel.org>

crypto: qce - Use existing devres APIs in devm_qce_dma_request()

This commit switches the explicit dma_request_chan and kmalloc calls to
their devm_* counterparts in devm_qce_dma_request. It also simplifies the
error handling paths and renames the release action.

> diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
> index d60efb5c26d88..c2602d35baa6a 100644
> --- a/drivers/crypto/qce/dma.c
> +++ b/drivers/crypto/qce/dma.c
> @@ -12,7 +12,7 @@
>  
>  #define QCE_IGNORE_BUF_SZ		(2 * QCE_BAM_BURST_SIZE)
>  
> -static void qce_dma_release(void *data)
> +static void qce_dma_terminate(void *data)
>  {
>  	struct qce_dma_data *dma = data;

[Severity: High]
Does qce_dma_terminate() still call kfree(dma->result_buf) and
dma_release_channel() on the DMA channels?

Looking at the unmodified portion of the function:

static void qce_dma_terminate(void *data)
{
        struct qce_dma_data *dma = data;

        dmaengine_terminate_sync(dma->txchan);
        dmaengine_terminate_sync(dma->rxchan);
        dma_release_channel(dma->txchan);
        dma_release_channel(dma->rxchan);
        kfree(dma->result_buf);
}

Since dma->result_buf, dma->txchan, and dma->rxchan are now allocated using
devm_kmalloc() and devm_dma_request_chan() below, won't the devres framework
automatically clean them up during device teardown?

Because devm_add_action_or_reset() adds this callback to the devres list, it
will execute first and manually free these resources. When the devres
framework subsequently cleans up the devm allocations, could this lead to a
double-free and slab corruption?

[ ... ]

> @@ -27,34 +27,22 @@ int devm_qce_dma_request(struct qce_device *qce)
>  {
>  	struct qce_dma_data *dma = &qce->dma;
>  	struct device *dev = qce->dev;
> -	int ret;
>  
> -	dma->txchan = dma_request_chan(dev, "tx");
> +	dma->result_buf = devm_kmalloc(dev, QCE_RESULT_BUF_SZ + QCE_IGNORE_BUF_SZ, GFP_KERNEL);
> +	if (!dma->result_buf)
> +		return -ENOMEM;
> +
> +	dma->txchan = devm_dma_request_chan(dev, "tx");

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526-qcom-qce-cmd-descr-v19-0-08472fdcbf4a@oss.qualcomm.com?part=11

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v19 12/14] crypto: qce - Map crypto memory for DMA
  2026-05-26 13:11 ` [PATCH v19 12/14] crypto: qce - Map crypto memory for DMA Bartosz Golaszewski
@ 2026-05-26 16:30   ` sashiko-bot
  0 siblings, 0 replies; 24+ messages in thread
From: sashiko-bot @ 2026-05-26 16:30 UTC (permalink / raw)
  To: Bartosz Golaszewski; +Cc: vkoul, Frank.Li, dmaengine

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

New issues:
- [High] The newly added DMA mapping is unmapped before the DMA engine is terminated and the workqueue is cancelled, leading to potential IOMMU faults or memory corruption during teardown.

Pre-existing issues:
- [High] Explicit free of devm-managed resources in `qce_dma_terminate` causes a double-free and memory pool corruption.
- [High] The background completion work is cancelled after its required lock is destroyed and DMA channels are released, potentially causing lock corruption or a use-after-free.
--

commit 0c5470463f2ecb5c08f0a1037a6585137db4fa3e
Author: Bartosz Golaszewski <brgl@kernel.org>
Subject: crypto: qce - Map crypto memory for DMA

This commit maps the crypto memory range for DMA as an initial step towards
converting the driver's register I/O to use DMA.

> diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
> index a0e2eadc3afd5..d7b7a3dda4649 100644
> --- a/drivers/crypto/qce/core.c
> +++ b/drivers/crypto/qce/core.c
> @@ -192,10 +192,19 @@ static void qce_cancel_work(void *data)
>  	cancel_work_sync(work);
>  }

[Severity: High]
This isn't a bug introduced by this patch, but is it possible that the
background completion work is cancelled after its required lock is destroyed?

Because devres executes cleanups in LIFO order, if the devm cleanup action
for the workqueue (qce_cancel_work) is registered before
devm_qce_dma_request() and devm_mutex_init(), the mutex is destroyed and
DMA channels are released before cancel_work_sync() is called.

If the background work is executing or gets scheduled during this teardown,
qce_req_done_work() might attempt to lock a destroyed mutex and access
already-freed DMA channels, potentially causing lock corruption or a
use-after-free.

[ ... ]

> @@ -255,6 +264,18 @@ static int qce_crypto_probe(struct platform_device *pdev)
>  	qce->async_req_enqueue = qce_async_request_enqueue;
>  	qce->async_req_done = qce_async_request_done;
>  
> +	qce->dma_size = resource_size(res);
> +	qce->base_dma = dma_map_resource(dev, res->start, qce->dma_size,
> +					 DMA_BIDIRECTIONAL, 0);
> +	qce->base_phys = res->start;
> +	ret = dma_mapping_error(dev, qce->base_dma);
> +	if (ret)
> +		return ret;
> +
> +	ret = devm_add_action_or_reset(qce->dev, qce_crypto_unmap_dma, qce);

[Severity: High]
Will this cause the DMA mapping to be unmapped while the DMA engine is
still active?

Since this devres cleanup action is registered near the end of the probe
sequence, devres will execute it first in LIFO order during device unbind
or a late probe failure.

However, because the background work cancellation and DMA channel
termination were registered earlier in the probe sequence, they will
execute later. This means the memory could be unmapped from the IOMMU
while the DMA engine is processing in-flight requests or while the
workqueue can still submit new requests, potentially leading to IOMMU
faults or memory corruption.

[Severity: High]
This is a pre-existing issue, but I wanted to ask about the cleanup
sequence in qce_dma_terminate().

In a preceding patch that converted the driver to use devm_kmalloc() for
dma->result_buf and devm_dma_request_chan() for the TX/RX DMA channels,
the explicit kfree() and dma_release_channel() calls don't appear to have
been removed from qce_dma_terminate().

Calling kfree() on a devm_kmalloc() pointer bypasses the devres allocation
header and corrupts the slab allocator. Likewise, calling
dma_release_channel() on devm-managed channels usually results in a
double-free when the devres framework later runs its own cleanup loop.

> +	if (ret)
> +		return ret;
> +
>  	return devm_qce_register_algs(qce);
>  }

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526-qcom-qce-cmd-descr-v19-0-08472fdcbf4a@oss.qualcomm.com?part=12

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v19 13/14] crypto: qce - Add BAM DMA support for crypto register I/O
  2026-05-26 13:11 ` [PATCH v19 13/14] crypto: qce - Add BAM DMA support for crypto register I/O Bartosz Golaszewski
@ 2026-05-26 17:13   ` sashiko-bot
  0 siblings, 0 replies; 24+ messages in thread
From: sashiko-bot @ 2026-05-26 17:13 UTC (permalink / raw)
  To: Bartosz Golaszewski; +Cc: vkoul, dmaengine, Frank.Li

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

New issues:
- [High] Host-endian register values are written directly to BAM DMA command descriptors without byte-swapping, breaking register writes on big-endian architectures.
- [Medium] Early return on error in `qce_dma_terminate_all()` leaks a DMA mapping and leaves the TX DMA channel running.

Pre-existing issues:
- [High] `qce_dma_terminate` explicitly frees devres-managed resources, causing double-frees and kernel panics.
--

commit 9259099a81bab443e790b41fd7e4fc53cf2bad0a
Author: Bartosz Golaszewski <brgl@kernel.org>

crypto: qce - Add BAM DMA support for crypto register I/O

This patch switches the driver to use BAM DMA for register I/O in addition
to passing data. It provides the necessary infrastructure to queue DMA command
descriptors and replaces direct register writes with DMA write wrappers.

> diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
> index c2602d35baa6a..769cc71da9007 100644
> --- a/drivers/crypto/qce/dma.c
> +++ b/drivers/crypto/qce/dma.c
> @@ -11,6 +13,96 @@

[ ... ]

> +static void qce_prep_dma_cmd_desc(struct qce_device *qce, struct qce_dma_data *dma,
> +				  unsigned int addr, void *buf)
> +{
> +	struct qce_bam_transaction *bam_txn = dma->bam_txn;
> +	struct bam_cmd_element *bam_ce_buf;
> +	int bam_ce_size, cnt, idx;
> +
> +	idx = bam_txn->bam_ce_idx;
> +	bam_ce_buf = &bam_txn->bam_ce[idx];
> +	bam_prep_ce_le32(bam_ce_buf, addr, BAM_WRITE_COMMAND, *((__le32 *)buf));

[Severity: High]
Does this cast to __le32 correctly handle endianness on big-endian
architectures?

Since buf points to a host-endian u32 passed from qce_write_dma(), casting
it directly bypasses byte-swapping. The prior writel() mechanism
automatically handled endianness conversion.

Could this result in reverse byte order being written to the BAM command
element on big-endian platforms?

[ ... ]

> @@ -42,6 +134,16 @@ int devm_qce_dma_request(struct qce_device *qce)
>  		return dev_err_probe(dev, PTR_ERR(dma->rxchan),
>  				     "Failed to get RX DMA channel\n");
>  
> +	dma->bam_txn = devm_kzalloc(dev, sizeof(*dma->bam_txn), GFP_KERNEL);
> +	if (!dma->bam_txn)
> +		return -ENOMEM;
> +
> +	dma->bam_txn->desc = devm_kzalloc(dev, sizeof(*dma->bam_txn->desc), GFP_KERNEL);
> +	if (!dma->bam_txn->desc)
> +		return -ENOMEM;
> +
> +	sg_init_table(dma->bam_txn->wr_sgl, QCE_BAM_CMD_SGL_SIZE);
> +
>  	return devm_add_action_or_reset(dev, qce_dma_terminate, dma);
>  }

[Severity: High]
This isn't a bug introduced by this patch, but looking at the
qce_dma_terminate callback that is being registered here:

static void qce_dma_terminate(void *data)
{
        ...
        dma_release_channel(dma->txchan);
        dma_release_channel(dma->rxchan);
        kfree(dma->result_buf);
}

Does explicitly freeing devres-managed resources here cause a double-free?

Passing a devres-managed pointer to kfree() can cause an immediate slab
allocator panic because it points past the internal devres header.
Additionally, manually releasing devm DMA channels will result in a
double-free when the kernel later cleans up the devres resources.

[ ... ]

> @@ -101,28 +203,36 @@ int qce_dma_prep_sgs(struct qce_dma_data *dma, struct scatterlist *rx_sg,

[ ... ]

>  int qce_dma_terminate_all(struct qce_dma_data *dma)
>  {
> +	struct qce_device *qce = container_of(dma, struct qce_device, dma);
> +	struct qce_bam_transaction *bam_txn = dma->bam_txn;
>  	int ret;
>  
>  	ret = dmaengine_terminate_all(dma->rxchan);
> -	return ret ?: dmaengine_terminate_all(dma->txchan);
> +	if (ret)
> +		return ret;
> +
> +	dma_unmap_sg(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt, DMA_TO_DEVICE);
> +
> +	return dmaengine_terminate_all(dma->txchan);
>  }

[Severity: Medium]
If dmaengine_terminate_all(dma->rxchan) returns an error, does this early
return leak the DMA mapping for bam_txn->wr_sgl?

It appears that bypassing the rest of the teardown sequence will leave the
mapping unfreed and potentially leave the TX DMA channel running.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526-qcom-qce-cmd-descr-v19-0-08472fdcbf4a@oss.qualcomm.com?part=13

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v19 14/14] crypto: qce - Communicate the base physical address to the dmaengine
  2026-05-26 13:11 ` [PATCH v19 14/14] crypto: qce - Communicate the base physical address to the dmaengine Bartosz Golaszewski
@ 2026-05-26 18:20   ` sashiko-bot
  0 siblings, 0 replies; 24+ messages in thread
From: sashiko-bot @ 2026-05-26 18:20 UTC (permalink / raw)
  To: Bartosz Golaszewski; +Cc: Frank.Li, vkoul, dmaengine

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Orphaned DMA descriptor memory leak upon metadata attachment failure leading to channel exhaustion.
--

commit 433ea109d3dff3d196874c0f1de536092d9aa63c
Author: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

crypto: qce - Communicate the base physical address to the dmaengine
    
This commit fills out a bam_desc_metadata struct containing the scratchpad
address and attaches it to the DMA descriptor via
dmaengine_desc_attach_metadata() prior to submission.

> diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
> --- a/drivers/crypto/qce/dma.c
> +++ b/drivers/crypto/qce/dma.c
[ ... ]
> @@ -60,6 +65,10 @@ int qce_submit_cmd_desc(struct qce_device *qce)
>  		goto err_unmap_sg;
>  	}
>  
> +	ret = dmaengine_desc_attach_metadata(dma_desc, &meta, sizeof(meta));
> +	if (ret)
> +		goto err_unmap_sg;
> +

[Severity: High]
Does this error path leak the DMA descriptor allocated by
dmaengine_prep_slave_sg()?

If dmaengine_desc_attach_metadata() fails, the code jumps to err_unmap_sg
without freeing or submitting dma_desc. When the error propagates to the
caller, the cleanup path bypasses dmaengine_terminate_all() because the
transaction was never submitted.

If this fails repeatedly, could it rapidly exhaust the DMA channel's
descriptor pool and permanently prevent future requests? Should the
unsubmitted descriptor be explicitly freed before returning the error?

>  	qce_desc->dma_desc = dma_desc;
>  	cookie = dmaengine_submit(qce_desc->dma_desc);
[ ... ]
>  err_unmap_sg:
>  	dma_unmap_sg(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt, DMA_TO_DEVICE);
>  	return ret;
>  }

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526-qcom-qce-cmd-descr-v19-0-08472fdcbf4a@oss.qualcomm.com?part=14

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2026-05-26 18:20 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-26 13:10 [PATCH v19 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
2026-05-26 13:10 ` [PATCH v19 01/14] dmaengine: constify struct dma_descriptor_metadata_ops Bartosz Golaszewski
2026-05-26 13:10 ` [PATCH v19 02/14] dmaengine: qcom: bam_dma: free interrupt before the clock in error path Bartosz Golaszewski
2026-05-26 13:45   ` sashiko-bot
2026-05-26 13:10 ` [PATCH v19 03/14] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue Bartosz Golaszewski
2026-05-26 14:17   ` sashiko-bot
2026-05-26 13:10 ` [PATCH v19 04/14] dmaengine: qcom: bam_dma: Extend the driver's device match data Bartosz Golaszewski
2026-05-26 13:10 ` [PATCH v19 05/14] dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support Bartosz Golaszewski
2026-05-26 13:10 ` [PATCH v19 06/14] dmaengine: qcom: bam_dma: add support for BAM locking Bartosz Golaszewski
2026-05-26 15:01   ` sashiko-bot
2026-05-26 13:10 ` [PATCH v19 07/14] crypto: qce - Cancel work on device detach Bartosz Golaszewski
2026-05-26 15:33   ` sashiko-bot
2026-05-26 13:10 ` [PATCH v19 08/14] crypto: qce - Include algapi.h in the core.h header Bartosz Golaszewski
2026-05-26 13:10 ` [PATCH v19 09/14] crypto: qce - Remove unused ignore_buf Bartosz Golaszewski
2026-05-26 15:57   ` sashiko-bot
2026-05-26 13:10 ` [PATCH v19 10/14] crypto: qce - Simplify arguments of devm_qce_dma_request() Bartosz Golaszewski
2026-05-26 13:10 ` [PATCH v19 11/14] crypto: qce - Use existing devres APIs in devm_qce_dma_request() Bartosz Golaszewski
2026-05-26 16:09   ` sashiko-bot
2026-05-26 13:11 ` [PATCH v19 12/14] crypto: qce - Map crypto memory for DMA Bartosz Golaszewski
2026-05-26 16:30   ` sashiko-bot
2026-05-26 13:11 ` [PATCH v19 13/14] crypto: qce - Add BAM DMA support for crypto register I/O Bartosz Golaszewski
2026-05-26 17:13   ` sashiko-bot
2026-05-26 13:11 ` [PATCH v19 14/14] crypto: qce - Communicate the base physical address to the dmaengine Bartosz Golaszewski
2026-05-26 18:20   ` sashiko-bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox