* [PATCH v20 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O
@ 2026-06-29 10:01 Bartosz Golaszewski
2026-06-29 10:01 ` [PATCH v20 01/14] dmaengine: constify struct dma_descriptor_metadata_ops Bartosz Golaszewski
` (13 more replies)
0 siblings, 14 replies; 25+ messages in thread
From: Bartosz Golaszewski @ 2026-06-29 10:01 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski,
Dmitry Baryshkov, Konrad Dybcio
This iteration addresses the main concerns raised by Sashiko: abuse of
the DMA engine API with wrong usage of cookies and FIFO-full
rescheduling with locks enabled.
Merging strategy: there are build-time dependencies between the crypto
and DMA patches so the best approach is for Vinod to create an immutable
branch with the DMA part pulled in by the crypto tree.
Currently the QCE crypto driver accesses the crypto engine registers
directly via CPU. Trust Zone may perform crypto operations simultaneously
resulting in a race condition. To remedy that, let's introduce support
for BAM locking/unlocking to the driver. The BAM driver will now wrap
any existing issued descriptor chains with additional descriptors
performing the locking when the client starts the transaction
(dmaengine_issue_pending()). The client wanting to profit from locking
needs to switch to performing register I/O over DMA and communicate the
address to which to perform the dummy writes via a call to
dmaengine_desc_attach_metadata().
In the specific case of the BAM DMA this translates to sending command
descriptors performing dummy writes with the relevant flags set. The BAM
will then lock all other pipes not related to the current pipe group, and
keep handling the current pipe only until it sees the the unlock bit.
In order for the locking to work correctly, we also need to switch to
using DMA for all register I/O.
On top of this, the series contains some additional tweaks and
refactoring.
The goal of this is not to improve the performance but to prepare the
driver for supporting decryption into secure buffers in the future.
Tested with tcrypt.ko, kcapi and cryptsetup.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
Changes in v20:
- Don't use DMA cookies for LOCK/UNLOCK descriptors as this leads to
dmaengine state corruption
- Handle re-scheduling of a DMA transaction on full FIFO
- Fix DMA descriptor leak in qce_submit_cmd_desc()
- Link to v19: https://patch.msgid.link/20260526-qcom-qce-cmd-descr-v19-0-08472fdcbf4a@oss.qualcomm.com
Changes in v19:
- Fix more potential issues in remove path (sashiko)
- Remove unneeded return value check for vchan_tx_prep() as it can never
fail
- Link to v18: https://patch.msgid.link/20260522-qcom-qce-cmd-descr-v18-0-99103926bafc@oss.qualcomm.com
Changes in v18:
- Free the BAM interrupt before disabling the clock in remove() path too
- convert the size assigned to command descriptors to little endian
- don't pass DMA mapping attributes to dma_map_sg() in bam_dma when
setting up command descriptors
- Cancel the QCE workqueue *after* any outstanding DMA transfer
completes
- When mapping the scatterlist for command descriptors: use the actual
number of mapped segments for dmaengine_prep_slave_sg()
- Drop the leftover read_buf field from struct qce_device
- Unmap command descriptors only after terminating the RX transfer
- Pass the actual size of the metadata struct to
dmaengine_desc_attach_metadata(), this is not really required for our
use-case but let's do this for correctness and make sashiko happy
- Drop double assignment of bam_ce_idx in qce_clear_bam_transaction()
- Remove unused QCE_MAX_REG_READ
- Link to v17: https://patch.msgid.link/20260519-qcom-qce-cmd-descr-v17-0-53a595414b79@oss.qualcomm.com
Changes in v17:
- New patch: free the interrupt before disabling the clock in error path
in probe()
- New patch: cancel the QCE work on device detach
- Hold the channel lock when attaching the metadata
- Reorder the operations in devm_qce_dma_request() to avoid freeing
memory that may still be used by the DMA channel
- Register algorithms as the last step in QCE's probe() to avoid making
the resources available to the system before the DMA is fully set up
- Fix error paths in algo request handlers
- Don't pass dmaengine attributes to map_sg_attrs() as it expects
dma-mapping attribute flags
- Fix a dma mapping leak for command descriptors
- Rebase on top of v7.1-rc4
- Link to v16: https://patch.msgid.link/20260427-qcom-qce-cmd-descr-v16-0-945fd1cafbbc@oss.qualcomm.com
Changes in v16:
- Fix a reported race between dma_map_sg() called with spinlock taken
and the corresponding dma_unmap_sg() called without it by moving the
descriptor locking data into the descriptor struct
- Also queue the TX data descriptors before the command descriptors to
match what downstream is doing
- Tweak commit messages
- Rebase on top of v7.1-rc1
- Link to v15: https://patch.msgid.link/20260402-qcom-qce-cmd-descr-v15-0-98b5361f7ed7@oss.qualcomm.com
Changes in v15:
- Extend the descriptor metadata struct to also carry the channel's
transfer direction and stop using dmaengine_slave_config() for that
- Link to v14: https://patch.msgid.link/20260323-qcom-qce-cmd-descr-v14-0-f323af411274@oss.qualcomm.com
Changes in v14:
- Don't return an error to a client which wants to use locking on BAM
that doesn't support it
- Add a comment describing the DMA descriptor metadata structure
- Fix memory leaks
- Remove leftovers from previous iterations
- Propagate errors from dma_cookie_assign() when setting up lock
descriptors
- Link to v13: https://patch.msgid.link/20260317-qcom-qce-cmd-descr-v13-0-0968eb4f8c40@oss.qualcomm.com
Changes in v13:
- As part of the DMA changes in the QCE driver: reverse the order of
queueing the descriptors in the QCE driver: queue command descriptors
with all the register writes first, followed by all the data descriptors,
this is in line with the recommandations from the BAM HPG
- Set the NWD (notify-when-done) bit (DMA_PREP_FENCE in dmaengine
parlance) on the data descriptors to ensure that the UNLOCK descriptor
will not be processed until after they have been processed by the
engine. While technically the NWD bit is only needed on the final data
descriptor, it's hard to tell which one *will* be the last from the
driver's point-of-view and both the downstream driver as well as
the Qualcomm TZ against which we want to synchronize sets NWD on every
data descriptor,
- Revert to creating the LOCK/UNLOCK command descriptor pair in one
place now that the NWD bit is in place,
- Link to v12: https://patch.msgid.link/20260310-qcom-qce-cmd-descr-v12-0-398f37f26ef0@oss.qualcomm.com
Changes in v12:
- Wait until the transaction is done before queueing the UNLOCK command
descriptor
- Use descriptor metadata for communicating the scratchpad address to
the BAM driver
- To that end: reverse the order of the series (first BAM, then QCE) to
maintain bisectability
- Unmap buffers used for dummy writes after the transaction
- Link to v11: https://patch.msgid.link/20260302-qcom-qce-cmd-descr-v11-0-4bf1f5db4802@oss.qualcomm.com
Changes in v11:
- Use new approach, not requiring the client to be involved in locking.
- Add a patch constifying dma_descriptor_metadata_ops
- Rebase on top of v7.0-rc1
- Link to v10: https://lore.kernel.org/r/20251219-qcom-qce-cmd-descr-v10-0-ff7e4bf7dad4@oss.qualcomm.com
Changes in v10:
- Move DESC_FLAG_(UN)LOCK BIT definitions from patch 2 to 3
- Add a patch constifying the dma engine metadata as the first in the
series
- Use the VERSION register for dummy lock/unlock writes
- Link to v9: https://lore.kernel.org/r/20251128-qcom-qce-cmd-descr-v9-0-9a5f72b89722@linaro.org
Changes in v9:
- Drop the global, generic LOCK/UNLOCK flags and instead use DMA
descriptor metadata ops to pass BAM-specific information from the QCE
to the DMA engine
- Link to v8: https://lore.kernel.org/r/20251106-qcom-qce-cmd-descr-v8-0-ecddca23ca26@linaro.org
Changes in v8:
- Rework the command descriptor logic and drop a lot of unneeded code
- Use the physical address for BAM command descriptor access, not the
mapped DMA address
- Fix the problems with iommu faults on newer platforms
- Generalize the LOCK/UNLOCK flags in dmaengine and reword the docs and
commit messages
- Make the BAM locking logic stricter in the DMA engine driver
- Add some additional minor QCE driver refactoring changes to the series
- Lots of small reworks and tweaks to rebase on current mainline and fix
previous issues
- Link to v7: https://lore.kernel.org/all/20250311-qce-cmd-descr-v7-0-db613f5d9c9f@linaro.org/
Changes in v7:
- remove unused code: writing to multiple registers was not used in v6,
neither were the functions for reading registers over BAM DMA-
- remove
- don't read the SW_VERSION register needlessly in the BAM driver,
instead: encode the information on whether the IP supports BAM locking
in device match data
- shrink code where possible with logic modifications (for instance:
change the implementation of qce_write() instead of replacing it
everywhere with a new symbol)
- remove duplicated error messages
- rework commit messages
- a lot of shuffling code around for easier review and a more
streamlined series
- Link to v6: https://lore.kernel.org/all/20250115103004.3350561-1-quic_mdalam@quicinc.com/
Changes in v6:
- change "BAM" to "DMA"
- Ensured this series is compilable with the current Linux-next tip of
the tree (TOT).
Changes in v5:
- Added DMA_PREP_LOCK and DMA_PREP_UNLOCK flag support in separate patch
- Removed DMA_PREP_LOCK & DMA_PREP_UNLOCK flag
- Added FIELD_GET and GENMASK macro to extract major and minor version
Changes in v4:
- Added feature description and test hardware
with test command
- Fixed patch version numbering
- Dropped dt-binding patch
- Dropped device tree changes
- Added BAM_SW_VERSION register read
- Handled the error path for the api dma_map_resource()
in probe
- updated the commit messages for batter redability
- Squash the change where qce_bam_acquire_lock() and
qce_bam_release_lock() api got introduce to the change where
the lock/unlock flag get introced
- changed cover letter subject heading to
"dmaengine: qcom: bam_dma: add cmd descriptor support"
- Added the very initial post for BAM lock/unlock patch link
as v1 to track this feature
Changes in v3:
- https://lore.kernel.org/lkml/183d4f5e-e00a-8ef6-a589-f5704bc83d4a@quicinc.com/
- Addressed all the comments from v2
- Added the dt-binding
- Fix alignment issue
- Removed type casting from qce_write_reg_dma()
and qce_read_reg_dma()
- Removed qce_bam_txn = dma->qce_bam_txn; line from
qce_alloc_bam_txn() api and directly returning
dma->qce_bam_txn
Changes in v2:
- https://lore.kernel.org/lkml/20231214114239.2635325-1-quic_mdalam@quicinc.com/
- Initial set of patches for cmd descriptor support
- Add client driver to use BAM lock/unlock feature
- Added register read/write via BAM in QCE Crypto driver
to use BAM lock/unlock feature
---
Bartosz Golaszewski (14):
dmaengine: constify struct dma_descriptor_metadata_ops
dmaengine: qcom: bam_dma: free interrupt before the clock in error path
dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue
dmaengine: qcom: bam_dma: Extend the driver's device match data
dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support
dmaengine: qcom: bam_dma: add support for BAM locking
crypto: qce - Cancel work on device detach
crypto: qce - Include algapi.h in the core.h header
crypto: qce - Remove unused ignore_buf
crypto: qce - Simplify arguments of devm_qce_dma_request()
crypto: qce - Use existing devres APIs in devm_qce_dma_request()
crypto: qce - Map crypto memory for DMA
crypto: qce - Add BAM DMA support for crypto register I/O
crypto: qce - Communicate the base physical address to the dmaengine
drivers/crypto/qce/aead.c | 10 +-
drivers/crypto/qce/common.c | 20 ++-
drivers/crypto/qce/core.c | 39 +++++-
drivers/crypto/qce/core.h | 7 +
drivers/crypto/qce/dma.c | 173 ++++++++++++++++++++-----
drivers/crypto/qce/dma.h | 11 +-
drivers/crypto/qce/sha.c | 10 +-
drivers/crypto/qce/skcipher.c | 10 +-
drivers/dma/qcom/bam_dma.c | 270 ++++++++++++++++++++++++++++++++++-----
drivers/dma/ti/k3-udma.c | 2 +-
drivers/dma/xilinx/xilinx_dma.c | 2 +-
include/linux/dma/qcom_bam_dma.h | 14 ++
include/linux/dmaengine.h | 2 +-
13 files changed, 468 insertions(+), 102 deletions(-)
---
base-commit: 80a4205b805dd461d0a5b5c684079bb120df96d0
change-id: 20251103-qcom-qce-cmd-descr-c5e9b11fe609
Best regards,
--
Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v20 01/14] dmaengine: constify struct dma_descriptor_metadata_ops
2026-06-29 10:01 [PATCH v20 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
@ 2026-06-29 10:01 ` Bartosz Golaszewski
2026-06-29 15:04 ` Pandey, Radhey Shyam
2026-06-29 10:01 ` [PATCH v20 02/14] dmaengine: qcom: bam_dma: free interrupt before the clock in error path Bartosz Golaszewski
` (12 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Bartosz Golaszewski @ 2026-06-29 10:01 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
There's no reason for the instances of this struct to be modifiable.
Constify the pointer in struct dma_async_tx_descriptor and all drivers
currently using it.
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/dma/ti/k3-udma.c | 2 +-
drivers/dma/xilinx/xilinx_dma.c | 2 +-
include/linux/dmaengine.h | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c
index 1cf158eb7bdb541c4e7f4f79f65ab70be4311fad..fb21e0df5ab7b20e4e16777b5ff7f61d2ae67b2b 100644
--- a/drivers/dma/ti/k3-udma.c
+++ b/drivers/dma/ti/k3-udma.c
@@ -3408,7 +3408,7 @@ static int udma_set_metadata_len(struct dma_async_tx_descriptor *desc,
return 0;
}
-static struct dma_descriptor_metadata_ops metadata_ops = {
+static const struct dma_descriptor_metadata_ops metadata_ops = {
.attach = udma_attach_metadata,
.get_ptr = udma_get_metadata_ptr,
.set_len = udma_set_metadata_len,
diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
index 404235c1735384635597e88edc25c67c7d250647..165b11a7c776abc6a8d66d631e19da669644577d 100644
--- a/drivers/dma/xilinx/xilinx_dma.c
+++ b/drivers/dma/xilinx/xilinx_dma.c
@@ -653,7 +653,7 @@ static void *xilinx_dma_get_metadata_ptr(struct dma_async_tx_descriptor *tx,
return seg->hw.app;
}
-static struct dma_descriptor_metadata_ops xilinx_dma_metadata_ops = {
+static const struct dma_descriptor_metadata_ops xilinx_dma_metadata_ops = {
.get_ptr = xilinx_dma_get_metadata_ptr,
};
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index b3d251c9734e95e1b75cf6763d4d2c3a1c6a9910..5244edb90e7e7510bf4460b6a74ee2a7f91c1ccc 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -623,7 +623,7 @@ struct dma_async_tx_descriptor {
void *callback_param;
struct dmaengine_unmap_data *unmap;
enum dma_desc_metadata_mode desc_metadata_mode;
- struct dma_descriptor_metadata_ops *metadata_ops;
+ const struct dma_descriptor_metadata_ops *metadata_ops;
#ifdef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
struct dma_async_tx_descriptor *next;
struct dma_async_tx_descriptor *parent;
--
2.47.3
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v20 02/14] dmaengine: qcom: bam_dma: free interrupt before the clock in error path
2026-06-29 10:01 [PATCH v20 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
2026-06-29 10:01 ` [PATCH v20 01/14] dmaengine: constify struct dma_descriptor_metadata_ops Bartosz Golaszewski
@ 2026-06-29 10:01 ` Bartosz Golaszewski
2026-06-29 10:18 ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 03/14] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue Bartosz Golaszewski
` (11 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Bartosz Golaszewski @ 2026-06-29 10:01 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
The BAM interrupt is requested with a devres helper and so on error it's
freed after probe() returns. We disable the clock before freeing or
masking it so it may still fire and we may end up reading BAM registers
with clock disabled.
Stop using devres for interrupts as we free it in remove() manually
anyway. Add an appropriate label and free the interrupt before disabling
the clock in error path and in remove().
Fixes: e7c0fe2a5c84 ("dmaengine: add Qualcomm BAM dma driver")
Closes: https://sashiko.dev/#/patchset/20260427-qcom-qce-cmd-descr-v16-0-945fd1cafbbc%40oss.qualcomm.com?part=2
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/dma/qcom/bam_dma.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index 1bb26af0405f3a16f97e0d4b86c945c252d97f57..fc155e0d1870cbb7e099a2c4280f9f8fbdf6cf15 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -1332,8 +1332,7 @@ static int bam_dma_probe(struct platform_device *pdev)
for (i = 0; i < bdev->num_channels; i++)
bam_channel_init(bdev, &bdev->channels[i], i);
- ret = devm_request_irq(bdev->dev, bdev->irq, bam_dma_irq,
- IRQF_TRIGGER_HIGH, "bam_dma", bdev);
+ ret = request_irq(bdev->irq, bam_dma_irq, IRQF_TRIGGER_HIGH, "bam_dma", bdev);
if (ret)
goto err_bam_channel_exit;
@@ -1366,7 +1365,7 @@ static int bam_dma_probe(struct platform_device *pdev)
ret = dma_async_device_register(&bdev->common);
if (ret) {
dev_err(bdev->dev, "failed to register dma async device\n");
- goto err_bam_channel_exit;
+ goto err_free_irq;
}
ret = of_dma_controller_register(pdev->dev.of_node, bam_dma_xlate,
@@ -1385,6 +1384,8 @@ static int bam_dma_probe(struct platform_device *pdev)
err_unregister_dma:
dma_async_device_unregister(&bdev->common);
+err_free_irq:
+ free_irq(bdev->irq, bdev);
err_bam_channel_exit:
for (i = 0; i < bdev->num_channels; i++)
tasklet_kill(&bdev->channels[i].vc.task);
@@ -1401,6 +1402,8 @@ static void bam_dma_remove(struct platform_device *pdev)
struct bam_device *bdev = platform_get_drvdata(pdev);
u32 i;
+ free_irq(bdev->irq, bdev);
+
pm_runtime_force_suspend(&pdev->dev);
of_dma_controller_free(pdev->dev.of_node);
@@ -1409,8 +1412,6 @@ static void bam_dma_remove(struct platform_device *pdev)
/* mask all interrupts for this execution environment */
writel_relaxed(0, bam_addr(bdev, 0, BAM_IRQ_SRCS_MSK_EE));
- devm_free_irq(bdev->dev, bdev->irq, bdev);
-
for (i = 0; i < bdev->num_channels; i++) {
bam_dma_terminate_all(&bdev->channels[i].vc.chan);
tasklet_kill(&bdev->channels[i].vc.task);
--
2.47.3
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v20 03/14] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue
2026-06-29 10:01 [PATCH v20 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
2026-06-29 10:01 ` [PATCH v20 01/14] dmaengine: constify struct dma_descriptor_metadata_ops Bartosz Golaszewski
2026-06-29 10:01 ` [PATCH v20 02/14] dmaengine: qcom: bam_dma: free interrupt before the clock in error path Bartosz Golaszewski
@ 2026-06-29 10:01 ` Bartosz Golaszewski
2026-06-29 10:17 ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 04/14] dmaengine: qcom: bam_dma: Extend the driver's device match data Bartosz Golaszewski
` (10 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Bartosz Golaszewski @ 2026-06-29 10:01 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski,
Dmitry Baryshkov
BH workqueues are a modern mechanism, aiming to replace legacy tasklets.
Let's convert the BAM DMA driver to using the high-priority variant of
the BH workqueue.
[Vinod: suggested using the BG workqueue instead of the regular one
running in process context]
Suggested-by: Vinod Koul <vkoul@kernel.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/dma/qcom/bam_dma.c | 32 ++++++++++++++++----------------
1 file changed, 16 insertions(+), 16 deletions(-)
diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index fc155e0d1870cbb7e099a2c4280f9f8fbdf6cf15..ea3df28e777f99c0532761b6aee6807ab23ab4ca 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -42,6 +42,7 @@
#include <linux/pm_runtime.h>
#include <linux/scatterlist.h>
#include <linux/slab.h>
+#include <linux/workqueue.h>
#include "../dmaengine.h"
#include "../virt-dma.h"
@@ -426,8 +427,8 @@ struct bam_device {
struct clk *bamclk;
int irq;
- /* dma start transaction tasklet */
- struct tasklet_struct task;
+ /* dma start transaction workqueue */
+ struct work_struct work;
};
/**
@@ -892,7 +893,7 @@ static u32 process_channel_irqs(struct bam_device *bdev)
/*
* if complete, process cookie. Otherwise
* push back to front of desc_issued so that
- * it gets restarted by the tasklet
+ * it gets restarted by the work queue.
*/
if (!async_desc->num_desc) {
vchan_cookie_complete(&async_desc->vd);
@@ -922,9 +923,9 @@ static irqreturn_t bam_dma_irq(int irq, void *data)
srcs |= process_channel_irqs(bdev);
- /* kick off tasklet to start next dma transfer */
+ /* kick off the work queue to start next dma transfer */
if (srcs & P_IRQ)
- tasklet_schedule(&bdev->task);
+ queue_work(system_bh_highpri_wq, &bdev->work);
ret = pm_runtime_get_sync(bdev->dev);
if (ret < 0)
@@ -1120,14 +1121,14 @@ static void bam_start_dma(struct bam_chan *bchan)
}
/**
- * dma_tasklet - DMA IRQ tasklet
- * @t: tasklet argument (bam controller structure)
+ * bam_dma_work() - DMA interrupt work queue callback
+ * @work: work queue struct embedded in the BAM controller device struct
*
* Sets up next DMA operation and then processes all completed transactions
*/
-static void dma_tasklet(struct tasklet_struct *t)
+static void bam_dma_work(struct work_struct *work)
{
- struct bam_device *bdev = from_tasklet(bdev, t, task);
+ struct bam_device *bdev = from_work(bdev, work, work);
struct bam_chan *bchan;
unsigned int i;
@@ -1140,14 +1141,13 @@ static void dma_tasklet(struct tasklet_struct *t)
if (!list_empty(&bchan->vc.desc_issued) && !IS_BUSY(bchan))
bam_start_dma(bchan);
}
-
}
/**
* bam_issue_pending - starts pending transactions
* @chan: dma channel
*
- * Calls tasklet directly which in turn starts any pending transactions
+ * Calls work queue directly which in turn starts any pending transactions
*/
static void bam_issue_pending(struct dma_chan *chan)
{
@@ -1316,14 +1316,14 @@ static int bam_dma_probe(struct platform_device *pdev)
if (ret)
goto err_disable_clk;
- tasklet_setup(&bdev->task, dma_tasklet);
+ INIT_WORK(&bdev->work, bam_dma_work);
bdev->channels = devm_kcalloc(bdev->dev, bdev->num_channels,
sizeof(*bdev->channels), GFP_KERNEL);
if (!bdev->channels) {
ret = -ENOMEM;
- goto err_tasklet_kill;
+ goto err_workqueue_cancel;
}
/* allocate and initialize channels */
@@ -1389,8 +1389,8 @@ static int bam_dma_probe(struct platform_device *pdev)
err_bam_channel_exit:
for (i = 0; i < bdev->num_channels; i++)
tasklet_kill(&bdev->channels[i].vc.task);
-err_tasklet_kill:
- tasklet_kill(&bdev->task);
+err_workqueue_cancel:
+ cancel_work_sync(&bdev->work);
err_disable_clk:
clk_disable_unprepare(bdev->bamclk);
@@ -1424,7 +1424,7 @@ static void bam_dma_remove(struct platform_device *pdev)
bdev->channels[i].fifo_phys);
}
- tasklet_kill(&bdev->task);
+ cancel_work_sync(&bdev->work);
clk_disable_unprepare(bdev->bamclk);
}
--
2.47.3
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v20 04/14] dmaengine: qcom: bam_dma: Extend the driver's device match data
2026-06-29 10:01 [PATCH v20 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (2 preceding siblings ...)
2026-06-29 10:01 ` [PATCH v20 03/14] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue Bartosz Golaszewski
@ 2026-06-29 10:01 ` Bartosz Golaszewski
2026-06-29 10:01 ` [PATCH v20 05/14] dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support Bartosz Golaszewski
` (9 subsequent siblings)
13 siblings, 0 replies; 25+ messages in thread
From: Bartosz Golaszewski @ 2026-06-29 10:01 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
In preparation for supporting the pipe locking feature flag, extend the
amount of information we can carry in device match data: create a
separate structure and make the register information one of its fields.
This way, in subsequent patches, it will be just a matter of adding a
new field to the device data.
Reviewed-by: Dmitry Baryshkov <lumag@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/dma/qcom/bam_dma.c | 34 +++++++++++++++++++++++++++-------
1 file changed, 27 insertions(+), 7 deletions(-)
diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index ea3df28e777f99c0532761b6aee6807ab23ab4ca..8ce0fe085c5fea6cc614edd692b5cfd264b94d5a 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -113,6 +113,10 @@ struct reg_offset_data {
unsigned int pipe_mult, evnt_mult, ee_mult;
};
+struct bam_device_data {
+ const struct reg_offset_data *reg_info;
+};
+
static const struct reg_offset_data bam_v1_3_reg_info[] = {
[BAM_CTRL] = { 0x0F80, 0x00, 0x00, 0x00 },
[BAM_REVISION] = { 0x0F84, 0x00, 0x00, 0x00 },
@@ -142,6 +146,10 @@ static const struct reg_offset_data bam_v1_3_reg_info[] = {
[BAM_P_FIFO_SIZES] = { 0x1020, 0x00, 0x40, 0x00 },
};
+static const struct bam_device_data bam_v1_3_data = {
+ .reg_info = bam_v1_3_reg_info,
+};
+
static const struct reg_offset_data bam_v1_4_reg_info[] = {
[BAM_CTRL] = { 0x0000, 0x00, 0x00, 0x00 },
[BAM_REVISION] = { 0x0004, 0x00, 0x00, 0x00 },
@@ -171,6 +179,10 @@ static const struct reg_offset_data bam_v1_4_reg_info[] = {
[BAM_P_FIFO_SIZES] = { 0x1820, 0x00, 0x1000, 0x00 },
};
+static const struct bam_device_data bam_v1_4_data = {
+ .reg_info = bam_v1_4_reg_info,
+};
+
static const struct reg_offset_data bam_v1_7_reg_info[] = {
[BAM_CTRL] = { 0x00000, 0x00, 0x00, 0x00 },
[BAM_REVISION] = { 0x01000, 0x00, 0x00, 0x00 },
@@ -200,6 +212,10 @@ static const struct reg_offset_data bam_v1_7_reg_info[] = {
[BAM_P_FIFO_SIZES] = { 0x13820, 0x00, 0x1000, 0x00 },
};
+static const struct bam_device_data bam_v1_7_data = {
+ .reg_info = bam_v1_7_reg_info,
+};
+
static const struct reg_offset_data bam_v2_0_reg_info[] = {
[BAM_CTRL] = { 0x0000, 0x00, 0x00, 0x00 },
[BAM_REVISION] = { 0x1000, 0x00, 0x00, 0x00 },
@@ -229,6 +245,10 @@ static const struct reg_offset_data bam_v2_0_reg_info[] = {
[BAM_P_FIFO_SIZES] = { 0xC820, 0x00, 0x1000, 0x00 },
};
+static const struct bam_device_data bam_v2_0_data = {
+ .reg_info = bam_v2_0_reg_info,
+};
+
/* BAM CTRL */
#define BAM_SW_RST BIT(0)
#define BAM_EN BIT(1)
@@ -422,7 +442,7 @@ struct bam_device {
bool powered_remotely;
u32 active_channels;
- const struct reg_offset_data *layout;
+ const struct bam_device_data *dev_data;
struct clk *bamclk;
int irq;
@@ -440,7 +460,7 @@ struct bam_device {
static inline void __iomem *bam_addr(struct bam_device *bdev, u32 pipe,
enum bam_reg reg)
{
- const struct reg_offset_data r = bdev->layout[reg];
+ const struct reg_offset_data r = bdev->dev_data->reg_info[reg];
return bdev->regs + r.base_offset +
r.pipe_mult * pipe +
@@ -1234,10 +1254,10 @@ static void bam_channel_init(struct bam_device *bdev, struct bam_chan *bchan,
}
static const struct of_device_id bam_of_match[] = {
- { .compatible = "qcom,bam-v1.3.0", .data = &bam_v1_3_reg_info },
- { .compatible = "qcom,bam-v1.4.0", .data = &bam_v1_4_reg_info },
- { .compatible = "qcom,bam-v1.7.0", .data = &bam_v1_7_reg_info },
- { .compatible = "qcom,bam-v2.0.0", .data = &bam_v2_0_reg_info },
+ { .compatible = "qcom,bam-v1.3.0", .data = &bam_v1_3_data },
+ { .compatible = "qcom,bam-v1.4.0", .data = &bam_v1_4_data },
+ { .compatible = "qcom,bam-v1.7.0", .data = &bam_v1_7_data },
+ { .compatible = "qcom,bam-v2.0.0", .data = &bam_v2_0_data },
{}
};
@@ -1261,7 +1281,7 @@ static int bam_dma_probe(struct platform_device *pdev)
return -ENODEV;
}
- bdev->layout = match->data;
+ bdev->dev_data = match->data;
bdev->regs = devm_platform_ioremap_resource(pdev, 0);
if (IS_ERR(bdev->regs))
--
2.47.3
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v20 05/14] dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support
2026-06-29 10:01 [PATCH v20 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (3 preceding siblings ...)
2026-06-29 10:01 ` [PATCH v20 04/14] dmaengine: qcom: bam_dma: Extend the driver's device match data Bartosz Golaszewski
@ 2026-06-29 10:01 ` Bartosz Golaszewski
2026-06-29 10:18 ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 06/14] dmaengine: qcom: bam_dma: add support for BAM locking Bartosz Golaszewski
` (8 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Bartosz Golaszewski @ 2026-06-29 10:01 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski,
Dmitry Baryshkov
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Extend the device match data with a flag indicating whether the IP
supports the BAM lock/unlock feature. Set it to true on BAM IP versions
1.4.0 and above.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Acked-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/dma/qcom/bam_dma.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index 8ce0fe085c5fea6cc614edd692b5cfd264b94d5a..f3e713a5259c2c7c24cfdcec094814eb1202971a 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -115,6 +115,7 @@ struct reg_offset_data {
struct bam_device_data {
const struct reg_offset_data *reg_info;
+ bool pipe_lock_supported;
};
static const struct reg_offset_data bam_v1_3_reg_info[] = {
@@ -181,6 +182,7 @@ static const struct reg_offset_data bam_v1_4_reg_info[] = {
static const struct bam_device_data bam_v1_4_data = {
.reg_info = bam_v1_4_reg_info,
+ .pipe_lock_supported = true,
};
static const struct reg_offset_data bam_v1_7_reg_info[] = {
@@ -214,6 +216,7 @@ static const struct reg_offset_data bam_v1_7_reg_info[] = {
static const struct bam_device_data bam_v1_7_data = {
.reg_info = bam_v1_7_reg_info,
+ .pipe_lock_supported = true,
};
static const struct reg_offset_data bam_v2_0_reg_info[] = {
@@ -247,6 +250,7 @@ static const struct reg_offset_data bam_v2_0_reg_info[] = {
static const struct bam_device_data bam_v2_0_data = {
.reg_info = bam_v2_0_reg_info,
+ .pipe_lock_supported = true,
};
/* BAM CTRL */
--
2.47.3
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v20 06/14] dmaengine: qcom: bam_dma: add support for BAM locking
2026-06-29 10:01 [PATCH v20 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (4 preceding siblings ...)
2026-06-29 10:01 ` [PATCH v20 05/14] dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support Bartosz Golaszewski
@ 2026-06-29 10:01 ` Bartosz Golaszewski
2026-06-29 10:16 ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 07/14] crypto: qce - Cancel work on device detach Bartosz Golaszewski
` (7 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Bartosz Golaszewski @ 2026-06-29 10:01 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
Add support for BAM pipe locking. To that end: when starting DMA on an RX
channel - prepend the existing queue of issued descriptors with an
additional "dummy" command descriptor with the LOCK bit set. Once the
transaction is done (no more issued descriptors), issue one more dummy
descriptor with the UNLOCK bit.
We *must* wait until the transaction is signalled as done because we
must not perform any writes into config registers while the engine is
busy.
The dummy writes must be issued into a scratchpad register of the client
so provide a mechanism to communicate the right address via descriptor
metadata.
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/dma/qcom/bam_dma.c | 189 +++++++++++++++++++++++++++++++++++++--
include/linux/dma/qcom_bam_dma.h | 14 +++
2 files changed, 196 insertions(+), 7 deletions(-)
diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index f3e713a5259c2c7c24cfdcec094814eb1202971a..f4f258994264a234f60debd3e66e31a6b35d1dc5 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -28,11 +28,13 @@
#include <linux/clk.h>
#include <linux/device.h>
#include <linux/dma-mapping.h>
+#include <linux/dma/qcom_bam_dma.h>
#include <linux/dmaengine.h>
#include <linux/init.h>
#include <linux/interrupt.h>
#include <linux/io.h>
#include <linux/kernel.h>
+#include <linux/lockdep.h>
#include <linux/module.h>
#include <linux/of_address.h>
#include <linux/of_dma.h>
@@ -60,6 +62,8 @@ struct bam_desc_hw {
#define DESC_FLAG_EOB BIT(13)
#define DESC_FLAG_NWD BIT(12)
#define DESC_FLAG_CMD BIT(11)
+#define DESC_FLAG_LOCK BIT(10)
+#define DESC_FLAG_UNLOCK BIT(9)
struct bam_async_desc {
struct virt_dma_desc vd;
@@ -72,6 +76,10 @@ struct bam_async_desc {
struct bam_desc_hw *curr_desc;
+ /* BAM locking infrastructure */
+ struct scatterlist lock_sg;
+ struct bam_cmd_element lock_ce;
+
/* list node for the desc in the bam_chan list of descriptors */
struct list_head desc_node;
enum dma_transfer_direction dir;
@@ -425,6 +433,11 @@ struct bam_chan {
struct list_head desc_list;
struct list_head node;
+
+ /* BAM locking infrastructure */
+ phys_addr_t scratchpad_addr;
+ enum dma_transfer_direction direction;
+ bool bam_locked;
};
static inline struct bam_chan *to_bam_chan(struct dma_chan *common)
@@ -638,8 +651,10 @@ static void bam_free_chan(struct dma_chan *chan)
goto err;
}
- scoped_guard(spinlock_irqsave, &bchan->vc.lock)
+ scoped_guard(spinlock_irqsave, &bchan->vc.lock) {
bam_reset_channel(bchan);
+ bchan->bam_locked = false;
+ }
dma_free_wc(bdev->dev, BAM_DESC_FIFO_SIZE, bchan->fifo_virt,
bchan->fifo_phys);
@@ -686,6 +701,35 @@ static int bam_slave_config(struct dma_chan *chan,
return 0;
}
+static int bam_metadata_attach(struct dma_async_tx_descriptor *desc, void *data, size_t len)
+{
+ struct bam_chan *bchan = to_bam_chan(desc->chan);
+ const struct bam_device_data *bdata = bchan->bdev->dev_data;
+ struct bam_desc_metadata *metadata = data;
+
+ if (!data)
+ return -EINVAL;
+
+ if (!bdata->pipe_lock_supported)
+ /*
+ * The client wants to use locking but this BAM version doesn't
+ * support it. Don't return an error here as this will stop the
+ * client from using DMA at all for no reason.
+ */
+ return 0;
+
+ guard(spinlock_irqsave)(&bchan->vc.lock);
+
+ bchan->scratchpad_addr = metadata->scratchpad_addr;
+ bchan->direction = metadata->direction;
+
+ return 0;
+}
+
+static const struct dma_descriptor_metadata_ops bam_metadata_ops = {
+ .attach = bam_metadata_attach,
+};
+
/**
* bam_prep_slave_sg - Prep slave sg transaction
*
@@ -702,6 +746,7 @@ static struct dma_async_tx_descriptor *bam_prep_slave_sg(struct dma_chan *chan,
void *context)
{
struct bam_chan *bchan = to_bam_chan(chan);
+ struct dma_async_tx_descriptor *tx_desc;
struct bam_device *bdev = bchan->bdev;
struct bam_async_desc *async_desc;
struct scatterlist *sg;
@@ -757,7 +802,10 @@ static struct dma_async_tx_descriptor *bam_prep_slave_sg(struct dma_chan *chan,
} while (remainder > 0);
}
- return vchan_tx_prep(&bchan->vc, &async_desc->vd, flags);
+ tx_desc = vchan_tx_prep(&bchan->vc, &async_desc->vd, flags);
+ tx_desc->metadata_ops = &bam_metadata_ops;
+
+ return tx_desc;
}
/**
@@ -802,6 +850,7 @@ static int bam_dma_terminate_all(struct dma_chan *chan)
}
vchan_get_all_descriptors(&bchan->vc, &head);
+ bchan->bam_locked = false;
}
vchan_dma_desc_free_list(&bchan->vc, &head);
@@ -859,6 +908,15 @@ static int bam_resume(struct dma_chan *chan)
return 0;
}
+static void bam_dma_free_lock_desc(struct virt_dma_desc *vd)
+{
+ struct bam_async_desc *async_desc = container_of(vd, struct bam_async_desc, vd);
+ struct dma_chan *chan = vd->tx.chan;
+
+ dma_unmap_sg(chan->slave, &async_desc->lock_sg, 1, DMA_TO_DEVICE);
+ kfree(async_desc);
+}
+
/**
* process_channel_irqs - processes the channel interrupts
* @bdev: bam controller
@@ -919,13 +977,23 @@ static u32 process_channel_irqs(struct bam_device *bdev)
* push back to front of desc_issued so that
* it gets restarted by the work queue.
*/
+
+ list_del(&async_desc->desc_node);
if (!async_desc->num_desc) {
- vchan_cookie_complete(&async_desc->vd);
+ struct bam_desc_hw *hdesc = async_desc->desc;
+ u16 flags = le16_to_cpu(hdesc->flags);
+
+ if (flags & (DESC_FLAG_LOCK | DESC_FLAG_UNLOCK)) {
+ if (flags & DESC_FLAG_UNLOCK)
+ bchan->bam_locked = false;
+ bam_dma_free_lock_desc(&async_desc->vd);
+ } else {
+ vchan_cookie_complete(&async_desc->vd);
+ }
} else {
list_add(&async_desc->vd.node,
&bchan->vc.desc_issued);
}
- list_del(&async_desc->desc_node);
}
}
@@ -1046,13 +1114,101 @@ static void bam_apply_new_config(struct bam_chan *bchan,
bchan->reconfigure = 0;
}
+static struct bam_async_desc *
+bam_make_lock_desc(struct bam_chan *bchan, unsigned long flag)
+{
+ struct dma_chan *chan = &bchan->vc.chan;
+ struct bam_async_desc *async_desc;
+ struct bam_desc_hw *desc;
+ struct virt_dma_desc *vd;
+ struct virt_dma_chan *vc;
+ unsigned int mapped;
+
+ async_desc = kzalloc_flex(*async_desc, desc, 1, GFP_NOWAIT);
+ if (!async_desc) {
+ dev_err(bchan->bdev->dev, "failed to allocate the BAM lock descriptor\n");
+ return ERR_PTR(-ENOMEM);
+ }
+
+ sg_init_table(&async_desc->lock_sg, 1);
+
+ async_desc->num_desc = 1;
+ async_desc->curr_desc = async_desc->desc;
+ async_desc->dir = DMA_MEM_TO_DEV;
+
+ desc = async_desc->desc;
+
+ bam_prep_ce_le32(&async_desc->lock_ce, bchan->scratchpad_addr, BAM_WRITE_COMMAND, 0);
+ sg_set_buf(&async_desc->lock_sg, &async_desc->lock_ce, sizeof(async_desc->lock_ce));
+
+ mapped = dma_map_sg(chan->slave, &async_desc->lock_sg, 1, DMA_TO_DEVICE);
+ if (!mapped) {
+ kfree(async_desc);
+ return ERR_PTR(-ENOMEM);
+ }
+
+ desc->flags |= cpu_to_le16(DESC_FLAG_CMD | flag);
+ desc->addr = sg_dma_address(&async_desc->lock_sg);
+ desc->size = cpu_to_le16(sizeof(struct bam_cmd_element));
+
+ vc = &bchan->vc;
+ vd = &async_desc->vd;
+
+ dma_async_tx_descriptor_init(&vd->tx, &vc->chan);
+ vd->tx.flags = DMA_PREP_CMD;
+ vd->tx_result.result = DMA_TRANS_NOERROR;
+ vd->tx_result.residue = 0;
+
+ return async_desc;
+}
+
+static int bam_setup_pipe_lock(struct bam_chan *bchan)
+{
+ const struct bam_device_data *bdata = bchan->bdev->dev_data;
+ struct bam_async_desc *lock_desc, *unlock_desc;
+
+ lockdep_assert_held(&bchan->vc.lock);
+
+ if (!bdata->pipe_lock_supported || !bchan->scratchpad_addr ||
+ bchan->direction != DMA_MEM_TO_DEV)
+ return 0;
+
+ /*
+ * Allocate both the LOCK and the UNLOCK descriptors up-front so the
+ * operation is all-or-nothing: if either allocation fails we free both
+ * and run the sequence unlocked rather than leave the pipe locked with
+ * no matching UNLOCK.
+ *
+ * Both are queued in-band around the currently issued work: the LOCK is
+ * prepended so it enters the FIFO first, the UNLOCK is appended so it is
+ * the last descriptor of the sequence. They are loaded together with the
+ * payload in a single operation so the engine executes LOCK, the work
+ * and UNLOCK as one ordered batch.
+ */
+ lock_desc = bam_make_lock_desc(bchan, DESC_FLAG_LOCK);
+ if (IS_ERR(lock_desc))
+ return PTR_ERR(lock_desc);
+
+ unlock_desc = bam_make_lock_desc(bchan, DESC_FLAG_UNLOCK);
+ if (IS_ERR(unlock_desc)) {
+ bam_dma_free_lock_desc(&lock_desc->vd);
+ return PTR_ERR(unlock_desc);
+ }
+
+ list_add(&lock_desc->vd.node, &bchan->vc.desc_issued);
+ list_add_tail(&unlock_desc->vd.node, &bchan->vc.desc_issued);
+ bchan->bam_locked = true;
+
+ return 0;
+}
+
/**
* bam_start_dma - start next transaction
* @bchan: bam dma channel
*/
static void bam_start_dma(struct bam_chan *bchan)
{
- struct virt_dma_desc *vd = vchan_next_desc(&bchan->vc);
+ struct virt_dma_desc *vd;
struct bam_device *bdev = bchan->bdev;
struct bam_async_desc *async_desc = NULL;
struct bam_desc_hw *desc;
@@ -1064,9 +1220,23 @@ static void bam_start_dma(struct bam_chan *bchan)
lockdep_assert_held(&bchan->vc.lock);
+ vd = vchan_next_desc(&bchan->vc);
if (!vd)
return;
+ /*
+ * Wrap the issued work with a LOCK/UNLOCK pair exactly once, at the
+ * start of a fresh sequence and only when there is real work to lock
+ * around. On a re-entry after a full FIFO, we see the BAM is locked
+ * and must not add another pair we simply continue loading the
+ * remainder of the same locked sequence.
+ */
+ if (!bchan->bam_locked) {
+ ret = bam_setup_pipe_lock(bchan);
+ if (ret == 0 && bchan->bam_locked)
+ vd = vchan_next_desc(&bchan->vc);
+ }
+
ret = pm_runtime_get_sync(bdev->dev);
if (ret < 0)
return;
@@ -1191,8 +1361,12 @@ static void bam_issue_pending(struct dma_chan *chan)
*/
static void bam_dma_free_desc(struct virt_dma_desc *vd)
{
- struct bam_async_desc *async_desc = container_of(vd,
- struct bam_async_desc, vd);
+ struct bam_async_desc *async_desc = container_of(vd, struct bam_async_desc, vd);
+ struct bam_desc_hw *desc = async_desc->desc;
+ struct dma_chan *chan = vd->tx.chan;
+
+ if (le16_to_cpu(desc->flags) & (DESC_FLAG_LOCK | DESC_FLAG_UNLOCK))
+ dma_unmap_sg(chan->slave, &async_desc->lock_sg, 1, DMA_TO_DEVICE);
kfree(async_desc);
}
@@ -1384,6 +1558,7 @@ static int bam_dma_probe(struct platform_device *pdev)
bdev->common.device_terminate_all = bam_dma_terminate_all;
bdev->common.device_issue_pending = bam_issue_pending;
bdev->common.device_tx_status = bam_tx_status;
+ bdev->common.desc_metadata_modes = DESC_METADATA_CLIENT;
bdev->common.dev = bdev->dev;
ret = dma_async_device_register(&bdev->common);
diff --git a/include/linux/dma/qcom_bam_dma.h b/include/linux/dma/qcom_bam_dma.h
index 68fc0e643b1b97fe4520d5878daa322b81f4f559..a2594264b0f58c4b2b1c85e243cad0d5669c26dc 100644
--- a/include/linux/dma/qcom_bam_dma.h
+++ b/include/linux/dma/qcom_bam_dma.h
@@ -6,6 +6,8 @@
#ifndef _QCOM_BAM_DMA_H
#define _QCOM_BAM_DMA_H
+#include <linux/dmaengine.h>
+
#include <asm/byteorder.h>
/*
@@ -34,6 +36,18 @@ enum bam_command_type {
BAM_READ_COMMAND,
};
+/**
+ * struct bam_desc_metadata - DMA descriptor metadata specific to the BAM driver.
+ *
+ * @scratchpad_addr: Physical address to use for dummy write operations when
+ * queuing command descriptors with LOCK/UNLOCK bits set.
+ * @direction: Transfer direction of this channel.
+ */
+struct bam_desc_metadata {
+ phys_addr_t scratchpad_addr;
+ enum dma_transfer_direction direction;
+};
+
/*
* prep_bam_ce_le32 - Wrapper function to prepare a single BAM command
* element with the data already in le32 format.
--
2.47.3
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v20 07/14] crypto: qce - Cancel work on device detach
2026-06-29 10:01 [PATCH v20 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (5 preceding siblings ...)
2026-06-29 10:01 ` [PATCH v20 06/14] dmaengine: qcom: bam_dma: add support for BAM locking Bartosz Golaszewski
@ 2026-06-29 10:01 ` Bartosz Golaszewski
2026-06-29 10:15 ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 08/14] crypto: qce - Include algapi.h in the core.h header Bartosz Golaszewski
` (6 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Bartosz Golaszewski @ 2026-06-29 10:01 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
The workqueue is setup in probe() but never cancelled on error or in
remove(). Set up a devres action to clean it up. We need to move the
initialization earlier as we don't want to cancel the work before any
outstanding DMA transfer is terminated. Make sure we do terminate all
transfers in qce_dma_release() devres action.
Fixes: eb7986e5e14d ("crypto: qce - convert tasklet to workqueue")
Closes: https://sashiko.dev/#/patchset/20260427-qcom-qce-cmd-descr-v16-0-945fd1cafbbc%40oss.qualcomm.com?part=7
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/core.c | 13 ++++++++++++-
drivers/crypto/qce/dma.c | 2 ++
2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
index b966f3365b7de8d2a8f6707397a34aa4facdc4ac..f671946cf7351cd5f0c319909bafd87e3af701c7 100644
--- a/drivers/crypto/qce/core.c
+++ b/drivers/crypto/qce/core.c
@@ -186,6 +186,13 @@ static int qce_check_version(struct qce_device *qce)
return 0;
}
+static void qce_cancel_work(void *data)
+{
+ struct work_struct *work = data;
+
+ cancel_work_sync(work);
+}
+
static int qce_crypto_probe(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
@@ -227,6 +234,11 @@ static int qce_crypto_probe(struct platform_device *pdev)
if (ret)
return ret;
+ INIT_WORK(&qce->done_work, qce_req_done_work);
+ ret = devm_add_action_or_reset(dev, qce_cancel_work, &qce->done_work);
+ if (ret)
+ return ret;
+
ret = devm_qce_dma_request(qce->dev, &qce->dma);
if (ret)
return ret;
@@ -239,7 +251,6 @@ static int qce_crypto_probe(struct platform_device *pdev)
if (ret)
return ret;
- INIT_WORK(&qce->done_work, qce_req_done_work);
crypto_init_queue(&qce->queue, QCE_QUEUE_LENGTH);
qce->async_req_enqueue = qce_async_request_enqueue;
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index 68cafd4741ad3d91906d39e817fc7873b028d498..7ec9d72fd690fb17e03ade7efe3cc522fb47e1ac 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -13,6 +13,8 @@ static void qce_dma_release(void *data)
{
struct qce_dma_data *dma = data;
+ dmaengine_terminate_sync(dma->txchan);
+ dmaengine_terminate_sync(dma->rxchan);
dma_release_channel(dma->txchan);
dma_release_channel(dma->rxchan);
kfree(dma->result_buf);
--
2.47.3
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v20 08/14] crypto: qce - Include algapi.h in the core.h header
2026-06-29 10:01 [PATCH v20 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (6 preceding siblings ...)
2026-06-29 10:01 ` [PATCH v20 07/14] crypto: qce - Cancel work on device detach Bartosz Golaszewski
@ 2026-06-29 10:01 ` Bartosz Golaszewski
2026-06-29 10:01 ` [PATCH v20 09/14] crypto: qce - Remove unused ignore_buf Bartosz Golaszewski
` (5 subsequent siblings)
13 siblings, 0 replies; 25+ messages in thread
From: Bartosz Golaszewski @ 2026-06-29 10:01 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
The header defines a struct embedding struct crypto_queue whose size
needs to be known and which is defined in crypto/algapi.h. Move the
inclusion from core.c to core.h.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/core.c | 1 -
drivers/crypto/qce/core.h | 1 +
2 files changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
index f671946cf7351cd5f0c319909bafd87e3af701c7..ad37c2b8ae53a373bb248aff06c3b7946e8439a8 100644
--- a/drivers/crypto/qce/core.c
+++ b/drivers/crypto/qce/core.c
@@ -13,7 +13,6 @@
#include <linux/mod_devicetable.h>
#include <linux/platform_device.h>
#include <linux/types.h>
-#include <crypto/algapi.h>
#include <crypto/internal/hash.h>
#include "core.h"
diff --git a/drivers/crypto/qce/core.h b/drivers/crypto/qce/core.h
index eb6fa7a8b64a81daf9ad5304a3ae4e5e597a70b8..f092ce2d3b04a936a37805c20ac5ba78d8fdd2df 100644
--- a/drivers/crypto/qce/core.h
+++ b/drivers/crypto/qce/core.h
@@ -8,6 +8,7 @@
#include <linux/mutex.h>
#include <linux/workqueue.h>
+#include <crypto/algapi.h>
#include "dma.h"
--
2.47.3
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v20 09/14] crypto: qce - Remove unused ignore_buf
2026-06-29 10:01 [PATCH v20 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (7 preceding siblings ...)
2026-06-29 10:01 ` [PATCH v20 08/14] crypto: qce - Include algapi.h in the core.h header Bartosz Golaszewski
@ 2026-06-29 10:01 ` Bartosz Golaszewski
2026-06-29 10:01 ` [PATCH v20 10/14] crypto: qce - Simplify arguments of devm_qce_dma_request() Bartosz Golaszewski
` (4 subsequent siblings)
13 siblings, 0 replies; 25+ messages in thread
From: Bartosz Golaszewski @ 2026-06-29 10:01 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
It's unclear what the purpose of this field is. It has been here since
the initial commit but without any explanation. The driver works fine
without it. We still keep allocating more space in the result buffer, we
just don't need to store its address. While at it: move the
QCE_IGNORE_BUF_SZ definition into dma.c as it's not used outside of this
compilation unit.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/dma.c | 4 ++--
drivers/crypto/qce/dma.h | 2 --
2 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index 7ec9d72fd690fb17e03ade7efe3cc522fb47e1ac..d1daa229361aa74da5d3d7bfe1bc8ab189761e38 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -9,6 +9,8 @@
#include "dma.h"
+#define QCE_IGNORE_BUF_SZ (2 * QCE_BAM_BURST_SIZE)
+
static void qce_dma_release(void *data)
{
struct qce_dma_data *dma = data;
@@ -43,8 +45,6 @@ int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma)
goto error_nomem;
}
- dma->ignore_buf = dma->result_buf + QCE_RESULT_BUF_SZ;
-
return devm_add_action_or_reset(dev, qce_dma_release, dma);
error_nomem:
diff --git a/drivers/crypto/qce/dma.h b/drivers/crypto/qce/dma.h
index 31629185000e12242fa07c2cc08b95fcbd5d4b8c..fc337c435cd14917bdfb99febcf9119275afdeba 100644
--- a/drivers/crypto/qce/dma.h
+++ b/drivers/crypto/qce/dma.h
@@ -23,7 +23,6 @@ struct qce_result_dump {
u32 status2;
};
-#define QCE_IGNORE_BUF_SZ (2 * QCE_BAM_BURST_SIZE)
#define QCE_RESULT_BUF_SZ \
ALIGN(sizeof(struct qce_result_dump), QCE_BAM_BURST_SIZE)
@@ -31,7 +30,6 @@ struct qce_dma_data {
struct dma_chan *txchan;
struct dma_chan *rxchan;
struct qce_result_dump *result_buf;
- void *ignore_buf;
};
int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma);
--
2.47.3
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v20 10/14] crypto: qce - Simplify arguments of devm_qce_dma_request()
2026-06-29 10:01 [PATCH v20 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (8 preceding siblings ...)
2026-06-29 10:01 ` [PATCH v20 09/14] crypto: qce - Remove unused ignore_buf Bartosz Golaszewski
@ 2026-06-29 10:01 ` Bartosz Golaszewski
2026-06-29 10:01 ` [PATCH v20 11/14] crypto: qce - Use existing devres APIs in devm_qce_dma_request() Bartosz Golaszewski
` (3 subsequent siblings)
13 siblings, 0 replies; 25+ messages in thread
From: Bartosz Golaszewski @ 2026-06-29 10:01 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
This function can extract all the information it needs from struct
qce_device alone so simplify its arguments. This is done in preparation
for adding support for register I/O over DMA which will require
accessing even more fields from struct qce_device.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/core.c | 2 +-
drivers/crypto/qce/dma.c | 5 ++++-
drivers/crypto/qce/dma.h | 4 +++-
3 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
index ad37c2b8ae53a373bb248aff06c3b7946e8439a8..a0e2eadc3afd5f83e46724c8bc3e3690146b86ba 100644
--- a/drivers/crypto/qce/core.c
+++ b/drivers/crypto/qce/core.c
@@ -238,7 +238,7 @@ static int qce_crypto_probe(struct platform_device *pdev)
if (ret)
return ret;
- ret = devm_qce_dma_request(qce->dev, &qce->dma);
+ ret = devm_qce_dma_request(qce);
if (ret)
return ret;
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index d1daa229361aa74da5d3d7bfe1bc8ab189761e38..d60efb5c26d88f8b0259b1dccc8724d0f75571c6 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -7,6 +7,7 @@
#include <linux/dmaengine.h>
#include <crypto/scatterwalk.h>
+#include "core.h"
#include "dma.h"
#define QCE_IGNORE_BUF_SZ (2 * QCE_BAM_BURST_SIZE)
@@ -22,8 +23,10 @@ static void qce_dma_release(void *data)
kfree(dma->result_buf);
}
-int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma)
+int devm_qce_dma_request(struct qce_device *qce)
{
+ struct qce_dma_data *dma = &qce->dma;
+ struct device *dev = qce->dev;
int ret;
dma->txchan = dma_request_chan(dev, "tx");
diff --git a/drivers/crypto/qce/dma.h b/drivers/crypto/qce/dma.h
index fc337c435cd14917bdfb99febcf9119275afdeba..483789d9fa98e79d1283de8297bf2fc2a773f3a7 100644
--- a/drivers/crypto/qce/dma.h
+++ b/drivers/crypto/qce/dma.h
@@ -8,6 +8,8 @@
#include <linux/dmaengine.h>
+struct qce_device;
+
/* maximum data transfer block size between BAM and CE */
#define QCE_BAM_BURST_SIZE 64
@@ -32,7 +34,7 @@ struct qce_dma_data {
struct qce_result_dump *result_buf;
};
-int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma);
+int devm_qce_dma_request(struct qce_device *qce);
int qce_dma_prep_sgs(struct qce_dma_data *dma, struct scatterlist *sg_in,
int in_ents, struct scatterlist *sg_out, int out_ents,
dma_async_tx_callback cb, void *cb_param);
--
2.47.3
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v20 11/14] crypto: qce - Use existing devres APIs in devm_qce_dma_request()
2026-06-29 10:01 [PATCH v20 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (9 preceding siblings ...)
2026-06-29 10:01 ` [PATCH v20 10/14] crypto: qce - Simplify arguments of devm_qce_dma_request() Bartosz Golaszewski
@ 2026-06-29 10:01 ` Bartosz Golaszewski
2026-06-29 10:17 ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 12/14] crypto: qce - Map crypto memory for DMA Bartosz Golaszewski
` (2 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Bartosz Golaszewski @ 2026-06-29 10:01 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski,
Konrad Dybcio
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Switch to devm_kmalloc() and devm_dma_alloc_chan() in
devm_qce_dma_request(). This allows us to drop two labels and shrink the
function.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/dma.c | 37 +++++++++++--------------------------
1 file changed, 11 insertions(+), 26 deletions(-)
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index d60efb5c26d88f8b0259b1dccc8724d0f75571c6..26347e9fc078adede712722107e74958538accdf 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -12,49 +12,34 @@
#define QCE_IGNORE_BUF_SZ (2 * QCE_BAM_BURST_SIZE)
-static void qce_dma_release(void *data)
+static void qce_dma_terminate(void *data)
{
struct qce_dma_data *dma = data;
dmaengine_terminate_sync(dma->txchan);
dmaengine_terminate_sync(dma->rxchan);
- dma_release_channel(dma->txchan);
- dma_release_channel(dma->rxchan);
- kfree(dma->result_buf);
}
int devm_qce_dma_request(struct qce_device *qce)
{
struct qce_dma_data *dma = &qce->dma;
struct device *dev = qce->dev;
- int ret;
- dma->txchan = dma_request_chan(dev, "tx");
+ dma->result_buf = devm_kmalloc(dev, QCE_RESULT_BUF_SZ + QCE_IGNORE_BUF_SZ, GFP_KERNEL);
+ if (!dma->result_buf)
+ return -ENOMEM;
+
+ dma->txchan = devm_dma_request_chan(dev, "tx");
if (IS_ERR(dma->txchan))
return dev_err_probe(dev, PTR_ERR(dma->txchan),
"Failed to get TX DMA channel\n");
- dma->rxchan = dma_request_chan(dev, "rx");
- if (IS_ERR(dma->rxchan)) {
- ret = dev_err_probe(dev, PTR_ERR(dma->rxchan),
- "Failed to get RX DMA channel\n");
- goto error_rx;
- }
-
- dma->result_buf = kmalloc(QCE_RESULT_BUF_SZ + QCE_IGNORE_BUF_SZ,
- GFP_KERNEL);
- if (!dma->result_buf) {
- ret = -ENOMEM;
- goto error_nomem;
- }
-
- return devm_add_action_or_reset(dev, qce_dma_release, dma);
+ dma->rxchan = devm_dma_request_chan(dev, "rx");
+ if (IS_ERR(dma->rxchan))
+ return dev_err_probe(dev, PTR_ERR(dma->rxchan),
+ "Failed to get RX DMA channel\n");
-error_nomem:
- dma_release_channel(dma->rxchan);
-error_rx:
- dma_release_channel(dma->txchan);
- return ret;
+ return devm_add_action_or_reset(dev, qce_dma_terminate, dma);
}
struct scatterlist *
--
2.47.3
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v20 12/14] crypto: qce - Map crypto memory for DMA
2026-06-29 10:01 [PATCH v20 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (10 preceding siblings ...)
2026-06-29 10:01 ` [PATCH v20 11/14] crypto: qce - Use existing devres APIs in devm_qce_dma_request() Bartosz Golaszewski
@ 2026-06-29 10:01 ` Bartosz Golaszewski
2026-06-29 10:14 ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 13/14] crypto: qce - Add BAM DMA support for crypto register I/O Bartosz Golaszewski
2026-06-29 10:01 ` [PATCH v20 14/14] crypto: qce - Communicate the base physical address to the dmaengine Bartosz Golaszewski
13 siblings, 1 reply; 25+ messages in thread
From: Bartosz Golaszewski @ 2026-06-29 10:01 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
As the first step in converting the driver to using DMA for register
I/O, let's map the crypto memory range.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/core.c | 23 ++++++++++++++++++++++-
drivers/crypto/qce/core.h | 6 ++++++
2 files changed, 28 insertions(+), 1 deletion(-)
diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
index a0e2eadc3afd5f83e46724c8bc3e3690146b86ba..d7b7a3dda464964afe6a6893bb329d5bd5759dcd 100644
--- a/drivers/crypto/qce/core.c
+++ b/drivers/crypto/qce/core.c
@@ -192,10 +192,19 @@ static void qce_cancel_work(void *data)
cancel_work_sync(work);
}
+static void qce_crypto_unmap_dma(void *data)
+{
+ struct qce_device *qce = data;
+
+ dma_unmap_resource(qce->dev, qce->base_dma, qce->dma_size,
+ DMA_BIDIRECTIONAL, 0);
+}
+
static int qce_crypto_probe(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
struct qce_device *qce;
+ struct resource *res;
int ret;
qce = devm_kzalloc(dev, sizeof(*qce), GFP_KERNEL);
@@ -205,7 +214,7 @@ static int qce_crypto_probe(struct platform_device *pdev)
qce->dev = dev;
platform_set_drvdata(pdev, qce);
- qce->base = devm_platform_ioremap_resource(pdev, 0);
+ qce->base = devm_platform_get_and_ioremap_resource(pdev, 0, &res);
if (IS_ERR(qce->base))
return PTR_ERR(qce->base);
@@ -255,6 +264,18 @@ static int qce_crypto_probe(struct platform_device *pdev)
qce->async_req_enqueue = qce_async_request_enqueue;
qce->async_req_done = qce_async_request_done;
+ qce->dma_size = resource_size(res);
+ qce->base_dma = dma_map_resource(dev, res->start, qce->dma_size,
+ DMA_BIDIRECTIONAL, 0);
+ qce->base_phys = res->start;
+ ret = dma_mapping_error(dev, qce->base_dma);
+ if (ret)
+ return ret;
+
+ ret = devm_add_action_or_reset(qce->dev, qce_crypto_unmap_dma, qce);
+ if (ret)
+ return ret;
+
return devm_qce_register_algs(qce);
}
diff --git a/drivers/crypto/qce/core.h b/drivers/crypto/qce/core.h
index f092ce2d3b04a936a37805c20ac5ba78d8fdd2df..a80e12eac6c87e5321cce16c56a4bf5003474ef0 100644
--- a/drivers/crypto/qce/core.h
+++ b/drivers/crypto/qce/core.h
@@ -27,6 +27,9 @@
* @dma: pointer to dma data
* @burst_size: the crypto burst size
* @pipe_pair_id: which pipe pair id the device using
+ * @base_dma: base DMA address
+ * @base_phys: base physical address
+ * @dma_size: size of memory mapped for DMA
* @async_req_enqueue: invoked by every algorithm to enqueue a request
* @async_req_done: invoked by every algorithm to finish its request
*/
@@ -43,6 +46,9 @@ struct qce_device {
struct qce_dma_data dma;
int burst_size;
unsigned int pipe_pair_id;
+ dma_addr_t base_dma;
+ phys_addr_t base_phys;
+ size_t dma_size;
int (*async_req_enqueue)(struct qce_device *qce,
struct crypto_async_request *req);
void (*async_req_done)(struct qce_device *qce, int ret);
--
2.47.3
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v20 13/14] crypto: qce - Add BAM DMA support for crypto register I/O
2026-06-29 10:01 [PATCH v20 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (11 preceding siblings ...)
2026-06-29 10:01 ` [PATCH v20 12/14] crypto: qce - Map crypto memory for DMA Bartosz Golaszewski
@ 2026-06-29 10:01 ` Bartosz Golaszewski
2026-06-29 10:22 ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 14/14] crypto: qce - Communicate the base physical address to the dmaengine Bartosz Golaszewski
13 siblings, 1 reply; 25+ messages in thread
From: Bartosz Golaszewski @ 2026-06-29 10:01 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Switch to using BAM DMA for register I/O in addition to passing data. To
that end: provide the necessary infrastructure in the driver, modify the
ordering of operations as required and replace all direct register writes
with wrappers queueing DMA command descriptors.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/aead.c | 10 ++--
drivers/crypto/qce/common.c | 20 ++++---
drivers/crypto/qce/dma.c | 120 ++++++++++++++++++++++++++++++++++++++++--
drivers/crypto/qce/dma.h | 5 ++
drivers/crypto/qce/sha.c | 10 ++--
drivers/crypto/qce/skcipher.c | 10 ++--
6 files changed, 144 insertions(+), 31 deletions(-)
diff --git a/drivers/crypto/qce/aead.c b/drivers/crypto/qce/aead.c
index 1461a08e6c58b00e60aa35515f3392c096726f6a..544a3cf8709248a5f3eb2b669e30b09183d3a69d 100644
--- a/drivers/crypto/qce/aead.c
+++ b/drivers/crypto/qce/aead.c
@@ -463,17 +463,17 @@ qce_aead_async_req_handle(struct crypto_async_request *async_req)
src_nents = dst_nents - 1;
}
- ret = qce_dma_prep_sgs(&qce->dma, rctx->src_sg, src_nents, rctx->dst_sg, dst_nents,
- qce_aead_done, async_req);
+ ret = qce_start(async_req, tmpl->crypto_alg_type);
if (ret)
goto error_unmap_src;
- qce_dma_issue_pending(&qce->dma);
-
- ret = qce_start(async_req, tmpl->crypto_alg_type);
+ ret = qce_dma_prep_sgs(&qce->dma, rctx->src_sg, src_nents, rctx->dst_sg, dst_nents,
+ qce_aead_done, async_req);
if (ret)
goto error_terminate;
+ qce_dma_issue_pending(&qce->dma);
+
return 0;
error_terminate:
diff --git a/drivers/crypto/qce/common.c b/drivers/crypto/qce/common.c
index 54a78a57f63028f01870a3edeb8e390f523bb190..37bb6f03244d317a887aeb0aa10cefe327b4ce05 100644
--- a/drivers/crypto/qce/common.c
+++ b/drivers/crypto/qce/common.c
@@ -25,7 +25,7 @@ static inline u32 qce_read(struct qce_device *qce, u32 offset)
static inline void qce_write(struct qce_device *qce, u32 offset, u32 val)
{
- writel(val, qce->base + offset);
+ qce_write_dma(qce, offset, val);
}
static inline void qce_write_array(struct qce_device *qce, u32 offset,
@@ -82,6 +82,8 @@ static void qce_setup_config(struct qce_device *qce)
{
u32 config;
+ qce_clear_bam_transaction(qce);
+
/* get big endianness */
config = qce_config_reg(qce, 0);
@@ -90,12 +92,14 @@ static void qce_setup_config(struct qce_device *qce)
qce_write(qce, REG_CONFIG, config);
}
-static inline void qce_crypto_go(struct qce_device *qce, bool result_dump)
+static inline int qce_crypto_go(struct qce_device *qce, bool result_dump)
{
if (result_dump)
qce_write(qce, REG_GOPROC, BIT(GO_SHIFT) | BIT(RESULTS_DUMP_SHIFT));
else
qce_write(qce, REG_GOPROC, BIT(GO_SHIFT));
+
+ return qce_submit_cmd_desc(qce);
}
#if defined(CONFIG_CRYPTO_DEV_QCE_SHA) || defined(CONFIG_CRYPTO_DEV_QCE_AEAD)
@@ -223,9 +227,7 @@ static int qce_setup_regs_ahash(struct crypto_async_request *async_req)
config = qce_config_reg(qce, 1);
qce_write(qce, REG_CONFIG, config);
- qce_crypto_go(qce, true);
-
- return 0;
+ return qce_crypto_go(qce, true);
}
#endif
@@ -386,9 +388,7 @@ static int qce_setup_regs_skcipher(struct crypto_async_request *async_req)
config = qce_config_reg(qce, 1);
qce_write(qce, REG_CONFIG, config);
- qce_crypto_go(qce, true);
-
- return 0;
+ return qce_crypto_go(qce, true);
}
#endif
@@ -535,9 +535,7 @@ static int qce_setup_regs_aead(struct crypto_async_request *async_req)
qce_write(qce, REG_CONFIG, config);
/* Start the process */
- qce_crypto_go(qce, !IS_CCM(flags));
-
- return 0;
+ return qce_crypto_go(qce, !IS_CCM(flags));
}
#endif
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index 26347e9fc078adede712722107e74958538accdf..1b43c56503334154be4b8000e5a9330b2005cb64 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -4,6 +4,8 @@
*/
#include <linux/device.h>
+#include <linux/dma/qcom_bam_dma.h>
+#include <linux/dma-mapping.h>
#include <linux/dmaengine.h>
#include <crypto/scatterwalk.h>
@@ -11,6 +13,96 @@
#include "dma.h"
#define QCE_IGNORE_BUF_SZ (2 * QCE_BAM_BURST_SIZE)
+#define QCE_BAM_CMD_SGL_SIZE 128
+#define QCE_BAM_CMD_ELEMENT_SIZE 128
+
+struct qce_desc_info {
+ struct dma_async_tx_descriptor *dma_desc;
+ enum dma_data_direction dir;
+};
+
+struct qce_bam_transaction {
+ struct bam_cmd_element bam_ce[QCE_BAM_CMD_ELEMENT_SIZE];
+ struct scatterlist wr_sgl[QCE_BAM_CMD_SGL_SIZE];
+ struct qce_desc_info *desc;
+ u32 bam_ce_idx;
+ u32 pre_bam_ce_idx;
+ u32 wr_sgl_cnt;
+};
+
+void qce_clear_bam_transaction(struct qce_device *qce)
+{
+ struct qce_bam_transaction *bam_txn = qce->dma.bam_txn;
+
+ bam_txn->bam_ce_idx = 0;
+ bam_txn->wr_sgl_cnt = 0;
+ bam_txn->pre_bam_ce_idx = 0;
+}
+
+int qce_submit_cmd_desc(struct qce_device *qce)
+{
+ struct qce_desc_info *qce_desc = qce->dma.bam_txn->desc;
+ struct qce_bam_transaction *bam_txn = qce->dma.bam_txn;
+ struct dma_async_tx_descriptor *dma_desc;
+ struct dma_chan *chan = qce->dma.rxchan;
+ unsigned long attrs = DMA_PREP_CMD;
+ dma_cookie_t cookie;
+ unsigned int mapped;
+ int ret;
+
+ mapped = dma_map_sg(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt, DMA_TO_DEVICE);
+ if (!mapped)
+ return -ENOMEM;
+
+ dma_desc = dmaengine_prep_slave_sg(chan, bam_txn->wr_sgl, mapped, DMA_MEM_TO_DEV, attrs);
+ if (!dma_desc) {
+ ret = -ENOMEM;
+ goto err_unmap_sg;
+ }
+
+ qce_desc->dma_desc = dma_desc;
+ cookie = dmaengine_submit(qce_desc->dma_desc);
+
+ ret = dma_submit_error(cookie);
+ if (ret)
+ goto err_unmap_sg;
+
+ return 0;
+
+err_unmap_sg:
+ dma_unmap_sg(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt, DMA_TO_DEVICE);
+ return ret;
+}
+
+static void qce_prep_dma_cmd_desc(struct qce_device *qce, struct qce_dma_data *dma,
+ unsigned int addr, void *buf)
+{
+ struct qce_bam_transaction *bam_txn = dma->bam_txn;
+ struct bam_cmd_element *bam_ce_buf;
+ int bam_ce_size, cnt, idx;
+
+ idx = bam_txn->bam_ce_idx;
+ bam_ce_buf = &bam_txn->bam_ce[idx];
+ bam_prep_ce_le32(bam_ce_buf, addr, BAM_WRITE_COMMAND, *((__le32 *)buf));
+
+ bam_ce_buf = &bam_txn->bam_ce[bam_txn->pre_bam_ce_idx];
+ bam_txn->bam_ce_idx++;
+ bam_ce_size = (bam_txn->bam_ce_idx - bam_txn->pre_bam_ce_idx) * sizeof(*bam_ce_buf);
+
+ cnt = bam_txn->wr_sgl_cnt;
+
+ sg_set_buf(&bam_txn->wr_sgl[cnt], bam_ce_buf, bam_ce_size);
+
+ ++bam_txn->wr_sgl_cnt;
+ bam_txn->pre_bam_ce_idx = bam_txn->bam_ce_idx;
+}
+
+void qce_write_dma(struct qce_device *qce, unsigned int offset, u32 val)
+{
+ unsigned int reg_addr = ((unsigned int)(qce->base_phys) + offset);
+
+ qce_prep_dma_cmd_desc(qce, &qce->dma, reg_addr, &val);
+}
static void qce_dma_terminate(void *data)
{
@@ -39,6 +131,16 @@ int devm_qce_dma_request(struct qce_device *qce)
return dev_err_probe(dev, PTR_ERR(dma->rxchan),
"Failed to get RX DMA channel\n");
+ dma->bam_txn = devm_kzalloc(dev, sizeof(*dma->bam_txn), GFP_KERNEL);
+ if (!dma->bam_txn)
+ return -ENOMEM;
+
+ dma->bam_txn->desc = devm_kzalloc(dev, sizeof(*dma->bam_txn->desc), GFP_KERNEL);
+ if (!dma->bam_txn->desc)
+ return -ENOMEM;
+
+ sg_init_table(dma->bam_txn->wr_sgl, QCE_BAM_CMD_SGL_SIZE);
+
return devm_add_action_or_reset(dev, qce_dma_terminate, dma);
}
@@ -98,28 +200,36 @@ int qce_dma_prep_sgs(struct qce_dma_data *dma, struct scatterlist *rx_sg,
{
struct dma_chan *rxchan = dma->rxchan;
struct dma_chan *txchan = dma->txchan;
- unsigned long flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
+ unsigned long txflags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
+ unsigned long rxflags = txflags | DMA_PREP_FENCE;
int ret;
- ret = qce_dma_prep_sg(rxchan, rx_sg, rx_nents, flags, DMA_MEM_TO_DEV,
+ ret = qce_dma_prep_sg(rxchan, rx_sg, rx_nents, rxflags, DMA_MEM_TO_DEV,
NULL, NULL);
if (ret)
return ret;
- return qce_dma_prep_sg(txchan, tx_sg, tx_nents, flags, DMA_DEV_TO_MEM,
+ return qce_dma_prep_sg(txchan, tx_sg, tx_nents, txflags, DMA_DEV_TO_MEM,
cb, cb_param);
}
void qce_dma_issue_pending(struct qce_dma_data *dma)
{
- dma_async_issue_pending(dma->rxchan);
dma_async_issue_pending(dma->txchan);
+ dma_async_issue_pending(dma->rxchan);
}
int qce_dma_terminate_all(struct qce_dma_data *dma)
{
+ struct qce_device *qce = container_of(dma, struct qce_device, dma);
+ struct qce_bam_transaction *bam_txn = dma->bam_txn;
int ret;
ret = dmaengine_terminate_all(dma->rxchan);
- return ret ?: dmaengine_terminate_all(dma->txchan);
+ if (ret)
+ return ret;
+
+ dma_unmap_sg(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt, DMA_TO_DEVICE);
+
+ return dmaengine_terminate_all(dma->txchan);
}
diff --git a/drivers/crypto/qce/dma.h b/drivers/crypto/qce/dma.h
index 483789d9fa98e79d1283de8297bf2fc2a773f3a7..f05dfa9e6b25bd60e32f45079a8bc7e6a4cf81f9 100644
--- a/drivers/crypto/qce/dma.h
+++ b/drivers/crypto/qce/dma.h
@@ -8,6 +8,7 @@
#include <linux/dmaengine.h>
+struct qce_bam_transaction;
struct qce_device;
/* maximum data transfer block size between BAM and CE */
@@ -32,6 +33,7 @@ struct qce_dma_data {
struct dma_chan *txchan;
struct dma_chan *rxchan;
struct qce_result_dump *result_buf;
+ struct qce_bam_transaction *bam_txn;
};
int devm_qce_dma_request(struct qce_device *qce);
@@ -43,5 +45,8 @@ int qce_dma_terminate_all(struct qce_dma_data *dma);
struct scatterlist *
qce_sgtable_add(struct sg_table *sgt, struct scatterlist *sg_add,
unsigned int max_len);
+void qce_write_dma(struct qce_device *qce, unsigned int offset, u32 val);
+int qce_submit_cmd_desc(struct qce_device *qce);
+void qce_clear_bam_transaction(struct qce_device *qce);
#endif /* _DMA_H_ */
diff --git a/drivers/crypto/qce/sha.c b/drivers/crypto/qce/sha.c
index 5476d4d30fae7eb72bbcbcdd7d8be7a76f6732c2..5cfd769a59a791a79da42e2a5b0554ad974f7631 100644
--- a/drivers/crypto/qce/sha.c
+++ b/drivers/crypto/qce/sha.c
@@ -109,17 +109,17 @@ static int qce_ahash_async_req_handle(struct crypto_async_request *async_req)
goto error_unmap_src;
}
- ret = qce_dma_prep_sgs(&qce->dma, req->src, rctx->src_nents,
- &rctx->result_sg, 1, qce_ahash_done, async_req);
+ ret = qce_start(async_req, tmpl->crypto_alg_type);
if (ret)
goto error_unmap_dst;
- qce_dma_issue_pending(&qce->dma);
-
- ret = qce_start(async_req, tmpl->crypto_alg_type);
+ ret = qce_dma_prep_sgs(&qce->dma, req->src, rctx->src_nents,
+ &rctx->result_sg, 1, qce_ahash_done, async_req);
if (ret)
goto error_terminate;
+ qce_dma_issue_pending(&qce->dma);
+
return 0;
error_terminate:
diff --git a/drivers/crypto/qce/skcipher.c b/drivers/crypto/qce/skcipher.c
index a9b59e68df4b6837805d45391f5a5fe43fd47709..b4ef3748fbb4dde542b0307f32d4c871b7c33ac2 100644
--- a/drivers/crypto/qce/skcipher.c
+++ b/drivers/crypto/qce/skcipher.c
@@ -142,18 +142,18 @@ qce_skcipher_async_req_handle(struct crypto_async_request *async_req)
src_nents = dst_nents - 1;
}
+ ret = qce_start(async_req, tmpl->crypto_alg_type);
+ if (ret)
+ goto error_unmap_src;
+
ret = qce_dma_prep_sgs(&qce->dma, rctx->src_sg, src_nents,
rctx->dst_sg, dst_nents,
qce_skcipher_done, async_req);
if (ret)
- goto error_unmap_src;
+ goto error_terminate;
qce_dma_issue_pending(&qce->dma);
- ret = qce_start(async_req, tmpl->crypto_alg_type);
- if (ret)
- goto error_terminate;
-
return 0;
error_terminate:
--
2.47.3
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v20 14/14] crypto: qce - Communicate the base physical address to the dmaengine
2026-06-29 10:01 [PATCH v20 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (12 preceding siblings ...)
2026-06-29 10:01 ` [PATCH v20 13/14] crypto: qce - Add BAM DMA support for crypto register I/O Bartosz Golaszewski
@ 2026-06-29 10:01 ` Bartosz Golaszewski
2026-06-29 10:24 ` sashiko-bot
13 siblings, 1 reply; 25+ messages in thread
From: Bartosz Golaszewski @ 2026-06-29 10:01 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
In order to communicate to the BAM DMA engine which address should be
used as a scratchpad for dummy writes related to BAM pipe locking,
fill out and attach the provided metadata struct to the descriptor.
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/dma.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index 1b43c56503334154be4b8000e5a9330b2005cb64..6410f8dc5bcf517223c768a3e8f87af245076c84 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -11,6 +11,7 @@
#include "core.h"
#include "dma.h"
+#include "regs-v5.h"
#define QCE_IGNORE_BUF_SZ (2 * QCE_BAM_BURST_SIZE)
#define QCE_BAM_CMD_SGL_SIZE 128
@@ -41,6 +42,10 @@ void qce_clear_bam_transaction(struct qce_device *qce)
int qce_submit_cmd_desc(struct qce_device *qce)
{
+ struct bam_desc_metadata meta = {
+ .scratchpad_addr = qce->base_phys + REG_VERSION,
+ .direction = DMA_MEM_TO_DEV,
+ };
struct qce_desc_info *qce_desc = qce->dma.bam_txn->desc;
struct qce_bam_transaction *bam_txn = qce->dma.bam_txn;
struct dma_async_tx_descriptor *dma_desc;
@@ -60,15 +65,21 @@ int qce_submit_cmd_desc(struct qce_device *qce)
goto err_unmap_sg;
}
+ ret = dmaengine_desc_attach_metadata(dma_desc, &meta, sizeof(meta));
+ if (ret)
+ goto err_free_desc;
+
qce_desc->dma_desc = dma_desc;
cookie = dmaengine_submit(qce_desc->dma_desc);
ret = dma_submit_error(cookie);
if (ret)
- goto err_unmap_sg;
+ goto err_free_desc;
return 0;
+err_free_desc:
+ dmaengine_desc_free(dma_desc);
err_unmap_sg:
dma_unmap_sg(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt, DMA_TO_DEVICE);
return ret;
--
2.47.3
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v20 12/14] crypto: qce - Map crypto memory for DMA
2026-06-29 10:01 ` [PATCH v20 12/14] crypto: qce - Map crypto memory for DMA Bartosz Golaszewski
@ 2026-06-29 10:14 ` sashiko-bot
0 siblings, 0 replies; 25+ messages in thread
From: sashiko-bot @ 2026-06-29 10:14 UTC (permalink / raw)
To: Bartosz Golaszewski; +Cc: dmaengine, vkoul, Frank.Li
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
New issues:
- [High] The DMA resource for the crypto memory range is unmapped before the DMA channels are terminated during device unbind, leading to a potential use-after-unmap.
Pre-existing issues:
- [High] Incorrect algorithm unregistration during probe rollback in devm_qce_register_algs().
- [High] The worker `done_work` can submit new DMA requests during device unbind after the DMA channels have been terminated, leading to hardware accesses with disabled clocks.
--
commit ee076bef79e68732cc680da917ccaf5eb87c11ef
Author: Bartosz Golaszewski <brgl@kernel.org>
crypto: qce - Map crypto memory for DMA
As the first step in converting the driver to using DMA for register
I/O, let's map the crypto memory range.
> diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
> index a0e2eadc3afd5..d7b7a3dda4649 100644
> --- a/drivers/crypto/qce/core.c
> +++ b/drivers/crypto/qce/core.c
> @@ -192,10 +192,19 @@ static void qce_cancel_work(void *data)
> cancel_work_sync(work);
> }
[Severity: High]
This is a pre-existing issue, but is the teardown order of done_work safe
relative to DMA termination?
In qce_crypto_probe(), the qce_cancel_work devres action is registered before
devm_qce_dma_request(). Since devres executes in reverse order, on device
unbind, qce_dma_terminate() runs before qce_cancel_work().
While qce_dma_terminate() cancels pending DMA transfers, it doesn't prevent
new submissions.
If done_work runs after qce_dma_terminate(), could it dequeue the next crypto
request and submit a new transfer to the BAM DMA engine?
The active engine might then attempt to access unclocked crypto registers when
devres proceeds to disable the clocks, potentially crashing the bus.
[ ... ]
> @@ -255,6 +264,18 @@ static int qce_crypto_probe(struct platform_device *pdev)
> qce->async_req_enqueue = qce_async_request_enqueue;
> qce->async_req_done = qce_async_request_done;
>
> + qce->dma_size = resource_size(res);
> + qce->base_dma = dma_map_resource(dev, res->start, qce->dma_size,
> + DMA_BIDIRECTIONAL, 0);
> + qce->base_phys = res->start;
> + ret = dma_mapping_error(dev, qce->base_dma);
> + if (ret)
> + return ret;
> +
> + ret = devm_add_action_or_reset(qce->dev, qce_crypto_unmap_dma, qce);
[Severity: High]
Is this devres action added too late in the probe sequence?
It is registered after devm_qce_dma_request() has registered the action to
terminate the DMA channels. Due to devres LIFO execution, on unbind
qce_crypto_unmap_dma() will execute before qce_dma_terminate().
If a crypto request is running when the device unbinds, could the IOMMU
mapping for the crypto registers be destroyed while the BAM DMA engine is
still actively accessing them, causing memory corruption or an IOMMU fault?
> + if (ret)
> + return ret;
> +
> return devm_qce_register_algs(qce);
[Severity: High]
This isn't a bug introduced by this patch, but there appears to be an issue
with the error handling inside devm_qce_register_algs() if algorithm
registration fails.
When it enters the rollback loop, it doesn't update the ops pointer:
drivers/crypto/qce/core.c:devm_qce_register_algs() {
for (i = 0; i < ARRAY_SIZE(qce_ops); i++) {
ops = qce_ops[i];
ret = ops->register_algs(qce);
if (ret) {
for (j = i - 1; j >= 0; j--)
ops->unregister_algs(qce);
return ret;
}
}
...
}
Because ops is not updated to qce_ops[j], won't this repeatedly unregister
the algorithm that just failed to register, while failing to unregister the
successful ones?
This could leave algorithms registered after the driver fails to probe,
leading to a use-after-free later.
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260629-qcom-qce-cmd-descr-v20-0-56f67da84c05@oss.qualcomm.com?part=12
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v20 07/14] crypto: qce - Cancel work on device detach
2026-06-29 10:01 ` [PATCH v20 07/14] crypto: qce - Cancel work on device detach Bartosz Golaszewski
@ 2026-06-29 10:15 ` sashiko-bot
0 siblings, 0 replies; 25+ messages in thread
From: sashiko-bot @ 2026-06-29 10:15 UTC (permalink / raw)
To: Bartosz Golaszewski; +Cc: vkoul, dmaengine, Frank.Li
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] Incorrect devres registration order causes Use-After-Free and Use-After-Destroy during device unbind.
- [High] Canceling the workqueue without completing pending requests causes a system hang.
--
commit bbc1062dac800956a5d7f42e6fe7594986dd0251
Author: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
crypto: qce - Cancel work on device detach
The workqueue is setup in probe() but never cancelled on error or in
remove(). Set up a devres action to clean it up early, ensuring work is
not cancelled before outstanding DMA transfers terminate.
Closes: https://sashiko.dev/#/patchset/20260427-qcom-qce-cmd-descr-v16-0-945fd1cafbbc%40oss.qualcomm.com?part=7
> diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
> index b966f3365b7de..f671946cf7351 100644
> --- a/drivers/crypto/qce/core.c
> +++ b/drivers/crypto/qce/core.c
> @@ -186,6 +186,13 @@ static int qce_check_version(struct qce_device *qce)
> return 0;
> }
>
> +static void qce_cancel_work(void *data)
> +{
> + struct work_struct *work = data;
> +
> + cancel_work_sync(work);
[Severity: High]
If the work is cancelled here, does the active crypto request get leaked?
If a DMA transfer completes and schedules the work, but the device unbinds
before the work executes, cancel_work_sync() will discard the pending work
item. Since qce_cancel_work() does not implement a fallback to propagate an
error completion (like -ESHUTDOWN) to the active crypto request,
crypto_request_complete() is never called.
Will upstream callers of the crypto API waiting on the request hang
indefinitely, potentially causing a system lockup?
> +}
> +
> static int qce_crypto_probe(struct platform_device *pdev)
> {
[ ... ]
> @@ -227,6 +234,11 @@ static int qce_crypto_probe(struct platform_device *pdev)
> if (ret)
> return ret;
>
> + INIT_WORK(&qce->done_work, qce_req_done_work);
> + ret = devm_add_action_or_reset(dev, qce_cancel_work, &qce->done_work);
> + if (ret)
> + return ret;
> +
> ret = devm_qce_dma_request(qce->dev, &qce->dma);
[Severity: High]
Will this devres registration order cause a use-after-free and
use-after-destroy during device unbind?
Devres executes actions in LIFO order. Because qce_cancel_work is registered
here before devm_qce_dma_request() and devm_mutex_init() (which happens later
in qce_crypto_probe), the teardown sequence will be:
1. Destroy qce->lock
2. Terminate and free DMA channels
3. Wait for the workqueue via cancel_work_sync()
If a DMA callback schedules qce->done_work just before teardown,
qce_req_done_work() could execute while cancel_work_sync() is waiting.
Does this allow the work item to attempt to lock the already destroyed
qce->lock and access the freed DMA channels?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260629-qcom-qce-cmd-descr-v20-0-56f67da84c05@oss.qualcomm.com?part=7
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v20 06/14] dmaengine: qcom: bam_dma: add support for BAM locking
2026-06-29 10:01 ` [PATCH v20 06/14] dmaengine: qcom: bam_dma: add support for BAM locking Bartosz Golaszewski
@ 2026-06-29 10:16 ` sashiko-bot
0 siblings, 0 replies; 25+ messages in thread
From: sashiko-bot @ 2026-06-29 10:16 UTC (permalink / raw)
To: Bartosz Golaszewski; +Cc: dmaengine, vkoul, Frank.Li
Thank you for your contribution! Sashiko AI review found 6 potential issue(s) to consider:
- [High] Race condition in locked descriptor processing leaves appended transactions unprotected.
- [High] Missing endianness conversion on physical address assigned to hardware descriptor.
- [High] Per-descriptor metadata is unsafely stored in the shared channel state.
- [High] Mapping an unaligned embedded structure field for DMA.
- [Medium] Internal command buffers are mapped using the client device instead of the DMA controller device.
- [Medium] Logic mismatch restricts BAM locking to TX channels instead of RX channels.
--
commit 74132b5d8141c72f8f781c94cfc7fae626e96be7
Author: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
dmaengine: qcom: bam_dma: add support for BAM locking
Add support for BAM pipe locking. To that end: when starting DMA on an RX
channel - prepend the existing queue of issued descriptors with an
additional "dummy" command descriptor with the LOCK bit set. Once the
transaction is done (no more issued descriptors), issue one more dummy
descriptor with the UNLOCK bit.
[ ... ]
> --- a/drivers/dma/qcom/bam_dma.c
> +++ b/drivers/dma/qcom/bam_dma.c
[ ... ]
> @@ -686,6 +701,35 @@ static int bam_slave_config(struct dma_chan *chan,
> return 0;
> }
>
> +static int bam_metadata_attach(struct dma_async_tx_descriptor *desc, void *data, size_t len)
> +{
> + struct bam_chan *bchan = to_bam_chan(desc->chan);
> + const struct bam_device_data *bdata = bchan->bdev->dev_data;
> + struct bam_desc_metadata *metadata = data;
> +
> + if (!data)
> + return -EINVAL;
> +
> + if (!bdata->pipe_lock_supported)
> + /*
> + * The client wants to use locking but this BAM version doesn't
> + * support it. Don't return an error here as this will stop the
> + * client from using DMA at all for no reason.
> + */
> + return 0;
> +
> + guard(spinlock_irqsave)(&bchan->vc.lock);
> +
> + bchan->scratchpad_addr = metadata->scratchpad_addr;
> + bchan->direction = metadata->direction;
[Severity: High]
Does this safely handle multiple descriptors being prepared before they are
issued?
Since the attach callback operates per-descriptor, storing these properties
directly into the shared bchan object could overwrite the state of previously
prepared descriptors.
When the locked sequence is eventually executed, might the channel apply this
overwritten metadata to all pending transactions, potentially using the wrong
scratchpad address for earlier ones?
> +
> + return 0;
> +}
> +
[ ... ]
> @@ -1046,13 +1114,101 @@ static void bam_apply_new_config(struct bam_chan *bchan,
> bchan->reconfigure = 0;
> }
>
> +static struct bam_async_desc *
> +bam_make_lock_desc(struct bam_chan *bchan, unsigned long flag)
> +{
> + struct dma_chan *chan = &bchan->vc.chan;
> + struct bam_async_desc *async_desc;
> + struct bam_desc_hw *desc;
> + struct virt_dma_desc *vd;
> + struct virt_dma_chan *vc;
> + unsigned int mapped;
> +
> + async_desc = kzalloc_flex(*async_desc, desc, 1, GFP_NOWAIT);
> + if (!async_desc) {
> + dev_err(bchan->bdev->dev, "failed to allocate the BAM lock descriptor\n");
> + return ERR_PTR(-ENOMEM);
> + }
> +
> + sg_init_table(&async_desc->lock_sg, 1);
> +
> + async_desc->num_desc = 1;
> + async_desc->curr_desc = async_desc->desc;
> + async_desc->dir = DMA_MEM_TO_DEV;
> +
> + desc = async_desc->desc;
> +
> + bam_prep_ce_le32(&async_desc->lock_ce, bchan->scratchpad_addr, BAM_WRITE_COMMAND, 0);
> + sg_set_buf(&async_desc->lock_sg, &async_desc->lock_ce, sizeof(async_desc->lock_ce));
[Severity: High]
Could this mapping violate DMA API requirements for cacheline alignment?
The lock_ce field is embedded within struct bam_async_desc and is not
guaranteed to be cacheline-aligned to ARCH_DMA_MINALIGN. Mapping it via
dma_map_sg() means it shares a cacheline with other fields.
If the CPU modifies adjacent fields (like async_desc->desc_node) while the
DMA mapping is active, it dirties the cacheline. On non-coherent
architectures, could this lead to unpredictable behavior and trigger DMA API
debug warnings?
> +
> + mapped = dma_map_sg(chan->slave, &async_desc->lock_sg, 1, DMA_TO_DEVICE);
[Severity: Medium]
Should this use the BAM DMA controller device instead of the client device?
The DMA mapping is being created in the IOMMU domain of the client device
(chan->slave). If the client device and the BAM controller do not share an
IOMMU domain, the BAM controller might be unable to access the command
element, resulting in an IOMMU fault.
Would it be better to map this using chan->device->dev or bchan->bdev->dev
since the BAM controller is the one reading the command buffer?
> + if (!mapped) {
> + kfree(async_desc);
> + return ERR_PTR(-ENOMEM);
> + }
> +
> + desc->flags |= cpu_to_le16(DESC_FLAG_CMD | flag);
> + desc->addr = sg_dma_address(&async_desc->lock_sg);
[Severity: High]
Is an endianness conversion missing here?
The desc->addr field is __le32, but it is being assigned a native
dma_addr_t without cpu_to_le32().
On big-endian systems, won't this write the wrong byte order into the hardware
descriptor and cause the BAM controller to fetch from an invalid memory address?
> + desc->size = cpu_to_le16(sizeof(struct bam_cmd_element));
> +
> + vc = &bchan->vc;
> + vd = &async_desc->vd;
> +
> + dma_async_tx_descriptor_init(&vd->tx, &vc->chan);
> + vd->tx.flags = DMA_PREP_CMD;
> + vd->tx_result.result = DMA_TRANS_NOERROR;
> + vd->tx_result.residue = 0;
> +
> + return async_desc;
> +}
> +
> +static int bam_setup_pipe_lock(struct bam_chan *bchan)
> +{
> + const struct bam_device_data *bdata = bchan->bdev->dev_data;
> + struct bam_async_desc *lock_desc, *unlock_desc;
> +
> + lockdep_assert_held(&bchan->vc.lock);
> +
> + if (!bdata->pipe_lock_supported || !bchan->scratchpad_addr ||
> + bchan->direction != DMA_MEM_TO_DEV)
[Severity: Medium]
Does this logic correctly target RX channels?
The commit message explicitly states: "when starting DMA on an RX channel",
but DMA_MEM_TO_DEV specifies the TX direction.
Will this silently skip the lock setup for RX channels (DMA_DEV_TO_MEM)?
> + return 0;
[ ... ]
> @@ -1064,9 +1220,23 @@ static void bam_start_dma(struct bam_chan *bchan)
>
> lockdep_assert_held(&bchan->vc.lock);
>
> + vd = vchan_next_desc(&bchan->vc);
> if (!vd)
> return;
>
> + /*
> + * Wrap the issued work with a LOCK/UNLOCK pair exactly once, at the
> + * start of a fresh sequence and only when there is real work to lock
> + * around. On a re-entry after a full FIFO, we see the BAM is locked
> + * and must not add another pair we simply continue loading the
> + * remainder of the same locked sequence.
> + */
> + if (!bchan->bam_locked) {
> + ret = bam_setup_pipe_lock(bchan);
> + if (ret == 0 && bchan->bam_locked)
> + vd = vchan_next_desc(&bchan->vc);
> + }
[Severity: High]
Is there a race condition here if new descriptors are appended?
If a new descriptor is submitted via vchan_issue_pending() while the hardware
is busy processing a locked sequence, the new descriptor is appended after the
UNLOCK descriptor.
When bam_start_dma() loops to process this new descriptor, bchan->bam_locked
is still true because it is only cleared asynchronously when the UNLOCK
hardware interrupt fires:
process_channel_irqs() {
...
if (flags & DESC_FLAG_UNLOCK)
bchan->bam_locked = false;
...
}
Because bchan->bam_locked is still true, won't this skip adding a new LOCK,
leaving the new transaction unprotected in the hardware FIFO?
> +
> ret = pm_runtime_get_sync(bdev->dev);
> if (ret < 0)
> return;
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260629-qcom-qce-cmd-descr-v20-0-56f67da84c05@oss.qualcomm.com?part=6
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v20 11/14] crypto: qce - Use existing devres APIs in devm_qce_dma_request()
2026-06-29 10:01 ` [PATCH v20 11/14] crypto: qce - Use existing devres APIs in devm_qce_dma_request() Bartosz Golaszewski
@ 2026-06-29 10:17 ` sashiko-bot
0 siblings, 0 replies; 25+ messages in thread
From: sashiko-bot @ 2026-06-29 10:17 UTC (permalink / raw)
To: Bartosz Golaszewski; +Cc: vkoul, dmaengine, Frank.Li
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
New issues:
- [Low] Commit message references a non-existent API function.
Pre-existing issues:
- [High] The devres unwinding order frees the DMA channels and result buffer BEFORE the background workqueue is explicitly canceled, creating a Use-After-Free window.
--
commit f6e312d1a680b31426e3cdbbdcdf493a9628353f
Author: Bartosz Golaszewski <brgl@kernel.org>
crypto: qce - Use existing devres APIs in devm_qce_dma_request()
Switch to devm_kmalloc() and devm_dma_alloc_chan() in devm_qce_dma_request()
to simplify the function and drop error labels.
> Switch to devm_kmalloc() and devm_dma_alloc_chan() in
> devm_qce_dma_request(). This allows us to drop two labels and shrink the
> function.
[Severity: Low]
This isn't a bug, but the commit message references devm_dma_alloc_chan()
instead of the actually used devm_dma_request_chan().
> diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
> index d60efb5c26d88..26347e9fc078a 100644
> --- a/drivers/crypto/qce/dma.c
> +++ b/drivers/crypto/qce/dma.c
[ ... ]
> - dma->rxchan = dma_request_chan(dev, "rx");
> - if (IS_ERR(dma->rxchan)) {
> - ret = dev_err_probe(dev, PTR_ERR(dma->rxchan),
> - "Failed to get RX DMA channel\n");
> - goto error_rx;
> - }
> -
> - dma->result_buf = kmalloc(QCE_RESULT_BUF_SZ + QCE_IGNORE_BUF_SZ,
> - GFP_KERNEL);
> - if (!dma->result_buf) {
> - ret = -ENOMEM;
> - goto error_nomem;
> - }
> -
> - return devm_add_action_or_reset(dev, qce_dma_release, dma);
> + dma->rxchan = devm_dma_request_chan(dev, "rx");
> + if (IS_ERR(dma->rxchan))
> + return dev_err_probe(dev, PTR_ERR(dma->rxchan),
> + "Failed to get RX DMA channel\n");
>
> -error_nomem:
> - dma_release_channel(dma->rxchan);
> -error_rx:
> - dma_release_channel(dma->txchan);
> - return ret;
> + return devm_add_action_or_reset(dev, qce_dma_terminate, dma);
> }
[Severity: High]
This is a pre-existing issue, but does the devres unwinding order here
create a window for a use-after-free?
In qce_crypto_probe(), qce_cancel_work is registered before the DMA
resources are requested:
drivers/crypto/qce/core.c:qce_crypto_probe() {
...
INIT_WORK(&qce->done_work, qce_req_done_work);
ret = devm_add_action_or_reset(dev, qce_cancel_work, &qce->done_work);
if (ret)
return ret;
ret = devm_qce_dma_request(qce);
...
}
Since devres actions are unwound in reverse order of registration (LIFO),
during unbind or late probe failure, the DMA termination registered here
will execute first. This frees txchan, rxchan, and result_buf.
Only after the DMA resources are freed does the unwinding sequence call
cancel_work_sync() for the done_work. If the background workqueue
qce_req_done_work() is already running or scheduled, could it process
subsequent crypto requests by calling qce_handle_queue() and accessing the
now-freed DMA channels and buffers?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260629-qcom-qce-cmd-descr-v20-0-56f67da84c05@oss.qualcomm.com?part=11
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v20 03/14] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue
2026-06-29 10:01 ` [PATCH v20 03/14] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue Bartosz Golaszewski
@ 2026-06-29 10:17 ` sashiko-bot
0 siblings, 0 replies; 25+ messages in thread
From: sashiko-bot @ 2026-06-29 10:17 UTC (permalink / raw)
To: Bartosz Golaszewski; +Cc: vkoul, Frank.Li, dmaengine
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
Pre-existing issues:
- [Critical] A modulo math mismatch between the ring buffer size and the CIRC_SPACE calculation allows the tail pointer to jump over the head pointer.
- [High] Hardware registers are accessed after device clock is disabled during removal, due to pm_runtime_force_suspend() being called too early.
- [High] Hardware registers accessed in IRQ handler before device is confirmed active, risking a bus fault.
- [High] Missing usage counter decrement when pm_runtime_get_sync() fails, leading to PM runtime refcount leak.
--
commit 0c2081a4425fb055ae1074f9c27f82d0875546e3
Author: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Subject: dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue
This commit converts the BAM DMA driver to use the high-priority variant
of the BH workqueue instead of legacy tasklets for starting DMA transactions.
> diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
> index fc155e0d1870c..ea3df28e777f9 100644
> --- a/drivers/dma/qcom/bam_dma.c
> +++ b/drivers/dma/qcom/bam_dma.c
[ ... ]
> @@ -922,9 +923,9 @@ static irqreturn_t bam_dma_irq(int irq, void *data)
>
> srcs |= process_channel_irqs(bdev);
[Severity: High]
This is a pre-existing issue, but does process_channel_irqs() read from
hardware registers before the device is confirmed active? If a spurious
interrupt occurs while the device is suspended, could this unprotected
hardware access trigger an unclocked access exception and crash the system?
>
> - /* kick off tasklet to start next dma transfer */
> + /* kick off the work queue to start next dma transfer */
> if (srcs & P_IRQ)
> - tasklet_schedule(&bdev->task);
> + queue_work(system_bh_highpri_wq, &bdev->work);
>
> ret = pm_runtime_get_sync(bdev->dev);
> if (ret < 0)
[Severity: High]
This isn't a bug introduced by this patch, but if pm_runtime_get_sync()
fails here, should we call pm_runtime_put_noidle() before returning?
Without it, the PM runtime usage counter might leak and prevent the device
from entering runtime suspend again.
[ ... ]
> @@ -1120,14 +1121,14 @@ static void bam_start_dma(struct bam_chan *bchan)
> }
[Severity: Critical]
This is a pre-existing issue, but there appears to be a modulo math mismatch
in bam_start_dma() when evaluating available buffer space:
avail = CIRC_SPACE(bchan->tail, bchan->head, MAX_DESCRIPTORS + 1);
...
bchan->tail += async_desc->xfer_len;
bchan->tail %= MAX_DESCRIPTORS;
Could this mismatch between CIRC_SPACE assuming a capacity of 4096 and the
explicit modulo using 4095 allow the tail pointer to jump over the head
pointer? When the buffer is full (tail reaches 4094 and wraps to 0), does
CIRC_SPACE calculate the space as 4095, falsely indicating an empty buffer
and allowing the hardware to overwrite active descriptors?
[Severity: High]
Also in bam_start_dma(), as a pre-existing issue, if pm_runtime_get_sync()
fails:
ret = pm_runtime_get_sync(bdev->dev);
if (ret < 0)
return;
Does this early return need a call to pm_runtime_put_noidle() to avoid
leaking the PM runtime usage counter?
[ ... ]
> @@ -1424,7 +1424,7 @@ static void bam_dma_remove(struct platform_device *pdev)
> bdev->channels[i].fifo_phys);
> }
>
> - tasklet_kill(&bdev->task);
> + cancel_work_sync(&bdev->work);
>
> clk_disable_unprepare(bdev->bamclk);
> }
[Severity: High]
This isn't a bug introduced by this patch, but does bam_dma_remove()
access hardware registers after the device clock is disabled?
pm_runtime_force_suspend(&pdev->dev);
...
writel_relaxed(0, bam_addr(bdev, 0, BAM_IRQ_SRCS_MSK_EE));
Since pm_runtime_force_suspend() disables the BAM clock, could accessing
hardware registers immediately after trigger a bus fault and crash during
driver unbind?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260629-qcom-qce-cmd-descr-v20-0-56f67da84c05@oss.qualcomm.com?part=3
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v20 05/14] dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support
2026-06-29 10:01 ` [PATCH v20 05/14] dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support Bartosz Golaszewski
@ 2026-06-29 10:18 ` sashiko-bot
0 siblings, 0 replies; 25+ messages in thread
From: sashiko-bot @ 2026-06-29 10:18 UTC (permalink / raw)
To: Bartosz Golaszewski; +Cc: dmaengine, Frank.Li, vkoul
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
New issues:
- [High] Prematurely freeing the IRQ in `bam_dma_remove()` before unregistering the DMA device.
Pre-existing issues:
- [High] Accessing hardware registers after the clock has been disabled in `bam_dma_remove()`.
--
commit c3d67a708644d5b3039f73321dfd2a2034b5a7b2
Author: Bartosz Golaszewski <brgl@kernel.org>
dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support
Extend the device match data with a flag indicating whether the IP
supports the BAM lock/unlock feature. Set it to true on BAM IP versions
1.4.0 and above.
> diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
> index 8ce0fe085c5fe..f3e713a5259c2 100644
> --- a/drivers/dma/qcom/bam_dma.c
> +++ b/drivers/dma/qcom/bam_dma.c
> @@ -115,6 +115,7 @@ struct reg_offset_data {
>
> struct bam_device_data {
> const struct reg_offset_data *reg_info;
> + bool pipe_lock_supported;
> };
>
> static const struct reg_offset_data bam_v1_3_reg_info[] = {
[ ... ]
Looking at bam_dma_remove() in drivers/dma/qcom/bam_dma.c:
drivers/dma/qcom/bam_dma.c:bam_dma_remove() {
...
free_irq(bdev->irq, bdev);
pm_runtime_force_suspend(&pdev->dev);
of_dma_controller_free(pdev->dev.of_node);
dma_async_device_unregister(&bdev->common);
...
}
[Severity: High]
If a client has an active DMA transaction when the DMA engine is removed (e.g.,
via sysfs unbind or module unload), does freeing the interrupt early mean the
completion interrupt will never be serviced?
This could cause the client driver to wait indefinitely for the DMA transaction
to finish, leading to unkillable tasks or system hangs. Should the DMA engine
be unregistered and channels properly quiesced before the IRQ is released?
drivers/dma/qcom/bam_dma.c:bam_dma_remove() {
...
pm_runtime_force_suspend(&pdev->dev);
of_dma_controller_free(pdev->dev.of_node);
dma_async_device_unregister(&bdev->common);
/* mask all interrupts for this execution environment */
writel_relaxed(0, bam_addr(bdev, 0, BAM_IRQ_SRCS_MSK_EE));
for (i = 0; i < bdev->num_channels; i++) {
bam_dma_terminate_all(&bdev->channels[i].vc.chan);
...
}
[Severity: High]
This is a pre-existing issue, but can accessing hardware registers after the
clock has been disabled here cause a system crash?
The clock is disabled by pm_runtime_force_suspend(), making the subsequent
writel_relaxed() and bam_dma_terminate_all() register accesses fatal. On
Qualcomm SoCs, accessing BAM registers while the clock is gated triggers a
synchronous external abort (NOC error), resulting in an immediate kernel panic.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260629-qcom-qce-cmd-descr-v20-0-56f67da84c05@oss.qualcomm.com?part=5
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v20 02/14] dmaengine: qcom: bam_dma: free interrupt before the clock in error path
2026-06-29 10:01 ` [PATCH v20 02/14] dmaengine: qcom: bam_dma: free interrupt before the clock in error path Bartosz Golaszewski
@ 2026-06-29 10:18 ` sashiko-bot
0 siblings, 0 replies; 25+ messages in thread
From: sashiko-bot @ 2026-06-29 10:18 UTC (permalink / raw)
To: Bartosz Golaszewski; +Cc: vkoul, dmaengine, Frank.Li
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
New issues:
- [High] Moving `free_irq()` to the start of `bam_dma_remove()` creates an interrupt storm window.
Pre-existing issues:
- [Critical] `bam_dma_remove()` performs unconditional unclocked hardware accesses after `pm_runtime_force_suspend()`, causing a system crash.
- [High] Concurrent execution of the pending async workqueue with `pm_runtime_force_suspend()` causes unclocked hardware accesses.
- [High] The interrupt handler `bam_dma_irq()` unconditionally accesses hardware registers, causing a crash on spurious interrupts.
--
commit 8349aa20615406eee52bb3c2ccef1b8f57571551
Author: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
dmaengine: qcom: bam_dma: free interrupt before the clock in error path
The BAM interrupt is requested with a devres helper and so on error it's
freed after probe() returns. We disable the clock before freeing or
masking it so it may still fire and we may end up reading BAM registers
with clock disabled.
Stop using devres for interrupts as we free it in remove() manually
anyway. Add an appropriate label and free the interrupt before disabling
the clock in error path and in remove().
Closes: https://sashiko.dev/#/patchset/20260427-qcom-qce-cmd-descr-v16-0-945fd1cafbbc%40oss.qualcomm.com?part=2
> diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
> index 1bb26af0405f3..fc155e0d1870c 100644
> --- a/drivers/dma/qcom/bam_dma.c
> +++ b/drivers/dma/qcom/bam_dma.c
> @@ -1332,8 +1332,7 @@ static int bam_dma_probe(struct platform_device *pdev)
> for (i = 0; i < bdev->num_channels; i++)
> bam_channel_init(bdev, &bdev->channels[i], i);
>
> - ret = devm_request_irq(bdev->dev, bdev->irq, bam_dma_irq,
> - IRQF_TRIGGER_HIGH, "bam_dma", bdev);
> + ret = request_irq(bdev->irq, bam_dma_irq, IRQF_TRIGGER_HIGH, "bam_dma", bdev);
[Severity: High]
This is a pre-existing issue, but does bam_dma_irq() unconditionally access
hardware registers before enabling the clock?
If a spurious interrupt triggers while the device is runtime-suspended,
process_channel_irqs() might read BAM_IRQ_SRCS_EE before pm_runtime_get_sync()
is called to turn the clock on.
Could this lead to a synchronous external abort?
[ ... ]
> @@ -1401,6 +1402,8 @@ static void bam_dma_remove(struct platform_device *pdev)
> struct bam_device *bdev = platform_get_drvdata(pdev);
> u32 i;
>
> + free_irq(bdev->irq, bdev);
> +
[Severity: High]
By unregistering the handler here, before hardware interrupts are masked,
could this create a window for an unhandled interrupt storm?
If a DMA transaction completes and asserts the level-triggered
IRQF_TRIGGER_HIGH line, there is no software handler to clear it until the
mask is written below. This might force the kernel to permanently disable
the IRQ line.
> pm_runtime_force_suspend(&pdev->dev);
>
[Severity: High]
This isn't a bug introduced by this patch, but does forcing the device to
suspend here in bam_dma_remove() race with pending asynchronous work?
If a DMA interrupt schedules the workqueue, the async work might execute
concurrently and acquire a runtime PM reference. Because
pm_runtime_force_suspend() ignores active usage counters, it could forcefully
disable the clock while the workqueue is still writing to BAM registers.
> of_dma_controller_free(pdev->dev.of_node);
> dma_async_device_unregister(&bdev->common);
>
> /* mask all interrupts for this execution environment */
> writel_relaxed(0, bam_addr(bdev, 0, BAM_IRQ_SRCS_MSK_EE));
[Severity: Critical]
This is a pre-existing issue, but does writing to BAM hardware registers here
in bam_dma_remove() cause a crash?
The preceding call to pm_runtime_force_suspend() disables the bamclk, so this
writel_relaxed() and subsequent termination calls appear to perform unclocked
hardware accesses, which could trigger a synchronous external abort.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260629-qcom-qce-cmd-descr-v20-0-56f67da84c05@oss.qualcomm.com?part=2
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v20 13/14] crypto: qce - Add BAM DMA support for crypto register I/O
2026-06-29 10:01 ` [PATCH v20 13/14] crypto: qce - Add BAM DMA support for crypto register I/O Bartosz Golaszewski
@ 2026-06-29 10:22 ` sashiko-bot
0 siblings, 0 replies; 25+ messages in thread
From: sashiko-bot @ 2026-06-29 10:22 UTC (permalink / raw)
To: Bartosz Golaszewski; +Cc: vkoul, dmaengine, Frank.Li
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
New issues:
- [High] Incorrect endianness conversion for BAM DMA register writes on big-endian architectures.
- [Medium] DMA descriptor leak in qce_submit_cmd_desc() on error paths.
Pre-existing issues:
- [High] Memory allocated for rctx->adata is never freed.
- [High] Incorrect dma_map_sg failure check for src_sg ignores mapping errors.
--
commit bf3a155521eb3078dcc0111460444f36ebb69ebf
Author: Bartosz Golaszewski <brgl@kernel.org>
crypto: qce - Add BAM DMA support for crypto register I/O
Switch to using BAM DMA for register I/O in addition to passing data. To
that end: provide the necessary infrastructure in the driver, modify the
ordering of operations as required and replace all direct register writes
with wrappers queueing DMA command descriptors.
> diff --git a/drivers/crypto/qce/aead.c b/drivers/crypto/qce/aead.c
> --- a/drivers/crypto/qce/aead.c
> +++ b/drivers/crypto/qce/aead.c
> @@ -463,17 +463,17 @@ qce_aead_async_req_handle(struct crypto_async_request *async_req)
[Severity: High]
This is a pre-existing issue, but does this code leak the rctx->adata buffer
allocated for AES-CCM operations? It looks like if memory is dynamically
allocated for rctx->adata via kzalloc() during CCM processing, it is never
freed in either the success completion path or error handling paths of
qce_aead_async_req_handle(), which could lead to memory exhaustion when
processing untrusted network traffic.
[Severity: High]
This is also a pre-existing issue, but does the error check for dma_map_sg()
earlier in qce_aead_async_req_handle() correctly handle mapping failures?
dma_map_sg() returns 0 on error, not a negative value. The check for
(src_nents < 0) evaluates to false, which means a mapping failure could be
ignored, causing the driver to proceed and later call dma_unmap_sg() with
invalid parameters.
[ ... ]
> diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
> --- a/drivers/crypto/qce/dma.c
> +++ b/drivers/crypto/qce/dma.c
[ ... ]
> +int qce_submit_cmd_desc(struct qce_device *qce)
> +{
[ ... ]
> + cookie = dmaengine_submit(qce_desc->dma_desc);
> +
> + ret = dma_submit_error(cookie);
> + if (ret)
> + goto err_unmap_sg;
> +
> + return 0;
> +
> +err_unmap_sg:
> + dma_unmap_sg(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt, DMA_TO_DEVICE);
> + return ret;
> +}
[Severity: Medium]
Does this error path leak the unsubmitted DMA descriptor? If the DMA
submission fails and we go to err_unmap_sg, the rollback bypasses
qce_dma_terminate_all() or any mechanism to flush the descriptor. Since the
DMA_CTRL_REUSE flag was not passed during preparation, attempting to explicitly
free it later will return an error without freeing the descriptor.
[ ... ]
> +static void qce_prep_dma_cmd_desc(struct qce_device *qce, struct qce_dma_data *dma,
> + unsigned int addr, void *buf)
> +{
[ ... ]
> + idx = bam_txn->bam_ce_idx;
> + bam_ce_buf = &bam_txn->bam_ce[idx];
> + bam_prep_ce_le32(bam_ce_buf, addr, BAM_WRITE_COMMAND, *((__le32 *)buf));
[ ... ]
> +void qce_write_dma(struct qce_device *qce, unsigned int offset, u32 val)
> +{
> + unsigned int reg_addr = ((unsigned int)(qce->base_phys) + offset);
> +
> + qce_prep_dma_cmd_desc(qce, &qce->dma, reg_addr, &val);
> +}
[Severity: High]
Is there an endianness mismatch here on big-endian architectures?
In qce_write_dma(), a CPU-endian u32 stack address is passed to
qce_prep_dma_cmd_desc(). That function then casts the pointer to __le32* and
reads it directly into the BAM descriptor without byte swapping. Will this
result in reversed bytes being written to the crypto hardware registers on
big-endian CPUs?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260629-qcom-qce-cmd-descr-v20-0-56f67da84c05@oss.qualcomm.com?part=13
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v20 14/14] crypto: qce - Communicate the base physical address to the dmaengine
2026-06-29 10:01 ` [PATCH v20 14/14] crypto: qce - Communicate the base physical address to the dmaengine Bartosz Golaszewski
@ 2026-06-29 10:24 ` sashiko-bot
0 siblings, 0 replies; 25+ messages in thread
From: sashiko-bot @ 2026-06-29 10:24 UTC (permalink / raw)
To: Bartosz Golaszewski; +Cc: dmaengine, Frank.Li, vkoul
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] Ineffective descriptor cleanup on error paths leads to a memory leak because `dmaengine_desc_free()` is called on a non-reusable descriptor.
--
commit 7c11f5b2471b4e0caed5825bea7adc1ce06d76de
Author: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
crypto: qce - Communicate the base physical address to the dmaengine
This commit attaches metadata containing the base physical address to the
DMA descriptor for the BAM DMA engine. It also adds error handling paths
to free the descriptor and unmap the scatterlist if the metadata attachment
or the DMA submission fails.
> diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
> index 1b43c56503334..6410f8dc5bcf5 100644
> --- a/drivers/crypto/qce/dma.c
> +++ b/drivers/crypto/qce/dma.c
[ ... ]
> @@ -60,15 +65,21 @@ int qce_submit_cmd_desc(struct qce_device *qce)
> goto err_unmap_sg;
> }
>
> + ret = dmaengine_desc_attach_metadata(dma_desc, &meta, sizeof(meta));
> + if (ret)
> + goto err_free_desc;
> +
> qce_desc->dma_desc = dma_desc;
> cookie = dmaengine_submit(qce_desc->dma_desc);
>
> ret = dma_submit_error(cookie);
> if (ret)
> - goto err_unmap_sg;
> + goto err_free_desc;
>
> return 0;
>
> +err_free_desc:
> + dmaengine_desc_free(dma_desc);
[Severity: Medium]
Does this error path successfully free the descriptor?
Earlier in qce_submit_cmd_desc(), the descriptor attributes are defined
without the reuse flag:
unsigned long attrs = DMA_PREP_CMD;
...
dma_desc = dmaengine_prep_slave_sg(chan, bam_txn->wr_sgl, mapped,
DMA_MEM_TO_DEV, attrs);
Since the descriptor is not prepared with the DMA_CTRL_REUSE flag,
dmaengine_desc_free() will return an error instead of freeing the memory:
include/linux/dmaengine.h:dmaengine_desc_free() {
/* this is supported for reusable desc, so check that */
if (!dmaengine_desc_test_reuse(desc))
return -EPERM;
return desc->desc_free(desc);
}
Could this cause a memory leak of the DMA descriptor if
dmaengine_desc_attach_metadata() fails or dma_submit_error() returns an error?
> err_unmap_sg:
> dma_unmap_sg(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt, DMA_TO_DEVICE);
> return ret;
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260629-qcom-qce-cmd-descr-v20-0-56f67da84c05@oss.qualcomm.com?part=14
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v20 01/14] dmaengine: constify struct dma_descriptor_metadata_ops
2026-06-29 10:01 ` [PATCH v20 01/14] dmaengine: constify struct dma_descriptor_metadata_ops Bartosz Golaszewski
@ 2026-06-29 15:04 ` Pandey, Radhey Shyam
0 siblings, 0 replies; 25+ messages in thread
From: Pandey, Radhey Shyam @ 2026-06-29 15:04 UTC (permalink / raw)
To: Bartosz Golaszewski, Vinod Koul, Jonathan Corbet, Thara Gopinath,
Herbert Xu, David S. Miller, Udit Tiwari, Md Sadre Alam,
Dmitry Baryshkov, Manivannan Sadhasivam, Stephan Gerhold,
Bjorn Andersson, Peter Ujfalusi, Michal Simek, Frank Li,
Andy Gross, Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski
> There's no reason for the instances of this struct to be modifiable.
> Constify the pointer in struct dma_async_tx_descriptor and all drivers
> currently using it.
>
> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Thanks!
> ---
> drivers/dma/ti/k3-udma.c | 2 +-
> drivers/dma/xilinx/xilinx_dma.c | 2 +-
> include/linux/dmaengine.h | 2 +-
> 3 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c
> index 1cf158eb7bdb541c4e7f4f79f65ab70be4311fad..fb21e0df5ab7b20e4e16777b5ff7f61d2ae67b2b 100644
> --- a/drivers/dma/ti/k3-udma.c
> +++ b/drivers/dma/ti/k3-udma.c
> @@ -3408,7 +3408,7 @@ static int udma_set_metadata_len(struct dma_async_tx_descriptor *desc,
> return 0;
> }
>
> -static struct dma_descriptor_metadata_ops metadata_ops = {
> +static const struct dma_descriptor_metadata_ops metadata_ops = {
> .attach = udma_attach_metadata,
> .get_ptr = udma_get_metadata_ptr,
> .set_len = udma_set_metadata_len,
> diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
> index 404235c1735384635597e88edc25c67c7d250647..165b11a7c776abc6a8d66d631e19da669644577d 100644
> --- a/drivers/dma/xilinx/xilinx_dma.c
> +++ b/drivers/dma/xilinx/xilinx_dma.c
> @@ -653,7 +653,7 @@ static void *xilinx_dma_get_metadata_ptr(struct dma_async_tx_descriptor *tx,
> return seg->hw.app;
> }
>
> -static struct dma_descriptor_metadata_ops xilinx_dma_metadata_ops = {
> +static const struct dma_descriptor_metadata_ops xilinx_dma_metadata_ops = {
> .get_ptr = xilinx_dma_get_metadata_ptr,
> };
>
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index b3d251c9734e95e1b75cf6763d4d2c3a1c6a9910..5244edb90e7e7510bf4460b6a74ee2a7f91c1ccc 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -623,7 +623,7 @@ struct dma_async_tx_descriptor {
> void *callback_param;
> struct dmaengine_unmap_data *unmap;
> enum dma_desc_metadata_mode desc_metadata_mode;
> - struct dma_descriptor_metadata_ops *metadata_ops;
> + const struct dma_descriptor_metadata_ops *metadata_ops;
> #ifdef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
> struct dma_async_tx_descriptor *next;
> struct dma_async_tx_descriptor *parent;
>
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2026-06-29 15:04 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-29 10:01 [PATCH v20 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
2026-06-29 10:01 ` [PATCH v20 01/14] dmaengine: constify struct dma_descriptor_metadata_ops Bartosz Golaszewski
2026-06-29 15:04 ` Pandey, Radhey Shyam
2026-06-29 10:01 ` [PATCH v20 02/14] dmaengine: qcom: bam_dma: free interrupt before the clock in error path Bartosz Golaszewski
2026-06-29 10:18 ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 03/14] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue Bartosz Golaszewski
2026-06-29 10:17 ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 04/14] dmaengine: qcom: bam_dma: Extend the driver's device match data Bartosz Golaszewski
2026-06-29 10:01 ` [PATCH v20 05/14] dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support Bartosz Golaszewski
2026-06-29 10:18 ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 06/14] dmaengine: qcom: bam_dma: add support for BAM locking Bartosz Golaszewski
2026-06-29 10:16 ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 07/14] crypto: qce - Cancel work on device detach Bartosz Golaszewski
2026-06-29 10:15 ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 08/14] crypto: qce - Include algapi.h in the core.h header Bartosz Golaszewski
2026-06-29 10:01 ` [PATCH v20 09/14] crypto: qce - Remove unused ignore_buf Bartosz Golaszewski
2026-06-29 10:01 ` [PATCH v20 10/14] crypto: qce - Simplify arguments of devm_qce_dma_request() Bartosz Golaszewski
2026-06-29 10:01 ` [PATCH v20 11/14] crypto: qce - Use existing devres APIs in devm_qce_dma_request() Bartosz Golaszewski
2026-06-29 10:17 ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 12/14] crypto: qce - Map crypto memory for DMA Bartosz Golaszewski
2026-06-29 10:14 ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 13/14] crypto: qce - Add BAM DMA support for crypto register I/O Bartosz Golaszewski
2026-06-29 10:22 ` sashiko-bot
2026-06-29 10:01 ` [PATCH v20 14/14] crypto: qce - Communicate the base physical address to the dmaengine Bartosz Golaszewski
2026-06-29 10:24 ` sashiko-bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox