* [PATCH v12 01/12] dmaengine: constify struct dma_descriptor_metadata_ops
2026-03-10 15:44 [PATCH v12 00/12] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
@ 2026-03-10 15:44 ` Bartosz Golaszewski
2026-03-11 12:43 ` Manivannan Sadhasivam
2026-03-10 15:44 ` [PATCH v12 02/12] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue Bartosz Golaszewski
` (11 subsequent siblings)
12 siblings, 1 reply; 23+ messages in thread
From: Bartosz Golaszewski @ 2026-03-10 15:44 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
There's no reason for the instances of this struct to be modifiable.
Constify the pointer in struct dma_async_tx_descriptor and all drivers
currently using it.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/dma/ti/k3-udma.c | 2 +-
drivers/dma/xilinx/xilinx_dma.c | 2 +-
include/linux/dmaengine.h | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c
index c964ebfcf3b68d86e4bbc9b62bad2212f0ce3ee9..8a2f235b669aaf084a6f7b3e6b23d06b04768608 100644
--- a/drivers/dma/ti/k3-udma.c
+++ b/drivers/dma/ti/k3-udma.c
@@ -3408,7 +3408,7 @@ static int udma_set_metadata_len(struct dma_async_tx_descriptor *desc,
return 0;
}
-static struct dma_descriptor_metadata_ops metadata_ops = {
+static const struct dma_descriptor_metadata_ops metadata_ops = {
.attach = udma_attach_metadata,
.get_ptr = udma_get_metadata_ptr,
.set_len = udma_set_metadata_len,
diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
index b53292e02448fe528f1ae9ba33b4bcf408f89fd6..97b934ca54101ea699e3ab28d419bed1b45dee4a 100644
--- a/drivers/dma/xilinx/xilinx_dma.c
+++ b/drivers/dma/xilinx/xilinx_dma.c
@@ -653,7 +653,7 @@ static void *xilinx_dma_get_metadata_ptr(struct dma_async_tx_descriptor *tx,
return seg->hw.app;
}
-static struct dma_descriptor_metadata_ops xilinx_dma_metadata_ops = {
+static const struct dma_descriptor_metadata_ops xilinx_dma_metadata_ops = {
.get_ptr = xilinx_dma_get_metadata_ptr,
};
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 99efe2b9b4ea9844ca6161208362ef18ef111d96..92566c4c100e98f48750de21249ae3b5de06c763 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -623,7 +623,7 @@ struct dma_async_tx_descriptor {
void *callback_param;
struct dmaengine_unmap_data *unmap;
enum dma_desc_metadata_mode desc_metadata_mode;
- struct dma_descriptor_metadata_ops *metadata_ops;
+ const struct dma_descriptor_metadata_ops *metadata_ops;
#ifdef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
struct dma_async_tx_descriptor *next;
struct dma_async_tx_descriptor *parent;
--
2.47.3
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH v12 01/12] dmaengine: constify struct dma_descriptor_metadata_ops
2026-03-10 15:44 ` [PATCH v12 01/12] dmaengine: constify struct dma_descriptor_metadata_ops Bartosz Golaszewski
@ 2026-03-11 12:43 ` Manivannan Sadhasivam
0 siblings, 0 replies; 23+ messages in thread
From: Manivannan Sadhasivam @ 2026-03-11 12:43 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li,
dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski
On Tue, Mar 10, 2026 at 04:44:15PM +0100, Bartosz Golaszewski wrote:
> There's no reason for the instances of this struct to be modifiable.
> Constify the pointer in struct dma_async_tx_descriptor and all drivers
> currently using it.
>
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
- Mani
> ---
> drivers/dma/ti/k3-udma.c | 2 +-
> drivers/dma/xilinx/xilinx_dma.c | 2 +-
> include/linux/dmaengine.h | 2 +-
> 3 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c
> index c964ebfcf3b68d86e4bbc9b62bad2212f0ce3ee9..8a2f235b669aaf084a6f7b3e6b23d06b04768608 100644
> --- a/drivers/dma/ti/k3-udma.c
> +++ b/drivers/dma/ti/k3-udma.c
> @@ -3408,7 +3408,7 @@ static int udma_set_metadata_len(struct dma_async_tx_descriptor *desc,
> return 0;
> }
>
> -static struct dma_descriptor_metadata_ops metadata_ops = {
> +static const struct dma_descriptor_metadata_ops metadata_ops = {
> .attach = udma_attach_metadata,
> .get_ptr = udma_get_metadata_ptr,
> .set_len = udma_set_metadata_len,
> diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
> index b53292e02448fe528f1ae9ba33b4bcf408f89fd6..97b934ca54101ea699e3ab28d419bed1b45dee4a 100644
> --- a/drivers/dma/xilinx/xilinx_dma.c
> +++ b/drivers/dma/xilinx/xilinx_dma.c
> @@ -653,7 +653,7 @@ static void *xilinx_dma_get_metadata_ptr(struct dma_async_tx_descriptor *tx,
> return seg->hw.app;
> }
>
> -static struct dma_descriptor_metadata_ops xilinx_dma_metadata_ops = {
> +static const struct dma_descriptor_metadata_ops xilinx_dma_metadata_ops = {
> .get_ptr = xilinx_dma_get_metadata_ptr,
> };
>
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 99efe2b9b4ea9844ca6161208362ef18ef111d96..92566c4c100e98f48750de21249ae3b5de06c763 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -623,7 +623,7 @@ struct dma_async_tx_descriptor {
> void *callback_param;
> struct dmaengine_unmap_data *unmap;
> enum dma_desc_metadata_mode desc_metadata_mode;
> - struct dma_descriptor_metadata_ops *metadata_ops;
> + const struct dma_descriptor_metadata_ops *metadata_ops;
> #ifdef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
> struct dma_async_tx_descriptor *next;
> struct dma_async_tx_descriptor *parent;
>
> --
> 2.47.3
>
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v12 02/12] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue
2026-03-10 15:44 [PATCH v12 00/12] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
2026-03-10 15:44 ` [PATCH v12 01/12] dmaengine: constify struct dma_descriptor_metadata_ops Bartosz Golaszewski
@ 2026-03-10 15:44 ` Bartosz Golaszewski
2026-03-11 12:45 ` Manivannan Sadhasivam
2026-03-10 15:44 ` [PATCH v12 03/12] dmaengine: qcom: bam_dma: Extend the driver's device match data Bartosz Golaszewski
` (10 subsequent siblings)
12 siblings, 1 reply; 23+ messages in thread
From: Bartosz Golaszewski @ 2026-03-10 15:44 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski,
Dmitry Baryshkov, Bjorn Andersson
BH workqueues are a modern mechanism, aiming to replace legacy tasklets.
Let's convert the BAM DMA driver to using the high-priority variant of
the BH workqueue.
[Vinod: suggested using the BG workqueue instead of the regular one
running in process context]
Suggested-by: Vinod Koul <vkoul@kernel.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/dma/qcom/bam_dma.c | 32 ++++++++++++++++----------------
1 file changed, 16 insertions(+), 16 deletions(-)
diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index 19116295f8325767a0d97a7848077885b118241c..c8601bac555edf1bb4384fd39cb3449ec6e86334 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -42,6 +42,7 @@
#include <linux/pm_runtime.h>
#include <linux/scatterlist.h>
#include <linux/slab.h>
+#include <linux/workqueue.h>
#include "../dmaengine.h"
#include "../virt-dma.h"
@@ -397,8 +398,8 @@ struct bam_device {
struct clk *bamclk;
int irq;
- /* dma start transaction tasklet */
- struct tasklet_struct task;
+ /* dma start transaction workqueue */
+ struct work_struct work;
};
/**
@@ -863,7 +864,7 @@ static u32 process_channel_irqs(struct bam_device *bdev)
/*
* if complete, process cookie. Otherwise
* push back to front of desc_issued so that
- * it gets restarted by the tasklet
+ * it gets restarted by the work queue.
*/
if (!async_desc->num_desc) {
vchan_cookie_complete(&async_desc->vd);
@@ -893,9 +894,9 @@ static irqreturn_t bam_dma_irq(int irq, void *data)
srcs |= process_channel_irqs(bdev);
- /* kick off tasklet to start next dma transfer */
+ /* kick off the work queue to start next dma transfer */
if (srcs & P_IRQ)
- tasklet_schedule(&bdev->task);
+ queue_work(system_bh_highpri_wq, &bdev->work);
ret = pm_runtime_get_sync(bdev->dev);
if (ret < 0)
@@ -1091,14 +1092,14 @@ static void bam_start_dma(struct bam_chan *bchan)
}
/**
- * dma_tasklet - DMA IRQ tasklet
- * @t: tasklet argument (bam controller structure)
+ * bam_dma_work() - DMA interrupt work queue callback
+ * @work: work queue struct embedded in the BAM controller device struct
*
* Sets up next DMA operation and then processes all completed transactions
*/
-static void dma_tasklet(struct tasklet_struct *t)
+static void bam_dma_work(struct work_struct *work)
{
- struct bam_device *bdev = from_tasklet(bdev, t, task);
+ struct bam_device *bdev = from_work(bdev, work, work);
struct bam_chan *bchan;
unsigned int i;
@@ -1111,14 +1112,13 @@ static void dma_tasklet(struct tasklet_struct *t)
if (!list_empty(&bchan->vc.desc_issued) && !IS_BUSY(bchan))
bam_start_dma(bchan);
}
-
}
/**
* bam_issue_pending - starts pending transactions
* @chan: dma channel
*
- * Calls tasklet directly which in turn starts any pending transactions
+ * Calls work queue directly which in turn starts any pending transactions
*/
static void bam_issue_pending(struct dma_chan *chan)
{
@@ -1286,14 +1286,14 @@ static int bam_dma_probe(struct platform_device *pdev)
if (ret)
goto err_disable_clk;
- tasklet_setup(&bdev->task, dma_tasklet);
+ INIT_WORK(&bdev->work, bam_dma_work);
bdev->channels = devm_kcalloc(bdev->dev, bdev->num_channels,
sizeof(*bdev->channels), GFP_KERNEL);
if (!bdev->channels) {
ret = -ENOMEM;
- goto err_tasklet_kill;
+ goto err_workqueue_cancel;
}
/* allocate and initialize channels */
@@ -1358,8 +1358,8 @@ static int bam_dma_probe(struct platform_device *pdev)
err_bam_channel_exit:
for (i = 0; i < bdev->num_channels; i++)
tasklet_kill(&bdev->channels[i].vc.task);
-err_tasklet_kill:
- tasklet_kill(&bdev->task);
+err_workqueue_cancel:
+ cancel_work_sync(&bdev->work);
err_disable_clk:
clk_disable_unprepare(bdev->bamclk);
@@ -1393,7 +1393,7 @@ static void bam_dma_remove(struct platform_device *pdev)
bdev->channels[i].fifo_phys);
}
- tasklet_kill(&bdev->task);
+ cancel_work_sync(&bdev->work);
clk_disable_unprepare(bdev->bamclk);
}
--
2.47.3
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH v12 02/12] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue
2026-03-10 15:44 ` [PATCH v12 02/12] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue Bartosz Golaszewski
@ 2026-03-11 12:45 ` Manivannan Sadhasivam
0 siblings, 0 replies; 23+ messages in thread
From: Manivannan Sadhasivam @ 2026-03-11 12:45 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li,
dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Dmitry Baryshkov,
Bjorn Andersson
On Tue, Mar 10, 2026 at 04:44:16PM +0100, Bartosz Golaszewski wrote:
> BH workqueues are a modern mechanism, aiming to replace legacy tasklets.
> Let's convert the BAM DMA driver to using the high-priority variant of
> the BH workqueue.
>
> [Vinod: suggested using the BG workqueue instead of the regular one
> running in process context]
>
> Suggested-by: Vinod Koul <vkoul@kernel.org>
> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
> Reviewed-by: Bjorn Andersson <andersson@kernel.org>
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
- Mani
> ---
> drivers/dma/qcom/bam_dma.c | 32 ++++++++++++++++----------------
> 1 file changed, 16 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
> index 19116295f8325767a0d97a7848077885b118241c..c8601bac555edf1bb4384fd39cb3449ec6e86334 100644
> --- a/drivers/dma/qcom/bam_dma.c
> +++ b/drivers/dma/qcom/bam_dma.c
> @@ -42,6 +42,7 @@
> #include <linux/pm_runtime.h>
> #include <linux/scatterlist.h>
> #include <linux/slab.h>
> +#include <linux/workqueue.h>
>
> #include "../dmaengine.h"
> #include "../virt-dma.h"
> @@ -397,8 +398,8 @@ struct bam_device {
> struct clk *bamclk;
> int irq;
>
> - /* dma start transaction tasklet */
> - struct tasklet_struct task;
> + /* dma start transaction workqueue */
> + struct work_struct work;
> };
>
> /**
> @@ -863,7 +864,7 @@ static u32 process_channel_irqs(struct bam_device *bdev)
> /*
> * if complete, process cookie. Otherwise
> * push back to front of desc_issued so that
> - * it gets restarted by the tasklet
> + * it gets restarted by the work queue.
> */
> if (!async_desc->num_desc) {
> vchan_cookie_complete(&async_desc->vd);
> @@ -893,9 +894,9 @@ static irqreturn_t bam_dma_irq(int irq, void *data)
>
> srcs |= process_channel_irqs(bdev);
>
> - /* kick off tasklet to start next dma transfer */
> + /* kick off the work queue to start next dma transfer */
> if (srcs & P_IRQ)
> - tasklet_schedule(&bdev->task);
> + queue_work(system_bh_highpri_wq, &bdev->work);
>
> ret = pm_runtime_get_sync(bdev->dev);
> if (ret < 0)
> @@ -1091,14 +1092,14 @@ static void bam_start_dma(struct bam_chan *bchan)
> }
>
> /**
> - * dma_tasklet - DMA IRQ tasklet
> - * @t: tasklet argument (bam controller structure)
> + * bam_dma_work() - DMA interrupt work queue callback
> + * @work: work queue struct embedded in the BAM controller device struct
> *
> * Sets up next DMA operation and then processes all completed transactions
> */
> -static void dma_tasklet(struct tasklet_struct *t)
> +static void bam_dma_work(struct work_struct *work)
> {
> - struct bam_device *bdev = from_tasklet(bdev, t, task);
> + struct bam_device *bdev = from_work(bdev, work, work);
> struct bam_chan *bchan;
> unsigned int i;
>
> @@ -1111,14 +1112,13 @@ static void dma_tasklet(struct tasklet_struct *t)
> if (!list_empty(&bchan->vc.desc_issued) && !IS_BUSY(bchan))
> bam_start_dma(bchan);
> }
> -
> }
>
> /**
> * bam_issue_pending - starts pending transactions
> * @chan: dma channel
> *
> - * Calls tasklet directly which in turn starts any pending transactions
> + * Calls work queue directly which in turn starts any pending transactions
> */
> static void bam_issue_pending(struct dma_chan *chan)
> {
> @@ -1286,14 +1286,14 @@ static int bam_dma_probe(struct platform_device *pdev)
> if (ret)
> goto err_disable_clk;
>
> - tasklet_setup(&bdev->task, dma_tasklet);
> + INIT_WORK(&bdev->work, bam_dma_work);
>
> bdev->channels = devm_kcalloc(bdev->dev, bdev->num_channels,
> sizeof(*bdev->channels), GFP_KERNEL);
>
> if (!bdev->channels) {
> ret = -ENOMEM;
> - goto err_tasklet_kill;
> + goto err_workqueue_cancel;
> }
>
> /* allocate and initialize channels */
> @@ -1358,8 +1358,8 @@ static int bam_dma_probe(struct platform_device *pdev)
> err_bam_channel_exit:
> for (i = 0; i < bdev->num_channels; i++)
> tasklet_kill(&bdev->channels[i].vc.task);
> -err_tasklet_kill:
> - tasklet_kill(&bdev->task);
> +err_workqueue_cancel:
> + cancel_work_sync(&bdev->work);
> err_disable_clk:
> clk_disable_unprepare(bdev->bamclk);
>
> @@ -1393,7 +1393,7 @@ static void bam_dma_remove(struct platform_device *pdev)
> bdev->channels[i].fifo_phys);
> }
>
> - tasklet_kill(&bdev->task);
> + cancel_work_sync(&bdev->work);
>
> clk_disable_unprepare(bdev->bamclk);
> }
>
> --
> 2.47.3
>
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v12 03/12] dmaengine: qcom: bam_dma: Extend the driver's device match data
2026-03-10 15:44 [PATCH v12 00/12] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
2026-03-10 15:44 ` [PATCH v12 01/12] dmaengine: constify struct dma_descriptor_metadata_ops Bartosz Golaszewski
2026-03-10 15:44 ` [PATCH v12 02/12] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue Bartosz Golaszewski
@ 2026-03-10 15:44 ` Bartosz Golaszewski
2026-03-11 12:47 ` Manivannan Sadhasivam
2026-03-10 15:44 ` [PATCH v12 04/12] dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support Bartosz Golaszewski
` (9 subsequent siblings)
12 siblings, 1 reply; 23+ messages in thread
From: Bartosz Golaszewski @ 2026-03-10 15:44 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
In preparation for supporting the pipe locking feature flag, extend the
amount of information we can carry in device match data: create a
separate structure and make the register information one of its fields.
This way, in subsequent patches, it will be just a matter of adding a
new field to the device data.
Reviewed-by: Dmitry Baryshkov <lumag@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/dma/qcom/bam_dma.c | 28 ++++++++++++++++++++++------
1 file changed, 22 insertions(+), 6 deletions(-)
diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index c8601bac555edf1bb4384fd39cb3449ec6e86334..8f6d03f6c673b57ed13aeca6c8331c71596d077b 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -113,6 +113,10 @@ struct reg_offset_data {
unsigned int pipe_mult, evnt_mult, ee_mult;
};
+struct bam_device_data {
+ const struct reg_offset_data *reg_info;
+};
+
static const struct reg_offset_data bam_v1_3_reg_info[] = {
[BAM_CTRL] = { 0x0F80, 0x00, 0x00, 0x00 },
[BAM_REVISION] = { 0x0F84, 0x00, 0x00, 0x00 },
@@ -142,6 +146,10 @@ static const struct reg_offset_data bam_v1_3_reg_info[] = {
[BAM_P_FIFO_SIZES] = { 0x1020, 0x00, 0x40, 0x00 },
};
+static const struct bam_device_data bam_v1_3_data = {
+ .reg_info = bam_v1_3_reg_info,
+};
+
static const struct reg_offset_data bam_v1_4_reg_info[] = {
[BAM_CTRL] = { 0x0000, 0x00, 0x00, 0x00 },
[BAM_REVISION] = { 0x0004, 0x00, 0x00, 0x00 },
@@ -171,6 +179,10 @@ static const struct reg_offset_data bam_v1_4_reg_info[] = {
[BAM_P_FIFO_SIZES] = { 0x1820, 0x00, 0x1000, 0x00 },
};
+static const struct bam_device_data bam_v1_4_data = {
+ .reg_info = bam_v1_4_reg_info,
+};
+
static const struct reg_offset_data bam_v1_7_reg_info[] = {
[BAM_CTRL] = { 0x00000, 0x00, 0x00, 0x00 },
[BAM_REVISION] = { 0x01000, 0x00, 0x00, 0x00 },
@@ -200,6 +212,10 @@ static const struct reg_offset_data bam_v1_7_reg_info[] = {
[BAM_P_FIFO_SIZES] = { 0x13820, 0x00, 0x1000, 0x00 },
};
+static const struct bam_device_data bam_v1_7_data = {
+ .reg_info = bam_v1_7_reg_info,
+};
+
/* BAM CTRL */
#define BAM_SW_RST BIT(0)
#define BAM_EN BIT(1)
@@ -393,7 +409,7 @@ struct bam_device {
bool powered_remotely;
u32 active_channels;
- const struct reg_offset_data *layout;
+ const struct bam_device_data *dev_data;
struct clk *bamclk;
int irq;
@@ -411,7 +427,7 @@ struct bam_device {
static inline void __iomem *bam_addr(struct bam_device *bdev, u32 pipe,
enum bam_reg reg)
{
- const struct reg_offset_data r = bdev->layout[reg];
+ const struct reg_offset_data r = bdev->dev_data->reg_info[reg];
return bdev->regs + r.base_offset +
r.pipe_mult * pipe +
@@ -1205,9 +1221,9 @@ static void bam_channel_init(struct bam_device *bdev, struct bam_chan *bchan,
}
static const struct of_device_id bam_of_match[] = {
- { .compatible = "qcom,bam-v1.3.0", .data = &bam_v1_3_reg_info },
- { .compatible = "qcom,bam-v1.4.0", .data = &bam_v1_4_reg_info },
- { .compatible = "qcom,bam-v1.7.0", .data = &bam_v1_7_reg_info },
+ { .compatible = "qcom,bam-v1.3.0", .data = &bam_v1_3_data },
+ { .compatible = "qcom,bam-v1.4.0", .data = &bam_v1_4_data },
+ { .compatible = "qcom,bam-v1.7.0", .data = &bam_v1_7_data },
{}
};
@@ -1231,7 +1247,7 @@ static int bam_dma_probe(struct platform_device *pdev)
return -ENODEV;
}
- bdev->layout = match->data;
+ bdev->dev_data = match->data;
bdev->regs = devm_platform_ioremap_resource(pdev, 0);
if (IS_ERR(bdev->regs))
--
2.47.3
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH v12 03/12] dmaengine: qcom: bam_dma: Extend the driver's device match data
2026-03-10 15:44 ` [PATCH v12 03/12] dmaengine: qcom: bam_dma: Extend the driver's device match data Bartosz Golaszewski
@ 2026-03-11 12:47 ` Manivannan Sadhasivam
0 siblings, 0 replies; 23+ messages in thread
From: Manivannan Sadhasivam @ 2026-03-11 12:47 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li,
dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski
On Tue, Mar 10, 2026 at 04:44:17PM +0100, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>
> In preparation for supporting the pipe locking feature flag, extend the
> amount of information we can carry in device match data: create a
> separate structure and make the register information one of its fields.
> This way, in subsequent patches, it will be just a matter of adding a
> new field to the device data.
>
Nit: s/patches/commits
> Reviewed-by: Dmitry Baryshkov <lumag@kernel.org>
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
- Mani
> ---
> drivers/dma/qcom/bam_dma.c | 28 ++++++++++++++++++++++------
> 1 file changed, 22 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
> index c8601bac555edf1bb4384fd39cb3449ec6e86334..8f6d03f6c673b57ed13aeca6c8331c71596d077b 100644
> --- a/drivers/dma/qcom/bam_dma.c
> +++ b/drivers/dma/qcom/bam_dma.c
> @@ -113,6 +113,10 @@ struct reg_offset_data {
> unsigned int pipe_mult, evnt_mult, ee_mult;
> };
>
> +struct bam_device_data {
> + const struct reg_offset_data *reg_info;
> +};
> +
> static const struct reg_offset_data bam_v1_3_reg_info[] = {
> [BAM_CTRL] = { 0x0F80, 0x00, 0x00, 0x00 },
> [BAM_REVISION] = { 0x0F84, 0x00, 0x00, 0x00 },
> @@ -142,6 +146,10 @@ static const struct reg_offset_data bam_v1_3_reg_info[] = {
> [BAM_P_FIFO_SIZES] = { 0x1020, 0x00, 0x40, 0x00 },
> };
>
> +static const struct bam_device_data bam_v1_3_data = {
> + .reg_info = bam_v1_3_reg_info,
> +};
> +
> static const struct reg_offset_data bam_v1_4_reg_info[] = {
> [BAM_CTRL] = { 0x0000, 0x00, 0x00, 0x00 },
> [BAM_REVISION] = { 0x0004, 0x00, 0x00, 0x00 },
> @@ -171,6 +179,10 @@ static const struct reg_offset_data bam_v1_4_reg_info[] = {
> [BAM_P_FIFO_SIZES] = { 0x1820, 0x00, 0x1000, 0x00 },
> };
>
> +static const struct bam_device_data bam_v1_4_data = {
> + .reg_info = bam_v1_4_reg_info,
> +};
> +
> static const struct reg_offset_data bam_v1_7_reg_info[] = {
> [BAM_CTRL] = { 0x00000, 0x00, 0x00, 0x00 },
> [BAM_REVISION] = { 0x01000, 0x00, 0x00, 0x00 },
> @@ -200,6 +212,10 @@ static const struct reg_offset_data bam_v1_7_reg_info[] = {
> [BAM_P_FIFO_SIZES] = { 0x13820, 0x00, 0x1000, 0x00 },
> };
>
> +static const struct bam_device_data bam_v1_7_data = {
> + .reg_info = bam_v1_7_reg_info,
> +};
> +
> /* BAM CTRL */
> #define BAM_SW_RST BIT(0)
> #define BAM_EN BIT(1)
> @@ -393,7 +409,7 @@ struct bam_device {
> bool powered_remotely;
> u32 active_channels;
>
> - const struct reg_offset_data *layout;
> + const struct bam_device_data *dev_data;
>
> struct clk *bamclk;
> int irq;
> @@ -411,7 +427,7 @@ struct bam_device {
> static inline void __iomem *bam_addr(struct bam_device *bdev, u32 pipe,
> enum bam_reg reg)
> {
> - const struct reg_offset_data r = bdev->layout[reg];
> + const struct reg_offset_data r = bdev->dev_data->reg_info[reg];
>
> return bdev->regs + r.base_offset +
> r.pipe_mult * pipe +
> @@ -1205,9 +1221,9 @@ static void bam_channel_init(struct bam_device *bdev, struct bam_chan *bchan,
> }
>
> static const struct of_device_id bam_of_match[] = {
> - { .compatible = "qcom,bam-v1.3.0", .data = &bam_v1_3_reg_info },
> - { .compatible = "qcom,bam-v1.4.0", .data = &bam_v1_4_reg_info },
> - { .compatible = "qcom,bam-v1.7.0", .data = &bam_v1_7_reg_info },
> + { .compatible = "qcom,bam-v1.3.0", .data = &bam_v1_3_data },
> + { .compatible = "qcom,bam-v1.4.0", .data = &bam_v1_4_data },
> + { .compatible = "qcom,bam-v1.7.0", .data = &bam_v1_7_data },
> {}
> };
>
> @@ -1231,7 +1247,7 @@ static int bam_dma_probe(struct platform_device *pdev)
> return -ENODEV;
> }
>
> - bdev->layout = match->data;
> + bdev->dev_data = match->data;
>
> bdev->regs = devm_platform_ioremap_resource(pdev, 0);
> if (IS_ERR(bdev->regs))
>
> --
> 2.47.3
>
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v12 04/12] dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support
2026-03-10 15:44 [PATCH v12 00/12] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (2 preceding siblings ...)
2026-03-10 15:44 ` [PATCH v12 03/12] dmaengine: qcom: bam_dma: Extend the driver's device match data Bartosz Golaszewski
@ 2026-03-10 15:44 ` Bartosz Golaszewski
2026-03-11 12:52 ` Manivannan Sadhasivam
2026-03-10 15:44 ` [PATCH v12 05/12] dmaengine: qcom: bam_dma: add support for BAM locking Bartosz Golaszewski
` (8 subsequent siblings)
12 siblings, 1 reply; 23+ messages in thread
From: Bartosz Golaszewski @ 2026-03-10 15:44 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski,
Dmitry Baryshkov
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Extend the device match data with a flag indicating whether the IP
supports the BAM lock/unlock feature. Set it to true on BAM IP versions
1.4.0 and above.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/dma/qcom/bam_dma.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index 8f6d03f6c673b57ed13aeca6c8331c71596d077b..83491e7c2f17d8c9d12a1a055baea7e3a0a75a53 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -115,6 +115,7 @@ struct reg_offset_data {
struct bam_device_data {
const struct reg_offset_data *reg_info;
+ bool pipe_lock_supported;
};
static const struct reg_offset_data bam_v1_3_reg_info[] = {
@@ -181,6 +182,7 @@ static const struct reg_offset_data bam_v1_4_reg_info[] = {
static const struct bam_device_data bam_v1_4_data = {
.reg_info = bam_v1_4_reg_info,
+ .pipe_lock_supported = true,
};
static const struct reg_offset_data bam_v1_7_reg_info[] = {
@@ -214,6 +216,7 @@ static const struct reg_offset_data bam_v1_7_reg_info[] = {
static const struct bam_device_data bam_v1_7_data = {
.reg_info = bam_v1_7_reg_info,
+ .pipe_lock_supported = true,
};
/* BAM CTRL */
--
2.47.3
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH v12 04/12] dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support
2026-03-10 15:44 ` [PATCH v12 04/12] dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support Bartosz Golaszewski
@ 2026-03-11 12:52 ` Manivannan Sadhasivam
0 siblings, 0 replies; 23+ messages in thread
From: Manivannan Sadhasivam @ 2026-03-11 12:52 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li,
dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Dmitry Baryshkov
On Tue, Mar 10, 2026 at 04:44:18PM +0100, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>
> Extend the device match data with a flag indicating whether the IP
> supports the BAM lock/unlock feature. Set it to true on BAM IP versions
> 1.4.0 and above.
>
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Nit: I'd move setting 'pipe_lock_supported = true' part after next patch so that
the order becomes, add pipe lock support in the driver and enable it for the
supported versions.
- Mani
> ---
> drivers/dma/qcom/bam_dma.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
> index 8f6d03f6c673b57ed13aeca6c8331c71596d077b..83491e7c2f17d8c9d12a1a055baea7e3a0a75a53 100644
> --- a/drivers/dma/qcom/bam_dma.c
> +++ b/drivers/dma/qcom/bam_dma.c
> @@ -115,6 +115,7 @@ struct reg_offset_data {
>
> struct bam_device_data {
> const struct reg_offset_data *reg_info;
> + bool pipe_lock_supported;
> };
>
> static const struct reg_offset_data bam_v1_3_reg_info[] = {
> @@ -181,6 +182,7 @@ static const struct reg_offset_data bam_v1_4_reg_info[] = {
>
> static const struct bam_device_data bam_v1_4_data = {
> .reg_info = bam_v1_4_reg_info,
> + .pipe_lock_supported = true,
> };
>
> static const struct reg_offset_data bam_v1_7_reg_info[] = {
> @@ -214,6 +216,7 @@ static const struct reg_offset_data bam_v1_7_reg_info[] = {
>
> static const struct bam_device_data bam_v1_7_data = {
> .reg_info = bam_v1_7_reg_info,
> + .pipe_lock_supported = true,
> };
>
> /* BAM CTRL */
>
> --
> 2.47.3
>
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v12 05/12] dmaengine: qcom: bam_dma: add support for BAM locking
2026-03-10 15:44 [PATCH v12 00/12] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (3 preceding siblings ...)
2026-03-10 15:44 ` [PATCH v12 04/12] dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support Bartosz Golaszewski
@ 2026-03-10 15:44 ` Bartosz Golaszewski
2026-03-11 10:00 ` Stephan Gerhold
2026-03-11 13:26 ` Manivannan Sadhasivam
2026-03-10 15:44 ` [PATCH v12 06/12] crypto: qce - Include algapi.h in the core.h header Bartosz Golaszewski
` (7 subsequent siblings)
12 siblings, 2 replies; 23+ messages in thread
From: Bartosz Golaszewski @ 2026-03-10 15:44 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
Add support for BAM pipe locking. To that end: when starting DMA on an RX
channel - prepend the existing queue of issued descriptors with an
additional "dummy" command descriptor with the LOCK bit set. Once the
transaction is done (no more issued descriptors), issue one more dummy
descriptor with the UNLOCK bit.
We *must* wait until the transaction is signalled as done because we
must not perform any writes into config registers while the engine is
busy.
The dummy writes must be issued into a scratchpad register of the client
so provide a mechanism to communicate the right address via descriptor
metadata.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/dma/qcom/bam_dma.c | 175 ++++++++++++++++++++++++++++++++++++++-
include/linux/dma/qcom_bam_dma.h | 4 +
2 files changed, 176 insertions(+), 3 deletions(-)
diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index 83491e7c2f17d8c9d12a1a055baea7e3a0a75a53..627c85a2df4dcdbac247d831a4aef047c2189456 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -28,11 +28,13 @@
#include <linux/clk.h>
#include <linux/device.h>
#include <linux/dma-mapping.h>
+#include <linux/dma/qcom_bam_dma.h>
#include <linux/dmaengine.h>
#include <linux/init.h>
#include <linux/interrupt.h>
#include <linux/io.h>
#include <linux/kernel.h>
+#include <linux/lockdep.h>
#include <linux/module.h>
#include <linux/of_address.h>
#include <linux/of_dma.h>
@@ -60,6 +62,8 @@ struct bam_desc_hw {
#define DESC_FLAG_EOB BIT(13)
#define DESC_FLAG_NWD BIT(12)
#define DESC_FLAG_CMD BIT(11)
+#define DESC_FLAG_LOCK BIT(10)
+#define DESC_FLAG_UNLOCK BIT(9)
struct bam_async_desc {
struct virt_dma_desc vd;
@@ -391,6 +395,14 @@ struct bam_chan {
struct list_head desc_list;
struct list_head node;
+
+ /* BAM locking infrastructure */
+ bool locked;
+ phys_addr_t scratchpad_addr;
+ struct scatterlist lock_sg;
+ struct scatterlist unlock_sg;
+ struct bam_cmd_element lock_ce;
+ struct bam_cmd_element unlock_ce;
};
static inline struct bam_chan *to_bam_chan(struct dma_chan *common)
@@ -652,6 +664,27 @@ static int bam_slave_config(struct dma_chan *chan,
return 0;
}
+static int bam_metadata_attach(struct dma_async_tx_descriptor *desc, void *data, size_t len)
+{
+ struct bam_chan *bchan = to_bam_chan(desc->chan);
+ const struct bam_device_data *bdata = bchan->bdev->dev_data;
+ struct bam_desc_metadata *metadata = data;
+
+ if (!data)
+ return -EINVAL;
+
+ if (!bdata->pipe_lock_supported)
+ return -EOPNOTSUPP;
+
+ bchan->scratchpad_addr = metadata->scratchpad_addr;
+
+ return 0;
+}
+
+static const struct dma_descriptor_metadata_ops bam_metadata_ops = {
+ .attach = bam_metadata_attach,
+};
+
/**
* bam_prep_slave_sg - Prep slave sg transaction
*
@@ -668,6 +701,7 @@ static struct dma_async_tx_descriptor *bam_prep_slave_sg(struct dma_chan *chan,
void *context)
{
struct bam_chan *bchan = to_bam_chan(chan);
+ struct dma_async_tx_descriptor *tx_desc;
struct bam_device *bdev = bchan->bdev;
struct bam_async_desc *async_desc;
struct scatterlist *sg;
@@ -723,7 +757,12 @@ static struct dma_async_tx_descriptor *bam_prep_slave_sg(struct dma_chan *chan,
} while (remainder > 0);
}
- return vchan_tx_prep(&bchan->vc, &async_desc->vd, flags);
+ tx_desc = vchan_tx_prep(&bchan->vc, &async_desc->vd, flags);
+ if (!tx_desc)
+ return NULL;
+
+ tx_desc->metadata_ops = &bam_metadata_ops;
+ return tx_desc;
}
/**
@@ -1012,6 +1051,112 @@ static void bam_apply_new_config(struct bam_chan *bchan,
bchan->reconfigure = 0;
}
+static struct bam_async_desc *
+bam_make_lock_desc(struct bam_chan *bchan, struct scatterlist *sg,
+ struct bam_cmd_element *ce, unsigned long flag)
+{
+ struct dma_chan *chan = &bchan->vc.chan;
+ struct bam_async_desc *async_desc;
+ struct bam_desc_hw *desc;
+ struct virt_dma_desc *vd;
+ struct virt_dma_chan *vc;
+ unsigned int mapped;
+ dma_cookie_t cookie;
+ int ret;
+
+ sg_init_table(sg, 1);
+
+ async_desc = kzalloc_flex(*async_desc, desc, 1, GFP_NOWAIT);
+ if (!async_desc) {
+ dev_err(bchan->bdev->dev, "failed to allocate the BAM lock descriptor\n");
+ return NULL;
+ }
+
+ async_desc->num_desc = 1;
+ async_desc->curr_desc = async_desc->desc;
+ async_desc->dir = DMA_MEM_TO_DEV;
+
+ desc = async_desc->desc;
+
+ bam_prep_ce_le32(ce, bchan->scratchpad_addr, BAM_WRITE_COMMAND, 0);
+ sg_set_buf(sg, ce, sizeof(*ce));
+
+ mapped = dma_map_sg_attrs(chan->slave, sg, 1, DMA_TO_DEVICE, DMA_PREP_CMD);
+ if (!mapped) {
+ kfree(async_desc);
+ return NULL;
+ }
+
+ desc->flags |= cpu_to_le16(DESC_FLAG_CMD | flag);
+ desc->addr = sg_dma_address(sg);
+ desc->size = sizeof(struct bam_cmd_element);
+
+ vc = &bchan->vc;
+ vd = &async_desc->vd;
+
+ dma_async_tx_descriptor_init(&vd->tx, &vc->chan);
+ vd->tx.flags = DMA_PREP_CMD;
+ vd->tx.desc_free = vchan_tx_desc_free;
+ vd->tx_result.result = DMA_TRANS_NOERROR;
+ vd->tx_result.residue = 0;
+
+ cookie = dma_cookie_assign(&vd->tx);
+ ret = dma_submit_error(cookie);
+ if (ret)
+ return NULL;
+
+ return async_desc;
+}
+
+static int bam_do_setup_pipe_lock(struct bam_chan *bchan, bool lock)
+{
+ struct bam_device *bdev = bchan->bdev;
+ const struct bam_device_data *bdata = bdev->dev_data;
+ struct bam_async_desc *lock_desc;
+ struct bam_cmd_element *ce;
+ struct scatterlist *sgl;
+ unsigned long flag;
+
+ lockdep_assert_held(&bchan->vc.lock);
+
+ if (!bdata->pipe_lock_supported || !bchan->scratchpad_addr ||
+ bchan->slave.direction != DMA_MEM_TO_DEV)
+ return 0;
+
+ if (lock) {
+ sgl = &bchan->lock_sg;
+ ce = &bchan->lock_ce;
+ flag = DESC_FLAG_LOCK;
+ } else {
+ sgl = &bchan->unlock_sg;
+ ce = &bchan->unlock_ce;
+ flag = DESC_FLAG_UNLOCK;
+ }
+
+ lock_desc = bam_make_lock_desc(bchan, sgl, ce, flag);
+ if (!lock_desc)
+ return -ENOMEM;
+
+ if (lock)
+ list_add(&lock_desc->vd.node, &bchan->vc.desc_issued);
+ else
+ list_add_tail(&lock_desc->vd.node, &bchan->vc.desc_issued);
+
+ bchan->locked = lock;
+
+ return 0;
+}
+
+static int bam_setup_pipe_lock(struct bam_chan *bchan)
+{
+ return bam_do_setup_pipe_lock(bchan, true);
+}
+
+static int bam_setup_pipe_unlock(struct bam_chan *bchan)
+{
+ return bam_do_setup_pipe_lock(bchan, false);
+}
+
/**
* bam_start_dma - start next transaction
* @bchan: bam dma channel
@@ -1121,6 +1266,7 @@ static void bam_dma_work(struct work_struct *work)
struct bam_device *bdev = from_work(bdev, work, work);
struct bam_chan *bchan;
unsigned int i;
+ int ret;
/* go through the channels and kick off transactions */
for (i = 0; i < bdev->num_channels; i++) {
@@ -1128,6 +1274,13 @@ static void bam_dma_work(struct work_struct *work)
guard(spinlock_irqsave)(&bchan->vc.lock);
+ if (list_empty(&bchan->vc.desc_issued) && bchan->locked) {
+ ret = bam_setup_pipe_unlock(bchan);
+ if (ret)
+ dev_err(bchan->vc.chan.slave,
+ "Failed to set up the pipe unlock descriptor\n");
+ }
+
if (!list_empty(&bchan->vc.desc_issued) && !IS_BUSY(bchan))
bam_start_dma(bchan);
}
@@ -1142,9 +1295,17 @@ static void bam_dma_work(struct work_struct *work)
static void bam_issue_pending(struct dma_chan *chan)
{
struct bam_chan *bchan = to_bam_chan(chan);
+ int ret;
guard(spinlock_irqsave)(&bchan->vc.lock);
+ if (!bchan->locked) {
+ ret = bam_setup_pipe_lock(bchan);
+ if (ret)
+ dev_err(bchan->vc.chan.slave,
+ "Failed to set up the pipe lock descriptor\n");
+ }
+
/* if work pending and idle, start a transaction */
if (vchan_issue_pending(&bchan->vc) && !IS_BUSY(bchan))
bam_start_dma(bchan);
@@ -1157,8 +1318,15 @@ static void bam_issue_pending(struct dma_chan *chan)
*/
static void bam_dma_free_desc(struct virt_dma_desc *vd)
{
- struct bam_async_desc *async_desc = container_of(vd,
- struct bam_async_desc, vd);
+ struct bam_async_desc *async_desc = container_of(vd, struct bam_async_desc, vd);
+ struct bam_desc_hw *desc = async_desc->desc;
+ struct dma_chan *chan = vd->tx.chan;
+ struct bam_chan *bchan = to_bam_chan(chan);
+
+ if (le16_to_cpu(desc->flags) & DESC_FLAG_LOCK)
+ dma_unmap_sg(chan->slave, &bchan->lock_sg, 1, DMA_TO_DEVICE);
+ else if (le16_to_cpu(desc->flags) & DESC_FLAG_UNLOCK)
+ dma_unmap_sg(chan->slave, &bchan->unlock_sg, 1, DMA_TO_DEVICE);
kfree(async_desc);
}
@@ -1350,6 +1518,7 @@ static int bam_dma_probe(struct platform_device *pdev)
bdev->common.device_terminate_all = bam_dma_terminate_all;
bdev->common.device_issue_pending = bam_issue_pending;
bdev->common.device_tx_status = bam_tx_status;
+ bdev->common.desc_metadata_modes = DESC_METADATA_CLIENT;
bdev->common.dev = bdev->dev;
ret = dma_async_device_register(&bdev->common);
diff --git a/include/linux/dma/qcom_bam_dma.h b/include/linux/dma/qcom_bam_dma.h
index 68fc0e643b1b97fe4520d5878daa322b81f4f559..f85e0c72407b5e1a733750ac87bbaba6af6e8c78 100644
--- a/include/linux/dma/qcom_bam_dma.h
+++ b/include/linux/dma/qcom_bam_dma.h
@@ -34,6 +34,10 @@ enum bam_command_type {
BAM_READ_COMMAND,
};
+struct bam_desc_metadata {
+ phys_addr_t scratchpad_addr;
+};
+
/*
* prep_bam_ce_le32 - Wrapper function to prepare a single BAM command
* element with the data already in le32 format.
--
2.47.3
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH v12 05/12] dmaengine: qcom: bam_dma: add support for BAM locking
2026-03-10 15:44 ` [PATCH v12 05/12] dmaengine: qcom: bam_dma: add support for BAM locking Bartosz Golaszewski
@ 2026-03-11 10:00 ` Stephan Gerhold
2026-03-11 10:32 ` Bartosz Golaszewski
2026-03-11 13:26 ` Manivannan Sadhasivam
1 sibling, 1 reply; 23+ messages in thread
From: Stephan Gerhold @ 2026-03-11 10:00 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li,
dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski
On Tue, Mar 10, 2026 at 04:44:19PM +0100, Bartosz Golaszewski wrote:
> Add support for BAM pipe locking. To that end: when starting DMA on an RX
> channel - prepend the existing queue of issued descriptors with an
> additional "dummy" command descriptor with the LOCK bit set. Once the
> transaction is done (no more issued descriptors), issue one more dummy
> descriptor with the UNLOCK bit.
>
> We *must* wait until the transaction is signalled as done because we
> must not perform any writes into config registers while the engine is
> busy.
>
> [...]
> diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
> index 83491e7c2f17d8c9d12a1a055baea7e3a0a75a53..627c85a2df4dcdbac247d831a4aef047c2189456 100644
> --- a/drivers/dma/qcom/bam_dma.c
> +++ b/drivers/dma/qcom/bam_dma.c
> [...]
> +static int bam_do_setup_pipe_lock(struct bam_chan *bchan, bool lock)
> +{
> + struct bam_device *bdev = bchan->bdev;
> + const struct bam_device_data *bdata = bdev->dev_data;
> + struct bam_async_desc *lock_desc;
> + struct bam_cmd_element *ce;
> + struct scatterlist *sgl;
> + unsigned long flag;
> +
> + lockdep_assert_held(&bchan->vc.lock);
> +
> + if (!bdata->pipe_lock_supported || !bchan->scratchpad_addr ||
> + bchan->slave.direction != DMA_MEM_TO_DEV)
> + return 0;
> +
> + if (lock) {
> + sgl = &bchan->lock_sg;
> + ce = &bchan->lock_ce;
> + flag = DESC_FLAG_LOCK;
> + } else {
> + sgl = &bchan->unlock_sg;
> + ce = &bchan->unlock_ce;
> + flag = DESC_FLAG_UNLOCK;
> + }
> +
> + lock_desc = bam_make_lock_desc(bchan, sgl, ce, flag);
> + if (!lock_desc)
> + return -ENOMEM;
> +
> + if (lock)
> + list_add(&lock_desc->vd.node, &bchan->vc.desc_issued);
> + else
> + list_add_tail(&lock_desc->vd.node, &bchan->vc.desc_issued);
> +
> + bchan->locked = lock;
> +
> + return 0;
> +}
> +
> +static int bam_setup_pipe_lock(struct bam_chan *bchan)
> +{
> + return bam_do_setup_pipe_lock(bchan, true);
> +}
> +
> +static int bam_setup_pipe_unlock(struct bam_chan *bchan)
> +{
> + return bam_do_setup_pipe_lock(bchan, false);
> +}
> +
> /**
> * bam_start_dma - start next transaction
> * @bchan: bam dma channel
> @@ -1121,6 +1266,7 @@ static void bam_dma_work(struct work_struct *work)
> struct bam_device *bdev = from_work(bdev, work, work);
> struct bam_chan *bchan;
> unsigned int i;
> + int ret;
>
> /* go through the channels and kick off transactions */
> for (i = 0; i < bdev->num_channels; i++) {
> @@ -1128,6 +1274,13 @@ static void bam_dma_work(struct work_struct *work)
>
> guard(spinlock_irqsave)(&bchan->vc.lock);
>
> + if (list_empty(&bchan->vc.desc_issued) && bchan->locked) {
> + ret = bam_setup_pipe_unlock(bchan);
> + if (ret)
> + dev_err(bchan->vc.chan.slave,
> + "Failed to set up the pipe unlock descriptor\n");
> + }
> +
> if (!list_empty(&bchan->vc.desc_issued) && !IS_BUSY(bchan))
> bam_start_dma(bchan);
> }
I'm not entirely sure if this actually guarantees waiting with the
unlock until the transaction is "done", for two reasons:
1. &bchan->vc.desc_issued looks like a "TODO" list for descriptors we
haven't fully managed to squeeze into the BAM FIFO yet. It doesn't
tell you which descriptors have been consumed and finished
processing inside the FIFO.
Consider e.g. the following case: The client has issued a number of
descriptors, they all fit into the FIFO. The first descriptor has a
callback assigned, so we ask the BAM to send us an interrupt when it
has been consumed. We get the interrupt for the first descriptor and
process_channel_irqs() marks it as completed, the rest of the
descriptors are still pending. &bchan->vc.desc_issued is empty, so
you queue the unlock command before the rest of the descriptors have
finished.
2. From reading the BAM chapter in the APQ8016E TRM I get the
impression that by default an interrupt for a descriptor just tells
you that the descriptor was consumed by the BAM (and forwarded to
the peripheral). If you want to guarantee that the transaction is
actually done on the peripheral side before allowing writes into
config registers, you would need to set the NWD (Notify When Done)
bit (aka DMA_PREP_FENCE) on the last descriptor before the unlock
command.
NWD seems to stall descriptor processing until the peripheral
signals completion, so this might allow you to immediately queue the
unlock command like in v11. The downside is that you would need to
make assumptions about the set of commands submitted by the client
again... The downstream driver seems to set NWD on the data
descriptor immediately before the UNLOCK command [1].
The chapter in the APQ8016E TRM kind of contradicts itself
sometimes, but there is this sentence for example: "On the data
descriptor preceding command descriptor, NWD bit must be asserted in
order to assure that all the data has been transferred and the
peripheral is ready to be re-configured."
Thanks,
Stephan
[1]: https://git.codelinaro.org/clo/la/platform/vendor/qcom/opensource/securemsm-kernel/-/blob/fa55a96773d3fbfcd96beb2965efcaaae5697816/crypto-qti/qce50.c#L5361-5362
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH v12 05/12] dmaengine: qcom: bam_dma: add support for BAM locking
2026-03-11 10:00 ` Stephan Gerhold
@ 2026-03-11 10:32 ` Bartosz Golaszewski
2026-03-11 11:25 ` Stephan Gerhold
0 siblings, 1 reply; 23+ messages in thread
From: Bartosz Golaszewski @ 2026-03-11 10:32 UTC (permalink / raw)
To: Stephan Gerhold
Cc: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li,
dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
On Wed, Mar 11, 2026 at 11:00 AM Stephan Gerhold
<stephan.gerhold@linaro.org> wrote:
>
> I'm not entirely sure if this actually guarantees waiting with the
> unlock until the transaction is "done", for two reasons:
>
> 1. &bchan->vc.desc_issued looks like a "TODO" list for descriptors we
> haven't fully managed to squeeze into the BAM FIFO yet. It doesn't
> tell you which descriptors have been consumed and finished
> processing inside the FIFO.
>
> Consider e.g. the following case: The client has issued a number of
> descriptors, they all fit into the FIFO. The first descriptor has a
> callback assigned, so we ask the BAM to send us an interrupt when it
> has been consumed. We get the interrupt for the first descriptor and
> process_channel_irqs() marks it as completed, the rest of the
> descriptors are still pending. &bchan->vc.desc_issued is empty, so
> you queue the unlock command before the rest of the descriptors have
> finished.
>
Thanks for looking into it. Good catch, I think you're right.
> 2. From reading the BAM chapter in the APQ8016E TRM I get the
> impression that by default an interrupt for a descriptor just tells
> you that the descriptor was consumed by the BAM (and forwarded to
> the peripheral). If you want to guarantee that the transaction is
> actually done on the peripheral side before allowing writes into
> config registers, you would need to set the NWD (Notify When Done)
> bit (aka DMA_PREP_FENCE) on the last descriptor before the unlock
> command.
>
> NWD seems to stall descriptor processing until the peripheral
> signals completion, so this might allow you to immediately queue the
> unlock command like in v11. The downside is that you would need to
> make assumptions about the set of commands submitted by the client
> again... The downstream driver seems to set NWD on the data
> descriptor immediately before the UNLOCK command [1].
>
If what we have in the queue is:
[DATA] [DATA] [DATA] [CMD]
And we want to extend it with LOCK/UNLOCK like so:
[LOCK] [DATA] [DATA] [DATA] [CMD] [UNLOCK]
Should the NWD go with the last DATA descriptor or the last descriptor period
whether data or command?
It's, again, not very clear from reading tha part.
Bart
> The chapter in the APQ8016E TRM kind of contradicts itself
> sometimes, but there is this sentence for example: "On the data
> descriptor preceding command descriptor, NWD bit must be asserted in
> order to assure that all the data has been transferred and the
> peripheral is ready to be re-configured."
>
> Thanks,
> Stephan
>
> [1]: https://git.codelinaro.org/clo/la/platform/vendor/qcom/opensource/securemsm-kernel/-/blob/fa55a96773d3fbfcd96beb2965efcaaae5697816/crypto-qti/qce50.c#L5361-5362
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v12 05/12] dmaengine: qcom: bam_dma: add support for BAM locking
2026-03-11 10:32 ` Bartosz Golaszewski
@ 2026-03-11 11:25 ` Stephan Gerhold
0 siblings, 0 replies; 23+ messages in thread
From: Stephan Gerhold @ 2026-03-11 11:25 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li,
dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, Bartosz Golaszewski, Bartosz Golaszewski
On Wed, Mar 11, 2026 at 03:32:38AM -0700, Bartosz Golaszewski wrote:
> On Wed, Mar 11, 2026 at 11:00 AM Stephan Gerhold
> <stephan.gerhold@linaro.org> wrote:
> >
> > I'm not entirely sure if this actually guarantees waiting with the
> > unlock until the transaction is "done", for two reasons:
> >
> > 1. &bchan->vc.desc_issued looks like a "TODO" list for descriptors we
> > haven't fully managed to squeeze into the BAM FIFO yet. It doesn't
> > tell you which descriptors have been consumed and finished
> > processing inside the FIFO.
> >
> > Consider e.g. the following case: The client has issued a number of
> > descriptors, they all fit into the FIFO. The first descriptor has a
> > callback assigned, so we ask the BAM to send us an interrupt when it
> > has been consumed. We get the interrupt for the first descriptor and
> > process_channel_irqs() marks it as completed, the rest of the
> > descriptors are still pending. &bchan->vc.desc_issued is empty, so
> > you queue the unlock command before the rest of the descriptors have
> > finished.
> >
>
> Thanks for looking into it. Good catch, I think you're right.
>
> > 2. From reading the BAM chapter in the APQ8016E TRM I get the
> > impression that by default an interrupt for a descriptor just tells
> > you that the descriptor was consumed by the BAM (and forwarded to
> > the peripheral). If you want to guarantee that the transaction is
> > actually done on the peripheral side before allowing writes into
> > config registers, you would need to set the NWD (Notify When Done)
> > bit (aka DMA_PREP_FENCE) on the last descriptor before the unlock
> > command.
> >
> > NWD seems to stall descriptor processing until the peripheral
> > signals completion, so this might allow you to immediately queue the
> > unlock command like in v11. The downside is that you would need to
> > make assumptions about the set of commands submitted by the client
> > again... The downstream driver seems to set NWD on the data
> > descriptor immediately before the UNLOCK command [1].
> >
>
> If what we have in the queue is:
>
> [DATA] [DATA] [DATA] [CMD]
>
> And we want to extend it with LOCK/UNLOCK like so:
>
> [LOCK] [DATA] [DATA] [DATA] [CMD] [UNLOCK]
>
> Should the NWD go with the last DATA descriptor or the last descriptor period
> whether data or command?
>
> It's, again, not very clear from reading tha part.
>
I'm not sure, my impression is that the exact behavior of NWD is quite
specific to the actual peripheral (i.e. QCE, QPIC NAND, etc). In the
downstream drivers:
- QCE seems to add NWD to the last data descriptor before the UNLOCK
(as I wrote, it seems to queue command descriptors before data).
- QPIC NAND has a dedicated "cmd" pipe that doesn't get any data
descriptors, it specifies NWD always for the EXEC_CMD register write,
which isn't even the last descriptor. This is also done in mainline
already (see NAND_BAM_NWD in qcom_write_reg_dma() [1]).
It is possible that NWD works only when attached to certain descriptors
(when there is an actual operation running that gets completed by a
certain descriptor), so we might not be able to simply add NWD to the
last descriptor. :/
I suppose you could argue that "make sure engine does not get
re-configured while busy" is a requirement of QCE and not BAM, so
perhaps it would be easiest and safest if you just add DMA_PREP_FENCE to
the right descriptor inside the QCE driver. qcom_nandc has that already.
Thanks,
Stephan
[1]: https://elixir.bootlin.com/linux/v7.0-rc3/source/drivers/mtd/nand/qpic_common.c#L484
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v12 05/12] dmaengine: qcom: bam_dma: add support for BAM locking
2026-03-10 15:44 ` [PATCH v12 05/12] dmaengine: qcom: bam_dma: add support for BAM locking Bartosz Golaszewski
2026-03-11 10:00 ` Stephan Gerhold
@ 2026-03-11 13:26 ` Manivannan Sadhasivam
1 sibling, 0 replies; 23+ messages in thread
From: Manivannan Sadhasivam @ 2026-03-11 13:26 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li,
dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski
On Tue, Mar 10, 2026 at 04:44:19PM +0100, Bartosz Golaszewski wrote:
> Add support for BAM pipe locking. To that end: when starting DMA on an RX
> channel - prepend the existing queue of issued descriptors with an
> additional "dummy" command descriptor with the LOCK bit set. Once the
> transaction is done (no more issued descriptors), issue one more dummy
> descriptor with the UNLOCK bit.
>
> We *must* wait until the transaction is signalled as done because we
> must not perform any writes into config registers while the engine is
> busy.
>
> The dummy writes must be issued into a scratchpad register of the client
> so provide a mechanism to communicate the right address via descriptor
> metadata.
>
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
> ---
> drivers/dma/qcom/bam_dma.c | 175 ++++++++++++++++++++++++++++++++++++++-
> include/linux/dma/qcom_bam_dma.h | 4 +
> 2 files changed, 176 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
> index 83491e7c2f17d8c9d12a1a055baea7e3a0a75a53..627c85a2df4dcdbac247d831a4aef047c2189456 100644
> --- a/drivers/dma/qcom/bam_dma.c
> +++ b/drivers/dma/qcom/bam_dma.c
> @@ -28,11 +28,13 @@
> #include <linux/clk.h>
> #include <linux/device.h>
> #include <linux/dma-mapping.h>
> +#include <linux/dma/qcom_bam_dma.h>
> #include <linux/dmaengine.h>
> #include <linux/init.h>
> #include <linux/interrupt.h>
> #include <linux/io.h>
> #include <linux/kernel.h>
> +#include <linux/lockdep.h>
> #include <linux/module.h>
> #include <linux/of_address.h>
> #include <linux/of_dma.h>
> @@ -60,6 +62,8 @@ struct bam_desc_hw {
> #define DESC_FLAG_EOB BIT(13)
> #define DESC_FLAG_NWD BIT(12)
> #define DESC_FLAG_CMD BIT(11)
> +#define DESC_FLAG_LOCK BIT(10)
> +#define DESC_FLAG_UNLOCK BIT(9)
>
> struct bam_async_desc {
> struct virt_dma_desc vd;
> @@ -391,6 +395,14 @@ struct bam_chan {
> struct list_head desc_list;
>
> struct list_head node;
> +
> + /* BAM locking infrastructure */
> + bool locked;
Nit: Move this boolean at the end to avoid hole in-between.
> + phys_addr_t scratchpad_addr;
> + struct scatterlist lock_sg;
> + struct scatterlist unlock_sg;
> + struct bam_cmd_element lock_ce;
> + struct bam_cmd_element unlock_ce;
> };
>
> static inline struct bam_chan *to_bam_chan(struct dma_chan *common)
> @@ -652,6 +664,27 @@ static int bam_slave_config(struct dma_chan *chan,
> return 0;
> }
>
> +static int bam_metadata_attach(struct dma_async_tx_descriptor *desc, void *data, size_t len)
> +{
> + struct bam_chan *bchan = to_bam_chan(desc->chan);
> + const struct bam_device_data *bdata = bchan->bdev->dev_data;
> + struct bam_desc_metadata *metadata = data;
> +
> + if (!data)
> + return -EINVAL;
> +
> + if (!bdata->pipe_lock_supported)
> + return -EOPNOTSUPP;
I don't think you should error out if pipe lock is not supported. You can safely
return 0 so that the client can continue to do DMA. Otherwise, if the client
tries to do DMA on a non-pipe lock supported platform (a valid case), DMA will
fail.
There is also no incentive for the clients to know whether pipe lock is
supported or not as they can proceed anyway.
> +
> + bchan->scratchpad_addr = metadata->scratchpad_addr;
> +
> + return 0;
> +}
> +
> +static const struct dma_descriptor_metadata_ops bam_metadata_ops = {
> + .attach = bam_metadata_attach,
> +};
> +
> /**
> * bam_prep_slave_sg - Prep slave sg transaction
> *
> @@ -668,6 +701,7 @@ static struct dma_async_tx_descriptor *bam_prep_slave_sg(struct dma_chan *chan,
> void *context)
> {
> struct bam_chan *bchan = to_bam_chan(chan);
> + struct dma_async_tx_descriptor *tx_desc;
> struct bam_device *bdev = bchan->bdev;
> struct bam_async_desc *async_desc;
> struct scatterlist *sg;
> @@ -723,7 +757,12 @@ static struct dma_async_tx_descriptor *bam_prep_slave_sg(struct dma_chan *chan,
> } while (remainder > 0);
> }
>
> - return vchan_tx_prep(&bchan->vc, &async_desc->vd, flags);
> + tx_desc = vchan_tx_prep(&bchan->vc, &async_desc->vd, flags);
> + if (!tx_desc)
> + return NULL;
> +
> + tx_desc->metadata_ops = &bam_metadata_ops;
> + return tx_desc;
> }
>
> /**
> @@ -1012,6 +1051,112 @@ static void bam_apply_new_config(struct bam_chan *bchan,
> bchan->reconfigure = 0;
> }
>
> +static struct bam_async_desc *
> +bam_make_lock_desc(struct bam_chan *bchan, struct scatterlist *sg,
> + struct bam_cmd_element *ce, unsigned long flag)
> +{
> + struct dma_chan *chan = &bchan->vc.chan;
> + struct bam_async_desc *async_desc;
> + struct bam_desc_hw *desc;
> + struct virt_dma_desc *vd;
> + struct virt_dma_chan *vc;
> + unsigned int mapped;
> + dma_cookie_t cookie;
> + int ret;
> +
> + sg_init_table(sg, 1);
> +
> + async_desc = kzalloc_flex(*async_desc, desc, 1, GFP_NOWAIT);
> + if (!async_desc) {
> + dev_err(bchan->bdev->dev, "failed to allocate the BAM lock descriptor\n");
> + return NULL;
> + }
> +
> + async_desc->num_desc = 1;
> + async_desc->curr_desc = async_desc->desc;
> + async_desc->dir = DMA_MEM_TO_DEV;
> +
> + desc = async_desc->desc;
> +
> + bam_prep_ce_le32(ce, bchan->scratchpad_addr, BAM_WRITE_COMMAND, 0);
> + sg_set_buf(sg, ce, sizeof(*ce));
> +
> + mapped = dma_map_sg_attrs(chan->slave, sg, 1, DMA_TO_DEVICE, DMA_PREP_CMD);
> + if (!mapped) {
> + kfree(async_desc);
> + return NULL;
> + }
> +
> + desc->flags |= cpu_to_le16(DESC_FLAG_CMD | flag);
> + desc->addr = sg_dma_address(sg);
> + desc->size = sizeof(struct bam_cmd_element);
> +
> + vc = &bchan->vc;
> + vd = &async_desc->vd;
> +
> + dma_async_tx_descriptor_init(&vd->tx, &vc->chan);
> + vd->tx.flags = DMA_PREP_CMD;
> + vd->tx.desc_free = vchan_tx_desc_free;
> + vd->tx_result.result = DMA_TRANS_NOERROR;
> + vd->tx_result.residue = 0;
> +
> + cookie = dma_cookie_assign(&vd->tx);
> + ret = dma_submit_error(cookie);
> + if (ret)
> + return NULL;
Why can't you pass the actual error pointer to the caller. Right now, caller
just treats all failures as -ENOMEM.
> +
> + return async_desc;
> +}
> +
> +static int bam_do_setup_pipe_lock(struct bam_chan *bchan, bool lock)
> +{
> + struct bam_device *bdev = bchan->bdev;
> + const struct bam_device_data *bdata = bdev->dev_data;
> + struct bam_async_desc *lock_desc;
> + struct bam_cmd_element *ce;
> + struct scatterlist *sgl;
> + unsigned long flag;
> +
> + lockdep_assert_held(&bchan->vc.lock);
> +
> + if (!bdata->pipe_lock_supported || !bchan->scratchpad_addr ||
> + bchan->slave.direction != DMA_MEM_TO_DEV)
> + return 0;
> +
> + if (lock) {
> + sgl = &bchan->lock_sg;
> + ce = &bchan->lock_ce;
> + flag = DESC_FLAG_LOCK;
> + } else {
> + sgl = &bchan->unlock_sg;
> + ce = &bchan->unlock_ce;
> + flag = DESC_FLAG_UNLOCK;
> + }
> +
> + lock_desc = bam_make_lock_desc(bchan, sgl, ce, flag);
> + if (!lock_desc)
> + return -ENOMEM;
> +
> + if (lock)
> + list_add(&lock_desc->vd.node, &bchan->vc.desc_issued);
> + else
> + list_add_tail(&lock_desc->vd.node, &bchan->vc.desc_issued);
> +
> + bchan->locked = lock;
> +
> + return 0;
> +}
> +
> +static int bam_setup_pipe_lock(struct bam_chan *bchan)
> +{
> + return bam_do_setup_pipe_lock(bchan, true);
> +}
> +
> +static int bam_setup_pipe_unlock(struct bam_chan *bchan)
> +{
> + return bam_do_setup_pipe_lock(bchan, false);
> +}
> +
> /**
> * bam_start_dma - start next transaction
> * @bchan: bam dma channel
> @@ -1121,6 +1266,7 @@ static void bam_dma_work(struct work_struct *work)
> struct bam_device *bdev = from_work(bdev, work, work);
> struct bam_chan *bchan;
> unsigned int i;
> + int ret;
>
> /* go through the channels and kick off transactions */
> for (i = 0; i < bdev->num_channels; i++) {
> @@ -1128,6 +1274,13 @@ static void bam_dma_work(struct work_struct *work)
>
> guard(spinlock_irqsave)(&bchan->vc.lock);
>
> + if (list_empty(&bchan->vc.desc_issued) && bchan->locked) {
I fully agree with Stephan that we cannot rely on this to ensure completion of
previous commands. But I can't help with the NWD behavior :/
> + ret = bam_setup_pipe_unlock(bchan);
if (bam_setup_pipe_unlock())?
> + if (ret)
> + dev_err(bchan->vc.chan.slave,
> + "Failed to set up the pipe unlock descriptor\n");
> + }
> +
> if (!list_empty(&bchan->vc.desc_issued) && !IS_BUSY(bchan))
> bam_start_dma(bchan);
> }
> @@ -1142,9 +1295,17 @@ static void bam_dma_work(struct work_struct *work)
> static void bam_issue_pending(struct dma_chan *chan)
> {
> struct bam_chan *bchan = to_bam_chan(chan);
> + int ret;
>
> guard(spinlock_irqsave)(&bchan->vc.lock);
>
> + if (!bchan->locked) {
> + ret = bam_setup_pipe_lock(bchan);
if (bam_setup_pipe_lock())?
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v12 06/12] crypto: qce - Include algapi.h in the core.h header
2026-03-10 15:44 [PATCH v12 00/12] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (4 preceding siblings ...)
2026-03-10 15:44 ` [PATCH v12 05/12] dmaengine: qcom: bam_dma: add support for BAM locking Bartosz Golaszewski
@ 2026-03-10 15:44 ` Bartosz Golaszewski
2026-03-10 15:44 ` [PATCH v12 07/12] crypto: qce - Remove unused ignore_buf Bartosz Golaszewski
` (6 subsequent siblings)
12 siblings, 0 replies; 23+ messages in thread
From: Bartosz Golaszewski @ 2026-03-10 15:44 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
The header defines a struct embedding struct crypto_queue whose size
needs to be known and which is defined in crypto/algapi.h. Move the
inclusion from core.c to core.h.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/core.c | 1 -
drivers/crypto/qce/core.h | 1 +
2 files changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
index b966f3365b7de8d2a8f6707397a34aa4facdc4ac..65205100c3df961ffaa4b7bc9e217e8d3e08ed57 100644
--- a/drivers/crypto/qce/core.c
+++ b/drivers/crypto/qce/core.c
@@ -13,7 +13,6 @@
#include <linux/mod_devicetable.h>
#include <linux/platform_device.h>
#include <linux/types.h>
-#include <crypto/algapi.h>
#include <crypto/internal/hash.h>
#include "core.h"
diff --git a/drivers/crypto/qce/core.h b/drivers/crypto/qce/core.h
index eb6fa7a8b64a81daf9ad5304a3ae4e5e597a70b8..f092ce2d3b04a936a37805c20ac5ba78d8fdd2df 100644
--- a/drivers/crypto/qce/core.h
+++ b/drivers/crypto/qce/core.h
@@ -8,6 +8,7 @@
#include <linux/mutex.h>
#include <linux/workqueue.h>
+#include <crypto/algapi.h>
#include "dma.h"
--
2.47.3
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH v12 07/12] crypto: qce - Remove unused ignore_buf
2026-03-10 15:44 [PATCH v12 00/12] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (5 preceding siblings ...)
2026-03-10 15:44 ` [PATCH v12 06/12] crypto: qce - Include algapi.h in the core.h header Bartosz Golaszewski
@ 2026-03-10 15:44 ` Bartosz Golaszewski
2026-03-10 15:44 ` [PATCH v12 08/12] crypto: qce - Simplify arguments of devm_qce_dma_request() Bartosz Golaszewski
` (5 subsequent siblings)
12 siblings, 0 replies; 23+ messages in thread
From: Bartosz Golaszewski @ 2026-03-10 15:44 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
It's unclear what the purpose of this field is. It has been here since
the initial commit but without any explanation. The driver works fine
without it. We still keep allocating more space in the result buffer, we
just don't need to store its address. While at it: move the
QCE_IGNORE_BUF_SZ definition into dma.c as it's not used outside of this
compilation unit.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/dma.c | 4 ++--
drivers/crypto/qce/dma.h | 2 --
2 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index 68cafd4741ad3d91906d39e817fc7873b028d498..08bf3e8ec12433c1a8ee17003f3487e41b7329e4 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -9,6 +9,8 @@
#include "dma.h"
+#define QCE_IGNORE_BUF_SZ (2 * QCE_BAM_BURST_SIZE)
+
static void qce_dma_release(void *data)
{
struct qce_dma_data *dma = data;
@@ -41,8 +43,6 @@ int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma)
goto error_nomem;
}
- dma->ignore_buf = dma->result_buf + QCE_RESULT_BUF_SZ;
-
return devm_add_action_or_reset(dev, qce_dma_release, dma);
error_nomem:
diff --git a/drivers/crypto/qce/dma.h b/drivers/crypto/qce/dma.h
index 31629185000e12242fa07c2cc08b95fcbd5d4b8c..fc337c435cd14917bdfb99febcf9119275afdeba 100644
--- a/drivers/crypto/qce/dma.h
+++ b/drivers/crypto/qce/dma.h
@@ -23,7 +23,6 @@ struct qce_result_dump {
u32 status2;
};
-#define QCE_IGNORE_BUF_SZ (2 * QCE_BAM_BURST_SIZE)
#define QCE_RESULT_BUF_SZ \
ALIGN(sizeof(struct qce_result_dump), QCE_BAM_BURST_SIZE)
@@ -31,7 +30,6 @@ struct qce_dma_data {
struct dma_chan *txchan;
struct dma_chan *rxchan;
struct qce_result_dump *result_buf;
- void *ignore_buf;
};
int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma);
--
2.47.3
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH v12 08/12] crypto: qce - Simplify arguments of devm_qce_dma_request()
2026-03-10 15:44 [PATCH v12 00/12] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (6 preceding siblings ...)
2026-03-10 15:44 ` [PATCH v12 07/12] crypto: qce - Remove unused ignore_buf Bartosz Golaszewski
@ 2026-03-10 15:44 ` Bartosz Golaszewski
2026-03-10 15:44 ` [PATCH v12 09/12] crypto: qce - Use existing devres APIs in devm_qce_dma_request() Bartosz Golaszewski
` (4 subsequent siblings)
12 siblings, 0 replies; 23+ messages in thread
From: Bartosz Golaszewski @ 2026-03-10 15:44 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
This function can extract all the information it needs from struct
qce_device alone so simplify its arguments. This is done in preparation
for adding support for register I/O over DMA which will require
accessing even more fields from struct qce_device.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/core.c | 2 +-
drivers/crypto/qce/dma.c | 5 ++++-
drivers/crypto/qce/dma.h | 4 +++-
3 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
index 65205100c3df961ffaa4b7bc9e217e8d3e08ed57..8b7bcd0c420c45caf8b29e5455e0f384fd5c5616 100644
--- a/drivers/crypto/qce/core.c
+++ b/drivers/crypto/qce/core.c
@@ -226,7 +226,7 @@ static int qce_crypto_probe(struct platform_device *pdev)
if (ret)
return ret;
- ret = devm_qce_dma_request(qce->dev, &qce->dma);
+ ret = devm_qce_dma_request(qce);
if (ret)
return ret;
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index 08bf3e8ec12433c1a8ee17003f3487e41b7329e4..c29b0abe9445381a019e0447d30acfd7319d5c1f 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -7,6 +7,7 @@
#include <linux/dmaengine.h>
#include <crypto/scatterwalk.h>
+#include "core.h"
#include "dma.h"
#define QCE_IGNORE_BUF_SZ (2 * QCE_BAM_BURST_SIZE)
@@ -20,8 +21,10 @@ static void qce_dma_release(void *data)
kfree(dma->result_buf);
}
-int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma)
+int devm_qce_dma_request(struct qce_device *qce)
{
+ struct qce_dma_data *dma = &qce->dma;
+ struct device *dev = qce->dev;
int ret;
dma->txchan = dma_request_chan(dev, "tx");
diff --git a/drivers/crypto/qce/dma.h b/drivers/crypto/qce/dma.h
index fc337c435cd14917bdfb99febcf9119275afdeba..483789d9fa98e79d1283de8297bf2fc2a773f3a7 100644
--- a/drivers/crypto/qce/dma.h
+++ b/drivers/crypto/qce/dma.h
@@ -8,6 +8,8 @@
#include <linux/dmaengine.h>
+struct qce_device;
+
/* maximum data transfer block size between BAM and CE */
#define QCE_BAM_BURST_SIZE 64
@@ -32,7 +34,7 @@ struct qce_dma_data {
struct qce_result_dump *result_buf;
};
-int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma);
+int devm_qce_dma_request(struct qce_device *qce);
int qce_dma_prep_sgs(struct qce_dma_data *dma, struct scatterlist *sg_in,
int in_ents, struct scatterlist *sg_out, int out_ents,
dma_async_tx_callback cb, void *cb_param);
--
2.47.3
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH v12 09/12] crypto: qce - Use existing devres APIs in devm_qce_dma_request()
2026-03-10 15:44 [PATCH v12 00/12] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (7 preceding siblings ...)
2026-03-10 15:44 ` [PATCH v12 08/12] crypto: qce - Simplify arguments of devm_qce_dma_request() Bartosz Golaszewski
@ 2026-03-10 15:44 ` Bartosz Golaszewski
2026-03-10 15:44 ` [PATCH v12 10/12] crypto: qce - Map crypto memory for DMA Bartosz Golaszewski
` (3 subsequent siblings)
12 siblings, 0 replies; 23+ messages in thread
From: Bartosz Golaszewski @ 2026-03-10 15:44 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski,
Konrad Dybcio
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Switch to devm_kmalloc() and devm_dma_alloc_chan() in
devm_qce_dma_request(). This allows us to drop two labels and shrink the
function.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/dma.c | 39 +++++++++------------------------------
1 file changed, 9 insertions(+), 30 deletions(-)
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index c29b0abe9445381a019e0447d30acfd7319d5c1f..a46264735bb895b6199969e83391383ccbbacc5f 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -12,47 +12,26 @@
#define QCE_IGNORE_BUF_SZ (2 * QCE_BAM_BURST_SIZE)
-static void qce_dma_release(void *data)
-{
- struct qce_dma_data *dma = data;
-
- dma_release_channel(dma->txchan);
- dma_release_channel(dma->rxchan);
- kfree(dma->result_buf);
-}
-
int devm_qce_dma_request(struct qce_device *qce)
{
struct qce_dma_data *dma = &qce->dma;
struct device *dev = qce->dev;
- int ret;
- dma->txchan = dma_request_chan(dev, "tx");
+ dma->txchan = devm_dma_request_chan(dev, "tx");
if (IS_ERR(dma->txchan))
return dev_err_probe(dev, PTR_ERR(dma->txchan),
"Failed to get TX DMA channel\n");
- dma->rxchan = dma_request_chan(dev, "rx");
- if (IS_ERR(dma->rxchan)) {
- ret = dev_err_probe(dev, PTR_ERR(dma->rxchan),
- "Failed to get RX DMA channel\n");
- goto error_rx;
- }
-
- dma->result_buf = kmalloc(QCE_RESULT_BUF_SZ + QCE_IGNORE_BUF_SZ,
- GFP_KERNEL);
- if (!dma->result_buf) {
- ret = -ENOMEM;
- goto error_nomem;
- }
+ dma->rxchan = devm_dma_request_chan(dev, "rx");
+ if (IS_ERR(dma->rxchan))
+ return dev_err_probe(dev, PTR_ERR(dma->rxchan),
+ "Failed to get RX DMA channel\n");
- return devm_add_action_or_reset(dev, qce_dma_release, dma);
+ dma->result_buf = devm_kmalloc(dev, QCE_RESULT_BUF_SZ + QCE_IGNORE_BUF_SZ, GFP_KERNEL);
+ if (!dma->result_buf)
+ return -ENOMEM;
-error_nomem:
- dma_release_channel(dma->rxchan);
-error_rx:
- dma_release_channel(dma->txchan);
- return ret;
+ return 0;
}
struct scatterlist *
--
2.47.3
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH v12 10/12] crypto: qce - Map crypto memory for DMA
2026-03-10 15:44 [PATCH v12 00/12] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (8 preceding siblings ...)
2026-03-10 15:44 ` [PATCH v12 09/12] crypto: qce - Use existing devres APIs in devm_qce_dma_request() Bartosz Golaszewski
@ 2026-03-10 15:44 ` Bartosz Golaszewski
2026-03-10 15:44 ` [PATCH v12 11/12] crypto: qce - Add BAM DMA support for crypto register I/O Bartosz Golaszewski
` (2 subsequent siblings)
12 siblings, 0 replies; 23+ messages in thread
From: Bartosz Golaszewski @ 2026-03-10 15:44 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
As the first step in converting the driver to using DMA for register
I/O, let's map the crypto memory range.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/core.c | 25 +++++++++++++++++++++++--
drivers/crypto/qce/core.h | 6 ++++++
2 files changed, 29 insertions(+), 2 deletions(-)
diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
index 8b7bcd0c420c45caf8b29e5455e0f384fd5c5616..2667fcd67fee826a44080da8f88a3e2abbb9b2cf 100644
--- a/drivers/crypto/qce/core.c
+++ b/drivers/crypto/qce/core.c
@@ -185,10 +185,19 @@ static int qce_check_version(struct qce_device *qce)
return 0;
}
+static void qce_crypto_unmap_dma(void *data)
+{
+ struct qce_device *qce = data;
+
+ dma_unmap_resource(qce->dev, qce->base_dma, qce->dma_size,
+ DMA_BIDIRECTIONAL, 0);
+}
+
static int qce_crypto_probe(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
struct qce_device *qce;
+ struct resource *res;
int ret;
qce = devm_kzalloc(dev, sizeof(*qce), GFP_KERNEL);
@@ -198,7 +207,7 @@ static int qce_crypto_probe(struct platform_device *pdev)
qce->dev = dev;
platform_set_drvdata(pdev, qce);
- qce->base = devm_platform_ioremap_resource(pdev, 0);
+ qce->base = devm_platform_get_and_ioremap_resource(pdev, 0, &res);
if (IS_ERR(qce->base))
return PTR_ERR(qce->base);
@@ -244,7 +253,19 @@ static int qce_crypto_probe(struct platform_device *pdev)
qce->async_req_enqueue = qce_async_request_enqueue;
qce->async_req_done = qce_async_request_done;
- return devm_qce_register_algs(qce);
+ ret = devm_qce_register_algs(qce);
+ if (ret)
+ return ret;
+
+ qce->dma_size = resource_size(res);
+ qce->base_dma = dma_map_resource(dev, res->start, qce->dma_size,
+ DMA_BIDIRECTIONAL, 0);
+ qce->base_phys = res->start;
+ ret = dma_mapping_error(dev, qce->base_dma);
+ if (ret)
+ return ret;
+
+ return devm_add_action_or_reset(qce->dev, qce_crypto_unmap_dma, qce);
}
static const struct of_device_id qce_crypto_of_match[] = {
diff --git a/drivers/crypto/qce/core.h b/drivers/crypto/qce/core.h
index f092ce2d3b04a936a37805c20ac5ba78d8fdd2df..a80e12eac6c87e5321cce16c56a4bf5003474ef0 100644
--- a/drivers/crypto/qce/core.h
+++ b/drivers/crypto/qce/core.h
@@ -27,6 +27,9 @@
* @dma: pointer to dma data
* @burst_size: the crypto burst size
* @pipe_pair_id: which pipe pair id the device using
+ * @base_dma: base DMA address
+ * @base_phys: base physical address
+ * @dma_size: size of memory mapped for DMA
* @async_req_enqueue: invoked by every algorithm to enqueue a request
* @async_req_done: invoked by every algorithm to finish its request
*/
@@ -43,6 +46,9 @@ struct qce_device {
struct qce_dma_data dma;
int burst_size;
unsigned int pipe_pair_id;
+ dma_addr_t base_dma;
+ phys_addr_t base_phys;
+ size_t dma_size;
int (*async_req_enqueue)(struct qce_device *qce,
struct crypto_async_request *req);
void (*async_req_done)(struct qce_device *qce, int ret);
--
2.47.3
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH v12 11/12] crypto: qce - Add BAM DMA support for crypto register I/O
2026-03-10 15:44 [PATCH v12 00/12] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (9 preceding siblings ...)
2026-03-10 15:44 ` [PATCH v12 10/12] crypto: qce - Map crypto memory for DMA Bartosz Golaszewski
@ 2026-03-10 15:44 ` Bartosz Golaszewski
2026-03-10 15:44 ` [PATCH v12 12/12] crypto: qce - Communicate the base physical address to the dmaengine Bartosz Golaszewski
2026-03-11 8:03 ` [PATCH v12 00/12] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
12 siblings, 0 replies; 23+ messages in thread
From: Bartosz Golaszewski @ 2026-03-10 15:44 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Implement the infrastructure for performing register I/O over BAM DMA,
not CPU.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/aead.c | 2 -
drivers/crypto/qce/common.c | 20 ++++----
drivers/crypto/qce/core.h | 4 ++
drivers/crypto/qce/dma.c | 109 ++++++++++++++++++++++++++++++++++++++++++
drivers/crypto/qce/dma.h | 5 ++
drivers/crypto/qce/sha.c | 2 -
drivers/crypto/qce/skcipher.c | 2 -
7 files changed, 127 insertions(+), 17 deletions(-)
diff --git a/drivers/crypto/qce/aead.c b/drivers/crypto/qce/aead.c
index abb438d2f8888d313d134161fda97dcc9d82d6d9..a4e8158803eb59cd0d40076674d0059bb94759ce 100644
--- a/drivers/crypto/qce/aead.c
+++ b/drivers/crypto/qce/aead.c
@@ -473,8 +473,6 @@ qce_aead_async_req_handle(struct crypto_async_request *async_req)
if (ret)
goto error_unmap_src;
- qce_dma_issue_pending(&qce->dma);
-
ret = qce_start(async_req, tmpl->crypto_alg_type);
if (ret)
goto error_terminate;
diff --git a/drivers/crypto/qce/common.c b/drivers/crypto/qce/common.c
index 04253a8d33409a2a51db527435d09ae85a7880af..b2b0e751a06517ac06e7a468599bd18666210e0c 100644
--- a/drivers/crypto/qce/common.c
+++ b/drivers/crypto/qce/common.c
@@ -25,7 +25,7 @@ static inline u32 qce_read(struct qce_device *qce, u32 offset)
static inline void qce_write(struct qce_device *qce, u32 offset, u32 val)
{
- writel(val, qce->base + offset);
+ qce_write_dma(qce, offset, val);
}
static inline void qce_write_array(struct qce_device *qce, u32 offset,
@@ -82,6 +82,8 @@ static void qce_setup_config(struct qce_device *qce)
{
u32 config;
+ qce_clear_bam_transaction(qce);
+
/* get big endianness */
config = qce_config_reg(qce, 0);
@@ -90,12 +92,14 @@ static void qce_setup_config(struct qce_device *qce)
qce_write(qce, REG_CONFIG, config);
}
-static inline void qce_crypto_go(struct qce_device *qce, bool result_dump)
+static inline int qce_crypto_go(struct qce_device *qce, bool result_dump)
{
if (result_dump)
qce_write(qce, REG_GOPROC, BIT(GO_SHIFT) | BIT(RESULTS_DUMP_SHIFT));
else
qce_write(qce, REG_GOPROC, BIT(GO_SHIFT));
+
+ return qce_submit_cmd_desc(qce);
}
#if defined(CONFIG_CRYPTO_DEV_QCE_SHA) || defined(CONFIG_CRYPTO_DEV_QCE_AEAD)
@@ -223,9 +227,7 @@ static int qce_setup_regs_ahash(struct crypto_async_request *async_req)
config = qce_config_reg(qce, 1);
qce_write(qce, REG_CONFIG, config);
- qce_crypto_go(qce, true);
-
- return 0;
+ return qce_crypto_go(qce, true);
}
#endif
@@ -386,9 +388,7 @@ static int qce_setup_regs_skcipher(struct crypto_async_request *async_req)
config = qce_config_reg(qce, 1);
qce_write(qce, REG_CONFIG, config);
- qce_crypto_go(qce, true);
-
- return 0;
+ return qce_crypto_go(qce, true);
}
#endif
@@ -535,9 +535,7 @@ static int qce_setup_regs_aead(struct crypto_async_request *async_req)
qce_write(qce, REG_CONFIG, config);
/* Start the process */
- qce_crypto_go(qce, !IS_CCM(flags));
-
- return 0;
+ return qce_crypto_go(qce, !IS_CCM(flags));
}
#endif
diff --git a/drivers/crypto/qce/core.h b/drivers/crypto/qce/core.h
index a80e12eac6c87e5321cce16c56a4bf5003474ef0..d238097f834e4605f3825f23d0316d4196439116 100644
--- a/drivers/crypto/qce/core.h
+++ b/drivers/crypto/qce/core.h
@@ -30,6 +30,8 @@
* @base_dma: base DMA address
* @base_phys: base physical address
* @dma_size: size of memory mapped for DMA
+ * @read_buf: Buffer for DMA to write back to
+ * @read_buf_dma: Mapped address of the read buffer
* @async_req_enqueue: invoked by every algorithm to enqueue a request
* @async_req_done: invoked by every algorithm to finish its request
*/
@@ -49,6 +51,8 @@ struct qce_device {
dma_addr_t base_dma;
phys_addr_t base_phys;
size_t dma_size;
+ __le32 *read_buf;
+ dma_addr_t read_buf_dma;
int (*async_req_enqueue)(struct qce_device *qce,
struct crypto_async_request *req);
void (*async_req_done)(struct qce_device *qce, int ret);
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index a46264735bb895b6199969e83391383ccbbacc5f..ba7a52fd4c6349d59c075c346f75741defeb6034 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -4,6 +4,8 @@
*/
#include <linux/device.h>
+#include <linux/dma/qcom_bam_dma.h>
+#include <linux/dma-mapping.h>
#include <linux/dmaengine.h>
#include <crypto/scatterwalk.h>
@@ -11,6 +13,98 @@
#include "dma.h"
#define QCE_IGNORE_BUF_SZ (2 * QCE_BAM_BURST_SIZE)
+#define QCE_BAM_CMD_SGL_SIZE 128
+#define QCE_BAM_CMD_ELEMENT_SIZE 128
+#define QCE_MAX_REG_READ 8
+
+struct qce_desc_info {
+ struct dma_async_tx_descriptor *dma_desc;
+ enum dma_data_direction dir;
+};
+
+struct qce_bam_transaction {
+ struct bam_cmd_element bam_ce[QCE_BAM_CMD_ELEMENT_SIZE];
+ struct scatterlist wr_sgl[QCE_BAM_CMD_SGL_SIZE];
+ struct qce_desc_info *desc;
+ u32 bam_ce_idx;
+ u32 pre_bam_ce_idx;
+ u32 wr_sgl_cnt;
+};
+
+void qce_clear_bam_transaction(struct qce_device *qce)
+{
+ struct qce_bam_transaction *bam_txn = qce->dma.bam_txn;
+
+ bam_txn->bam_ce_idx = 0;
+ bam_txn->wr_sgl_cnt = 0;
+ bam_txn->bam_ce_idx = 0;
+ bam_txn->pre_bam_ce_idx = 0;
+}
+
+int qce_submit_cmd_desc(struct qce_device *qce)
+{
+ struct qce_desc_info *qce_desc = qce->dma.bam_txn->desc;
+ struct qce_bam_transaction *bam_txn = qce->dma.bam_txn;
+ struct dma_async_tx_descriptor *dma_desc;
+ struct dma_chan *chan = qce->dma.rxchan;
+ unsigned long attrs = DMA_PREP_CMD;
+ dma_cookie_t cookie;
+ unsigned int mapped;
+ int ret;
+
+ mapped = dma_map_sg_attrs(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt,
+ DMA_TO_DEVICE, attrs);
+ if (!mapped)
+ return -ENOMEM;
+
+ dma_desc = dmaengine_prep_slave_sg(chan, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt,
+ DMA_MEM_TO_DEV, attrs);
+ if (!dma_desc) {
+ dma_unmap_sg(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt, DMA_TO_DEVICE);
+ return -ENOMEM;
+ }
+
+ qce_desc->dma_desc = dma_desc;
+ cookie = dmaengine_submit(qce_desc->dma_desc);
+
+ ret = dma_submit_error(cookie);
+ if (ret)
+ return ret;
+
+ qce_dma_issue_pending(&qce->dma);
+
+ return 0;
+}
+
+static void qce_prep_dma_cmd_desc(struct qce_device *qce, struct qce_dma_data *dma,
+ unsigned int addr, void *buf)
+{
+ struct qce_bam_transaction *bam_txn = dma->bam_txn;
+ struct bam_cmd_element *bam_ce_buf;
+ int bam_ce_size, cnt, idx;
+
+ idx = bam_txn->bam_ce_idx;
+ bam_ce_buf = &bam_txn->bam_ce[idx];
+ bam_prep_ce_le32(bam_ce_buf, addr, BAM_WRITE_COMMAND, *((__le32 *)buf));
+
+ bam_ce_buf = &bam_txn->bam_ce[bam_txn->pre_bam_ce_idx];
+ bam_txn->bam_ce_idx++;
+ bam_ce_size = (bam_txn->bam_ce_idx - bam_txn->pre_bam_ce_idx) * sizeof(*bam_ce_buf);
+
+ cnt = bam_txn->wr_sgl_cnt;
+
+ sg_set_buf(&bam_txn->wr_sgl[cnt], bam_ce_buf, bam_ce_size);
+
+ ++bam_txn->wr_sgl_cnt;
+ bam_txn->pre_bam_ce_idx = bam_txn->bam_ce_idx;
+}
+
+void qce_write_dma(struct qce_device *qce, unsigned int offset, u32 val)
+{
+ unsigned int reg_addr = ((unsigned int)(qce->base_phys) + offset);
+
+ qce_prep_dma_cmd_desc(qce, &qce->dma, reg_addr, &val);
+}
int devm_qce_dma_request(struct qce_device *qce)
{
@@ -31,6 +125,21 @@ int devm_qce_dma_request(struct qce_device *qce)
if (!dma->result_buf)
return -ENOMEM;
+ dma->bam_txn = devm_kzalloc(dev, sizeof(*dma->bam_txn), GFP_KERNEL);
+ if (!dma->bam_txn)
+ return -ENOMEM;
+
+ dma->bam_txn->desc = devm_kzalloc(dev, sizeof(*dma->bam_txn->desc), GFP_KERNEL);
+ if (!dma->bam_txn->desc)
+ return -ENOMEM;
+
+ sg_init_table(dma->bam_txn->wr_sgl, QCE_BAM_CMD_SGL_SIZE);
+
+ qce->read_buf = dmam_alloc_coherent(qce->dev, QCE_MAX_REG_READ * sizeof(*qce->read_buf),
+ &qce->read_buf_dma, GFP_KERNEL);
+ if (!qce->read_buf)
+ return -ENOMEM;
+
return 0;
}
diff --git a/drivers/crypto/qce/dma.h b/drivers/crypto/qce/dma.h
index 483789d9fa98e79d1283de8297bf2fc2a773f3a7..f05dfa9e6b25bd60e32f45079a8bc7e6a4cf81f9 100644
--- a/drivers/crypto/qce/dma.h
+++ b/drivers/crypto/qce/dma.h
@@ -8,6 +8,7 @@
#include <linux/dmaengine.h>
+struct qce_bam_transaction;
struct qce_device;
/* maximum data transfer block size between BAM and CE */
@@ -32,6 +33,7 @@ struct qce_dma_data {
struct dma_chan *txchan;
struct dma_chan *rxchan;
struct qce_result_dump *result_buf;
+ struct qce_bam_transaction *bam_txn;
};
int devm_qce_dma_request(struct qce_device *qce);
@@ -43,5 +45,8 @@ int qce_dma_terminate_all(struct qce_dma_data *dma);
struct scatterlist *
qce_sgtable_add(struct sg_table *sgt, struct scatterlist *sg_add,
unsigned int max_len);
+void qce_write_dma(struct qce_device *qce, unsigned int offset, u32 val);
+int qce_submit_cmd_desc(struct qce_device *qce);
+void qce_clear_bam_transaction(struct qce_device *qce);
#endif /* _DMA_H_ */
diff --git a/drivers/crypto/qce/sha.c b/drivers/crypto/qce/sha.c
index d7b6d042fb44f4856a6b4f9c901376dd7531454d..9552a74bf191def412fc32f3859356e569e5d400 100644
--- a/drivers/crypto/qce/sha.c
+++ b/drivers/crypto/qce/sha.c
@@ -113,8 +113,6 @@ static int qce_ahash_async_req_handle(struct crypto_async_request *async_req)
if (ret)
goto error_unmap_dst;
- qce_dma_issue_pending(&qce->dma);
-
ret = qce_start(async_req, tmpl->crypto_alg_type);
if (ret)
goto error_terminate;
diff --git a/drivers/crypto/qce/skcipher.c b/drivers/crypto/qce/skcipher.c
index 872b65318233ed21e3559853f6bbdad030a1b81f..e80452c19b03496faaee38d4ac792289f560d082 100644
--- a/drivers/crypto/qce/skcipher.c
+++ b/drivers/crypto/qce/skcipher.c
@@ -147,8 +147,6 @@ qce_skcipher_async_req_handle(struct crypto_async_request *async_req)
if (ret)
goto error_unmap_src;
- qce_dma_issue_pending(&qce->dma);
-
ret = qce_start(async_req, tmpl->crypto_alg_type);
if (ret)
goto error_terminate;
--
2.47.3
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH v12 12/12] crypto: qce - Communicate the base physical address to the dmaengine
2026-03-10 15:44 [PATCH v12 00/12] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (10 preceding siblings ...)
2026-03-10 15:44 ` [PATCH v12 11/12] crypto: qce - Add BAM DMA support for crypto register I/O Bartosz Golaszewski
@ 2026-03-10 15:44 ` Bartosz Golaszewski
2026-03-11 8:03 ` [PATCH v12 00/12] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
12 siblings, 0 replies; 23+ messages in thread
From: Bartosz Golaszewski @ 2026-03-10 15:44 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
In order to communicate to the BAM DMA engine which address should be
used as a scratchpad for dummy writes related to BAM pipe locking,
fill out and attach the provided metadata struct to the descriptor as
well as mark the RX channel as such using the slave config struct.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/dma.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index ba7a52fd4c6349d59c075c346f75741defeb6034..6ab352261223c3c4815a01e84238447e8e61e040 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -11,6 +11,7 @@
#include "core.h"
#include "dma.h"
+#include "regs-v5.h"
#define QCE_IGNORE_BUF_SZ (2 * QCE_BAM_BURST_SIZE)
#define QCE_BAM_CMD_SGL_SIZE 128
@@ -43,6 +44,7 @@ void qce_clear_bam_transaction(struct qce_device *qce)
int qce_submit_cmd_desc(struct qce_device *qce)
{
+ struct bam_desc_metadata meta = { .scratchpad_addr = qce->base_phys + REG_VERSION };
struct qce_desc_info *qce_desc = qce->dma.bam_txn->desc;
struct qce_bam_transaction *bam_txn = qce->dma.bam_txn;
struct dma_async_tx_descriptor *dma_desc;
@@ -64,6 +66,12 @@ int qce_submit_cmd_desc(struct qce_device *qce)
return -ENOMEM;
}
+ ret = dmaengine_desc_attach_metadata(dma_desc, &meta, 0);
+ if (ret) {
+ dma_unmap_sg(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt, DMA_TO_DEVICE);
+ return ret;
+ }
+
qce_desc->dma_desc = dma_desc;
cookie = dmaengine_submit(qce_desc->dma_desc);
@@ -109,7 +117,9 @@ void qce_write_dma(struct qce_device *qce, unsigned int offset, u32 val)
int devm_qce_dma_request(struct qce_device *qce)
{
struct qce_dma_data *dma = &qce->dma;
+ struct dma_slave_config cfg = { };
struct device *dev = qce->dev;
+ int ret;
dma->txchan = devm_dma_request_chan(dev, "tx");
if (IS_ERR(dma->txchan))
@@ -121,6 +131,11 @@ int devm_qce_dma_request(struct qce_device *qce)
return dev_err_probe(dev, PTR_ERR(dma->rxchan),
"Failed to get RX DMA channel\n");
+ cfg.direction = DMA_MEM_TO_DEV;
+ ret = dmaengine_slave_config(dma->rxchan, &cfg);
+ if (ret)
+ return ret;
+
dma->result_buf = devm_kmalloc(dev, QCE_RESULT_BUF_SZ + QCE_IGNORE_BUF_SZ, GFP_KERNEL);
if (!dma->result_buf)
return -ENOMEM;
--
2.47.3
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH v12 00/12] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O
2026-03-10 15:44 [PATCH v12 00/12] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
` (11 preceding siblings ...)
2026-03-10 15:44 ` [PATCH v12 12/12] crypto: qce - Communicate the base physical address to the dmaengine Bartosz Golaszewski
@ 2026-03-11 8:03 ` Bartosz Golaszewski
2026-03-11 8:06 ` Stephan Gerhold
12 siblings, 1 reply; 23+ messages in thread
From: Bartosz Golaszewski @ 2026-03-11 8:03 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi, Md Sadre Alam,
Dmitry Baryshkov, Peter Ujfalusi, Michal Simek, Frank Li,
dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, Bartosz Golaszewski, Dmitry Baryshkov,
Bjorn Andersson, Konrad Dybcio, Manivannan Sadhasivam,
Stephan Gerhold
On Tue, Mar 10, 2026 at 4:44 PM Bartosz Golaszewski
<bartosz.golaszewski@oss.qualcomm.com> wrote:
>
> This iteration is built on top of the v11 RFC with remaining issues
> fixed and the mechanism for communicating the scratchpad address from
> clients to the BAM driver changed from slave config to descriptor
> metadata.
>
> However: during stress-testing I noticed that sometimes a transaction
> would end with an error. The engine was indicating that a write/read to
> the config registers was performed while the engine was busy (bit 17 of
> the STATUS register was set). It turns out that we must not just
> unconditionally append the UNLOCK descriptor to the "issued" queue, we
> must wait for the transaction to end before we queue it so this version
> takes this into account and queues the UNLOCK descriptor from the
> workqueue.
>
> With this all stress tests and benchmarks from cryptsetup work fine.
>
Mani, Stephan: sorry, I forgot to update the cover letter to Cc you.
Doing it now here.
Stephan: I tried to use READ command but it would crash on sm8650, so
I went with WRITE. :(
Bartosz
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH v12 00/12] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O
2026-03-11 8:03 ` [PATCH v12 00/12] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O Bartosz Golaszewski
@ 2026-03-11 8:06 ` Stephan Gerhold
0 siblings, 0 replies; 23+ messages in thread
From: Stephan Gerhold @ 2026-03-11 8:06 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Bartosz Golaszewski, Vinod Koul, Jonathan Corbet, Thara Gopinath,
Herbert Xu, David S. Miller, Udit Tiwari, Daniel Perez-Zoghbi,
Md Sadre Alam, Dmitry Baryshkov, Peter Ujfalusi, Michal Simek,
Frank Li, dmaengine, linux-doc, linux-kernel, linux-arm-msm,
linux-crypto, linux-arm-kernel, Bartosz Golaszewski,
Dmitry Baryshkov, Bjorn Andersson, Konrad Dybcio,
Manivannan Sadhasivam
On Wed, Mar 11, 2026 at 09:03:37AM +0100, Bartosz Golaszewski wrote:
> On Tue, Mar 10, 2026 at 4:44 PM Bartosz Golaszewski
> <bartosz.golaszewski@oss.qualcomm.com> wrote:
> >
> > This iteration is built on top of the v11 RFC with remaining issues
> > fixed and the mechanism for communicating the scratchpad address from
> > clients to the BAM driver changed from slave config to descriptor
> > metadata.
> >
> > However: during stress-testing I noticed that sometimes a transaction
> > would end with an error. The engine was indicating that a write/read to
> > the config registers was performed while the engine was busy (bit 17 of
> > the STATUS register was set). It turns out that we must not just
> > unconditionally append the UNLOCK descriptor to the "issued" queue, we
> > must wait for the transaction to end before we queue it so this version
> > takes this into account and queues the UNLOCK descriptor from the
> > workqueue.
> >
> > With this all stress tests and benchmarks from cryptsetup work fine.
> >
>
> Mani, Stephan: sorry, I forgot to update the cover letter to Cc you.
> Doing it now here.
>
> Stephan: I tried to use READ command but it would crash on sm8650, so
> I went with WRITE. :(
>
No worries, thanks for testing this!
Stephan
^ permalink raw reply [flat|nested] 23+ messages in thread