* Re: [PATCH v6 03/11] dt-bindings: mfd: add documentation for S2MU005 PMIC
From: Krzysztof Kozlowski @ 2026-05-19 13:31 UTC (permalink / raw)
To: Kaustabh Chakraborty, Conor Dooley
Cc: Lee Jones, Pavel Machek, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, MyungJoo Ham, Chanwoo Choi, Sebastian Reichel,
André Draszik, Alexandre Belloni, Jonathan Corbet,
Shuah Khan, Nam Tran, Łukasz Lebiedziński, linux-leds,
devicetree, linux-kernel, linux-pm, linux-samsung-soc, linux-rtc,
linux-doc
In-Reply-To: <DIMN3D9E8YCT.3T2PGAYYB2IOO@disroot.org>
On 19/05/2026 14:07, Kaustabh Chakraborty wrote:
>>>>
>>>> I don't think the compatible should be here, but I also don't want to
>>>> stall that patchset. I understand that it is inconsistent review from my
>>>> side, because other similar patchsets receive comment to drop the
>>>> compatible. But I don't think we will be fair asking to drop the
>>>> compatible now, when we did not ask for that in the early versions at all.
>>>
>>>
>>> I think you misunderstood, we were talking about the ordering of the
>>> properties in the binding file being alphanumerical, rather than the
>>> more typical approach of approximately following the order of
>>> dts-coding-style.
>>
>>
>> Ah, then I misunderstood and, even though it is a nit, I do care because
>> old code is then used for new patches. Bindings follow DTS rules, thus
>> should be:
>> 1. compatible
>> 2. reg
>> 3. core properties
>> 4. vendor properties
>>
>> Kaustabh, can you change it please?
>
> Ack, will do that in v8 then.
>
> While at it, do you also want me to drop the multi-led compatible string?
> So it would be:
Yes
>
> multi-led:
> $ref: /schemas/leds/leds-class-multicolor.yaml#
with "unevaluatedProperties: false"
Best regards,
Krzysztof
^ permalink raw reply
* Re: [PATCH 00/12] misc/syncobj: add /dev/syncobj device
From: Christian König @ 2026-05-19 13:28 UTC (permalink / raw)
To: Julian Orth
Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Sumit Semwal, Jonathan Corbet, Shuah Khan,
Arnd Bergmann, Greg Kroah-Hartman, dri-devel, linux-kernel,
linux-media, linaro-mm-sig, linux-doc, wayland-devel,
Michel Dänzer
In-Reply-To: <CAHijbEWHp960qvZFoK7+9ppHAqkAR7=UQhtMUccqWzGd_pFPQA@mail.gmail.com>
On 5/19/26 15:19, Julian Orth wrote:
> On Tue, May 19, 2026 at 10:18 AM Christian König
> <christian.koenig@amd.com> wrote:
>>
>> On 5/18/26 14:58, Julian Orth wrote:
>>> On Mon, May 18, 2026 at 2:41 PM Christian König
>>> <christian.koenig@amd.com> wrote:
>> ...
>>>> It could be that we have eventfd integration for that as well now, but in that case you could give the compositor an eventfd instead of a drm_syncobj fd in the first place.
>>>
>>> Yes, all compositors use the DRM_IOCTL_SYNCOBJ_EVENTFD ioctl to wait
>>> async for the timeline point to materialize and/or be signaled. The
>>> wayland protocol was the motivation for that ioctl.
>>>
>>>>
>>>> So as far as I can see using drm_syncobj for software rendering really doesn't make sense, eventfd is a much better fit for that use case.
>>>
>>> Using eventfd has some disadvantages:
>>>
>>> - We've just added syncobj support to vulkan:
>>> https://github.com/KhronosGroup/Vulkan-Docs/issues/2473#issuecomment-4446117280.
>>> For eventfd we would not only have to add yet another extension, that
>>> would realistically only be exposed by llvmpipe, but also every
>>> compositor and every client would have to support both extensions.
>>> - Similarly, a new wayland protocol would need to be designed to
>>> support sync over eventfd.
>>> - Eventfd does not support timeline semantics. Meaning that you would
>>> have to send two eventfds over the wire for each commit, one for the
>>> acquire point and one for the release point. Whereas with syncobj you
>>> only need to send two integers per commit.
>>>
>>> I don't see the advantage when drm_syncobj already does everything we need.
>>>
>>> You seem to believe that compositors would not be ready for this and
>>> from that perspective I can understand your apprehension. But I can
>>> assure you that compositors are already fully set up to support all of
>>> the usecases I've described: The wayland protocol requires the
>>> compositor to support wait before signal.
>> Yeah that's much better than I thought it would be.
>>
>> And that eventfds don't support timeline points is indeed a pretty good argument.
>>
>> But I still don't see much justification for creating a /dev/syncobj device, this is clearly something DRM specific.
>
> The justification is given in the cover letter. To repeat them briefly:
>
> 1. This series makes the ability to manipulate syncobjs available
> independently of attached hardware.
> 2. It makes it available under a consistent path /dev/syncobj.
Exactly that is a big no-go. This has to be under /dev/dri.
> 3. It removes the need to translate between syncobjs fds and handles.
That's a pretty big no-go as well. The differentiation between FDs and handles is completely intentional.
>
>>
>> What about using VGEM for this?
>
> If the vgem render node were made available unconditionally under,
Software rendering is a complete corner case, I don't think that this will be enabled by default.
Regards,
Christian.
> say, /dev/vgem and DRIVER_SYNCOBJ_TIMELINE were added to the driver,
> then maybe that could solve points 1 and 2 above.
>
> But it would not solve point 3 and it sounds like a hack to me to have
> a render node available outside of /dev/dri.
>
>>
>> Regards,
>> Christian.
>>
>>>
>>>>
>>>> Regards,
>>>> Christian.
^ permalink raw reply
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations
From: David Woodhouse @ 2026-05-19 13:24 UTC (permalink / raw)
To: Marc Zyngier, Paolo Bonzini
Cc: Will Deacon, Jonathan Corbet, Shuah Khan, kvm,
Linux Doc Mailing List, Kernel Mailing List, Linux,
Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas,
Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann,
Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest
In-Reply-To: <86pl2rwoat.wl-maz@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 2101 bytes --]
On Tue, 2026-05-19 at 13:56 +0100, Marc Zyngier wrote:
> On Tue, 19 May 2026 13:38:57 +0100,
> Marc Zyngier <maz@kernel.org> wrote:
> >
> > As I said before, I'd be OK with something that would restore IIDR to
> > REV1. But not something that actively breaks the GIC emulation by
> > reintroducing a bug. That's, by construction, dead code that will only
> > bitrot, because there is no SW that can make use of this nonsense.
>
> I will also add that if we make it a policy to preserve buggy
> behaviours that the guest cannot be relying on, then I question
> whether we should be fixing anything at all.
I think we just have a different understanding of what it practically
means to have behaviour "that the guest cannot be relying on", as in
the examples I just described for the IIDR issue.
> For example, 6.19 fixed a totally buggy behaviour where a guest
> couldn't not have more than (on most HW) 4 interrupts in flight at any
> given time. This was obviously totally bogus, and this was fixed
> unconditionally, as legitimate guests could experience gold-platted
> lock-ups.
And marked with a Fixes: tag and backported to stable, one presumes?
I'm confused that you think this is relevant. Can you contrive a
situation where a guest actually relied on this bug and *survived*,
like the situations I just explained for the IIDR issue?
You can nit-pick my hypotheticals as unlikely — and they are. But if we
always just YOLO it and change guest-visible behaviour on the basis
that it's "unlikely" to break anyone, and there are many such changes
in a given kernel deployment (e.g. from v6.6 to v6.18), then the
cumulative probability of being bitten by one of those "unlikely"
problems approaches 1.
There's a reason we do a *huge* amount of testing of what the guest
sees as we move from one kernel to the next, and back again, and
endeavour to eliminate all those differences.
And once the new kernel *is* deployed and won't be rolled back, of
course, all new launches can get the newer behaviour (and the latest
version of PSCI, etc...)
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]
^ permalink raw reply
* Re: [PATCH 00/12] misc/syncobj: add /dev/syncobj device
From: Julian Orth @ 2026-05-19 13:19 UTC (permalink / raw)
To: Christian König
Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Sumit Semwal, Jonathan Corbet, Shuah Khan,
Arnd Bergmann, Greg Kroah-Hartman, dri-devel, linux-kernel,
linux-media, linaro-mm-sig, linux-doc, wayland-devel,
Michel Dänzer
In-Reply-To: <38551bfe-75e1-4978-b57d-adc43cebc85e@amd.com>
On Tue, May 19, 2026 at 10:18 AM Christian König
<christian.koenig@amd.com> wrote:
>
> On 5/18/26 14:58, Julian Orth wrote:
> > On Mon, May 18, 2026 at 2:41 PM Christian König
> > <christian.koenig@amd.com> wrote:
> ...
> >> It could be that we have eventfd integration for that as well now, but in that case you could give the compositor an eventfd instead of a drm_syncobj fd in the first place.
> >
> > Yes, all compositors use the DRM_IOCTL_SYNCOBJ_EVENTFD ioctl to wait
> > async for the timeline point to materialize and/or be signaled. The
> > wayland protocol was the motivation for that ioctl.
> >
> >>
> >> So as far as I can see using drm_syncobj for software rendering really doesn't make sense, eventfd is a much better fit for that use case.
> >
> > Using eventfd has some disadvantages:
> >
> > - We've just added syncobj support to vulkan:
> > https://github.com/KhronosGroup/Vulkan-Docs/issues/2473#issuecomment-4446117280.
> > For eventfd we would not only have to add yet another extension, that
> > would realistically only be exposed by llvmpipe, but also every
> > compositor and every client would have to support both extensions.
> > - Similarly, a new wayland protocol would need to be designed to
> > support sync over eventfd.
> > - Eventfd does not support timeline semantics. Meaning that you would
> > have to send two eventfds over the wire for each commit, one for the
> > acquire point and one for the release point. Whereas with syncobj you
> > only need to send two integers per commit.
> >
> > I don't see the advantage when drm_syncobj already does everything we need.
> >
> > You seem to believe that compositors would not be ready for this and
> > from that perspective I can understand your apprehension. But I can
> > assure you that compositors are already fully set up to support all of
> > the usecases I've described: The wayland protocol requires the
> > compositor to support wait before signal.
> Yeah that's much better than I thought it would be.
>
> And that eventfds don't support timeline points is indeed a pretty good argument.
>
> But I still don't see much justification for creating a /dev/syncobj device, this is clearly something DRM specific.
The justification is given in the cover letter. To repeat them briefly:
1. This series makes the ability to manipulate syncobjs available
independently of attached hardware.
2. It makes it available under a consistent path /dev/syncobj.
3. It removes the need to translate between syncobjs fds and handles.
>
> What about using VGEM for this?
If the vgem render node were made available unconditionally under,
say, /dev/vgem and DRIVER_SYNCOBJ_TIMELINE were added to the driver,
then maybe that could solve points 1 and 2 above.
But it would not solve point 3 and it sounds like a hack to me to have
a render node available outside of /dev/dri.
>
> Regards,
> Christian.
>
> >
> >>
> >> Regards,
> >> Christian.
^ permalink raw reply
* [PATCH v17 14/14] crypto: qce - Communicate the base physical address to the dmaengine
From: Bartosz Golaszewski @ 2026-05-19 13:17 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
In-Reply-To: <20260519-qcom-qce-cmd-descr-v17-0-53a595414b79@oss.qualcomm.com>
In order to communicate to the BAM DMA engine which address should be
used as a scratchpad for dummy writes related to BAM pipe locking,
fill out and attach the provided metadata struct to the descriptor.
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/dma.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index b66e6386fccda20d9462e70e51b8b485be85dec8..97b0f02c2b4d212f9e9ad41bbcb3a33e0b64835a 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -11,6 +11,7 @@
#include "core.h"
#include "dma.h"
+#include "regs-v5.h"
#define QCE_IGNORE_BUF_SZ (2 * QCE_BAM_BURST_SIZE)
#define QCE_BAM_CMD_SGL_SIZE 128
@@ -43,6 +44,10 @@ void qce_clear_bam_transaction(struct qce_device *qce)
int qce_submit_cmd_desc(struct qce_device *qce)
{
+ struct bam_desc_metadata meta = {
+ .scratchpad_addr = qce->base_phys + REG_VERSION,
+ .direction = DMA_MEM_TO_DEV,
+ };
struct qce_desc_info *qce_desc = qce->dma.bam_txn->desc;
struct qce_bam_transaction *bam_txn = qce->dma.bam_txn;
struct dma_async_tx_descriptor *dma_desc;
@@ -63,6 +68,10 @@ int qce_submit_cmd_desc(struct qce_device *qce)
goto err_unmap_sg;
}
+ ret = dmaengine_desc_attach_metadata(dma_desc, &meta, 0);
+ if (ret)
+ goto err_unmap_sg;
+
qce_desc->dma_desc = dma_desc;
cookie = dmaengine_submit(qce_desc->dma_desc);
--
2.47.3
^ permalink raw reply related
* [PATCH v17 13/14] crypto: qce - Add BAM DMA support for crypto register I/O
From: Bartosz Golaszewski @ 2026-05-19 13:17 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
In-Reply-To: <20260519-qcom-qce-cmd-descr-v17-0-53a595414b79@oss.qualcomm.com>
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Switch to using BAM DMA for register I/O in addition to passing data. To
that end: provide the necessary infrastructure in the driver, modify the
ordering of operations as required and replace all direct register writes
with wrappers queueing DMA command descriptors.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/aead.c | 10 ++--
drivers/crypto/qce/common.c | 20 ++++---
drivers/crypto/qce/core.h | 4 ++
drivers/crypto/qce/dma.c | 123 ++++++++++++++++++++++++++++++++++++++++--
drivers/crypto/qce/dma.h | 5 ++
drivers/crypto/qce/sha.c | 10 ++--
drivers/crypto/qce/skcipher.c | 10 ++--
7 files changed, 152 insertions(+), 30 deletions(-)
diff --git a/drivers/crypto/qce/aead.c b/drivers/crypto/qce/aead.c
index 03b8042da9a1b4aebdc775ad8ab912abc7b2383d..e271ecbcbb4a33c405fbec85c458cf1daa7e2c55 100644
--- a/drivers/crypto/qce/aead.c
+++ b/drivers/crypto/qce/aead.c
@@ -463,17 +463,17 @@ qce_aead_async_req_handle(struct crypto_async_request *async_req)
src_nents = dst_nents - 1;
}
- ret = qce_dma_prep_sgs(&qce->dma, rctx->src_sg, src_nents, rctx->dst_sg, dst_nents,
- qce_aead_done, async_req);
+ ret = qce_start(async_req, tmpl->crypto_alg_type);
if (ret)
goto error_unmap_src;
- qce_dma_issue_pending(&qce->dma);
-
- ret = qce_start(async_req, tmpl->crypto_alg_type);
+ ret = qce_dma_prep_sgs(&qce->dma, rctx->src_sg, src_nents, rctx->dst_sg, dst_nents,
+ qce_aead_done, async_req);
if (ret)
goto error_terminate;
+ qce_dma_issue_pending(&qce->dma);
+
return 0;
error_terminate:
diff --git a/drivers/crypto/qce/common.c b/drivers/crypto/qce/common.c
index 54a78a57f63028f01870a3edeb8e390f523bb190..37bb6f03244d317a887aeb0aa10cefe327b4ce05 100644
--- a/drivers/crypto/qce/common.c
+++ b/drivers/crypto/qce/common.c
@@ -25,7 +25,7 @@ static inline u32 qce_read(struct qce_device *qce, u32 offset)
static inline void qce_write(struct qce_device *qce, u32 offset, u32 val)
{
- writel(val, qce->base + offset);
+ qce_write_dma(qce, offset, val);
}
static inline void qce_write_array(struct qce_device *qce, u32 offset,
@@ -82,6 +82,8 @@ static void qce_setup_config(struct qce_device *qce)
{
u32 config;
+ qce_clear_bam_transaction(qce);
+
/* get big endianness */
config = qce_config_reg(qce, 0);
@@ -90,12 +92,14 @@ static void qce_setup_config(struct qce_device *qce)
qce_write(qce, REG_CONFIG, config);
}
-static inline void qce_crypto_go(struct qce_device *qce, bool result_dump)
+static inline int qce_crypto_go(struct qce_device *qce, bool result_dump)
{
if (result_dump)
qce_write(qce, REG_GOPROC, BIT(GO_SHIFT) | BIT(RESULTS_DUMP_SHIFT));
else
qce_write(qce, REG_GOPROC, BIT(GO_SHIFT));
+
+ return qce_submit_cmd_desc(qce);
}
#if defined(CONFIG_CRYPTO_DEV_QCE_SHA) || defined(CONFIG_CRYPTO_DEV_QCE_AEAD)
@@ -223,9 +227,7 @@ static int qce_setup_regs_ahash(struct crypto_async_request *async_req)
config = qce_config_reg(qce, 1);
qce_write(qce, REG_CONFIG, config);
- qce_crypto_go(qce, true);
-
- return 0;
+ return qce_crypto_go(qce, true);
}
#endif
@@ -386,9 +388,7 @@ static int qce_setup_regs_skcipher(struct crypto_async_request *async_req)
config = qce_config_reg(qce, 1);
qce_write(qce, REG_CONFIG, config);
- qce_crypto_go(qce, true);
-
- return 0;
+ return qce_crypto_go(qce, true);
}
#endif
@@ -535,9 +535,7 @@ static int qce_setup_regs_aead(struct crypto_async_request *async_req)
qce_write(qce, REG_CONFIG, config);
/* Start the process */
- qce_crypto_go(qce, !IS_CCM(flags));
-
- return 0;
+ return qce_crypto_go(qce, !IS_CCM(flags));
}
#endif
diff --git a/drivers/crypto/qce/core.h b/drivers/crypto/qce/core.h
index a80e12eac6c87e5321cce16c56a4bf5003474ef0..d238097f834e4605f3825f23d0316d4196439116 100644
--- a/drivers/crypto/qce/core.h
+++ b/drivers/crypto/qce/core.h
@@ -30,6 +30,8 @@
* @base_dma: base DMA address
* @base_phys: base physical address
* @dma_size: size of memory mapped for DMA
+ * @read_buf: Buffer for DMA to write back to
+ * @read_buf_dma: Mapped address of the read buffer
* @async_req_enqueue: invoked by every algorithm to enqueue a request
* @async_req_done: invoked by every algorithm to finish its request
*/
@@ -49,6 +51,8 @@ struct qce_device {
dma_addr_t base_dma;
phys_addr_t base_phys;
size_t dma_size;
+ __le32 *read_buf;
+ dma_addr_t read_buf_dma;
int (*async_req_enqueue)(struct qce_device *qce,
struct crypto_async_request *req);
void (*async_req_done)(struct qce_device *qce, int ret);
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index 3db46fc0c419a0a387abce93649084fbf4b1f128..b66e6386fccda20d9462e70e51b8b485be85dec8 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -4,6 +4,8 @@
*/
#include <linux/device.h>
+#include <linux/dma/qcom_bam_dma.h>
+#include <linux/dma-mapping.h>
#include <linux/dmaengine.h>
#include <crypto/scatterwalk.h>
@@ -11,6 +13,99 @@
#include "dma.h"
#define QCE_IGNORE_BUF_SZ (2 * QCE_BAM_BURST_SIZE)
+#define QCE_BAM_CMD_SGL_SIZE 128
+#define QCE_BAM_CMD_ELEMENT_SIZE 128
+#define QCE_MAX_REG_READ 8
+
+struct qce_desc_info {
+ struct dma_async_tx_descriptor *dma_desc;
+ enum dma_data_direction dir;
+};
+
+struct qce_bam_transaction {
+ struct bam_cmd_element bam_ce[QCE_BAM_CMD_ELEMENT_SIZE];
+ struct scatterlist wr_sgl[QCE_BAM_CMD_SGL_SIZE];
+ struct qce_desc_info *desc;
+ u32 bam_ce_idx;
+ u32 pre_bam_ce_idx;
+ u32 wr_sgl_cnt;
+};
+
+void qce_clear_bam_transaction(struct qce_device *qce)
+{
+ struct qce_bam_transaction *bam_txn = qce->dma.bam_txn;
+
+ bam_txn->bam_ce_idx = 0;
+ bam_txn->wr_sgl_cnt = 0;
+ bam_txn->bam_ce_idx = 0;
+ bam_txn->pre_bam_ce_idx = 0;
+}
+
+int qce_submit_cmd_desc(struct qce_device *qce)
+{
+ struct qce_desc_info *qce_desc = qce->dma.bam_txn->desc;
+ struct qce_bam_transaction *bam_txn = qce->dma.bam_txn;
+ struct dma_async_tx_descriptor *dma_desc;
+ struct dma_chan *chan = qce->dma.rxchan;
+ unsigned long attrs = DMA_PREP_CMD;
+ dma_cookie_t cookie;
+ unsigned int mapped;
+ int ret;
+
+ mapped = dma_map_sg(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt, DMA_TO_DEVICE);
+ if (!mapped)
+ return -ENOMEM;
+
+ dma_desc = dmaengine_prep_slave_sg(chan, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt,
+ DMA_MEM_TO_DEV, attrs);
+ if (!dma_desc) {
+ ret = -ENOMEM;
+ goto err_unmap_sg;
+ }
+
+ qce_desc->dma_desc = dma_desc;
+ cookie = dmaengine_submit(qce_desc->dma_desc);
+
+ ret = dma_submit_error(cookie);
+ if (ret)
+ goto err_unmap_sg;
+
+ return 0;
+
+err_unmap_sg:
+ dma_unmap_sg(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt, DMA_TO_DEVICE);
+ return ret;
+}
+
+static void qce_prep_dma_cmd_desc(struct qce_device *qce, struct qce_dma_data *dma,
+ unsigned int addr, void *buf)
+{
+ struct qce_bam_transaction *bam_txn = dma->bam_txn;
+ struct bam_cmd_element *bam_ce_buf;
+ int bam_ce_size, cnt, idx;
+
+ idx = bam_txn->bam_ce_idx;
+ bam_ce_buf = &bam_txn->bam_ce[idx];
+ bam_prep_ce_le32(bam_ce_buf, addr, BAM_WRITE_COMMAND, *((__le32 *)buf));
+
+ bam_ce_buf = &bam_txn->bam_ce[bam_txn->pre_bam_ce_idx];
+ bam_txn->bam_ce_idx++;
+ bam_ce_size = (bam_txn->bam_ce_idx - bam_txn->pre_bam_ce_idx) * sizeof(*bam_ce_buf);
+
+ cnt = bam_txn->wr_sgl_cnt;
+
+ sg_set_buf(&bam_txn->wr_sgl[cnt], bam_ce_buf, bam_ce_size);
+
+ ++bam_txn->wr_sgl_cnt;
+ bam_txn->pre_bam_ce_idx = bam_txn->bam_ce_idx;
+}
+
+void qce_write_dma(struct qce_device *qce, unsigned int offset, u32 val)
+{
+ unsigned int reg_addr = ((unsigned int)(qce->base_phys) + offset);
+
+ qce_prep_dma_cmd_desc(qce, &qce->dma, reg_addr, &val);
+}
int devm_qce_dma_request(struct qce_device *qce)
{
@@ -31,6 +126,21 @@ int devm_qce_dma_request(struct qce_device *qce)
return dev_err_probe(dev, PTR_ERR(dma->rxchan),
"Failed to get RX DMA channel\n");
+ dma->bam_txn = devm_kzalloc(dev, sizeof(*dma->bam_txn), GFP_KERNEL);
+ if (!dma->bam_txn)
+ return -ENOMEM;
+
+ dma->bam_txn->desc = devm_kzalloc(dev, sizeof(*dma->bam_txn->desc), GFP_KERNEL);
+ if (!dma->bam_txn->desc)
+ return -ENOMEM;
+
+ sg_init_table(dma->bam_txn->wr_sgl, QCE_BAM_CMD_SGL_SIZE);
+
+ qce->read_buf = dmam_alloc_coherent(qce->dev, QCE_MAX_REG_READ * sizeof(*qce->read_buf),
+ &qce->read_buf_dma, GFP_KERNEL);
+ if (!qce->read_buf)
+ return -ENOMEM;
+
return 0;
}
@@ -90,28 +200,33 @@ int qce_dma_prep_sgs(struct qce_dma_data *dma, struct scatterlist *rx_sg,
{
struct dma_chan *rxchan = dma->rxchan;
struct dma_chan *txchan = dma->txchan;
- unsigned long flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
+ unsigned long txflags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
+ unsigned long rxflags = txflags | DMA_PREP_FENCE;
int ret;
- ret = qce_dma_prep_sg(rxchan, rx_sg, rx_nents, flags, DMA_MEM_TO_DEV,
+ ret = qce_dma_prep_sg(rxchan, rx_sg, rx_nents, rxflags, DMA_MEM_TO_DEV,
NULL, NULL);
if (ret)
return ret;
- return qce_dma_prep_sg(txchan, tx_sg, tx_nents, flags, DMA_DEV_TO_MEM,
+ return qce_dma_prep_sg(txchan, tx_sg, tx_nents, txflags, DMA_DEV_TO_MEM,
cb, cb_param);
}
void qce_dma_issue_pending(struct qce_dma_data *dma)
{
- dma_async_issue_pending(dma->rxchan);
dma_async_issue_pending(dma->txchan);
+ dma_async_issue_pending(dma->rxchan);
}
int qce_dma_terminate_all(struct qce_dma_data *dma)
{
+ struct qce_device *qce = container_of(dma, struct qce_device, dma);
+ struct qce_bam_transaction *bam_txn = dma->bam_txn;
int ret;
+ dma_unmap_sg(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt, DMA_TO_DEVICE);
+
ret = dmaengine_terminate_all(dma->rxchan);
return ret ?: dmaengine_terminate_all(dma->txchan);
}
diff --git a/drivers/crypto/qce/dma.h b/drivers/crypto/qce/dma.h
index 483789d9fa98e79d1283de8297bf2fc2a773f3a7..f05dfa9e6b25bd60e32f45079a8bc7e6a4cf81f9 100644
--- a/drivers/crypto/qce/dma.h
+++ b/drivers/crypto/qce/dma.h
@@ -8,6 +8,7 @@
#include <linux/dmaengine.h>
+struct qce_bam_transaction;
struct qce_device;
/* maximum data transfer block size between BAM and CE */
@@ -32,6 +33,7 @@ struct qce_dma_data {
struct dma_chan *txchan;
struct dma_chan *rxchan;
struct qce_result_dump *result_buf;
+ struct qce_bam_transaction *bam_txn;
};
int devm_qce_dma_request(struct qce_device *qce);
@@ -43,5 +45,8 @@ int qce_dma_terminate_all(struct qce_dma_data *dma);
struct scatterlist *
qce_sgtable_add(struct sg_table *sgt, struct scatterlist *sg_add,
unsigned int max_len);
+void qce_write_dma(struct qce_device *qce, unsigned int offset, u32 val);
+int qce_submit_cmd_desc(struct qce_device *qce);
+void qce_clear_bam_transaction(struct qce_device *qce);
#endif /* _DMA_H_ */
diff --git a/drivers/crypto/qce/sha.c b/drivers/crypto/qce/sha.c
index a3a1a205aaf8559a04809936e2a3b7d564c16c53..5be82b345753f49202797852cec09dbc7f0a1e03 100644
--- a/drivers/crypto/qce/sha.c
+++ b/drivers/crypto/qce/sha.c
@@ -109,17 +109,17 @@ static int qce_ahash_async_req_handle(struct crypto_async_request *async_req)
goto error_unmap_src;
}
- ret = qce_dma_prep_sgs(&qce->dma, req->src, rctx->src_nents,
- &rctx->result_sg, 1, qce_ahash_done, async_req);
+ ret = qce_start(async_req, tmpl->crypto_alg_type);
if (ret)
goto error_unmap_dst;
- qce_dma_issue_pending(&qce->dma);
-
- ret = qce_start(async_req, tmpl->crypto_alg_type);
+ ret = qce_dma_prep_sgs(&qce->dma, req->src, rctx->src_nents,
+ &rctx->result_sg, 1, qce_ahash_done, async_req);
if (ret)
goto error_terminate;
+ qce_dma_issue_pending(&qce->dma);
+
return 0;
error_terminate:
diff --git a/drivers/crypto/qce/skcipher.c b/drivers/crypto/qce/skcipher.c
index 1fef315a7105c869e7fc6a60719087b721e78bb3..6535336a2c57c39db94999011890b8bdad5c58c2 100644
--- a/drivers/crypto/qce/skcipher.c
+++ b/drivers/crypto/qce/skcipher.c
@@ -142,18 +142,18 @@ qce_skcipher_async_req_handle(struct crypto_async_request *async_req)
src_nents = dst_nents - 1;
}
+ ret = qce_start(async_req, tmpl->crypto_alg_type);
+ if (ret)
+ goto error_unmap_src;
+
ret = qce_dma_prep_sgs(&qce->dma, rctx->src_sg, src_nents,
rctx->dst_sg, dst_nents,
qce_skcipher_done, async_req);
if (ret)
- goto error_unmap_src;
+ goto error_terminate;
qce_dma_issue_pending(&qce->dma);
- ret = qce_start(async_req, tmpl->crypto_alg_type);
- if (ret)
- goto error_terminate;
-
return 0;
error_terminate:
--
2.47.3
^ permalink raw reply related
* [PATCH v17 12/14] crypto: qce - Map crypto memory for DMA
From: Bartosz Golaszewski @ 2026-05-19 13:17 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
In-Reply-To: <20260519-qcom-qce-cmd-descr-v17-0-53a595414b79@oss.qualcomm.com>
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
As the first step in converting the driver to using DMA for register
I/O, let's map the crypto memory range.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/core.c | 23 ++++++++++++++++++++++-
drivers/crypto/qce/core.h | 6 ++++++
2 files changed, 28 insertions(+), 1 deletion(-)
diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
index 90f44db6606173d8afbd295a6dadea177b7bcd11..92e551f4570c0c69cbbbb709a0752fbc16c307e5 100644
--- a/drivers/crypto/qce/core.c
+++ b/drivers/crypto/qce/core.c
@@ -192,10 +192,19 @@ static void qce_cancel_work(void *data)
cancel_work_sync(work);
}
+static void qce_crypto_unmap_dma(void *data)
+{
+ struct qce_device *qce = data;
+
+ dma_unmap_resource(qce->dev, qce->base_dma, qce->dma_size,
+ DMA_BIDIRECTIONAL, 0);
+}
+
static int qce_crypto_probe(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
struct qce_device *qce;
+ struct resource *res;
int ret;
qce = devm_kzalloc(dev, sizeof(*qce), GFP_KERNEL);
@@ -205,7 +214,7 @@ static int qce_crypto_probe(struct platform_device *pdev)
qce->dev = dev;
platform_set_drvdata(pdev, qce);
- qce->base = devm_platform_ioremap_resource(pdev, 0);
+ qce->base = devm_platform_get_and_ioremap_resource(pdev, 0, &res);
if (IS_ERR(qce->base))
return PTR_ERR(qce->base);
@@ -256,6 +265,18 @@ static int qce_crypto_probe(struct platform_device *pdev)
qce->async_req_enqueue = qce_async_request_enqueue;
qce->async_req_done = qce_async_request_done;
+ qce->dma_size = resource_size(res);
+ qce->base_dma = dma_map_resource(dev, res->start, qce->dma_size,
+ DMA_BIDIRECTIONAL, 0);
+ qce->base_phys = res->start;
+ ret = dma_mapping_error(dev, qce->base_dma);
+ if (ret)
+ return ret;
+
+ ret = devm_add_action_or_reset(qce->dev, qce_crypto_unmap_dma, qce);
+ if (ret)
+ return ret;
+
return devm_qce_register_algs(qce);
}
diff --git a/drivers/crypto/qce/core.h b/drivers/crypto/qce/core.h
index f092ce2d3b04a936a37805c20ac5ba78d8fdd2df..a80e12eac6c87e5321cce16c56a4bf5003474ef0 100644
--- a/drivers/crypto/qce/core.h
+++ b/drivers/crypto/qce/core.h
@@ -27,6 +27,9 @@
* @dma: pointer to dma data
* @burst_size: the crypto burst size
* @pipe_pair_id: which pipe pair id the device using
+ * @base_dma: base DMA address
+ * @base_phys: base physical address
+ * @dma_size: size of memory mapped for DMA
* @async_req_enqueue: invoked by every algorithm to enqueue a request
* @async_req_done: invoked by every algorithm to finish its request
*/
@@ -43,6 +46,9 @@ struct qce_device {
struct qce_dma_data dma;
int burst_size;
unsigned int pipe_pair_id;
+ dma_addr_t base_dma;
+ phys_addr_t base_phys;
+ size_t dma_size;
int (*async_req_enqueue)(struct qce_device *qce,
struct crypto_async_request *req);
void (*async_req_done)(struct qce_device *qce, int ret);
--
2.47.3
^ permalink raw reply related
* [PATCH v17 11/14] crypto: qce - Use existing devres APIs in devm_qce_dma_request()
From: Bartosz Golaszewski @ 2026-05-19 13:17 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski,
Konrad Dybcio
In-Reply-To: <20260519-qcom-qce-cmd-descr-v17-0-53a595414b79@oss.qualcomm.com>
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Switch to devm_kmalloc() and devm_dma_alloc_chan() in
devm_qce_dma_request(). This allows us to drop two labels and shrink the
function.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/dma.c | 41 ++++++++++-------------------------------
1 file changed, 10 insertions(+), 31 deletions(-)
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index c29b0abe9445381a019e0447d30acfd7319d5c1f..3db46fc0c419a0a387abce93649084fbf4b1f128 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -12,47 +12,26 @@
#define QCE_IGNORE_BUF_SZ (2 * QCE_BAM_BURST_SIZE)
-static void qce_dma_release(void *data)
-{
- struct qce_dma_data *dma = data;
-
- dma_release_channel(dma->txchan);
- dma_release_channel(dma->rxchan);
- kfree(dma->result_buf);
-}
-
int devm_qce_dma_request(struct qce_device *qce)
{
struct qce_dma_data *dma = &qce->dma;
struct device *dev = qce->dev;
- int ret;
- dma->txchan = dma_request_chan(dev, "tx");
+ dma->result_buf = devm_kmalloc(dev, QCE_RESULT_BUF_SZ + QCE_IGNORE_BUF_SZ, GFP_KERNEL);
+ if (!dma->result_buf)
+ return -ENOMEM;
+
+ dma->txchan = devm_dma_request_chan(dev, "tx");
if (IS_ERR(dma->txchan))
return dev_err_probe(dev, PTR_ERR(dma->txchan),
"Failed to get TX DMA channel\n");
- dma->rxchan = dma_request_chan(dev, "rx");
- if (IS_ERR(dma->rxchan)) {
- ret = dev_err_probe(dev, PTR_ERR(dma->rxchan),
- "Failed to get RX DMA channel\n");
- goto error_rx;
- }
-
- dma->result_buf = kmalloc(QCE_RESULT_BUF_SZ + QCE_IGNORE_BUF_SZ,
- GFP_KERNEL);
- if (!dma->result_buf) {
- ret = -ENOMEM;
- goto error_nomem;
- }
-
- return devm_add_action_or_reset(dev, qce_dma_release, dma);
+ dma->rxchan = devm_dma_request_chan(dev, "rx");
+ if (IS_ERR(dma->rxchan))
+ return dev_err_probe(dev, PTR_ERR(dma->rxchan),
+ "Failed to get RX DMA channel\n");
-error_nomem:
- dma_release_channel(dma->rxchan);
-error_rx:
- dma_release_channel(dma->txchan);
- return ret;
+ return 0;
}
struct scatterlist *
--
2.47.3
^ permalink raw reply related
* [PATCH v17 10/14] crypto: qce - Simplify arguments of devm_qce_dma_request()
From: Bartosz Golaszewski @ 2026-05-19 13:17 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
In-Reply-To: <20260519-qcom-qce-cmd-descr-v17-0-53a595414b79@oss.qualcomm.com>
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
This function can extract all the information it needs from struct
qce_device alone so simplify its arguments. This is done in preparation
for adding support for register I/O over DMA which will require
accessing even more fields from struct qce_device.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/core.c | 2 +-
drivers/crypto/qce/dma.c | 5 ++++-
drivers/crypto/qce/dma.h | 4 +++-
3 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
index 5f724db7c65930991218557394d99574418fb68c..90f44db6606173d8afbd295a6dadea177b7bcd11 100644
--- a/drivers/crypto/qce/core.c
+++ b/drivers/crypto/qce/core.c
@@ -233,7 +233,7 @@ static int qce_crypto_probe(struct platform_device *pdev)
if (ret)
return ret;
- ret = devm_qce_dma_request(qce->dev, &qce->dma);
+ ret = devm_qce_dma_request(qce);
if (ret)
return ret;
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index 08bf3e8ec12433c1a8ee17003f3487e41b7329e4..c29b0abe9445381a019e0447d30acfd7319d5c1f 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -7,6 +7,7 @@
#include <linux/dmaengine.h>
#include <crypto/scatterwalk.h>
+#include "core.h"
#include "dma.h"
#define QCE_IGNORE_BUF_SZ (2 * QCE_BAM_BURST_SIZE)
@@ -20,8 +21,10 @@ static void qce_dma_release(void *data)
kfree(dma->result_buf);
}
-int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma)
+int devm_qce_dma_request(struct qce_device *qce)
{
+ struct qce_dma_data *dma = &qce->dma;
+ struct device *dev = qce->dev;
int ret;
dma->txchan = dma_request_chan(dev, "tx");
diff --git a/drivers/crypto/qce/dma.h b/drivers/crypto/qce/dma.h
index fc337c435cd14917bdfb99febcf9119275afdeba..483789d9fa98e79d1283de8297bf2fc2a773f3a7 100644
--- a/drivers/crypto/qce/dma.h
+++ b/drivers/crypto/qce/dma.h
@@ -8,6 +8,8 @@
#include <linux/dmaengine.h>
+struct qce_device;
+
/* maximum data transfer block size between BAM and CE */
#define QCE_BAM_BURST_SIZE 64
@@ -32,7 +34,7 @@ struct qce_dma_data {
struct qce_result_dump *result_buf;
};
-int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma);
+int devm_qce_dma_request(struct qce_device *qce);
int qce_dma_prep_sgs(struct qce_dma_data *dma, struct scatterlist *sg_in,
int in_ents, struct scatterlist *sg_out, int out_ents,
dma_async_tx_callback cb, void *cb_param);
--
2.47.3
^ permalink raw reply related
* [PATCH v17 09/14] crypto: qce - Remove unused ignore_buf
From: Bartosz Golaszewski @ 2026-05-19 13:17 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
In-Reply-To: <20260519-qcom-qce-cmd-descr-v17-0-53a595414b79@oss.qualcomm.com>
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
It's unclear what the purpose of this field is. It has been here since
the initial commit but without any explanation. The driver works fine
without it. We still keep allocating more space in the result buffer, we
just don't need to store its address. While at it: move the
QCE_IGNORE_BUF_SZ definition into dma.c as it's not used outside of this
compilation unit.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/dma.c | 4 ++--
drivers/crypto/qce/dma.h | 2 --
2 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index 68cafd4741ad3d91906d39e817fc7873b028d498..08bf3e8ec12433c1a8ee17003f3487e41b7329e4 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -9,6 +9,8 @@
#include "dma.h"
+#define QCE_IGNORE_BUF_SZ (2 * QCE_BAM_BURST_SIZE)
+
static void qce_dma_release(void *data)
{
struct qce_dma_data *dma = data;
@@ -41,8 +43,6 @@ int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma)
goto error_nomem;
}
- dma->ignore_buf = dma->result_buf + QCE_RESULT_BUF_SZ;
-
return devm_add_action_or_reset(dev, qce_dma_release, dma);
error_nomem:
diff --git a/drivers/crypto/qce/dma.h b/drivers/crypto/qce/dma.h
index 31629185000e12242fa07c2cc08b95fcbd5d4b8c..fc337c435cd14917bdfb99febcf9119275afdeba 100644
--- a/drivers/crypto/qce/dma.h
+++ b/drivers/crypto/qce/dma.h
@@ -23,7 +23,6 @@ struct qce_result_dump {
u32 status2;
};
-#define QCE_IGNORE_BUF_SZ (2 * QCE_BAM_BURST_SIZE)
#define QCE_RESULT_BUF_SZ \
ALIGN(sizeof(struct qce_result_dump), QCE_BAM_BURST_SIZE)
@@ -31,7 +30,6 @@ struct qce_dma_data {
struct dma_chan *txchan;
struct dma_chan *rxchan;
struct qce_result_dump *result_buf;
- void *ignore_buf;
};
int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma);
--
2.47.3
^ permalink raw reply related
* [PATCH v17 08/14] crypto: qce - Include algapi.h in the core.h header
From: Bartosz Golaszewski @ 2026-05-19 13:17 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
In-Reply-To: <20260519-qcom-qce-cmd-descr-v17-0-53a595414b79@oss.qualcomm.com>
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
The header defines a struct embedding struct crypto_queue whose size
needs to be known and which is defined in crypto/algapi.h. Move the
inclusion from core.c to core.h.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/core.c | 1 -
drivers/crypto/qce/core.h | 1 +
2 files changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
index e82fc862c74b20c34ea5abd6c0b98b71089a3fee..5f724db7c65930991218557394d99574418fb68c 100644
--- a/drivers/crypto/qce/core.c
+++ b/drivers/crypto/qce/core.c
@@ -13,7 +13,6 @@
#include <linux/mod_devicetable.h>
#include <linux/platform_device.h>
#include <linux/types.h>
-#include <crypto/algapi.h>
#include <crypto/internal/hash.h>
#include "core.h"
diff --git a/drivers/crypto/qce/core.h b/drivers/crypto/qce/core.h
index eb6fa7a8b64a81daf9ad5304a3ae4e5e597a70b8..f092ce2d3b04a936a37805c20ac5ba78d8fdd2df 100644
--- a/drivers/crypto/qce/core.h
+++ b/drivers/crypto/qce/core.h
@@ -8,6 +8,7 @@
#include <linux/mutex.h>
#include <linux/workqueue.h>
+#include <crypto/algapi.h>
#include "dma.h"
--
2.47.3
^ permalink raw reply related
* [PATCH v17 07/14] crypto: qce - Cancel work on device detach
From: Bartosz Golaszewski @ 2026-05-19 13:17 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
In-Reply-To: <20260519-qcom-qce-cmd-descr-v17-0-53a595414b79@oss.qualcomm.com>
The workqueue is setup in probe() but never cancelled on error or in
remove(). Set up a devres action to clean it up.
Fixes: eb7986e5e14d ("crypto: qce - convert tasklet to workqueue")
Closes: https://sashiko.dev/#/patchset/20260427-qcom-qce-cmd-descr-v16-0-945fd1cafbbc%40oss.qualcomm.com?part=7
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/crypto/qce/core.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
index b966f3365b7de8d2a8f6707397a34aa4facdc4ac..e82fc862c74b20c34ea5abd6c0b98b71089a3fee 100644
--- a/drivers/crypto/qce/core.c
+++ b/drivers/crypto/qce/core.c
@@ -186,6 +186,13 @@ static int qce_check_version(struct qce_device *qce)
return 0;
}
+static void qce_cancel_work(void *data)
+{
+ struct work_struct *work = data;
+
+ cancel_work_sync(work);
+}
+
static int qce_crypto_probe(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
@@ -240,6 +247,11 @@ static int qce_crypto_probe(struct platform_device *pdev)
return ret;
INIT_WORK(&qce->done_work, qce_req_done_work);
+
+ ret = devm_add_action_or_reset(dev, qce_cancel_work, &qce->done_work);
+ if (ret)
+ return ret;
+
crypto_init_queue(&qce->queue, QCE_QUEUE_LENGTH);
qce->async_req_enqueue = qce_async_request_enqueue;
--
2.47.3
^ permalink raw reply related
* [PATCH v17 06/14] dmaengine: qcom: bam_dma: add support for BAM locking
From: Bartosz Golaszewski @ 2026-05-19 13:17 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
In-Reply-To: <20260519-qcom-qce-cmd-descr-v17-0-53a595414b79@oss.qualcomm.com>
Add support for BAM pipe locking. To that end: when starting DMA on an RX
channel - prepend the existing queue of issued descriptors with an
additional "dummy" command descriptor with the LOCK bit set. Once the
transaction is done (no more issued descriptors), issue one more dummy
descriptor with the UNLOCK bit.
We *must* wait until the transaction is signalled as done because we
must not perform any writes into config registers while the engine is
busy.
The dummy writes must be issued into a scratchpad register of the client
so provide a mechanism to communicate the right address via descriptor
metadata.
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/dma/qcom/bam_dma.c | 156 ++++++++++++++++++++++++++++++++++++++-
include/linux/dma/qcom_bam_dma.h | 14 ++++
2 files changed, 166 insertions(+), 4 deletions(-)
diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index 30318cf01ee20b7e64a988e8ce1ec04dab55e3c3..2c9f90313c313851f84ebea8d99e73b37829b297 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -28,11 +28,13 @@
#include <linux/clk.h>
#include <linux/device.h>
#include <linux/dma-mapping.h>
+#include <linux/dma/qcom_bam_dma.h>
#include <linux/dmaengine.h>
#include <linux/init.h>
#include <linux/interrupt.h>
#include <linux/io.h>
#include <linux/kernel.h>
+#include <linux/lockdep.h>
#include <linux/module.h>
#include <linux/of_address.h>
#include <linux/of_dma.h>
@@ -60,6 +62,8 @@ struct bam_desc_hw {
#define DESC_FLAG_EOB BIT(13)
#define DESC_FLAG_NWD BIT(12)
#define DESC_FLAG_CMD BIT(11)
+#define DESC_FLAG_LOCK BIT(10)
+#define DESC_FLAG_UNLOCK BIT(9)
struct bam_async_desc {
struct virt_dma_desc vd;
@@ -72,6 +76,10 @@ struct bam_async_desc {
struct bam_desc_hw *curr_desc;
+ /* BAM locking infrastructure */
+ struct scatterlist lock_sg;
+ struct bam_cmd_element lock_ce;
+
/* list node for the desc in the bam_chan list of descriptors */
struct list_head desc_node;
enum dma_transfer_direction dir;
@@ -391,6 +399,10 @@ struct bam_chan {
struct list_head desc_list;
struct list_head node;
+
+ /* BAM locking infrastructure */
+ phys_addr_t scratchpad_addr;
+ enum dma_transfer_direction direction;
};
static inline struct bam_chan *to_bam_chan(struct dma_chan *common)
@@ -652,6 +664,35 @@ static int bam_slave_config(struct dma_chan *chan,
return 0;
}
+static int bam_metadata_attach(struct dma_async_tx_descriptor *desc, void *data, size_t len)
+{
+ struct bam_chan *bchan = to_bam_chan(desc->chan);
+ const struct bam_device_data *bdata = bchan->bdev->dev_data;
+ struct bam_desc_metadata *metadata = data;
+
+ if (!data)
+ return -EINVAL;
+
+ if (!bdata->pipe_lock_supported)
+ /*
+ * The client wants to use locking but this BAM version doesn't
+ * support it. Don't return an error here as this will stop the
+ * client from using DMA at all for no reason.
+ */
+ return 0;
+
+ guard(spinlock_irqsave)(&bchan->vc.lock);
+
+ bchan->scratchpad_addr = metadata->scratchpad_addr;
+ bchan->direction = metadata->direction;
+
+ return 0;
+}
+
+static const struct dma_descriptor_metadata_ops bam_metadata_ops = {
+ .attach = bam_metadata_attach,
+};
+
/**
* bam_prep_slave_sg - Prep slave sg transaction
*
@@ -668,6 +709,7 @@ static struct dma_async_tx_descriptor *bam_prep_slave_sg(struct dma_chan *chan,
void *context)
{
struct bam_chan *bchan = to_bam_chan(chan);
+ struct dma_async_tx_descriptor *tx_desc;
struct bam_device *bdev = bchan->bdev;
struct bam_async_desc *async_desc;
struct scatterlist *sg;
@@ -723,7 +765,12 @@ static struct dma_async_tx_descriptor *bam_prep_slave_sg(struct dma_chan *chan,
} while (remainder > 0);
}
- return vchan_tx_prep(&bchan->vc, &async_desc->vd, flags);
+ tx_desc = vchan_tx_prep(&bchan->vc, &async_desc->vd, flags);
+ if (!tx_desc)
+ return NULL;
+
+ tx_desc->metadata_ops = &bam_metadata_ops;
+ return tx_desc;
}
/**
@@ -1012,13 +1059,106 @@ static void bam_apply_new_config(struct bam_chan *bchan,
bchan->reconfigure = 0;
}
+static struct bam_async_desc *
+bam_make_lock_desc(struct bam_chan *bchan, unsigned long flag)
+{
+ struct dma_chan *chan = &bchan->vc.chan;
+ struct bam_async_desc *async_desc;
+ struct bam_desc_hw *desc;
+ struct virt_dma_desc *vd;
+ struct virt_dma_chan *vc;
+ unsigned int mapped;
+ dma_cookie_t cookie;
+ int ret;
+
+ async_desc = kzalloc_flex(*async_desc, desc, 1, GFP_NOWAIT);
+ if (!async_desc) {
+ dev_err(bchan->bdev->dev, "failed to allocate the BAM lock descriptor\n");
+ return ERR_PTR(-ENOMEM);
+ }
+
+ sg_init_table(&async_desc->lock_sg, 1);
+
+ async_desc->num_desc = 1;
+ async_desc->curr_desc = async_desc->desc;
+ async_desc->dir = DMA_MEM_TO_DEV;
+
+ desc = async_desc->desc;
+
+ bam_prep_ce_le32(&async_desc->lock_ce, bchan->scratchpad_addr, BAM_WRITE_COMMAND, 0);
+ sg_set_buf(&async_desc->lock_sg, &async_desc->lock_ce, sizeof(async_desc->lock_ce));
+
+ mapped = dma_map_sg_attrs(chan->slave, &async_desc->lock_sg,
+ 1, DMA_TO_DEVICE, DMA_PREP_CMD);
+ if (!mapped) {
+ kfree(async_desc);
+ return ERR_PTR(-ENOMEM);
+ }
+
+ desc->flags |= cpu_to_le16(DESC_FLAG_CMD | flag);
+ desc->addr = sg_dma_address(&async_desc->lock_sg);
+ desc->size = sizeof(struct bam_cmd_element);
+
+ vc = &bchan->vc;
+ vd = &async_desc->vd;
+
+ dma_async_tx_descriptor_init(&vd->tx, &vc->chan);
+ vd->tx.flags = DMA_PREP_CMD;
+ vd->tx.desc_free = vchan_tx_desc_free;
+ vd->tx_result.result = DMA_TRANS_NOERROR;
+ vd->tx_result.residue = 0;
+
+ cookie = dma_cookie_assign(&vd->tx);
+ ret = dma_submit_error(cookie);
+ if (ret) {
+ dma_unmap_sg(chan->slave, &async_desc->lock_sg, 1, DMA_TO_DEVICE);
+ kfree(async_desc);
+ return ERR_PTR(ret);
+ }
+
+ return async_desc;
+}
+
+static int bam_do_setup_pipe_lock(struct bam_chan *bchan, bool lock)
+{
+ struct bam_device *bdev = bchan->bdev;
+ const struct bam_device_data *bdata = bdev->dev_data;
+ struct bam_async_desc *lock_desc;
+ unsigned long flag;
+
+ lockdep_assert_held(&bchan->vc.lock);
+
+ if (!bdata->pipe_lock_supported || !bchan->scratchpad_addr ||
+ bchan->direction != DMA_MEM_TO_DEV)
+ return 0;
+
+ flag = lock ? DESC_FLAG_LOCK : DESC_FLAG_UNLOCK;
+
+ lock_desc = bam_make_lock_desc(bchan, flag);
+ if (IS_ERR(lock_desc))
+ return PTR_ERR(lock_desc);
+
+ if (lock)
+ list_add(&lock_desc->vd.node, &bchan->vc.desc_issued);
+ else
+ list_add_tail(&lock_desc->vd.node, &bchan->vc.desc_issued);
+
+ return 0;
+}
+
+static void bam_setup_pipe_lock(struct bam_chan *bchan)
+{
+ if (bam_do_setup_pipe_lock(bchan, true) || bam_do_setup_pipe_lock(bchan, false))
+ dev_err(bchan->vc.chan.slave, "Failed to setup BAM pipe lock descriptors");
+}
+
/**
* bam_start_dma - start next transaction
* @bchan: bam dma channel
*/
static void bam_start_dma(struct bam_chan *bchan)
{
- struct virt_dma_desc *vd = vchan_next_desc(&bchan->vc);
+ struct virt_dma_desc *vd;
struct bam_device *bdev = bchan->bdev;
struct bam_async_desc *async_desc = NULL;
struct bam_desc_hw *desc;
@@ -1030,6 +1170,9 @@ static void bam_start_dma(struct bam_chan *bchan)
lockdep_assert_held(&bchan->vc.lock);
+ bam_setup_pipe_lock(bchan);
+
+ vd = vchan_next_desc(&bchan->vc);
if (!vd)
return;
@@ -1157,8 +1300,12 @@ static void bam_issue_pending(struct dma_chan *chan)
*/
static void bam_dma_free_desc(struct virt_dma_desc *vd)
{
- struct bam_async_desc *async_desc = container_of(vd,
- struct bam_async_desc, vd);
+ struct bam_async_desc *async_desc = container_of(vd, struct bam_async_desc, vd);
+ struct bam_desc_hw *desc = async_desc->desc;
+ struct dma_chan *chan = vd->tx.chan;
+
+ if (le16_to_cpu(desc->flags) & (DESC_FLAG_LOCK | DESC_FLAG_UNLOCK))
+ dma_unmap_sg(chan->slave, &async_desc->lock_sg, 1, DMA_TO_DEVICE);
kfree(async_desc);
}
@@ -1349,6 +1496,7 @@ static int bam_dma_probe(struct platform_device *pdev)
bdev->common.device_terminate_all = bam_dma_terminate_all;
bdev->common.device_issue_pending = bam_issue_pending;
bdev->common.device_tx_status = bam_tx_status;
+ bdev->common.desc_metadata_modes = DESC_METADATA_CLIENT;
bdev->common.dev = bdev->dev;
ret = dma_async_device_register(&bdev->common);
diff --git a/include/linux/dma/qcom_bam_dma.h b/include/linux/dma/qcom_bam_dma.h
index 68fc0e643b1b97fe4520d5878daa322b81f4f559..a2594264b0f58c4b2b1c85e243cad0d5669c26dc 100644
--- a/include/linux/dma/qcom_bam_dma.h
+++ b/include/linux/dma/qcom_bam_dma.h
@@ -6,6 +6,8 @@
#ifndef _QCOM_BAM_DMA_H
#define _QCOM_BAM_DMA_H
+#include <linux/dmaengine.h>
+
#include <asm/byteorder.h>
/*
@@ -34,6 +36,18 @@ enum bam_command_type {
BAM_READ_COMMAND,
};
+/**
+ * struct bam_desc_metadata - DMA descriptor metadata specific to the BAM driver.
+ *
+ * @scratchpad_addr: Physical address to use for dummy write operations when
+ * queuing command descriptors with LOCK/UNLOCK bits set.
+ * @direction: Transfer direction of this channel.
+ */
+struct bam_desc_metadata {
+ phys_addr_t scratchpad_addr;
+ enum dma_transfer_direction direction;
+};
+
/*
* prep_bam_ce_le32 - Wrapper function to prepare a single BAM command
* element with the data already in le32 format.
--
2.47.3
^ permalink raw reply related
* [PATCH v17 05/14] dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support
From: Bartosz Golaszewski @ 2026-05-19 13:17 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski,
Dmitry Baryshkov
In-Reply-To: <20260519-qcom-qce-cmd-descr-v17-0-53a595414b79@oss.qualcomm.com>
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Extend the device match data with a flag indicating whether the IP
supports the BAM lock/unlock feature. Set it to true on BAM IP versions
1.4.0 and above.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Acked-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/dma/qcom/bam_dma.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index 7f3d1b6dd5d7660d2743dafcc43878e5f7952b8d..30318cf01ee20b7e64a988e8ce1ec04dab55e3c3 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -115,6 +115,7 @@ struct reg_offset_data {
struct bam_device_data {
const struct reg_offset_data *reg_info;
+ bool pipe_lock_supported;
};
static const struct reg_offset_data bam_v1_3_reg_info[] = {
@@ -181,6 +182,7 @@ static const struct reg_offset_data bam_v1_4_reg_info[] = {
static const struct bam_device_data bam_v1_4_data = {
.reg_info = bam_v1_4_reg_info,
+ .pipe_lock_supported = true,
};
static const struct reg_offset_data bam_v1_7_reg_info[] = {
@@ -214,6 +216,7 @@ static const struct reg_offset_data bam_v1_7_reg_info[] = {
static const struct bam_device_data bam_v1_7_data = {
.reg_info = bam_v1_7_reg_info,
+ .pipe_lock_supported = true,
};
/* BAM CTRL */
--
2.47.3
^ permalink raw reply related
* [PATCH v17 04/14] dmaengine: qcom: bam_dma: Extend the driver's device match data
From: Bartosz Golaszewski @ 2026-05-19 13:17 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
In-Reply-To: <20260519-qcom-qce-cmd-descr-v17-0-53a595414b79@oss.qualcomm.com>
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
In preparation for supporting the pipe locking feature flag, extend the
amount of information we can carry in device match data: create a
separate structure and make the register information one of its fields.
This way, in subsequent patches, it will be just a matter of adding a
new field to the device data.
Reviewed-by: Dmitry Baryshkov <lumag@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/dma/qcom/bam_dma.c | 28 ++++++++++++++++++++++------
1 file changed, 22 insertions(+), 6 deletions(-)
diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index e2f16efcdb55f7465950fb81e22acb451e63ba0c..7f3d1b6dd5d7660d2743dafcc43878e5f7952b8d 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -113,6 +113,10 @@ struct reg_offset_data {
unsigned int pipe_mult, evnt_mult, ee_mult;
};
+struct bam_device_data {
+ const struct reg_offset_data *reg_info;
+};
+
static const struct reg_offset_data bam_v1_3_reg_info[] = {
[BAM_CTRL] = { 0x0F80, 0x00, 0x00, 0x00 },
[BAM_REVISION] = { 0x0F84, 0x00, 0x00, 0x00 },
@@ -142,6 +146,10 @@ static const struct reg_offset_data bam_v1_3_reg_info[] = {
[BAM_P_FIFO_SIZES] = { 0x1020, 0x00, 0x40, 0x00 },
};
+static const struct bam_device_data bam_v1_3_data = {
+ .reg_info = bam_v1_3_reg_info,
+};
+
static const struct reg_offset_data bam_v1_4_reg_info[] = {
[BAM_CTRL] = { 0x0000, 0x00, 0x00, 0x00 },
[BAM_REVISION] = { 0x0004, 0x00, 0x00, 0x00 },
@@ -171,6 +179,10 @@ static const struct reg_offset_data bam_v1_4_reg_info[] = {
[BAM_P_FIFO_SIZES] = { 0x1820, 0x00, 0x1000, 0x00 },
};
+static const struct bam_device_data bam_v1_4_data = {
+ .reg_info = bam_v1_4_reg_info,
+};
+
static const struct reg_offset_data bam_v1_7_reg_info[] = {
[BAM_CTRL] = { 0x00000, 0x00, 0x00, 0x00 },
[BAM_REVISION] = { 0x01000, 0x00, 0x00, 0x00 },
@@ -200,6 +212,10 @@ static const struct reg_offset_data bam_v1_7_reg_info[] = {
[BAM_P_FIFO_SIZES] = { 0x13820, 0x00, 0x1000, 0x00 },
};
+static const struct bam_device_data bam_v1_7_data = {
+ .reg_info = bam_v1_7_reg_info,
+};
+
/* BAM CTRL */
#define BAM_SW_RST BIT(0)
#define BAM_EN BIT(1)
@@ -393,7 +409,7 @@ struct bam_device {
bool powered_remotely;
u32 active_channels;
- const struct reg_offset_data *layout;
+ const struct bam_device_data *dev_data;
struct clk *bamclk;
int irq;
@@ -411,7 +427,7 @@ struct bam_device {
static inline void __iomem *bam_addr(struct bam_device *bdev, u32 pipe,
enum bam_reg reg)
{
- const struct reg_offset_data r = bdev->layout[reg];
+ const struct reg_offset_data r = bdev->dev_data->reg_info[reg];
return bdev->regs + r.base_offset +
r.pipe_mult * pipe +
@@ -1205,9 +1221,9 @@ static void bam_channel_init(struct bam_device *bdev, struct bam_chan *bchan,
}
static const struct of_device_id bam_of_match[] = {
- { .compatible = "qcom,bam-v1.3.0", .data = &bam_v1_3_reg_info },
- { .compatible = "qcom,bam-v1.4.0", .data = &bam_v1_4_reg_info },
- { .compatible = "qcom,bam-v1.7.0", .data = &bam_v1_7_reg_info },
+ { .compatible = "qcom,bam-v1.3.0", .data = &bam_v1_3_data },
+ { .compatible = "qcom,bam-v1.4.0", .data = &bam_v1_4_data },
+ { .compatible = "qcom,bam-v1.7.0", .data = &bam_v1_7_data },
{}
};
@@ -1231,7 +1247,7 @@ static int bam_dma_probe(struct platform_device *pdev)
return -ENODEV;
}
- bdev->layout = match->data;
+ bdev->dev_data = match->data;
bdev->regs = devm_platform_ioremap_resource(pdev, 0);
if (IS_ERR(bdev->regs))
--
2.47.3
^ permalink raw reply related
* [PATCH v17 03/14] dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue
From: Bartosz Golaszewski @ 2026-05-19 13:17 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski,
Dmitry Baryshkov
In-Reply-To: <20260519-qcom-qce-cmd-descr-v17-0-53a595414b79@oss.qualcomm.com>
BH workqueues are a modern mechanism, aiming to replace legacy tasklets.
Let's convert the BAM DMA driver to using the high-priority variant of
the BH workqueue.
[Vinod: suggested using the BG workqueue instead of the regular one
running in process context]
Suggested-by: Vinod Koul <vkoul@kernel.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/dma/qcom/bam_dma.c | 32 ++++++++++++++++----------------
1 file changed, 16 insertions(+), 16 deletions(-)
diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index cea44833201d641ce6a657840d354abb443501b5..e2f16efcdb55f7465950fb81e22acb451e63ba0c 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -42,6 +42,7 @@
#include <linux/pm_runtime.h>
#include <linux/scatterlist.h>
#include <linux/slab.h>
+#include <linux/workqueue.h>
#include "../dmaengine.h"
#include "../virt-dma.h"
@@ -397,8 +398,8 @@ struct bam_device {
struct clk *bamclk;
int irq;
- /* dma start transaction tasklet */
- struct tasklet_struct task;
+ /* dma start transaction workqueue */
+ struct work_struct work;
};
/**
@@ -863,7 +864,7 @@ static u32 process_channel_irqs(struct bam_device *bdev)
/*
* if complete, process cookie. Otherwise
* push back to front of desc_issued so that
- * it gets restarted by the tasklet
+ * it gets restarted by the work queue.
*/
if (!async_desc->num_desc) {
vchan_cookie_complete(&async_desc->vd);
@@ -893,9 +894,9 @@ static irqreturn_t bam_dma_irq(int irq, void *data)
srcs |= process_channel_irqs(bdev);
- /* kick off tasklet to start next dma transfer */
+ /* kick off the work queue to start next dma transfer */
if (srcs & P_IRQ)
- tasklet_schedule(&bdev->task);
+ queue_work(system_bh_highpri_wq, &bdev->work);
ret = pm_runtime_get_sync(bdev->dev);
if (ret < 0)
@@ -1091,14 +1092,14 @@ static void bam_start_dma(struct bam_chan *bchan)
}
/**
- * dma_tasklet - DMA IRQ tasklet
- * @t: tasklet argument (bam controller structure)
+ * bam_dma_work() - DMA interrupt work queue callback
+ * @work: work queue struct embedded in the BAM controller device struct
*
* Sets up next DMA operation and then processes all completed transactions
*/
-static void dma_tasklet(struct tasklet_struct *t)
+static void bam_dma_work(struct work_struct *work)
{
- struct bam_device *bdev = from_tasklet(bdev, t, task);
+ struct bam_device *bdev = from_work(bdev, work, work);
struct bam_chan *bchan;
unsigned int i;
@@ -1111,14 +1112,13 @@ static void dma_tasklet(struct tasklet_struct *t)
if (!list_empty(&bchan->vc.desc_issued) && !IS_BUSY(bchan))
bam_start_dma(bchan);
}
-
}
/**
* bam_issue_pending - starts pending transactions
* @chan: dma channel
*
- * Calls tasklet directly which in turn starts any pending transactions
+ * Calls work queue directly which in turn starts any pending transactions
*/
static void bam_issue_pending(struct dma_chan *chan)
{
@@ -1286,14 +1286,14 @@ static int bam_dma_probe(struct platform_device *pdev)
if (ret)
goto err_disable_clk;
- tasklet_setup(&bdev->task, dma_tasklet);
+ INIT_WORK(&bdev->work, bam_dma_work);
bdev->channels = devm_kcalloc(bdev->dev, bdev->num_channels,
sizeof(*bdev->channels), GFP_KERNEL);
if (!bdev->channels) {
ret = -ENOMEM;
- goto err_tasklet_kill;
+ goto err_workqueue_cancel;
}
/* allocate and initialize channels */
@@ -1359,8 +1359,8 @@ static int bam_dma_probe(struct platform_device *pdev)
err_bam_channel_exit:
for (i = 0; i < bdev->num_channels; i++)
tasklet_kill(&bdev->channels[i].vc.task);
-err_tasklet_kill:
- tasklet_kill(&bdev->task);
+err_workqueue_cancel:
+ cancel_work_sync(&bdev->work);
err_disable_clk:
clk_disable_unprepare(bdev->bamclk);
@@ -1394,7 +1394,7 @@ static void bam_dma_remove(struct platform_device *pdev)
bdev->channels[i].fifo_phys);
}
- tasklet_kill(&bdev->task);
+ cancel_work_sync(&bdev->work);
clk_disable_unprepare(bdev->bamclk);
}
--
2.47.3
^ permalink raw reply related
* [PATCH v17 02/14] dmaengine: qcom: bam_dma: free interrupt before the clock in error path
From: Bartosz Golaszewski @ 2026-05-19 13:17 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
In-Reply-To: <20260519-qcom-qce-cmd-descr-v17-0-53a595414b79@oss.qualcomm.com>
The BAM interrupt is requested with a devres helper and so on error it's
freed after probe() returns. We disable the clock before freeing or
masking it so it may still fire and we may end up reading BAM registers
with clock disabled.
Stop using devres for interrupts as we free it in remove() manually
anyway. Add an appropriate label and free the interrupt before disabling
the clock in error path.
Fixes: e7c0fe2a5c84 ("dmaengine: add Qualcomm BAM dma driver")
Closes: https://sashiko.dev/#/patchset/20260427-qcom-qce-cmd-descr-v16-0-945fd1cafbbc%40oss.qualcomm.com?part=2
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/dma/qcom/bam_dma.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index 19116295f8325767a0d97a7848077885b118241c..cea44833201d641ce6a657840d354abb443501b5 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -1302,8 +1302,7 @@ static int bam_dma_probe(struct platform_device *pdev)
for (i = 0; i < bdev->num_channels; i++)
bam_channel_init(bdev, &bdev->channels[i], i);
- ret = devm_request_irq(bdev->dev, bdev->irq, bam_dma_irq,
- IRQF_TRIGGER_HIGH, "bam_dma", bdev);
+ ret = request_irq(bdev->irq, bam_dma_irq, IRQF_TRIGGER_HIGH, "bam_dma", bdev);
if (ret)
goto err_bam_channel_exit;
@@ -1336,7 +1335,7 @@ static int bam_dma_probe(struct platform_device *pdev)
ret = dma_async_device_register(&bdev->common);
if (ret) {
dev_err(bdev->dev, "failed to register dma async device\n");
- goto err_bam_channel_exit;
+ goto err_free_irq;
}
ret = of_dma_controller_register(pdev->dev.of_node, bam_dma_xlate,
@@ -1355,6 +1354,8 @@ static int bam_dma_probe(struct platform_device *pdev)
err_unregister_dma:
dma_async_device_unregister(&bdev->common);
+err_free_irq:
+ free_irq(bdev->irq, bdev);
err_bam_channel_exit:
for (i = 0; i < bdev->num_channels; i++)
tasklet_kill(&bdev->channels[i].vc.task);
@@ -1379,7 +1380,7 @@ static void bam_dma_remove(struct platform_device *pdev)
/* mask all interrupts for this execution environment */
writel_relaxed(0, bam_addr(bdev, 0, BAM_IRQ_SRCS_MSK_EE));
- devm_free_irq(bdev->dev, bdev->irq, bdev);
+ free_irq(bdev->irq, bdev);
for (i = 0; i < bdev->num_channels; i++) {
bam_dma_terminate_all(&bdev->channels[i].vc.chan);
--
2.47.3
^ permalink raw reply related
* [PATCH v17 01/14] dmaengine: constify struct dma_descriptor_metadata_ops
From: Bartosz Golaszewski @ 2026-05-19 13:17 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
In-Reply-To: <20260519-qcom-qce-cmd-descr-v17-0-53a595414b79@oss.qualcomm.com>
There's no reason for the instances of this struct to be modifiable.
Constify the pointer in struct dma_async_tx_descriptor and all drivers
currently using it.
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
drivers/dma/ti/k3-udma.c | 2 +-
drivers/dma/xilinx/xilinx_dma.c | 2 +-
include/linux/dmaengine.h | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c
index c964ebfcf3b68d86e4bbc9b62bad2212f0ce3ee9..8a2f235b669aaf084a6f7b3e6b23d06b04768608 100644
--- a/drivers/dma/ti/k3-udma.c
+++ b/drivers/dma/ti/k3-udma.c
@@ -3408,7 +3408,7 @@ static int udma_set_metadata_len(struct dma_async_tx_descriptor *desc,
return 0;
}
-static struct dma_descriptor_metadata_ops metadata_ops = {
+static const struct dma_descriptor_metadata_ops metadata_ops = {
.attach = udma_attach_metadata,
.get_ptr = udma_get_metadata_ptr,
.set_len = udma_set_metadata_len,
diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
index 404235c1735384635597e88edc25c67c7d250647..165b11a7c776abc6a8d66d631e19da669644577d 100644
--- a/drivers/dma/xilinx/xilinx_dma.c
+++ b/drivers/dma/xilinx/xilinx_dma.c
@@ -653,7 +653,7 @@ static void *xilinx_dma_get_metadata_ptr(struct dma_async_tx_descriptor *tx,
return seg->hw.app;
}
-static struct dma_descriptor_metadata_ops xilinx_dma_metadata_ops = {
+static const struct dma_descriptor_metadata_ops xilinx_dma_metadata_ops = {
.get_ptr = xilinx_dma_get_metadata_ptr,
};
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index b3d251c9734e95e1b75cf6763d4d2c3a1c6a9910..5244edb90e7e7510bf4460b6a74ee2a7f91c1ccc 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -623,7 +623,7 @@ struct dma_async_tx_descriptor {
void *callback_param;
struct dmaengine_unmap_data *unmap;
enum dma_desc_metadata_mode desc_metadata_mode;
- struct dma_descriptor_metadata_ops *metadata_ops;
+ const struct dma_descriptor_metadata_ops *metadata_ops;
#ifdef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
struct dma_async_tx_descriptor *next;
struct dma_async_tx_descriptor *parent;
--
2.47.3
^ permalink raw reply related
* [PATCH v17 00/14] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O
From: Bartosz Golaszewski @ 2026-05-19 13:17 UTC (permalink / raw)
To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
Peter Ujfalusi, Michal Simek, Frank Li, Andy Gross,
Neil Armstrong
Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski,
Dmitry Baryshkov, Konrad Dybcio
This revision addresses some issues pointed out by sashiko.
Merging strategy: there are build-time dependencies between the crypto
and DMA patches so the best approach is for Vinod to create an immutable
branch with the DMA part pulled in by the crypto tree.
This iteration continues to build on top of v12 but uses the BAM's NWD
bit on data descriptors as suggested by Stephan. To that end, there are
some more changes like reversing the order of command and data
descriptors queuedy by the QCE driver.
Currently the QCE crypto driver accesses the crypto engine registers
directly via CPU. Trust Zone may perform crypto operations simultaneously
resulting in a race condition. To remedy that, let's introduce support
for BAM locking/unlocking to the driver. The BAM driver will now wrap
any existing issued descriptor chains with additional descriptors
performing the locking when the client starts the transaction
(dmaengine_issue_pending()). The client wanting to profit from locking
needs to switch to performing register I/O over DMA and communicate the
address to which to perform the dummy writes via a call to
dmaengine_desc_attach_metadata().
In the specific case of the BAM DMA this translates to sending command
descriptors performing dummy writes with the relevant flags set. The BAM
will then lock all other pipes not related to the current pipe group, and
keep handling the current pipe only until it sees the the unlock bit.
In order for the locking to work correctly, we also need to switch to
using DMA for all register I/O.
On top of this, the series contains some additional tweaks and
refactoring.
The goal of this is not to improve the performance but to prepare the
driver for supporting decryption into secure buffers in the future.
Tested with tcrypt.ko, kcapi and cryptsetup.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
Changes in v17:
- New patch: free the interrupt before disabling the clock in error path
in probe()
- New patch: cancel the QCE work on device detach
- Hold the channel lock when attaching the metadata
- Reorder the operations in devm_qce_dma_request() to avoid freeing
memory that may still be used by the DMA channel
- Register algorithms as the last step in QCE's probe() to avoid making
the resources available to the system before the DMA is fully set up
- Fix error paths in algo request handlers
- Don't pass dmaengine attributes to map_sg_attrs() as it expects
dma-mapping attribute flags
- Fix a dma mapping leak for command descriptors
- Rebase on top of v7.1-rc4
- Link to v16: https://patch.msgid.link/20260427-qcom-qce-cmd-descr-v16-0-945fd1cafbbc@oss.qualcomm.com
Changes in v16:
- Fix a reported race between dma_map_sg() called with spinlock taken
and the corresponding dma_unmap_sg() called without it by moving the
descriptor locking data into the descriptor struct
- Also queue the TX data descriptors before the command descriptors to
match what downstream is doing
- Tweak commit messages
- Rebase on top of v7.1-rc1
- Link to v15: https://patch.msgid.link/20260402-qcom-qce-cmd-descr-v15-0-98b5361f7ed7@oss.qualcomm.com
Changes in v15:
- Extend the descriptor metadata struct to also carry the channel's
transfer direction and stop using dmaengine_slave_config() for that
- Link to v14: https://patch.msgid.link/20260323-qcom-qce-cmd-descr-v14-0-f323af411274@oss.qualcomm.com
Changes in v14:
- Don't return an error to a client which wants to use locking on BAM
that doesn't support it
- Add a comment describing the DMA descriptor metadata structure
- Fix memory leaks
- Remove leftovers from previous iterations
- Propagate errors from dma_cookie_assign() when setting up lock
descriptors
- Link to v13: https://patch.msgid.link/20260317-qcom-qce-cmd-descr-v13-0-0968eb4f8c40@oss.qualcomm.com
Changes in v13:
- As part of the DMA changes in the QCE driver: reverse the order of
queueing the descriptors in the QCE driver: queue command descriptors
with all the register writes first, followed by all the data descriptors,
this is in line with the recommandations from the BAM HPG
- Set the NWD (notify-when-done) bit (DMA_PREP_FENCE in dmaengine
parlance) on the data descriptors to ensure that the UNLOCK descriptor
will not be processed until after they have been processed by the
engine. While technically the NWD bit is only needed on the final data
descriptor, it's hard to tell which one *will* be the last from the
driver's point-of-view and both the downstream driver as well as
the Qualcomm TZ against which we want to synchronize sets NWD on every
data descriptor,
- Revert to creating the LOCK/UNLOCK command descriptor pair in one
place now that the NWD bit is in place,
- Link to v12: https://patch.msgid.link/20260310-qcom-qce-cmd-descr-v12-0-398f37f26ef0@oss.qualcomm.com
Changes in v12:
- Wait until the transaction is done before queueing the UNLOCK command
descriptor
- Use descriptor metadata for communicating the scratchpad address to
the BAM driver
- To that end: reverse the order of the series (first BAM, then QCE) to
maintain bisectability
- Unmap buffers used for dummy writes after the transaction
- Link to v11: https://patch.msgid.link/20260302-qcom-qce-cmd-descr-v11-0-4bf1f5db4802@oss.qualcomm.com
Changes in v11:
- Use new approach, not requiring the client to be involved in locking.
- Add a patch constifying dma_descriptor_metadata_ops
- Rebase on top of v7.0-rc1
- Link to v10: https://lore.kernel.org/r/20251219-qcom-qce-cmd-descr-v10-0-ff7e4bf7dad4@oss.qualcomm.com
Changes in v10:
- Move DESC_FLAG_(UN)LOCK BIT definitions from patch 2 to 3
- Add a patch constifying the dma engine metadata as the first in the
series
- Use the VERSION register for dummy lock/unlock writes
- Link to v9: https://lore.kernel.org/r/20251128-qcom-qce-cmd-descr-v9-0-9a5f72b89722@linaro.org
Changes in v9:
- Drop the global, generic LOCK/UNLOCK flags and instead use DMA
descriptor metadata ops to pass BAM-specific information from the QCE
to the DMA engine
- Link to v8: https://lore.kernel.org/r/20251106-qcom-qce-cmd-descr-v8-0-ecddca23ca26@linaro.org
Changes in v8:
- Rework the command descriptor logic and drop a lot of unneeded code
- Use the physical address for BAM command descriptor access, not the
mapped DMA address
- Fix the problems with iommu faults on newer platforms
- Generalize the LOCK/UNLOCK flags in dmaengine and reword the docs and
commit messages
- Make the BAM locking logic stricter in the DMA engine driver
- Add some additional minor QCE driver refactoring changes to the series
- Lots of small reworks and tweaks to rebase on current mainline and fix
previous issues
- Link to v7: https://lore.kernel.org/all/20250311-qce-cmd-descr-v7-0-db613f5d9c9f@linaro.org/
Changes in v7:
- remove unused code: writing to multiple registers was not used in v6,
neither were the functions for reading registers over BAM DMA-
- remove
- don't read the SW_VERSION register needlessly in the BAM driver,
instead: encode the information on whether the IP supports BAM locking
in device match data
- shrink code where possible with logic modifications (for instance:
change the implementation of qce_write() instead of replacing it
everywhere with a new symbol)
- remove duplicated error messages
- rework commit messages
- a lot of shuffling code around for easier review and a more
streamlined series
- Link to v6: https://lore.kernel.org/all/20250115103004.3350561-1-quic_mdalam@quicinc.com/
Changes in v6:
- change "BAM" to "DMA"
- Ensured this series is compilable with the current Linux-next tip of
the tree (TOT).
Changes in v5:
- Added DMA_PREP_LOCK and DMA_PREP_UNLOCK flag support in separate patch
- Removed DMA_PREP_LOCK & DMA_PREP_UNLOCK flag
- Added FIELD_GET and GENMASK macro to extract major and minor version
Changes in v4:
- Added feature description and test hardware
with test command
- Fixed patch version numbering
- Dropped dt-binding patch
- Dropped device tree changes
- Added BAM_SW_VERSION register read
- Handled the error path for the api dma_map_resource()
in probe
- updated the commit messages for batter redability
- Squash the change where qce_bam_acquire_lock() and
qce_bam_release_lock() api got introduce to the change where
the lock/unlock flag get introced
- changed cover letter subject heading to
"dmaengine: qcom: bam_dma: add cmd descriptor support"
- Added the very initial post for BAM lock/unlock patch link
as v1 to track this feature
Changes in v3:
- https://lore.kernel.org/lkml/183d4f5e-e00a-8ef6-a589-f5704bc83d4a@quicinc.com/
- Addressed all the comments from v2
- Added the dt-binding
- Fix alignment issue
- Removed type casting from qce_write_reg_dma()
and qce_read_reg_dma()
- Removed qce_bam_txn = dma->qce_bam_txn; line from
qce_alloc_bam_txn() api and directly returning
dma->qce_bam_txn
Changes in v2:
- https://lore.kernel.org/lkml/20231214114239.2635325-1-quic_mdalam@quicinc.com/
- Initial set of patches for cmd descriptor support
- Add client driver to use BAM lock/unlock feature
- Added register read/write via BAM in QCE Crypto driver
to use BAM lock/unlock feature
---
Bartosz Golaszewski (14):
dmaengine: constify struct dma_descriptor_metadata_ops
dmaengine: qcom: bam_dma: free interrupt before the clock in error path
dmaengine: qcom: bam_dma: convert tasklet to a BH workqueue
dmaengine: qcom: bam_dma: Extend the driver's device match data
dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support
dmaengine: qcom: bam_dma: add support for BAM locking
crypto: qce - Cancel work on device detach
crypto: qce - Include algapi.h in the core.h header
crypto: qce - Remove unused ignore_buf
crypto: qce - Simplify arguments of devm_qce_dma_request()
crypto: qce - Use existing devres APIs in devm_qce_dma_request()
crypto: qce - Map crypto memory for DMA
crypto: qce - Add BAM DMA support for crypto register I/O
crypto: qce - Communicate the base physical address to the dmaengine
drivers/crypto/qce/aead.c | 10 +-
drivers/crypto/qce/common.c | 20 ++--
drivers/crypto/qce/core.c | 38 ++++++-
drivers/crypto/qce/core.h | 11 ++
drivers/crypto/qce/dma.c | 168 +++++++++++++++++++++++------
drivers/crypto/qce/dma.h | 11 +-
drivers/crypto/qce/sha.c | 10 +-
drivers/crypto/qce/skcipher.c | 10 +-
drivers/dma/qcom/bam_dma.c | 228 +++++++++++++++++++++++++++++++++------
drivers/dma/ti/k3-udma.c | 2 +-
drivers/dma/xilinx/xilinx_dma.c | 2 +-
include/linux/dma/qcom_bam_dma.h | 14 +++
include/linux/dmaengine.h | 2 +-
13 files changed, 430 insertions(+), 96 deletions(-)
---
base-commit: b4a253871ac29e454a62b6746b0385d52cfe7b24
change-id: 20251103-qcom-qce-cmd-descr-c5e9b11fe609
Best regards,
--
Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
^ permalink raw reply
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations
From: David Woodhouse @ 2026-05-19 12:59 UTC (permalink / raw)
To: Marc Zyngier, Paolo Bonzini
Cc: Will Deacon, Jonathan Corbet, Shuah Khan, kvm,
Linux Doc Mailing List, Kernel Mailing List, Linux,
Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas,
Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann,
Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest
In-Reply-To: <86qzn7wp3y.wl-maz@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 1774 bytes --]
On Tue, 2026-05-19 at 13:38 +0100, Marc Zyngier wrote:
>
> > But in this case there's an ID register that tells KVM if userspace
> > wants the old or the new behavior, independent of whether that old
> > behavior is architecturally valid or not.
>
> But the "old behaviour" makes no sense, and cannot be used by a guest:
>
> - either the guest doesn't use the alternative interrupt groups, then
> it wasn't affected by the bug. That's 100% of the guests.
>
> - or the guest did try to use the alternative groups, and it *NEVER*
> worked, as it wouldn't get any interrupt at all. What is the point
> of preserving a "feature" that only results in a non-working guest?
Given how long this bug existed in KVM, it's entirely feasible that
some guests *check* for it and refrain from trying to use the
alternative groups if the registers aren't actually writable.
If such a guest boots on the new kernel and *does* use alternative
groups, and then the kernel is rolled back, it breaks.
Or some guest configurations which have only ever been tested under KVM
could have a bug where they *rely* on the registers not being writable,
and write values which are inconsistent with the rest of their
configuration. Which breaks the moment those registers become writable.
Even in that latter case, when we're hosting customer guests under KVM
and we make a change which breaks things, we don't get to tell
customers "you deserved it".
And those hypothetical cases *do* happen. All of the time. There's a
massive zoo of guest operating systems; not just the major players like
Linux, FreeBSD and Windows but a whole bunch of embedded home-grown and
network appliance kernels.
Nobody is claiming that we shouldn't fix any bug ever.
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]
^ permalink raw reply
* Re: [PATCH] liveupdate: document liveupdate=on
From: Luca Boccassi @ 2026-05-19 12:58 UTC (permalink / raw)
To: Pratyush Yadav
Cc: Pasha Tatashin, Mike Rapoport, Jonathan Corbet, linux-kernel,
kexec, linux-doc
In-Reply-To: <20260519125714.2435640-1-pratyush@kernel.org>
On Tue, 19 May 2026 at 13:57, Pratyush Yadav <pratyush@kernel.org> wrote:
>
> From: "Pratyush Yadav (Google)" <pratyush@kernel.org>
>
> While the liveupdate= parameter is documented in kernel-parameters.txt,
> it is not listed in LUO's user facing documentation. This can make it
> hard for users to figure out how to enable the subsystem, since enabling
> just the config isn't enough.
>
> Note the need for the kernel parameter in LUO core documentation, which
> gets exported to Documentation/core-api/liveupdate.rst.
>
> Suggested-by: Luca Boccassi <luca.boccassi@gmail.com>
> Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
> ---
>
> Notes:
> I think we should take this patch through the liveupdate/fixes branch.
>
> kernel/liveupdate/luo_core.c | 4 ++++
> 1 file changed, 4 insertions(+)
Acked-by: Luca Boccassi <luca.boccassi@gmail.com>
^ permalink raw reply
* [PATCH] liveupdate: document liveupdate=on
From: Pratyush Yadav @ 2026-05-19 12:57 UTC (permalink / raw)
To: Pasha Tatashin, Mike Rapoport, Pratyush Yadav, Luca Boccassi,
Jonathan Corbet
Cc: linux-kernel, kexec, linux-doc
From: "Pratyush Yadav (Google)" <pratyush@kernel.org>
While the liveupdate= parameter is documented in kernel-parameters.txt,
it is not listed in LUO's user facing documentation. This can make it
hard for users to figure out how to enable the subsystem, since enabling
just the config isn't enough.
Note the need for the kernel parameter in LUO core documentation, which
gets exported to Documentation/core-api/liveupdate.rst.
Suggested-by: Luca Boccassi <luca.boccassi@gmail.com>
Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
---
Notes:
I think we should take this patch through the liveupdate/fixes branch.
kernel/liveupdate/luo_core.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/kernel/liveupdate/luo_core.c b/kernel/liveupdate/luo_core.c
index 803f51c84275..5d5827ced73c 100644
--- a/kernel/liveupdate/luo_core.c
+++ b/kernel/liveupdate/luo_core.c
@@ -36,6 +36,10 @@
*
* LUO uses Kexec Handover to transfer memory state from the current kernel to
* the next kernel. For more details see Documentation/core-api/kho/index.rst.
+ *
+ * .. note::
+ * To enable LUO, boot the kernel with the ``liveupdate=on`` command line
+ * parameter.
*/
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
base-commit: b1378127003b61930ce30064328640503ad3ef6d
--
2.54.0.563.g4f69b47b94-goog
^ permalink raw reply related
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations
From: Marc Zyngier @ 2026-05-19 12:56 UTC (permalink / raw)
To: Paolo Bonzini
Cc: David Woodhouse, Will Deacon, Jonathan Corbet, Shuah Khan, kvm,
Linux Doc Mailing List, Kernel Mailing List, Linux,
Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas,
Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann,
Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest
In-Reply-To: <86qzn7wp3y.wl-maz@kernel.org>
On Tue, 19 May 2026 13:38:57 +0100,
Marc Zyngier <maz@kernel.org> wrote:
>
> As I said before, I'd be OK with something that would restore IIDR to
> REV1. But not something that actively breaks the GIC emulation by
> reintroducing a bug. That's, by construction, dead code that will only
> bitrot, because there is no SW that can make use of this nonsense.
I will also add that if we make it a policy to preserve buggy
behaviours that the guest cannot be relying on, then I question
whether we should be fixing anything at all.
For example, 6.19 fixed a totally buggy behaviour where a guest
couldn't not have more than (on most HW) 4 interrupts in flight at any
given time. This was obviously totally bogus, and this was fixed
unconditionally, as legitimate guests could experience gold-platted
lock-ups.
Should we revert to the previous behaviour? In the affirmative, I will
simply stop fixing things, and someone else can have fun retrofitting
buggy crap.
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations
From: David Woodhouse @ 2026-05-19 12:42 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Will Deacon, Marc Zyngier, Jonathan Corbet, Shuah Khan, kvm,
Linux Doc Mailing List, Kernel Mailing List, Linux,
Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas,
Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann,
Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest
In-Reply-To: <CABgObfacAYexR25SMi1kSZMRnHx3EDGj8=E84V1DumER66ibnQ@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2201 bytes --]
On Tue, 2026-05-19 at 14:13 +0200, Paolo Bonzini wrote:
> I admit that my knowledge of Arm is really limited, and I do not
> understand which IIDR values have architecturally allowed behaviors
> and which (if any) were made up by KVM; but even if I cannot honestly
> remark on the code or even the approach, a compatibility knob is the
> right thing to have. That's a userspace API design matter, not an Arm
> or GIC matter.
To be clear: the "IID" in IIDR is "implementer identification"; the
implementer in this case being KVM.
The revision field in the IIDR literally *is* the compatibility knob.
The values are indeed 'made up by KVM', and have been correctly bumped
from 1 to 2 to 3 as guest-visible behaviour has changed.
The only problem is that there's no way to set it *back* to 1 again, on
a newer kernel (which defaults to 3). We can only set it to 2 or 3.
The patch which is causing all this fuss is little more than a one-
liner which allows userspace to set it to 1 again, and a second line to
actually *honour* the corresponding behaviour that certain registers
aren't writable.
This is the only way to retain that historical behaviour so that we
don't have to change it underneath running guests on a kernel upgrade
(or worse, rip the new behaviour *away* from newly-launched guests, if
we have to roll back to the old kernel after launching on the new one).
> I will certainly take this patch, but I won't override Marc. However
> I'd like to better understand his point of view, because right now I
> just don't get it.
Indeed. Like you, I just don't get it. I cannot see any reason *not* to
take the fix, and I am *trying* (with limited success) to limit the
expression of my frustration to the specific technical issue at hand.
Marc, I have a huge amount of respect for you, and I'm painfully aware
that I risk burning bridges here by pressing the issue. But on this
specific topic I respectfully believe that you have made the wrong
decision, and I beg you to reconsider.
We *need* to be able to upgrade without changing behaviour for guests.
Even if the old behaviour was "wrong" according to the architecture
specification.
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]
^ permalink raw reply
* Re: [PATCH v13 04/15] arm64: kexec_file: Fix potential buffer overflow in prepare_elf_headers()
From: Jinjie Ruan @ 2026-05-19 12:42 UTC (permalink / raw)
To: Breno Leitao
Cc: corbet, skhan, catalin.marinas, will, chenhuacai, kernel, maddy,
mpe, npiggin, chleroy, pjw, palmer, aou, alex, tglx, mingo, bp,
dave.hansen, hpa, robh, saravanak, akpm, bhe, rppt,
pasha.tatashin, pratyush, ruirui.yang, rdunlap, pmladek,
dapeng1.mi, kees, elver, kuba, ebiggers, lirongqing, paulmck,
sourabhjain, coxu, jbohac, ryan.roberts, osandov, cfsworks,
tangyouling, ritesh.list, adityag, guoren, songshuaishuai,
kevin.brodsky, vishal.moola, junhui.liu, wangruikang, namcao,
chao.gao, seanjc, fuqiang.wang, ardb, chenjiahao16, hbathini,
takahiro.akashi, james.morse, lizhengyu3, x86, linux-doc,
linux-kernel, linux-arm-kernel, loongarch, linuxppc-dev,
linux-riscv, devicetree, kexec
In-Reply-To: <agGkvrg06KNDNfDi@gmail.com>
On 5/11/2026 5:46 PM, Breno Leitao wrote:
> On Mon, May 11, 2026 at 11:04:43AM +0800, Jinjie Ruan wrote:
>> There is a race condition between the kexec_load() system call
>> (crash kernel loading path) and memory hotplug operations that can
>> lead to buffer overflow and potential kernel crash.
>>
>> During prepare_elf_headers(), the following steps occur:
>> 1. The first for_each_mem_range() queries current System RAM memory ranges
>> 2. Allocates buffer based on queried count
>> 3. The 2st for_each_mem_range() populates ranges from memblock
>>
>> If memory hotplug occurs between step 1 and step 3, the number of ranges
>> can increase, causing out-of-bounds write when populating cmem->ranges[].
>>
>> This happens because kexec_load() uses kexec_trylock (atomic_t) while
>> memory hotplug uses device_hotplug_lock (mutex), so they don't serialize
>> with each other.
>>
>> Add the explicit bounds checking to prevent out-of-bounds access.
>
> It seems you have a TOCTOU type of issue, and this seems to be shrinking
> the window, but not fully solving it?
I plan to fix this issue as follows, and would appreciate your feedback
on whether this is reasonable.
Sashiko AI code review pointed out there is a TOCTOU (Time-of-Check to
Time-of-Use) race condition in prepare_elf_headers() between the initial
pass that counts System RAM ranges and the second pass that populates them.
If a memory hotplug event occurs between these two steps, the number of
memory regions may increase, causing an out-of-bounds write to
the cmem->ranges[] array.
To resolve this and ensure data consistency, this patch:
1. Wraps the counting and population passes with get_online_mems() and
crash_hotplug_lock(). This serializes the kexec_file_load() path
with concurrent memory hotplug operations, ensuring the memory
map remains consistent throughout the header preparation.
2. Adds an explicit boundary check in prepare_elf64_ram_headers_callback().
If the number of ranges exceeds the allocated maximum, it now returns
-EAGAIN, which indicates a transient race, signaling userspace
kexec-tools to retry the syscall instead of leaving the system
without a loaded crash kernel.
index daf81a873bbd..546be6261177 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -15,6 +15,7 @@
#include <linux/kexec.h>
#include <linux/libfdt.h>
#include <linux/memblock.h>
+#include <linux/memory_hotplug.h>
#include <linux/of.h>
#include <linux/of_fdt.h>
#include <linux/slab.h>
@@ -40,7 +41,7 @@ int arch_kimage_file_post_load_cleanup(struct kimage
*image)
}
#ifdef CONFIG_CRASH_DUMP
-int prepare_elf_headers(void **addr, unsigned long *sz)
+static int __prepare_elf_headers(void **addr, unsigned long *sz)
{
struct crash_mem *cmem;
unsigned int nr_ranges;
@@ -59,6 +60,11 @@ int prepare_elf_headers(void **addr, unsigned long *sz)
cmem->max_nr_ranges = nr_ranges;
cmem->nr_ranges = 0;
for_each_mem_range(i, &start, &end) {
+ if (cmem->nr_ranges >= cmem->max_nr_ranges) {
+ ret = -EAGAIN;
+ goto out;
+ }
+
cmem->ranges[cmem->nr_ranges].start = start;
cmem->ranges[cmem->nr_ranges].end = end - 1;
cmem->nr_ranges++;
@@ -81,6 +87,21 @@ int prepare_elf_headers(void **addr, unsigned long *sz)
kfree(cmem);
return ret;
}
+
+int prepare_elf_headers(void **addr, unsigned long *sz)
+{
+ int ret;
+
+ crash_hotplug_lock();
+ get_online_mems();
+
+ ret = __prepare_elf_headers(addr, sz);
+
+ put_online_mems();
+ crash_hotplug_unlock();
+
+ return ret;
+}
#endif
>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will.deacon@arm.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Baoquan He <bhe@redhat.com>
>> Cc: Breno Leitao <leitao@debian.org>
>> Cc: stable@vger.kernel.org
>> Fixes: 3751e728cef2 ("arm64: kexec_file: add crash dump support")
>> Closes: https://sashiko.dev/#/patchset/20260323072745.2481719-1-ruanjinjie%40huawei.com
>> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
>> ---
>> arch/arm64/kernel/machine_kexec_file.c | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
>> index e31fabed378a..a67e7b1abbab 100644
>> --- a/arch/arm64/kernel/machine_kexec_file.c
>> +++ b/arch/arm64/kernel/machine_kexec_file.c
>> @@ -59,6 +59,11 @@ static int prepare_elf_headers(void **addr, unsigned long *sz)
>> cmem->max_nr_ranges = nr_ranges;
>> cmem->nr_ranges = 0;
>> for_each_mem_range(i, &start, &end) {
>> + if (cmem->nr_ranges >= cmem->max_nr_ranges) {
>> + ret = -ENOMEM;
>
> -ENOMEM seems to be the the wrong errno. This isn't an allocation
> failure; it's a transient race. -EBUSY or -EAGAIN would be more honest
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox