Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v15 04/12] dmaengine: qcom: bam_dma: Add pipe_lock_supported flag support
From: Bartosz Golaszewski @ 2026-04-02 14:55 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski,
	Dmitry Baryshkov
In-Reply-To: <20260402-qcom-qce-cmd-descr-v15-0-98b5361f7ed7@oss.qualcomm.com>

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Extend the device match data with a flag indicating whether the IP
supports the BAM lock/unlock feature. Set it to true on BAM IP versions
1.4.0 and above.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Acked-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/dma/qcom/bam_dma.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index 8f6d03f6c673b57ed13aeca6c8331c71596d077b..83491e7c2f17d8c9d12a1a055baea7e3a0a75a53 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -115,6 +115,7 @@ struct reg_offset_data {
 
 struct bam_device_data {
 	const struct reg_offset_data *reg_info;
+	bool pipe_lock_supported;
 };
 
 static const struct reg_offset_data bam_v1_3_reg_info[] = {
@@ -181,6 +182,7 @@ static const struct reg_offset_data bam_v1_4_reg_info[] = {
 
 static const struct bam_device_data bam_v1_4_data = {
 	.reg_info = bam_v1_4_reg_info,
+	.pipe_lock_supported = true,
 };
 
 static const struct reg_offset_data bam_v1_7_reg_info[] = {
@@ -214,6 +216,7 @@ static const struct reg_offset_data bam_v1_7_reg_info[] = {
 
 static const struct bam_device_data bam_v1_7_data = {
 	.reg_info = bam_v1_7_reg_info,
+	.pipe_lock_supported = true,
 };
 
 /* BAM CTRL */

-- 
2.47.3



^ permalink raw reply related

* [PATCH v15 06/12] crypto: qce - Include algapi.h in the core.h header
From: Bartosz Golaszewski @ 2026-04-02 14:55 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
In-Reply-To: <20260402-qcom-qce-cmd-descr-v15-0-98b5361f7ed7@oss.qualcomm.com>

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

The header defines a struct embedding struct crypto_queue whose size
needs to be known and which is defined in crypto/algapi.h. Move the
inclusion from core.c to core.h.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/core.c | 1 -
 drivers/crypto/qce/core.h | 1 +
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
index b966f3365b7de8d2a8f6707397a34aa4facdc4ac..65205100c3df961ffaa4b7bc9e217e8d3e08ed57 100644
--- a/drivers/crypto/qce/core.c
+++ b/drivers/crypto/qce/core.c
@@ -13,7 +13,6 @@
 #include <linux/mod_devicetable.h>
 #include <linux/platform_device.h>
 #include <linux/types.h>
-#include <crypto/algapi.h>
 #include <crypto/internal/hash.h>
 
 #include "core.h"
diff --git a/drivers/crypto/qce/core.h b/drivers/crypto/qce/core.h
index eb6fa7a8b64a81daf9ad5304a3ae4e5e597a70b8..f092ce2d3b04a936a37805c20ac5ba78d8fdd2df 100644
--- a/drivers/crypto/qce/core.h
+++ b/drivers/crypto/qce/core.h
@@ -8,6 +8,7 @@
 
 #include <linux/mutex.h>
 #include <linux/workqueue.h>
+#include <crypto/algapi.h>
 
 #include "dma.h"
 

-- 
2.47.3



^ permalink raw reply related

* [PATCH v15 07/12] crypto: qce - Remove unused ignore_buf
From: Bartosz Golaszewski @ 2026-04-02 14:55 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
In-Reply-To: <20260402-qcom-qce-cmd-descr-v15-0-98b5361f7ed7@oss.qualcomm.com>

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

It's unclear what the purpose of this field is. It has been here since
the initial commit but without any explanation. The driver works fine
without it. We still keep allocating more space in the result buffer, we
just don't need to store its address. While at it: move the
QCE_IGNORE_BUF_SZ definition into dma.c as it's not used outside of this
compilation unit.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/dma.c | 4 ++--
 drivers/crypto/qce/dma.h | 2 --
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index 68cafd4741ad3d91906d39e817fc7873b028d498..08bf3e8ec12433c1a8ee17003f3487e41b7329e4 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -9,6 +9,8 @@
 
 #include "dma.h"
 
+#define QCE_IGNORE_BUF_SZ		(2 * QCE_BAM_BURST_SIZE)
+
 static void qce_dma_release(void *data)
 {
 	struct qce_dma_data *dma = data;
@@ -41,8 +43,6 @@ int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma)
 		goto error_nomem;
 	}
 
-	dma->ignore_buf = dma->result_buf + QCE_RESULT_BUF_SZ;
-
 	return devm_add_action_or_reset(dev, qce_dma_release, dma);
 
 error_nomem:
diff --git a/drivers/crypto/qce/dma.h b/drivers/crypto/qce/dma.h
index 31629185000e12242fa07c2cc08b95fcbd5d4b8c..fc337c435cd14917bdfb99febcf9119275afdeba 100644
--- a/drivers/crypto/qce/dma.h
+++ b/drivers/crypto/qce/dma.h
@@ -23,7 +23,6 @@ struct qce_result_dump {
 	u32 status2;
 };
 
-#define QCE_IGNORE_BUF_SZ	(2 * QCE_BAM_BURST_SIZE)
 #define QCE_RESULT_BUF_SZ	\
 		ALIGN(sizeof(struct qce_result_dump), QCE_BAM_BURST_SIZE)
 
@@ -31,7 +30,6 @@ struct qce_dma_data {
 	struct dma_chan *txchan;
 	struct dma_chan *rxchan;
 	struct qce_result_dump *result_buf;
-	void *ignore_buf;
 };
 
 int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma);

-- 
2.47.3



^ permalink raw reply related

* [PATCH v15 05/12] dmaengine: qcom: bam_dma: add support for BAM locking
From: Bartosz Golaszewski @ 2026-04-02 14:55 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
In-Reply-To: <20260402-qcom-qce-cmd-descr-v15-0-98b5361f7ed7@oss.qualcomm.com>

Add support for BAM pipe locking. To that end: when starting DMA on an RX
channel - prepend the existing queue of issued descriptors with an
additional "dummy" command descriptor with the LOCK bit set. Once the
transaction is done (no more issued descriptors), issue one more dummy
descriptor with the UNLOCK bit.

We *must* wait until the transaction is signalled as done because we
must not perform any writes into config registers while the engine is
busy.

The dummy writes must be issued into a scratchpad register of the client
so provide a mechanism to communicate the right address via descriptor
metadata.

Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/dma/qcom/bam_dma.c       | 167 ++++++++++++++++++++++++++++++++++++++-
 include/linux/dma/qcom_bam_dma.h |  14 ++++
 2 files changed, 177 insertions(+), 4 deletions(-)

diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index 83491e7c2f17d8c9d12a1a055baea7e3a0a75a53..c3c1d39b6e7cce16cb4eaf450220c9e9a4dffe3f 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -28,11 +28,13 @@
 #include <linux/clk.h>
 #include <linux/device.h>
 #include <linux/dma-mapping.h>
+#include <linux/dma/qcom_bam_dma.h>
 #include <linux/dmaengine.h>
 #include <linux/init.h>
 #include <linux/interrupt.h>
 #include <linux/io.h>
 #include <linux/kernel.h>
+#include <linux/lockdep.h>
 #include <linux/module.h>
 #include <linux/of_address.h>
 #include <linux/of_dma.h>
@@ -60,6 +62,8 @@ struct bam_desc_hw {
 #define DESC_FLAG_EOB BIT(13)
 #define DESC_FLAG_NWD BIT(12)
 #define DESC_FLAG_CMD BIT(11)
+#define DESC_FLAG_LOCK BIT(10)
+#define DESC_FLAG_UNLOCK BIT(9)
 
 struct bam_async_desc {
 	struct virt_dma_desc vd;
@@ -391,6 +395,14 @@ struct bam_chan {
 	struct list_head desc_list;
 
 	struct list_head node;
+
+	/* BAM locking infrastructure */
+	phys_addr_t scratchpad_addr;
+	enum dma_transfer_direction direction;
+	struct scatterlist lock_sg;
+	struct scatterlist unlock_sg;
+	struct bam_cmd_element lock_ce;
+	struct bam_cmd_element unlock_ce;
 };
 
 static inline struct bam_chan *to_bam_chan(struct dma_chan *common)
@@ -652,6 +664,33 @@ static int bam_slave_config(struct dma_chan *chan,
 	return 0;
 }
 
+static int bam_metadata_attach(struct dma_async_tx_descriptor *desc, void *data, size_t len)
+{
+	struct bam_chan *bchan = to_bam_chan(desc->chan);
+	const struct bam_device_data *bdata = bchan->bdev->dev_data;
+	struct bam_desc_metadata *metadata = data;
+
+	if (!data)
+		return -EINVAL;
+
+	if (!bdata->pipe_lock_supported)
+		/*
+		 * The client wants to use locking but this BAM version doesn't
+		 * support it. Don't return an error here as this will stop the
+		 * client from using DMA at all for no reason.
+		 */
+		return 0;
+
+	bchan->scratchpad_addr = metadata->scratchpad_addr;
+	bchan->direction = metadata->direction;
+
+	return 0;
+}
+
+static const struct dma_descriptor_metadata_ops bam_metadata_ops = {
+	.attach = bam_metadata_attach,
+};
+
 /**
  * bam_prep_slave_sg - Prep slave sg transaction
  *
@@ -668,6 +707,7 @@ static struct dma_async_tx_descriptor *bam_prep_slave_sg(struct dma_chan *chan,
 	void *context)
 {
 	struct bam_chan *bchan = to_bam_chan(chan);
+	struct dma_async_tx_descriptor *tx_desc;
 	struct bam_device *bdev = bchan->bdev;
 	struct bam_async_desc *async_desc;
 	struct scatterlist *sg;
@@ -723,7 +763,12 @@ static struct dma_async_tx_descriptor *bam_prep_slave_sg(struct dma_chan *chan,
 		} while (remainder > 0);
 	}
 
-	return vchan_tx_prep(&bchan->vc, &async_desc->vd, flags);
+	tx_desc = vchan_tx_prep(&bchan->vc, &async_desc->vd, flags);
+	if (!tx_desc)
+		return NULL;
+
+	tx_desc->metadata_ops = &bam_metadata_ops;
+	return tx_desc;
 }
 
 /**
@@ -1012,13 +1057,116 @@ static void bam_apply_new_config(struct bam_chan *bchan,
 	bchan->reconfigure = 0;
 }
 
+static struct bam_async_desc *
+bam_make_lock_desc(struct bam_chan *bchan, struct scatterlist *sg,
+		   struct bam_cmd_element *ce, unsigned long flag)
+{
+	struct dma_chan *chan = &bchan->vc.chan;
+	struct bam_async_desc *async_desc;
+	struct bam_desc_hw *desc;
+	struct virt_dma_desc *vd;
+	struct virt_dma_chan *vc;
+	unsigned int mapped;
+	dma_cookie_t cookie;
+	int ret;
+
+	sg_init_table(sg, 1);
+
+	async_desc = kzalloc_flex(*async_desc, desc, 1, GFP_NOWAIT);
+	if (!async_desc) {
+		dev_err(bchan->bdev->dev, "failed to allocate the BAM lock descriptor\n");
+		return ERR_PTR(-ENOMEM);
+	}
+
+	async_desc->num_desc = 1;
+	async_desc->curr_desc = async_desc->desc;
+	async_desc->dir = DMA_MEM_TO_DEV;
+
+	desc = async_desc->desc;
+
+	bam_prep_ce_le32(ce, bchan->scratchpad_addr, BAM_WRITE_COMMAND, 0);
+	sg_set_buf(sg, ce, sizeof(*ce));
+
+	mapped = dma_map_sg_attrs(chan->slave, sg, 1, DMA_TO_DEVICE, DMA_PREP_CMD);
+	if (!mapped) {
+		kfree(async_desc);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	desc->flags |= cpu_to_le16(DESC_FLAG_CMD | flag);
+	desc->addr = sg_dma_address(sg);
+	desc->size = sizeof(struct bam_cmd_element);
+
+	vc = &bchan->vc;
+	vd = &async_desc->vd;
+
+	dma_async_tx_descriptor_init(&vd->tx, &vc->chan);
+	vd->tx.flags = DMA_PREP_CMD;
+	vd->tx.desc_free = vchan_tx_desc_free;
+	vd->tx_result.result = DMA_TRANS_NOERROR;
+	vd->tx_result.residue = 0;
+
+	cookie = dma_cookie_assign(&vd->tx);
+	ret = dma_submit_error(cookie);
+	if (ret) {
+		dma_unmap_sg(chan->slave, sg, 1, DMA_TO_DEVICE);
+		kfree(async_desc);
+		return ERR_PTR(ret);
+	}
+
+	return async_desc;
+}
+
+static int bam_do_setup_pipe_lock(struct bam_chan *bchan, bool lock)
+{
+	struct bam_device *bdev = bchan->bdev;
+	const struct bam_device_data *bdata = bdev->dev_data;
+	struct bam_async_desc *lock_desc;
+	struct bam_cmd_element *ce;
+	struct scatterlist *sgl;
+	unsigned long flag;
+
+	lockdep_assert_held(&bchan->vc.lock);
+
+	if (!bdata->pipe_lock_supported || !bchan->scratchpad_addr ||
+	    bchan->direction != DMA_MEM_TO_DEV)
+		return 0;
+
+	if (lock) {
+		sgl = &bchan->lock_sg;
+		ce = &bchan->lock_ce;
+		flag = DESC_FLAG_LOCK;
+	} else {
+		sgl = &bchan->unlock_sg;
+		ce = &bchan->unlock_ce;
+		flag = DESC_FLAG_UNLOCK;
+	}
+
+	lock_desc = bam_make_lock_desc(bchan, sgl, ce, flag);
+	if (IS_ERR(lock_desc))
+		return PTR_ERR(lock_desc);
+
+	if (lock)
+		list_add(&lock_desc->vd.node, &bchan->vc.desc_issued);
+	else
+		list_add_tail(&lock_desc->vd.node, &bchan->vc.desc_issued);
+
+	return 0;
+}
+
+static void bam_setup_pipe_lock(struct bam_chan *bchan)
+{
+	if (bam_do_setup_pipe_lock(bchan, true) || bam_do_setup_pipe_lock(bchan, false))
+		dev_err(bchan->vc.chan.slave, "Failed to setup BAM pipe lock descriptors");
+}
+
 /**
  * bam_start_dma - start next transaction
  * @bchan: bam dma channel
  */
 static void bam_start_dma(struct bam_chan *bchan)
 {
-	struct virt_dma_desc *vd = vchan_next_desc(&bchan->vc);
+	struct virt_dma_desc *vd;
 	struct bam_device *bdev = bchan->bdev;
 	struct bam_async_desc *async_desc = NULL;
 	struct bam_desc_hw *desc;
@@ -1030,6 +1178,9 @@ static void bam_start_dma(struct bam_chan *bchan)
 
 	lockdep_assert_held(&bchan->vc.lock);
 
+	bam_setup_pipe_lock(bchan);
+
+	vd = vchan_next_desc(&bchan->vc);
 	if (!vd)
 		return;
 
@@ -1157,8 +1308,15 @@ static void bam_issue_pending(struct dma_chan *chan)
  */
 static void bam_dma_free_desc(struct virt_dma_desc *vd)
 {
-	struct bam_async_desc *async_desc = container_of(vd,
-			struct bam_async_desc, vd);
+	struct bam_async_desc *async_desc = container_of(vd, struct bam_async_desc, vd);
+	struct bam_desc_hw *desc = async_desc->desc;
+	struct dma_chan *chan = vd->tx.chan;
+	struct bam_chan *bchan = to_bam_chan(chan);
+
+	if (le16_to_cpu(desc->flags) & DESC_FLAG_LOCK)
+		dma_unmap_sg(chan->slave, &bchan->lock_sg, 1, DMA_TO_DEVICE);
+	else if (le16_to_cpu(desc->flags) & DESC_FLAG_UNLOCK)
+		dma_unmap_sg(chan->slave, &bchan->unlock_sg, 1, DMA_TO_DEVICE);
 
 	kfree(async_desc);
 }
@@ -1350,6 +1508,7 @@ static int bam_dma_probe(struct platform_device *pdev)
 	bdev->common.device_terminate_all = bam_dma_terminate_all;
 	bdev->common.device_issue_pending = bam_issue_pending;
 	bdev->common.device_tx_status = bam_tx_status;
+	bdev->common.desc_metadata_modes = DESC_METADATA_CLIENT;
 	bdev->common.dev = bdev->dev;
 
 	ret = dma_async_device_register(&bdev->common);
diff --git a/include/linux/dma/qcom_bam_dma.h b/include/linux/dma/qcom_bam_dma.h
index 68fc0e643b1b97fe4520d5878daa322b81f4f559..a2594264b0f58c4b2b1c85e243cad0d5669c26dc 100644
--- a/include/linux/dma/qcom_bam_dma.h
+++ b/include/linux/dma/qcom_bam_dma.h
@@ -6,6 +6,8 @@
 #ifndef _QCOM_BAM_DMA_H
 #define _QCOM_BAM_DMA_H
 
+#include <linux/dmaengine.h>
+
 #include <asm/byteorder.h>
 
 /*
@@ -34,6 +36,18 @@ enum bam_command_type {
 	BAM_READ_COMMAND,
 };
 
+/**
+ * struct bam_desc_metadata - DMA descriptor metadata specific to the BAM driver.
+ *
+ * @scratchpad_addr: Physical address to use for dummy write operations when
+ *                   queuing command descriptors with LOCK/UNLOCK bits set.
+ * @direction: Transfer direction of this channel.
+ */
+struct bam_desc_metadata {
+	phys_addr_t scratchpad_addr;
+	enum dma_transfer_direction direction;
+};
+
 /*
  * prep_bam_ce_le32 - Wrapper function to prepare a single BAM command
  * element with the data already in le32 format.

-- 
2.47.3



^ permalink raw reply related

* [PATCH v15 08/12] crypto: qce - Simplify arguments of devm_qce_dma_request()
From: Bartosz Golaszewski @ 2026-04-02 14:55 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
In-Reply-To: <20260402-qcom-qce-cmd-descr-v15-0-98b5361f7ed7@oss.qualcomm.com>

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

This function can extract all the information it needs from struct
qce_device alone so simplify its arguments. This is done in preparation
for adding support for register I/O over DMA which will require
accessing even more fields from struct qce_device.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/core.c | 2 +-
 drivers/crypto/qce/dma.c  | 5 ++++-
 drivers/crypto/qce/dma.h  | 4 +++-
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
index 65205100c3df961ffaa4b7bc9e217e8d3e08ed57..8b7bcd0c420c45caf8b29e5455e0f384fd5c5616 100644
--- a/drivers/crypto/qce/core.c
+++ b/drivers/crypto/qce/core.c
@@ -226,7 +226,7 @@ static int qce_crypto_probe(struct platform_device *pdev)
 	if (ret)
 		return ret;
 
-	ret = devm_qce_dma_request(qce->dev, &qce->dma);
+	ret = devm_qce_dma_request(qce);
 	if (ret)
 		return ret;
 
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index 08bf3e8ec12433c1a8ee17003f3487e41b7329e4..c29b0abe9445381a019e0447d30acfd7319d5c1f 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -7,6 +7,7 @@
 #include <linux/dmaengine.h>
 #include <crypto/scatterwalk.h>
 
+#include "core.h"
 #include "dma.h"
 
 #define QCE_IGNORE_BUF_SZ		(2 * QCE_BAM_BURST_SIZE)
@@ -20,8 +21,10 @@ static void qce_dma_release(void *data)
 	kfree(dma->result_buf);
 }
 
-int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma)
+int devm_qce_dma_request(struct qce_device *qce)
 {
+	struct qce_dma_data *dma = &qce->dma;
+	struct device *dev = qce->dev;
 	int ret;
 
 	dma->txchan = dma_request_chan(dev, "tx");
diff --git a/drivers/crypto/qce/dma.h b/drivers/crypto/qce/dma.h
index fc337c435cd14917bdfb99febcf9119275afdeba..483789d9fa98e79d1283de8297bf2fc2a773f3a7 100644
--- a/drivers/crypto/qce/dma.h
+++ b/drivers/crypto/qce/dma.h
@@ -8,6 +8,8 @@
 
 #include <linux/dmaengine.h>
 
+struct qce_device;
+
 /* maximum data transfer block size between BAM and CE */
 #define QCE_BAM_BURST_SIZE		64
 
@@ -32,7 +34,7 @@ struct qce_dma_data {
 	struct qce_result_dump *result_buf;
 };
 
-int devm_qce_dma_request(struct device *dev, struct qce_dma_data *dma);
+int devm_qce_dma_request(struct qce_device *qce);
 int qce_dma_prep_sgs(struct qce_dma_data *dma, struct scatterlist *sg_in,
 		     int in_ents, struct scatterlist *sg_out, int out_ents,
 		     dma_async_tx_callback cb, void *cb_param);

-- 
2.47.3



^ permalink raw reply related

* [PATCH v15 09/12] crypto: qce - Use existing devres APIs in devm_qce_dma_request()
From: Bartosz Golaszewski @ 2026-04-02 14:55 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski,
	Konrad Dybcio
In-Reply-To: <20260402-qcom-qce-cmd-descr-v15-0-98b5361f7ed7@oss.qualcomm.com>

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Switch to devm_kmalloc() and devm_dma_alloc_chan() in
devm_qce_dma_request(). This allows us to drop two labels and shrink the
function.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/dma.c | 39 +++++++++------------------------------
 1 file changed, 9 insertions(+), 30 deletions(-)

diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index c29b0abe9445381a019e0447d30acfd7319d5c1f..a46264735bb895b6199969e83391383ccbbacc5f 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -12,47 +12,26 @@
 
 #define QCE_IGNORE_BUF_SZ		(2 * QCE_BAM_BURST_SIZE)
 
-static void qce_dma_release(void *data)
-{
-	struct qce_dma_data *dma = data;
-
-	dma_release_channel(dma->txchan);
-	dma_release_channel(dma->rxchan);
-	kfree(dma->result_buf);
-}
-
 int devm_qce_dma_request(struct qce_device *qce)
 {
 	struct qce_dma_data *dma = &qce->dma;
 	struct device *dev = qce->dev;
-	int ret;
 
-	dma->txchan = dma_request_chan(dev, "tx");
+	dma->txchan = devm_dma_request_chan(dev, "tx");
 	if (IS_ERR(dma->txchan))
 		return dev_err_probe(dev, PTR_ERR(dma->txchan),
 				     "Failed to get TX DMA channel\n");
 
-	dma->rxchan = dma_request_chan(dev, "rx");
-	if (IS_ERR(dma->rxchan)) {
-		ret = dev_err_probe(dev, PTR_ERR(dma->rxchan),
-				    "Failed to get RX DMA channel\n");
-		goto error_rx;
-	}
-
-	dma->result_buf = kmalloc(QCE_RESULT_BUF_SZ + QCE_IGNORE_BUF_SZ,
-				  GFP_KERNEL);
-	if (!dma->result_buf) {
-		ret = -ENOMEM;
-		goto error_nomem;
-	}
+	dma->rxchan = devm_dma_request_chan(dev, "rx");
+	if (IS_ERR(dma->rxchan))
+		return dev_err_probe(dev, PTR_ERR(dma->rxchan),
+				     "Failed to get RX DMA channel\n");
 
-	return devm_add_action_or_reset(dev, qce_dma_release, dma);
+	dma->result_buf = devm_kmalloc(dev, QCE_RESULT_BUF_SZ + QCE_IGNORE_BUF_SZ, GFP_KERNEL);
+	if (!dma->result_buf)
+		return -ENOMEM;
 
-error_nomem:
-	dma_release_channel(dma->rxchan);
-error_rx:
-	dma_release_channel(dma->txchan);
-	return ret;
+	return 0;
 }
 
 struct scatterlist *

-- 
2.47.3



^ permalink raw reply related

* [PATCH v15 10/12] crypto: qce - Map crypto memory for DMA
From: Bartosz Golaszewski @ 2026-04-02 14:55 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
In-Reply-To: <20260402-qcom-qce-cmd-descr-v15-0-98b5361f7ed7@oss.qualcomm.com>

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

As the first step in converting the driver to using DMA for register
I/O, let's map the crypto memory range.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/core.c | 25 +++++++++++++++++++++++--
 drivers/crypto/qce/core.h |  6 ++++++
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
index 8b7bcd0c420c45caf8b29e5455e0f384fd5c5616..2667fcd67fee826a44080da8f88a3e2abbb9b2cf 100644
--- a/drivers/crypto/qce/core.c
+++ b/drivers/crypto/qce/core.c
@@ -185,10 +185,19 @@ static int qce_check_version(struct qce_device *qce)
 	return 0;
 }
 
+static void qce_crypto_unmap_dma(void *data)
+{
+	struct qce_device *qce = data;
+
+	dma_unmap_resource(qce->dev, qce->base_dma, qce->dma_size,
+			   DMA_BIDIRECTIONAL, 0);
+}
+
 static int qce_crypto_probe(struct platform_device *pdev)
 {
 	struct device *dev = &pdev->dev;
 	struct qce_device *qce;
+	struct resource *res;
 	int ret;
 
 	qce = devm_kzalloc(dev, sizeof(*qce), GFP_KERNEL);
@@ -198,7 +207,7 @@ static int qce_crypto_probe(struct platform_device *pdev)
 	qce->dev = dev;
 	platform_set_drvdata(pdev, qce);
 
-	qce->base = devm_platform_ioremap_resource(pdev, 0);
+	qce->base = devm_platform_get_and_ioremap_resource(pdev, 0, &res);
 	if (IS_ERR(qce->base))
 		return PTR_ERR(qce->base);
 
@@ -244,7 +253,19 @@ static int qce_crypto_probe(struct platform_device *pdev)
 	qce->async_req_enqueue = qce_async_request_enqueue;
 	qce->async_req_done = qce_async_request_done;
 
-	return devm_qce_register_algs(qce);
+	ret = devm_qce_register_algs(qce);
+	if (ret)
+		return ret;
+
+	qce->dma_size = resource_size(res);
+	qce->base_dma = dma_map_resource(dev, res->start, qce->dma_size,
+					 DMA_BIDIRECTIONAL, 0);
+	qce->base_phys = res->start;
+	ret = dma_mapping_error(dev, qce->base_dma);
+	if (ret)
+		return ret;
+
+	return devm_add_action_or_reset(qce->dev, qce_crypto_unmap_dma, qce);
 }
 
 static const struct of_device_id qce_crypto_of_match[] = {
diff --git a/drivers/crypto/qce/core.h b/drivers/crypto/qce/core.h
index f092ce2d3b04a936a37805c20ac5ba78d8fdd2df..a80e12eac6c87e5321cce16c56a4bf5003474ef0 100644
--- a/drivers/crypto/qce/core.h
+++ b/drivers/crypto/qce/core.h
@@ -27,6 +27,9 @@
  * @dma: pointer to dma data
  * @burst_size: the crypto burst size
  * @pipe_pair_id: which pipe pair id the device using
+ * @base_dma: base DMA address
+ * @base_phys: base physical address
+ * @dma_size: size of memory mapped for DMA
  * @async_req_enqueue: invoked by every algorithm to enqueue a request
  * @async_req_done: invoked by every algorithm to finish its request
  */
@@ -43,6 +46,9 @@ struct qce_device {
 	struct qce_dma_data dma;
 	int burst_size;
 	unsigned int pipe_pair_id;
+	dma_addr_t base_dma;
+	phys_addr_t base_phys;
+	size_t dma_size;
 	int (*async_req_enqueue)(struct qce_device *qce,
 				 struct crypto_async_request *req);
 	void (*async_req_done)(struct qce_device *qce, int ret);

-- 
2.47.3



^ permalink raw reply related

* [PATCH v15 11/12] crypto: qce - Add BAM DMA support for crypto register I/O
From: Bartosz Golaszewski @ 2026-04-02 14:55 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
In-Reply-To: <20260402-qcom-qce-cmd-descr-v15-0-98b5361f7ed7@oss.qualcomm.com>

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Implement the infrastructure for performing register I/O over BAM DMA,
not CPU.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/aead.c     |   8 +--
 drivers/crypto/qce/common.c   |  20 ++++----
 drivers/crypto/qce/core.h     |   4 ++
 drivers/crypto/qce/dma.c      | 114 ++++++++++++++++++++++++++++++++++++++++--
 drivers/crypto/qce/dma.h      |   5 ++
 drivers/crypto/qce/sha.c      |   8 +--
 drivers/crypto/qce/skcipher.c |   8 +--
 7 files changed, 141 insertions(+), 26 deletions(-)

diff --git a/drivers/crypto/qce/aead.c b/drivers/crypto/qce/aead.c
index abb438d2f8888d313d134161fda97dcc9d82d6d9..0cfea1cbfb0f927e0b8bcd57c47004cbe41175a0 100644
--- a/drivers/crypto/qce/aead.c
+++ b/drivers/crypto/qce/aead.c
@@ -468,6 +468,10 @@ qce_aead_async_req_handle(struct crypto_async_request *async_req)
 			src_nents = dst_nents - 1;
 	}
 
+	ret = qce_start(async_req, tmpl->crypto_alg_type);
+	if (ret)
+		goto error_terminate;
+
 	ret = qce_dma_prep_sgs(&qce->dma, rctx->src_sg, src_nents, rctx->dst_sg, dst_nents,
 			       qce_aead_done, async_req);
 	if (ret)
@@ -475,10 +479,6 @@ qce_aead_async_req_handle(struct crypto_async_request *async_req)
 
 	qce_dma_issue_pending(&qce->dma);
 
-	ret = qce_start(async_req, tmpl->crypto_alg_type);
-	if (ret)
-		goto error_terminate;
-
 	return 0;
 
 error_terminate:
diff --git a/drivers/crypto/qce/common.c b/drivers/crypto/qce/common.c
index 04253a8d33409a2a51db527435d09ae85a7880af..b2b0e751a06517ac06e7a468599bd18666210e0c 100644
--- a/drivers/crypto/qce/common.c
+++ b/drivers/crypto/qce/common.c
@@ -25,7 +25,7 @@ static inline u32 qce_read(struct qce_device *qce, u32 offset)
 
 static inline void qce_write(struct qce_device *qce, u32 offset, u32 val)
 {
-	writel(val, qce->base + offset);
+	qce_write_dma(qce, offset, val);
 }
 
 static inline void qce_write_array(struct qce_device *qce, u32 offset,
@@ -82,6 +82,8 @@ static void qce_setup_config(struct qce_device *qce)
 {
 	u32 config;
 
+	qce_clear_bam_transaction(qce);
+
 	/* get big endianness */
 	config = qce_config_reg(qce, 0);
 
@@ -90,12 +92,14 @@ static void qce_setup_config(struct qce_device *qce)
 	qce_write(qce, REG_CONFIG, config);
 }
 
-static inline void qce_crypto_go(struct qce_device *qce, bool result_dump)
+static inline int qce_crypto_go(struct qce_device *qce, bool result_dump)
 {
 	if (result_dump)
 		qce_write(qce, REG_GOPROC, BIT(GO_SHIFT) | BIT(RESULTS_DUMP_SHIFT));
 	else
 		qce_write(qce, REG_GOPROC, BIT(GO_SHIFT));
+
+	return qce_submit_cmd_desc(qce);
 }
 
 #if defined(CONFIG_CRYPTO_DEV_QCE_SHA) || defined(CONFIG_CRYPTO_DEV_QCE_AEAD)
@@ -223,9 +227,7 @@ static int qce_setup_regs_ahash(struct crypto_async_request *async_req)
 	config = qce_config_reg(qce, 1);
 	qce_write(qce, REG_CONFIG, config);
 
-	qce_crypto_go(qce, true);
-
-	return 0;
+	return qce_crypto_go(qce, true);
 }
 #endif
 
@@ -386,9 +388,7 @@ static int qce_setup_regs_skcipher(struct crypto_async_request *async_req)
 	config = qce_config_reg(qce, 1);
 	qce_write(qce, REG_CONFIG, config);
 
-	qce_crypto_go(qce, true);
-
-	return 0;
+	return qce_crypto_go(qce, true);
 }
 #endif
 
@@ -535,9 +535,7 @@ static int qce_setup_regs_aead(struct crypto_async_request *async_req)
 	qce_write(qce, REG_CONFIG, config);
 
 	/* Start the process */
-	qce_crypto_go(qce, !IS_CCM(flags));
-
-	return 0;
+	return qce_crypto_go(qce, !IS_CCM(flags));
 }
 #endif
 
diff --git a/drivers/crypto/qce/core.h b/drivers/crypto/qce/core.h
index a80e12eac6c87e5321cce16c56a4bf5003474ef0..d238097f834e4605f3825f23d0316d4196439116 100644
--- a/drivers/crypto/qce/core.h
+++ b/drivers/crypto/qce/core.h
@@ -30,6 +30,8 @@
  * @base_dma: base DMA address
  * @base_phys: base physical address
  * @dma_size: size of memory mapped for DMA
+ * @read_buf: Buffer for DMA to write back to
+ * @read_buf_dma: Mapped address of the read buffer
  * @async_req_enqueue: invoked by every algorithm to enqueue a request
  * @async_req_done: invoked by every algorithm to finish its request
  */
@@ -49,6 +51,8 @@ struct qce_device {
 	dma_addr_t base_dma;
 	phys_addr_t base_phys;
 	size_t dma_size;
+	__le32 *read_buf;
+	dma_addr_t read_buf_dma;
 	int (*async_req_enqueue)(struct qce_device *qce,
 				 struct crypto_async_request *req);
 	void (*async_req_done)(struct qce_device *qce, int ret);
diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index a46264735bb895b6199969e83391383ccbbacc5f..5c42fc7ddf01e11a6562d272ba7c90c906e0e312 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -4,6 +4,8 @@
  */
 
 #include <linux/device.h>
+#include <linux/dma/qcom_bam_dma.h>
+#include <linux/dma-mapping.h>
 #include <linux/dmaengine.h>
 #include <crypto/scatterwalk.h>
 
@@ -11,6 +13,96 @@
 #include "dma.h"
 
 #define QCE_IGNORE_BUF_SZ		(2 * QCE_BAM_BURST_SIZE)
+#define QCE_BAM_CMD_SGL_SIZE		128
+#define QCE_BAM_CMD_ELEMENT_SIZE	128
+#define QCE_MAX_REG_READ		8
+
+struct qce_desc_info {
+	struct dma_async_tx_descriptor *dma_desc;
+	enum dma_data_direction dir;
+};
+
+struct qce_bam_transaction {
+	struct bam_cmd_element bam_ce[QCE_BAM_CMD_ELEMENT_SIZE];
+	struct scatterlist wr_sgl[QCE_BAM_CMD_SGL_SIZE];
+	struct qce_desc_info *desc;
+	u32 bam_ce_idx;
+	u32 pre_bam_ce_idx;
+	u32 wr_sgl_cnt;
+};
+
+void qce_clear_bam_transaction(struct qce_device *qce)
+{
+	struct qce_bam_transaction *bam_txn = qce->dma.bam_txn;
+
+	bam_txn->bam_ce_idx = 0;
+	bam_txn->wr_sgl_cnt = 0;
+	bam_txn->bam_ce_idx = 0;
+	bam_txn->pre_bam_ce_idx = 0;
+}
+
+int qce_submit_cmd_desc(struct qce_device *qce)
+{
+	struct qce_desc_info *qce_desc = qce->dma.bam_txn->desc;
+	struct qce_bam_transaction *bam_txn = qce->dma.bam_txn;
+	struct dma_async_tx_descriptor *dma_desc;
+	struct dma_chan *chan = qce->dma.rxchan;
+	unsigned long attrs = DMA_PREP_CMD;
+	dma_cookie_t cookie;
+	unsigned int mapped;
+	int ret;
+
+	mapped = dma_map_sg_attrs(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt,
+				  DMA_TO_DEVICE, attrs);
+	if (!mapped)
+		return -ENOMEM;
+
+	dma_desc = dmaengine_prep_slave_sg(chan, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt,
+					   DMA_MEM_TO_DEV, attrs);
+	if (!dma_desc) {
+		dma_unmap_sg(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt, DMA_TO_DEVICE);
+		return -ENOMEM;
+	}
+
+	qce_desc->dma_desc = dma_desc;
+	cookie = dmaengine_submit(qce_desc->dma_desc);
+
+	ret = dma_submit_error(cookie);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static void qce_prep_dma_cmd_desc(struct qce_device *qce, struct qce_dma_data *dma,
+				  unsigned int addr, void *buf)
+{
+	struct qce_bam_transaction *bam_txn = dma->bam_txn;
+	struct bam_cmd_element *bam_ce_buf;
+	int bam_ce_size, cnt, idx;
+
+	idx = bam_txn->bam_ce_idx;
+	bam_ce_buf = &bam_txn->bam_ce[idx];
+	bam_prep_ce_le32(bam_ce_buf, addr, BAM_WRITE_COMMAND, *((__le32 *)buf));
+
+	bam_ce_buf = &bam_txn->bam_ce[bam_txn->pre_bam_ce_idx];
+	bam_txn->bam_ce_idx++;
+	bam_ce_size = (bam_txn->bam_ce_idx - bam_txn->pre_bam_ce_idx) * sizeof(*bam_ce_buf);
+
+	cnt = bam_txn->wr_sgl_cnt;
+
+	sg_set_buf(&bam_txn->wr_sgl[cnt], bam_ce_buf, bam_ce_size);
+
+	++bam_txn->wr_sgl_cnt;
+	bam_txn->pre_bam_ce_idx = bam_txn->bam_ce_idx;
+}
+
+void qce_write_dma(struct qce_device *qce, unsigned int offset, u32 val)
+{
+	unsigned int reg_addr = ((unsigned int)(qce->base_phys) + offset);
+
+	qce_prep_dma_cmd_desc(qce, &qce->dma, reg_addr, &val);
+}
 
 int devm_qce_dma_request(struct qce_device *qce)
 {
@@ -31,6 +123,21 @@ int devm_qce_dma_request(struct qce_device *qce)
 	if (!dma->result_buf)
 		return -ENOMEM;
 
+	dma->bam_txn = devm_kzalloc(dev, sizeof(*dma->bam_txn), GFP_KERNEL);
+	if (!dma->bam_txn)
+		return -ENOMEM;
+
+	dma->bam_txn->desc = devm_kzalloc(dev, sizeof(*dma->bam_txn->desc), GFP_KERNEL);
+	if (!dma->bam_txn->desc)
+		return -ENOMEM;
+
+	sg_init_table(dma->bam_txn->wr_sgl, QCE_BAM_CMD_SGL_SIZE);
+
+	qce->read_buf = dmam_alloc_coherent(qce->dev, QCE_MAX_REG_READ * sizeof(*qce->read_buf),
+					    &qce->read_buf_dma, GFP_KERNEL);
+	if (!qce->read_buf)
+		return -ENOMEM;
+
 	return 0;
 }
 
@@ -90,15 +197,16 @@ int qce_dma_prep_sgs(struct qce_dma_data *dma, struct scatterlist *rx_sg,
 {
 	struct dma_chan *rxchan = dma->rxchan;
 	struct dma_chan *txchan = dma->txchan;
-	unsigned long flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
+	unsigned long txflags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
+	unsigned long rxflags = txflags | DMA_PREP_FENCE;
 	int ret;
 
-	ret = qce_dma_prep_sg(rxchan, rx_sg, rx_nents, flags, DMA_MEM_TO_DEV,
+	ret = qce_dma_prep_sg(rxchan, rx_sg, rx_nents, rxflags, DMA_MEM_TO_DEV,
 			     NULL, NULL);
 	if (ret)
 		return ret;
 
-	return qce_dma_prep_sg(txchan, tx_sg, tx_nents, flags, DMA_DEV_TO_MEM,
+	return qce_dma_prep_sg(txchan, tx_sg, tx_nents, txflags, DMA_DEV_TO_MEM,
 			       cb, cb_param);
 }
 
diff --git a/drivers/crypto/qce/dma.h b/drivers/crypto/qce/dma.h
index 483789d9fa98e79d1283de8297bf2fc2a773f3a7..f05dfa9e6b25bd60e32f45079a8bc7e6a4cf81f9 100644
--- a/drivers/crypto/qce/dma.h
+++ b/drivers/crypto/qce/dma.h
@@ -8,6 +8,7 @@
 
 #include <linux/dmaengine.h>
 
+struct qce_bam_transaction;
 struct qce_device;
 
 /* maximum data transfer block size between BAM and CE */
@@ -32,6 +33,7 @@ struct qce_dma_data {
 	struct dma_chan *txchan;
 	struct dma_chan *rxchan;
 	struct qce_result_dump *result_buf;
+	struct qce_bam_transaction *bam_txn;
 };
 
 int devm_qce_dma_request(struct qce_device *qce);
@@ -43,5 +45,8 @@ int qce_dma_terminate_all(struct qce_dma_data *dma);
 struct scatterlist *
 qce_sgtable_add(struct sg_table *sgt, struct scatterlist *sg_add,
 		unsigned int max_len);
+void qce_write_dma(struct qce_device *qce, unsigned int offset, u32 val);
+int qce_submit_cmd_desc(struct qce_device *qce);
+void qce_clear_bam_transaction(struct qce_device *qce);
 
 #endif /* _DMA_H_ */
diff --git a/drivers/crypto/qce/sha.c b/drivers/crypto/qce/sha.c
index d7b6d042fb44f4856a6b4f9c901376dd7531454d..f7e1f49b11b9344a5c45a9caddd485d3dac91046 100644
--- a/drivers/crypto/qce/sha.c
+++ b/drivers/crypto/qce/sha.c
@@ -108,6 +108,10 @@ static int qce_ahash_async_req_handle(struct crypto_async_request *async_req)
 		goto error_unmap_src;
 	}
 
+	ret = qce_start(async_req, tmpl->crypto_alg_type);
+	if (ret)
+		goto error_terminate;
+
 	ret = qce_dma_prep_sgs(&qce->dma, req->src, rctx->src_nents,
 			       &rctx->result_sg, 1, qce_ahash_done, async_req);
 	if (ret)
@@ -115,10 +119,6 @@ static int qce_ahash_async_req_handle(struct crypto_async_request *async_req)
 
 	qce_dma_issue_pending(&qce->dma);
 
-	ret = qce_start(async_req, tmpl->crypto_alg_type);
-	if (ret)
-		goto error_terminate;
-
 	return 0;
 
 error_terminate:
diff --git a/drivers/crypto/qce/skcipher.c b/drivers/crypto/qce/skcipher.c
index 872b65318233ed21e3559853f6bbdad030a1b81f..a386b407cfb1b1b8d72ff9c2d255476c6327a3c2 100644
--- a/drivers/crypto/qce/skcipher.c
+++ b/drivers/crypto/qce/skcipher.c
@@ -141,6 +141,10 @@ qce_skcipher_async_req_handle(struct crypto_async_request *async_req)
 		src_nents = dst_nents - 1;
 	}
 
+	ret = qce_start(async_req, tmpl->crypto_alg_type);
+	if (ret)
+		goto error_terminate;
+
 	ret = qce_dma_prep_sgs(&qce->dma, rctx->src_sg, src_nents,
 			       rctx->dst_sg, dst_nents,
 			       qce_skcipher_done, async_req);
@@ -149,10 +153,6 @@ qce_skcipher_async_req_handle(struct crypto_async_request *async_req)
 
 	qce_dma_issue_pending(&qce->dma);
 
-	ret = qce_start(async_req, tmpl->crypto_alg_type);
-	if (ret)
-		goto error_terminate;
-
 	return 0;
 
 error_terminate:

-- 
2.47.3



^ permalink raw reply related

* [PATCH v15 12/12] crypto: qce - Communicate the base physical address to the dmaengine
From: Bartosz Golaszewski @ 2026-04-02 14:55 UTC (permalink / raw)
  To: Vinod Koul, Jonathan Corbet, Thara Gopinath, Herbert Xu,
	David S. Miller, Udit Tiwari, Md Sadre Alam, Dmitry Baryshkov,
	Manivannan Sadhasivam, Stephan Gerhold, Bjorn Andersson,
	Peter Ujfalusi, Michal Simek, Frank Li
  Cc: dmaengine, linux-doc, linux-kernel, linux-arm-msm, linux-crypto,
	linux-arm-kernel, brgl, Bartosz Golaszewski, Bartosz Golaszewski
In-Reply-To: <20260402-qcom-qce-cmd-descr-v15-0-98b5361f7ed7@oss.qualcomm.com>

In order to communicate to the BAM DMA engine which address should be
used as a scratchpad for dummy writes related to BAM pipe locking,
fill out and attach the provided metadata struct to the descriptor.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
 drivers/crypto/qce/dma.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/crypto/qce/dma.c b/drivers/crypto/qce/dma.c
index 5c42fc7ddf01e11a6562d272ba7c90c906e0e312..7d214ed6f703e6ea0c8b6dbb1d7620fcaf4d5163 100644
--- a/drivers/crypto/qce/dma.c
+++ b/drivers/crypto/qce/dma.c
@@ -11,6 +11,7 @@
 
 #include "core.h"
 #include "dma.h"
+#include "regs-v5.h"
 
 #define QCE_IGNORE_BUF_SZ		(2 * QCE_BAM_BURST_SIZE)
 #define QCE_BAM_CMD_SGL_SIZE		128
@@ -43,6 +44,10 @@ void qce_clear_bam_transaction(struct qce_device *qce)
 
 int qce_submit_cmd_desc(struct qce_device *qce)
 {
+	struct bam_desc_metadata meta = {
+		.scratchpad_addr = qce->base_phys + REG_VERSION,
+		.direction = DMA_MEM_TO_DEV,
+	};
 	struct qce_desc_info *qce_desc = qce->dma.bam_txn->desc;
 	struct qce_bam_transaction *bam_txn = qce->dma.bam_txn;
 	struct dma_async_tx_descriptor *dma_desc;
@@ -64,6 +69,12 @@ int qce_submit_cmd_desc(struct qce_device *qce)
 		return -ENOMEM;
 	}
 
+	ret = dmaengine_desc_attach_metadata(dma_desc, &meta, 0);
+	if (ret) {
+		dma_unmap_sg(qce->dev, bam_txn->wr_sgl, bam_txn->wr_sgl_cnt, DMA_TO_DEVICE);
+		return ret;
+	}
+
 	qce_desc->dma_desc = dma_desc;
 	cookie = dmaengine_submit(qce_desc->dma_desc);
 

-- 
2.47.3



^ permalink raw reply related

* [PATCH v2] stmmac: cleanup dead dependencies on STMMAC_PLATFORM and STMMAC_ETH in Kconfig
From: Julian Braha @ 2026-04-02 14:58 UTC (permalink / raw)
  To: davem, peppe.cavallaro, alexandre.torgue, mcoquelin.stm32, linux,
	kuba
  Cc: netdev, linux-arm-kernel, linux-kernel, Julian Braha,
	Russell King (Oracle)

There are already 'if STMMAC_ETH' and 'STMMAC_PLATFORM'
conditions wrapping these config options, making the
'depends on' statements duplicate dependencies (dead code).

I propose leaving the outer 'if STMMAC_PLATFORM...endif' and
'if STMMAC_ETH...endif' conditions, and removing the
individual 'depends on' statements.

This dead code was found by kconfirm, a static analysis tool for Kconfig.

Signed-off-by: Julian Braha <julianbraha@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
v2: add back default STMMAC_PLATFORM for DWMAC_GENERIC
Link to v1: https://lore.kernel.org/all/20260331125817.117091-1-julianbraha@gmail.com/
---
 drivers/net/ethernet/stmicro/stmmac/Kconfig | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/Kconfig b/drivers/net/ethernet/stmicro/stmmac/Kconfig
index c2cb530fd0a2..e3dd5adda5ac 100644
--- a/drivers/net/ethernet/stmicro/stmmac/Kconfig
+++ b/drivers/net/ethernet/stmicro/stmmac/Kconfig
@@ -20,7 +20,6 @@ if STMMAC_ETH
 config STMMAC_SELFTESTS
 	bool "Support for STMMAC Selftests"
 	depends on INET
-	depends on STMMAC_ETH
 	default n
 	help
 	  This adds support for STMMAC Selftests using ethtool. Enable this
@@ -29,7 +28,6 @@ config STMMAC_SELFTESTS
 
 config STMMAC_PLATFORM
 	tristate "STMMAC Platform bus support"
-	depends on STMMAC_ETH
 	select MFD_SYSCON
 	default y
 	help
@@ -336,7 +334,6 @@ config DWMAC_IMX8
 config DWMAC_INTEL_PLAT
 	tristate "Intel dwmac support"
 	depends on OF && COMMON_CLK
-	depends on STMMAC_ETH
 	help
 	  Support for ethernet controllers on Intel SoCs
 
@@ -371,7 +368,7 @@ config DWMAC_VISCONTI
 	help
 	  Support for ethernet controller on Visconti SoCs.
 
-endif
+endif # STMMAC_PLATFORM
 
 config STMMAC_LIBPCI
 	tristate
@@ -381,7 +378,7 @@ config STMMAC_LIBPCI
 config DWMAC_INTEL
 	tristate "Intel GMAC support"
 	default X86
-	depends on X86 && STMMAC_ETH && PCI
+	depends on X86 && PCI
 	depends on COMMON_CLK
 	depends on ACPI
 	help
@@ -420,4 +417,4 @@ config STMMAC_PCI
 	  If you have a controller with this interface, say Y or M here.
 
 	  If unsure, say N.
-endif
+endif # STMMAC_ETH
-- 
2.53.0



^ permalink raw reply related

* Re: [PATCH 3/8] firmware: sysfb: Make CONFIG_SYSFB a user-selectable option
From: Arnd Bergmann @ 2026-04-02 14:59 UTC (permalink / raw)
  To: Thomas Zimmermann, Javier Martinez Canillas, Ard Biesheuvel,
	Ilias Apalodimas, Huacai Chen, WANG Xuerui, Maarten Lankhorst,
	Maxime Ripard, Dave Airlie, Simona Vetter, K. Y. Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, longli, Helge Deller
  Cc: linux-arm-kernel, loongarch, linux-efi, linux-riscv, dri-devel,
	linux-hyperv, linux-fbdev
In-Reply-To: <3e466158-c2e5-4e23-934f-dcdbb71ad41f@suse.de>

On Thu, Apr 2, 2026, at 16:10, Thomas Zimmermann wrote:
> Am 02.04.26 um 15:08 schrieb Arnd Bergmann:
>> On Thu, Apr 2, 2026, at 11:09, Thomas Zimmermann wrote:
>> I don't really like this part of the series and would prefer
>> to keep CONFIG_SYSFB hidden as much as possible as an x86
>> (and EFI) specific implementation detail, with the hope
>> of eventually seperating out the x86 bits from the EFI ones.
>
> You mean, you want to use the EFI-provided framebuffers without the 
> intermediate step of going through sysfb_primary_display?
>
> In that case, CONFIG_SYSFB would become an x86-internal thing, right?

The part that is still needed from sysfb is the arbitration
between DRM_EFI and the PCI device driver for the same hardware,
so I think some part of sysfb is clearly needed, in particular
the sysfb_disable() function that removes the EFI framebuffer
when there is a conflicting simpledrm or hardware specific
driver.

The parts that I want to keep out of that is anything
related to the x86 boot protocol, non-EFI framebuffers,
text console, and kexec handoff, which we don't need on
non-x86 UEFI systems.

I don't mind the idea of having a sysfb_primary_display
in the EFI code if that helps keep EFI sane on x86,
but it would be good to make that local to
drivers/firmware/efi and (eventually) detached from
include/uapi/linux/screen_info.h.

>> In general, I am always in favor of properly using Kconfig
>> dependencies over 'select' statements, for the same reasons
>> you describe, but I don't want the the x86 logic for
>> the legacy VESA and VGA console handling to leak into more
>> architectures than necessary.
>>
>> Do you think we could instead move the sysfb_init()
>> function into the same two places that contain the
>> sysfb_primary_display definition (arch/x86/kernel/setup.c,
>> drivers/firmware/efi/efi-init.c) and simplify the efi version
>> to take out the x86 bits? That would reduce the rest
>> of sysfb-primary.c to the logic to unregister the device,
>> and that could then be selected by both x86 and EFI.
>
> No, I'm more than happy that sysfb finally consolidates all the 
> init-framebuffer setup and detection that floated around in the kernel. 
> I would not want it to be duplicated again.
>
> For now, we could certainly keep CONFIG_SYSFB hidden and autoselected. 
> Although I think this will require soem sort of solution at a later point.

Can you clarify which problem you are trying to solve
with that?

     Arnd


^ permalink raw reply

* Re: [PATCH net v4 0/2] stmmac crash/stall fixes when under memory pressure
From: Jakub Kicinski @ 2026-04-02 15:05 UTC (permalink / raw)
  To: Sam Edwards
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Paolo Abeni,
	Maxime Coquelin, Alexandre Torgue, Russell King (Oracle),
	Maxime Chevallier, Ovidiu Panait, Vladimir Oltean, Baruch Siach,
	Serge Semin, Giuseppe Cavallaro, netdev, linux-stm32,
	linux-arm-kernel, linux-kernel
In-Reply-To: <20260401041929.12392-1-CFSworks@gmail.com>

On Tue, 31 Mar 2026 21:19:27 -0700 Sam Edwards wrote:
> - Changed patch 2 to tolerate dirty stragglers up to a critical threshold (the
>   same threshold tolerated by the zero-copy path), to avoid nuisance looping
>   during OOM conditions (thanks Jakub)

I meant we need both a threshold, and a delay :(


^ permalink raw reply

* Re: [PATCH] dt-bindings: arm: arm,vexpress-scc: convert to DT schema
From: Liviu Dudau @ 2026-04-02 15:11 UTC (permalink / raw)
  To: Khushal Chitturi
  Cc: robh, krzk+dt, conor+dt, sudeep.holla, lpieralisi, pawel.moll,
	devicetree, linux-arm-kernel, linux-kernel
In-Reply-To: <20260331172959.35745-1-khushalchitturi@gmail.com>

Hello,

Thanks for your patch, I have some suggestions to improve it.

On Tue, Mar 31, 2026 at 10:59:59PM +0530, Khushal Chitturi wrote:
> Convert the ARM Versatile Express Serial Configuration Controller
> bindings to DT schema.
> 
> Signed-off-by: Khushal Chitturi <khushalchitturi@gmail.com>
> ---
> Note:
> * This patch is part of the GSoC2026 application process for device tree bindings conversions
> * https://github.com/LinuxFoundationGSoC/ProjectIdeas/wiki/GSoC-2026-Device-Tree-Bindings
> 
>  .../bindings/arm/arm,vexpress-scc.yaml        | 51 +++++++++++++++++++
>  .../devicetree/bindings/arm/vexpress-scc.txt  | 33 ------------
>  2 files changed, 51 insertions(+), 33 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/arm/arm,vexpress-scc.yaml
>  delete mode 100644 Documentation/devicetree/bindings/arm/vexpress-scc.txt
> 
> diff --git a/Documentation/devicetree/bindings/arm/arm,vexpress-scc.yaml b/Documentation/devicetree/bindings/arm/arm,vexpress-scc.yaml
> new file mode 100644
> index 000000000000..7870410211a0
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/arm,vexpress-scc.yaml
> @@ -0,0 +1,51 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/arm/arm,vexpress-scc.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: ARM Versatile Express Serial Configuration Controller
> +
> +maintainers:
> +  - Pawel Moll <pawel.moll@arm.com>

I'm not sure Pawel wants to be maintainer for this file, maybe add me and Sudeep
instead. I'd also wait until Pawel replies.

> +
> +description: |
> +  Test chips for ARM Versatile Express platform implement SCC (Serial
> +  Configuration Controller) interface, used to set initial conditions
> +  for the test chip.
> +
> +  In some cases its registers are also mapped in normal address space
> +  and can be used to obtain runtime information about the chip internals
> +  (like silicon temperature sensors) and as interface to other subsystems
> +  like platform configuration control and power management.
> +
> +properties:
> +  compatible:
> +    items:
> +      - pattern: "^arm,vexpress-scc,[a-z0-9_-]+$"

This is way too generic. I suggest you have a look at bindings/arm/arm,vexpress-juno.yaml
and see how we defined the possible values for the compatible string there. For the initial
conversion I would suggest you only define as valid the "arm,vexpress-scc,v2p-ca15_a7" value
but in a way similar to Juno's file so that it can be extended in the future.

Best regards,
Liviu

> +      - const: arm,vexpress-scc
> +
> +  reg:
> +    maxItems: 1
> +
> +  interrupts:
> +    maxItems: 1
> +
> +required:
> +  - compatible
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +    bus {
> +        #address-cells = <2>;
> +        #size-cells = <2>;
> +
> +        scc@7fff0000 {
> +            compatible = "arm,vexpress-scc,v2p-ca15_a7", "arm,vexpress-scc";
> +            reg = <0 0x7fff0000 0 0x1000>;
> +            interrupts = <0 95 4>;
> +        };
> +    };
> +...
> diff --git a/Documentation/devicetree/bindings/arm/vexpress-scc.txt b/Documentation/devicetree/bindings/arm/vexpress-scc.txt
> deleted file mode 100644
> index ae5043e42e5d..000000000000
> --- a/Documentation/devicetree/bindings/arm/vexpress-scc.txt
> +++ /dev/null
> @@ -1,33 +0,0 @@
> -ARM Versatile Express Serial Configuration Controller
> ------------------------------------------------------
> -
> -Test chips for ARM Versatile Express platform implement SCC (Serial
> -Configuration Controller) interface, used to set initial conditions
> -for the test chip.
> -
> -In some cases its registers are also mapped in normal address space
> -and can be used to obtain runtime information about the chip internals
> -(like silicon temperature sensors) and as interface to other subsystems
> -like platform configuration control and power management.
> -
> -Required properties:
> -
> -- compatible value: "arm,vexpress-scc,<model>", "arm,vexpress-scc";
> -		    where <model> is the full tile model name (as used
> -		    in the tile's Technical Reference Manual),
> -		    eg. for Coretile Express A15x2 A7x3 (V2P-CA15_A7):
> -	compatible = "arm,vexpress-scc,v2p-ca15_a7", "arm,vexpress-scc";
> -
> -Optional properties:
> -
> -- reg: when the SCC is memory mapped, physical address and size of the
> -       registers window
> -- interrupts: when the SCC can generate a system-level interrupt
> -
> -Example:
> -
> -	scc@7fff0000 {
> -		compatible = "arm,vexpress-scc,v2p-ca15_a7", "arm,vexpress-scc";
> -		reg = <0 0x7fff0000 0 0x1000>;
> -		interrupts = <0 95 4>;
> -	};
> -- 
> 2.53.0
> 

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯


^ permalink raw reply

* Re: Re: Re: Re: Re: [PATCH 1/2] dt-bindings: gpu: mali-valhall-csf: Document i.MX952 support
From: Liviu Dudau @ 2026-04-02 15:14 UTC (permalink / raw)
  To: Guangliu Ding
  Cc: Daniel Baluta (OSS), Daniel Almeida, Alice Ryhl, Boris Brezillon,
	Steven Price, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Frank Li, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam,
	dri-devel@lists.freedesktop.org, devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org, imx@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org, Jiyu Yang
In-Reply-To: <AM0PR04MB470716BD5A538CFBB6685C87F350A@AM0PR04MB4707.eurprd04.prod.outlook.com>

On Wed, Apr 01, 2026 at 06:03:28PM +0000, Guangliu Ding wrote:
> Hi Liviu
> 
> Thanks a lot for your sharing.
> 
> > On Wed, Apr 01, 2026 at 03:59:23PM +0000, Guangliu Ding wrote:
> > > Hi Liviu
> > >
> > > > On Wed, Apr 01, 2026 at 10:31:01AM +0000, Guangliu Ding wrote:
> > > > > Hi Liviu
> > > > >
> > > > > > On Wed, Apr 01, 2026 at 09:43:12AM +0000, Guangliu Ding wrote:
> > > > > > > Hi Daniel
> > > > > > >
> > > > > > > > On 4/1/26 11:48, Guangliu Ding wrote:
> > > > > > > > > [You don't often get email from guangliu.ding@nxp.com.
> > > > > > > > > Learn why this is important at
> > > > > > > > > https://aka.ms/LearnAboutSenderIdentification
> > > > > > > > > ]
> > > > > > > > >
> > > > > > > > > Hi Liviu
> > > > > > > > >
> > > > > > > > > Thanks for your review. Please refer to my comments below:
> > > > > > > > >
> > > > > > > > >> On Tue, Mar 31, 2026 at 06:12:38PM +0800, Guangliu Ding
> > wrote:
> > > > > > > > >>> Add compatible string of Mali G310 GPU on i.MX952 board.
> > > > > > > > >>>
> > > > > > > > >>> Signed-off-by: Guangliu Ding <guangliu.ding@nxp.com>
> > > > > > > > >>> Reviewed-by: Jiyu Yang <jiyu.yang@nxp.com>
> > > > > > > > >>> ---
> > > > > > > > >>>
> > > > > > > > >>> Documentation/devicetree/bindings/gpu/arm,mali-valhall-c
> > > > > > > > >>> sf.y
> > > > > > > > >>> aml
> > > > > > > > >>> | 1
> > > > > > > > >>> +
> > > > > > > > >>>  1 file changed, 1 insertion(+)
> > > > > > > > >>>
> > > > > > > > >>> diff --git
> > > > > > > > >>> a/Documentation/devicetree/bindings/gpu/arm,mali-valhall
> > > > > > > > >>> -csf
> > > > > > > > >>> .yam
> > > > > > > > >>> l
> > > > > > > > >> b/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.
> > > > > > > > >> yaml
> > > > > > > > >>> index 8eccd4338a2b..6a10843a26e2 100644
> > > > > > > > >>> ---
> > > > > > > > >>> a/Documentation/devicetree/bindings/gpu/arm,mali-valhall
> > > > > > > > >>> -csf
> > > > > > > > >>> .yam
> > > > > > > > >>> l
> > > > > > > > >>> +++ b/Documentation/devicetree/bindings/gpu/arm,mali-val
> > > > > > > > >>> +++ hall
> > > > > > > > >>> +++ -csf
> > > > > > > > >>> +++ .yam
> > > > > > > > >>> +++ l
> > > > > > > > >>> @@ -20,6 +20,7 @@ properties:
> > > > > > > > >>>            - enum:
> > > > > > > > >>>                - mediatek,mt8196-mali
> > > > > > > > >>>                - nxp,imx95-mali            # G310
> > > > > > > > >>> +              - nxp,imx952-mali           # G310
> > > > > > > > >> Can you explain why this is needed? Can it not be covered
> > > > > > > > >> by the existing compatible?
> > > > > > > > > There are functional differences in GPU module (GPUMIX)
> > > > > > > > > between
> > > > > > > > > i.MX95 and i.MX952. So they cannot be fully covered by a
> > > > > > > > > single existing
> > > > > > compatible.
> > > > > > > > > On i.MX952, The GPU clock is controlled by hardware GPU
> > > > > > > > > auto clock-gating mechanism, while the GPU clock is
> > > > > > > > > managed explicitly by the
> > > > > > > > driver on i.MX95.
> > > > > > > > > Because of these behavioral differences, separate
> > > > > > > > > compatible strings "nxp,imx95-mali" and "nxp,imx952-mali"
> > > > > > > > > are needed to allow the driver to handle the two variants
> > > > > > > > > independently and to keep room for future
> > > > > > > > divergence.
> > > > > > > >
> > > > > > > >
> > > > > > > > This information should be added in the commit message
> > > > > > > > explaining why
> > > > > > > >
> > > > > > > > the change is needed.
> > > > > > > >
> > > > > > > >
> > > > > > > > But then where is the driver code taking care of these diferences?
> > > > > > > >
> > > > > > >
> > > > > > > Yes. Currently the driver does not require "nxp,imx952-mali" string.
> > > > > > > However, when GPU ipa_counters are enabled to calculate the
> > > > > > > GPU busy_time/idle_time for GPU DVFS feature, they will
> > > > > > > conflict with the hardware GPU auto clock‑gating mechanism,
> > > > > > > causing GPU clock to remain
> > > > > > always on.
> > > > > > > In such cases, ipa_counters need to be disabled so that the
> > > > > > > GPU auto clock‑gating mechanism can operate normally, using
> > > > "nxp,imx952-mali"
> > > > > > string.
> > > > > >
> > > > > > OK, I understand that you're following guidance from some other
> > > > > > senior people on how to upstream patches so you've tried to
> > > > > > create the smallest patchset to ensure that it gets reviewed and
> > > > > > accepted, but in this case we need to see the other patches as
> > > > > > well to decide if your approach is the right one and we do need
> > > > > > a separate compatible
> > > > string.
> > > > > >
> > > > > > If enabling GPU ipa_counters causes the clocks to get stuck
> > > > > > active, that feels like a hardware bug, so figuring out how to
> > > > > > handle that is more important than adding a compatible string.
> > > > > >
> > > > > > Either add the patch(es) that use the compatible to this series
> > > > > > in v2, or put a comment in the commit message on where we can
> > > > > > see the
> > > > driver changes.
> > > > > >
> > > > >
> > > > > According to discussions with the GPU vendor, this is a hardware
> > > > > limitation of Mali-G310 rather than a hardware bug, and it has
> > > > > been addressed in newer Mali GPU families.
> > > >
> > > > I represent the said GPU vendor and I think I know what you're
> > > > talking about, but you're taking the wrong approach. All G310s have
> > > > a problem where in order to enable access to the ipa_counters the
> > > > automatic clock gating gets disabled. So the solution that needs to
> > > > be implemented when we add support for IPA_COUNTERs will apply to all
> > GPUs, not just MX952.
> > >
> > > Yes. We have bring-up G310 (V2) GPU on both i.MX95 and i.MX952. And
> > > auto clock gating mechanism is firstly introduced in i.MX952 (not supported
> > on i.MX95).
> > > According to your update, solution needs to be implemented to all GPUs
> > > which support auto clock gating mechanism after IPA_COUNTERs are
> > supported in the driver, right?
> > 
> > A solution is needed, yes.
> > 
> > > What's your suggestions for 952 gpu dtb node?
> > 
> > There is no IPA_COUNTER use in Panthor at the moment. Unless your DVFS
> > controller uses that, I would suggest that we don't introduce a compatible for
> > 952 until the time we add support for reading the counters.
> > 
> > It helps if you think in terms of what is already in upstream, rather than mixing
> > with the tests that uses kbase code. Does your hardware need extra code in
> > upstream in order to function? If so, where is that code? If not, then let's not
> > introduce the compatible until we are absolutely sure we need it because we
> > have code specific to that SoC. For everything else we will implement an
> > architecture fix if needed.
> > 
> 
> Got it. The following compatible string is the correct choice since the GPU on
> i.MX952 is fully compatible with the GPU on i.MX95 now.
> compatible = "nxp,imx95-mali", "arm,mali-valhall-csf";
> 
> I will not mix tests with kbase code in the following upstream patches for panthor driver.

OK. I think if you drop patch 2/3 from your series you can send a v3.

Best regards,
Liviu

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯


^ permalink raw reply

* Re: [PATCH 3/8] firmware: sysfb: Make CONFIG_SYSFB a user-selectable option
From: Thomas Zimmermann @ 2026-04-02 15:27 UTC (permalink / raw)
  To: Arnd Bergmann, Javier Martinez Canillas, Ard Biesheuvel,
	Ilias Apalodimas, Huacai Chen, WANG Xuerui, Maarten Lankhorst,
	Maxime Ripard, Dave Airlie, Simona Vetter, K. Y. Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, longli, Helge Deller
  Cc: linux-arm-kernel, loongarch, linux-efi, linux-riscv, dri-devel,
	linux-hyperv, linux-fbdev
In-Reply-To: <d3d7c545-3b07-4881-a16d-45b6f039de19@app.fastmail.com>

Hi

Am 02.04.26 um 16:59 schrieb Arnd Bergmann:
> On Thu, Apr 2, 2026, at 16:10, Thomas Zimmermann wrote:
>> Am 02.04.26 um 15:08 schrieb Arnd Bergmann:
>>> On Thu, Apr 2, 2026, at 11:09, Thomas Zimmermann wrote:
>>> I don't really like this part of the series and would prefer
>>> to keep CONFIG_SYSFB hidden as much as possible as an x86
>>> (and EFI) specific implementation detail, with the hope
>>> of eventually seperating out the x86 bits from the EFI ones.
>> You mean, you want to use the EFI-provided framebuffers without the
>> intermediate step of going through sysfb_primary_display?
>>
>> In that case, CONFIG_SYSFB would become an x86-internal thing, right?
> The part that is still needed from sysfb is the arbitration
> between DRM_EFI and the PCI device driver for the same hardware,
> so I think some part of sysfb is clearly needed, in particular
> the sysfb_disable() function that removes the EFI framebuffer
> when there is a conflicting simpledrm or hardware specific
> driver.

We do most of that in the aperture-helper module. (see 
<linux/aperture.h>). Calling sysfb_disable() from there is a workaround 
for some corner cases. We can have an EFI-specific function that does 
the same.

BTW, simpledrm-on-EFI/VESA is considered obsolete and should preferably 
be removed from that driver. Simpledrm should become a driver for 
Devicetree nodes of type simple-framebuffer (as it originally has been 
intended).

>
> The parts that I want to keep out of that is anything
> related to the x86 boot protocol, non-EFI framebuffers,
> text console, and kexec handoff, which we don't need on
> non-x86 UEFI systems.
>
> I don't mind the idea of having a sysfb_primary_display
> in the EFI code if that helps keep EFI sane on x86,
> but it would be good to make that local to
> drivers/firmware/efi and (eventually) detached from
> include/uapi/linux/screen_info.h.

Efidrm retrieves the framebuffer settings from the contained struct 
screen_info. Disconnecting from screen_info would require separate 
graphics drivers for x86 and non-x86. If we split off EFI from sysfb, 
we'd likely need a sysfbdrm driver of some sort. Just saying.

I think we'd also have to duplicate the framebuffer-relocation code that 
currently works on anything using struct screen_info (see patch 5).

>
>>> In general, I am always in favor of properly using Kconfig
>>> dependencies over 'select' statements, for the same reasons
>>> you describe, but I don't want the the x86 logic for
>>> the legacy VESA and VGA console handling to leak into more
>>> architectures than necessary.
>>>
>>> Do you think we could instead move the sysfb_init()
>>> function into the same two places that contain the
>>> sysfb_primary_display definition (arch/x86/kernel/setup.c,
>>> drivers/firmware/efi/efi-init.c) and simplify the efi version
>>> to take out the x86 bits? That would reduce the rest
>>> of sysfb-primary.c to the logic to unregister the device,
>>> and that could then be selected by both x86 and EFI.
>> No, I'm more than happy that sysfb finally consolidates all the
>> init-framebuffer setup and detection that floated around in the kernel.
>> I would not want it to be duplicated again.
>>
>> For now, we could certainly keep CONFIG_SYSFB hidden and autoselected.
>> Although I think this will require soem sort of solution at a later point.
> Can you clarify which problem you are trying to solve
> with that?

One thing is that some users simply what control over their kernel build.

I also think that there might be systems that want to use 
sysfb_primary_display (plus the relocation feature), but not create the 
framebuffer device. Say for efi-earlycon. It needs user-control over the 
SYSFB option to do that.

As a side-effect, user-configurable SYSFB gives us a nice place to put 
SYSFB_SIMPLEFB and FIRMWARE_EDID; two options that currently float 
around in the config somewhat arbitrarily.

Best regards
Thomas

>
>       Arnd

-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)




^ permalink raw reply

* Re: [PATCH 3/4] perf arm_spe: Decode Arm N1 IMPDEF events
From: Ian Rogers @ 2026-04-02 15:26 UTC (permalink / raw)
  To: James Clark
  Cc: John Garry, Will Deacon, Mike Leach, Leo Yan, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Adrian Hunter, Al Grant,
	linux-arm-kernel, linux-perf-users, linux-kernel
In-Reply-To: <20260401-james-spe-impdef-decode-v1-3-ad0d372c220c@linaro.org>

On Wed, Apr 1, 2026 at 7:26 AM James Clark <james.clark@linaro.org> wrote:
>
> From the TRM [1], N1 has one IMPDEF event which isn't covered by the
> common list. Add a framework so that more cores can be added in the
> future and that the N1 IMPDEF event can be decoded. Also increase the
> size of the buffer because we're adding more strings and if it gets
> truncated it falls back to a hex dump only.
>
> [1]: https://developer.arm.com/documentation/100616/0401/Statistical-Profiling-Extension/implementation-defined-features-of-SPE
> Suggested-by: Al Grant <al.grant@arm.com>
> Signed-off-by: James Clark <james.clark@linaro.org>
> ---
>  tools/perf/util/arm-spe-decoder/Build              |  2 +
>  .../util/arm-spe-decoder/arm-spe-pkt-decoder.c     | 45 ++++++++++++++++++++--
>  .../util/arm-spe-decoder/arm-spe-pkt-decoder.h     |  5 ++-
>  tools/perf/util/arm-spe.c                          | 13 ++++---
>  4 files changed, 54 insertions(+), 11 deletions(-)
>
> diff --git a/tools/perf/util/arm-spe-decoder/Build b/tools/perf/util/arm-spe-decoder/Build
> index ab500e0efe24..97a298d1e279 100644
> --- a/tools/perf/util/arm-spe-decoder/Build
> +++ b/tools/perf/util/arm-spe-decoder/Build
> @@ -1 +1,3 @@
>  perf-util-y += arm-spe-pkt-decoder.o arm-spe-decoder.o
> +
> +CFLAGS_arm-spe-pkt-decoder.o += -I$(srctree)/tools/arch/arm64/include/ -I$(OUTPUT)arch/arm64/include/generated/
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index c880b0dec3a1..42a7501d4dfe 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -15,6 +15,8 @@
>
>  #include "arm-spe-pkt-decoder.h"
>
> +#include "../../arm64/include/asm/cputype.h"

Sashiko spotted:
https://sashiko.dev/#/patchset/20260401-james-spe-impdef-decode-v1-0-ad0d372c220c%40linaro.org
"""
This isn't a bug, but does this include directive rely on accidental
path normalization?

The relative path ../../arm64/include/asm/cputype.h does not exist relative
to arm-spe-pkt-decoder.c. It only compiles because the Build file adds
-I$(srctree)/tools/arch/arm64/include/ to CFLAGS.

Would it be cleaner to use #include <asm/cputype.h> to explicitly rely on
the include path?
[ ... ]
"""
I wouldn't use <asm/cputype.h> due to cross-compilation and the like,
instead just add the extra "../" into the include path.

> +
>  static const char * const arm_spe_packet_name[] = {
>         [ARM_SPE_PAD]           = "PAD",
>         [ARM_SPE_END]           = "END",
> @@ -307,6 +309,11 @@ static const struct ev_string common_ev_strings[] = {
>         { .event = 0, .desc = NULL },
>  };
>
> +static const struct ev_string n1_event_strings[] = {
> +       { .event = 12, .desc = "LATE-PREFETCH" },
> +       { .event = 0, .desc = NULL },
> +};
> +
>  static u64 print_event_list(int *err, char **buf, size_t *buf_len,
>                             const struct ev_string *ev_strings, u64 payload)
>  {
> @@ -318,14 +325,44 @@ static u64 print_event_list(int *err, char **buf, size_t *buf_len,
>         return payload;
>  }
>
> +struct event_print_handle {
> +       const struct midr_range *midr_ranges;
> +       const struct ev_string *ev_strings;
> +};
> +
> +#define EV_PRINT(range, strings)                       \
> +       {                                       \
> +               .midr_ranges = range,           \
> +               .ev_strings = strings,  \
> +       }
> +
> +static const struct midr_range n1_event_encoding_cpus[] = {
> +       MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
> +       {},
> +};
> +
> +static const struct event_print_handle event_print_handles[] = {
> +       EV_PRINT(n1_event_encoding_cpus, n1_event_strings),
> +};
> +
>  static int arm_spe_pkt_desc_event(const struct arm_spe_pkt *packet,
> -                                 char *buf, size_t buf_len)
> +                                 char *buf, size_t buf_len, u64 midr)
>  {
>         u64 payload = packet->payload;
>         int err = 0;
>
>         arm_spe_pkt_out_string(&err, &buf, &buf_len, "EV");
> -       print_event_list(&err, &buf, &buf_len, common_ev_strings, payload);
> +       payload = print_event_list(&err, &buf, &buf_len, common_ev_strings,
> +                                  payload);
> +
> +       /* Try to decode IMPDEF bits for known CPUs */
> +       for (unsigned int i = 0; i < ARRAY_SIZE(event_print_handles); i++) {
> +               if (is_midr_in_range_list(midr,
> +                                         event_print_handles[i].midr_ranges))
> +                       payload = print_event_list(&err, &buf, &buf_len,
> +                                                  event_print_handles[i].ev_strings,
> +                                                  payload);
> +       }
>
>         return err;
>  }
> @@ -506,7 +543,7 @@ static int arm_spe_pkt_desc_counter(const struct arm_spe_pkt *packet,
>  }
>
>  int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
> -                    size_t buf_len)
> +                    size_t buf_len, u64 midr)
>  {
>         int idx = packet->index;
>         unsigned long long payload = packet->payload;
> @@ -522,7 +559,7 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
>                 arm_spe_pkt_out_string(&err, &buf, &blen, "%s", name);
>                 break;
>         case ARM_SPE_EVENTS:
> -               err = arm_spe_pkt_desc_event(packet, buf, buf_len);
> +               err = arm_spe_pkt_desc_event(packet, buf, buf_len, midr);
>                 break;
>         case ARM_SPE_OP_TYPE:
>                 err = arm_spe_pkt_desc_op_type(packet, buf, buf_len);
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> index adf4cde320aa..17b067fe3c87 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> @@ -11,7 +11,7 @@
>  #include <stddef.h>
>  #include <stdint.h>
>
> -#define ARM_SPE_PKT_DESC_MAX           256
> +#define ARM_SPE_PKT_DESC_MAX           512
>
>  #define ARM_SPE_NEED_MORE_BYTES                -1
>  #define ARM_SPE_BAD_PACKET             -2
> @@ -186,5 +186,6 @@ const char *arm_spe_pkt_name(enum arm_spe_pkt_type);
>  int arm_spe_get_packet(const unsigned char *buf, size_t len,
>                        struct arm_spe_pkt *packet);
>
> -int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf, size_t len);
> +int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf, size_t len,
> +                    u64 midr);
>  #endif
> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
> index 7447b000f9cd..46f0309c092b 100644
> --- a/tools/perf/util/arm-spe.c
> +++ b/tools/perf/util/arm-spe.c
> @@ -135,7 +135,7 @@ struct data_source_handle {
>         }
>
>  static void arm_spe_dump(struct arm_spe *spe __maybe_unused,
> -                        unsigned char *buf, size_t len)
> +                        unsigned char *buf, size_t len, u64 midr)
>  {
>         struct arm_spe_pkt packet;
>         size_t pos = 0;
> @@ -161,7 +161,7 @@ static void arm_spe_dump(struct arm_spe *spe __maybe_unused,
>                         color_fprintf(stdout, color, "   ");
>                 if (ret > 0) {
>                         ret = arm_spe_pkt_desc(&packet, desc,
> -                                              ARM_SPE_PKT_DESC_MAX);
> +                                              ARM_SPE_PKT_DESC_MAX, midr);
>                         if (!ret)
>                                 color_fprintf(stdout, color, " %s\n", desc);
>                 } else {
> @@ -174,10 +174,10 @@ static void arm_spe_dump(struct arm_spe *spe __maybe_unused,
>  }
>
>  static void arm_spe_dump_event(struct arm_spe *spe, unsigned char *buf,
> -                              size_t len)
> +                              size_t len, u64 midr)
>  {
>         printf(".\n");
> -       arm_spe_dump(spe, buf, len);
> +       arm_spe_dump(spe, buf, len, midr);
>  }
>
>  static int arm_spe_get_trace(struct arm_spe_buffer *b, void *data)
> @@ -1469,8 +1469,11 @@ static int arm_spe_process_auxtrace_event(struct perf_session *session,
>                 /* Dump here now we have copied a piped trace out of the pipe */
>                 if (dump_trace) {
>                         if (auxtrace_buffer__get_data(buffer, fd)) {
> +                               u64 midr = 0;
> +
> +                               arm_spe__get_midr(spe, buffer->cpu.cpu, &midr);

Sashiko claims to have spotted an issue here:
"""
Is it possible for arm_spe__get_midr() to cause a segmentation fault here?

If the trace is from an older recording (metadata version 1) and the
environment lacks a CPUID string (such as during cross-architecture
analysis), perf_env__cpuid() returns NULL.

It appears arm_spe__get_midr() then passes this NULL pointer to
strtol(cpuid, NULL, 16), which leads to undefined behavior.
"""

But this feels like, if this happens you're already having a bad time
and these changes aren't necessarily making things worse.

Thanks,
Ian

>                                 arm_spe_dump_event(spe, buffer->data,
> -                                               buffer->size);
> +                                               buffer->size, midr);
>                                 auxtrace_buffer__put_data(buffer);
>                         }
>                 }
>
> --
> 2.34.1
>


^ permalink raw reply

* Re: [PATCH 2/4] perf arm_spe: Turn event name mappings into an array
From: Ian Rogers @ 2026-04-02 15:30 UTC (permalink / raw)
  To: James Clark
  Cc: John Garry, Will Deacon, Mike Leach, Leo Yan, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Adrian Hunter, Al Grant,
	linux-arm-kernel, linux-perf-users, linux-kernel
In-Reply-To: <20260401-james-spe-impdef-decode-v1-2-ad0d372c220c@linaro.org>

On Wed, Apr 1, 2026 at 7:26 AM James Clark <james.clark@linaro.org> wrote:
>
> This is so we can have a single function that prints events and can be
> used with multiple mappings from different CPUs. Remove any bit that was
> printed so that later we can print out the remaining unknown impdef
> bits.
>
> No functional changes intended.
>
> Signed-off-by: James Clark <james.clark@linaro.org>
> ---
>  .../util/arm-spe-decoder/arm-spe-pkt-decoder.c     | 88 +++++++++++-----------
>  1 file changed, 43 insertions(+), 45 deletions(-)
>
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 5769ba2f4140..c880b0dec3a1 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -276,6 +276,48 @@ static int arm_spe_pkt_out_string(int *err, char **buf_p, size_t *blen,
>         return ret;
>  }
>
> +struct ev_string {
> +       u8 event;
> +       const char *desc;
> +};
> +
> +static const struct ev_string common_ev_strings[] = {
> +       { .event = EV_EXCEPTION_GEN, .desc = "EXCEPTION-GEN" },
> +       { .event = EV_RETIRED, .desc = "RETIRED" },
> +       { .event = EV_L1D_ACCESS, .desc = "L1D-ACCESS" },
> +       { .event = EV_L1D_REFILL, .desc = "L1D-REFILL" },
> +       { .event = EV_TLB_ACCESS, .desc = "TLB-ACCESS" },
> +       { .event = EV_TLB_WALK, .desc = "TLB-REFILL" },
> +       { .event = EV_NOT_TAKEN, .desc = "NOT-TAKEN" },
> +       { .event = EV_MISPRED, .desc = "MISPRED" },
> +       { .event = EV_LLC_ACCESS, .desc = "LLC-ACCESS" },
> +       { .event = EV_LLC_MISS, .desc = "LLC-REFILL" },
> +       { .event = EV_REMOTE_ACCESS, .desc = "REMOTE-ACCESS" },
> +       { .event = EV_ALIGNMENT, .desc = "ALIGNMENT" },
> +       { .event = EV_TRANSACTIONAL, .desc = "TXN" },
> +       { .event = EV_PARTIAL_PREDICATE, .desc = "SVE-PARTIAL-PRED" },
> +       { .event = EV_EMPTY_PREDICATE, .desc = "SVE-EMPTY-PRED" },
> +       { .event = EV_L2D_ACCESS, .desc = "L2D-ACCESS" },
> +       { .event = EV_L2D_MISS, .desc = "L2D-MISS" },
> +       { .event = EV_CACHE_DATA_MODIFIED, .desc = "HITM" },
> +       { .event = EV_RECENTLY_FETCHED, .desc = "LFB" },
> +       { .event = EV_DATA_SNOOPED, .desc = "SNOOPED" },
> +       { .event = EV_STREAMING_SVE_MODE, .desc = "STREAMING-SVE" },
> +       { .event = EV_SMCU, .desc = "SMCU" },
> +       { .event = 0, .desc = NULL },
> +};
> +
> +static u64 print_event_list(int *err, char **buf, size_t *buf_len,
> +                           const struct ev_string *ev_strings, u64 payload)
> +{
> +       for (const struct ev_string *ev = ev_strings; ev->desc != NULL; ev++) {
> +               if (payload & BIT(ev->event))
> +                       arm_spe_pkt_out_string(err, buf, buf_len, " %s", ev->desc);
> +               payload &= ~BIT(ev->event);

Sashiko has a bunch of worries in these patches about 32-bit builds:
https://sashiko.dev/#/patchset/20260401-james-spe-impdef-decode-v1-0-ad0d372c220c%40linaro.org
The one here is:
"""
Since payload is a u64, does using ~BIT(ev->event) unintentionally clear
the upper 32 bits on 32-bit architectures?

On 32-bit systems, BIT() evaluates to a 32-bit unsigned long. The bitwise
NOT operation creates a 32-bit mask, which is then zero-extended to 64 bits
during the compound assignment to the payload.

This would discard bits 32-63, contradicting the intent mentioned in the
commit message to preserve the remaining unknown impdef bits for later
printing.

Would using BIT_ULL() be safer here to ensure a 64-bit inverted mask?
"""

Thanks,
Ian

> +       }
> +       return payload;
> +}
> +
>  static int arm_spe_pkt_desc_event(const struct arm_spe_pkt *packet,
>                                   char *buf, size_t buf_len)
>  {
> @@ -283,51 +325,7 @@ static int arm_spe_pkt_desc_event(const struct arm_spe_pkt *packet,
>         int err = 0;
>
>         arm_spe_pkt_out_string(&err, &buf, &buf_len, "EV");
> -
> -       if (payload & BIT(EV_EXCEPTION_GEN))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " EXCEPTION-GEN");
> -       if (payload & BIT(EV_RETIRED))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " RETIRED");
> -       if (payload & BIT(EV_L1D_ACCESS))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " L1D-ACCESS");
> -       if (payload & BIT(EV_L1D_REFILL))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " L1D-REFILL");
> -       if (payload & BIT(EV_TLB_ACCESS))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " TLB-ACCESS");
> -       if (payload & BIT(EV_TLB_WALK))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " TLB-REFILL");
> -       if (payload & BIT(EV_NOT_TAKEN))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " NOT-TAKEN");
> -       if (payload & BIT(EV_MISPRED))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " MISPRED");
> -       if (payload & BIT(EV_LLC_ACCESS))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " LLC-ACCESS");
> -       if (payload & BIT(EV_LLC_MISS))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " LLC-REFILL");
> -       if (payload & BIT(EV_REMOTE_ACCESS))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " REMOTE-ACCESS");
> -       if (payload & BIT(EV_ALIGNMENT))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " ALIGNMENT");
> -       if (payload & BIT(EV_TRANSACTIONAL))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " TXN");
> -       if (payload & BIT(EV_PARTIAL_PREDICATE))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " SVE-PARTIAL-PRED");
> -       if (payload & BIT(EV_EMPTY_PREDICATE))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " SVE-EMPTY-PRED");
> -       if (payload & BIT(EV_L2D_ACCESS))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " L2D-ACCESS");
> -       if (payload & BIT(EV_L2D_MISS))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " L2D-MISS");
> -       if (payload & BIT(EV_CACHE_DATA_MODIFIED))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " HITM");
> -       if (payload & BIT(EV_RECENTLY_FETCHED))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " LFB");
> -       if (payload & BIT(EV_DATA_SNOOPED))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " SNOOPED");
> -       if (payload & BIT(EV_STREAMING_SVE_MODE))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " STREAMING-SVE");
> -       if (payload & BIT(EV_SMCU))
> -               arm_spe_pkt_out_string(&err, &buf, &buf_len, " SMCU");
> +       print_event_list(&err, &buf, &buf_len, common_ev_strings, payload);
>
>         return err;
>  }
>
> --
> 2.34.1
>


^ permalink raw reply

* Re: [PATCH 5/8] thermal: khadas-mcu-fan: Add fan config from platform data Add regulator support
From: Neil Armstrong @ 2026-04-02 15:39 UTC (permalink / raw)
  To: Ronald Claveau, Lee Jones, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Andi Shyti, Kevin Hilman, Jerome Brunet,
	Martin Blumenstingl, Beniamino Galvani, Rafael J. Wysocki,
	Daniel Lezcano, Zhang Rui, Lukasz Luba, Liam Girdwood, Mark Brown
  Cc: linux-amlogic, devicetree, linux-kernel, linux-i2c,
	linux-arm-kernel, linux-pm
In-Reply-To: <20260402-add-mcu-fan-khadas-vim4-v1-5-2b12eb4ac7b0@aliel.fr>

On 4/2/26 16:27, Ronald Claveau wrote:
> Replace the hardcoded MAX_LEVEL constant and fan register
> with values read from platform_data (fan_reg, max_level),
> as new MCUs need different values.
> 
> Optionally acquire and enable a "fan" regulator supply
> at probe time and on resume,
> so boards that gate fan power through a regulator are handled.
> 
> Signed-off-by: Ronald Claveau <linux-kernel-dev@aliel.fr>
> ---
>   drivers/thermal/khadas_mcu_fan.c | 43 ++++++++++++++++++++++++++++++++++------
>   1 file changed, 37 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/thermal/khadas_mcu_fan.c b/drivers/thermal/khadas_mcu_fan.c
> index d35e5313bea41..55b496625e3bd 100644
> --- a/drivers/thermal/khadas_mcu_fan.c
> +++ b/drivers/thermal/khadas_mcu_fan.c
> @@ -13,13 +13,15 @@
>   #include <linux/regmap.h>
>   #include <linux/sysfs.h>
>   #include <linux/thermal.h>
> -
> -#define MAX_LEVEL 3
> +#include <linux/regulator/consumer.h>
>   
>   struct khadas_mcu_fan_ctx {
>   	struct khadas_mcu *mcu;
> +	unsigned int fan_reg;
>   	unsigned int level;
> +	unsigned int max_level;
>   	struct thermal_cooling_device *cdev;
> +	struct regulator *power;
>   };
>   
>   static int khadas_mcu_fan_set_level(struct khadas_mcu_fan_ctx *ctx,
> @@ -27,8 +29,7 @@ static int khadas_mcu_fan_set_level(struct khadas_mcu_fan_ctx *ctx,
>   {
>   	int ret;
>   
> -	ret = regmap_write(ctx->mcu->regmap, KHADAS_MCU_CMD_FAN_STATUS_CTRL_REG,
> -			   level);
> +	ret = regmap_write(ctx->mcu->regmap, ctx->fan_reg, level);
>   	if (ret)
>   		return ret;
>   
> @@ -40,7 +41,9 @@ static int khadas_mcu_fan_set_level(struct khadas_mcu_fan_ctx *ctx,
>   static int khadas_mcu_fan_get_max_state(struct thermal_cooling_device *cdev,
>   					unsigned long *state)
>   {
> -	*state = MAX_LEVEL;
> +	struct khadas_mcu_fan_ctx *ctx = cdev->devdata;
> +
> +	*state = ctx->max_level;
>   
>   	return 0;
>   }
> @@ -61,7 +64,7 @@ khadas_mcu_fan_set_cur_state(struct thermal_cooling_device *cdev,
>   {
>   	struct khadas_mcu_fan_ctx *ctx = cdev->devdata;
>   
> -	if (state > MAX_LEVEL)
> +	if (state > ctx->max_level)
>   		return -EINVAL;
>   
>   	if (state == ctx->level)
> @@ -83,11 +86,32 @@ static int khadas_mcu_fan_probe(struct platform_device *pdev)
>   	struct device *dev = &pdev->dev;
>   	struct khadas_mcu_fan_ctx *ctx;
>   	int ret;
> +	const struct khadas_mcu_fan_pdata *pdata = dev_get_platdata(&pdev->dev);
>   
>   	ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
>   	if (!ctx)
>   		return -ENOMEM;
> +
>   	ctx->mcu = mcu;
> +	ctx->fan_reg   = pdata->fan_reg;
> +	ctx->max_level = pdata->max_level;
> +
> +	ctx->power = devm_regulator_get_optional(dev->parent, "fan");
> +	if (IS_ERR(ctx->power)) {
> +		if (PTR_ERR(ctx->power) == -ENODEV)
> +			ctx->power = NULL;
> +		else
> +			return PTR_ERR(ctx->power);
> +	}
> +
> +	if (ctx->power) {
> +		ret = regulator_enable(ctx->power);
> +		if (ret) {
> +			dev_err(dev, "Failed to enable fan power supply: %d\n", ret);
> +			return ret;
> +		}
> +	}
> +
>   	platform_set_drvdata(pdev, ctx);
>   
>   	cdev = devm_thermal_of_cooling_device_register(dev->parent,
> @@ -130,6 +154,13 @@ static int khadas_mcu_fan_suspend(struct device *dev)
>   static int khadas_mcu_fan_resume(struct device *dev)
>   {
>   	struct khadas_mcu_fan_ctx *ctx = dev_get_drvdata(dev);
> +	int ret;
> +
> +	if (ctx->power) {
> +		ret = regulator_enable(ctx->power);

Seems you're missing a regulator_disable() on suspend.

Neil

> +		if (ret)
> +			return ret;
> +	}
>   
>   	return khadas_mcu_fan_set_level(ctx, ctx->level);
>   }
> 



^ permalink raw reply

* Re: [PATCH] Bluetooth: btmtk: hide unused btmtk_mt6639_devs[] array
From: Paul Menzel @ 2026-04-02 15:41 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Marcel Holtmann, Luiz Augusto von Dentz, Matthias Brugger,
	AngeloGioacchino Del Regno, Javier Tia, Arnd Bergmann, Chris Lu,
	Kees Cook, Johan Hovold, Sean Wang, Jiande Lu, linux-bluetooth,
	linux-kernel, linux-arm-kernel, linux-mediatek
In-Reply-To: <20260402141119.2732591-1-arnd@kernel.org>

Dear Arnd,


Thank you for your patch.

Am 02.04.26 um 16:11 schrieb Arnd Bergmann:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> When USB support is disabled, the array is not referenced anywhere,
> causing a warning:
> 
> drivers/bluetooth/btmtk.c:35:3: error: 'btmtk_mt6639_devs' defined but not used [-Werror=unused-const-variable=]
>     35 | } btmtk_mt6639_devs[] = {
>        |   ^~~~~~~~~~~~~~~~~
> 
> Move it into the #ifdef block.
> 
> Fixes: 4cdd001ff03f ("Bluetooth: btmtk: Add MT6639 (MT7927) Bluetooth support")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>   drivers/bluetooth/btmtk.c | 32 ++++++++++++++++----------------
>   1 file changed, 16 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/bluetooth/btmtk.c b/drivers/bluetooth/btmtk.c
> index 099188bf772e..6fb6ca274808 100644
> --- a/drivers/bluetooth/btmtk.c
> +++ b/drivers/bluetooth/btmtk.c
> @@ -25,22 +25,6 @@
>   /* It is for mt79xx iso data transmission setting */
>   #define MTK_ISO_THRESHOLD	264
>   
> -/* Known MT6639 (MT7927) Bluetooth USB devices.
> - * Used to scope the zero-CHIPID workaround to real MT6639 hardware,
> - * since some boards return 0x0000 from the MMIO chip ID register.
> - */
> -static const struct {
> -	u16 vendor;
> -	u16 product;
> -} btmtk_mt6639_devs[] = {
> -	{ 0x0489, 0xe13a },	/* ASUS ROG Crosshair X870E Hero */
> -	{ 0x0489, 0xe0fa },	/* Lenovo Legion Pro 7 16ARX9 */
> -	{ 0x0489, 0xe10f },	/* Gigabyte Z790 AORUS MASTER X */
> -	{ 0x0489, 0xe110 },	/* MSI X870E Ace Max */
> -	{ 0x0489, 0xe116 },	/* TP-Link Archer TBE550E */
> -	{ 0x13d3, 0x3588 },	/* ASUS ROG STRIX X870E-E */
> -};
> -
>   struct btmtk_patch_header {
>   	u8 datetime[16];
>   	u8 platform[4];
> @@ -483,6 +467,22 @@ int btmtk_process_coredump(struct hci_dev *hdev, struct sk_buff *skb)
>   EXPORT_SYMBOL_GPL(btmtk_process_coredump);
>   
>   #if IS_ENABLED(CONFIG_BT_HCIBTUSB_MTK)
> +/* Known MT6639 (MT7927) Bluetooth USB devices.
> + * Used to scope the zero-CHIPID workaround to real MT6639 hardware,
> + * since some boards return 0x0000 from the MMIO chip ID register.
> + */
> +static const struct {
> +	u16 vendor;
> +	u16 product;
> +} btmtk_mt6639_devs[] = {
> +	{ 0x0489, 0xe13a },	/* ASUS ROG Crosshair X870E Hero */
> +	{ 0x0489, 0xe0fa },	/* Lenovo Legion Pro 7 16ARX9 */
> +	{ 0x0489, 0xe10f },	/* Gigabyte Z790 AORUS MASTER X */
> +	{ 0x0489, 0xe110 },	/* MSI X870E Ace Max */
> +	{ 0x0489, 0xe116 },	/* TP-Link Archer TBE550E */
> +	{ 0x13d3, 0x3588 },	/* ASUS ROG STRIX X870E-E */
> +};
> +
>   static void btmtk_usb_wmt_recv(struct urb *urb)
>   {
>   	struct hci_dev *hdev = urb->context;

Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>


Kind regards,

Paul


^ permalink raw reply

* Re: [PATCH 0/2] pwm: mediatek: fix mt7628 register offset and clock source
From: Uwe Kleine-König @ 2026-04-02 15:44 UTC (permalink / raw)
  To: Shiji Yang, Matthias Brugger, AngeloGioacchino Del Regno
  Cc: linux-pwm, linux-kernel, linux-arm-kernel, linux-mediatek
In-Reply-To: <OS7PR01MB1360282ADC135931ECCAD9AF6BC74A@OS7PR01MB13602.jpnprd01.prod.outlook.com>

[-- Attachment #1: Type: text/plain, Size: 247 bytes --]

Hello,

On Tue, Feb 24, 2026 at 04:51:00PM +0800, Shiji Yang wrote:
> This patch series fixes support for mt7628.

The series looks reasonable to me. It would be great to get some
feedback from the Mediatek maintainers, though?!

Best regards
Uwe

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH v2] iommu: Always fill in gather when unmapping
From: Greg KH @ 2026-04-02 15:49 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alexandre Ghiti, AngeloGioacchino Del Regno, Albert Ou, asahi,
	Baolin Wang, iommu, Janne Grunau, Jernej Skrabec, Joerg Roedel,
	Jean-Philippe Brucker, linux-arm-kernel, linux-mediatek,
	linux-riscv, linux-sunxi, Matthias Brugger, Neal Gompa,
	Orson Zhai, Palmer Dabbelt, Paul Walmsley, Samuel Holland,
	Sven Peter, virtualization, Chen-Yu Tsai, Will Deacon, Yong Wu,
	Chunyan Zhang, Lu Baolu, Janusz Krzysztofik, Joerg Roedel,
	Jon Hunter, patches, Pranjal Shrivastava, Robin Murphy,
	Samiullah Khawaja, stable, Vasant Hegde
In-Reply-To: <0-v2-b24668f107b2+11bbe-iommu_gather_always_jgg@nvidia.com>

On Thu, Apr 02, 2026 at 11:25:16AM -0300, Jason Gunthorpe wrote:
> The fixed commit assumed that the gather would always be populated if an
> iotlb_sync was required.
> 
> arm-smmu-v3, amd, VT-d, riscv, s390, and mtk all use information from the
> gather during their iotlb_sync() and this approach works for them.
> 
> However, arm-smmu, qcom_iommu, ipmmu-vmsa, sun50i, sprd, virtio, and
> apple-dart all ignore the gather during their iotlb_sync(). They mostly
> issue a full flush.
> 
> Unfortunately the latter set of drivers often don't bother to add anything
> to the gather since they don't intend on using it. Since the core code now
> blocks gathers that were never filled, this caused those drivers to stop
> getting their iotlb_sync() calls and breaks them.
> 
> Since it is impossible to tell the difference between gathers that are
> empty because there is nothing to do and gathers that are empty because
> they are not used, fill in the gathers for the missing cases.
> 
> mtk uses io-pgtable-arm-v7s but added the range to the gather in the unmap
> callback. Move this into the io-pgtable-arm-v7s unmap itself. That will
> fix all the armv7 using drivers (arm-smmu, qcom_iommu, ipmmu-vmsa).
> 
> io-pgtable-arm needs to accommodate drivers like arm-smmu that don't want
> to use the gather by just adding a simple range, and drivers like SMMUv3
> that need to use gather->pgsize and also have a disjoint check. Move
> SMMUv3 to a new tlb_add_range() op which replaces calling
> iommu_iotlb_gather_add_page() in a loop with a single call to update the
> gather with the range and required pgsize.
> 
> iommu_iotlb_gather_add_page() is repurposed since nothing but SMMUv3 uses it
> now that amd, VT-d and riscv are using iommupt.
> 
> Add a trivial gather population to io-pgtable-dart.
> 
> Add trivial populations to sprd, sun50i and virtio-iommu in their unmap
> functions.
> 
> Fixes: 90c5def10bea ("iommu: Do not call drivers for empty gathers")
> Reported-by: Jon Hunter <jonathanh@nvidia.com>
> Closes: https://lore.kernel.org/r/8800a38b-8515-4bbe-af15-0dae81274bf7@nvidia.com
> Tested-by: Jon Hunter <jonathanh@nvidia.com>
> Acked-by: Pranjal Shrivastava <praan@google.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 11 ++++++-----
>  drivers/iommu/io-pgtable-arm-v7s.c          |  4 ++++
>  drivers/iommu/io-pgtable-arm.c              | 19 ++++++++++++++++---
>  drivers/iommu/io-pgtable-dart.c             |  3 +++
>  drivers/iommu/mtk_iommu.c                   |  1 -
>  drivers/iommu/sprd-iommu.c                  |  1 +
>  drivers/iommu/sun50i-iommu.c                |  1 +
>  drivers/iommu/virtio-iommu.c                |  2 ++
>  include/linux/io-pgtable.h                  |  3 +++
>  include/linux/iommu.h                       | 19 ++++++++++---------
>  10 files changed, 46 insertions(+), 18 deletions(-)
> 

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>


^ permalink raw reply

* Re: [PATCH v2] iommu: Always fill in gather when unmapping
From: Robin Murphy @ 2026-04-02 15:59 UTC (permalink / raw)
  To: Jason Gunthorpe, Alexandre Ghiti, AngeloGioacchino Del Regno,
	Albert Ou, asahi, Baolin Wang, iommu, Janne Grunau,
	Jernej Skrabec, Joerg Roedel, Jean-Philippe Brucker,
	linux-arm-kernel, linux-mediatek, linux-riscv, linux-sunxi,
	Matthias Brugger, Neal Gompa, Orson Zhai, Palmer Dabbelt,
	Paul Walmsley, Samuel Holland, Sven Peter, virtualization,
	Chen-Yu Tsai, Will Deacon, Yong Wu, Chunyan Zhang
  Cc: Lu Baolu, Janusz Krzysztofik, Joerg Roedel, Jon Hunter, patches,
	Pranjal Shrivastava, Samiullah Khawaja, stable, Vasant Hegde
In-Reply-To: <0-v2-b24668f107b2+11bbe-iommu_gather_always_jgg@nvidia.com>

On 2026-04-02 3:25 pm, Jason Gunthorpe wrote:
> The fixed commit assumed that the gather would always be populated if an
> iotlb_sync was required.
> 
> arm-smmu-v3, amd, VT-d, riscv, s390, and mtk all use information from the
> gather during their iotlb_sync() and this approach works for them.
> 
> However, arm-smmu, qcom_iommu, ipmmu-vmsa, sun50i, sprd, virtio, and
> apple-dart all ignore the gather during their iotlb_sync(). They mostly
> issue a full flush.
> 
> Unfortunately the latter set of drivers often don't bother to add anything
> to the gather since they don't intend on using it. Since the core code now
> blocks gathers that were never filled, this caused those drivers to stop
> getting their iotlb_sync() calls and breaks them.
> 
> Since it is impossible to tell the difference between gathers that are
> empty because there is nothing to do and gathers that are empty because
> they are not used, fill in the gathers for the missing cases.
> 
> mtk uses io-pgtable-arm-v7s but added the range to the gather in the unmap
> callback. Move this into the io-pgtable-arm-v7s unmap itself. That will
> fix all the armv7 using drivers (arm-smmu, qcom_iommu, ipmmu-vmsa).
> 
> io-pgtable-arm needs to accommodate drivers like arm-smmu that don't want
> to use the gather by just adding a simple range, and drivers like SMMUv3
> that need to use gather->pgsize and also have a disjoint check. Move
> SMMUv3 to a new tlb_add_range() op which replaces calling
> iommu_iotlb_gather_add_page() in a loop with a single call to update the
> gather with the range and required pgsize.
> 
> iommu_iotlb_gather_add_page() is repurposed since nothing but SMMUv3 uses it
> now that amd, VT-d and riscv are using iommupt.
> 
> Add a trivial gather population to io-pgtable-dart.
> 
> Add trivial populations to sprd, sun50i and virtio-iommu in their unmap
> functions.
> 
> Fixes: 90c5def10bea ("iommu: Do not call drivers for empty gathers")
> Reported-by: Jon Hunter <jonathanh@nvidia.com>
> Closes: https://lore.kernel.org/r/8800a38b-8515-4bbe-af15-0dae81274bf7@nvidia.com
> Tested-by: Jon Hunter <jonathanh@nvidia.com>
> Acked-by: Pranjal Shrivastava <praan@google.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 11 ++++++-----
>   drivers/iommu/io-pgtable-arm-v7s.c          |  4 ++++
>   drivers/iommu/io-pgtable-arm.c              | 19 ++++++++++++++++---
>   drivers/iommu/io-pgtable-dart.c             |  3 +++
>   drivers/iommu/mtk_iommu.c                   |  1 -
>   drivers/iommu/sprd-iommu.c                  |  1 +
>   drivers/iommu/sun50i-iommu.c                |  1 +
>   drivers/iommu/virtio-iommu.c                |  2 ++
>   include/linux/io-pgtable.h                  |  3 +++
>   include/linux/iommu.h                       | 19 ++++++++++---------
>   10 files changed, 46 insertions(+), 18 deletions(-)
> 
> v2:
>   - Add missed hunk for io-pgtable-armv7
>   - Revise the commit message to fix the miss about smmuv3's gather flow
>   - Make smmuv3 push its gather with a range instead of per-page
> v1: https://patch.msgid.link/r/0-v1-664d3acaabb9+78b-iommu_gather_always_jgg@nvidia.com
> 
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index e8d7dbe495f030..97e78a351cf35b 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2775,14 +2775,15 @@ void arm_smmu_domain_inv_range(struct arm_smmu_domain *smmu_domain,
>   	rcu_read_unlock();
>   }
>   
> -static void arm_smmu_tlb_inv_page_nosync(struct iommu_iotlb_gather *gather,
> -					 unsigned long iova, size_t granule,
> -					 void *cookie)
> +static void arm_smmu_tlb_inv_range_nosync(struct iommu_iotlb_gather *gather,
> +					  unsigned long iova, size_t size,
> +					  size_t granule, void *cookie)
>   {
>   	struct arm_smmu_domain *smmu_domain = cookie;
>   	struct iommu_domain *domain = &smmu_domain->domain;
>   
> -	iommu_iotlb_gather_add_page(domain, gather, iova, granule);
> +	iommu_iotlb_gather_add_range_pgsize(domain, gather, iova, size,
> +					    granule);
>   }
>   
>   static void arm_smmu_tlb_inv_walk(unsigned long iova, size_t size,
> @@ -2796,7 +2797,7 @@ static void arm_smmu_tlb_inv_walk(unsigned long iova, size_t size,
>   static const struct iommu_flush_ops arm_smmu_flush_ops = {
>   	.tlb_flush_all	= arm_smmu_tlb_inv_context,
>   	.tlb_flush_walk = arm_smmu_tlb_inv_walk,
> -	.tlb_add_page	= arm_smmu_tlb_inv_page_nosync,
> +	.tlb_add_range	= arm_smmu_tlb_inv_range_nosync,
>   };
>   
>   static bool arm_smmu_dbm_capable(struct arm_smmu_device *smmu)
> diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
> index 40e33257d3c2c5..87292a7f094687 100644
> --- a/drivers/iommu/io-pgtable-arm-v7s.c
> +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> @@ -596,6 +596,10 @@ static size_t __arm_v7s_unmap(struct arm_v7s_io_pgtable *data,
>   
>   		__arm_v7s_set_pte(ptep, 0, num_entries, &iop->cfg);
>   
> +		if (!iommu_iotlb_gather_queued(gather))
> +			iommu_iotlb_gather_add_range(gather, iova,
> +						     num_entries * blk_size);
> +
>   		for (i = 0; i < num_entries; i++) {
>   			if (ARM_V7S_PTE_IS_TABLE(pte[i], lvl)) {
>   				/* Also flush any partial walks */
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 0208e5897c299a..d51531330f8dea 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -666,9 +666,22 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
>   		/* Clear the remaining entries */
>   		__arm_lpae_clear_pte(ptep, &iop->cfg, i);
>   
> -		if (gather && !iommu_iotlb_gather_queued(gather))
> -			for (int j = 0; j < i; j++)
> -				io_pgtable_tlb_add_page(iop, gather, iova + j * size, size);
> +		if (gather && !iommu_iotlb_gather_queued(gather)) {
> +			if (iop->cfg.tlb && iop->cfg.tlb->tlb_add_range) {
> +				iop->cfg.tlb->tlb_add_range(gather, iova,
> +							    i * size, size,
> +							    iop->cookie);
> +
> +			} else {
> +				iommu_iotlb_gather_add_range(gather, iova,
> +							     i * size);
> +
> +				for (int j = 0; j < i; j++)
> +					io_pgtable_tlb_add_page(iop, gather,
> +								iova + j * size,
> +								size);
> +			}
> +		}

NAK, this is insane.

If you'd rather make gathers mandatory for all drivers than fix it in 
the core code, then for goodness' sake just add the trivial one-liner to 
the handful of .unamp_pages implementations which need it, consistenly 
with those which already exist, plus the ones you're also adding here 
anyway. The entire diff should still be smaller than this absurd hunk 
alone...

Thanks,
Robin.

>   		return i * size;
>   	} else if (iopte_leaf(pte, lvl, iop->fmt)) {
> diff --git a/drivers/iommu/io-pgtable-dart.c b/drivers/iommu/io-pgtable-dart.c
> index cbc5d6aa2daa23..75d699dc28e7b0 100644
> --- a/drivers/iommu/io-pgtable-dart.c
> +++ b/drivers/iommu/io-pgtable-dart.c
> @@ -330,6 +330,9 @@ static size_t dart_unmap_pages(struct io_pgtable_ops *ops, unsigned long iova,
>   		i++;
>   	}
>   
> +	if (i && !iommu_iotlb_gather_queued(gather))
> +		iommu_iotlb_gather_add_range(gather, iova, i * pgsize);
> +
>   	return i * pgsize;
>   }
>   
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 2be990c108de2b..a2f80a92f51f2c 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -828,7 +828,6 @@ static size_t mtk_iommu_unmap(struct iommu_domain *domain,
>   {
>   	struct mtk_iommu_domain *dom = to_mtk_domain(domain);
>   
> -	iommu_iotlb_gather_add_range(gather, iova, pgsize * pgcount);
>   	return dom->iop->unmap_pages(dom->iop, iova, pgsize, pgcount, gather);
>   }
>   
> diff --git a/drivers/iommu/sprd-iommu.c b/drivers/iommu/sprd-iommu.c
> index c1a34445d244fb..893ea67d322644 100644
> --- a/drivers/iommu/sprd-iommu.c
> +++ b/drivers/iommu/sprd-iommu.c
> @@ -340,6 +340,7 @@ static size_t sprd_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
>   	spin_lock_irqsave(&dom->pgtlock, flags);
>   	memset(pgt_base_iova, 0, pgcount * sizeof(u32));
>   	spin_unlock_irqrestore(&dom->pgtlock, flags);
> +	iommu_iotlb_gather_add_range(iotlb_gather, iova, size);
>   
>   	return size;
>   }
> diff --git a/drivers/iommu/sun50i-iommu.c b/drivers/iommu/sun50i-iommu.c
> index be3f1ce696ba29..b9aa4bbc82acad 100644
> --- a/drivers/iommu/sun50i-iommu.c
> +++ b/drivers/iommu/sun50i-iommu.c
> @@ -655,6 +655,7 @@ static size_t sun50i_iommu_unmap(struct iommu_domain *domain, unsigned long iova
>   
>   	memset(pte_addr, 0, sizeof(*pte_addr));
>   	sun50i_table_flush(sun50i_domain, pte_addr, 1);
> +	iommu_iotlb_gather_add_range(gather, iova, SZ_4K);
>   
>   	return SZ_4K;
>   }
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> index 587fc13197f122..5865b8f6c6e67a 100644
> --- a/drivers/iommu/virtio-iommu.c
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -897,6 +897,8 @@ static size_t viommu_unmap_pages(struct iommu_domain *domain, unsigned long iova
>   	if (unmapped < size)
>   		return 0;
>   
> +	iommu_iotlb_gather_add_range(gather, iova, unmapped);
> +
>   	/* Device already removed all mappings after detach. */
>   	if (!vdomain->nr_endpoints)
>   		return unmapped;
> diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
> index e19872e37e067f..b109c95b5ff53d 100644
> --- a/include/linux/io-pgtable.h
> +++ b/include/linux/io-pgtable.h
> @@ -42,6 +42,9 @@ struct iommu_flush_ops {
>   			       void *cookie);
>   	void (*tlb_add_page)(struct iommu_iotlb_gather *gather,
>   			     unsigned long iova, size_t granule, void *cookie);
> +	void (*tlb_add_range)(struct iommu_iotlb_gather *gather,
> +			      unsigned long iova, size_t size, size_t granule,
> +			      void *cookie);
>   };
>   
>   /**
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index e587d4ac4d3310..d8fcdb61e44c42 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -1034,30 +1034,31 @@ static inline void iommu_iotlb_gather_add_range(struct iommu_iotlb_gather *gathe
>   }
>   
>   /**
> - * iommu_iotlb_gather_add_page - Gather for page-based TLB invalidation
> + * iommu_iotlb_gather_add_range_pgsize - Include pgsize in the gather
>    * @domain: IOMMU domain to be invalidated
>    * @gather: TLB gather data
>    * @iova: start of page to invalidate
>    * @size: size of page to invalidate
> + * @pgsize: page granularity of the invalidation
>    *
> - * Helper for IOMMU drivers to build invalidation commands based on individual
> - * pages, or with page size/table level hints which cannot be gathered if they
> - * differ.
> + * Helper for IOMMU drivers to build invalidation commands when using the pgsize
> + * hint. Unlike iommu_iotlb_gather_add_range() this also flushes if the range is
> + * disjoint.
>    */
> -static inline void iommu_iotlb_gather_add_page(struct iommu_domain *domain,
> -					       struct iommu_iotlb_gather *gather,
> -					       unsigned long iova, size_t size)
> +static inline void iommu_iotlb_gather_add_range_pgsize(
> +	struct iommu_domain *domain, struct iommu_iotlb_gather *gather,
> +	unsigned long iova, size_t size, size_t pgsize)
>   {
>   	/*
>   	 * If the new page is disjoint from the current range or is mapped at
>   	 * a different granularity, then sync the TLB so that the gather
>   	 * structure can be rewritten.
>   	 */
> -	if ((gather->pgsize && gather->pgsize != size) ||
> +	if ((gather->pgsize && gather->pgsize != pgsize) ||
>   	    iommu_iotlb_gather_is_disjoint(gather, iova, size))
>   		iommu_iotlb_sync(domain, gather);
>   
> -	gather->pgsize = size;
> +	gather->pgsize = pgsize;
>   	iommu_iotlb_gather_add_range(gather, iova, size);
>   }
>   
> 
> base-commit: 23f3682fd3605da81b90738ad3d2a30f18c46e98



^ permalink raw reply

* Re: [PATCH 2/3] drm: lcdif: Use dedicated set/clr registers for polarity/edge
From: Paul Kocialkowski @ 2026-04-02 16:08 UTC (permalink / raw)
  To: Lucas Stach
  Cc: dri-devel, imx, linux-arm-kernel, linux-kernel, Marek Vasut,
	Stefan Agner, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, Simona Vetter, Frank Li, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, Krzysztof Hałasa,
	Marco Felsch, Liu Ying
In-Reply-To: <3b221a4997c367df716bf107a7e98ce10cd6860b.camel@pengutronix.de>

[-- Attachment #1: Type: text/plain, Size: 2792 bytes --]

Hi Lucas,

On Tue 31 Mar 26, 18:04, Lucas Stach wrote:
> Am Dienstag, dem 31.03.2026 um 17:17 +0200 schrieb Paul Kocialkowski:
> > Le Tue 31 Mar 26, 11:09, Lucas Stach a écrit :
> > > Am Dienstag, dem 31.03.2026 um 00:46 +0200 schrieb Paul Kocialkowski:
> > > > The lcdif v3 hardware comes with dedicated registers to set and clear
> > > > polarity bits in the CTRL register. It is unclear if there is a
> > > > difference with writing to the CTRL register directly.
> > > > 
> > > > Follow the NXP BSP reference by using these registers, in case there is
> > > > a subtle difference caused by using them.
> > > > 
> > > I don't really like that patch, as it blows up what is currently a
> > > single register access to three separate ones. If there is no clear
> > > benefit (as in it has been shown to fix any issue), I would prefer this
> > > code to stay as-is.
> > 
> > Well I guess the cost of a few writes vs a single one is rather
> > negligible.
> > 
> Yea, a few writes don't really hurt. But I don't think there is a very
> good reason to set this register this way, see below.
> 
> > I'm rather worried that there might be an undocumented
> > reason why these registers exist in the first place and why they are
> > used in the BSP.
> > 
> > But yes this is only speculation and I could not witness any actual
> > issue. My setup (lcdif3 with hdmi) uses all positive polarities which is
> > the default state, so not a good way to check.
> > 
> > It would be great if somebody from NXP could confirm whether this is
> > needed or not. In the meantime I guess we can drop the patch. It'll stay
> > on the list in case someone has polarity issues later :)
> 
> The separate clr/set registers are a rather common design feat found on
> Freescale/NXP IP blocks from the MXS era. On some of those IP blocks
> _all_ registers are presented as a base/clr/set triplet in the
> registers space.

Ah yes I vaguely remember seeing this with units used in the imx6.

> As far as I can tell they are mostly useful when you
> want to set/clear individual bits from a register without having to
> remember or executing a readback of the current state.
> 
> In cases like the one changed in this patch, where the full register
> state is set in one go, directly writing to the base register is the
> right thing to do.

Okay then if you're confident these registers are just here for
convenience purposes we can continue writing the full register value.

I'll drop this patch in the next iteration.

All the best,

Paul

-- 
Paul Kocialkowski,

Independent contractor - sys-base - https://www.sys-base.io/
Free software developer - https://www.paulk.fr/

Expert in multimedia, graphics and embedded hardware support with Linux.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH] arm64: dts: imx8mp-phyboard-pollux: Add HDMI support
From: Paul Kocialkowski @ 2026-04-02 16:12 UTC (permalink / raw)
  To: Yannic Moog
  Cc: devicetree@vger.kernel.org, imx@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Frank Li, Sascha Hauer, Pengutronix Kernel Team,
	Fabio Estevam
In-Reply-To: <573e4ebd9679517086a6b4acb162d72463429f35.camel@phytec.de>

[-- Attachment #1: Type: text/plain, Size: 3212 bytes --]

Hi Yannic,

On Wed 01 Apr 26, 12:06, Yannic Moog wrote:
> On Tue, 2026-03-31 at 00:37 +0200, Paul Kocialkowski wrote:
> > The PHYTEC phyBOARD Pollux comes with a HDMI port on the base board.
> > Add the required device-tree nodes to enable support for it.
> 
> "Only" video is supported, but it does work. You could add that to the description since audio is
> also supported upstream and part of HDMI.

You're right, I didn't think about the audio part.
I'll add it in the next iteration.

All the best,

Paul
 
> Yannic
> 
> > 
> > Signed-off-by: Paul Kocialkowski <paulk@sys-base.io>
> 
> Reviewed-by: Yannic Moog <y.moog@phytec.de>
> Tested-by: Yannic Moog <y.moog@phytec.de>
> 
> > ---
> >  .../freescale/imx8mp-phyboard-pollux-rdk.dts  | 47 +++++++++++++++++++
> >  1 file changed, 47 insertions(+)
> > 
> > diff --git a/arch/arm64/boot/dts/freescale/imx8mp-phyboard-pollux-rdk.dts
> > b/arch/arm64/boot/dts/freescale/imx8mp-phyboard-pollux-rdk.dts
> > index 0fe52c73fc8f..0d52f29813f1 100644
> > --- a/arch/arm64/boot/dts/freescale/imx8mp-phyboard-pollux-rdk.dts
> > +++ b/arch/arm64/boot/dts/freescale/imx8mp-phyboard-pollux-rdk.dts
> > @@ -38,6 +38,18 @@ fan0: fan {
> >  		#cooling-cells = <2>;
> >  	};
> >  
> > +	hdmi-connector {
> > +		compatible = "hdmi-connector";
> > +		label = "hdmi";
> > +		type = "a";
> > +
> > +		port {
> > +			hdmi_connector_in: endpoint {
> > +				remote-endpoint = <&hdmi_tx_out>;
> > +			};
> > +		};
> > +	};
> > +
> >  	panel_lvds1: panel-lvds1 {
> >  		/* compatible panel in overlay */
> >  		backlight = <&backlight_lvds1>;
> > @@ -201,6 +213,28 @@ &flexcan2 {
> >  	status = "okay";
> >  };
> >  
> > +&hdmi_pvi {
> > +	status = "okay";
> > +};
> > +
> > +&hdmi_tx {
> > +	pinctrl-names = "default";
> > +	pinctrl-0 = <&pinctrl_hdmi>;
> > +	status = "okay";
> > +
> > +	ports {
> > +		port@1 {
> > +			hdmi_tx_out: endpoint {
> > +				remote-endpoint = <&hdmi_connector_in>;
> > +			};
> > +		};
> > +	};
> > +};
> > +
> > +&hdmi_tx_phy {
> > +	status = "okay";
> > +};
> > +
> >  &i2c2 {
> >  	clock-frequency = <400000>;
> >  	pinctrl-names = "default", "gpio";
> > @@ -244,6 +278,10 @@ &i2c3 {
> >  	scl-gpios = <&gpio5 19 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
> >  };
> >  
> > +&lcdif3 {
> > +	status = "okay";
> > +};
> > +
> >  &ldb_lvds_ch1 {
> >  	remote-endpoint = <&panel1_in>;
> >  };
> > @@ -444,6 +482,15 @@ MX8MP_IOMUXC_SAI5_RXD0__GPIO3_IO21	0x154
> >  		>;
> >  	};
> >  
> > +	pinctrl_hdmi: hdmigrp {
> > +		fsl,pins = <
> > +			MX8MP_IOMUXC_HDMI_DDC_SCL__HDMIMIX_HDMI_SCL			0x1c3
> > +			MX8MP_IOMUXC_HDMI_DDC_SDA__HDMIMIX_HDMI_SDA			0x1c3
> > +			MX8MP_IOMUXC_HDMI_HPD__HDMIMIX_HDMI_HPD				0
> > x19
> > +			MX8MP_IOMUXC_HDMI_CEC__HDMIMIX_HDMI_CEC				0
> > x19
> > +		>;
> > +	};
> > +
> >  	pinctrl_i2c2: i2c2grp {
> >  		fsl,pins = <
> >  			MX8MP_IOMUXC_I2C2_SCL__I2C2_SCL		0x400001c2

-- 
Paul Kocialkowski,

Independent contractor - sys-base - https://www.sys-base.io/
Free software developer - https://www.paulk.fr/

Expert in multimedia, graphics and embedded hardware support with Linux.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH v5 2/4] software node: remove software_node_exit()
From: Andy Shevchenko @ 2026-04-02 16:19 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
	Daniel Scally, Heikki Krogerus, Sakari Ailus, Aaro Koskinen,
	Janusz Krzysztofik, Tony Lindgren, Russell King, Dmitry Torokhov,
	Kevin Hilman, Arnd Bergmann, brgl, driver-core, linux-kernel,
	linux-acpi, linux-arm-kernel, linux-omap
In-Reply-To: <20260402-nokia770-gpio-swnodes-v5-2-d730db3dd299@oss.qualcomm.com>

On Thu, Apr 02, 2026 at 04:15:03PM +0200, Bartosz Golaszewski wrote:
> software_node_exit() is an __exitcall() in a built-in compilation unit
> so effectively dead code. Remove it.

a dead code

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

-- 
With Best Regards,
Andy Shevchenko




^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox