public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] drm/panthor: Coherency related fixes
@ 2024-10-24 14:54 Akash Goel
  2024-10-24 14:54 ` [PATCH 1/3] drm/panthor: Update memattr programing to align with GPU spec Akash Goel
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Akash Goel @ 2024-10-24 14:54 UTC (permalink / raw)
  To: boris.brezillon, liviu.dudau, steven.price
  Cc: dri-devel, linux-kernel, mihail.atanassov, ketil.johnsen,
	florent.tomasin, maarten.lankhorst, mripard, tzimmermann, airlied,
	daniel, nd, Akash Goel

This patch series contains 3 cache coherency related fixes for the
Panthor driver.
- The first fix, regarding the Inner-shareability, is mandatory to
  ensure things work on all platforms (including Juno FPGA) when
  no_coherency protocol is selected.
- The second fix regarding the coherency feature/enable register is
  required to avoid potential misalignment on certain platforms.
- The third fix, regarding the potential overwrite of buffer objects,
  has been prepared speculatively & it may not be required in practice.
  Please provide feedback on it.

Akash Goel (3):
  drm/panthor: Update memattr programing to align with GPU spec
  drm/panthor: Explicitly set the coherency mode
  drm/panthor: Prevent potential overwrite of buffer objects

 drivers/gpu/drm/panthor/panthor_device.c | 22 ++++++++++++++++++-
 drivers/gpu/drm/panthor/panthor_gem.h    | 10 +++++++++
 drivers/gpu/drm/panthor/panthor_gpu.c    |  9 ++++++++
 drivers/gpu/drm/panthor/panthor_mmu.c    | 28 +++++++++++++++++-------
 4 files changed, 60 insertions(+), 9 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/3] drm/panthor: Update memattr programing to align with GPU spec
  2024-10-24 14:54 [PATCH 0/3] drm/panthor: Coherency related fixes Akash Goel
@ 2024-10-24 14:54 ` Akash Goel
  2024-10-24 15:49   ` Boris Brezillon
  2024-10-25 10:03   ` Steven Price
  2024-10-24 14:54 ` [PATCH 2/3] drm/panthor: Explicitly set the coherency mode Akash Goel
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 17+ messages in thread
From: Akash Goel @ 2024-10-24 14:54 UTC (permalink / raw)
  To: boris.brezillon, liviu.dudau, steven.price
  Cc: dri-devel, linux-kernel, mihail.atanassov, ketil.johnsen,
	florent.tomasin, maarten.lankhorst, mripard, tzimmermann, airlied,
	daniel, nd, Akash Goel

Mali GPU Arch spec forbids the GPU PTEs to indicate Inner or Outer
shareability when no_coherency protocol is selected. Doing so results in
unexpected or undesired snooping of the CPU caches on some platforms,
such as Juno FPGA, causing functional issues. For example the boot of
MCU firmware fails as GPU ends up reading stale data for the FW memory
pages from the CPU's cache. The FW memory pages are initialized with
uncached mapping when the device is not reported to be dma-coherent.
The shareability bits are set to inner-shareable when IOMMU_CACHE flag
is passed to map_pages() callback and IOMMU_CACHE flag is passed by
Panthor driver when memory needs to be mapped as cached on the GPU side.

IOMMU_CACHE seems to imply cache coherent and is probably not fit for
purpose for the memory that is mapped as cached on GPU side but doesn't
need to remain coherent with the CPU.

This commit updates the programming of MEMATTR register to use
MIDGARD_INNER instead of CPU_INNER when coherency is disabled. That way
the inner-shareability specified in the GPU PTEs would map to Mali's
internal-shareable mode, which is always supported by the GPU regardless
of the coherency protocal and is required by the Userspace driver to
ensure coherency between the shader cores.

Signed-off-by: Akash Goel <akash.goel@arm.com>
---
 drivers/gpu/drm/panthor/panthor_mmu.c | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
index f3ee5d2753f1..f522a116c1b1 100644
--- a/drivers/gpu/drm/panthor/panthor_mmu.c
+++ b/drivers/gpu/drm/panthor/panthor_mmu.c
@@ -1927,7 +1927,7 @@ struct panthor_heap_pool *panthor_vm_get_heap_pool(struct panthor_vm *vm, bool c
 	return pool;
 }
 
-static u64 mair_to_memattr(u64 mair)
+static u64 mair_to_memattr(u64 mair, bool coherent)
 {
 	u64 memattr = 0;
 	u32 i;
@@ -1946,14 +1946,21 @@ static u64 mair_to_memattr(u64 mair)
 				   AS_MEMATTR_AARCH64_SH_MIDGARD_INNER |
 				   AS_MEMATTR_AARCH64_INNER_ALLOC_EXPL(false, false);
 		} else {
-			/* Use SH_CPU_INNER mode so SH_IS, which is used when
-			 * IOMMU_CACHE is set, actually maps to the standard
-			 * definition of inner-shareable and not Mali's
-			 * internal-shareable mode.
-			 */
 			out_attr = AS_MEMATTR_AARCH64_INNER_OUTER_WB |
-				   AS_MEMATTR_AARCH64_SH_CPU_INNER |
 				   AS_MEMATTR_AARCH64_INNER_ALLOC_EXPL(inner & 1, inner & 2);
+			/* Use SH_MIDGARD_INNER mode when device isn't coherent,
+			 * so SH_IS, which is used when IOMMU_CACHE is set, maps
+			 * to Mali's internal-shareable mode. As per the Mali
+			 * Spec, inner and outer-shareable modes aren't allowed
+			 * for WB memory when coherency is disabled.
+			 * Use SH_CPU_INNER mode when coherency is enabled, so
+			 * that SH_IS actually maps to the standard definition of
+			 * inner-shareable.
+			 */
+			if (!coherent)
+				out_attr |= AS_MEMATTR_AARCH64_SH_MIDGARD_INNER;
+			else
+				out_attr |= AS_MEMATTR_AARCH64_SH_CPU_INNER;
 		}
 
 		memattr |= (u64)out_attr << (8 * i);
@@ -2325,7 +2332,7 @@ panthor_vm_create(struct panthor_device *ptdev, bool for_mcu,
 		goto err_sched_fini;
 
 	mair = io_pgtable_ops_to_pgtable(vm->pgtbl_ops)->cfg.arm_lpae_s1_cfg.mair;
-	vm->memattr = mair_to_memattr(mair);
+	vm->memattr = mair_to_memattr(mair, ptdev->coherent);
 
 	mutex_lock(&ptdev->mmu->vm.lock);
 	list_add_tail(&vm->node, &ptdev->mmu->vm.list);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/3] drm/panthor: Explicitly set the coherency mode
  2024-10-24 14:54 [PATCH 0/3] drm/panthor: Coherency related fixes Akash Goel
  2024-10-24 14:54 ` [PATCH 1/3] drm/panthor: Update memattr programing to align with GPU spec Akash Goel
@ 2024-10-24 14:54 ` Akash Goel
  2024-10-24 15:51   ` Boris Brezillon
                     ` (2 more replies)
  2024-10-24 14:54 ` [PATCH 3/3] drm/panthor: Prevent potential overwrite of buffer objects Akash Goel
  2024-10-30 15:46 ` [PATCH 0/3] drm/panthor: Coherency related fixes Boris Brezillon
  3 siblings, 3 replies; 17+ messages in thread
From: Akash Goel @ 2024-10-24 14:54 UTC (permalink / raw)
  To: boris.brezillon, liviu.dudau, steven.price
  Cc: dri-devel, linux-kernel, mihail.atanassov, ketil.johnsen,
	florent.tomasin, maarten.lankhorst, mripard, tzimmermann, airlied,
	daniel, nd, Akash Goel

This commit fixes the potential misalignment between the value of device
tree property "dma-coherent" and default value of COHERENCY_ENABLE
register.
Panthor driver didn't explicitly program the COHERENCY_ENABLE register
with the desired coherency mode. The default value of COHERENCY_ENABLE
register is implementation defined, so it may not be always aligned with
the "dma-coherent" property value.
The commit also checks the COHERENCY_FEATURES register to confirm that
the coherency protocol is actually supported or not.

Signed-off-by: Akash Goel <akash.goel@arm.com>
---
 drivers/gpu/drm/panthor/panthor_device.c | 22 +++++++++++++++++++++-
 drivers/gpu/drm/panthor/panthor_gpu.c    |  9 +++++++++
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
index 4082c8f2951d..984615f4ed27 100644
--- a/drivers/gpu/drm/panthor/panthor_device.c
+++ b/drivers/gpu/drm/panthor/panthor_device.c
@@ -22,6 +22,24 @@
 #include "panthor_regs.h"
 #include "panthor_sched.h"
 
+static int panthor_gpu_coherency_init(struct panthor_device *ptdev)
+{
+	ptdev->coherent = device_get_dma_attr(ptdev->base.dev) == DEV_DMA_COHERENT;
+
+	if (!ptdev->coherent)
+		return 0;
+
+	/* Check if the ACE-Lite coherency protocol is actually supported by the GPU.
+	 * ACE protocol has never been supported for command stream frontend GPUs.
+	 */
+	if ((gpu_read(ptdev, GPU_COHERENCY_FEATURES) &
+		      GPU_COHERENCY_PROT_BIT(ACE_LITE)))
+		return 0;
+
+	drm_err(&ptdev->base, "Coherency not supported by the device");
+	return -ENOTSUPP;
+}
+
 static int panthor_clk_init(struct panthor_device *ptdev)
 {
 	ptdev->clks.core = devm_clk_get(ptdev->base.dev, NULL);
@@ -156,7 +174,9 @@ int panthor_device_init(struct panthor_device *ptdev)
 	struct page *p;
 	int ret;
 
-	ptdev->coherent = device_get_dma_attr(ptdev->base.dev) == DEV_DMA_COHERENT;
+	ret = panthor_gpu_coherency_init(ptdev);
+	if (ret)
+		return ret;
 
 	init_completion(&ptdev->unplug.done);
 	ret = drmm_mutex_init(&ptdev->base, &ptdev->unplug.lock);
diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
index 5251d8764e7d..1e24f08a519a 100644
--- a/drivers/gpu/drm/panthor/panthor_gpu.c
+++ b/drivers/gpu/drm/panthor/panthor_gpu.c
@@ -77,6 +77,12 @@ static const struct panthor_model gpu_models[] = {
 	 GPU_IRQ_RESET_COMPLETED | \
 	 GPU_IRQ_CLEAN_CACHES_COMPLETED)
 
+static void panthor_gpu_coherency_set(struct panthor_device *ptdev)
+{
+	gpu_write(ptdev, GPU_COHERENCY_PROTOCOL,
+		ptdev->coherent ? GPU_COHERENCY_PROT_BIT(ACE_LITE) : GPU_COHERENCY_NONE);
+}
+
 static void panthor_gpu_init_info(struct panthor_device *ptdev)
 {
 	const struct panthor_model *model;
@@ -365,6 +371,9 @@ int panthor_gpu_l2_power_on(struct panthor_device *ptdev)
 			      hweight64(ptdev->gpu_info.shader_present));
 	}
 
+	/* Set the desired coherency mode before the power up of L2 */
+	panthor_gpu_coherency_set(ptdev);
+
 	return panthor_gpu_power_on(ptdev, L2, 1, 20000);
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/3] drm/panthor: Prevent potential overwrite of buffer objects
  2024-10-24 14:54 [PATCH 0/3] drm/panthor: Coherency related fixes Akash Goel
  2024-10-24 14:54 ` [PATCH 1/3] drm/panthor: Update memattr programing to align with GPU spec Akash Goel
  2024-10-24 14:54 ` [PATCH 2/3] drm/panthor: Explicitly set the coherency mode Akash Goel
@ 2024-10-24 14:54 ` Akash Goel
  2024-10-24 15:39   ` Boris Brezillon
  2024-10-30 15:46 ` [PATCH 0/3] drm/panthor: Coherency related fixes Boris Brezillon
  3 siblings, 1 reply; 17+ messages in thread
From: Akash Goel @ 2024-10-24 14:54 UTC (permalink / raw)
  To: boris.brezillon, liviu.dudau, steven.price
  Cc: dri-devel, linux-kernel, mihail.atanassov, ketil.johnsen,
	florent.tomasin, maarten.lankhorst, mripard, tzimmermann, airlied,
	daniel, nd, Akash Goel

All CPU mappings are forced as uncached for Panthor buffer objects when
system(IO) coherency is disabled. Physical backing for Panthor BOs is
allocated by shmem, which clears the pages also after allocation. But
there is no explicit cache flush done after the clearing of pages.
So it could happen that there are dirty cachelines in the CPU cache
for the BOs, when they are accessed from the CPU side through uncached
CPU mapping, and the eviction of cachelines overwrites the data of BOs.

This commit tries to avoid the potential overwrite scenario.

Signed-off-by: Akash Goel <akash.goel@arm.com>
---
 drivers/gpu/drm/panthor/panthor_gem.h | 10 ++++++++++
 drivers/gpu/drm/panthor/panthor_mmu.c |  5 +++++
 2 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/panthor/panthor_gem.h b/drivers/gpu/drm/panthor/panthor_gem.h
index e43021cf6d45..4b0f43f1edf1 100644
--- a/drivers/gpu/drm/panthor/panthor_gem.h
+++ b/drivers/gpu/drm/panthor/panthor_gem.h
@@ -46,6 +46,16 @@ struct panthor_gem_object {
 
 	/** @flags: Combination of drm_panthor_bo_flags flags. */
 	u32 flags;
+
+	/**
+	 * @cleaned: The buffer object pages have been cleaned.
+	 *
+	 * There could be dirty CPU cachelines for the pages of buffer object
+	 * after allocation, as shmem will zero out the pages. The cachelines
+	 * need to be cleaned if the pages are going to be accessed with an
+	 * uncached CPU mapping.
+	 */
+	bool cleaned;
 };
 
 /**
diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
index f522a116c1b1..d8cc9e7d064e 100644
--- a/drivers/gpu/drm/panthor/panthor_mmu.c
+++ b/drivers/gpu/drm/panthor/panthor_mmu.c
@@ -1249,6 +1249,11 @@ static int panthor_vm_prepare_map_op_ctx(struct panthor_vm_op_ctx *op_ctx,
 
 	op_ctx->map.sgt = sgt;
 
+	if (bo->base.map_wc && !bo->cleaned) {
+		dma_sync_sgtable_for_device(vm->ptdev->base.dev, sgt, DMA_TO_DEVICE);
+		bo->cleaned = true;
+	}
+
 	preallocated_vm_bo = drm_gpuvm_bo_create(&vm->base, &bo->base.base);
 	if (!preallocated_vm_bo) {
 		if (!bo->base.base.import_attach)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] drm/panthor: Prevent potential overwrite of buffer objects
  2024-10-24 14:54 ` [PATCH 3/3] drm/panthor: Prevent potential overwrite of buffer objects Akash Goel
@ 2024-10-24 15:39   ` Boris Brezillon
  2024-10-31 21:42     ` Akash Goel
  0 siblings, 1 reply; 17+ messages in thread
From: Boris Brezillon @ 2024-10-24 15:39 UTC (permalink / raw)
  To: Akash Goel
  Cc: liviu.dudau, steven.price, dri-devel, linux-kernel,
	mihail.atanassov, ketil.johnsen, florent.tomasin,
	maarten.lankhorst, mripard, tzimmermann, airlied, daniel, nd

On Thu, 24 Oct 2024 15:54:32 +0100
Akash Goel <akash.goel@arm.com> wrote:

> All CPU mappings are forced as uncached for Panthor buffer objects when
> system(IO) coherency is disabled. Physical backing for Panthor BOs is
> allocated by shmem, which clears the pages also after allocation. But
> there is no explicit cache flush done after the clearing of pages.
> So it could happen that there are dirty cachelines in the CPU cache
> for the BOs, when they are accessed from the CPU side through uncached
> CPU mapping, and the eviction of cachelines overwrites the data of BOs.

Hm, this looks like something that should be handled at the
drm_gem_shmem level when drm_gem_shmem_object::map_wc=true, as I
suspect other drivers can hit the same issue (I'm thinking of panfrost
and lima, but there might be others).

> 
> This commit tries to avoid the potential overwrite scenario.
> 
> Signed-off-by: Akash Goel <akash.goel@arm.com>
> ---
>  drivers/gpu/drm/panthor/panthor_gem.h | 10 ++++++++++
>  drivers/gpu/drm/panthor/panthor_mmu.c |  5 +++++
>  2 files changed, 15 insertions(+)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_gem.h b/drivers/gpu/drm/panthor/panthor_gem.h
> index e43021cf6d45..4b0f43f1edf1 100644
> --- a/drivers/gpu/drm/panthor/panthor_gem.h
> +++ b/drivers/gpu/drm/panthor/panthor_gem.h
> @@ -46,6 +46,16 @@ struct panthor_gem_object {
>  
>  	/** @flags: Combination of drm_panthor_bo_flags flags. */
>  	u32 flags;
> +
> +	/**
> +	 * @cleaned: The buffer object pages have been cleaned.
> +	 *
> +	 * There could be dirty CPU cachelines for the pages of buffer object
> +	 * after allocation, as shmem will zero out the pages. The cachelines
> +	 * need to be cleaned if the pages are going to be accessed with an
> +	 * uncached CPU mapping.
> +	 */
> +	bool cleaned;
>  };
>  
>  /**
> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
> index f522a116c1b1..d8cc9e7d064e 100644
> --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> @@ -1249,6 +1249,11 @@ static int panthor_vm_prepare_map_op_ctx(struct panthor_vm_op_ctx *op_ctx,
>  
>  	op_ctx->map.sgt = sgt;
>  
> +	if (bo->base.map_wc && !bo->cleaned) {
> +		dma_sync_sgtable_for_device(vm->ptdev->base.dev, sgt, DMA_TO_DEVICE);
> +		bo->cleaned = true;
> +	}
> +
>  	preallocated_vm_bo = drm_gpuvm_bo_create(&vm->base, &bo->base.base);
>  	if (!preallocated_vm_bo) {
>  		if (!bo->base.base.import_attach)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] drm/panthor: Update memattr programing to align with GPU spec
  2024-10-24 14:54 ` [PATCH 1/3] drm/panthor: Update memattr programing to align with GPU spec Akash Goel
@ 2024-10-24 15:49   ` Boris Brezillon
  2024-10-25  9:24     ` Liviu Dudau
  2024-10-25 10:03   ` Steven Price
  1 sibling, 1 reply; 17+ messages in thread
From: Boris Brezillon @ 2024-10-24 15:49 UTC (permalink / raw)
  To: Akash Goel, Robin Murphy
  Cc: liviu.dudau, steven.price, dri-devel, linux-kernel,
	mihail.atanassov, ketil.johnsen, florent.tomasin,
	maarten.lankhorst, mripard, tzimmermann, airlied, daniel, nd

+Robin for the MMU details

On Thu, 24 Oct 2024 15:54:30 +0100
Akash Goel <akash.goel@arm.com> wrote:

> Mali GPU Arch spec forbids the GPU PTEs to indicate Inner or Outer
> shareability when no_coherency protocol is selected. Doing so results in
> unexpected or undesired snooping of the CPU caches on some platforms,
> such as Juno FPGA, causing functional issues. For example the boot of
> MCU firmware fails as GPU ends up reading stale data for the FW memory
> pages from the CPU's cache. The FW memory pages are initialized with
> uncached mapping when the device is not reported to be dma-coherent.
> The shareability bits are set to inner-shareable when IOMMU_CACHE flag
> is passed to map_pages() callback and IOMMU_CACHE flag is passed by
> Panthor driver when memory needs to be mapped as cached on the GPU side.
> 
> IOMMU_CACHE seems to imply cache coherent and is probably not fit for
> purpose for the memory that is mapped as cached on GPU side but doesn't
> need to remain coherent with the CPU.

Yeah, IIRC I've been abusing the _CACHE flag to mean GPU-cached, not
cache-coherent. I think it be good to sit down with Rob and add the
necessary IOMMU_ flags so we can express all the shareability and
cacheability variants we have with the "Mali" MMU. For instance, I
think the shareability between MCU/GPU can be expressed properly at the
moment, and we unconditionally map things uncached because of that.

> 
> This commit updates the programming of MEMATTR register to use
> MIDGARD_INNER instead of CPU_INNER when coherency is disabled. That way
> the inner-shareability specified in the GPU PTEs would map to Mali's
> internal-shareable mode, which is always supported by the GPU regardless
> of the coherency protocal and is required by the Userspace driver to
> ensure coherency between the shader cores.
> 
> Signed-off-by: Akash Goel <akash.goel@arm.com>

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

> ---
>  drivers/gpu/drm/panthor/panthor_mmu.c | 23 +++++++++++++++--------
>  1 file changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
> index f3ee5d2753f1..f522a116c1b1 100644
> --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> @@ -1927,7 +1927,7 @@ struct panthor_heap_pool *panthor_vm_get_heap_pool(struct panthor_vm *vm, bool c
>  	return pool;
>  }
>  
> -static u64 mair_to_memattr(u64 mair)
> +static u64 mair_to_memattr(u64 mair, bool coherent)
>  {
>  	u64 memattr = 0;
>  	u32 i;
> @@ -1946,14 +1946,21 @@ static u64 mair_to_memattr(u64 mair)
>  				   AS_MEMATTR_AARCH64_SH_MIDGARD_INNER |
>  				   AS_MEMATTR_AARCH64_INNER_ALLOC_EXPL(false, false);
>  		} else {
> -			/* Use SH_CPU_INNER mode so SH_IS, which is used when
> -			 * IOMMU_CACHE is set, actually maps to the standard
> -			 * definition of inner-shareable and not Mali's
> -			 * internal-shareable mode.
> -			 */
>  			out_attr = AS_MEMATTR_AARCH64_INNER_OUTER_WB |
> -				   AS_MEMATTR_AARCH64_SH_CPU_INNER |
>  				   AS_MEMATTR_AARCH64_INNER_ALLOC_EXPL(inner & 1, inner & 2);
> +			/* Use SH_MIDGARD_INNER mode when device isn't coherent,
> +			 * so SH_IS, which is used when IOMMU_CACHE is set, maps
> +			 * to Mali's internal-shareable mode. As per the Mali
> +			 * Spec, inner and outer-shareable modes aren't allowed
> +			 * for WB memory when coherency is disabled.
> +			 * Use SH_CPU_INNER mode when coherency is enabled, so
> +			 * that SH_IS actually maps to the standard definition of
> +			 * inner-shareable.
> +			 */
> +			if (!coherent)
> +				out_attr |= AS_MEMATTR_AARCH64_SH_MIDGARD_INNER;
> +			else
> +				out_attr |= AS_MEMATTR_AARCH64_SH_CPU_INNER;
>  		}
>  
>  		memattr |= (u64)out_attr << (8 * i);
> @@ -2325,7 +2332,7 @@ panthor_vm_create(struct panthor_device *ptdev, bool for_mcu,
>  		goto err_sched_fini;
>  
>  	mair = io_pgtable_ops_to_pgtable(vm->pgtbl_ops)->cfg.arm_lpae_s1_cfg.mair;
> -	vm->memattr = mair_to_memattr(mair);
> +	vm->memattr = mair_to_memattr(mair, ptdev->coherent);
>  
>  	mutex_lock(&ptdev->mmu->vm.lock);
>  	list_add_tail(&vm->node, &ptdev->mmu->vm.list);


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] drm/panthor: Explicitly set the coherency mode
  2024-10-24 14:54 ` [PATCH 2/3] drm/panthor: Explicitly set the coherency mode Akash Goel
@ 2024-10-24 15:51   ` Boris Brezillon
  2024-10-25  9:29   ` Liviu Dudau
  2024-10-25 10:03   ` Steven Price
  2 siblings, 0 replies; 17+ messages in thread
From: Boris Brezillon @ 2024-10-24 15:51 UTC (permalink / raw)
  To: Akash Goel
  Cc: liviu.dudau, steven.price, dri-devel, linux-kernel,
	mihail.atanassov, ketil.johnsen, florent.tomasin,
	maarten.lankhorst, mripard, tzimmermann, airlied, daniel, nd

On Thu, 24 Oct 2024 15:54:31 +0100
Akash Goel <akash.goel@arm.com> wrote:

> This commit fixes the potential misalignment between the value of device
> tree property "dma-coherent" and default value of COHERENCY_ENABLE
> register.
> Panthor driver didn't explicitly program the COHERENCY_ENABLE register
> with the desired coherency mode. The default value of COHERENCY_ENABLE
> register is implementation defined, so it may not be always aligned with
> the "dma-coherent" property value.
> The commit also checks the COHERENCY_FEATURES register to confirm that
> the coherency protocol is actually supported or not.
> 
> Signed-off-by: Akash Goel <akash.goel@arm.com>

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

> ---
>  drivers/gpu/drm/panthor/panthor_device.c | 22 +++++++++++++++++++++-
>  drivers/gpu/drm/panthor/panthor_gpu.c    |  9 +++++++++
>  2 files changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> index 4082c8f2951d..984615f4ed27 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.c
> +++ b/drivers/gpu/drm/panthor/panthor_device.c
> @@ -22,6 +22,24 @@
>  #include "panthor_regs.h"
>  #include "panthor_sched.h"
>  
> +static int panthor_gpu_coherency_init(struct panthor_device *ptdev)
> +{
> +	ptdev->coherent = device_get_dma_attr(ptdev->base.dev) == DEV_DMA_COHERENT;
> +
> +	if (!ptdev->coherent)
> +		return 0;
> +
> +	/* Check if the ACE-Lite coherency protocol is actually supported by the GPU.
> +	 * ACE protocol has never been supported for command stream frontend GPUs.
> +	 */
> +	if ((gpu_read(ptdev, GPU_COHERENCY_FEATURES) &
> +		      GPU_COHERENCY_PROT_BIT(ACE_LITE)))
> +		return 0;
> +
> +	drm_err(&ptdev->base, "Coherency not supported by the device");
> +	return -ENOTSUPP;
> +}
> +
>  static int panthor_clk_init(struct panthor_device *ptdev)
>  {
>  	ptdev->clks.core = devm_clk_get(ptdev->base.dev, NULL);
> @@ -156,7 +174,9 @@ int panthor_device_init(struct panthor_device *ptdev)
>  	struct page *p;
>  	int ret;
>  
> -	ptdev->coherent = device_get_dma_attr(ptdev->base.dev) == DEV_DMA_COHERENT;
> +	ret = panthor_gpu_coherency_init(ptdev);
> +	if (ret)
> +		return ret;
>  
>  	init_completion(&ptdev->unplug.done);
>  	ret = drmm_mutex_init(&ptdev->base, &ptdev->unplug.lock);
> diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
> index 5251d8764e7d..1e24f08a519a 100644
> --- a/drivers/gpu/drm/panthor/panthor_gpu.c
> +++ b/drivers/gpu/drm/panthor/panthor_gpu.c
> @@ -77,6 +77,12 @@ static const struct panthor_model gpu_models[] = {
>  	 GPU_IRQ_RESET_COMPLETED | \
>  	 GPU_IRQ_CLEAN_CACHES_COMPLETED)
>  
> +static void panthor_gpu_coherency_set(struct panthor_device *ptdev)
> +{
> +	gpu_write(ptdev, GPU_COHERENCY_PROTOCOL,
> +		ptdev->coherent ? GPU_COHERENCY_PROT_BIT(ACE_LITE) : GPU_COHERENCY_NONE);
> +}
> +
>  static void panthor_gpu_init_info(struct panthor_device *ptdev)
>  {
>  	const struct panthor_model *model;
> @@ -365,6 +371,9 @@ int panthor_gpu_l2_power_on(struct panthor_device *ptdev)
>  			      hweight64(ptdev->gpu_info.shader_present));
>  	}
>  
> +	/* Set the desired coherency mode before the power up of L2 */
> +	panthor_gpu_coherency_set(ptdev);
> +
>  	return panthor_gpu_power_on(ptdev, L2, 1, 20000);
>  }
>  


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] drm/panthor: Update memattr programing to align with GPU spec
  2024-10-24 15:49   ` Boris Brezillon
@ 2024-10-25  9:24     ` Liviu Dudau
  2024-10-25 12:31       ` Boris Brezillon
  0 siblings, 1 reply; 17+ messages in thread
From: Liviu Dudau @ 2024-10-25  9:24 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Akash Goel, Robin Murphy, steven.price, dri-devel, linux-kernel,
	mihail.atanassov, ketil.johnsen, florent.tomasin,
	maarten.lankhorst, mripard, tzimmermann, airlied, daniel, nd

On Thu, Oct 24, 2024 at 05:49:44PM +0200, Boris Brezillon wrote:
> +Robin for the MMU details
> 
> On Thu, 24 Oct 2024 15:54:30 +0100
> Akash Goel <akash.goel@arm.com> wrote:
> 
> > Mali GPU Arch spec forbids the GPU PTEs to indicate Inner or Outer
> > shareability when no_coherency protocol is selected. Doing so results in
> > unexpected or undesired snooping of the CPU caches on some platforms,
> > such as Juno FPGA, causing functional issues. For example the boot of
> > MCU firmware fails as GPU ends up reading stale data for the FW memory
> > pages from the CPU's cache. The FW memory pages are initialized with
> > uncached mapping when the device is not reported to be dma-coherent.
> > The shareability bits are set to inner-shareable when IOMMU_CACHE flag
> > is passed to map_pages() callback and IOMMU_CACHE flag is passed by
> > Panthor driver when memory needs to be mapped as cached on the GPU side.
> > 
> > IOMMU_CACHE seems to imply cache coherent and is probably not fit for
> > purpose for the memory that is mapped as cached on GPU side but doesn't
> > need to remain coherent with the CPU.
> 
> Yeah, IIRC I've been abusing the _CACHE flag to mean GPU-cached, not
> cache-coherent. I think it be good to sit down with Rob and add the
> necessary IOMMU_ flags so we can express all the shareability and
> cacheability variants we have with the "Mali" MMU. For instance, I
> think the shareability between MCU/GPU can be expressed properly at the
> moment, and we unconditionally map things uncached because of that.

Boris, did you mean to say "shareability between MCU/GPU *can't* be expressed
properly" ? Currently the sentence reads a bit strange, as if there was a
negation somewhere.

Our GPU's architecture dictates a lot of coherency attributes, especially at
the read-write/read-only L1$, so using _CACHE as a flag for something else
is indeed tempting. We should talk with Rob to see how we can improve things
here.

> 
> > 
> > This commit updates the programming of MEMATTR register to use
> > MIDGARD_INNER instead of CPU_INNER when coherency is disabled. That way
> > the inner-shareability specified in the GPU PTEs would map to Mali's
> > internal-shareable mode, which is always supported by the GPU regardless
> > of the coherency protocal and is required by the Userspace driver to
> > ensure coherency between the shader cores.
> > 
> > Signed-off-by: Akash Goel <akash.goel@arm.com>
> 
> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>

Best regards,
Liviu

> 
> > ---
> >  drivers/gpu/drm/panthor/panthor_mmu.c | 23 +++++++++++++++--------
> >  1 file changed, 15 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
> > index f3ee5d2753f1..f522a116c1b1 100644
> > --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> > +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> > @@ -1927,7 +1927,7 @@ struct panthor_heap_pool *panthor_vm_get_heap_pool(struct panthor_vm *vm, bool c
> >  	return pool;
> >  }
> >  
> > -static u64 mair_to_memattr(u64 mair)
> > +static u64 mair_to_memattr(u64 mair, bool coherent)
> >  {
> >  	u64 memattr = 0;
> >  	u32 i;
> > @@ -1946,14 +1946,21 @@ static u64 mair_to_memattr(u64 mair)
> >  				   AS_MEMATTR_AARCH64_SH_MIDGARD_INNER |
> >  				   AS_MEMATTR_AARCH64_INNER_ALLOC_EXPL(false, false);
> >  		} else {
> > -			/* Use SH_CPU_INNER mode so SH_IS, which is used when
> > -			 * IOMMU_CACHE is set, actually maps to the standard
> > -			 * definition of inner-shareable and not Mali's
> > -			 * internal-shareable mode.
> > -			 */
> >  			out_attr = AS_MEMATTR_AARCH64_INNER_OUTER_WB |
> > -				   AS_MEMATTR_AARCH64_SH_CPU_INNER |
> >  				   AS_MEMATTR_AARCH64_INNER_ALLOC_EXPL(inner & 1, inner & 2);
> > +			/* Use SH_MIDGARD_INNER mode when device isn't coherent,
> > +			 * so SH_IS, which is used when IOMMU_CACHE is set, maps
> > +			 * to Mali's internal-shareable mode. As per the Mali
> > +			 * Spec, inner and outer-shareable modes aren't allowed
> > +			 * for WB memory when coherency is disabled.
> > +			 * Use SH_CPU_INNER mode when coherency is enabled, so
> > +			 * that SH_IS actually maps to the standard definition of
> > +			 * inner-shareable.
> > +			 */
> > +			if (!coherent)
> > +				out_attr |= AS_MEMATTR_AARCH64_SH_MIDGARD_INNER;
> > +			else
> > +				out_attr |= AS_MEMATTR_AARCH64_SH_CPU_INNER;
> >  		}
> >  
> >  		memattr |= (u64)out_attr << (8 * i);
> > @@ -2325,7 +2332,7 @@ panthor_vm_create(struct panthor_device *ptdev, bool for_mcu,
> >  		goto err_sched_fini;
> >  
> >  	mair = io_pgtable_ops_to_pgtable(vm->pgtbl_ops)->cfg.arm_lpae_s1_cfg.mair;
> > -	vm->memattr = mair_to_memattr(mair);
> > +	vm->memattr = mair_to_memattr(mair, ptdev->coherent);
> >  
> >  	mutex_lock(&ptdev->mmu->vm.lock);
> >  	list_add_tail(&vm->node, &ptdev->mmu->vm.list);
> 

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] drm/panthor: Explicitly set the coherency mode
  2024-10-24 14:54 ` [PATCH 2/3] drm/panthor: Explicitly set the coherency mode Akash Goel
  2024-10-24 15:51   ` Boris Brezillon
@ 2024-10-25  9:29   ` Liviu Dudau
  2024-10-25 10:03   ` Steven Price
  2 siblings, 0 replies; 17+ messages in thread
From: Liviu Dudau @ 2024-10-25  9:29 UTC (permalink / raw)
  To: Akash Goel
  Cc: boris.brezillon, steven.price, dri-devel, linux-kernel,
	mihail.atanassov, ketil.johnsen, florent.tomasin,
	maarten.lankhorst, mripard, tzimmermann, airlied, daniel, nd

On Thu, Oct 24, 2024 at 03:54:31PM +0100, Akash Goel wrote:
> This commit fixes the potential misalignment between the value of device
> tree property "dma-coherent" and default value of COHERENCY_ENABLE
> register.
> Panthor driver didn't explicitly program the COHERENCY_ENABLE register
> with the desired coherency mode. The default value of COHERENCY_ENABLE
> register is implementation defined, so it may not be always aligned with
> the "dma-coherent" property value.
> The commit also checks the COHERENCY_FEATURES register to confirm that
> the coherency protocol is actually supported or not.
> 
> Signed-off-by: Akash Goel <akash.goel@arm.com>

Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>

Best regards,
Liviu

> ---
>  drivers/gpu/drm/panthor/panthor_device.c | 22 +++++++++++++++++++++-
>  drivers/gpu/drm/panthor/panthor_gpu.c    |  9 +++++++++
>  2 files changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> index 4082c8f2951d..984615f4ed27 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.c
> +++ b/drivers/gpu/drm/panthor/panthor_device.c
> @@ -22,6 +22,24 @@
>  #include "panthor_regs.h"
>  #include "panthor_sched.h"
>  
> +static int panthor_gpu_coherency_init(struct panthor_device *ptdev)
> +{
> +	ptdev->coherent = device_get_dma_attr(ptdev->base.dev) == DEV_DMA_COHERENT;
> +
> +	if (!ptdev->coherent)
> +		return 0;
> +
> +	/* Check if the ACE-Lite coherency protocol is actually supported by the GPU.
> +	 * ACE protocol has never been supported for command stream frontend GPUs.
> +	 */
> +	if ((gpu_read(ptdev, GPU_COHERENCY_FEATURES) &
> +		      GPU_COHERENCY_PROT_BIT(ACE_LITE)))
> +		return 0;
> +
> +	drm_err(&ptdev->base, "Coherency not supported by the device");
> +	return -ENOTSUPP;
> +}
> +
>  static int panthor_clk_init(struct panthor_device *ptdev)
>  {
>  	ptdev->clks.core = devm_clk_get(ptdev->base.dev, NULL);
> @@ -156,7 +174,9 @@ int panthor_device_init(struct panthor_device *ptdev)
>  	struct page *p;
>  	int ret;
>  
> -	ptdev->coherent = device_get_dma_attr(ptdev->base.dev) == DEV_DMA_COHERENT;
> +	ret = panthor_gpu_coherency_init(ptdev);
> +	if (ret)
> +		return ret;
>  
>  	init_completion(&ptdev->unplug.done);
>  	ret = drmm_mutex_init(&ptdev->base, &ptdev->unplug.lock);
> diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
> index 5251d8764e7d..1e24f08a519a 100644
> --- a/drivers/gpu/drm/panthor/panthor_gpu.c
> +++ b/drivers/gpu/drm/panthor/panthor_gpu.c
> @@ -77,6 +77,12 @@ static const struct panthor_model gpu_models[] = {
>  	 GPU_IRQ_RESET_COMPLETED | \
>  	 GPU_IRQ_CLEAN_CACHES_COMPLETED)
>  
> +static void panthor_gpu_coherency_set(struct panthor_device *ptdev)
> +{
> +	gpu_write(ptdev, GPU_COHERENCY_PROTOCOL,
> +		ptdev->coherent ? GPU_COHERENCY_PROT_BIT(ACE_LITE) : GPU_COHERENCY_NONE);
> +}
> +
>  static void panthor_gpu_init_info(struct panthor_device *ptdev)
>  {
>  	const struct panthor_model *model;
> @@ -365,6 +371,9 @@ int panthor_gpu_l2_power_on(struct panthor_device *ptdev)
>  			      hweight64(ptdev->gpu_info.shader_present));
>  	}
>  
> +	/* Set the desired coherency mode before the power up of L2 */
> +	panthor_gpu_coherency_set(ptdev);
> +
>  	return panthor_gpu_power_on(ptdev, L2, 1, 20000);
>  }
>  
> -- 
> 2.25.1
> 

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] drm/panthor: Update memattr programing to align with GPU spec
  2024-10-24 14:54 ` [PATCH 1/3] drm/panthor: Update memattr programing to align with GPU spec Akash Goel
  2024-10-24 15:49   ` Boris Brezillon
@ 2024-10-25 10:03   ` Steven Price
  1 sibling, 0 replies; 17+ messages in thread
From: Steven Price @ 2024-10-25 10:03 UTC (permalink / raw)
  To: Akash Goel, boris.brezillon, liviu.dudau
  Cc: dri-devel, linux-kernel, mihail.atanassov, ketil.johnsen,
	florent.tomasin, maarten.lankhorst, mripard, tzimmermann, airlied,
	daniel, nd

On 24/10/2024 15:54, Akash Goel wrote:
> Mali GPU Arch spec forbids the GPU PTEs to indicate Inner or Outer
> shareability when no_coherency protocol is selected. Doing so results in
> unexpected or undesired snooping of the CPU caches on some platforms,
> such as Juno FPGA, causing functional issues. For example the boot of
> MCU firmware fails as GPU ends up reading stale data for the FW memory
> pages from the CPU's cache. The FW memory pages are initialized with
> uncached mapping when the device is not reported to be dma-coherent.
> The shareability bits are set to inner-shareable when IOMMU_CACHE flag
> is passed to map_pages() callback and IOMMU_CACHE flag is passed by
> Panthor driver when memory needs to be mapped as cached on the GPU side.
> 
> IOMMU_CACHE seems to imply cache coherent and is probably not fit for
> purpose for the memory that is mapped as cached on GPU side but doesn't
> need to remain coherent with the CPU.
> 
> This commit updates the programming of MEMATTR register to use
> MIDGARD_INNER instead of CPU_INNER when coherency is disabled. That way
> the inner-shareability specified in the GPU PTEs would map to Mali's
> internal-shareable mode, which is always supported by the GPU regardless
> of the coherency protocal and is required by the Userspace driver to
> ensure coherency between the shader cores.
> 
> Signed-off-by: Akash Goel <akash.goel@arm.com>

Reviewed-by: Steven Price <steven.price@arm.com>

> ---
>  drivers/gpu/drm/panthor/panthor_mmu.c | 23 +++++++++++++++--------
>  1 file changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
> index f3ee5d2753f1..f522a116c1b1 100644
> --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> @@ -1927,7 +1927,7 @@ struct panthor_heap_pool *panthor_vm_get_heap_pool(struct panthor_vm *vm, bool c
>  	return pool;
>  }
>  
> -static u64 mair_to_memattr(u64 mair)
> +static u64 mair_to_memattr(u64 mair, bool coherent)
>  {
>  	u64 memattr = 0;
>  	u32 i;
> @@ -1946,14 +1946,21 @@ static u64 mair_to_memattr(u64 mair)
>  				   AS_MEMATTR_AARCH64_SH_MIDGARD_INNER |
>  				   AS_MEMATTR_AARCH64_INNER_ALLOC_EXPL(false, false);
>  		} else {
> -			/* Use SH_CPU_INNER mode so SH_IS, which is used when
> -			 * IOMMU_CACHE is set, actually maps to the standard
> -			 * definition of inner-shareable and not Mali's
> -			 * internal-shareable mode.
> -			 */
>  			out_attr = AS_MEMATTR_AARCH64_INNER_OUTER_WB |
> -				   AS_MEMATTR_AARCH64_SH_CPU_INNER |
>  				   AS_MEMATTR_AARCH64_INNER_ALLOC_EXPL(inner & 1, inner & 2);
> +			/* Use SH_MIDGARD_INNER mode when device isn't coherent,
> +			 * so SH_IS, which is used when IOMMU_CACHE is set, maps
> +			 * to Mali's internal-shareable mode. As per the Mali
> +			 * Spec, inner and outer-shareable modes aren't allowed
> +			 * for WB memory when coherency is disabled.
> +			 * Use SH_CPU_INNER mode when coherency is enabled, so
> +			 * that SH_IS actually maps to the standard definition of
> +			 * inner-shareable.
> +			 */
> +			if (!coherent)
> +				out_attr |= AS_MEMATTR_AARCH64_SH_MIDGARD_INNER;
> +			else
> +				out_attr |= AS_MEMATTR_AARCH64_SH_CPU_INNER;
>  		}
>  
>  		memattr |= (u64)out_attr << (8 * i);
> @@ -2325,7 +2332,7 @@ panthor_vm_create(struct panthor_device *ptdev, bool for_mcu,
>  		goto err_sched_fini;
>  
>  	mair = io_pgtable_ops_to_pgtable(vm->pgtbl_ops)->cfg.arm_lpae_s1_cfg.mair;
> -	vm->memattr = mair_to_memattr(mair);
> +	vm->memattr = mair_to_memattr(mair, ptdev->coherent);
>  
>  	mutex_lock(&ptdev->mmu->vm.lock);
>  	list_add_tail(&vm->node, &ptdev->mmu->vm.list);


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] drm/panthor: Explicitly set the coherency mode
  2024-10-24 14:54 ` [PATCH 2/3] drm/panthor: Explicitly set the coherency mode Akash Goel
  2024-10-24 15:51   ` Boris Brezillon
  2024-10-25  9:29   ` Liviu Dudau
@ 2024-10-25 10:03   ` Steven Price
  2 siblings, 0 replies; 17+ messages in thread
From: Steven Price @ 2024-10-25 10:03 UTC (permalink / raw)
  To: Akash Goel, boris.brezillon, liviu.dudau
  Cc: dri-devel, linux-kernel, mihail.atanassov, ketil.johnsen,
	florent.tomasin, maarten.lankhorst, mripard, tzimmermann, airlied,
	daniel, nd

On 24/10/2024 15:54, Akash Goel wrote:
> This commit fixes the potential misalignment between the value of device
> tree property "dma-coherent" and default value of COHERENCY_ENABLE
> register.
> Panthor driver didn't explicitly program the COHERENCY_ENABLE register
> with the desired coherency mode. The default value of COHERENCY_ENABLE
> register is implementation defined, so it may not be always aligned with
> the "dma-coherent" property value.
> The commit also checks the COHERENCY_FEATURES register to confirm that
> the coherency protocol is actually supported or not.
> 
> Signed-off-by: Akash Goel <akash.goel@arm.com>

Reviewed-by: Steven Price <steven.price@arm.com>

> ---
>  drivers/gpu/drm/panthor/panthor_device.c | 22 +++++++++++++++++++++-
>  drivers/gpu/drm/panthor/panthor_gpu.c    |  9 +++++++++
>  2 files changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> index 4082c8f2951d..984615f4ed27 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.c
> +++ b/drivers/gpu/drm/panthor/panthor_device.c
> @@ -22,6 +22,24 @@
>  #include "panthor_regs.h"
>  #include "panthor_sched.h"
>  
> +static int panthor_gpu_coherency_init(struct panthor_device *ptdev)
> +{
> +	ptdev->coherent = device_get_dma_attr(ptdev->base.dev) == DEV_DMA_COHERENT;
> +
> +	if (!ptdev->coherent)
> +		return 0;
> +
> +	/* Check if the ACE-Lite coherency protocol is actually supported by the GPU.
> +	 * ACE protocol has never been supported for command stream frontend GPUs.
> +	 */
> +	if ((gpu_read(ptdev, GPU_COHERENCY_FEATURES) &
> +		      GPU_COHERENCY_PROT_BIT(ACE_LITE)))
> +		return 0;
> +
> +	drm_err(&ptdev->base, "Coherency not supported by the device");
> +	return -ENOTSUPP;
> +}
> +
>  static int panthor_clk_init(struct panthor_device *ptdev)
>  {
>  	ptdev->clks.core = devm_clk_get(ptdev->base.dev, NULL);
> @@ -156,7 +174,9 @@ int panthor_device_init(struct panthor_device *ptdev)
>  	struct page *p;
>  	int ret;
>  
> -	ptdev->coherent = device_get_dma_attr(ptdev->base.dev) == DEV_DMA_COHERENT;
> +	ret = panthor_gpu_coherency_init(ptdev);
> +	if (ret)
> +		return ret;
>  
>  	init_completion(&ptdev->unplug.done);
>  	ret = drmm_mutex_init(&ptdev->base, &ptdev->unplug.lock);
> diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
> index 5251d8764e7d..1e24f08a519a 100644
> --- a/drivers/gpu/drm/panthor/panthor_gpu.c
> +++ b/drivers/gpu/drm/panthor/panthor_gpu.c
> @@ -77,6 +77,12 @@ static const struct panthor_model gpu_models[] = {
>  	 GPU_IRQ_RESET_COMPLETED | \
>  	 GPU_IRQ_CLEAN_CACHES_COMPLETED)
>  
> +static void panthor_gpu_coherency_set(struct panthor_device *ptdev)
> +{
> +	gpu_write(ptdev, GPU_COHERENCY_PROTOCOL,
> +		ptdev->coherent ? GPU_COHERENCY_PROT_BIT(ACE_LITE) : GPU_COHERENCY_NONE);
> +}
> +
>  static void panthor_gpu_init_info(struct panthor_device *ptdev)
>  {
>  	const struct panthor_model *model;
> @@ -365,6 +371,9 @@ int panthor_gpu_l2_power_on(struct panthor_device *ptdev)
>  			      hweight64(ptdev->gpu_info.shader_present));
>  	}
>  
> +	/* Set the desired coherency mode before the power up of L2 */
> +	panthor_gpu_coherency_set(ptdev);
> +
>  	return panthor_gpu_power_on(ptdev, L2, 1, 20000);
>  }
>  


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] drm/panthor: Update memattr programing to align with GPU spec
  2024-10-25  9:24     ` Liviu Dudau
@ 2024-10-25 12:31       ` Boris Brezillon
  0 siblings, 0 replies; 17+ messages in thread
From: Boris Brezillon @ 2024-10-25 12:31 UTC (permalink / raw)
  To: Liviu Dudau
  Cc: Akash Goel, Robin Murphy, steven.price, dri-devel, linux-kernel,
	mihail.atanassov, ketil.johnsen, florent.tomasin,
	maarten.lankhorst, mripard, tzimmermann, airlied, daniel, nd

On Fri, 25 Oct 2024 10:24:32 +0100
Liviu Dudau <liviu.dudau@arm.com> wrote:

> On Thu, Oct 24, 2024 at 05:49:44PM +0200, Boris Brezillon wrote:
> > +Robin for the MMU details
> > 
> > On Thu, 24 Oct 2024 15:54:30 +0100
> > Akash Goel <akash.goel@arm.com> wrote:
> >   
> > > Mali GPU Arch spec forbids the GPU PTEs to indicate Inner or Outer
> > > shareability when no_coherency protocol is selected. Doing so results in
> > > unexpected or undesired snooping of the CPU caches on some platforms,
> > > such as Juno FPGA, causing functional issues. For example the boot of
> > > MCU firmware fails as GPU ends up reading stale data for the FW memory
> > > pages from the CPU's cache. The FW memory pages are initialized with
> > > uncached mapping when the device is not reported to be dma-coherent.
> > > The shareability bits are set to inner-shareable when IOMMU_CACHE flag
> > > is passed to map_pages() callback and IOMMU_CACHE flag is passed by
> > > Panthor driver when memory needs to be mapped as cached on the GPU side.
> > > 
> > > IOMMU_CACHE seems to imply cache coherent and is probably not fit for
> > > purpose for the memory that is mapped as cached on GPU side but doesn't
> > > need to remain coherent with the CPU.  
> > 
> > Yeah, IIRC I've been abusing the _CACHE flag to mean GPU-cached, not
> > cache-coherent. I think it be good to sit down with Rob and add the
> > necessary IOMMU_ flags so we can express all the shareability and
> > cacheability variants we have with the "Mali" MMU. For instance, I
> > think the shareability between MCU/GPU can be expressed properly at the
> > moment, and we unconditionally map things uncached because of that.  
> 
> Boris, did you mean to say "shareability between MCU/GPU *can't* be expressed
> properly" ? Currently the sentence reads a bit strange, as if there was a
> negation somewhere.

Yes, sorry, I meant "can't".

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/3] drm/panthor: Coherency related fixes
  2024-10-24 14:54 [PATCH 0/3] drm/panthor: Coherency related fixes Akash Goel
                   ` (2 preceding siblings ...)
  2024-10-24 14:54 ` [PATCH 3/3] drm/panthor: Prevent potential overwrite of buffer objects Akash Goel
@ 2024-10-30 15:46 ` Boris Brezillon
  3 siblings, 0 replies; 17+ messages in thread
From: Boris Brezillon @ 2024-10-30 15:46 UTC (permalink / raw)
  To: Akash Goel
  Cc: liviu.dudau, steven.price, dri-devel, linux-kernel,
	mihail.atanassov, ketil.johnsen, florent.tomasin,
	maarten.lankhorst, mripard, tzimmermann, airlied, daniel, nd

On Thu, 24 Oct 2024 15:54:29 +0100
Akash Goel <akash.goel@arm.com> wrote:

> This patch series contains 3 cache coherency related fixes for the
> Panthor driver.
> - The first fix, regarding the Inner-shareability, is mandatory to
>   ensure things work on all platforms (including Juno FPGA) when
>   no_coherency protocol is selected.
> - The second fix regarding the coherency feature/enable register is
>   required to avoid potential misalignment on certain platforms.
> - The third fix, regarding the potential overwrite of buffer objects,
>   has been prepared speculatively & it may not be required in practice.
>   Please provide feedback on it.
> 
> Akash Goel (3):
>   drm/panthor: Update memattr programing to align with GPU spec
>   drm/panthor: Explicitly set the coherency mode
>   drm/panthor: Prevent potential overwrite of buffer objects

For some reason, our replies didn't make it to patchwork [1], meaning I
don't have the R-b tags when I apply the patch. Could you send a v2
with the R-bs added, so I can at least apply the first two patches to
drm-misc-fixes?

[1]https://patchwork.freedesktop.org/series/140498/

> 
>  drivers/gpu/drm/panthor/panthor_device.c | 22 ++++++++++++++++++-
>  drivers/gpu/drm/panthor/panthor_gem.h    | 10 +++++++++
>  drivers/gpu/drm/panthor/panthor_gpu.c    |  9 ++++++++
>  drivers/gpu/drm/panthor/panthor_mmu.c    | 28 +++++++++++++++++-------
>  4 files changed, 60 insertions(+), 9 deletions(-)
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] drm/panthor: Prevent potential overwrite of buffer objects
  2024-10-24 15:39   ` Boris Brezillon
@ 2024-10-31 21:42     ` Akash Goel
  2024-11-04 11:16       ` Boris Brezillon
  0 siblings, 1 reply; 17+ messages in thread
From: Akash Goel @ 2024-10-31 21:42 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: liviu.dudau, steven.price, dri-devel, linux-kernel,
	mihail.atanassov, ketil.johnsen, florent.tomasin,
	maarten.lankhorst, mripard, tzimmermann, airlied, daniel, nd



On 10/24/24 16:39, Boris Brezillon wrote:
> On Thu, 24 Oct 2024 15:54:32 +0100
> Akash Goel <akash.goel@arm.com> wrote:
> 
>> All CPU mappings are forced as uncached for Panthor buffer objects when
>> system(IO) coherency is disabled. Physical backing for Panthor BOs is
>> allocated by shmem, which clears the pages also after allocation. But
>> there is no explicit cache flush done after the clearing of pages.
>> So it could happen that there are dirty cachelines in the CPU cache
>> for the BOs, when they are accessed from the CPU side through uncached
>> CPU mapping, and the eviction of cachelines overwrites the data of BOs.
> 
> Hm, this looks like something that should be handled at the
> drm_gem_shmem level when drm_gem_shmem_object::map_wc=true, as I
> suspect other drivers can hit the same issue (I'm thinking of panfrost
> and lima, but there might be others).
> 


I am sorry for the late reply.
Many thanks for the quick feedback.

I assume you also reckon that there is a potential problem here for arm64.

Fully agree with your suggestion that the handling needs to be at the 
drm_gem_shmem level. I was not sure if we really need to do anything, as 
I didn't observe any overwrite issue during the testing. So thought 
better to limit the change to Panthor and get some feedback.

shmem calls 'flush_dcache_folio()' after clearing the pages but that 
just clears the 'PG_dcache_clean' bit and CPU cache is not cleaned 
immediately.

I realize that this patch is not foolproof, as Userspace can try to 
populate the BO from CPU side before mapping it on the GPU side.

Not sure if we also need to consider the case when shmem pages are 
swapped out. Don't know if there could be a similar situation of dirty 
cachelines after the swap in.

Also not sure if dma_sync_sgtable_for_device() can be called from 
drm_gem_shmem_get_pages() as the sg_table won't be available at that point.

Please let me know your thoughts.

Best regards
Akash



>>
>> This commit tries to avoid the potential overwrite scenario.
>>
>> Signed-off-by: Akash Goel <akash.goel@arm.com>
>> ---
>>   drivers/gpu/drm/panthor/panthor_gem.h | 10 ++++++++++
>>   drivers/gpu/drm/panthor/panthor_mmu.c |  5 +++++
>>   2 files changed, 15 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/panthor/panthor_gem.h b/drivers/gpu/drm/panthor/panthor_gem.h
>> index e43021cf6d45..4b0f43f1edf1 100644
>> --- a/drivers/gpu/drm/panthor/panthor_gem.h
>> +++ b/drivers/gpu/drm/panthor/panthor_gem.h
>> @@ -46,6 +46,16 @@ struct panthor_gem_object {
>>   
>>   	/** @flags: Combination of drm_panthor_bo_flags flags. */
>>   	u32 flags;
>> +
>> +	/**
>> +	 * @cleaned: The buffer object pages have been cleaned.
>> +	 *
>> +	 * There could be dirty CPU cachelines for the pages of buffer object
>> +	 * after allocation, as shmem will zero out the pages. The cachelines
>> +	 * need to be cleaned if the pages are going to be accessed with an
>> +	 * uncached CPU mapping.
>> +	 */
>> +	bool cleaned;
>>   };
>>   
>>   /**
>> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
>> index f522a116c1b1..d8cc9e7d064e 100644
>> --- a/drivers/gpu/drm/panthor/panthor_mmu.c
>> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
>> @@ -1249,6 +1249,11 @@ static int panthor_vm_prepare_map_op_ctx(struct panthor_vm_op_ctx *op_ctx,
>>   
>>   	op_ctx->map.sgt = sgt;
>>   
>> +	if (bo->base.map_wc && !bo->cleaned) {
>> +		dma_sync_sgtable_for_device(vm->ptdev->base.dev, sgt, DMA_TO_DEVICE);
>> +		bo->cleaned = true;
>> +	}
>> +
>>   	preallocated_vm_bo = drm_gpuvm_bo_create(&vm->base, &bo->base.base);
>>   	if (!preallocated_vm_bo) {
>>   		if (!bo->base.base.import_attach)
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] drm/panthor: Prevent potential overwrite of buffer objects
  2024-10-31 21:42     ` Akash Goel
@ 2024-11-04 11:16       ` Boris Brezillon
  2024-11-04 12:49         ` Akash Goel
  0 siblings, 1 reply; 17+ messages in thread
From: Boris Brezillon @ 2024-11-04 11:16 UTC (permalink / raw)
  To: Akash Goel
  Cc: liviu.dudau, steven.price, dri-devel, linux-kernel,
	mihail.atanassov, ketil.johnsen, florent.tomasin,
	maarten.lankhorst, mripard, tzimmermann, airlied, daniel, nd

Hi Akash,

On Thu, 31 Oct 2024 21:42:27 +0000
Akash Goel <akash.goel@arm.com> wrote:

> I assume you also reckon that there is a potential problem here for arm64.

It impacts any system that's not IO-coherent I would say, and this
comment seems to prove this is a known issue [3].

> 
> Fully agree with your suggestion that the handling needs to be at the 
> drm_gem_shmem level. I was not sure if we really need to do anything, as 
> I didn't observe any overwrite issue during the testing. So thought 
> better to limit the change to Panthor and get some feedback.

Actually, I wonder if PowerVR isn't papering over the same issue with
[1], so it looks like at least two drivers would benefit from a fix at
the drm_gem_shmem level.

> 
> shmem calls 'flush_dcache_folio()' after clearing the pages but that 
> just clears the 'PG_dcache_clean' bit and CPU cache is not cleaned 
> immediately.
> 
> I realize that this patch is not foolproof, as Userspace can try to 
> populate the BO from CPU side before mapping it on the GPU side.
> 
> Not sure if we also need to consider the case when shmem pages are 
> swapped out. Don't know if there could be a similar situation of dirty 
> cachelines after the swap in.

I think we do. We basically need to flush CPU caches any time
pages are [re]allocated, because the shmem layer will either zero-out
(first allocation) or populate (swap-in) in that path, and in both
cases, it involves a CPU copy to a cached mapping.

> 
> Also not sure if dma_sync_sgtable_for_device() can be called from 
> drm_gem_shmem_get_pages() as the sg_table won't be available at that point.

Okay, that's indeed an issue. Maybe we should tie the sgt allocation to
the pages allocation, as I can't think of a case where we would
allocate pages without needing the sg table that goes with it. And if
there are driver that want the sgt to be lazily allocated, we can
always add a drm_gem_shmem_object::lazy_sgt_alloc flag.

Regards,

Boris

[1]https://elixir.bootlin.com/linux/v6.11.6/source/drivers/gpu/drm/imagination/pvr_gem.c#L363
[2]https://elixir.bootlin.com/linux/v6.11.6/source/drivers/gpu/drm/drm_gem_shmem_helper.c#L177
[3]https://elixir.bootlin.com/linux/v6.11.6/source/drivers/gpu/drm/drm_gem_shmem_helper.c#L185

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] drm/panthor: Prevent potential overwrite of buffer objects
  2024-11-04 11:16       ` Boris Brezillon
@ 2024-11-04 12:49         ` Akash Goel
  2024-11-04 13:41           ` Boris Brezillon
  0 siblings, 1 reply; 17+ messages in thread
From: Akash Goel @ 2024-11-04 12:49 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: liviu.dudau, steven.price, dri-devel, linux-kernel,
	mihail.atanassov, ketil.johnsen, florent.tomasin,
	maarten.lankhorst, mripard, tzimmermann, airlied, daniel, nd



On 11/4/24 11:16, Boris Brezillon wrote:
> Hi Akash,
> 
> On Thu, 31 Oct 2024 21:42:27 +0000
> Akash Goel <akash.goel@arm.com> wrote:
> 
>> I assume you also reckon that there is a potential problem here for arm64.
> 
> It impacts any system that's not IO-coherent I would say, and this
> comment seems to prove this is a known issue [3].
> 

Thanks for confirming.

Actually I had tried to check with Daniel Vetter about [3], as it was 
not clear to me that how that code exactly helped in x86 case.
As far as I understand, [3] updates the attribute of direct kernel 
mapping of the shmem pages to WC, so as to be consistent with the 
Userspace mapping of the pages or their vmapping inside the kernel.
But didn't get how that alignment actually helped in cleaning the dirty 
cache lines.

>>
>> Fully agree with your suggestion that the handling needs to be at the
>> drm_gem_shmem level. I was not sure if we really need to do anything, as
>> I didn't observe any overwrite issue during the testing. So thought
>> better to limit the change to Panthor and get some feedback.
> 
> Actually, I wonder if PowerVR isn't papering over the same issue with
> [1], so it looks like at least two drivers would benefit from a fix at
> the drm_gem_shmem level.
> 

Thanks for giving the reference of PowerVR code.
It is unconditionally calling dma_sync_sgtable after acquiring the 
pages, so could be papering over the issue as you suspected.

>>
>> shmem calls 'flush_dcache_folio()' after clearing the pages but that
>> just clears the 'PG_dcache_clean' bit and CPU cache is not cleaned
>> immediately.
>>
>> I realize that this patch is not foolproof, as Userspace can try to
>> populate the BO from CPU side before mapping it on the GPU side.
>>
>> Not sure if we also need to consider the case when shmem pages are
>> swapped out. Don't know if there could be a similar situation of dirty
>> cachelines after the swap in.
> 
> I think we do. We basically need to flush CPU caches any time
> pages are [re]allocated, because the shmem layer will either zero-out
> (first allocation) or populate (swap-in) in that path, and in both
> cases, it involves a CPU copy to a cached mapping.
> 

Thanks for confirming.

I think we may have to do cache flush page by page.
Not all pages might get swapped out and the initial allocation of all 
pages may not happen at the same time.
Please correct me if my understanding is wrong.


>>
>> Also not sure if dma_sync_sgtable_for_device() can be called from
>> drm_gem_shmem_get_pages() as the sg_table won't be available at that point.
> 
> Okay, that's indeed an issue. Maybe we should tie the sgt allocation to
> the pages allocation, as I can't think of a case where we would
> allocate pages without needing the sg table that goes with it. And if
> there are driver that want the sgt to be lazily allocated, we can
> always add a drm_gem_shmem_object::lazy_sgt_alloc flag.
> 

Many thanks for the suggestion.

Will try to see how we can progress this work.

Best regards
Akash


> Regards,
> 
> Boris
> 
> [1]https://elixir.bootlin.com/linux/v6.11.6/source/drivers/gpu/drm/imagination/pvr_gem.c#L363
> [2]https://elixir.bootlin.com/linux/v6.11.6/source/drivers/gpu/drm/drm_gem_shmem_helper.c#L177
> [3]https://elixir.bootlin.com/linux/v6.11.6/source/drivers/gpu/drm/drm_gem_shmem_helper.c#L185

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] drm/panthor: Prevent potential overwrite of buffer objects
  2024-11-04 12:49         ` Akash Goel
@ 2024-11-04 13:41           ` Boris Brezillon
  0 siblings, 0 replies; 17+ messages in thread
From: Boris Brezillon @ 2024-11-04 13:41 UTC (permalink / raw)
  To: Akash Goel
  Cc: liviu.dudau, steven.price, dri-devel, linux-kernel,
	mihail.atanassov, ketil.johnsen, florent.tomasin,
	maarten.lankhorst, mripard, tzimmermann, airlied, daniel, nd

On Mon, 4 Nov 2024 12:49:56 +0000
Akash Goel <akash.goel@arm.com> wrote:

> On 11/4/24 11:16, Boris Brezillon wrote:
> > Hi Akash,
> > 
> > On Thu, 31 Oct 2024 21:42:27 +0000
> > Akash Goel <akash.goel@arm.com> wrote:
> >   
> >> I assume you also reckon that there is a potential problem here for arm64.  
> > 
> > It impacts any system that's not IO-coherent I would say, and this
> > comment seems to prove this is a known issue [3].
> >   
> 
> Thanks for confirming.
> 
> Actually I had tried to check with Daniel Vetter about [3], as it was 
> not clear to me that how that code exactly helped in x86 case.
> As far as I understand, [3] updates the attribute of direct kernel 
> mapping of the shmem pages to WC, so as to be consistent with the 
> Userspace mapping of the pages or their vmapping inside the kernel.
> But didn't get how that alignment actually helped in cleaning the dirty 
> cache lines.

Yeah, I was not referring to the code but rather the fact that x86,
with its IO coherency model, is a special case here, and that other
archs probably need explicit flushes in a few places.

> >>
> >> shmem calls 'flush_dcache_folio()' after clearing the pages but that
> >> just clears the 'PG_dcache_clean' bit and CPU cache is not cleaned
> >> immediately.
> >>
> >> I realize that this patch is not foolproof, as Userspace can try to
> >> populate the BO from CPU side before mapping it on the GPU side.
> >>
> >> Not sure if we also need to consider the case when shmem pages are
> >> swapped out. Don't know if there could be a similar situation of dirty
> >> cachelines after the swap in.  
> > 
> > I think we do. We basically need to flush CPU caches any time
> > pages are [re]allocated, because the shmem layer will either zero-out
> > (first allocation) or populate (swap-in) in that path, and in both
> > cases, it involves a CPU copy to a cached mapping.
> >   
> 
> Thanks for confirming.
> 
> I think we may have to do cache flush page by page.
> Not all pages might get swapped out and the initial allocation of all 
> pages may not happen at the same time.

If the pages are mapped GPU-side, it's always all pages at a time (at
least until we add support for lazy page allocation, AKA growing/heap
buffers). You're right that GPU buffers that have only been mapped
CPU-side with mmap() get their pages lazily allocated, though I'm not
really sure we care about optimizing that case just yet.

> Please correct me if my understanding is wrong.

Eviction should be rare enough that we can probably pay the price of a
flush on the entire BO range.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-11-04 13:41 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-24 14:54 [PATCH 0/3] drm/panthor: Coherency related fixes Akash Goel
2024-10-24 14:54 ` [PATCH 1/3] drm/panthor: Update memattr programing to align with GPU spec Akash Goel
2024-10-24 15:49   ` Boris Brezillon
2024-10-25  9:24     ` Liviu Dudau
2024-10-25 12:31       ` Boris Brezillon
2024-10-25 10:03   ` Steven Price
2024-10-24 14:54 ` [PATCH 2/3] drm/panthor: Explicitly set the coherency mode Akash Goel
2024-10-24 15:51   ` Boris Brezillon
2024-10-25  9:29   ` Liviu Dudau
2024-10-25 10:03   ` Steven Price
2024-10-24 14:54 ` [PATCH 3/3] drm/panthor: Prevent potential overwrite of buffer objects Akash Goel
2024-10-24 15:39   ` Boris Brezillon
2024-10-31 21:42     ` Akash Goel
2024-11-04 11:16       ` Boris Brezillon
2024-11-04 12:49         ` Akash Goel
2024-11-04 13:41           ` Boris Brezillon
2024-10-30 15:46 ` [PATCH 0/3] drm/panthor: Coherency related fixes Boris Brezillon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox