* [PATCH v7 00/13] drm/panfrost, panthor: Cached maps and explicit flushing
@ 2025-12-03 9:01 Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 01/13] drm/panthor: Provide a custom dma_buf implementation Boris Brezillon
` (12 more replies)
0 siblings, 13 replies; 18+ messages in thread
From: Boris Brezillon @ 2025-12-03 9:01 UTC (permalink / raw)
To: Steven Price
Cc: dri-devel, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Faith Ekstrand, Thierry Reding,
Mikko Perttunen, Melissa Wen, Maíra Canal, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Frank Binns, Matt Coster,
Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
Sean Paul, Marijn Suijten, Alex Deucher, Christian König,
amd-gfx, Boris Brezillon, kernel
This series implements cached maps and explicit flushing for both panfrost
and panthor.
The PanVK MR to use this lives here:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385
This version moves away from the code sharing proposed at the
drm_prime/gem_shmem level and hand-roll a bunch of things that
could potentially be shared. The goal here is to take the path
of least resistance and discuss this controversial code-sharing
topic in a follow-up MR. Keeping everyone Cc-ed on the previous
version involved, so they're aware of this change of direction,
sorry for the noise if you don't care :-/.
Changes in v2:
- Expose the coherency so userspace can know when it should skip cache
maintenance
- Hook things up at drm_gem_object_funcs level to dma-buf cpu_prep hooks
can be implemented generically
- Revisit the semantics of the flags passed to gem_sync()
- Add BO_QUERY_INFO ioctls to query BO flags on imported objects and
let the UMD know when cache maintenance is needed on those
Changes in v3:
- New patch to fix panthor_gpu_coherency_set()
- No other major changes, check each patch changelog for more details
Changes in v4:
- Two trivial fixes, check each patch changelog for more details
Changes in v5:
- Add a way to overload dma_buf_ops while still relying on the drm_prime
boilerplate
- Add default shmem implementation for
dma_buf_ops::{begin,end}_cpu_access()
- Provide custom dma_buf_ops to deal with CPU cache flushes around CPU
accesses when the BO is CPU-cacheable
- Go back to a version of drm_gem_shmem_sync() that only deals with
cache maintenance, and adjust the semantics to make it clear this is
the only thing it cares about
- Adjust the BO_SYNC ioctls according to the new drm_gem_shmem_sync()
semantics
Changes in v6:
- No major changes, check the changelog in each patch for more details
Changes in v7:
- Drop the drm_prime/gem_shmem helpers and duplicate the logic in
panthor/panfrost
Boris Brezillon (8):
drm/panthor: Provide a custom dma_buf implementation
drm/panthor: Fix panthor_gpu_coherency_set()
drm/panthor: Expose the selected coherency protocol to the UMD
drm/panthor: Add a PANTHOR_BO_SYNC ioctl
drm/panthor: Add an ioctl to query BO flags
drm/panfrost: Provide a custom dma_buf implementation
drm/panfrost: Expose the selected coherency protocol to the UMD
drm/panfrost: Add an ioctl to query BO flags
Faith Ekstrand (4):
drm/panthor: Bump the driver version to 1.6
drm/panfrost: Add a PANFROST_SYNC_BO ioctl
drm/panfrost: Add flag to map GEM object Write-Back Cacheable
drm/panfrost: Bump the driver version to 1.6
Loïc Molinari (1):
drm/panthor: Add flag to map GEM object Write-Back Cacheable
drivers/gpu/drm/panfrost/panfrost_device.h | 1 +
drivers/gpu/drm/panfrost/panfrost_drv.c | 101 ++++++++-
drivers/gpu/drm/panfrost/panfrost_gem.c | 240 +++++++++++++++++++++
drivers/gpu/drm/panfrost/panfrost_gem.h | 10 +
drivers/gpu/drm/panfrost/panfrost_gpu.c | 26 ++-
drivers/gpu/drm/panfrost/panfrost_regs.h | 10 +-
drivers/gpu/drm/panthor/panthor_device.c | 10 +-
drivers/gpu/drm/panthor/panthor_drv.c | 79 ++++++-
drivers/gpu/drm/panthor/panthor_gem.c | 240 ++++++++++++++++++++-
drivers/gpu/drm/panthor/panthor_gem.h | 6 +
drivers/gpu/drm/panthor/panthor_gpu.c | 2 +-
drivers/gpu/drm/panthor/panthor_sched.c | 18 +-
include/uapi/drm/panfrost_drm.h | 76 ++++++-
include/uapi/drm/panthor_drm.h | 157 +++++++++++++-
14 files changed, 954 insertions(+), 22 deletions(-)
--
2.51.1
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v7 01/13] drm/panthor: Provide a custom dma_buf implementation
2025-12-03 9:01 [PATCH v7 00/13] drm/panfrost, panthor: Cached maps and explicit flushing Boris Brezillon
@ 2025-12-03 9:01 ` Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 02/13] drm/panthor: Fix panthor_gpu_coherency_set() Boris Brezillon
` (11 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Boris Brezillon @ 2025-12-03 9:01 UTC (permalink / raw)
To: Steven Price
Cc: dri-devel, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Faith Ekstrand, Thierry Reding,
Mikko Perttunen, Melissa Wen, Maíra Canal, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Frank Binns, Matt Coster,
Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
Sean Paul, Marijn Suijten, Alex Deucher, Christian König,
amd-gfx, Boris Brezillon, kernel
Before we introduce cached CPU mappings, we want a dma_buf
implementation satisfying synchronization requests around CPU
accesses coming from a dma_buf exported by our driver. Let's
provide our own implementation relying on the default
gem_shmem_prime helpers designed for that purpose.
v5:
- New patch
v6:
- Collect R-b
v7:
- Hand-roll the dma_buf sync/import logic (was previously done by
generic prime/shmem helpers)
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
---
drivers/gpu/drm/panthor/panthor_drv.c | 1 +
drivers/gpu/drm/panthor/panthor_gem.c | 118 +++++++++++++++++++++++++-
drivers/gpu/drm/panthor/panthor_gem.h | 4 +
3 files changed, 122 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
index d1d4c50da5bf..d12ac4cb0ac4 100644
--- a/drivers/gpu/drm/panthor/panthor_drv.c
+++ b/drivers/gpu/drm/panthor/panthor_drv.c
@@ -1621,6 +1621,7 @@ static const struct drm_driver panthor_drm_driver = {
.gem_create_object = panthor_gem_create_object,
.gem_prime_import_sg_table = drm_gem_shmem_prime_import_sg_table,
+ .gem_prime_import = panthor_gem_prime_import,
#ifdef CONFIG_DEBUG_FS
.debugfs_init = panthor_debugfs_init,
#endif
diff --git a/drivers/gpu/drm/panthor/panthor_gem.c b/drivers/gpu/drm/panthor/panthor_gem.c
index 10d255cccc09..173d42d65000 100644
--- a/drivers/gpu/drm/panthor/panthor_gem.c
+++ b/drivers/gpu/drm/panthor/panthor_gem.c
@@ -184,14 +184,130 @@ panthor_kernel_bo_create(struct panthor_device *ptdev, struct panthor_vm *vm,
return ERR_PTR(ret);
}
+static struct sg_table *
+panthor_gem_prime_map_dma_buf(struct dma_buf_attachment *attach,
+ enum dma_data_direction dir)
+{
+ struct sg_table *sgt = drm_gem_map_dma_buf(attach, dir);
+
+ if (!IS_ERR(sgt))
+ attach->priv = sgt;
+
+ return sgt;
+}
+
+static void
+panthor_gem_prime_unmap_dma_buf(struct dma_buf_attachment *attach,
+ struct sg_table *sgt,
+ enum dma_data_direction dir)
+{
+ attach->priv = NULL;
+ drm_gem_unmap_dma_buf(attach, sgt, dir);
+}
+
+static int
+panthor_gem_prime_begin_cpu_access(struct dma_buf *dma_buf,
+ enum dma_data_direction dir)
+{
+ struct drm_gem_object *obj = dma_buf->priv;
+ struct drm_device *dev = obj->dev;
+ struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
+ struct dma_buf_attachment *attach;
+
+ dma_resv_lock(obj->resv, NULL);
+ if (shmem->sgt)
+ dma_sync_sgtable_for_cpu(dev->dev, shmem->sgt, dir);
+
+ if (shmem->vaddr)
+ invalidate_kernel_vmap_range(shmem->vaddr, shmem->base.size);
+
+ list_for_each_entry(attach, &dma_buf->attachments, node) {
+ struct sg_table *sgt = attach->priv;
+
+ if (sgt)
+ dma_sync_sgtable_for_cpu(attach->dev, sgt, dir);
+ }
+ dma_resv_unlock(obj->resv);
+
+ return 0;
+}
+
+static int
+panthor_gem_prime_end_cpu_access(struct dma_buf *dma_buf,
+ enum dma_data_direction dir)
+{
+ struct drm_gem_object *obj = dma_buf->priv;
+ struct drm_device *dev = obj->dev;
+ struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
+ struct dma_buf_attachment *attach;
+
+ dma_resv_lock(obj->resv, NULL);
+ list_for_each_entry(attach, &dma_buf->attachments, node) {
+ struct sg_table *sgt = attach->priv;
+
+ if (sgt)
+ dma_sync_sgtable_for_device(attach->dev, sgt, dir);
+ }
+
+ if (shmem->vaddr)
+ flush_kernel_vmap_range(shmem->vaddr, shmem->base.size);
+
+ if (shmem->sgt)
+ dma_sync_sgtable_for_device(dev->dev, shmem->sgt, dir);
+
+ dma_resv_unlock(obj->resv);
+ return 0;
+}
+
+static const struct dma_buf_ops panthor_dma_buf_ops = {
+ .attach = drm_gem_map_attach,
+ .detach = drm_gem_map_detach,
+ .map_dma_buf = panthor_gem_prime_map_dma_buf,
+ .unmap_dma_buf = panthor_gem_prime_unmap_dma_buf,
+ .release = drm_gem_dmabuf_release,
+ .mmap = drm_gem_dmabuf_mmap,
+ .vmap = drm_gem_dmabuf_vmap,
+ .vunmap = drm_gem_dmabuf_vunmap,
+ .begin_cpu_access = panthor_gem_prime_begin_cpu_access,
+ .end_cpu_access = panthor_gem_prime_end_cpu_access,
+};
+
static struct dma_buf *
panthor_gem_prime_export(struct drm_gem_object *obj, int flags)
{
+ struct drm_device *dev = obj->dev;
+ struct dma_buf_export_info exp_info = {
+ .exp_name = KBUILD_MODNAME,
+ .owner = THIS_MODULE,
+ .ops = &panthor_dma_buf_ops,
+ .size = obj->size,
+ .flags = flags,
+ .priv = obj,
+ .resv = obj->resv,
+ };
+
/* We can't export GEMs that have an exclusive VM. */
if (to_panthor_bo(obj)->exclusive_vm_root_gem)
return ERR_PTR(-EINVAL);
- return drm_gem_prime_export(obj, flags);
+ return drm_gem_dmabuf_export(dev, &exp_info);
+}
+
+struct drm_gem_object *
+panthor_gem_prime_import(struct drm_device *dev,
+ struct dma_buf *dma_buf)
+{
+ struct drm_gem_object *obj = dma_buf->priv;
+
+ if (dma_buf->ops == &panthor_dma_buf_ops && obj->dev == dev) {
+ /* Importing dmabuf exported from our own gem increases
+ * refcount on gem itself instead of f_count of dmabuf.
+ */
+ drm_gem_object_get(obj);
+ return obj;
+ }
+
+ return drm_gem_prime_import(dev, dma_buf);
}
static enum drm_gem_object_status panthor_gem_status(struct drm_gem_object *obj)
diff --git a/drivers/gpu/drm/panthor/panthor_gem.h b/drivers/gpu/drm/panthor/panthor_gem.h
index 80c6e24112d0..91d1880f8a5d 100644
--- a/drivers/gpu/drm/panthor/panthor_gem.h
+++ b/drivers/gpu/drm/panthor/panthor_gem.h
@@ -147,6 +147,10 @@ panthor_gem_create_with_handle(struct drm_file *file,
void panthor_gem_bo_set_label(struct drm_gem_object *obj, const char *label);
void panthor_gem_kernel_bo_set_label(struct panthor_kernel_bo *bo, const char *label);
+struct drm_gem_object *
+panthor_gem_prime_import(struct drm_device *dev,
+ struct dma_buf *dma_buf);
+
static inline u64
panthor_kernel_bo_gpuva(struct panthor_kernel_bo *bo)
{
--
2.51.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v7 02/13] drm/panthor: Fix panthor_gpu_coherency_set()
2025-12-03 9:01 [PATCH v7 00/13] drm/panfrost, panthor: Cached maps and explicit flushing Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 01/13] drm/panthor: Provide a custom dma_buf implementation Boris Brezillon
@ 2025-12-03 9:01 ` Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 03/13] drm/panthor: Expose the selected coherency protocol to the UMD Boris Brezillon
` (10 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Boris Brezillon @ 2025-12-03 9:01 UTC (permalink / raw)
To: Steven Price
Cc: dri-devel, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Faith Ekstrand, Thierry Reding,
Mikko Perttunen, Melissa Wen, Maíra Canal, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Frank Binns, Matt Coster,
Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
Sean Paul, Marijn Suijten, Alex Deucher, Christian König,
amd-gfx, Boris Brezillon, kernel, Akash Goel
GPU_COHERENCY_PROTOCOL takes one of GPU_COHERENCY_xx
not BIT(GPU_COHERENCY_xx).
v3:
- New commit
v4:
- Add Steve's R-b
v5:
- No changes
v6:
- No changes
v7:
- No changes
Cc: Akash Goel <akash.goel@arm.com>
Fixes: dd7db8d911a1 ("drm/panthor: Explicitly set the coherency mode")
Reported-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
---
drivers/gpu/drm/panthor/panthor_gpu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
index 9cb5dee93212..ff5231269518 100644
--- a/drivers/gpu/drm/panthor/panthor_gpu.c
+++ b/drivers/gpu/drm/panthor/panthor_gpu.c
@@ -51,7 +51,7 @@ struct panthor_gpu {
static void panthor_gpu_coherency_set(struct panthor_device *ptdev)
{
gpu_write(ptdev, GPU_COHERENCY_PROTOCOL,
- ptdev->coherent ? GPU_COHERENCY_PROT_BIT(ACE_LITE) : GPU_COHERENCY_NONE);
+ ptdev->coherent ? GPU_COHERENCY_ACE_LITE : GPU_COHERENCY_NONE);
}
static void panthor_gpu_l2_config_set(struct panthor_device *ptdev)
--
2.51.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v7 03/13] drm/panthor: Expose the selected coherency protocol to the UMD
2025-12-03 9:01 [PATCH v7 00/13] drm/panfrost, panthor: Cached maps and explicit flushing Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 01/13] drm/panthor: Provide a custom dma_buf implementation Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 02/13] drm/panthor: Fix panthor_gpu_coherency_set() Boris Brezillon
@ 2025-12-03 9:01 ` Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 04/13] drm/panthor: Add a PANTHOR_BO_SYNC ioctl Boris Brezillon
` (9 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Boris Brezillon @ 2025-12-03 9:01 UTC (permalink / raw)
To: Steven Price
Cc: dri-devel, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Faith Ekstrand, Thierry Reding,
Mikko Perttunen, Melissa Wen, Maíra Canal, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Frank Binns, Matt Coster,
Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
Sean Paul, Marijn Suijten, Alex Deucher, Christian König,
amd-gfx, Boris Brezillon, kernel
If we want to be able to skip CPU cache maintenance operations on
CPU-cached mappings, the UMD needs to know the kind of coherency
in place. Add a field to drm_panthor_gpu_info to do that. We can re-use
a padding field for that since this object is write-only from the
KMD perspective, and the UMD should just ignore it.
v2:
- New commit
v3:
- Make coherency protocol a real enum, not a bitmask
- Add BUILD_BUG_ON()s to make sure the values in panthor_regs.h and
those exposed through the uAPI match
v4:
- Add Steve's R-b
v5:
- No changes
v6:
- No changes
v7:
- Fix kernel doc
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
---
drivers/gpu/drm/panthor/panthor_device.c | 10 +++++-
drivers/gpu/drm/panthor/panthor_gpu.c | 2 +-
include/uapi/drm/panthor_drm.h | 39 ++++++++++++++++++++++--
3 files changed, 46 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
index e133b1e0ad6d..a66fc66999c2 100644
--- a/drivers/gpu/drm/panthor/panthor_device.c
+++ b/drivers/gpu/drm/panthor/panthor_device.c
@@ -27,6 +27,12 @@
static int panthor_gpu_coherency_init(struct panthor_device *ptdev)
{
+ BUILD_BUG_ON(GPU_COHERENCY_NONE != DRM_PANTHOR_GPU_COHERENCY_NONE);
+ BUILD_BUG_ON(GPU_COHERENCY_ACE_LITE != DRM_PANTHOR_GPU_COHERENCY_ACE_LITE);
+ BUILD_BUG_ON(GPU_COHERENCY_ACE != DRM_PANTHOR_GPU_COHERENCY_ACE);
+
+ /* Start with no coherency, and update it if the device is flagged coherent. */
+ ptdev->gpu_info.selected_coherency = GPU_COHERENCY_NONE;
ptdev->coherent = device_get_dma_attr(ptdev->base.dev) == DEV_DMA_COHERENT;
if (!ptdev->coherent)
@@ -36,8 +42,10 @@ static int panthor_gpu_coherency_init(struct panthor_device *ptdev)
* ACE protocol has never been supported for command stream frontend GPUs.
*/
if ((gpu_read(ptdev, GPU_COHERENCY_FEATURES) &
- GPU_COHERENCY_PROT_BIT(ACE_LITE)))
+ GPU_COHERENCY_PROT_BIT(ACE_LITE))) {
+ ptdev->gpu_info.selected_coherency = GPU_COHERENCY_ACE_LITE;
return 0;
+ }
drm_err(&ptdev->base, "Coherency not supported by the device");
return -ENOTSUPP;
diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
index ff5231269518..057e167468d0 100644
--- a/drivers/gpu/drm/panthor/panthor_gpu.c
+++ b/drivers/gpu/drm/panthor/panthor_gpu.c
@@ -51,7 +51,7 @@ struct panthor_gpu {
static void panthor_gpu_coherency_set(struct panthor_device *ptdev)
{
gpu_write(ptdev, GPU_COHERENCY_PROTOCOL,
- ptdev->coherent ? GPU_COHERENCY_ACE_LITE : GPU_COHERENCY_NONE);
+ ptdev->gpu_info.selected_coherency);
}
static void panthor_gpu_l2_config_set(struct panthor_device *ptdev)
diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
index 467d365ed7ba..28cf9e878db6 100644
--- a/include/uapi/drm/panthor_drm.h
+++ b/include/uapi/drm/panthor_drm.h
@@ -245,6 +245,26 @@ enum drm_panthor_dev_query_type {
DRM_PANTHOR_DEV_QUERY_GROUP_PRIORITIES_INFO,
};
+/**
+ * enum drm_panthor_gpu_coherency: Type of GPU coherency
+ */
+enum drm_panthor_gpu_coherency {
+ /**
+ * @DRM_PANTHOR_GPU_COHERENCY_ACE_LITE: ACE Lite coherency.
+ */
+ DRM_PANTHOR_GPU_COHERENCY_ACE_LITE = 0,
+
+ /**
+ * @DRM_PANTHOR_GPU_COHERENCY_ACE: ACE coherency.
+ */
+ DRM_PANTHOR_GPU_COHERENCY_ACE = 1,
+
+ /**
+ * @DRM_PANTHOR_GPU_COHERENCY_NONE: No coherency.
+ */
+ DRM_PANTHOR_GPU_COHERENCY_NONE = 31,
+};
+
/**
* struct drm_panthor_gpu_info - GPU information
*
@@ -301,7 +321,16 @@ struct drm_panthor_gpu_info {
*/
__u32 thread_max_barrier_size;
- /** @coherency_features: Coherency features. */
+ /**
+ * @coherency_features: Coherency features.
+ *
+ * Combination of drm_panthor_gpu_coherency flags.
+ *
+ * Note that this is just what the coherency protocols supported by the
+ * GPU, but the actual coherency in place depends on the SoC
+ * integration and is reflected by
+ * drm_panthor_gpu_info::selected_coherency.
+ */
__u32 coherency_features;
/** @texture_features: Texture features. */
@@ -310,8 +339,12 @@ struct drm_panthor_gpu_info {
/** @as_present: Bitmask encoding the number of address-space exposed by the MMU. */
__u32 as_present;
- /** @pad0: MBZ. */
- __u32 pad0;
+ /**
+ * @select_coherency: Coherency selected for this device.
+ *
+ * One of drm_panthor_gpu_coherency.
+ */
+ __u32 selected_coherency;
/** @shader_present: Bitmask encoding the shader cores exposed by the GPU. */
__u64 shader_present;
--
2.51.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v7 04/13] drm/panthor: Add a PANTHOR_BO_SYNC ioctl
2025-12-03 9:01 [PATCH v7 00/13] drm/panfrost, panthor: Cached maps and explicit flushing Boris Brezillon
` (2 preceding siblings ...)
2025-12-03 9:01 ` [PATCH v7 03/13] drm/panthor: Expose the selected coherency protocol to the UMD Boris Brezillon
@ 2025-12-03 9:01 ` Boris Brezillon
2025-12-04 15:42 ` Steven Price
2025-12-03 9:01 ` [PATCH v7 05/13] drm/panthor: Add an ioctl to query BO flags Boris Brezillon
` (8 subsequent siblings)
12 siblings, 1 reply; 18+ messages in thread
From: Boris Brezillon @ 2025-12-03 9:01 UTC (permalink / raw)
To: Steven Price
Cc: dri-devel, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Faith Ekstrand, Thierry Reding,
Mikko Perttunen, Melissa Wen, Maíra Canal, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Frank Binns, Matt Coster,
Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
Sean Paul, Marijn Suijten, Alex Deucher, Christian König,
amd-gfx, Boris Brezillon, kernel
This will be used by the UMD to synchronize CPU-cached mappings when
the UMD can't do it directly (no usermode cache maintenance instruction
on Arm32).
v2:
- Change the flags so they better match the drm_gem_shmem_sync()
semantics
v3:
- Add Steve's R-b
v4:
- No changes
v5:
- Drop Steve's R-b (the semantics changes call for a new review)
v6:
- Drop ret initialization in panthor_ioctl_bo_sync()
- Bail out early in panthor_ioctl_bo_sync() if ops.count is zero
- Drop unused PANTHOR_BO_SYNC_OP_FLAGS definition
v7:
- Hand-roll the sync logic (was previously provided by gem_shmem)
Signed-off-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
drivers/gpu/drm/panthor/panthor_drv.c | 41 ++++++++++++-
drivers/gpu/drm/panthor/panthor_gem.c | 85 +++++++++++++++++++++++++++
drivers/gpu/drm/panthor/panthor_gem.h | 2 +
include/uapi/drm/panthor_drm.h | 52 ++++++++++++++++
4 files changed, 179 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
index d12ac4cb0ac4..cab19621917f 100644
--- a/drivers/gpu/drm/panthor/panthor_drv.c
+++ b/drivers/gpu/drm/panthor/panthor_drv.c
@@ -177,7 +177,8 @@ panthor_get_uobj_array(const struct drm_panthor_obj_array *in, u32 min_stride,
PANTHOR_UOBJ_DECL(struct drm_panthor_sync_op, timeline_value), \
PANTHOR_UOBJ_DECL(struct drm_panthor_queue_submit, syncs), \
PANTHOR_UOBJ_DECL(struct drm_panthor_queue_create, ringbuf_size), \
- PANTHOR_UOBJ_DECL(struct drm_panthor_vm_bind_op, syncs))
+ PANTHOR_UOBJ_DECL(struct drm_panthor_vm_bind_op, syncs), \
+ PANTHOR_UOBJ_DECL(struct drm_panthor_bo_sync_op, size))
/**
* PANTHOR_UOBJ_SET() - Copy a kernel object to a user object.
@@ -1396,6 +1397,43 @@ static int panthor_ioctl_set_user_mmio_offset(struct drm_device *ddev,
return 0;
}
+static int panthor_ioctl_bo_sync(struct drm_device *ddev, void *data,
+ struct drm_file *file)
+{
+ struct drm_panthor_bo_sync *args = data;
+ struct drm_panthor_bo_sync_op *ops;
+ struct drm_gem_object *obj;
+ int ret;
+
+ if (!args->ops.count)
+ return 0;
+
+ ret = PANTHOR_UOBJ_GET_ARRAY(ops, &args->ops);
+ if (ret)
+ return ret;
+
+ for (u32 i = 0; i < args->ops.count; i++) {
+ obj = drm_gem_object_lookup(file, ops[i].handle);
+ if (!obj) {
+ ret = -ENOENT;
+ goto err_ops;
+ }
+
+ ret = panthor_gem_sync(obj, ops[i].type, ops[i].offset,
+ ops[i].size);
+
+ drm_gem_object_put(obj);
+
+ if (ret)
+ goto err_ops;
+ }
+
+err_ops:
+ kvfree(ops);
+
+ return ret;
+}
+
static int
panthor_open(struct drm_device *ddev, struct drm_file *file)
{
@@ -1470,6 +1508,7 @@ static const struct drm_ioctl_desc panthor_drm_driver_ioctls[] = {
PANTHOR_IOCTL(GROUP_SUBMIT, group_submit, DRM_RENDER_ALLOW),
PANTHOR_IOCTL(BO_SET_LABEL, bo_set_label, DRM_RENDER_ALLOW),
PANTHOR_IOCTL(SET_USER_MMIO_OFFSET, set_user_mmio_offset, DRM_RENDER_ALLOW),
+ PANTHOR_IOCTL(BO_SYNC, bo_sync, DRM_RENDER_ALLOW),
};
static int panthor_mmap(struct file *filp, struct vm_area_struct *vma)
diff --git a/drivers/gpu/drm/panthor/panthor_gem.c b/drivers/gpu/drm/panthor/panthor_gem.c
index 173d42d65000..4be32fc1732b 100644
--- a/drivers/gpu/drm/panthor/panthor_gem.c
+++ b/drivers/gpu/drm/panthor/panthor_gem.c
@@ -447,6 +447,91 @@ panthor_gem_kernel_bo_set_label(struct panthor_kernel_bo *bo, const char *label)
panthor_gem_bo_set_label(bo->obj, str);
}
+int
+panthor_gem_sync(struct drm_gem_object *obj, u32 type,
+ u64 offset, u64 size)
+{
+ struct panthor_gem_object *bo = to_panthor_bo(obj);
+ struct drm_gem_shmem_object *shmem = &bo->base;
+ const struct drm_device *dev = shmem->base.dev;
+ struct sg_table *sgt;
+ struct scatterlist *sgl;
+ unsigned int count;
+
+ /* Make sure the range is in bounds. */
+ if (offset + size < offset || offset + size > shmem->base.size)
+ return -EINVAL;
+
+ /* Disallow CPU-cache maintenance on imported buffers. */
+ if (drm_gem_is_imported(&shmem->base))
+ return -EINVAL;
+
+ switch (type) {
+ case DRM_PANTHOR_BO_SYNC_CPU_CACHE_FLUSH:
+ case DRM_PANTHOR_BO_SYNC_CPU_CACHE_FLUSH_AND_INVALIDATE:
+ break;
+
+ default:
+ return -EINVAL;
+ }
+
+ /* Don't bother if it's WC-mapped */
+ if (shmem->map_wc)
+ return 0;
+
+ /* Nothing to do if the size is zero. */
+ if (size == 0)
+ return 0;
+
+ sgt = drm_gem_shmem_get_pages_sgt(shmem);
+ if (IS_ERR(sgt))
+ return PTR_ERR(sgt);
+
+ for_each_sgtable_dma_sg(sgt, sgl, count) {
+ if (size == 0)
+ break;
+
+ dma_addr_t paddr = sg_dma_address(sgl);
+ size_t len = sg_dma_len(sgl);
+
+ if (len <= offset) {
+ offset -= len;
+ continue;
+ }
+
+ paddr += offset;
+ len -= offset;
+ len = min_t(size_t, len, size);
+ size -= len;
+ offset = 0;
+
+ /* It's unclear whether dma_sync_xxx() is the right API to do CPU
+ * cache maintenance given an IOMMU can register their own
+ * implementation doing more than just CPU cache flushes/invalidation,
+ * and what we really care about here is CPU caches only, but that's
+ * the best we have that is both arch-agnostic and does at least the
+ * CPU cache maintenance on a <page,offset,size> tuple.
+ *
+ * Also, I wish we could do a single
+ *
+ * dma_sync_single_for_device(BIDIR)
+ *
+ * and get a flush+invalidate, but that's not how it's implemented
+ * in practice (at least on arm64), so we have to make it
+ *
+ * dma_sync_single_for_device(TO_DEVICE)
+ * dma_sync_single_for_cpu(FROM_DEVICE)
+ *
+ * for the flush+invalidate case.
+ */
+ dma_sync_single_for_device(dev->dev, paddr, len, DMA_TO_DEVICE);
+ if (type == DRM_PANTHOR_BO_SYNC_CPU_CACHE_FLUSH_AND_INVALIDATE)
+ dma_sync_single_for_cpu(dev->dev, paddr, len, DMA_FROM_DEVICE);
+ }
+
+ return 0;
+}
+
#ifdef CONFIG_DEBUG_FS
struct gem_size_totals {
size_t size;
diff --git a/drivers/gpu/drm/panthor/panthor_gem.h b/drivers/gpu/drm/panthor/panthor_gem.h
index 91d1880f8a5d..bbf9ae75c360 100644
--- a/drivers/gpu/drm/panthor/panthor_gem.h
+++ b/drivers/gpu/drm/panthor/panthor_gem.h
@@ -146,6 +146,8 @@ panthor_gem_create_with_handle(struct drm_file *file,
void panthor_gem_bo_set_label(struct drm_gem_object *obj, const char *label);
void panthor_gem_kernel_bo_set_label(struct panthor_kernel_bo *bo, const char *label);
+int panthor_gem_sync(struct drm_gem_object *obj,
+ u32 type, u64 offset, u64 size);
struct drm_gem_object *
panthor_gem_prime_import(struct drm_device *dev,
diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
index 28cf9e878db6..9f810305db6e 100644
--- a/include/uapi/drm/panthor_drm.h
+++ b/include/uapi/drm/panthor_drm.h
@@ -144,6 +144,9 @@ enum drm_panthor_ioctl_id {
* pgoff_t size.
*/
DRM_PANTHOR_SET_USER_MMIO_OFFSET,
+
+ /** @DRM_PANTHOR_BO_SYNC: Sync BO data to/from the device */
+ DRM_PANTHOR_BO_SYNC,
};
/**
@@ -1073,6 +1076,53 @@ struct drm_panthor_set_user_mmio_offset {
__u64 offset;
};
+/**
+ * enum drm_panthor_bo_sync_op_type - BO sync type
+ */
+enum drm_panthor_bo_sync_op_type {
+ /** @DRM_PANTHOR_BO_SYNC_CPU_CACHE_FLUSH: Flush CPU caches. */
+ DRM_PANTHOR_BO_SYNC_CPU_CACHE_FLUSH = 0,
+
+ /** @DRM_PANTHOR_BO_SYNC_CPU_CACHE_FLUSH_AND_INVALIDATE: Flush and invalidate CPU caches. */
+ DRM_PANTHOR_BO_SYNC_CPU_CACHE_FLUSH_AND_INVALIDATE = 1,
+};
+
+/**
+ * struct drm_panthor_bo_sync_op - BO map sync op
+ */
+struct drm_panthor_bo_sync_op {
+ /** @handle: Handle of the buffer object to sync. */
+ __u32 handle;
+
+ /** @type: Type of operation. */
+ __u32 type;
+
+ /**
+ * @offset: Offset into the BO at which the sync range starts.
+ *
+ * This will be rounded down to the nearest cache line as needed.
+ */
+ __u64 offset;
+
+ /**
+ * @size: Size of the range to sync
+ *
+ * @size + @offset will be rounded up to the nearest cache line as
+ * needed.
+ */
+ __u64 size;
+};
+
+/**
+ * struct drm_panthor_bo_sync - BO map sync request
+ */
+struct drm_panthor_bo_sync {
+ /**
+ * @ops: Array of struct drm_panthor_bo_sync_op sync operations.
+ */
+ struct drm_panthor_obj_array ops;
+};
+
/**
* DRM_IOCTL_PANTHOR() - Build a Panthor IOCTL number
* @__access: Access type. Must be R, W or RW.
@@ -1119,6 +1169,8 @@ enum {
DRM_IOCTL_PANTHOR(WR, BO_SET_LABEL, bo_set_label),
DRM_IOCTL_PANTHOR_SET_USER_MMIO_OFFSET =
DRM_IOCTL_PANTHOR(WR, SET_USER_MMIO_OFFSET, set_user_mmio_offset),
+ DRM_IOCTL_PANTHOR_BO_SYNC =
+ DRM_IOCTL_PANTHOR(WR, BO_SYNC, bo_sync),
};
#if defined(__cplusplus)
--
2.51.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v7 05/13] drm/panthor: Add an ioctl to query BO flags
2025-12-03 9:01 [PATCH v7 00/13] drm/panfrost, panthor: Cached maps and explicit flushing Boris Brezillon
` (3 preceding siblings ...)
2025-12-03 9:01 ` [PATCH v7 04/13] drm/panthor: Add a PANTHOR_BO_SYNC ioctl Boris Brezillon
@ 2025-12-03 9:01 ` Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 06/13] drm/panthor: Add flag to map GEM object Write-Back Cacheable Boris Brezillon
` (7 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Boris Brezillon @ 2025-12-03 9:01 UTC (permalink / raw)
To: Steven Price
Cc: dri-devel, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Faith Ekstrand, Thierry Reding,
Mikko Perttunen, Melissa Wen, Maíra Canal, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Frank Binns, Matt Coster,
Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
Sean Paul, Marijn Suijten, Alex Deucher, Christian König,
amd-gfx, Boris Brezillon, kernel
This is useful when importing BOs, so we can know about cacheability
and flush the caches when needed.
We can also know when the buffer comes from a different subsystem and
take proper actions (avoid CPU mappings, or do kernel-based syncs
instead of userland cache flushes).
v2:
- New commit
v3:
- Add Steve's R-b
v4:
- No changes
v5:
- No changes
v6:
- No changes
v7:
- No changes
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
---
drivers/gpu/drm/panthor/panthor_drv.c | 24 +++++++++++
include/uapi/drm/panthor_drm.h | 57 +++++++++++++++++++++++++++
2 files changed, 81 insertions(+)
diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
index cab19621917f..db04f13217b9 100644
--- a/drivers/gpu/drm/panthor/panthor_drv.c
+++ b/drivers/gpu/drm/panthor/panthor_drv.c
@@ -1434,6 +1434,29 @@ static int panthor_ioctl_bo_sync(struct drm_device *ddev, void *data,
return ret;
}
+static int panthor_ioctl_bo_query_info(struct drm_device *ddev, void *data,
+ struct drm_file *file)
+{
+ struct drm_panthor_bo_query_info *args = data;
+ struct panthor_gem_object *bo;
+ struct drm_gem_object *obj;
+
+ obj = drm_gem_object_lookup(file, args->handle);
+ if (!obj)
+ return -ENOENT;
+
+ bo = to_panthor_bo(obj);
+ args->pad = 0;
+ args->create_flags = bo->flags;
+
+ args->extra_flags = 0;
+ if (drm_gem_is_imported(&bo->base.base))
+ args->extra_flags |= DRM_PANTHOR_BO_IS_IMPORTED;
+
+ drm_gem_object_put(obj);
+ return 0;
+}
+
static int
panthor_open(struct drm_device *ddev, struct drm_file *file)
{
@@ -1509,6 +1532,7 @@ static const struct drm_ioctl_desc panthor_drm_driver_ioctls[] = {
PANTHOR_IOCTL(BO_SET_LABEL, bo_set_label, DRM_RENDER_ALLOW),
PANTHOR_IOCTL(SET_USER_MMIO_OFFSET, set_user_mmio_offset, DRM_RENDER_ALLOW),
PANTHOR_IOCTL(BO_SYNC, bo_sync, DRM_RENDER_ALLOW),
+ PANTHOR_IOCTL(BO_QUERY_INFO, bo_query_info, DRM_RENDER_ALLOW),
};
static int panthor_mmap(struct file *filp, struct vm_area_struct *vma)
diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
index 9f810305db6e..39d5ce815742 100644
--- a/include/uapi/drm/panthor_drm.h
+++ b/include/uapi/drm/panthor_drm.h
@@ -147,6 +147,13 @@ enum drm_panthor_ioctl_id {
/** @DRM_PANTHOR_BO_SYNC: Sync BO data to/from the device */
DRM_PANTHOR_BO_SYNC,
+
+ /**
+ * @DRM_PANTHOR_BO_QUERY_INFO: Query information about a BO.
+ *
+ * This is useful for imported BOs.
+ */
+ DRM_PANTHOR_BO_QUERY_INFO,
};
/**
@@ -1123,6 +1130,54 @@ struct drm_panthor_bo_sync {
struct drm_panthor_obj_array ops;
};
+/**
+ * enum drm_panthor_bo_extra_flags - Set of flags returned on a BO_QUERY_INFO request
+ *
+ * Those are flags reflecting BO properties that are not directly coming from the flags
+ * passed are creation time, or information on BOs that were imported from other drivers.
+ */
+enum drm_panthor_bo_extra_flags {
+ /**
+ * @DRM_PANTHOR_BO_IS_IMPORTED: BO has been imported from an external driver.
+ *
+ * Note that imported dma-buf handles are not flagged as imported if they
+ * where exported by panthor. Only buffers that are coming from other drivers
+ * (dma heaps, other GPUs, display controllers, V4L, ...).
+ *
+ * It's also important to note that all imported BOs are mapped cached and can't
+ * be considered IO-coherent even if the GPU is. This means they require explicit
+ * syncs that must go through the DRM_PANTHOR_BO_SYNC ioctl (userland cache
+ * maintenance is not allowed in that case, because extra operations might be
+ * needed to make changes visible to the CPU/device, like buffer migration when the
+ * exporter is a GPU with its own VRAM).
+ */
+ DRM_PANTHOR_BO_IS_IMPORTED = (1 << 0),
+};
+
+/**
+ * struct drm_panthor_bo_query_info - Query BO info
+ */
+struct drm_panthor_bo_query_info {
+ /** @handle: Handle of the buffer object to query flags on. */
+ __u32 handle;
+
+ /**
+ * @extra_flags: Combination of enum drm_panthor_bo_extra_flags flags.
+ */
+ __u32 extra_flags;
+
+ /**
+ * @create_flags: Flags passed at creation time.
+ *
+ * Combination of enum drm_panthor_bo_flags flags.
+ * Will be zero if the buffer comes from a different driver.
+ */
+ __u32 create_flags;
+
+ /** @pad: Will be zero on return. */
+ __u32 pad;
+};
+
/**
* DRM_IOCTL_PANTHOR() - Build a Panthor IOCTL number
* @__access: Access type. Must be R, W or RW.
@@ -1171,6 +1226,8 @@ enum {
DRM_IOCTL_PANTHOR(WR, SET_USER_MMIO_OFFSET, set_user_mmio_offset),
DRM_IOCTL_PANTHOR_BO_SYNC =
DRM_IOCTL_PANTHOR(WR, BO_SYNC, bo_sync),
+ DRM_IOCTL_PANTHOR_BO_QUERY_INFO =
+ DRM_IOCTL_PANTHOR(WR, BO_QUERY_INFO, bo_query_info),
};
#if defined(__cplusplus)
--
2.51.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v7 06/13] drm/panthor: Add flag to map GEM object Write-Back Cacheable
2025-12-03 9:01 [PATCH v7 00/13] drm/panfrost, panthor: Cached maps and explicit flushing Boris Brezillon
` (4 preceding siblings ...)
2025-12-03 9:01 ` [PATCH v7 05/13] drm/panthor: Add an ioctl to query BO flags Boris Brezillon
@ 2025-12-03 9:01 ` Boris Brezillon
2025-12-04 15:44 ` Steven Price
2025-12-03 9:01 ` [PATCH v7 07/13] drm/panthor: Bump the driver version to 1.6 Boris Brezillon
` (6 subsequent siblings)
12 siblings, 1 reply; 18+ messages in thread
From: Boris Brezillon @ 2025-12-03 9:01 UTC (permalink / raw)
To: Steven Price
Cc: dri-devel, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Faith Ekstrand, Thierry Reding,
Mikko Perttunen, Melissa Wen, Maíra Canal, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Frank Binns, Matt Coster,
Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
Sean Paul, Marijn Suijten, Alex Deucher, Christian König,
amd-gfx, Loïc Molinari, kernel, Boris Brezillon
From: Loïc Molinari <loic.molinari@collabora.com>
Will be used by the UMD to optimize CPU accesses to buffers
that are frequently read by the CPU, or on which the access
pattern makes non-cacheable mappings inefficient.
Mapping buffers CPU-cached implies taking care of the CPU
cache maintenance in the UMD, unless the GPU is IO coherent.
v2:
- Add more to the commit message
- Tweak the doc
- Make sure we sync the section of the BO pointing to the CS
syncobj before we read its seqno
v3:
- Fix formatting/spelling issues
v4:
- Add Steve's R-b
v5:
- Drop Steve's R-b (changes in the ioctl semantics requiring
new review)
v6:
- Fix the uAPI doc
- Fix inverted logic in some comment
v7:
- No changes
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
drivers/gpu/drm/panthor/panthor_drv.c | 7 ++++-
drivers/gpu/drm/panthor/panthor_gem.c | 37 +++++++++++++++++++++++--
drivers/gpu/drm/panthor/panthor_sched.c | 18 ++++++++++--
include/uapi/drm/panthor_drm.h | 9 ++++++
4 files changed, 66 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
index db04f13217b9..1bb6a20c497a 100644
--- a/drivers/gpu/drm/panthor/panthor_drv.c
+++ b/drivers/gpu/drm/panthor/panthor_drv.c
@@ -902,7 +902,8 @@ static int panthor_ioctl_vm_destroy(struct drm_device *ddev, void *data,
return panthor_vm_pool_destroy_vm(pfile->vms, args->id);
}
-#define PANTHOR_BO_FLAGS DRM_PANTHOR_BO_NO_MMAP
+#define PANTHOR_BO_FLAGS (DRM_PANTHOR_BO_NO_MMAP | \
+ DRM_PANTHOR_BO_WB_MMAP)
static int panthor_ioctl_bo_create(struct drm_device *ddev, void *data,
struct drm_file *file)
@@ -921,6 +922,10 @@ static int panthor_ioctl_bo_create(struct drm_device *ddev, void *data,
goto out_dev_exit;
}
+ if ((args->flags & DRM_PANTHOR_BO_NO_MMAP) &&
+ (args->flags & DRM_PANTHOR_BO_WB_MMAP))
+ return -EINVAL;
+
if (args->exclusive_vm_id) {
vm = panthor_vm_pool_get_vm(pfile->vms, args->exclusive_vm_id);
if (!vm) {
diff --git a/drivers/gpu/drm/panthor/panthor_gem.c b/drivers/gpu/drm/panthor/panthor_gem.c
index 4be32fc1732b..7a9eb6010f6f 100644
--- a/drivers/gpu/drm/panthor/panthor_gem.c
+++ b/drivers/gpu/drm/panthor/panthor_gem.c
@@ -59,6 +59,39 @@ static void panthor_gem_debugfs_set_usage_flags(struct panthor_gem_object *bo, u
static void panthor_gem_debugfs_bo_init(struct panthor_gem_object *bo) {}
#endif
+static bool
+should_map_wc(struct panthor_gem_object *bo, struct panthor_vm *exclusive_vm)
+{
+ struct panthor_device *ptdev = container_of(bo->base.base.dev, struct panthor_device, base);
+
+ /* We can't do uncached mappings if the device is coherent,
+ * because the zeroing done by the shmem layer at page allocation
+ * time happens on a cached mapping which isn't CPU-flushed (at least
+ * not on Arm64 where the flush is deferred to PTE setup time, and
+ * only done conditionally based on the mapping permissions). We can't
+ * rely on dma_map_sgtable()/dma_sync_sgtable_for_xxx() either to flush
+ * those, because they are NOPed if dma_dev_coherent() returns true.
+ *
+ * FIXME: Note that this problem is going to pop up again when we
+ * decide to support mapping buffers with the NO_MMAP flag as
+ * non-shareable (AKA buffers accessed only by the GPU), because we
+ * need the same CPU flush to happen after page allocation, otherwise
+ * there's a risk of data leak or late corruption caused by a dirty
+ * cacheline being evicted. At this point we'll need a way to force
+ * CPU cache maintenance regardless of whether the device is coherent
+ * or not.
+ */
+ if (ptdev->coherent)
+ return false;
+
+ /* Cached mappings are explicitly requested, so no write-combine. */
+ if (bo->flags & DRM_PANTHOR_BO_WB_MMAP)
+ return false;
+
+ /* The default is write-combine. */
+ return true;
+}
+
static void panthor_gem_free_object(struct drm_gem_object *obj)
{
struct panthor_gem_object *bo = to_panthor_bo(obj);
@@ -145,6 +178,7 @@ panthor_kernel_bo_create(struct panthor_device *ptdev, struct panthor_vm *vm,
bo = to_panthor_bo(&obj->base);
kbo->obj = &obj->base;
bo->flags = bo_flags;
+ bo->base.map_wc = should_map_wc(bo, vm);
bo->exclusive_vm_root_gem = panthor_vm_root_gem(vm);
drm_gem_object_get(bo->exclusive_vm_root_gem);
bo->base.base.resv = bo->exclusive_vm_root_gem->resv;
@@ -345,7 +379,6 @@ static const struct drm_gem_object_funcs panthor_gem_funcs = {
*/
struct drm_gem_object *panthor_gem_create_object(struct drm_device *ddev, size_t size)
{
- struct panthor_device *ptdev = container_of(ddev, struct panthor_device, base);
struct panthor_gem_object *obj;
obj = kzalloc(sizeof(*obj), GFP_KERNEL);
@@ -353,7 +386,6 @@ struct drm_gem_object *panthor_gem_create_object(struct drm_device *ddev, size_t
return ERR_PTR(-ENOMEM);
obj->base.base.funcs = &panthor_gem_funcs;
- obj->base.map_wc = !ptdev->coherent;
mutex_init(&obj->label.lock);
panthor_gem_debugfs_bo_init(obj);
@@ -388,6 +420,7 @@ panthor_gem_create_with_handle(struct drm_file *file,
bo = to_panthor_bo(&shmem->base);
bo->flags = flags;
+ bo->base.map_wc = should_map_wc(bo, exclusive_vm);
if (exclusive_vm) {
bo->exclusive_vm_root_gem = panthor_vm_root_gem(exclusive_vm);
diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index 389d508b3848..ae69a5704756 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -863,8 +863,11 @@ panthor_queue_get_syncwait_obj(struct panthor_group *group, struct panthor_queue
struct iosys_map map;
int ret;
- if (queue->syncwait.kmap)
- return queue->syncwait.kmap + queue->syncwait.offset;
+ if (queue->syncwait.kmap) {
+ bo = container_of(queue->syncwait.obj,
+ struct panthor_gem_object, base.base);
+ goto out_sync;
+ }
bo = panthor_vm_get_bo_for_va(group->vm,
queue->syncwait.gpu_va,
@@ -881,6 +884,17 @@ panthor_queue_get_syncwait_obj(struct panthor_group *group, struct panthor_queue
if (drm_WARN_ON(&ptdev->base, !queue->syncwait.kmap))
goto err_put_syncwait_obj;
+out_sync:
+ /* Make sure the CPU caches are invalidated before the seqno is read.
+ * drm_gem_shmem_sync() is a NOP if map_wc=true, so no need to check
+ * it here.
+ */
+ panthor_gem_sync(&bo->base.base, queue->syncwait.offset,
+ queue->syncwait.sync64 ?
+ sizeof(struct panthor_syncobj_64b) :
+ sizeof(struct panthor_syncobj_32b),
+ DRM_PANTHOR_BO_SYNC_CPU_CACHE_FLUSH_AND_INVALIDATE);
+
return queue->syncwait.kmap + queue->syncwait.offset;
err_put_syncwait_obj:
diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
index 39d5ce815742..e238c6264fa1 100644
--- a/include/uapi/drm/panthor_drm.h
+++ b/include/uapi/drm/panthor_drm.h
@@ -681,6 +681,15 @@ struct drm_panthor_vm_get_state {
enum drm_panthor_bo_flags {
/** @DRM_PANTHOR_BO_NO_MMAP: The buffer object will never be CPU-mapped in userspace. */
DRM_PANTHOR_BO_NO_MMAP = (1 << 0),
+
+ /**
+ * @DRM_PANTHOR_BO_WB_MMAP: Force "Write-Back Cacheable" CPU mapping.
+ *
+ * CPU map the buffer object in userspace by forcing the "Write-Back
+ * Cacheable" cacheability attribute. The mapping otherwise uses the
+ * "Non-Cacheable" attribute if the GPU is not IO coherent.
+ */
+ DRM_PANTHOR_BO_WB_MMAP = (1 << 1),
};
/**
--
2.51.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v7 07/13] drm/panthor: Bump the driver version to 1.6
2025-12-03 9:01 [PATCH v7 00/13] drm/panfrost, panthor: Cached maps and explicit flushing Boris Brezillon
` (5 preceding siblings ...)
2025-12-03 9:01 ` [PATCH v7 06/13] drm/panthor: Add flag to map GEM object Write-Back Cacheable Boris Brezillon
@ 2025-12-03 9:01 ` Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 08/13] drm/panfrost: Provide a custom dma_buf implementation Boris Brezillon
` (5 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Boris Brezillon @ 2025-12-03 9:01 UTC (permalink / raw)
To: Steven Price
Cc: dri-devel, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Faith Ekstrand, Thierry Reding,
Mikko Perttunen, Melissa Wen, Maíra Canal, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Frank Binns, Matt Coster,
Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
Sean Paul, Marijn Suijten, Alex Deucher, Christian König,
amd-gfx, kernel, Boris Brezillon
From: Faith Ekstrand <faith.ekstrand@collabora.com>
Bump the driver version to reflect the new cached-CPU mapping
capability.
v2:
- Quickly describe what the new version exposes in the commit message
v3:
- Add Steve's R-b
v4:
- No changes
v5:
- No changes
v6:
- No changes
v7:
- No changes
Signed-off-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
---
drivers/gpu/drm/panthor/panthor_drv.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
index 1bb6a20c497a..80ebc0b3721c 100644
--- a/drivers/gpu/drm/panthor/panthor_drv.c
+++ b/drivers/gpu/drm/panthor/panthor_drv.c
@@ -1672,6 +1672,10 @@ static void panthor_debugfs_init(struct drm_minor *minor)
* - 1.3 - adds DRM_PANTHOR_GROUP_STATE_INNOCENT flag
* - 1.4 - adds DRM_IOCTL_PANTHOR_BO_SET_LABEL ioctl
* - 1.5 - adds DRM_PANTHOR_SET_USER_MMIO_OFFSET ioctl
+ * - 1.6 - adds DRM_PANTHOR_BO_WB_MMAP flag
+ * - adds DRM_IOCTL_PANTHOR_BO_SYNC ioctl
+ * - adds DRM_IOCTL_PANTHOR_BO_QUERY_INFO ioctl
+ * - adds drm_panthor_gpu_info::selected_coherency
*/
static const struct drm_driver panthor_drm_driver = {
.driver_features = DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ |
@@ -1685,7 +1689,7 @@ static const struct drm_driver panthor_drm_driver = {
.name = "panthor",
.desc = "Panthor DRM driver",
.major = 1,
- .minor = 5,
+ .minor = 6,
.gem_create_object = panthor_gem_create_object,
.gem_prime_import_sg_table = drm_gem_shmem_prime_import_sg_table,
--
2.51.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v7 08/13] drm/panfrost: Provide a custom dma_buf implementation
2025-12-03 9:01 [PATCH v7 00/13] drm/panfrost, panthor: Cached maps and explicit flushing Boris Brezillon
` (6 preceding siblings ...)
2025-12-03 9:01 ` [PATCH v7 07/13] drm/panthor: Bump the driver version to 1.6 Boris Brezillon
@ 2025-12-03 9:01 ` Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 09/13] drm/panfrost: Expose the selected coherency protocol to the UMD Boris Brezillon
` (4 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Boris Brezillon @ 2025-12-03 9:01 UTC (permalink / raw)
To: Steven Price
Cc: dri-devel, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Faith Ekstrand, Thierry Reding,
Mikko Perttunen, Melissa Wen, Maíra Canal, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Frank Binns, Matt Coster,
Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
Sean Paul, Marijn Suijten, Alex Deucher, Christian König,
amd-gfx, Boris Brezillon, kernel
Before we introduce cached CPU mappings, we want a dma_buf
implementation satisfying synchronization requests around CPU
accesses coming from a dma_buf exported by our driver. Let's
provide our own implementation relying on the default
gem_shmem_prime helpers designed for that purpose.
v5:
- New patch
v6:
- Collect R-b
v7:
- Hand-roll our own dma_buf boilerplate
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
---
drivers/gpu/drm/panfrost/panfrost_drv.c | 1 +
drivers/gpu/drm/panfrost/panfrost_gem.c | 123 ++++++++++++++++++++++++
drivers/gpu/drm/panfrost/panfrost_gem.h | 3 +
3 files changed, 127 insertions(+)
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 7d8c7c337606..cd6001a7faa2 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -852,6 +852,7 @@ static const struct drm_driver panfrost_drm_driver = {
.minor = 5,
.gem_create_object = panfrost_gem_create_object,
+ .gem_prime_import = panfrost_gem_prime_import,
.gem_prime_import_sg_table = panfrost_gem_prime_import_sg_table,
#ifdef CONFIG_DEBUG_FS
.debugfs_init = panfrost_debugfs_init,
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c
index 8041b65c6609..4afd1a7f77d5 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
@@ -259,6 +259,128 @@ static size_t panfrost_gem_rss(struct drm_gem_object *obj)
return 0;
}
+static struct sg_table *
+panfrost_gem_prime_map_dma_buf(struct dma_buf_attachment *attach,
+ enum dma_data_direction dir)
+{
+ struct sg_table *sgt = drm_gem_map_dma_buf(attach, dir);
+
+ if (!IS_ERR(sgt))
+ attach->priv = sgt;
+
+ return sgt;
+}
+
+static void
+panfrost_gem_prime_unmap_dma_buf(struct dma_buf_attachment *attach,
+ struct sg_table *sgt,
+ enum dma_data_direction dir)
+{
+ attach->priv = NULL;
+ drm_gem_unmap_dma_buf(attach, sgt, dir);
+}
+
+static int
+panfrost_gem_prime_begin_cpu_access(struct dma_buf *dma_buf,
+ enum dma_data_direction dir)
+{
+ struct drm_gem_object *obj = dma_buf->priv;
+ struct drm_device *dev = obj->dev;
+ struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
+ struct dma_buf_attachment *attach;
+
+ dma_resv_lock(obj->resv, NULL);
+ if (shmem->sgt)
+ dma_sync_sgtable_for_cpu(dev->dev, shmem->sgt, dir);
+
+ if (shmem->vaddr)
+ invalidate_kernel_vmap_range(shmem->vaddr, shmem->base.size);
+
+ list_for_each_entry(attach, &dma_buf->attachments, node) {
+ struct sg_table *sgt = attach->priv;
+
+ if (sgt)
+ dma_sync_sgtable_for_cpu(attach->dev, sgt, dir);
+ }
+ dma_resv_unlock(obj->resv);
+
+ return 0;
+}
+
+static int
+panfrost_gem_prime_end_cpu_access(struct dma_buf *dma_buf,
+ enum dma_data_direction dir)
+{
+ struct drm_gem_object *obj = dma_buf->priv;
+ struct drm_device *dev = obj->dev;
+ struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
+ struct dma_buf_attachment *attach;
+
+ dma_resv_lock(obj->resv, NULL);
+ list_for_each_entry(attach, &dma_buf->attachments, node) {
+ struct sg_table *sgt = attach->priv;
+
+ if (sgt)
+ dma_sync_sgtable_for_device(attach->dev, sgt, dir);
+ }
+
+ if (shmem->vaddr)
+ flush_kernel_vmap_range(shmem->vaddr, shmem->base.size);
+
+ if (shmem->sgt)
+ dma_sync_sgtable_for_device(dev->dev, shmem->sgt, dir);
+
+ dma_resv_unlock(obj->resv);
+ return 0;
+}
+
+static const struct dma_buf_ops panfrost_dma_buf_ops = {
+ .attach = drm_gem_map_attach,
+ .detach = drm_gem_map_detach,
+ .map_dma_buf = panfrost_gem_prime_map_dma_buf,
+ .unmap_dma_buf = panfrost_gem_prime_unmap_dma_buf,
+ .release = drm_gem_dmabuf_release,
+ .mmap = drm_gem_dmabuf_mmap,
+ .vmap = drm_gem_dmabuf_vmap,
+ .vunmap = drm_gem_dmabuf_vunmap,
+ .begin_cpu_access = panfrost_gem_prime_begin_cpu_access,
+ .end_cpu_access = panfrost_gem_prime_end_cpu_access,
+};
+
+static struct dma_buf *
+panfrost_gem_prime_export(struct drm_gem_object *obj, int flags)
+{
+ struct drm_device *dev = obj->dev;
+ struct dma_buf_export_info exp_info = {
+ .exp_name = KBUILD_MODNAME,
+ .owner = THIS_MODULE,
+ .ops = &panfrost_dma_buf_ops,
+ .size = obj->size,
+ .flags = flags,
+ .priv = obj,
+ .resv = obj->resv,
+ };
+
+ return drm_gem_dmabuf_export(dev, &exp_info);
+}
+
+struct drm_gem_object *
+panfrost_gem_prime_import(struct drm_device *dev,
+ struct dma_buf *dma_buf)
+{
+ struct drm_gem_object *obj = dma_buf->priv;
+
+ if (dma_buf->ops == &panfrost_dma_buf_ops && obj->dev == dev) {
+ /* Importing dmabuf exported from our own gem increases
+ * refcount on gem itself instead of f_count of dmabuf.
+ */
+ drm_gem_object_get(obj);
+ return obj;
+ }
+
+ return drm_gem_prime_import(dev, dma_buf);
+}
+
static const struct drm_gem_object_funcs panfrost_gem_funcs = {
.free = panfrost_gem_free_object,
.open = panfrost_gem_open,
@@ -267,6 +389,7 @@ static const struct drm_gem_object_funcs panfrost_gem_funcs = {
.pin = panfrost_gem_pin,
.unpin = drm_gem_shmem_object_unpin,
.get_sg_table = drm_gem_shmem_object_get_sg_table,
+ .export = panfrost_gem_prime_export,
.vmap = drm_gem_shmem_object_vmap,
.vunmap = drm_gem_shmem_object_vunmap,
.mmap = drm_gem_shmem_object_mmap,
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.h b/drivers/gpu/drm/panfrost/panfrost_gem.h
index 8de3e76f2717..7fec20339354 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.h
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.h
@@ -130,6 +130,9 @@ struct drm_gem_object *
panfrost_gem_prime_import_sg_table(struct drm_device *dev,
struct dma_buf_attachment *attach,
struct sg_table *sgt);
+struct drm_gem_object *
+panfrost_gem_prime_import(struct drm_device *dev,
+ struct dma_buf *dma_buf);
struct panfrost_gem_object *
panfrost_gem_create(struct drm_device *dev, size_t size, u32 flags);
--
2.51.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v7 09/13] drm/panfrost: Expose the selected coherency protocol to the UMD
2025-12-03 9:01 [PATCH v7 00/13] drm/panfrost, panthor: Cached maps and explicit flushing Boris Brezillon
` (7 preceding siblings ...)
2025-12-03 9:01 ` [PATCH v7 08/13] drm/panfrost: Provide a custom dma_buf implementation Boris Brezillon
@ 2025-12-03 9:01 ` Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 10/13] drm/panfrost: Add a PANFROST_SYNC_BO ioctl Boris Brezillon
` (3 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Boris Brezillon @ 2025-12-03 9:01 UTC (permalink / raw)
To: Steven Price
Cc: dri-devel, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Faith Ekstrand, Thierry Reding,
Mikko Perttunen, Melissa Wen, Maíra Canal, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Frank Binns, Matt Coster,
Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
Sean Paul, Marijn Suijten, Alex Deucher, Christian König,
amd-gfx, Boris Brezillon, kernel
Will be needed if we want to skip CPU cache maintenance operations when
the GPU can snoop CPU caches.
v2:
- New commit
v3:
- Fix the coherency values (enum instead of bitmask)
v4:
- Fix init/test on coherency_features
v5:
- No changes
v6:
- Collect R-b
v7:
- No changes
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
---
drivers/gpu/drm/panfrost/panfrost_device.h | 1 +
drivers/gpu/drm/panfrost/panfrost_drv.c | 1 +
drivers/gpu/drm/panfrost/panfrost_gpu.c | 26 +++++++++++++++++++---
drivers/gpu/drm/panfrost/panfrost_regs.h | 10 +++++++--
include/uapi/drm/panfrost_drm.h | 7 ++++++
5 files changed, 40 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
index e61c4329fd07..0f3992412205 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -79,6 +79,7 @@ struct panfrost_features {
u32 thread_max_workgroup_sz;
u32 thread_max_barrier_sz;
u32 coherency_features;
+ u32 selected_coherency;
u32 afbc_features;
u32 texture_features[4];
u32 js_features[16];
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index cd6001a7faa2..d650cd138dad 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -95,6 +95,7 @@ static int panfrost_ioctl_get_param(struct drm_device *ddev, void *data, struct
PANFROST_FEATURE_ARRAY(JS_FEATURES, js_features, 15);
PANFROST_FEATURE(NR_CORE_GROUPS, nr_core_groups);
PANFROST_FEATURE(THREAD_TLS_ALLOC, thread_tls_alloc);
+ PANFROST_FEATURE(SELECTED_COHERENCY, selected_coherency);
case DRM_PANFROST_PARAM_SYSTEM_TIMESTAMP:
ret = panfrost_ioctl_query_timestamp(pfdev, ¶m->value);
diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c
index 483d278eb154..7d555e63e21a 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
@@ -159,8 +159,8 @@ static void panfrost_gpu_init_quirks(struct panfrost_device *pfdev)
pfdev->features.revision >= 0x2000)
quirks |= JM_MAX_JOB_THROTTLE_LIMIT << JM_JOB_THROTTLE_LIMIT_SHIFT;
else if (panfrost_model_eq(pfdev, 0x6000) &&
- pfdev->features.coherency_features == COHERENCY_ACE)
- quirks |= (COHERENCY_ACE_LITE | COHERENCY_ACE) <<
+ pfdev->features.coherency_features == BIT(COHERENCY_ACE))
+ quirks |= (BIT(COHERENCY_ACE_LITE) | BIT(COHERENCY_ACE)) <<
JM_FORCE_COHERENCY_FEATURES_SHIFT;
if (panfrost_has_hw_feature(pfdev, HW_FEATURE_IDVS_GROUP_SIZE))
@@ -263,7 +263,27 @@ static int panfrost_gpu_init_features(struct panfrost_device *pfdev)
pfdev->features.max_threads = gpu_read(pfdev, GPU_THREAD_MAX_THREADS);
pfdev->features.thread_max_workgroup_sz = gpu_read(pfdev, GPU_THREAD_MAX_WORKGROUP_SIZE);
pfdev->features.thread_max_barrier_sz = gpu_read(pfdev, GPU_THREAD_MAX_BARRIER_SIZE);
- pfdev->features.coherency_features = gpu_read(pfdev, GPU_COHERENCY_FEATURES);
+
+ if (panfrost_has_hw_feature(pfdev, HW_FEATURE_COHERENCY_REG))
+ pfdev->features.coherency_features = gpu_read(pfdev, GPU_COHERENCY_FEATURES);
+ else
+ pfdev->features.coherency_features = BIT(COHERENCY_ACE_LITE);
+
+ BUILD_BUG_ON(COHERENCY_ACE_LITE != DRM_PANFROST_GPU_COHERENCY_ACE_LITE);
+ BUILD_BUG_ON(COHERENCY_ACE != DRM_PANFROST_GPU_COHERENCY_ACE);
+ BUILD_BUG_ON(COHERENCY_NONE != DRM_PANFROST_GPU_COHERENCY_NONE);
+
+ if (!pfdev->coherent) {
+ pfdev->features.selected_coherency = COHERENCY_NONE;
+ } else if (pfdev->features.coherency_features & BIT(COHERENCY_ACE)) {
+ pfdev->features.selected_coherency = COHERENCY_ACE;
+ } else if (pfdev->features.coherency_features & BIT(COHERENCY_ACE_LITE)) {
+ pfdev->features.selected_coherency = COHERENCY_ACE_LITE;
+ } else {
+ drm_WARN(&pfdev->base, true, "No known coherency protocol supported");
+ pfdev->features.selected_coherency = COHERENCY_NONE;
+ }
+
pfdev->features.afbc_features = gpu_read(pfdev, GPU_AFBC_FEATURES);
for (i = 0; i < 4; i++)
pfdev->features.texture_features[i] = gpu_read(pfdev, GPU_TEXTURE_FEATURES(i));
diff --git a/drivers/gpu/drm/panfrost/panfrost_regs.h b/drivers/gpu/drm/panfrost/panfrost_regs.h
index 2b8f1617b836..ee15f6bf6e6f 100644
--- a/drivers/gpu/drm/panfrost/panfrost_regs.h
+++ b/drivers/gpu/drm/panfrost/panfrost_regs.h
@@ -102,9 +102,15 @@
#define GPU_L2_PRESENT_LO 0x120 /* (RO) Level 2 cache present bitmap, low word */
#define GPU_L2_PRESENT_HI 0x124 /* (RO) Level 2 cache present bitmap, high word */
+/* GPU_COHERENCY_FEATURES is a bitmask of BIT(COHERENCY_xxx) values encoding the
+ * set of supported coherency protocols. GPU_COHERENCY_ENABLE is passed a
+ * COHERENCY_xxx value.
+ */
#define GPU_COHERENCY_FEATURES 0x300 /* (RO) Coherency features present */
-#define COHERENCY_ACE_LITE BIT(0)
-#define COHERENCY_ACE BIT(1)
+#define GPU_COHERENCY_ENABLE 0x304 /* (RW) Coherency protocol selection */
+#define COHERENCY_ACE_LITE 0
+#define COHERENCY_ACE 1
+#define COHERENCY_NONE 31
#define GPU_STACK_PRESENT_LO 0xE00 /* (RO) Core stack present bitmap, low word */
#define GPU_STACK_PRESENT_HI 0xE04 /* (RO) Core stack present bitmap, high word */
diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
index 1956431bb391..0c59714ae42b 100644
--- a/include/uapi/drm/panfrost_drm.h
+++ b/include/uapi/drm/panfrost_drm.h
@@ -228,6 +228,13 @@ enum drm_panfrost_param {
DRM_PANFROST_PARAM_SYSTEM_TIMESTAMP,
DRM_PANFROST_PARAM_SYSTEM_TIMESTAMP_FREQUENCY,
DRM_PANFROST_PARAM_ALLOWED_JM_CTX_PRIORITIES,
+ DRM_PANFROST_PARAM_SELECTED_COHERENCY,
+};
+
+enum drm_panfrost_gpu_coherency {
+ DRM_PANFROST_GPU_COHERENCY_ACE_LITE = 0,
+ DRM_PANFROST_GPU_COHERENCY_ACE = 1,
+ DRM_PANFROST_GPU_COHERENCY_NONE = 31,
};
struct drm_panfrost_get_param {
--
2.51.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v7 10/13] drm/panfrost: Add a PANFROST_SYNC_BO ioctl
2025-12-03 9:01 [PATCH v7 00/13] drm/panfrost, panthor: Cached maps and explicit flushing Boris Brezillon
` (8 preceding siblings ...)
2025-12-03 9:01 ` [PATCH v7 09/13] drm/panfrost: Expose the selected coherency protocol to the UMD Boris Brezillon
@ 2025-12-03 9:01 ` Boris Brezillon
2025-12-04 16:09 ` Steven Price
2025-12-03 9:01 ` [PATCH v7 11/13] drm/panfrost: Add an ioctl to query BO flags Boris Brezillon
` (2 subsequent siblings)
12 siblings, 1 reply; 18+ messages in thread
From: Boris Brezillon @ 2025-12-03 9:01 UTC (permalink / raw)
To: Steven Price
Cc: dri-devel, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Faith Ekstrand, Thierry Reding,
Mikko Perttunen, Melissa Wen, Maíra Canal, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Frank Binns, Matt Coster,
Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
Sean Paul, Marijn Suijten, Alex Deucher, Christian König,
amd-gfx, kernel, Boris Brezillon
From: Faith Ekstrand <faith.ekstrand@collabora.com>
This will be used by the UMD to synchronize CPU-cached mappings when
the UMD can't do it directly (no usermode cache maintenance instruction
on Arm32).
v2:
- Add more to the commit message
- Change the flags to better match the drm_gem_shmem_sync semantics
v3:
- Add Steve's R-b
v4:
- No changes
v5:
- Drop Steve's R-b (semantics changes requiring a new review)
v6:
- Bail out early in panfrost_ioctl_sync_bo() if op_count is zero
v7:
- Hand-roll our own bo_sync() helper
Signed-off-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
drivers/gpu/drm/panfrost/panfrost_drv.c | 51 +++++++++++++++
drivers/gpu/drm/panfrost/panfrost_gem.c | 84 +++++++++++++++++++++++++
drivers/gpu/drm/panfrost/panfrost_gem.h | 2 +
include/uapi/drm/panfrost_drm.h | 45 +++++++++++++
4 files changed, 182 insertions(+)
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index d650cd138dad..77b0ae5ef000 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -580,6 +580,56 @@ static int panfrost_ioctl_jm_ctx_destroy(struct drm_device *dev, void *data,
return panfrost_jm_ctx_destroy(file, args->handle);
}
+static int panfrost_ioctl_sync_bo(struct drm_device *ddev, void *data,
+ struct drm_file *file)
+{
+ struct drm_panfrost_sync_bo *args = data;
+ struct drm_panfrost_bo_sync_op *ops;
+ struct drm_gem_object *obj;
+ int ret;
+ u32 i;
+
+ if (args->pad)
+ return -EINVAL;
+
+ if (!args->op_count)
+ return 0;
+
+ ops = kvmalloc_array(args->op_count, sizeof(*ops), GFP_KERNEL);
+ if (!ops) {
+ DRM_DEBUG("Failed to allocate incoming BO sync ops array\n");
+ return -ENOMEM;
+ }
+
+ if (copy_from_user(ops, (void __user *)(uintptr_t)args->ops,
+ args->op_count * sizeof(*ops))) {
+ DRM_DEBUG("Failed to copy in BO sync ops\n");
+ ret = -EFAULT;
+ goto err_ops;
+ }
+
+ for (i = 0; i < args->op_count; i++) {
+ obj = drm_gem_object_lookup(file, ops[i].handle);
+ if (!obj) {
+ ret = -ENOENT;
+ goto err_ops;
+ }
+
+ ret = panfrost_gem_sync(obj, ops[i].type,
+ ops[i].offset, ops[i].size);
+
+ drm_gem_object_put(obj);
+
+ if (ret)
+ goto err_ops;
+ }
+
+err_ops:
+ kvfree(ops);
+
+ return ret;
+}
+
int panfrost_unstable_ioctl_check(void)
{
if (!unstable_ioctls)
@@ -649,6 +699,7 @@ static const struct drm_ioctl_desc panfrost_drm_driver_ioctls[] = {
PANFROST_IOCTL(SET_LABEL_BO, set_label_bo, DRM_RENDER_ALLOW),
PANFROST_IOCTL(JM_CTX_CREATE, jm_ctx_create, DRM_RENDER_ALLOW),
PANFROST_IOCTL(JM_CTX_DESTROY, jm_ctx_destroy, DRM_RENDER_ALLOW),
+ PANFROST_IOCTL(SYNC_BO, sync_bo, DRM_RENDER_ALLOW),
};
static void panfrost_gpu_show_fdinfo(struct panfrost_device *pfdev,
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c
index 4afd1a7f77d5..8231ae04f54c 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
@@ -489,6 +489,90 @@ panfrost_gem_set_label(struct drm_gem_object *obj, const char *label)
kfree_const(old_label);
}
+int
+panfrost_gem_sync(struct drm_gem_object *obj, u32 type, u32 offset, u32 size)
+{
+ struct panfrost_gem_object *bo = to_panfrost_bo(obj);
+ struct drm_gem_shmem_object *shmem = &bo->base;
+ const struct drm_device *dev = shmem->base.dev;
+ struct sg_table *sgt;
+ struct scatterlist *sgl;
+ unsigned int count;
+
+ /* Make sure the range is in bounds. */
+ if (offset + size < offset || offset + size > shmem->base.size)
+ return -EINVAL;
+
+ /* Disallow CPU-cache maintenance on imported buffers. */
+ if (drm_gem_is_imported(&shmem->base))
+ return -EINVAL;
+
+ switch (type) {
+ case PANFROST_BO_SYNC_CPU_CACHE_FLUSH:
+ case PANFROST_BO_SYNC_CPU_CACHE_FLUSH_AND_INVALIDATE:
+ break;
+
+ default:
+ return -EINVAL;
+ }
+
+ /* Don't bother if it's WC-mapped */
+ if (shmem->map_wc)
+ return 0;
+
+ /* Nothing to do if the size is zero. */
+ if (size == 0)
+ return 0;
+
+ sgt = drm_gem_shmem_get_pages_sgt(shmem);
+ if (IS_ERR(sgt))
+ return PTR_ERR(sgt);
+
+ for_each_sgtable_dma_sg(sgt, sgl, count) {
+ if (size == 0)
+ break;
+
+ dma_addr_t paddr = sg_dma_address(sgl);
+ size_t len = sg_dma_len(sgl);
+
+ if (len <= offset) {
+ offset -= len;
+ continue;
+ }
+
+ paddr += offset;
+ len -= offset;
+ len = min_t(size_t, len, size);
+ size -= len;
+ offset = 0;
+
+ /* It's unclear whether dma_sync_xxx() is the right API to do CPU
+ * cache maintenance given an IOMMU can register their own
+ * implementation doing more than just CPU cache flushes/invalidation,
+ * and what we really care about here is CPU caches only, but that's
+ * the best we have that is both arch-agnostic and does at least the
+ * CPU cache maintenance on a <page,offset,size> tuple.
+ *
+ * Also, I wish we could do a single
+ *
+ * dma_sync_single_for_device(BIDIR)
+ *
+ * and get a flush+invalidate, but that's not how it's implemented
+ * in practice (at least on arm64), so we have to make it
+ *
+ * dma_sync_single_for_device(TO_DEVICE)
+ * dma_sync_single_for_cpu(FROM_DEVICE)
+ *
+ * for the flush+invalidate case.
+ */
+ dma_sync_single_for_device(dev->dev, paddr, len, DMA_TO_DEVICE);
+ if (type == PANFROST_BO_SYNC_CPU_CACHE_FLUSH_AND_INVALIDATE)
+ dma_sync_single_for_cpu(dev->dev, paddr, len, DMA_FROM_DEVICE);
+ }
+
+ return 0;
+}
+
void
panfrost_gem_internal_set_label(struct drm_gem_object *obj, const char *label)
{
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.h b/drivers/gpu/drm/panfrost/panfrost_gem.h
index 7fec20339354..d61ffe1f6841 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.h
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.h
@@ -151,6 +151,8 @@ int panfrost_gem_shrinker_init(struct drm_device *dev);
void panfrost_gem_shrinker_cleanup(struct drm_device *dev);
void panfrost_gem_set_label(struct drm_gem_object *obj, const char *label);
+int panfrost_gem_sync(struct drm_gem_object *obj, u32 type,
+ u32 offset, u32 size);
void panfrost_gem_internal_set_label(struct drm_gem_object *obj, const char *label);
#ifdef CONFIG_DEBUG_FS
diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
index 0c59714ae42b..e194e087a0c8 100644
--- a/include/uapi/drm/panfrost_drm.h
+++ b/include/uapi/drm/panfrost_drm.h
@@ -24,6 +24,7 @@ extern "C" {
#define DRM_PANFROST_SET_LABEL_BO 0x09
#define DRM_PANFROST_JM_CTX_CREATE 0x0a
#define DRM_PANFROST_JM_CTX_DESTROY 0x0b
+#define DRM_PANFROST_SYNC_BO 0x0c
#define DRM_IOCTL_PANFROST_SUBMIT DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_SUBMIT, struct drm_panfrost_submit)
#define DRM_IOCTL_PANFROST_WAIT_BO DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_WAIT_BO, struct drm_panfrost_wait_bo)
@@ -35,6 +36,7 @@ extern "C" {
#define DRM_IOCTL_PANFROST_SET_LABEL_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_SET_LABEL_BO, struct drm_panfrost_set_label_bo)
#define DRM_IOCTL_PANFROST_JM_CTX_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_JM_CTX_CREATE, struct drm_panfrost_jm_ctx_create)
#define DRM_IOCTL_PANFROST_JM_CTX_DESTROY DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_JM_CTX_DESTROY, struct drm_panfrost_jm_ctx_destroy)
+#define DRM_IOCTL_PANFROST_SYNC_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_SYNC_BO, struct drm_panfrost_sync_bo)
/*
* Unstable ioctl(s): only exposed when the unsafe unstable_ioctls module
@@ -308,6 +310,49 @@ struct drm_panfrost_set_label_bo {
__u64 label;
};
+/* Valid flags to pass to drm_panfrost_bo_sync_op */
+#define PANFROST_BO_SYNC_CPU_CACHE_FLUSH 0
+#define PANFROST_BO_SYNC_CPU_CACHE_FLUSH_AND_INVALIDATE 1
+
+/**
+ * struct drm_panthor_bo_flush_map_op - BO map sync op
+ */
+struct drm_panfrost_bo_sync_op {
+ /** @handle: Handle of the buffer object to sync. */
+ __u32 handle;
+
+ /** @type: Type of sync operation. */
+ __u32 type;
+
+ /**
+ * @offset: Offset into the BO at which the sync range starts.
+ *
+ * This will be rounded down to the nearest cache line as needed.
+ */
+ __u32 offset;
+
+ /**
+ * @size: Size of the range to sync
+ *
+ * @size + @offset will be rounded up to the nearest cache line as
+ * needed.
+ */
+ __u32 size;
+};
+
+/**
+ * struct drm_panfrost_sync_bo - ioctl argument for syncing BO maps
+ */
+struct drm_panfrost_sync_bo {
+ /** Array of struct drm_panfrost_bo_sync_op */
+ __u64 ops;
+
+ /** Number of BO sync ops */
+ __u32 op_count;
+
+ __u32 pad;
+};
+
/* Definitions for coredump decoding in user space */
#define PANFROSTDUMP_MAJOR 1
#define PANFROSTDUMP_MINOR 0
--
2.51.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v7 11/13] drm/panfrost: Add an ioctl to query BO flags
2025-12-03 9:01 [PATCH v7 00/13] drm/panfrost, panthor: Cached maps and explicit flushing Boris Brezillon
` (9 preceding siblings ...)
2025-12-03 9:01 ` [PATCH v7 10/13] drm/panfrost: Add a PANFROST_SYNC_BO ioctl Boris Brezillon
@ 2025-12-03 9:01 ` Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 12/13] drm/panfrost: Add flag to map GEM object Write-Back Cacheable Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 13/13] drm/panfrost: Bump the driver version to 1.6 Boris Brezillon
12 siblings, 0 replies; 18+ messages in thread
From: Boris Brezillon @ 2025-12-03 9:01 UTC (permalink / raw)
To: Steven Price
Cc: dri-devel, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Faith Ekstrand, Thierry Reding,
Mikko Perttunen, Melissa Wen, Maíra Canal, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Frank Binns, Matt Coster,
Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
Sean Paul, Marijn Suijten, Alex Deucher, Christian König,
amd-gfx, Boris Brezillon, kernel
This is useful when importing BOs, so we can know about cacheability
and flush the caches when needed.
v2:
- New commit
v3:
- Add Steve's R-b
v4:
- No changes
v5:
- No changes
v6:
- No changes
v7:
- No changes
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
---
drivers/gpu/drm/panfrost/panfrost_drv.c | 33 +++++++++++++++++++++++++
include/uapi/drm/panfrost_drm.h | 19 ++++++++++++++
2 files changed, 52 insertions(+)
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 77b0ae5ef000..b191795d88cd 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -630,6 +630,38 @@ static int panfrost_ioctl_sync_bo(struct drm_device *ddev, void *data,
return ret;
}
+static int panfrost_ioctl_query_bo_info(struct drm_device *dev, void *data,
+ struct drm_file *file_priv)
+{
+ struct drm_panfrost_query_bo_info *args = data;
+ struct drm_gem_object *gem_obj;
+ struct panfrost_gem_object *bo;
+
+ gem_obj = drm_gem_object_lookup(file_priv, args->handle);
+ if (!gem_obj) {
+ DRM_DEBUG("Failed to look up GEM BO %d\n", args->handle);
+ return -ENOENT;
+ }
+
+ bo = to_panfrost_bo(gem_obj);
+ args->pad = 0;
+ args->create_flags = 0;
+ args->extra_flags = 0;
+
+ if (drm_gem_is_imported(gem_obj)) {
+ args->extra_flags |= DRM_PANFROST_BO_IS_IMPORTED;
+ } else {
+ if (bo->noexec)
+ args->create_flags |= PANFROST_BO_NOEXEC;
+
+ if (bo->is_heap)
+ args->create_flags |= PANFROST_BO_HEAP;
+ }
+
+ drm_gem_object_put(gem_obj);
+ return 0;
+}
+
int panfrost_unstable_ioctl_check(void)
{
if (!unstable_ioctls)
@@ -700,6 +732,7 @@ static const struct drm_ioctl_desc panfrost_drm_driver_ioctls[] = {
PANFROST_IOCTL(JM_CTX_CREATE, jm_ctx_create, DRM_RENDER_ALLOW),
PANFROST_IOCTL(JM_CTX_DESTROY, jm_ctx_destroy, DRM_RENDER_ALLOW),
PANFROST_IOCTL(SYNC_BO, sync_bo, DRM_RENDER_ALLOW),
+ PANFROST_IOCTL(QUERY_BO_INFO, query_bo_info, DRM_RENDER_ALLOW),
};
static void panfrost_gpu_show_fdinfo(struct panfrost_device *pfdev,
diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
index e194e087a0c8..36ae48ea50d3 100644
--- a/include/uapi/drm/panfrost_drm.h
+++ b/include/uapi/drm/panfrost_drm.h
@@ -25,6 +25,7 @@ extern "C" {
#define DRM_PANFROST_JM_CTX_CREATE 0x0a
#define DRM_PANFROST_JM_CTX_DESTROY 0x0b
#define DRM_PANFROST_SYNC_BO 0x0c
+#define DRM_PANFROST_QUERY_BO_INFO 0x0d
#define DRM_IOCTL_PANFROST_SUBMIT DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_SUBMIT, struct drm_panfrost_submit)
#define DRM_IOCTL_PANFROST_WAIT_BO DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_WAIT_BO, struct drm_panfrost_wait_bo)
@@ -37,6 +38,7 @@ extern "C" {
#define DRM_IOCTL_PANFROST_JM_CTX_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_JM_CTX_CREATE, struct drm_panfrost_jm_ctx_create)
#define DRM_IOCTL_PANFROST_JM_CTX_DESTROY DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_JM_CTX_DESTROY, struct drm_panfrost_jm_ctx_destroy)
#define DRM_IOCTL_PANFROST_SYNC_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_SYNC_BO, struct drm_panfrost_sync_bo)
+#define DRM_IOCTL_PANFROST_QUERY_BO_INFO DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_QUERY_BO_INFO, struct drm_panfrost_query_bo_info)
/*
* Unstable ioctl(s): only exposed when the unsafe unstable_ioctls module
@@ -353,6 +355,23 @@ struct drm_panfrost_sync_bo {
__u32 pad;
};
+/** BO comes from a different subsystem. */
+#define DRM_PANFROST_BO_IS_IMPORTED (1 << 0)
+
+struct drm_panfrost_query_bo_info {
+ /** Handle of the object being queried. */
+ __u32 handle;
+
+ /** Extra flags that are not coming from the BO_CREATE ioctl(). */
+ __u32 extra_flags;
+
+ /** Flags passed at creation time. */
+ __u32 create_flags;
+
+ /** Will be zero on return. */
+ __u32 pad;
+};
+
/* Definitions for coredump decoding in user space */
#define PANFROSTDUMP_MAJOR 1
#define PANFROSTDUMP_MINOR 0
--
2.51.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v7 12/13] drm/panfrost: Add flag to map GEM object Write-Back Cacheable
2025-12-03 9:01 [PATCH v7 00/13] drm/panfrost, panthor: Cached maps and explicit flushing Boris Brezillon
` (10 preceding siblings ...)
2025-12-03 9:01 ` [PATCH v7 11/13] drm/panfrost: Add an ioctl to query BO flags Boris Brezillon
@ 2025-12-03 9:01 ` Boris Brezillon
2025-12-05 13:07 ` kernel test robot
2025-12-03 9:01 ` [PATCH v7 13/13] drm/panfrost: Bump the driver version to 1.6 Boris Brezillon
12 siblings, 1 reply; 18+ messages in thread
From: Boris Brezillon @ 2025-12-03 9:01 UTC (permalink / raw)
To: Steven Price
Cc: dri-devel, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Faith Ekstrand, Thierry Reding,
Mikko Perttunen, Melissa Wen, Maíra Canal, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Frank Binns, Matt Coster,
Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
Sean Paul, Marijn Suijten, Alex Deucher, Christian König,
amd-gfx, kernel, Boris Brezillon
From: Faith Ekstrand <faith.ekstrand@collabora.com>
Will be used by the UMD to optimize CPU accesses to buffers
that are frequently read by the CPU, or on which the access
pattern makes non-cacheable mappings inefficient.
Mapping buffers CPU-cached implies taking care of the CPU
cache maintenance in the UMD, unless the GPU is IO coherent.
v2:
- Add more to the commit message
v3:
- No changes
v4:
- Fix the map_wc test in panfrost_ioctl_query_bo_info()
v5:
- Drop Steve's R-b (enough has changed to justify a new review)
v6:
- Collect R-b
v7:
- No changes
Signed-off-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
---
drivers/gpu/drm/panfrost/panfrost_drv.c | 10 ++++++--
drivers/gpu/drm/panfrost/panfrost_gem.c | 33 +++++++++++++++++++++++++
drivers/gpu/drm/panfrost/panfrost_gem.h | 5 ++++
include/uapi/drm/panfrost_drm.h | 5 +++-
4 files changed, 50 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index b191795d88cd..43a453ee9614 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -126,6 +126,10 @@ static int panfrost_ioctl_get_param(struct drm_device *ddev, void *data, struct
return 0;
}
+#define PANFROST_BO_FLAGS (PANFROST_BO_NOEXEC | \
+ PANFROST_BO_HEAP | \
+ PANFROST_BO_WB_MMAP)
+
static int panfrost_ioctl_create_bo(struct drm_device *dev, void *data,
struct drm_file *file)
{
@@ -135,8 +139,7 @@ static int panfrost_ioctl_create_bo(struct drm_device *dev, void *data,
struct panfrost_gem_mapping *mapping;
int ret;
- if (!args->size || args->pad ||
- (args->flags & ~(PANFROST_BO_NOEXEC | PANFROST_BO_HEAP)))
+ if (!args->size || args->pad || (args->flags & ~PANFROST_BO_FLAGS))
return -EINVAL;
/* Heaps should never be executable */
@@ -656,6 +659,9 @@ static int panfrost_ioctl_query_bo_info(struct drm_device *dev, void *data,
if (bo->is_heap)
args->create_flags |= PANFROST_BO_HEAP;
+
+ if (!bo->base.map_wc)
+ args->create_flags |= PANFROST_BO_WB_MMAP;
}
drm_gem_object_put(gem_obj);
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c
index 8231ae04f54c..6c468c328772 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
@@ -393,6 +393,7 @@ static const struct drm_gem_object_funcs panfrost_gem_funcs = {
.vmap = drm_gem_shmem_object_vmap,
.vunmap = drm_gem_shmem_object_vunmap,
.mmap = drm_gem_shmem_object_mmap,
+ .export = drm_gem_prime_export,
.status = panfrost_gem_status,
.rss = panfrost_gem_rss,
.vm_ops = &drm_gem_shmem_vm_ops,
@@ -426,12 +427,42 @@ struct drm_gem_object *panfrost_gem_create_object(struct drm_device *dev, size_t
return &obj->base.base;
}
+static bool
+should_map_wc(struct panfrost_gem_object *bo)
+{
+ struct panfrost_device *pfdev = to_panfrost_device(bo->base.base.dev);
+
+ /* We can't do uncached mappings if the device is coherent,
+ * because the zeroing done by the shmem layer at page allocation
+ * time happens on a cached mapping which isn't CPU-flushed (at least
+ * not on Arm64 where the flush is deferred to PTE setup time, and
+ * only done conditionally based on the mapping permissions). We can't
+ * rely on dma_map_sgtable()/dma_sync_sgtable_for_xxx() either to flush
+ * those, because they are NOPed if dma_dev_coherent() returns true.
+ */
+ if (pfdev->coherent)
+ return false;
+
+ /* Cached mappings are explicitly requested, so no write-combine. */
+ if (bo->wb_mmap)
+ return false;
+
+ /* The default is write-combine. */
+ return true;
+}
+
struct panfrost_gem_object *
panfrost_gem_create(struct drm_device *dev, size_t size, u32 flags)
{
struct drm_gem_shmem_object *shmem;
struct panfrost_gem_object *bo;
+ /* The heap buffer is not supposed to be CPU-visible, so don't allow
+ * WB_MMAP on those.
+ */
+ if ((flags & PANFROST_BO_HEAP) && (flags & PANFROST_BO_WB_MMAP))
+ return ERR_PTR(-EINVAL);
+
/* Round up heap allocations to 2MB to keep fault handling simple */
if (flags & PANFROST_BO_HEAP)
size = roundup(size, SZ_2M);
@@ -443,6 +474,8 @@ panfrost_gem_create(struct drm_device *dev, size_t size, u32 flags)
bo = to_panfrost_bo(&shmem->base);
bo->noexec = !!(flags & PANFROST_BO_NOEXEC);
bo->is_heap = !!(flags & PANFROST_BO_HEAP);
+ bo->wb_mmap = !!(flags & PANFROST_BO_WB_MMAP);
+ bo->base.map_wc = should_map_wc(bo);
return bo;
}
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.h b/drivers/gpu/drm/panfrost/panfrost_gem.h
index d61ffe1f6841..3095c570ee5f 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.h
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.h
@@ -98,6 +98,11 @@ struct panfrost_gem_object {
bool noexec :1;
bool is_heap :1;
+ /* On coherent devices, this reflects the creation flags, not the true
+ * cacheability attribute of the mapping.
+ */
+ bool wb_mmap :1;
+
#ifdef CONFIG_DEBUG_FS
struct panfrost_gem_debugfs debugfs;
#endif
diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
index 36ae48ea50d3..50d5337f35ef 100644
--- a/include/uapi/drm/panfrost_drm.h
+++ b/include/uapi/drm/panfrost_drm.h
@@ -124,9 +124,12 @@ struct drm_panfrost_wait_bo {
__s64 timeout_ns;
};
-/* Valid flags to pass to drm_panfrost_create_bo */
+/* Valid flags to pass to drm_panfrost_create_bo.
+ * PANFROST_BO_WB_MMAP can't be set if PANFROST_BO_HEAP is.
+ */
#define PANFROST_BO_NOEXEC 1
#define PANFROST_BO_HEAP 2
+#define PANFROST_BO_WB_MMAP 4
/**
* struct drm_panfrost_create_bo - ioctl argument for creating Panfrost BOs.
--
2.51.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v7 13/13] drm/panfrost: Bump the driver version to 1.6
2025-12-03 9:01 [PATCH v7 00/13] drm/panfrost, panthor: Cached maps and explicit flushing Boris Brezillon
` (11 preceding siblings ...)
2025-12-03 9:01 ` [PATCH v7 12/13] drm/panfrost: Add flag to map GEM object Write-Back Cacheable Boris Brezillon
@ 2025-12-03 9:01 ` Boris Brezillon
12 siblings, 0 replies; 18+ messages in thread
From: Boris Brezillon @ 2025-12-03 9:01 UTC (permalink / raw)
To: Steven Price
Cc: dri-devel, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Faith Ekstrand, Thierry Reding,
Mikko Perttunen, Melissa Wen, Maíra Canal, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Frank Binns, Matt Coster,
Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
Sean Paul, Marijn Suijten, Alex Deucher, Christian König,
amd-gfx, kernel, Boris Brezillon
From: Faith Ekstrand <faith.ekstrand@collabora.com>
Bump the driver version to reflect the new cached-CPU mapping
capability.
v2:
- Quickly describe what the new version exposes in the commit message
v3:
- Add Steve's R-b
v4:
- No changes
v5:
- No changes
v6:
- No changes
v7:
- No changes
Signed-off-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
---
drivers/gpu/drm/panfrost/panfrost_drv.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 43a453ee9614..712497729732 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -928,6 +928,9 @@ static void panfrost_debugfs_init(struct drm_minor *minor)
* - 1.4 - adds SET_LABEL_BO
* - 1.5 - adds JM_CTX_{CREATE,DESTROY} ioctls and extend SUBMIT to allow
* context creation with configurable priorities/affinity
+ * - 1.6 - adds PANFROST_BO_MAP_WB, PANFROST_IOCTL_SYNC_BO,
+ * PANFROST_IOCTL_QUERY_BO_INFO and
+ * DRM_PANFROST_PARAM_SELECTED_COHERENCY
*/
static const struct drm_driver panfrost_drm_driver = {
.driver_features = DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ,
@@ -940,7 +943,7 @@ static const struct drm_driver panfrost_drm_driver = {
.name = "panfrost",
.desc = "panfrost DRM",
.major = 1,
- .minor = 5,
+ .minor = 6,
.gem_create_object = panfrost_gem_create_object,
.gem_prime_import = panfrost_gem_prime_import,
--
2.51.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v7 04/13] drm/panthor: Add a PANTHOR_BO_SYNC ioctl
2025-12-03 9:01 ` [PATCH v7 04/13] drm/panthor: Add a PANTHOR_BO_SYNC ioctl Boris Brezillon
@ 2025-12-04 15:42 ` Steven Price
0 siblings, 0 replies; 18+ messages in thread
From: Steven Price @ 2025-12-04 15:42 UTC (permalink / raw)
To: Boris Brezillon
Cc: dri-devel, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Faith Ekstrand, Thierry Reding,
Mikko Perttunen, Melissa Wen, Maíra Canal, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Frank Binns, Matt Coster,
Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
Sean Paul, Marijn Suijten, Alex Deucher, Christian König,
amd-gfx, kernel
On 03/12/2025 09:01, Boris Brezillon wrote:
> This will be used by the UMD to synchronize CPU-cached mappings when
> the UMD can't do it directly (no usermode cache maintenance instruction
> on Arm32).
>
> v2:
> - Change the flags so they better match the drm_gem_shmem_sync()
> semantics
>
> v3:
> - Add Steve's R-b
>
> v4:
> - No changes
>
> v5:
> - Drop Steve's R-b (the semantics changes call for a new review)
>
> v6:
> - Drop ret initialization in panthor_ioctl_bo_sync()
> - Bail out early in panthor_ioctl_bo_sync() if ops.count is zero
> - Drop unused PANTHOR_BO_SYNC_OP_FLAGS definition
>
> v7:
> - Hand-roll the sync logic (was previously provided by gem_shmem)
>
> Signed-off-by: Faith Ekstrand <faith.ekstrand@collabora.com>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
> ---
> drivers/gpu/drm/panthor/panthor_drv.c | 41 ++++++++++++-
> drivers/gpu/drm/panthor/panthor_gem.c | 85 +++++++++++++++++++++++++++
> drivers/gpu/drm/panthor/panthor_gem.h | 2 +
> include/uapi/drm/panthor_drm.h | 52 ++++++++++++++++
> 4 files changed, 179 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
> index d12ac4cb0ac4..cab19621917f 100644
> --- a/drivers/gpu/drm/panthor/panthor_drv.c
> +++ b/drivers/gpu/drm/panthor/panthor_drv.c
> @@ -177,7 +177,8 @@ panthor_get_uobj_array(const struct drm_panthor_obj_array *in, u32 min_stride,
> PANTHOR_UOBJ_DECL(struct drm_panthor_sync_op, timeline_value), \
> PANTHOR_UOBJ_DECL(struct drm_panthor_queue_submit, syncs), \
> PANTHOR_UOBJ_DECL(struct drm_panthor_queue_create, ringbuf_size), \
> - PANTHOR_UOBJ_DECL(struct drm_panthor_vm_bind_op, syncs))
> + PANTHOR_UOBJ_DECL(struct drm_panthor_vm_bind_op, syncs), \
> + PANTHOR_UOBJ_DECL(struct drm_panthor_bo_sync_op, size))
>
> /**
> * PANTHOR_UOBJ_SET() - Copy a kernel object to a user object.
> @@ -1396,6 +1397,43 @@ static int panthor_ioctl_set_user_mmio_offset(struct drm_device *ddev,
> return 0;
> }
>
> +static int panthor_ioctl_bo_sync(struct drm_device *ddev, void *data,
> + struct drm_file *file)
> +{
> + struct drm_panthor_bo_sync *args = data;
> + struct drm_panthor_bo_sync_op *ops;
> + struct drm_gem_object *obj;
> + int ret;
> +
> + if (!args->ops.count)
> + return 0;
> +
> + ret = PANTHOR_UOBJ_GET_ARRAY(ops, &args->ops);
> + if (ret)
> + return ret;
> +
> + for (u32 i = 0; i < args->ops.count; i++) {
> + obj = drm_gem_object_lookup(file, ops[i].handle);
> + if (!obj) {
> + ret = -ENOENT;
> + goto err_ops;
> + }
> +
> + ret = panthor_gem_sync(obj, ops[i].type, ops[i].offset,
> + ops[i].size);
> +
> + drm_gem_object_put(obj);
> +
> + if (ret)
> + goto err_ops;
> + }
> +
> +err_ops:
> + kvfree(ops);
> +
> + return ret;
> +}
> +
> static int
> panthor_open(struct drm_device *ddev, struct drm_file *file)
> {
> @@ -1470,6 +1508,7 @@ static const struct drm_ioctl_desc panthor_drm_driver_ioctls[] = {
> PANTHOR_IOCTL(GROUP_SUBMIT, group_submit, DRM_RENDER_ALLOW),
> PANTHOR_IOCTL(BO_SET_LABEL, bo_set_label, DRM_RENDER_ALLOW),
> PANTHOR_IOCTL(SET_USER_MMIO_OFFSET, set_user_mmio_offset, DRM_RENDER_ALLOW),
> + PANTHOR_IOCTL(BO_SYNC, bo_sync, DRM_RENDER_ALLOW),
> };
>
> static int panthor_mmap(struct file *filp, struct vm_area_struct *vma)
> diff --git a/drivers/gpu/drm/panthor/panthor_gem.c b/drivers/gpu/drm/panthor/panthor_gem.c
> index 173d42d65000..4be32fc1732b 100644
> --- a/drivers/gpu/drm/panthor/panthor_gem.c
> +++ b/drivers/gpu/drm/panthor/panthor_gem.c
> @@ -447,6 +447,91 @@ panthor_gem_kernel_bo_set_label(struct panthor_kernel_bo *bo, const char *label)
> panthor_gem_bo_set_label(bo->obj, str);
> }
>
> +int
> +panthor_gem_sync(struct drm_gem_object *obj, u32 type,
> + u64 offset, u64 size)
> +{
> + struct panthor_gem_object *bo = to_panthor_bo(obj);
> + struct drm_gem_shmem_object *shmem = &bo->base;
> + const struct drm_device *dev = shmem->base.dev;
> + struct sg_table *sgt;
> + struct scatterlist *sgl;
> + unsigned int count;
> +
> + /* Make sure the range is in bounds. */
> + if (offset + size < offset || offset + size > shmem->base.size)
> + return -EINVAL;
> +
> + /* Disallow CPU-cache maintenance on imported buffers. */
> + if (drm_gem_is_imported(&shmem->base))
> + return -EINVAL;
> +
> + switch (type) {
> + case DRM_PANTHOR_BO_SYNC_CPU_CACHE_FLUSH:
> + case DRM_PANTHOR_BO_SYNC_CPU_CACHE_FLUSH_AND_INVALIDATE:
> + break;
> +
> + default:
> + return -EINVAL;
> + }
> +
> + /* Don't bother if it's WC-mapped */
> + if (shmem->map_wc)
> + return 0;
> +
> + /* Nothing to do if the size is zero. */
> + if (size == 0)
> + return 0;
> +
> + sgt = drm_gem_shmem_get_pages_sgt(shmem);
> + if (IS_ERR(sgt))
> + return PTR_ERR(sgt);
> +
> + for_each_sgtable_dma_sg(sgt, sgl, count) {
> + if (size == 0)
> + break;
> +
> + dma_addr_t paddr = sg_dma_address(sgl);
> + size_t len = sg_dma_len(sgl);
> +
> + if (len <= offset) {
> + offset -= len;
> + continue;
> + }
> +
> + paddr += offset;
> + len -= offset;
> + len = min_t(size_t, len, size);
> + size -= len;
> + offset = 0;
> +
> + /* It's unclear whether dma_sync_xxx() is the right API to do CPU
> + * cache maintenance given an IOMMU can register their own
> + * implementation doing more than just CPU cache flushes/invalidation,
> + * and what we really care about here is CPU caches only, but that's
> + * the best we have that is both arch-agnostic and does at least the
> + * CPU cache maintenance on a <page,offset,size> tuple.
> + *
> + * Also, I wish we could do a single
> + *
> + * dma_sync_single_for_device(BIDIR)
> + *
> + * and get a flush+invalidate, but that's not how it's implemented
> + * in practice (at least on arm64), so we have to make it
> + *
> + * dma_sync_single_for_device(TO_DEVICE)
> + * dma_sync_single_for_cpu(FROM_DEVICE)
> + *
> + * for the flush+invalidate case.
> + */
> + dma_sync_single_for_device(dev->dev, paddr, len, DMA_TO_DEVICE);
> + if (type == DRM_PANTHOR_BO_SYNC_CPU_CACHE_FLUSH_AND_INVALIDATE)
> + dma_sync_single_for_cpu(dev->dev, paddr, len, DMA_FROM_DEVICE);
> + }
> +
> + return 0;
> +}
> +
> #ifdef CONFIG_DEBUG_FS
> struct gem_size_totals {
> size_t size;
> diff --git a/drivers/gpu/drm/panthor/panthor_gem.h b/drivers/gpu/drm/panthor/panthor_gem.h
> index 91d1880f8a5d..bbf9ae75c360 100644
> --- a/drivers/gpu/drm/panthor/panthor_gem.h
> +++ b/drivers/gpu/drm/panthor/panthor_gem.h
> @@ -146,6 +146,8 @@ panthor_gem_create_with_handle(struct drm_file *file,
>
> void panthor_gem_bo_set_label(struct drm_gem_object *obj, const char *label);
> void panthor_gem_kernel_bo_set_label(struct panthor_kernel_bo *bo, const char *label);
> +int panthor_gem_sync(struct drm_gem_object *obj,
> + u32 type, u64 offset, u64 size);
>
> struct drm_gem_object *
> panthor_gem_prime_import(struct drm_device *dev,
> diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
> index 28cf9e878db6..9f810305db6e 100644
> --- a/include/uapi/drm/panthor_drm.h
> +++ b/include/uapi/drm/panthor_drm.h
> @@ -144,6 +144,9 @@ enum drm_panthor_ioctl_id {
> * pgoff_t size.
> */
> DRM_PANTHOR_SET_USER_MMIO_OFFSET,
> +
> + /** @DRM_PANTHOR_BO_SYNC: Sync BO data to/from the device */
> + DRM_PANTHOR_BO_SYNC,
> };
>
> /**
> @@ -1073,6 +1076,53 @@ struct drm_panthor_set_user_mmio_offset {
> __u64 offset;
> };
>
> +/**
> + * enum drm_panthor_bo_sync_op_type - BO sync type
> + */
> +enum drm_panthor_bo_sync_op_type {
> + /** @DRM_PANTHOR_BO_SYNC_CPU_CACHE_FLUSH: Flush CPU caches. */
> + DRM_PANTHOR_BO_SYNC_CPU_CACHE_FLUSH = 0,
> +
> + /** @DRM_PANTHOR_BO_SYNC_CPU_CACHE_FLUSH_AND_INVALIDATE: Flush and invalidate CPU caches. */
> + DRM_PANTHOR_BO_SYNC_CPU_CACHE_FLUSH_AND_INVALIDATE = 1,
> +};
> +
> +/**
> + * struct drm_panthor_bo_sync_op - BO map sync op
> + */
> +struct drm_panthor_bo_sync_op {
> + /** @handle: Handle of the buffer object to sync. */
> + __u32 handle;
> +
> + /** @type: Type of operation. */
> + __u32 type;
> +
> + /**
> + * @offset: Offset into the BO at which the sync range starts.
> + *
> + * This will be rounded down to the nearest cache line as needed.
> + */
> + __u64 offset;
> +
> + /**
> + * @size: Size of the range to sync
> + *
> + * @size + @offset will be rounded up to the nearest cache line as
> + * needed.
> + */
> + __u64 size;
> +};
> +
> +/**
> + * struct drm_panthor_bo_sync - BO map sync request
> + */
> +struct drm_panthor_bo_sync {
> + /**
> + * @ops: Array of struct drm_panthor_bo_sync_op sync operations.
> + */
> + struct drm_panthor_obj_array ops;
> +};
> +
> /**
> * DRM_IOCTL_PANTHOR() - Build a Panthor IOCTL number
> * @__access: Access type. Must be R, W or RW.
> @@ -1119,6 +1169,8 @@ enum {
> DRM_IOCTL_PANTHOR(WR, BO_SET_LABEL, bo_set_label),
> DRM_IOCTL_PANTHOR_SET_USER_MMIO_OFFSET =
> DRM_IOCTL_PANTHOR(WR, SET_USER_MMIO_OFFSET, set_user_mmio_offset),
> + DRM_IOCTL_PANTHOR_BO_SYNC =
> + DRM_IOCTL_PANTHOR(WR, BO_SYNC, bo_sync),
> };
>
> #if defined(__cplusplus)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v7 06/13] drm/panthor: Add flag to map GEM object Write-Back Cacheable
2025-12-03 9:01 ` [PATCH v7 06/13] drm/panthor: Add flag to map GEM object Write-Back Cacheable Boris Brezillon
@ 2025-12-04 15:44 ` Steven Price
0 siblings, 0 replies; 18+ messages in thread
From: Steven Price @ 2025-12-04 15:44 UTC (permalink / raw)
To: Boris Brezillon
Cc: dri-devel, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Faith Ekstrand, Thierry Reding,
Mikko Perttunen, Melissa Wen, Maíra Canal, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Frank Binns, Matt Coster,
Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
Sean Paul, Marijn Suijten, Alex Deucher, Christian König,
amd-gfx, Loïc Molinari, kernel
On 03/12/2025 09:01, Boris Brezillon wrote:
> From: Loïc Molinari <loic.molinari@collabora.com>
>
> Will be used by the UMD to optimize CPU accesses to buffers
> that are frequently read by the CPU, or on which the access
> pattern makes non-cacheable mappings inefficient.
>
> Mapping buffers CPU-cached implies taking care of the CPU
> cache maintenance in the UMD, unless the GPU is IO coherent.
>
> v2:
> - Add more to the commit message
> - Tweak the doc
> - Make sure we sync the section of the BO pointing to the CS
> syncobj before we read its seqno
>
> v3:
> - Fix formatting/spelling issues
>
> v4:
> - Add Steve's R-b
>
> v5:
> - Drop Steve's R-b (changes in the ioctl semantics requiring
> new review)
>
> v6:
> - Fix the uAPI doc
> - Fix inverted logic in some comment
>
> v7:
> - No changes
>
> Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Readding my r-b that I gave on v6:
Reviewed-by: Steven Price <steven.price@arm.com>
Thanks,
Steve
> ---
> drivers/gpu/drm/panthor/panthor_drv.c | 7 ++++-
> drivers/gpu/drm/panthor/panthor_gem.c | 37 +++++++++++++++++++++++--
> drivers/gpu/drm/panthor/panthor_sched.c | 18 ++++++++++--
> include/uapi/drm/panthor_drm.h | 9 ++++++
> 4 files changed, 66 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
> index db04f13217b9..1bb6a20c497a 100644
> --- a/drivers/gpu/drm/panthor/panthor_drv.c
> +++ b/drivers/gpu/drm/panthor/panthor_drv.c
> @@ -902,7 +902,8 @@ static int panthor_ioctl_vm_destroy(struct drm_device *ddev, void *data,
> return panthor_vm_pool_destroy_vm(pfile->vms, args->id);
> }
>
> -#define PANTHOR_BO_FLAGS DRM_PANTHOR_BO_NO_MMAP
> +#define PANTHOR_BO_FLAGS (DRM_PANTHOR_BO_NO_MMAP | \
> + DRM_PANTHOR_BO_WB_MMAP)
>
> static int panthor_ioctl_bo_create(struct drm_device *ddev, void *data,
> struct drm_file *file)
> @@ -921,6 +922,10 @@ static int panthor_ioctl_bo_create(struct drm_device *ddev, void *data,
> goto out_dev_exit;
> }
>
> + if ((args->flags & DRM_PANTHOR_BO_NO_MMAP) &&
> + (args->flags & DRM_PANTHOR_BO_WB_MMAP))
> + return -EINVAL;
> +
> if (args->exclusive_vm_id) {
> vm = panthor_vm_pool_get_vm(pfile->vms, args->exclusive_vm_id);
> if (!vm) {
> diff --git a/drivers/gpu/drm/panthor/panthor_gem.c b/drivers/gpu/drm/panthor/panthor_gem.c
> index 4be32fc1732b..7a9eb6010f6f 100644
> --- a/drivers/gpu/drm/panthor/panthor_gem.c
> +++ b/drivers/gpu/drm/panthor/panthor_gem.c
> @@ -59,6 +59,39 @@ static void panthor_gem_debugfs_set_usage_flags(struct panthor_gem_object *bo, u
> static void panthor_gem_debugfs_bo_init(struct panthor_gem_object *bo) {}
> #endif
>
> +static bool
> +should_map_wc(struct panthor_gem_object *bo, struct panthor_vm *exclusive_vm)
> +{
> + struct panthor_device *ptdev = container_of(bo->base.base.dev, struct panthor_device, base);
> +
> + /* We can't do uncached mappings if the device is coherent,
> + * because the zeroing done by the shmem layer at page allocation
> + * time happens on a cached mapping which isn't CPU-flushed (at least
> + * not on Arm64 where the flush is deferred to PTE setup time, and
> + * only done conditionally based on the mapping permissions). We can't
> + * rely on dma_map_sgtable()/dma_sync_sgtable_for_xxx() either to flush
> + * those, because they are NOPed if dma_dev_coherent() returns true.
> + *
> + * FIXME: Note that this problem is going to pop up again when we
> + * decide to support mapping buffers with the NO_MMAP flag as
> + * non-shareable (AKA buffers accessed only by the GPU), because we
> + * need the same CPU flush to happen after page allocation, otherwise
> + * there's a risk of data leak or late corruption caused by a dirty
> + * cacheline being evicted. At this point we'll need a way to force
> + * CPU cache maintenance regardless of whether the device is coherent
> + * or not.
> + */
> + if (ptdev->coherent)
> + return false;
> +
> + /* Cached mappings are explicitly requested, so no write-combine. */
> + if (bo->flags & DRM_PANTHOR_BO_WB_MMAP)
> + return false;
> +
> + /* The default is write-combine. */
> + return true;
> +}
> +
> static void panthor_gem_free_object(struct drm_gem_object *obj)
> {
> struct panthor_gem_object *bo = to_panthor_bo(obj);
> @@ -145,6 +178,7 @@ panthor_kernel_bo_create(struct panthor_device *ptdev, struct panthor_vm *vm,
> bo = to_panthor_bo(&obj->base);
> kbo->obj = &obj->base;
> bo->flags = bo_flags;
> + bo->base.map_wc = should_map_wc(bo, vm);
> bo->exclusive_vm_root_gem = panthor_vm_root_gem(vm);
> drm_gem_object_get(bo->exclusive_vm_root_gem);
> bo->base.base.resv = bo->exclusive_vm_root_gem->resv;
> @@ -345,7 +379,6 @@ static const struct drm_gem_object_funcs panthor_gem_funcs = {
> */
> struct drm_gem_object *panthor_gem_create_object(struct drm_device *ddev, size_t size)
> {
> - struct panthor_device *ptdev = container_of(ddev, struct panthor_device, base);
> struct panthor_gem_object *obj;
>
> obj = kzalloc(sizeof(*obj), GFP_KERNEL);
> @@ -353,7 +386,6 @@ struct drm_gem_object *panthor_gem_create_object(struct drm_device *ddev, size_t
> return ERR_PTR(-ENOMEM);
>
> obj->base.base.funcs = &panthor_gem_funcs;
> - obj->base.map_wc = !ptdev->coherent;
> mutex_init(&obj->label.lock);
>
> panthor_gem_debugfs_bo_init(obj);
> @@ -388,6 +420,7 @@ panthor_gem_create_with_handle(struct drm_file *file,
>
> bo = to_panthor_bo(&shmem->base);
> bo->flags = flags;
> + bo->base.map_wc = should_map_wc(bo, exclusive_vm);
>
> if (exclusive_vm) {
> bo->exclusive_vm_root_gem = panthor_vm_root_gem(exclusive_vm);
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> index 389d508b3848..ae69a5704756 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -863,8 +863,11 @@ panthor_queue_get_syncwait_obj(struct panthor_group *group, struct panthor_queue
> struct iosys_map map;
> int ret;
>
> - if (queue->syncwait.kmap)
> - return queue->syncwait.kmap + queue->syncwait.offset;
> + if (queue->syncwait.kmap) {
> + bo = container_of(queue->syncwait.obj,
> + struct panthor_gem_object, base.base);
> + goto out_sync;
> + }
>
> bo = panthor_vm_get_bo_for_va(group->vm,
> queue->syncwait.gpu_va,
> @@ -881,6 +884,17 @@ panthor_queue_get_syncwait_obj(struct panthor_group *group, struct panthor_queue
> if (drm_WARN_ON(&ptdev->base, !queue->syncwait.kmap))
> goto err_put_syncwait_obj;
>
> +out_sync:
> + /* Make sure the CPU caches are invalidated before the seqno is read.
> + * drm_gem_shmem_sync() is a NOP if map_wc=true, so no need to check
> + * it here.
> + */
> + panthor_gem_sync(&bo->base.base, queue->syncwait.offset,
> + queue->syncwait.sync64 ?
> + sizeof(struct panthor_syncobj_64b) :
> + sizeof(struct panthor_syncobj_32b),
> + DRM_PANTHOR_BO_SYNC_CPU_CACHE_FLUSH_AND_INVALIDATE);
> +
> return queue->syncwait.kmap + queue->syncwait.offset;
>
> err_put_syncwait_obj:
> diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
> index 39d5ce815742..e238c6264fa1 100644
> --- a/include/uapi/drm/panthor_drm.h
> +++ b/include/uapi/drm/panthor_drm.h
> @@ -681,6 +681,15 @@ struct drm_panthor_vm_get_state {
> enum drm_panthor_bo_flags {
> /** @DRM_PANTHOR_BO_NO_MMAP: The buffer object will never be CPU-mapped in userspace. */
> DRM_PANTHOR_BO_NO_MMAP = (1 << 0),
> +
> + /**
> + * @DRM_PANTHOR_BO_WB_MMAP: Force "Write-Back Cacheable" CPU mapping.
> + *
> + * CPU map the buffer object in userspace by forcing the "Write-Back
> + * Cacheable" cacheability attribute. The mapping otherwise uses the
> + * "Non-Cacheable" attribute if the GPU is not IO coherent.
> + */
> + DRM_PANTHOR_BO_WB_MMAP = (1 << 1),
> };
>
> /**
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v7 10/13] drm/panfrost: Add a PANFROST_SYNC_BO ioctl
2025-12-03 9:01 ` [PATCH v7 10/13] drm/panfrost: Add a PANFROST_SYNC_BO ioctl Boris Brezillon
@ 2025-12-04 16:09 ` Steven Price
0 siblings, 0 replies; 18+ messages in thread
From: Steven Price @ 2025-12-04 16:09 UTC (permalink / raw)
To: Boris Brezillon
Cc: dri-devel, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Faith Ekstrand, Thierry Reding,
Mikko Perttunen, Melissa Wen, Maíra Canal, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Frank Binns, Matt Coster,
Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
Sean Paul, Marijn Suijten, Alex Deucher, Christian König,
amd-gfx, kernel
On 03/12/2025 09:01, Boris Brezillon wrote:
> From: Faith Ekstrand <faith.ekstrand@collabora.com>
>
> This will be used by the UMD to synchronize CPU-cached mappings when
> the UMD can't do it directly (no usermode cache maintenance instruction
> on Arm32).
>
> v2:
> - Add more to the commit message
> - Change the flags to better match the drm_gem_shmem_sync semantics
>
> v3:
> - Add Steve's R-b
>
> v4:
> - No changes
>
> v5:
> - Drop Steve's R-b (semantics changes requiring a new review)
>
> v6:
> - Bail out early in panfrost_ioctl_sync_bo() if op_count is zero
>
> v7:
> - Hand-roll our own bo_sync() helper
>
> Signed-off-by: Faith Ekstrand <faith.ekstrand@collabora.com>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
> ---
> drivers/gpu/drm/panfrost/panfrost_drv.c | 51 +++++++++++++++
> drivers/gpu/drm/panfrost/panfrost_gem.c | 84 +++++++++++++++++++++++++
> drivers/gpu/drm/panfrost/panfrost_gem.h | 2 +
> include/uapi/drm/panfrost_drm.h | 45 +++++++++++++
> 4 files changed, 182 insertions(+)
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index d650cd138dad..77b0ae5ef000 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -580,6 +580,56 @@ static int panfrost_ioctl_jm_ctx_destroy(struct drm_device *dev, void *data,
> return panfrost_jm_ctx_destroy(file, args->handle);
> }
>
> +static int panfrost_ioctl_sync_bo(struct drm_device *ddev, void *data,
> + struct drm_file *file)
> +{
> + struct drm_panfrost_sync_bo *args = data;
> + struct drm_panfrost_bo_sync_op *ops;
> + struct drm_gem_object *obj;
> + int ret;
> + u32 i;
> +
> + if (args->pad)
> + return -EINVAL;
> +
> + if (!args->op_count)
> + return 0;
> +
> + ops = kvmalloc_array(args->op_count, sizeof(*ops), GFP_KERNEL);
> + if (!ops) {
> + DRM_DEBUG("Failed to allocate incoming BO sync ops array\n");
> + return -ENOMEM;
> + }
> +
> + if (copy_from_user(ops, (void __user *)(uintptr_t)args->ops,
> + args->op_count * sizeof(*ops))) {
> + DRM_DEBUG("Failed to copy in BO sync ops\n");
> + ret = -EFAULT;
> + goto err_ops;
> + }
> +
> + for (i = 0; i < args->op_count; i++) {
> + obj = drm_gem_object_lookup(file, ops[i].handle);
> + if (!obj) {
> + ret = -ENOENT;
> + goto err_ops;
> + }
> +
> + ret = panfrost_gem_sync(obj, ops[i].type,
> + ops[i].offset, ops[i].size);
> +
> + drm_gem_object_put(obj);
> +
> + if (ret)
> + goto err_ops;
> + }
> +
> +err_ops:
> + kvfree(ops);
> +
> + return ret;
> +}
> +
> int panfrost_unstable_ioctl_check(void)
> {
> if (!unstable_ioctls)
> @@ -649,6 +699,7 @@ static const struct drm_ioctl_desc panfrost_drm_driver_ioctls[] = {
> PANFROST_IOCTL(SET_LABEL_BO, set_label_bo, DRM_RENDER_ALLOW),
> PANFROST_IOCTL(JM_CTX_CREATE, jm_ctx_create, DRM_RENDER_ALLOW),
> PANFROST_IOCTL(JM_CTX_DESTROY, jm_ctx_destroy, DRM_RENDER_ALLOW),
> + PANFROST_IOCTL(SYNC_BO, sync_bo, DRM_RENDER_ALLOW),
> };
>
> static void panfrost_gpu_show_fdinfo(struct panfrost_device *pfdev,
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c
> index 4afd1a7f77d5..8231ae04f54c 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gem.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
> @@ -489,6 +489,90 @@ panfrost_gem_set_label(struct drm_gem_object *obj, const char *label)
> kfree_const(old_label);
> }
>
> +int
> +panfrost_gem_sync(struct drm_gem_object *obj, u32 type, u32 offset, u32 size)
> +{
> + struct panfrost_gem_object *bo = to_panfrost_bo(obj);
> + struct drm_gem_shmem_object *shmem = &bo->base;
> + const struct drm_device *dev = shmem->base.dev;
> + struct sg_table *sgt;
> + struct scatterlist *sgl;
> + unsigned int count;
> +
> + /* Make sure the range is in bounds. */
> + if (offset + size < offset || offset + size > shmem->base.size)
> + return -EINVAL;
> +
> + /* Disallow CPU-cache maintenance on imported buffers. */
> + if (drm_gem_is_imported(&shmem->base))
> + return -EINVAL;
> +
> + switch (type) {
> + case PANFROST_BO_SYNC_CPU_CACHE_FLUSH:
> + case PANFROST_BO_SYNC_CPU_CACHE_FLUSH_AND_INVALIDATE:
> + break;
> +
> + default:
> + return -EINVAL;
> + }
> +
> + /* Don't bother if it's WC-mapped */
> + if (shmem->map_wc)
> + return 0;
> +
> + /* Nothing to do if the size is zero. */
> + if (size == 0)
> + return 0;
> +
> + sgt = drm_gem_shmem_get_pages_sgt(shmem);
> + if (IS_ERR(sgt))
> + return PTR_ERR(sgt);
> +
> + for_each_sgtable_dma_sg(sgt, sgl, count) {
> + if (size == 0)
> + break;
> +
> + dma_addr_t paddr = sg_dma_address(sgl);
> + size_t len = sg_dma_len(sgl);
> +
> + if (len <= offset) {
> + offset -= len;
> + continue;
> + }
> +
> + paddr += offset;
> + len -= offset;
> + len = min_t(size_t, len, size);
> + size -= len;
> + offset = 0;
> +
> + /* It's unclear whether dma_sync_xxx() is the right API to do CPU
> + * cache maintenance given an IOMMU can register their own
> + * implementation doing more than just CPU cache flushes/invalidation,
> + * and what we really care about here is CPU caches only, but that's
> + * the best we have that is both arch-agnostic and does at least the
> + * CPU cache maintenance on a <page,offset,size> tuple.
> + *
> + * Also, I wish we could do a single
> + *
> + * dma_sync_single_for_device(BIDIR)
> + *
> + * and get a flush+invalidate, but that's not how it's implemented
> + * in practice (at least on arm64), so we have to make it
> + *
> + * dma_sync_single_for_device(TO_DEVICE)
> + * dma_sync_single_for_cpu(FROM_DEVICE)
> + *
> + * for the flush+invalidate case.
> + */
> + dma_sync_single_for_device(dev->dev, paddr, len, DMA_TO_DEVICE);
> + if (type == PANFROST_BO_SYNC_CPU_CACHE_FLUSH_AND_INVALIDATE)
> + dma_sync_single_for_cpu(dev->dev, paddr, len, DMA_FROM_DEVICE);
> + }
> +
> + return 0;
> +}
> +
> void
> panfrost_gem_internal_set_label(struct drm_gem_object *obj, const char *label)
> {
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.h b/drivers/gpu/drm/panfrost/panfrost_gem.h
> index 7fec20339354..d61ffe1f6841 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gem.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_gem.h
> @@ -151,6 +151,8 @@ int panfrost_gem_shrinker_init(struct drm_device *dev);
> void panfrost_gem_shrinker_cleanup(struct drm_device *dev);
>
> void panfrost_gem_set_label(struct drm_gem_object *obj, const char *label);
> +int panfrost_gem_sync(struct drm_gem_object *obj, u32 type,
> + u32 offset, u32 size);
> void panfrost_gem_internal_set_label(struct drm_gem_object *obj, const char *label);
>
> #ifdef CONFIG_DEBUG_FS
> diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
> index 0c59714ae42b..e194e087a0c8 100644
> --- a/include/uapi/drm/panfrost_drm.h
> +++ b/include/uapi/drm/panfrost_drm.h
> @@ -24,6 +24,7 @@ extern "C" {
> #define DRM_PANFROST_SET_LABEL_BO 0x09
> #define DRM_PANFROST_JM_CTX_CREATE 0x0a
> #define DRM_PANFROST_JM_CTX_DESTROY 0x0b
> +#define DRM_PANFROST_SYNC_BO 0x0c
>
> #define DRM_IOCTL_PANFROST_SUBMIT DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_SUBMIT, struct drm_panfrost_submit)
> #define DRM_IOCTL_PANFROST_WAIT_BO DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_WAIT_BO, struct drm_panfrost_wait_bo)
> @@ -35,6 +36,7 @@ extern "C" {
> #define DRM_IOCTL_PANFROST_SET_LABEL_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_SET_LABEL_BO, struct drm_panfrost_set_label_bo)
> #define DRM_IOCTL_PANFROST_JM_CTX_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_JM_CTX_CREATE, struct drm_panfrost_jm_ctx_create)
> #define DRM_IOCTL_PANFROST_JM_CTX_DESTROY DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_JM_CTX_DESTROY, struct drm_panfrost_jm_ctx_destroy)
> +#define DRM_IOCTL_PANFROST_SYNC_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_SYNC_BO, struct drm_panfrost_sync_bo)
>
> /*
> * Unstable ioctl(s): only exposed when the unsafe unstable_ioctls module
> @@ -308,6 +310,49 @@ struct drm_panfrost_set_label_bo {
> __u64 label;
> };
>
> +/* Valid flags to pass to drm_panfrost_bo_sync_op */
> +#define PANFROST_BO_SYNC_CPU_CACHE_FLUSH 0
> +#define PANFROST_BO_SYNC_CPU_CACHE_FLUSH_AND_INVALIDATE 1
> +
> +/**
> + * struct drm_panthor_bo_flush_map_op - BO map sync op
> + */
> +struct drm_panfrost_bo_sync_op {
> + /** @handle: Handle of the buffer object to sync. */
> + __u32 handle;
> +
> + /** @type: Type of sync operation. */
> + __u32 type;
> +
> + /**
> + * @offset: Offset into the BO at which the sync range starts.
> + *
> + * This will be rounded down to the nearest cache line as needed.
> + */
> + __u32 offset;
> +
> + /**
> + * @size: Size of the range to sync
> + *
> + * @size + @offset will be rounded up to the nearest cache line as
> + * needed.
> + */
> + __u32 size;
> +};
> +
> +/**
> + * struct drm_panfrost_sync_bo - ioctl argument for syncing BO maps
> + */
> +struct drm_panfrost_sync_bo {
> + /** Array of struct drm_panfrost_bo_sync_op */
> + __u64 ops;
> +
> + /** Number of BO sync ops */
> + __u32 op_count;
> +
> + __u32 pad;
> +};
> +
> /* Definitions for coredump decoding in user space */
> #define PANFROSTDUMP_MAJOR 1
> #define PANFROSTDUMP_MINOR 0
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v7 12/13] drm/panfrost: Add flag to map GEM object Write-Back Cacheable
2025-12-03 9:01 ` [PATCH v7 12/13] drm/panfrost: Add flag to map GEM object Write-Back Cacheable Boris Brezillon
@ 2025-12-05 13:07 ` kernel test robot
0 siblings, 0 replies; 18+ messages in thread
From: kernel test robot @ 2025-12-05 13:07 UTC (permalink / raw)
To: Boris Brezillon, Steven Price
Cc: oe-kbuild-all, dri-devel, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Faith Ekstrand,
Thierry Reding, Mikko Perttunen, Melissa Wen, Maíra Canal,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi, Frank Binns,
Matt Coster, Rob Clark, Dmitry Baryshkov, Abhinav Kumar,
Jessica Zhang, Sean Paul, Marijn Suijten, Alex Deucher,
Christian König, amd-gfx, kernel, Boris Brezillon
Hi Boris,
kernel test robot noticed the following build warnings:
[auto build test WARNING on drm-misc/drm-misc-next]
[also build test WARNING on drm/drm-next drm-tip/drm-tip linus/master next-20251205]
[cannot apply to drm-i915/for-linux-next drm-i915/for-linux-next-fixes daeinki-drm-exynos/exynos-drm-next v6.18]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Boris-Brezillon/drm-panthor-Provide-a-custom-dma_buf-implementation/20251203-174141
base: https://gitlab.freedesktop.org/drm/misc/kernel.git drm-misc-next
patch link: https://lore.kernel.org/r/20251203090141.227394-13-boris.brezillon%40collabora.com
patch subject: [PATCH v7 12/13] drm/panfrost: Add flag to map GEM object Write-Back Cacheable
config: sparc64-randconfig-r113-20251205 (https://download.01.org/0day-ci/archive/20251205/202512051903.QuHryKTr-lkp@intel.com/config)
compiler: sparc64-linux-gcc (GCC) 8.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251205/202512051903.QuHryKTr-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202512051903.QuHryKTr-lkp@intel.com/
sparse warnings: (new ones prefixed by >>)
>> drivers/gpu/drm/panfrost/panfrost_gem.c:392:10: sparse: sparse: Initializer entry defined twice
drivers/gpu/drm/panfrost/panfrost_gem.c:396:10: sparse: also defined here
vim +392 drivers/gpu/drm/panfrost/panfrost_gem.c
06fdcb378400a75 Boris Brezillon 2025-12-03 383
f3ba91228e8e917 Rob Herring 2018-09-10 384 static const struct drm_gem_object_funcs panfrost_gem_funcs = {
f3ba91228e8e917 Rob Herring 2018-09-10 385 .free = panfrost_gem_free_object,
a5efb4c9a562aa1 Rob Herring 2019-07-01 386 .open = panfrost_gem_open,
a5efb4c9a562aa1 Rob Herring 2019-07-01 387 .close = panfrost_gem_close,
c7fbcb7149ff932 Thomas Zimmermann 2021-11-08 388 .print_info = drm_gem_shmem_object_print_info,
187d2929206e6b0 Rob Herring 2019-07-26 389 .pin = panfrost_gem_pin,
c7fbcb7149ff932 Thomas Zimmermann 2021-11-08 390 .unpin = drm_gem_shmem_object_unpin,
c7fbcb7149ff932 Thomas Zimmermann 2021-11-08 391 .get_sg_table = drm_gem_shmem_object_get_sg_table,
06fdcb378400a75 Boris Brezillon 2025-12-03 @392 .export = panfrost_gem_prime_export,
c7fbcb7149ff932 Thomas Zimmermann 2021-11-08 393 .vmap = drm_gem_shmem_object_vmap,
c7fbcb7149ff932 Thomas Zimmermann 2021-11-08 394 .vunmap = drm_gem_shmem_object_vunmap,
c7fbcb7149ff932 Thomas Zimmermann 2021-11-08 395 .mmap = drm_gem_shmem_object_mmap,
cf47bdd19762531 Faith Ekstrand 2025-12-03 396 .export = drm_gem_prime_export,
9ccdac7aa822f04 Adrián Larumbe 2023-09-29 397 .status = panfrost_gem_status,
407c0ad5b19a477 Adrián Larumbe 2023-09-29 398 .rss = panfrost_gem_rss,
d315bdbfebd517c Thomas Zimmermann 2022-02-09 399 .vm_ops = &drm_gem_shmem_vm_ops,
f3ba91228e8e917 Rob Herring 2018-09-10 400 };
f3ba91228e8e917 Rob Herring 2018-09-10 401
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2025-12-05 13:07 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-03 9:01 [PATCH v7 00/13] drm/panfrost, panthor: Cached maps and explicit flushing Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 01/13] drm/panthor: Provide a custom dma_buf implementation Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 02/13] drm/panthor: Fix panthor_gpu_coherency_set() Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 03/13] drm/panthor: Expose the selected coherency protocol to the UMD Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 04/13] drm/panthor: Add a PANTHOR_BO_SYNC ioctl Boris Brezillon
2025-12-04 15:42 ` Steven Price
2025-12-03 9:01 ` [PATCH v7 05/13] drm/panthor: Add an ioctl to query BO flags Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 06/13] drm/panthor: Add flag to map GEM object Write-Back Cacheable Boris Brezillon
2025-12-04 15:44 ` Steven Price
2025-12-03 9:01 ` [PATCH v7 07/13] drm/panthor: Bump the driver version to 1.6 Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 08/13] drm/panfrost: Provide a custom dma_buf implementation Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 09/13] drm/panfrost: Expose the selected coherency protocol to the UMD Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 10/13] drm/panfrost: Add a PANFROST_SYNC_BO ioctl Boris Brezillon
2025-12-04 16:09 ` Steven Price
2025-12-03 9:01 ` [PATCH v7 11/13] drm/panfrost: Add an ioctl to query BO flags Boris Brezillon
2025-12-03 9:01 ` [PATCH v7 12/13] drm/panfrost: Add flag to map GEM object Write-Back Cacheable Boris Brezillon
2025-12-05 13:07 ` kernel test robot
2025-12-03 9:01 ` [PATCH v7 13/13] drm/panfrost: Bump the driver version to 1.6 Boris Brezillon
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.