From: Boris Brezillon <boris.brezillon@collabora.com>
To: Akash Goel <akash.goel@arm.com>
Cc: "Steven Price" <steven.price@arm.com>,
dri-devel@lists.freedesktop.org,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
"Maxime Ripard" <mripard@kernel.org>,
"Thomas Zimmermann" <tzimmermann@suse.de>,
"David Airlie" <airlied@gmail.com>,
"Simona Vetter" <simona@ffwll.ch>,
"Faith Ekstrand" <faith.ekstrand@collabora.com>,
"Thierry Reding" <thierry.reding@gmail.com>,
"Mikko Perttunen" <mperttunen@nvidia.com>,
"Melissa Wen" <mwen@igalia.com>,
"Maíra Canal" <mcanal@igalia.com>,
"Lucas De Marchi" <lucas.demarchi@intel.com>,
"Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
"Rodrigo Vivi" <rodrigo.vivi@intel.com>,
"Frank Binns" <frank.binns@imgtec.com>,
"Matt Coster" <matt.coster@imgtec.com>,
"Rob Clark" <robin.clark@oss.qualcomm.com>,
"Dmitry Baryshkov" <lumag@kernel.org>,
"Abhinav Kumar" <abhinav.kumar@linux.dev>,
"Jessica Zhang" <jessica.zhang@oss.qualcomm.com>,
"Sean Paul" <sean@poorly.run>,
"Marijn Suijten" <marijn.suijten@somainline.org>,
"Alex Deucher" <alexander.deucher@amd.com>,
"Christian König" <christian.koenig@amd.com>,
amd-gfx@lists.freedesktop.org,
"Loïc Molinari" <loic.molinari@collabora.com>,
kernel@collabora.com
Subject: Re: [PATCH v5 09/16] drm/panthor: Add flag to map GEM object Write-Back Cacheable
Date: Mon, 3 Nov 2025 18:13:29 +0100 [thread overview]
Message-ID: <20251103181329.21822c2d@fedora> (raw)
In-Reply-To: <d4695588-9371-4a75-9521-6d4cfc173608@arm.com>
On Mon, 3 Nov 2025 16:43:12 +0000
Akash Goel <akash.goel@arm.com> wrote:
> On 10/30/25 14:05, Boris Brezillon wrote:
> > From: Loïc Molinari <loic.molinari@collabora.com>
> >
> > Will be used by the UMD to optimize CPU accesses to buffers
> > that are frequently read by the CPU, or on which the access
> > pattern makes non-cacheable mappings inefficient.
> >
> > Mapping buffers CPU-cached implies taking care of the CPU
> > cache maintenance in the UMD, unless the GPU is IO coherent.
> >
> > v2:
> > - Add more to the commit message
> > - Tweak the doc
> > - Make sure we sync the section of the BO pointing to the CS
> > syncobj before we read its seqno
> >
> > v3:
> > - Fix formatting/spelling issues
> >
> > v4:
> > - Add Steve's R-b
> >
> > v5:
> > - Drop Steve's R-b (changes in the ioctl semantics requiring
> > new review)
> >
> > Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > ---
> > drivers/gpu/drm/panthor/panthor_drv.c | 7 ++++-
> > drivers/gpu/drm/panthor/panthor_gem.c | 37 +++++++++++++++++++++++--
> > drivers/gpu/drm/panthor/panthor_sched.c | 18 ++++++++++--
> > include/uapi/drm/panthor_drm.h | 12 ++++++++
> > 4 files changed, 69 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
> > index c07fc5dcd4a1..4e915f5ef3fa 100644
> > --- a/drivers/gpu/drm/panthor/panthor_drv.c
> > +++ b/drivers/gpu/drm/panthor/panthor_drv.c
> > @@ -900,7 +900,8 @@ static int panthor_ioctl_vm_destroy(struct drm_device *ddev, void *data,
> > return panthor_vm_pool_destroy_vm(pfile->vms, args->id);
> > }
> >
> > -#define PANTHOR_BO_FLAGS DRM_PANTHOR_BO_NO_MMAP
> > +#define PANTHOR_BO_FLAGS (DRM_PANTHOR_BO_NO_MMAP | \
> > + DRM_PANTHOR_BO_WB_MMAP)
> >
> > static int panthor_ioctl_bo_create(struct drm_device *ddev, void *data,
> > struct drm_file *file)
> > @@ -919,6 +920,10 @@ static int panthor_ioctl_bo_create(struct drm_device *ddev, void *data,
> > goto out_dev_exit;
> > }
> >
> > + if ((args->flags & DRM_PANTHOR_BO_NO_MMAP) &&
> > + (args->flags & DRM_PANTHOR_BO_WB_MMAP))
> > + return -EINVAL;
> > +
> > if (args->exclusive_vm_id) {
> > vm = panthor_vm_pool_get_vm(pfile->vms, args->exclusive_vm_id);
> > if (!vm) {
> > diff --git a/drivers/gpu/drm/panthor/panthor_gem.c b/drivers/gpu/drm/panthor/panthor_gem.c
> > index 1b1e98c61b8c..479a779ee59d 100644
> > --- a/drivers/gpu/drm/panthor/panthor_gem.c
> > +++ b/drivers/gpu/drm/panthor/panthor_gem.c
> > @@ -58,6 +58,39 @@ static void panthor_gem_debugfs_set_usage_flags(struct panthor_gem_object *bo, u
> > static void panthor_gem_debugfs_bo_init(struct panthor_gem_object *bo) {}
> > #endif
> >
> > +static bool
> > +should_map_wc(struct panthor_gem_object *bo, struct panthor_vm *exclusive_vm)
> > +{
> > + struct panthor_device *ptdev = container_of(bo->base.base.dev, struct panthor_device, base);
> > +
> > + /* We can't do uncached mappings if the device is coherent,
> > + * because the zeroing done by the shmem layer at page allocation
> > + * time happens on a cached mapping which isn't CPU-flushed (at least
> > + * not on Arm64 where the flush is deferred to PTE setup time, and
> > + * only done conditionally based on the mapping permissions). We can't
> > + * rely on dma_map_sgtable()/dma_sync_sgtable_for_xxx() either to flush
> > + * those, because they are NOPed if dma_dev_coherent() returns true.
> > + *
> > + * FIXME: Note that this problem is going to pop up again when we
> > + * decide to support mapping buffers with the NO_MMAP flag as
> > + * non-shareable (AKA buffers accessed only by the GPU), because we
> > + * need the same CPU flush to happen after page allocation, otherwise
> > + * there's a risk of data leak or late corruption caused by a dirty
> > + * cacheline being evicted. At this point we'll need a way to force
> > + * CPU cache maintenance regardless of whether the device is coherent
> > + * or not.
> > + */
> > + if (ptdev->coherent)
> > + return false;
> > +
> > + /* Cached mappings are explicitly requested, so no write-combine. */
> > + if (bo->flags & DRM_PANTHOR_BO_WB_MMAP)
> > + return false;
> > +
> > + /* The default is write-combine. */
> > + return true;
> > +}
> > +
> > static void panthor_gem_free_object(struct drm_gem_object *obj)
> > {
> > struct panthor_gem_object *bo = to_panthor_bo(obj);
> > @@ -152,6 +185,7 @@ panthor_kernel_bo_create(struct panthor_device *ptdev, struct panthor_vm *vm,
> > bo = to_panthor_bo(&obj->base);
> > kbo->obj = &obj->base;
> > bo->flags = bo_flags;
> > + bo->base.map_wc = should_map_wc(bo, vm);
> >
> > if (vm == panthor_fw_vm(ptdev))
> > debug_flags |= PANTHOR_DEBUGFS_GEM_USAGE_FLAG_FW_MAPPED;
> > @@ -255,7 +289,6 @@ static const struct drm_gem_object_funcs panthor_gem_funcs = {
> > */
> > struct drm_gem_object *panthor_gem_create_object(struct drm_device *ddev, size_t size)
> > {
> > - struct panthor_device *ptdev = container_of(ddev, struct panthor_device, base);
> > struct panthor_gem_object *obj;
> >
> > obj = kzalloc(sizeof(*obj), GFP_KERNEL);
> > @@ -263,7 +296,6 @@ struct drm_gem_object *panthor_gem_create_object(struct drm_device *ddev, size_t
> > return ERR_PTR(-ENOMEM);
> >
> > obj->base.base.funcs = &panthor_gem_funcs;
> > - obj->base.map_wc = !ptdev->coherent;
> > mutex_init(&obj->label.lock);
> >
> > panthor_gem_debugfs_bo_init(obj);
> > @@ -298,6 +330,7 @@ panthor_gem_create_with_handle(struct drm_file *file,
> >
> > bo = to_panthor_bo(&shmem->base);
> > bo->flags = flags;
> > + bo->base.map_wc = should_map_wc(bo, exclusive_vm);
> >
> > if (exclusive_vm) {
> > bo->exclusive_vm_root_gem = panthor_vm_root_gem(exclusive_vm);
> > diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> > index f5e01cb16cfc..7d5da5717de2 100644
> > --- a/drivers/gpu/drm/panthor/panthor_sched.c
> > +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> > @@ -868,8 +868,11 @@ panthor_queue_get_syncwait_obj(struct panthor_group *group, struct panthor_queue
> > struct iosys_map map;
> > int ret;
> >
> > - if (queue->syncwait.kmap)
> > - return queue->syncwait.kmap + queue->syncwait.offset;
> > + if (queue->syncwait.kmap) {
> > + bo = container_of(queue->syncwait.obj,
> > + struct panthor_gem_object, base.base);
> > + goto out_sync;
> > + }
> >
> > bo = panthor_vm_get_bo_for_va(group->vm,
> > queue->syncwait.gpu_va,
> > @@ -886,6 +889,17 @@ panthor_queue_get_syncwait_obj(struct panthor_group *group, struct panthor_queue
> > if (drm_WARN_ON(&ptdev->base, !queue->syncwait.kmap))
> > goto err_put_syncwait_obj;
> >
> > +out_sync:
> > + /* Make sure the CPU caches are invalidated before the seqno is read.
> > + * drm_gem_shmem_sync() is a NOP if map_wc=false, so no need to check
>
> Sorry nitpick.
>
> IIUC, drm_gem_shmem_sync() would be a NOP if 'map_wc' is true.
Oops, will fix that.
>
>
>
> > + * it here.
> > + */
> > + drm_gem_shmem_sync(&bo->base, queue->syncwait.offset,
> > + queue->syncwait.sync64 ?
> > + sizeof(struct panthor_syncobj_64b) :
> > + sizeof(struct panthor_syncobj_32b),
> > + DRM_GEM_SHMEM_SYNC_CPU_CACHE_FLUSH_AND_INVALIDATE);
> > +
> > return queue->syncwait.kmap + queue->syncwait.offset;
> >
> > err_put_syncwait_obj:
> > diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
> > index 7eec9f922183..57e2f5ffa03c 100644
> > --- a/include/uapi/drm/panthor_drm.h
> > +++ b/include/uapi/drm/panthor_drm.h
> > @@ -681,6 +681,18 @@ struct drm_panthor_vm_get_state {
> > enum drm_panthor_bo_flags {
> > /** @DRM_PANTHOR_BO_NO_MMAP: The buffer object will never be CPU-mapped in userspace. */
> > DRM_PANTHOR_BO_NO_MMAP = (1 << 0),
> > +
> > + /**
> > + * @DRM_PANTHOR_BO_WB_MMAP: Force "Write-Back Cacheable" CPU mapping.
> > + *
> > + * CPU map the buffer object in userspace by forcing the "Write-Back
> > + * Cacheable" cacheability attribute. The mapping otherwise uses the
> > + * "Non-Cacheable" attribute if the GPU is not IO coherent.
> > + *
> > + * Can't be set if exclusive_vm_id=0 (only private BOs can be mapped
> > + * cacheable).
>
> Sorry Boris, I may have misinterpreted the code.
>
> As per the comment, DRM_PANTHOR_BO_WB_MMAP flag should be rejected if
> 'exclusive_vm' is NULL. But I don't see any check for 'exclusive_vm'
> pointer inside should_map_wc().
You're right, I had this behavior enforced at some point, and dropped
it after adding {begin,end}_cpu_access() implementations to panthor.
I'll revisit the comment or re-introduce the check in v6 based on how
the review process goes.
next prev parent reply other threads:[~2025-11-03 17:13 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-30 14:05 [PATCH v5 00/16] drm/panfrost, panthor: Cached maps and explicit flushing Boris Brezillon
2025-10-30 14:05 ` [PATCH v5 01/16] drm/prime: Simplify life of drivers needing custom dma_buf_ops Boris Brezillon
2025-10-30 14:25 ` Christian König
2025-10-30 14:35 ` Boris Brezillon
2025-10-30 14:05 ` [PATCH v5 02/16] drm/shmem: Provide a generic {begin, end}_cpu_access() implementation Boris Brezillon
2025-10-30 14:31 ` [PATCH v5 02/16] drm/shmem: Provide a generic {begin,end}_cpu_access() implementation Christian König
2025-11-04 8:08 ` Boris Brezillon
2025-11-03 20:34 ` [PATCH v5 02/16] drm/shmem: Provide a generic {begin, end}_cpu_access() implementation Akash Goel
2025-11-04 7:42 ` Boris Brezillon
2025-10-30 14:05 ` [PATCH v5 03/16] drm/shmem: Add a drm_gem_shmem_sync() helper Boris Brezillon
2025-11-14 15:02 ` Steven Price
2025-10-30 14:05 ` [PATCH v5 04/16] drm/panthor: Provide a custom dma_buf implementation Boris Brezillon
2025-11-14 15:02 ` Steven Price
2025-10-30 14:05 ` [PATCH v5 05/16] drm/panthor: Fix panthor_gpu_coherency_set() Boris Brezillon
2025-10-30 14:05 ` [PATCH v5 06/16] drm/panthor: Expose the selected coherency protocol to the UMD Boris Brezillon
2025-10-30 14:05 ` [PATCH v5 07/16] drm/panthor: Add a PANTHOR_BO_SYNC ioctl Boris Brezillon
2025-10-31 7:25 ` Marcin Ślusarz
2025-11-03 20:42 ` Akash Goel
2025-11-04 7:41 ` Boris Brezillon
2025-10-30 14:05 ` [PATCH v5 08/16] drm/panthor: Add an ioctl to query BO flags Boris Brezillon
2025-10-30 14:05 ` [PATCH v5 09/16] drm/panthor: Add flag to map GEM object Write-Back Cacheable Boris Brezillon
2025-11-03 16:43 ` Akash Goel
2025-11-03 17:13 ` Boris Brezillon [this message]
2025-10-30 14:05 ` [PATCH v5 10/16] drm/panthor: Bump the driver version to 1.6 Boris Brezillon
2025-10-30 14:05 ` [PATCH v5 11/16] drm/panfrost: Provide a custom dma_buf implementation Boris Brezillon
2025-11-14 16:17 ` Steven Price
2025-10-30 14:05 ` [PATCH v5 12/16] drm/panfrost: Expose the selected coherency protocol to the UMD Boris Brezillon
2025-11-14 16:19 ` Steven Price
2025-10-30 14:05 ` [PATCH v5 13/16] drm/panfrost: Add a PANFROST_SYNC_BO ioctl Boris Brezillon
2025-10-31 7:08 ` Marcin Ślusarz
2025-10-31 8:49 ` Boris Brezillon
2025-10-30 14:05 ` [PATCH v5 14/16] drm/panfrost: Add an ioctl to query BO flags Boris Brezillon
2025-10-30 14:05 ` [PATCH v5 15/16] drm/panfrost: Add flag to map GEM object Write-Back Cacheable Boris Brezillon
2025-11-14 16:22 ` Steven Price
2025-10-30 14:05 ` [PATCH v5 16/16] drm/panfrost: Bump the driver version to 1.6 Boris Brezillon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251103181329.21822c2d@fedora \
--to=boris.brezillon@collabora.com \
--cc=abhinav.kumar@linux.dev \
--cc=airlied@gmail.com \
--cc=akash.goel@arm.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=faith.ekstrand@collabora.com \
--cc=frank.binns@imgtec.com \
--cc=jessica.zhang@oss.qualcomm.com \
--cc=kernel@collabora.com \
--cc=loic.molinari@collabora.com \
--cc=lucas.demarchi@intel.com \
--cc=lumag@kernel.org \
--cc=maarten.lankhorst@linux.intel.com \
--cc=marijn.suijten@somainline.org \
--cc=matt.coster@imgtec.com \
--cc=mcanal@igalia.com \
--cc=mperttunen@nvidia.com \
--cc=mripard@kernel.org \
--cc=mwen@igalia.com \
--cc=robin.clark@oss.qualcomm.com \
--cc=rodrigo.vivi@intel.com \
--cc=sean@poorly.run \
--cc=simona@ffwll.ch \
--cc=steven.price@arm.com \
--cc=thierry.reding@gmail.com \
--cc=thomas.hellstrom@linux.intel.com \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.