From: Boris Brezillon <boris.brezillon@collabora.com>
To: Akash Goel <akash.goel@arm.com>
Cc: "Steven Price" <steven.price@arm.com>,
dri-devel@lists.freedesktop.org,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
"Maxime Ripard" <mripard@kernel.org>,
"Thomas Zimmermann" <tzimmermann@suse.de>,
"David Airlie" <airlied@gmail.com>,
"Simona Vetter" <simona@ffwll.ch>,
"Faith Ekstrand" <faith.ekstrand@collabora.com>,
"Thierry Reding" <thierry.reding@gmail.com>,
"Mikko Perttunen" <mperttunen@nvidia.com>,
"Melissa Wen" <mwen@igalia.com>,
"Maíra Canal" <mcanal@igalia.com>,
"Lucas De Marchi" <lucas.demarchi@intel.com>,
"Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
"Rodrigo Vivi" <rodrigo.vivi@intel.com>,
"Frank Binns" <frank.binns@imgtec.com>,
"Matt Coster" <matt.coster@imgtec.com>,
"Rob Clark" <robin.clark@oss.qualcomm.com>,
"Dmitry Baryshkov" <lumag@kernel.org>,
"Abhinav Kumar" <abhinav.kumar@linux.dev>,
"Jessica Zhang" <jessica.zhang@oss.qualcomm.com>,
"Sean Paul" <sean@poorly.run>,
"Marijn Suijten" <marijn.suijten@somainline.org>,
"Alex Deucher" <alexander.deucher@amd.com>,
"Christian König" <christian.koenig@amd.com>,
amd-gfx@lists.freedesktop.org,
"Loïc Molinari" <loic.molinari@collabora.com>,
kernel@collabora.com
Subject: Re: [PATCH v5 09/16] drm/panthor: Add flag to map GEM object Write-Back Cacheable
Date: Mon, 3 Nov 2025 18:13:29 +0100 [thread overview]
Message-ID: <20251103181329.21822c2d@fedora> (raw)
In-Reply-To: <d4695588-9371-4a75-9521-6d4cfc173608@arm.com>
On Mon, 3 Nov 2025 16:43:12 +0000
Akash Goel <akash.goel@arm.com> wrote:
> On 10/30/25 14:05, Boris Brezillon wrote:
> > From: Loïc Molinari <loic.molinari@collabora.com>
> >
> > Will be used by the UMD to optimize CPU accesses to buffers
> > that are frequently read by the CPU, or on which the access
> > pattern makes non-cacheable mappings inefficient.
> >
> > Mapping buffers CPU-cached implies taking care of the CPU
> > cache maintenance in the UMD, unless the GPU is IO coherent.
> >
> > v2:
> > - Add more to the commit message
> > - Tweak the doc
> > - Make sure we sync the section of the BO pointing to the CS
> > syncobj before we read its seqno
> >
> > v3:
> > - Fix formatting/spelling issues
> >
> > v4:
> > - Add Steve's R-b
> >
> > v5:
> > - Drop Steve's R-b (changes in the ioctl semantics requiring
> > new review)
> >
> > Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > ---
> > drivers/gpu/drm/panthor/panthor_drv.c | 7 ++++-
> > drivers/gpu/drm/panthor/panthor_gem.c | 37 +++++++++++++++++++++++--
> > drivers/gpu/drm/panthor/panthor_sched.c | 18 ++++++++++--
> > include/uapi/drm/panthor_drm.h | 12 ++++++++
> > 4 files changed, 69 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
> > index c07fc5dcd4a1..4e915f5ef3fa 100644
> > --- a/drivers/gpu/drm/panthor/panthor_drv.c
> > +++ b/drivers/gpu/drm/panthor/panthor_drv.c
> > @@ -900,7 +900,8 @@ static int panthor_ioctl_vm_destroy(struct drm_device *ddev, void *data,
> > return panthor_vm_pool_destroy_vm(pfile->vms, args->id);
> > }
> >
> > -#define PANTHOR_BO_FLAGS DRM_PANTHOR_BO_NO_MMAP
> > +#define PANTHOR_BO_FLAGS (DRM_PANTHOR_BO_NO_MMAP | \
> > + DRM_PANTHOR_BO_WB_MMAP)
> >
> > static int panthor_ioctl_bo_create(struct drm_device *ddev, void *data,
> > struct drm_file *file)
> > @@ -919,6 +920,10 @@ static int panthor_ioctl_bo_create(struct drm_device *ddev, void *data,
> > goto out_dev_exit;
> > }
> >
> > + if ((args->flags & DRM_PANTHOR_BO_NO_MMAP) &&
> > + (args->flags & DRM_PANTHOR_BO_WB_MMAP))
> > + return -EINVAL;
> > +
> > if (args->exclusive_vm_id) {
> > vm = panthor_vm_pool_get_vm(pfile->vms, args->exclusive_vm_id);
> > if (!vm) {
> > diff --git a/drivers/gpu/drm/panthor/panthor_gem.c b/drivers/gpu/drm/panthor/panthor_gem.c
> > index 1b1e98c61b8c..479a779ee59d 100644
> > --- a/drivers/gpu/drm/panthor/panthor_gem.c
> > +++ b/drivers/gpu/drm/panthor/panthor_gem.c
> > @@ -58,6 +58,39 @@ static void panthor_gem_debugfs_set_usage_flags(struct panthor_gem_object *bo, u
> > static void panthor_gem_debugfs_bo_init(struct panthor_gem_object *bo) {}
> > #endif
> >
> > +static bool
> > +should_map_wc(struct panthor_gem_object *bo, struct panthor_vm *exclusive_vm)
> > +{
> > + struct panthor_device *ptdev = container_of(bo->base.base.dev, struct panthor_device, base);
> > +
> > + /* We can't do uncached mappings if the device is coherent,
> > + * because the zeroing done by the shmem layer at page allocation
> > + * time happens on a cached mapping which isn't CPU-flushed (at least
> > + * not on Arm64 where the flush is deferred to PTE setup time, and
> > + * only done conditionally based on the mapping permissions). We can't
> > + * rely on dma_map_sgtable()/dma_sync_sgtable_for_xxx() either to flush
> > + * those, because they are NOPed if dma_dev_coherent() returns true.
> > + *
> > + * FIXME: Note that this problem is going to pop up again when we
> > + * decide to support mapping buffers with the NO_MMAP flag as
> > + * non-shareable (AKA buffers accessed only by the GPU), because we
> > + * need the same CPU flush to happen after page allocation, otherwise
> > + * there's a risk of data leak or late corruption caused by a dirty
> > + * cacheline being evicted. At this point we'll need a way to force
> > + * CPU cache maintenance regardless of whether the device is coherent
> > + * or not.
> > + */
> > + if (ptdev->coherent)
> > + return false;
> > +
> > + /* Cached mappings are explicitly requested, so no write-combine. */
> > + if (bo->flags & DRM_PANTHOR_BO_WB_MMAP)
> > + return false;
> > +
> > + /* The default is write-combine. */
> > + return true;
> > +}
> > +
> > static void panthor_gem_free_object(struct drm_gem_object *obj)
> > {
> > struct panthor_gem_object *bo = to_panthor_bo(obj);
> > @@ -152,6 +185,7 @@ panthor_kernel_bo_create(struct panthor_device *ptdev, struct panthor_vm *vm,
> > bo = to_panthor_bo(&obj->base);
> > kbo->obj = &obj->base;
> > bo->flags = bo_flags;
> > + bo->base.map_wc = should_map_wc(bo, vm);
> >
> > if (vm == panthor_fw_vm(ptdev))
> > debug_flags |= PANTHOR_DEBUGFS_GEM_USAGE_FLAG_FW_MAPPED;
> > @@ -255,7 +289,6 @@ static const struct drm_gem_object_funcs panthor_gem_funcs = {
> > */
> > struct drm_gem_object *panthor_gem_create_object(struct drm_device *ddev, size_t size)
> > {
> > - struct panthor_device *ptdev = container_of(ddev, struct panthor_device, base);
> > struct panthor_gem_object *obj;
> >
> > obj = kzalloc(sizeof(*obj), GFP_KERNEL);
> > @@ -263,7 +296,6 @@ struct drm_gem_object *panthor_gem_create_object(struct drm_device *ddev, size_t
> > return ERR_PTR(-ENOMEM);
> >
> > obj->base.base.funcs = &panthor_gem_funcs;
> > - obj->base.map_wc = !ptdev->coherent;
> > mutex_init(&obj->label.lock);
> >
> > panthor_gem_debugfs_bo_init(obj);
> > @@ -298,6 +330,7 @@ panthor_gem_create_with_handle(struct drm_file *file,
> >
> > bo = to_panthor_bo(&shmem->base);
> > bo->flags = flags;
> > + bo->base.map_wc = should_map_wc(bo, exclusive_vm);
> >
> > if (exclusive_vm) {
> > bo->exclusive_vm_root_gem = panthor_vm_root_gem(exclusive_vm);
> > diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> > index f5e01cb16cfc..7d5da5717de2 100644
> > --- a/drivers/gpu/drm/panthor/panthor_sched.c
> > +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> > @@ -868,8 +868,11 @@ panthor_queue_get_syncwait_obj(struct panthor_group *group, struct panthor_queue
> > struct iosys_map map;
> > int ret;
> >
> > - if (queue->syncwait.kmap)
> > - return queue->syncwait.kmap + queue->syncwait.offset;
> > + if (queue->syncwait.kmap) {
> > + bo = container_of(queue->syncwait.obj,
> > + struct panthor_gem_object, base.base);
> > + goto out_sync;
> > + }
> >
> > bo = panthor_vm_get_bo_for_va(group->vm,
> > queue->syncwait.gpu_va,
> > @@ -886,6 +889,17 @@ panthor_queue_get_syncwait_obj(struct panthor_group *group, struct panthor_queue
> > if (drm_WARN_ON(&ptdev->base, !queue->syncwait.kmap))
> > goto err_put_syncwait_obj;
> >
> > +out_sync:
> > + /* Make sure the CPU caches are invalidated before the seqno is read.
> > + * drm_gem_shmem_sync() is a NOP if map_wc=false, so no need to check
>
> Sorry nitpick.
>
> IIUC, drm_gem_shmem_sync() would be a NOP if 'map_wc' is true.
Oops, will fix that.
>
>
>
> > + * it here.
> > + */
> > + drm_gem_shmem_sync(&bo->base, queue->syncwait.offset,
> > + queue->syncwait.sync64 ?
> > + sizeof(struct panthor_syncobj_64b) :
> > + sizeof(struct panthor_syncobj_32b),
> > + DRM_GEM_SHMEM_SYNC_CPU_CACHE_FLUSH_AND_INVALIDATE);
> > +
> > return queue->syncwait.kmap + queue->syncwait.offset;
> >
> > err_put_syncwait_obj:
> > diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
> > index 7eec9f922183..57e2f5ffa03c 100644
> > --- a/include/uapi/drm/panthor_drm.h
> > +++ b/include/uapi/drm/panthor_drm.h
> > @@ -681,6 +681,18 @@ struct drm_panthor_vm_get_state {
> > enum drm_panthor_bo_flags {
> > /** @DRM_PANTHOR_BO_NO_MMAP: The buffer object will never be CPU-mapped in userspace. */
> > DRM_PANTHOR_BO_NO_MMAP = (1 << 0),
> > +
> > + /**
> > + * @DRM_PANTHOR_BO_WB_MMAP: Force "Write-Back Cacheable" CPU mapping.
> > + *
> > + * CPU map the buffer object in userspace by forcing the "Write-Back
> > + * Cacheable" cacheability attribute. The mapping otherwise uses the
> > + * "Non-Cacheable" attribute if the GPU is not IO coherent.
> > + *
> > + * Can't be set if exclusive_vm_id=0 (only private BOs can be mapped
> > + * cacheable).
>
> Sorry Boris, I may have misinterpreted the code.
>
> As per the comment, DRM_PANTHOR_BO_WB_MMAP flag should be rejected if
> 'exclusive_vm' is NULL. But I don't see any check for 'exclusive_vm'
> pointer inside should_map_wc().
You're right, I had this behavior enforced at some point, and dropped
it after adding {begin,end}_cpu_access() implementations to panthor.
I'll revisit the comment or re-introduce the check in v6 based on how
the review process goes.
next prev parent reply other threads:[~2025-11-03 17:13 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-30 14:05 [PATCH v5 00/16] drm/panfrost, panthor: Cached maps and explicit flushing Boris Brezillon
2025-10-30 14:05 ` [PATCH v5 01/16] drm/prime: Simplify life of drivers needing custom dma_buf_ops Boris Brezillon
2025-10-30 14:25 ` Christian König
2025-10-30 14:35 ` Boris Brezillon
2025-10-30 14:05 ` [PATCH v5 02/16] drm/shmem: Provide a generic {begin, end}_cpu_access() implementation Boris Brezillon
2025-10-30 14:31 ` [PATCH v5 02/16] drm/shmem: Provide a generic {begin,end}_cpu_access() implementation Christian König
2025-11-04 8:08 ` Boris Brezillon
2025-11-03 20:34 ` [PATCH v5 02/16] drm/shmem: Provide a generic {begin, end}_cpu_access() implementation Akash Goel
2025-11-04 7:42 ` Boris Brezillon
2025-10-30 14:05 ` [PATCH v5 03/16] drm/shmem: Add a drm_gem_shmem_sync() helper Boris Brezillon
2025-11-14 15:02 ` Steven Price
2025-10-30 14:05 ` [PATCH v5 04/16] drm/panthor: Provide a custom dma_buf implementation Boris Brezillon
2025-11-14 15:02 ` Steven Price
2025-10-30 14:05 ` [PATCH v5 05/16] drm/panthor: Fix panthor_gpu_coherency_set() Boris Brezillon
2025-10-30 14:05 ` [PATCH v5 06/16] drm/panthor: Expose the selected coherency protocol to the UMD Boris Brezillon
2025-10-30 14:05 ` [PATCH v5 07/16] drm/panthor: Add a PANTHOR_BO_SYNC ioctl Boris Brezillon
2025-10-31 7:25 ` Marcin Ślusarz
2025-11-03 20:42 ` Akash Goel
2025-11-04 7:41 ` Boris Brezillon
2025-10-30 14:05 ` [PATCH v5 08/16] drm/panthor: Add an ioctl to query BO flags Boris Brezillon
2025-10-30 14:05 ` [PATCH v5 09/16] drm/panthor: Add flag to map GEM object Write-Back Cacheable Boris Brezillon
2025-11-03 16:43 ` Akash Goel
2025-11-03 17:13 ` Boris Brezillon [this message]
2025-10-30 14:05 ` [PATCH v5 10/16] drm/panthor: Bump the driver version to 1.6 Boris Brezillon
2025-10-30 14:05 ` [PATCH v5 11/16] drm/panfrost: Provide a custom dma_buf implementation Boris Brezillon
2025-11-14 16:17 ` Steven Price
2025-10-30 14:05 ` [PATCH v5 12/16] drm/panfrost: Expose the selected coherency protocol to the UMD Boris Brezillon
2025-11-14 16:19 ` Steven Price
2025-10-30 14:05 ` [PATCH v5 13/16] drm/panfrost: Add a PANFROST_SYNC_BO ioctl Boris Brezillon
2025-10-31 7:08 ` Marcin Ślusarz
2025-10-31 8:49 ` Boris Brezillon
2025-10-30 14:05 ` [PATCH v5 14/16] drm/panfrost: Add an ioctl to query BO flags Boris Brezillon
2025-10-30 14:05 ` [PATCH v5 15/16] drm/panfrost: Add flag to map GEM object Write-Back Cacheable Boris Brezillon
2025-11-14 16:22 ` Steven Price
2025-10-30 14:05 ` [PATCH v5 16/16] drm/panfrost: Bump the driver version to 1.6 Boris Brezillon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251103181329.21822c2d@fedora \
--to=boris.brezillon@collabora.com \
--cc=abhinav.kumar@linux.dev \
--cc=airlied@gmail.com \
--cc=akash.goel@arm.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=faith.ekstrand@collabora.com \
--cc=frank.binns@imgtec.com \
--cc=jessica.zhang@oss.qualcomm.com \
--cc=kernel@collabora.com \
--cc=loic.molinari@collabora.com \
--cc=lucas.demarchi@intel.com \
--cc=lumag@kernel.org \
--cc=maarten.lankhorst@linux.intel.com \
--cc=marijn.suijten@somainline.org \
--cc=matt.coster@imgtec.com \
--cc=mcanal@igalia.com \
--cc=mperttunen@nvidia.com \
--cc=mripard@kernel.org \
--cc=mwen@igalia.com \
--cc=robin.clark@oss.qualcomm.com \
--cc=rodrigo.vivi@intel.com \
--cc=sean@poorly.run \
--cc=simona@ffwll.ch \
--cc=steven.price@arm.com \
--cc=thierry.reding@gmail.com \
--cc=thomas.hellstrom@linux.intel.com \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox