Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: "Gupta, Varun" <varun.gupta@intel.com>
Cc: <intel-xe@lists.freedesktop.org>, <himal.prasad.ghimiray@intel.com>
Subject: Re: [PATCH] drm/xe: Skip BO migration when platform supports CPU atomics on VRAM
Date: Tue, 24 Feb 2026 22:58:20 -0800	[thread overview]
Message-ID: <aZ6djEEAx/ktco04@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <9ad9bac0-1fe0-4dc2-b767-b2285653d4a4@intel.com>

On Wed, Feb 25, 2026 at 11:47:17AM +0530, Gupta, Varun wrote:
> 
> On 24-Feb-26 10:33 PM, Matthew Brost wrote:
> > On Tue, Feb 24, 2026 at 02:57:58PM +0530, Varun Gupta wrote:
> > 
> > Are you sure about this? CPUs supporting atomics (i.e., CPU instructions
> > with a LOCK prefix) over PCIe is news to me. Why do you think this is
> > supported on Xe2? Also, wouldn’t the CPU factor in here too if it were
> > supported?
> > 
> > If this somehow works—which I doubt—I think we’d need a test case where
> > the CPU issues atomics and the GPU does too. Then verify the result,
> > e.g., both sides perform 10,000 atomic increments on a VRAM page and we
> > confirm the final value is 20,000. Maybe we already have an IGT madvise
> > test for this; I’m not sure.
> > 
> > Matt
> 
> Thanks for the review,
> Before sending the patch, I only tested whether CPU atomics on VRAM were
> self consistent or not.
> But that possibly worked only because root port serialized the read write
> instructions instead of actual
> atomic operation.
> After your suggestion, I wrote a simple subtest to run concurrent CPU+GPU
> atomics, and observed
> that there are some lost updates (around 150-200 for 20000 operations).
> Do you see any other approach that could make this work, or should i drop
> this patch entirely?

I don't how it is possible to make this work on PCIe without dynamically
removing mappings on either side to ensure exclusive access (current
implementation).

I'd just drop the patch.

Matt

> The has_system_atomics_on_vram could be kept as dead infra for future
> CXL-attached devices where
> host-device coherency actually works.
> Let me know your thoughts.
> 
> Thanks,
> Varun
> 
> > > should_migrate_to_smem() forced any BO marked ATOMIC_GLOBAL
> > > or ATOMIC_CPU out of VRAM on a CPU fault, regardless of whether the
> > > platform actually requires it.
> > > 
> > > Introduce has_system_atomics_on_vram to the device capability flags.
> > > should_migrate_to_smem() now gates migration on this flag, eliminating
> > > redundant evictions on capable hardware.
> > > 
> > > Signed-off-by: Varun Gupta <varun.gupta@intel.com>
> > > ---
> > >   drivers/gpu/drm/xe/xe_bo.c           | 16 +++++++++-------
> > >   drivers/gpu/drm/xe/xe_device_types.h |  2 ++
> > >   drivers/gpu/drm/xe/xe_pci.c          |  5 ++++-
> > >   3 files changed, 15 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> > > index d6c2cb959cdd..848467ca6700 100644
> > > --- a/drivers/gpu/drm/xe/xe_bo.c
> > > +++ b/drivers/gpu/drm/xe/xe_bo.c
> > > @@ -1771,14 +1771,15 @@ static void xe_gem_object_close(struct drm_gem_object *obj,
> > >   static bool should_migrate_to_smem(struct xe_bo *bo)
> > >   {
> > > +	struct xe_device *xe = xe_bo_device(bo);
> > > +
> > >   	/*
> > > -	 * NOTE: The following atomic checks are platform-specific. For example,
> > > -	 * if a device supports CXL atomics, these may not be necessary or
> > > -	 * may behave differently.
> > > +	 * Migrate to system memory only if the platform does not support
> > > +	 * CPU atomics on VRAM.
> > >   	 */
> > > -
> > > -	return bo->attr.atomic_access == DRM_XE_ATOMIC_GLOBAL ||
> > > -	       bo->attr.atomic_access == DRM_XE_ATOMIC_CPU;
> > > +	return (bo->attr.atomic_access == DRM_XE_ATOMIC_GLOBAL ||
> > > +		bo->attr.atomic_access == DRM_XE_ATOMIC_CPU) &&
> > > +	       !xe->info.has_system_atomics_on_vram;
> > >   }
> > >   static int xe_bo_wait_usage_kernel(struct xe_bo *bo, struct ttm_operation_ctx *ctx)
> > > @@ -1804,6 +1805,7 @@ static int xe_bo_fault_migrate(struct xe_bo *bo, struct ttm_operation_ctx *ctx,
> > >   			       struct drm_exec *exec)
> > >   {
> > >   	struct ttm_buffer_object *tbo = &bo->ttm;
> > > +	struct xe_device *xe = xe_bo_device(bo);
> > >   	int err = 0;
> > >   	if (ttm_manager_type(tbo->bdev, tbo->resource->mem_type)->use_tt) {
> > > @@ -1811,7 +1813,7 @@ static int xe_bo_fault_migrate(struct xe_bo *bo, struct ttm_operation_ctx *ctx,
> > >   		if (!err)
> > >   			err = ttm_bo_populate(&bo->ttm, ctx);
> > >   	} else if (should_migrate_to_smem(bo)) {
> > > -		xe_assert(xe_bo_device(bo), bo->flags & XE_BO_FLAG_SYSTEM);
> > > +		xe_assert(xe, bo->flags & XE_BO_FLAG_SYSTEM);
> > >   		err = xe_bo_migrate(bo, XE_PL_TT, ctx, exec);
> > >   	}
> > > diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> > > index 8f3ef836541e..2fe643c179fe 100644
> > > --- a/drivers/gpu/drm/xe/xe_device_types.h
> > > +++ b/drivers/gpu/drm/xe/xe_device_types.h
> > > @@ -205,6 +205,8 @@ struct xe_device {
> > >   		u8 has_usm:1;
> > >   		/** @info.has_64bit_timestamp: Device supports 64-bit timestamps */
> > >   		u8 has_64bit_timestamp:1;
> > > +		/** @info.has_system_atomics_on_vram: Supports CPU atomics on VRAM */
> > > +		u8 has_system_atomics_on_vram:1;
> > >   		/** @info.is_dgfx: is discrete device */
> > >   		u8 is_dgfx:1;
> > >   		/** @info.needs_scratch: needs scratch page for oob prefetch to work */
> > > diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> > > index 56a768f2cfca..f2d86e08e190 100644
> > > --- a/drivers/gpu/drm/xe/xe_pci.c
> > > +++ b/drivers/gpu/drm/xe/xe_pci.c
> > > @@ -925,8 +925,11 @@ static int xe_info_init(struct xe_device *xe,
> > >   	xe->info.has_asid = graphics_desc->has_asid;
> > >   	xe->info.has_atomic_enable_pte_bit = graphics_desc->has_atomic_enable_pte_bit;
> > > -	if (xe->info.platform != XE_PVC)
> > > +	if (xe->info.platform != XE_PVC) {
> > >   		xe->info.has_device_atomics_on_smem = 1;
> > > +		/* PVC doesn't support CPU atomics on VRAM, Xe2+ platforms do */
> > > +		xe->info.has_system_atomics_on_vram = 1;
> > > +	}
> > >   	xe->info.has_range_tlb_inval = graphics_desc->has_range_tlb_inval;
> > >   	xe->info.has_ctx_tlb_inval = graphics_desc->has_ctx_tlb_inval;
> > > -- 
> > > 2.43.0
> > > 

      reply	other threads:[~2026-02-25  6:58 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-24  9:27 [PATCH] drm/xe: Skip BO migration when platform supports CPU atomics on VRAM Varun Gupta
2026-02-24  9:35 ` ✓ CI.KUnit: success for " Patchwork
2026-02-24 17:03 ` [PATCH] " Matthew Brost
2026-02-25  6:17   ` Gupta, Varun
2026-02-25  6:58     ` Matthew Brost [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aZ6djEEAx/ktco04@lstrano-desk.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=himal.prasad.ghimiray@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=varun.gupta@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox