Re: [PATCH] drm/xe: Skip BO migration when platform supports CPU atomics on VRAM

Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: "Gupta, Varun" <varun.gupta@intel.com>
To: Matthew Brost <matthew.brost@intel.com>
Cc: <intel-xe@lists.freedesktop.org>, <himal.prasad.ghimiray@intel.com>
Subject: Re: [PATCH] drm/xe: Skip BO migration when platform supports CPU atomics on VRAM
Date: Wed, 25 Feb 2026 11:47:17 +0530	[thread overview]
Message-ID: <9ad9bac0-1fe0-4dc2-b767-b2285653d4a4@intel.com> (raw)
In-Reply-To: <aZ3Z/fGBaTBtyd2K@lstrano-desk.jf.intel.com>


On 24-Feb-26 10:33 PM, Matthew Brost wrote:
> On Tue, Feb 24, 2026 at 02:57:58PM +0530, Varun Gupta wrote:
>
> Are you sure about this? CPUs supporting atomics (i.e., CPU instructions
> with a LOCK prefix) over PCIe is news to me. Why do you think this is
> supported on Xe2? Also, wouldn’t the CPU factor in here too if it were
> supported?
>
> If this somehow works—which I doubt—I think we’d need a test case where
> the CPU issues atomics and the GPU does too. Then verify the result,
> e.g., both sides perform 10,000 atomic increments on a VRAM page and we
> confirm the final value is 20,000. Maybe we already have an IGT madvise
> test for this; I’m not sure.
>
> Matt

Thanks for the review,
Before sending the patch, I only tested whether CPU atomics on VRAM were 
self consistent or not.
But that possibly worked only because root port serialized the read 
write instructions instead of actual
atomic operation.
After your suggestion, I wrote a simple subtest to run concurrent 
CPU+GPU atomics, and observed
that there are some lost updates (around 150-200 for 20000 operations).
Do you see any other approach that could make this work, or should i 
drop this patch entirely?
The has_system_atomics_on_vram could be kept as dead infra for future 
CXL-attached devices where
host-device coherency actually works.
Let me know your thoughts.

Thanks,
Varun

>> should_migrate_to_smem() forced any BO marked ATOMIC_GLOBAL
>> or ATOMIC_CPU out of VRAM on a CPU fault, regardless of whether the
>> platform actually requires it.
>>
>> Introduce has_system_atomics_on_vram to the device capability flags.
>> should_migrate_to_smem() now gates migration on this flag, eliminating
>> redundant evictions on capable hardware.
>>
>> Signed-off-by: Varun Gupta <varun.gupta@intel.com>
>> ---
>>   drivers/gpu/drm/xe/xe_bo.c           | 16 +++++++++-------
>>   drivers/gpu/drm/xe/xe_device_types.h |  2 ++
>>   drivers/gpu/drm/xe/xe_pci.c          |  5 ++++-
>>   3 files changed, 15 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
>> index d6c2cb959cdd..848467ca6700 100644
>> --- a/drivers/gpu/drm/xe/xe_bo.c
>> +++ b/drivers/gpu/drm/xe/xe_bo.c
>> @@ -1771,14 +1771,15 @@ static void xe_gem_object_close(struct drm_gem_object *obj,
>>   
>>   static bool should_migrate_to_smem(struct xe_bo *bo)
>>   {
>> +	struct xe_device *xe = xe_bo_device(bo);
>> +
>>   	/*
>> -	 * NOTE: The following atomic checks are platform-specific. For example,
>> -	 * if a device supports CXL atomics, these may not be necessary or
>> -	 * may behave differently.
>> +	 * Migrate to system memory only if the platform does not support
>> +	 * CPU atomics on VRAM.
>>   	 */
>> -
>> -	return bo->attr.atomic_access == DRM_XE_ATOMIC_GLOBAL ||
>> -	       bo->attr.atomic_access == DRM_XE_ATOMIC_CPU;
>> +	return (bo->attr.atomic_access == DRM_XE_ATOMIC_GLOBAL ||
>> +		bo->attr.atomic_access == DRM_XE_ATOMIC_CPU) &&
>> +	       !xe->info.has_system_atomics_on_vram;
>>   }
>>   
>>   static int xe_bo_wait_usage_kernel(struct xe_bo *bo, struct ttm_operation_ctx *ctx)
>> @@ -1804,6 +1805,7 @@ static int xe_bo_fault_migrate(struct xe_bo *bo, struct ttm_operation_ctx *ctx,
>>   			       struct drm_exec *exec)
>>   {
>>   	struct ttm_buffer_object *tbo = &bo->ttm;
>> +	struct xe_device *xe = xe_bo_device(bo);
>>   	int err = 0;
>>   
>>   	if (ttm_manager_type(tbo->bdev, tbo->resource->mem_type)->use_tt) {
>> @@ -1811,7 +1813,7 @@ static int xe_bo_fault_migrate(struct xe_bo *bo, struct ttm_operation_ctx *ctx,
>>   		if (!err)
>>   			err = ttm_bo_populate(&bo->ttm, ctx);
>>   	} else if (should_migrate_to_smem(bo)) {
>> -		xe_assert(xe_bo_device(bo), bo->flags & XE_BO_FLAG_SYSTEM);
>> +		xe_assert(xe, bo->flags & XE_BO_FLAG_SYSTEM);
>>   		err = xe_bo_migrate(bo, XE_PL_TT, ctx, exec);
>>   	}
>>   
>> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
>> index 8f3ef836541e..2fe643c179fe 100644
>> --- a/drivers/gpu/drm/xe/xe_device_types.h
>> +++ b/drivers/gpu/drm/xe/xe_device_types.h
>> @@ -205,6 +205,8 @@ struct xe_device {
>>   		u8 has_usm:1;
>>   		/** @info.has_64bit_timestamp: Device supports 64-bit timestamps */
>>   		u8 has_64bit_timestamp:1;
>> +		/** @info.has_system_atomics_on_vram: Supports CPU atomics on VRAM */
>> +		u8 has_system_atomics_on_vram:1;
>>   		/** @info.is_dgfx: is discrete device */
>>   		u8 is_dgfx:1;
>>   		/** @info.needs_scratch: needs scratch page for oob prefetch to work */
>> diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
>> index 56a768f2cfca..f2d86e08e190 100644
>> --- a/drivers/gpu/drm/xe/xe_pci.c
>> +++ b/drivers/gpu/drm/xe/xe_pci.c
>> @@ -925,8 +925,11 @@ static int xe_info_init(struct xe_device *xe,
>>   
>>   	xe->info.has_asid = graphics_desc->has_asid;
>>   	xe->info.has_atomic_enable_pte_bit = graphics_desc->has_atomic_enable_pte_bit;
>> -	if (xe->info.platform != XE_PVC)
>> +	if (xe->info.platform != XE_PVC) {
>>   		xe->info.has_device_atomics_on_smem = 1;
>> +		/* PVC doesn't support CPU atomics on VRAM, Xe2+ platforms do */
>> +		xe->info.has_system_atomics_on_vram = 1;
>> +	}
>>   
>>   	xe->info.has_range_tlb_inval = graphics_desc->has_range_tlb_inval;
>>   	xe->info.has_ctx_tlb_inval = graphics_desc->has_ctx_tlb_inval;
>> -- 
>> 2.43.0
>>

next prev parent reply	other threads:[~2026-02-25  6:17 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-24  9:27 [PATCH] drm/xe: Skip BO migration when platform supports CPU atomics on VRAM Varun Gupta
2026-02-24  9:35 ` ✓ CI.KUnit: success for " Patchwork
2026-02-24 17:03 ` [PATCH] " Matthew Brost
2026-02-25  6:17   ` Gupta, Varun [this message]
2026-02-25  6:58     ` Matthew Brost

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9ad9bac0-1fe0-4dc2-b767-b2285653d4a4@intel.com \
    --to=varun.gupta@intel.com \
    --cc=himal.prasad.ghimiray@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox