Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Auld <matthew.auld@intel.com>
To: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	"Tejas Upadhyay" <tejas.upadhyay@intel.com>,
	intel-xe@lists.freedesktop.org
Subject: Re: [PATCH V3 3/4] drm/xe/xe3p_lpg: Enable L2 flush optimization feature
Date: Fri, 20 Feb 2026 15:10:46 +0000	[thread overview]
Message-ID: <b8f03ad0-8b2b-4e8a-aab8-565792d23475@intel.com> (raw)
In-Reply-To: <6e37cefc514a1a8f2ade91602441a580c96797cd.camel@linux.intel.com>

On 20/02/2026 12:58, Thomas Hellström wrote:
> On Fri, 2026-02-20 at 12:06 +0000, Matthew Auld wrote:
>> On 20/02/2026 11:50, Thomas Hellström wrote:
>>> On Fri, 2026-02-20 at 11:46 +0000, Matthew Auld wrote:
>>>> On 20/02/2026 10:16, Tejas Upadhyay wrote:
>>>>> When set, the L2 flush optimization feature will control
>>>>> whether L2 is in Persistent or Transient mode through
>>>>> monitoring of media activity.
>>>>>
>>>>> To enable L2 flush optimization include new feature flag
>>>>> GUC_CTL_ENABLE_L2FLUSH_OPT for Novalake platforms when
>>>>> media type is detected.
>>>>>
>>>>> Also, restrict userptr, svm and dmabuf mappings to be
>>>>> either 2WAY or XA+1WAY
>>>>>
>>>>> V2(MattA): validate dma-buf bos and madvise pat-index
>>>
>>> Question: Assuming that we on *faulting* VMs always perform a TLB
>>> flush
>>> on unbind. Can we eliminate the PAT restrictions on those? That
>>> would
>>> actually then include all SVM maps.
>>
>> With unbind do you mean notifier invalidation flow on svm side? Or
>> actual vma unbind?
> 
> Both actually, or to rephrase, before we release any memory back to
> system we have removed its GPU mappings and flushed TLB?

Hmm, yeah I think that must be true. We should already have the 
necessary tlb invalidations in all the right places for faulting VMs to 
be well behaved today. So if memory is going be released then the 
mapping surely must have already have been invalidated, and the 
invalidation will already take care to do a PPC flush. So from security 
angle there shouldn't be gaps when binding into a faulting VM, I think.

So with the faulting VM stuff this can go with LR right, but where we 
don't have to forcefully pre-empt it? So workload is still running but 
happy to trigger a fault, if needed. Are we assuming that zapping PTE + 
TLB flush ensures that the GPU is not still accessing the physical 
memory that the PTE was pointing at, like say it is has already fetched 
that memory and is still in the middle of some op. Or does it not work 
like this at all?

> 
> /Thomas
> 
> 
> 
>>
>>>
>>> /Thomas
>>>    
>>>
>>>>>
>>>>> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
>>>>> ---
>>>>>     drivers/gpu/drm/xe/xe_guc.c        |  3 +++
>>>>>     drivers/gpu/drm/xe/xe_guc_fwif.h   |  1 +
>>>>>     drivers/gpu/drm/xe/xe_vm.c         |  9 +++++++++
>>>>>     drivers/gpu/drm/xe/xe_vm_madvise.c | 18 ++++++++++++++++++
>>>>>     4 files changed, 31 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/xe/xe_guc.c
>>>>> b/drivers/gpu/drm/xe/xe_guc.c
>>>>> index cbbb4d665b8f..97c33c3dd520 100644
>>>>> --- a/drivers/gpu/drm/xe/xe_guc.c
>>>>> +++ b/drivers/gpu/drm/xe/xe_guc.c
>>>>> @@ -98,6 +98,9 @@ static u32 guc_ctl_feature_flags(struct
>>>>> xe_guc
>>>>> *guc)
>>>>>     	if (xe_guc_using_main_gamctrl_queues(guc))
>>>>>     		flags |= GUC_CTL_MAIN_GAMCTRL_QUEUES;
>>>>>     
>>>>> +	if (GRAPHICS_VER(xe) >= 35 && !IS_DGFX(xe) &&
>>>>> xe_gt_is_media_type(guc_to_gt(guc)))
>>>>> +		flags |= GUC_CTL_ENABLE_L2FLUSH_OPT;
>>>>> +
>>>>>     	return flags;
>>>>>     }
>>>>>     
>>>>> diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h
>>>>> b/drivers/gpu/drm/xe/xe_guc_fwif.h
>>>>> index a33ea288b907..39ff7b3e960b 100644
>>>>> --- a/drivers/gpu/drm/xe/xe_guc_fwif.h
>>>>> +++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
>>>>> @@ -67,6 +67,7 @@ struct guc_update_exec_queue_policy {
>>>>>     #define   GUC_CTL_ENABLE_PSMI_LOGGING	BIT(7)
>>>>>     #define   GUC_CTL_MAIN_GAMCTRL_QUEUES	BIT(9)
>>>>>     #define   GUC_CTL_DISABLE_SCHEDULER	BIT(14)
>>>>> +#define   GUC_CTL_ENABLE_L2FLUSH_OPT	BIT(15)
>>>>>     
>>>>>     #define GUC_CTL_DEBUG			3
>>>>>     #define   GUC_LOG_VERBOSITY		REG_GENMASK(1, 0)
>>>>> diff --git a/drivers/gpu/drm/xe/xe_vm.c
>>>>> b/drivers/gpu/drm/xe/xe_vm.c
>>>>> index c06fd250e037..e2e4c9648d05 100644
>>>>> --- a/drivers/gpu/drm/xe/xe_vm.c
>>>>> +++ b/drivers/gpu/drm/xe/xe_vm.c
>>>>> @@ -3474,6 +3474,11 @@ static int
>>>>> vm_bind_ioctl_check_args(struct
>>>>> xe_device *xe, struct xe_vm *vm,
>>>>>     				 op ==
>>>>> DRM_XE_VM_BIND_OP_MAP_USERPTR) ||
>>>>>     		    XE_IOCTL_DBG(xe, coh_mode == XE_COH_NONE
>>>>> &&
>>>>>     				 op ==
>>>>> DRM_XE_VM_BIND_OP_MAP_USERPTR) ||
>>>>> +		    XE_IOCTL_DBG(xe,
>>>>> xe_device_is_l2_flush_optimized(xe) &&
>>>>> +				 (op ==
>>>>> DRM_XE_VM_BIND_OP_MAP_USERPTR ||
>>>>> +				  /* svm */
>>>>> +				  op == (DRM_XE_VM_BIND_OP_MAP
>>>>> &&
>>>>> is_cpu_addr_mirror)) &&
>>>>
>>>> op == (DRM_XE_VM_BIND_OP_MAP && is_cpu_addr_mirror)
>>>>
>>>> I think you meant (op == DRM_XE_VM_BIND_OP_MAP) && is_cpu ?
>>>>
>>>> But maybe just drop the op check. Having the check being
>>>> consistent
>>>> for
>>>> bind/unbind matches existing uapi behaviour?
>>>>
>>>>> +				 (pat_index != 19 || coh_mode
>>>>> !=
>>>>> XE_COH_2WAY)) ||
>>>>>     		    XE_IOCTL_DBG(xe, comp_en &&
>>>>>     				 op ==
>>>>> DRM_XE_VM_BIND_OP_MAP_USERPTR) ||
>>>>>     		    XE_IOCTL_DBG(xe, op ==
>>>>> DRM_XE_VM_BIND_OP_MAP_USERPTR &&
>>>>> @@ -3608,6 +3613,10 @@ static int
>>>>> xe_vm_bind_ioctl_validate_bo(struct xe_device *xe, struct xe_bo
>>>>> *bo,
>>>>>     	if (XE_IOCTL_DBG(xe, bo->ttm.base.import_attach &&
>>>>> comp_en))
>>>>>     		return -EINVAL;
>>>>>     
>>>>> +	if (XE_IOCTL_DBG(xe, bo->ttm.base.import_attach &&
>>>>> xe_device_is_l2_flush_optimized(xe) &&
>>>>> +			 (pat_index != 19 || coh_mode !=
>>>>> XE_COH_2WAY)))
>>>>> +		return -EINVAL;
>>>>> +
>>>>>     	/* If a BO is protected it can only be mapped if the
>>>>> key
>>>>> is still valid */
>>>>>     	if ((bind_flags & DRM_XE_VM_BIND_FLAG_CHECK_PXP) &&
>>>>> xe_bo_is_protected(bo) &&
>>>>>     	    op != DRM_XE_VM_BIND_OP_UNMAP && op !=
>>>>> DRM_XE_VM_BIND_OP_UNMAP_ALL)
>>>>> diff --git a/drivers/gpu/drm/xe/xe_vm_madvise.c
>>>>> b/drivers/gpu/drm/xe/xe_vm_madvise.c
>>>>> index 1a1ad8c07d49..2a35dbeba2d8 100644
>>>>> --- a/drivers/gpu/drm/xe/xe_vm_madvise.c
>>>>> +++ b/drivers/gpu/drm/xe/xe_vm_madvise.c
>>>>> @@ -411,6 +411,7 @@ int xe_vm_madvise_ioctl(struct drm_device
>>>>> *dev,
>>>>> void *data, struct drm_file *fil
>>>>>     	struct xe_vmas_in_madvise_range madvise_range = {.addr
>>>>> =
>>>>> args->start,
>>>>>     							
>>>>> .range =
>>>>> args->range, };
>>>>>     	struct xe_madvise_details details;
>>>>> +	u16 pat_index, coh_mode;
>>>>>     	struct xe_vm *vm;
>>>>>     	struct drm_exec exec;
>>>>>     	int err, attr_type;
>>>>> @@ -447,6 +448,15 @@ int xe_vm_madvise_ioctl(struct drm_device
>>>>> *dev, void *data, struct drm_file *fil
>>>>>     	if (err || !madvise_range.num_vmas)
>>>>>     		goto madv_fini;
>>>>>     
>>>>> +	pat_index = array_index_nospec(args->pat_index.val,
>>>>> xe-
>>>>>> pat.n_entries);
>>>>
>>>> This needs to be conditional on DRM_XE_MEM_RANGE_ATTR_PAT. This
>>>> is a
>>>> union underneath so pat_index.val is not actually a pat_index for
>>>> the
>>>> other madv types, but just some other random data.
>>>>
>>>>> +	coh_mode = xe_pat_index_get_coh_mode(xe, pat_index);
>>>>> +	if (XE_IOCTL_DBG(xe,
>>>>> madvise_range.has_svm_userptr_vmas &&
>>>>> +			 xe_device_is_l2_flush_optimized(xe)
>>>>> &&
>>>>> +			 (pat_index != 19 || coh_mode !=
>>>>> XE_COH_2WAY))) {
>>>>> +		err = -EINVAL;
>>>>> +		goto madv_fini;
>>>>> +	}
>>>>> +
>>>>>     	if (madvise_range.has_bo_vmas) {
>>>>>     		if (args->type ==
>>>>> DRM_XE_MEM_RANGE_ATTR_ATOMIC) {
>>>>>     			if (!check_bo_args_are_sane(vm,
>>>>> madvise_range.vmas,
>>>>> @@ -464,6 +474,14 @@ int xe_vm_madvise_ioctl(struct drm_device
>>>>> *dev, void *data, struct drm_file *fil
>>>>>     
>>>>>     				if (!bo)
>>>>>     					continue;
>>>>> +
>>>>> +				if (XE_IOCTL_DBG(xe, bo-
>>>>>> ttm.base.import_attach &&
>>>>> +						
>>>>> xe_device_is_l2_flush_optimized(xe) &&
>>>>> +						 (pat_index !=
>>>>> 19
>>>>>>> coh_mode != XE_COH_2WAY))) {
>>>>> +					err = -EINVAL;
>>>>> +					goto err_fini;
>>>>> +				}
>>>>> +
>>>>>     				err = drm_exec_lock_obj(&exec,
>>>>> &bo->ttm.base);
>>>>>     				drm_exec_retry_on_contention(&
>>>>> exec
>>>>> );
>>>>>     				if (err)


  parent reply	other threads:[~2026-02-20 15:10 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-20 10:16 [PATCH V3 0/4] drm/xe/xe3p_lpg: L2 flush optimization Tejas Upadhyay
2026-02-20 10:16 ` [PATCH V3 1/4] drm/xe/xe3p_lpg: flush shrinker bo cachelines manually Tejas Upadhyay
2026-02-20 10:16 ` [PATCH V3 2/4] drm/xe/pat: define coh_mode 2way Tejas Upadhyay
2026-02-20 10:25   ` Matthew Auld
2026-02-20 10:16 ` [PATCH V3 3/4] drm/xe/xe3p_lpg: Enable L2 flush optimization feature Tejas Upadhyay
2026-02-20 11:46   ` Matthew Auld
2026-02-20 11:50     ` Thomas Hellström
2026-02-20 12:06       ` Matthew Auld
2026-02-20 12:58         ` Thomas Hellström
2026-02-20 13:11           ` Upadhyay, Tejas
2026-02-20 15:10           ` Matthew Auld [this message]
2026-02-20 10:16 ` [PATCH V3 4/4] drm/xe/xe3p: Skip TD flush Tejas Upadhyay
2026-02-20 10:23 ` ✓ CI.KUnit: success for drm/xe/xe3p_lpg: L2 flush optimization (rev4) Patchwork
2026-02-20 11:02 ` ✓ Xe.CI.BAT: " Patchwork
2026-02-20 22:24 ` ✗ Xe.CI.FULL: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b8f03ad0-8b2b-4e8a-aab8-565792d23475@intel.com \
    --to=matthew.auld@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=tejas.upadhyay@intel.com \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox