Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Auld <matthew.auld@intel.com>
To: Matt Roper <matthew.d.roper@intel.com>,
	"Souza, Jose" <jose.souza@intel.com>
Cc: "Upadhyay, Tejas" <tejas.upadhyay@intel.com>,
	"Mrozek, Michal" <michal.mrozek@intel.com>,
	"intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
	"Brost, Matthew" <matthew.brost@intel.com>,
	"thomas.hellstrom@linux.intel.com"
	<thomas.hellstrom@linux.intel.com>
Subject: Re: [PATCH 1/3] drm/xe/xe3p_lpg: flush userptr/shrinker bo cachelines manually
Date: Fri, 13 Feb 2026 17:31:48 +0000	[thread overview]
Message-ID: <a62d59e4-c77d-4e28-8227-be253733dd7b@intel.com> (raw)
In-Reply-To: <20260213171638.GC52346@mdroper-desk1.amr.corp.intel.com>

On 13/02/2026 17:16, Matt Roper wrote:
> On Fri, Feb 13, 2026 at 04:48:39PM +0000, Souza, Jose wrote:
>> On Fri, 2026-02-13 at 16:23 +0000, Upadhyay, Tejas wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Roper, Matthew D <matthew.d.roper@intel.com>
>>>> Sent: 12 February 2026 02:41
>>>> To: Upadhyay, Tejas <tejas.upadhyay@intel.com>
>>>> Cc: Brost, Matthew <matthew.brost@intel.com>; intel-
>>>> xe@lists.freedesktop.org; Auld, Matthew <matthew.auld@intel.com>;
>>>> thomas.hellstrom@linux.intel.com
>>>> Subject: Re: [PATCH 1/3] drm/xe/xe3p_lpg: flush userptr/shrinker bo
>>>> cachelines manually
>>>>
>>>> On Wed, Feb 11, 2026 at 07:06:05PM +0000, Upadhyay, Tejas wrote:
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Brost, Matthew <matthew.brost@intel.com>
>>>>>> Sent: 11 February 2026 05:32
>>>>>> To: Roper, Matthew D <matthew.d.roper@intel.com>
>>>>>> Cc: Upadhyay, Tejas <tejas.upadhyay@intel.com>; intel-
>>>>>> xe@lists.freedesktop.org; Auld, Matthew
>>>>>> <matthew.auld@intel.com>;
>>>>>> thomas.hellstrom@linux.intel.com
>>>>>> Subject: Re: [PATCH 1/3] drm/xe/xe3p_lpg: flush
>>>>>> userptr/shrinker bo
>>>>>> cachelines manually
>>>>>>
>>>>>> On Tue, Feb 10, 2026 at 01:05:25PM -0800, Matt Roper wrote:
>>>>>>> On Tue, Feb 10, 2026 at 06:21:22PM +0530, Tejas Upadhyay
>>>>>>> wrote:
>>>>>>>> "eXtended Architecture" (XA) tagged memory—memory shared
>>>> between
>>>>>> the
>>>>>>>> CPU and GPU
>>>>>>>
>>>>>>> I'm pretty sure this expansion of "XA" is wrong; where are
>>>>>>> you
>>>>>>> seeing this definition?  Everything in the bspec indicates
>>>>>>> that XA
>>>>>>> means "wb
>>>>>>> - transient app" (similar to how "XD" is 'wb - transient
>>>>>>> display").
>>>>>>> I'm not sure why exactly they picked "X" to refer to
>>>>>>> transient in
>>>>>>> both of these cases, but I've never seen any documentation
>>>>>>> that
>>>>>>> refers to it as "extended."
>>>>>>>
>>>>>>>> is treated differently from other GPU memory when the Media
>>>>>>>> engine is
>>>>>> power-gated.
>>>>>>>>
>>>>>>>> XA is *always* flushed, like at the end-of-submssion (and
>>>>>>>> maybe
>>>>>>>> other
>>>>>>>
>>>>>>> I assume you're referring to the fact that the driver
>>>>>>> performs
>>>>>>> flushes at the end of submission (via PIPE_CONTROL or
>>>>>>> MI_FLUSH_DW), and that depending on other state/optimizations
>>>>>>> in
>>>>>>> the system, those flushes may flush the entire device cache,
>>>>>>> or
>>>>>>> may only flush the subset of cache data that is not marked as
>>>>>>> transient.  The way you worded this was confusing since it
>>>>>>> makes
>>>>>>> it sound like cache flushes happen automatically somewhere in
>>>> hardware/firmware.
>>>>>>>
>>>>>>>> places), just that internally as an optimisation hw doesn't
>>>>>>>> need
>>>>>>>> to make that a full flush (which will also include XA) when
>>>>>>>> Media is off/powergated, since it doesn't need to worry
>>>>>>>> about GT
>>>>>>>> caches vs Media coherency, and only CPU vs GPU coherency,
>>>>>>>> so can
>>>>>>>> make that flush a targeted XA flush, since stuff tagged
>>>>>>>> with XA
>>>>>>>> now means it's shared with the CPU. The main implication is
>>>>>>>> that
>>>>>>>> we now need to somehow flush non-XA before freeing system
>>>>>>>> memory
>>>>>>>> pages, otherwise dirty cachelines could be flushed after
>>>>>>>> the
>>>>>>>> free (like if Media suddenly turns on and does a full
>>>>>>>> flush)
>>>>>>>
>>>>>>> This description seems really confusing.  My understanding is
>>>>>>> that
>>>>>>> marking something as wb-transient-app indicates that it might
>>>>>>> be
>>>>>>> accessed by something other than our graphics/media IP (i.e.,
>>>>>>> accessed from the CPU, exported to another device, etc.), so
>>>>>>> transient data truly does need to be flushed at the points in
>>>>>>> the
>>>>>>> driver where a flush typically happens.
>>>>>>>
>>>>>>> However when something is _not_ transient, then either:
>>>>>>>   - it's "private" to the GPU and only our graphics/media IP
>>>>>>> will be
>>>>>>>     accessing it
>>>>>>>   - it's bound with a coherent PAT index so that outside
>>>>>>> observers like
>>>>>>>     the CPU can snoop the device cache, even when the cache
>>>>>>> hasn't been
>>>>>>>     flushed
>>>>>>>
>>>>>>> If media is not active, then there's really no need to
>>>>>>> include
>>>>>>> non-transient data when an device cache flush happens since
>>>>>>> there's no real need for the data to get to RAM.  So that
>>>>>>> enables
>>>>>>> an optimization (which comes in your next patch), that allows
>>>>>>> flushes to only operate on the subset of the device cache
>>>>>>> tagged as
>>>> "transient" if media is idle.
>>>>>
>>>>> But what If we have stale non-XA marked pages for userptr, and
>>>>> that
>>>>> object moves out and at the same time media comes back, will end
>>>>> up in
>>>>> full flush and flush the stale entry to RAM.
>>>>
>>>> What makes userptr special here?  During general, active usage,
>>>> userptr would
>>>> be data that's accessible by the CPU, so it needs to either be
>>>> transient (so CPU
>>>> can see the data in RAM after explicit flushes) or it needs to be
>>>> using a
>>>> coherent PAT (so that the CPU can just snoop the GPU cache).  If
>>>> you marked
>>>> userptr as both non-XA and non-coherent, then that sounds likely to
>>>> be a
>>>> userspace bug (and probably something we can catch and reject as an
>>>> invalid
>>>> case on any Xe3p or later platforms that support this) since the
>>>> CPU wouldn't
>>>> have any reliable way of seeing GPU updates.
>>>
>>> Right. FYI @Mrozek, Michal @Souza, Jose
>>> For userptr, as explained above, it needs to be either coherent or XA
>>> pat index, or else KMD will reject as invalid case.
>>>
>>>
>> The coherency restriction is already in the uAPI:
>>
>> "Note: For userptr and externally imported dma-buf the kernel expects
>> either 1WAY or 2WAY for the @pat_index."
>>
>> Using 1 way is enough as Xe KMD does a PIPE_CONTROL flushing GPU caches
>> at the end of batch buffers.
> 
> But isn't that what we're discussing here?  1-way *won't* necessarily be
> enough anymore because PIPE_CONTROL instructions don't flush the entire
> cache anymore.  Whenever the GuC determines that media is inactive and
> activates the optimization, PIPE_CONTROL, MI_FLUSH_DW, etc. change
> behavior to only flush out the subset of data that was marked as
> app-transient; anything not marked that way doesn't get flushed now.  So
> there's a new requirement here that you ensure you're using an XA PAT
> index, or you switch to use 2-way coherency which will allow the CPU to
> snoop the GPU's caches.

That exactly matches my understanding also.

> 
> 
> Matt
> 
>>
>>>
>>>>
>>>> If something happens that changes the GTT mapping of an object,
>>>> then
>>>> doesn't that already trigger a TLB invalidation when necessary in
>>>> the driver
>>>> today?  It was my understanding that "heavy" TLB invalidations wait
>>>> for data
>>>> values to be globally observable before starting, so I think that
>>>> would ensure
>>>> that any non-XA data makes it to RAM before any binding changes,
>>>> object,
>>>> destruction, etc.?  Is there something special about userptr that
>>>> makes that
>>>> case more of a problem?
>>>>
>>>> I just found bspec page 74635 which gives an overview of the
>>>> various flush
>>>> and invalidate cases, and I don't see anything there that makes it
>>>> obvious to
>>>> me that userptr would be special.
>>>>
>>>>
>>>>>
>>>>>>>
>>>>>>> As you said, we eventually do want to force a flush of the
>>>>>>> non-transient data as well once we're freeing the underlying
>>>>>>> pages.
>>>>>>> So how do we do that?  It's not clear to me how the changes
>>>>>>> below
>>>>>>> are accomplishing that.  Is there a way to explicitly request
>>>>>>> a
>>>>>>> full device cache flush (ignoring the transient vs non-
>>>>>>> transient tagging)?
>>>>>>> Since the GuC handles the optimization in the next patch
>>>>>>> (toggling
>>>>>>> whether flushes are full flushes vs non-transient flushes
>>>>>>> depending on whether media is active), I thought there might
>>>>>>> be
>>>>>>> some kind of GuC interface to request "please do one full
>>>>>>> flush now, even
>>>> if media is idle."
>>>>>>>
>>>>>>
>>>>>> I’m not an expert here by any means, but everything above from
>>>>>> Matt
>>>>>> seems like valid concerns. Thomas also raised some concerns in
>>>>>> the
>>>>>> two previous revisions; again I’m not an expert, but reading
>>>>>> through
>>>>>> those, it doesn’t really seem like he received proper answers
>>>>>> to his
>>>> questions.
>>>>>
>>>>> Its forcing flush via tlb invalidation PPC flag under
>>>>> xe_invalidate_vma( ).
>>>>
>>>> By the way, what is "PPC?"  It seems like it's another new synonym
>>>> for the
>>>> device cache?  It's already really confusing that some of our
>>>> hardware docs use
>>>> a mix of both "L2" and "L3" to refer to the same device cache for
>>>> historical
>>>> reasons...
>>>>
>>>>
>>>> Matt
>>>>
>>>>>
>>>>>>
>>>>>> A couple of comments below.
>>>>>>
>>>>>>>
>>>>>>> Matt
>>>>>>>
>>>>>>>>
>>>>>>>> V2(MattA): Expand commit description
>>>>>>>>
>>>>>>>> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
>>>>>>>> ---
>>>>>>>>   drivers/gpu/drm/xe/xe_bo.c      |  3 ++-
>>>>>>>>   drivers/gpu/drm/xe/xe_device.c  | 23
>>>>>>>> +++++++++++++++++++++++
>>>>>>>> drivers/gpu/drm/xe/xe_device.h  |  1 +
>>>>>>>> drivers/gpu/drm/xe/xe_userptr.c |  3 ++-
>>>>>>>>   4 files changed, 28 insertions(+), 2 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/xe/xe_bo.c
>>>>>>>> b/drivers/gpu/drm/xe/xe_bo.c index
>>>>>>>> e9180b01a4e4..4455886b211e
>>>>>>>> 100644
>>>>>>>> --- a/drivers/gpu/drm/xe/xe_bo.c
>>>>>>>> +++ b/drivers/gpu/drm/xe/xe_bo.c
>>>>>>>> @@ -689,7 +689,8 @@ static int xe_bo_trigger_rebind(struct
>>>>>>>> xe_device *xe, struct xe_bo *bo,
>>>>>>>>
>>>>>>>>   		if (!xe_vm_in_fault_mode(vm)) {
>>>>>>>>   			drm_gpuvm_bo_evict(vm_bo, true);
>>>>>>>> -			continue;
>>>>>>>> +			if
>>>>>>>> (!xe_device_needs_cache_flush(xe))
>>>>>>>> +				continue;
>>>
>>> Matt R,
>>> This flush will be still needed as there can be non-xa buffers which
>>> can be evicted while media was off and stale entries can be flushed
>>> when media comes back on. Which was not case earlier as full flush
>>> was happening at regular sync points and that’s where this feature is
>>> bringing optimization now.
>>>
>>> Tejas
>>>
>>>>>>
>>>>>> This will trigger a TLB invalidation (and I assume a cache
>>>>>> flush)
>>>>>> every time we move or free memory in the 3D stack if it has a
>>>>>> binding. It also performs a synchronous wait on the BO being
>>>>>> idle.
>>>>>> Both of these are very expensive operations. I can’t imagine
>>>>>> the
>>>>>> granularity we want here is to do this on every move/free with
>>>>>> bindings.
>>>>>>
>>>>>> Also, for LR compute with preempt fences, we would trigger the
>>>>>> preempt fences during the wait, so a TLB invalidation after
>>>>>> this
>>>>>> seems unnecessary, though perhaps the cache flush is still
>>>>>> required?
>>>>>>
>>>>>> I think this needs a bit more explanation, because without
>>>>>> knowing a
>>>>>> lot about the exact requirements, the implementation does not
>>>>>> look
>>>> correct.
>>>>>
>>>>> The thing is that we are trying to solve problem with userptr
>>>>> with non-XA
>>>> pat, consider if that BO got moved while media is not active. As
>>>> soon as media
>>>> will come back active, stale cached entries of that object will be
>>>> flushed as part
>>>> of full flush , which may corrupt things.
>>>>> There was thinking that with this patch we would at least solve
>>>>> the problem
>>>> of corruption and later when page_reclamation feature comes in will
>>>> help in
>>>> performance as well. But now when page reclamation feature is
>>>> merged earlier
>>>> and it tightly coupled with bind/unbind some cases like discussed
>>>> above
>>>> (which are not doing unbind immediately on move/free) are missed in
>>>> reclamation.
>>>>>
>>>>> So thought was to let this solution go in with little perf hit
>>>>> and discuss with
>>>> page reclamation owner to come with cleaner solution together.
>>>>>
>>>>> Tejas
>>>>>>
>>>>>>>>   		}
>>>>>>>>
>>>>>>>>   		if (!idle) {
>>>>>>>> diff --git a/drivers/gpu/drm/xe/xe_device.c
>>>>>>>> b/drivers/gpu/drm/xe/xe_device.c index
>>>>>>>> 743c18e0c580..da2abed94bc0
>>>>>>>> 100644
>>>>>>>> --- a/drivers/gpu/drm/xe/xe_device.c
>>>>>>>> +++ b/drivers/gpu/drm/xe/xe_device.c
>>>>>>>> @@ -1097,6 +1097,29 @@ static void tdf_request_sync(struct
>>>>>>>> xe_device
>>>>>> *xe)
>>>>>>>>   	}
>>>>>>>>   }
>>>>>>>>
>>>>>>>> +/**
>>>>>>>> + * xe_device_needs_cache_flush - Whether the cache needs
>>>>>>>> to be
>>>>>>>> +flushed
>>>>>>>> + * @xe: The device to check.
>>>>>>>> + *
>>>>>>>> + * Return: true if the device needs cache flush, false
>>>>>>>> otherwise.
>>>>>>>> + */
>>>>>>>> +bool xe_device_needs_cache_flush(struct xe_device *xe) {
>>>>>>>> +	/* XA is *always* flushed, like at the end-of-
>>>>>>>> submssion (and
>>>>>>>> +maybe
>>>>>> other
>>>>>>>> +	 * places), just that internally as an
>>>>>>>> optimisation hw doesn't
>>>>>>>> +need to
>>>>>> make
>>>>>>>> +	 * that a full flush (which will also include XA)
>>>>>>>> when Media is
>>>>>>>> +	 * off/powergated, since it doesn't need to worry
>>>>>>>> about GT
>>>>>>>> +caches vs
>>>>>> Media
>>>>>>>> +	 * coherency, and only CPU vs GPU coherency, so
>>>>>>>> can make
>>>> that
>>>>>>>> +flush
>>>>>> a
>>>>>>>> +	 * targeted XA flush, since stuff tagged with XA
>>>>>>>> now means
>>>>>>>> +it's shared
>>>>>> with
>>>>>>>> +	 * the CPU. The main implication is that we now
>>>>>>>> need to
>>>>>>>> +somehow
>>>>>> flush non-XA before
>>>>>>>> +	 * freeing system memory pages, otherwise dirty
>>>>>>>> cachelines
>>>>>>>> +could be
>>>>>> flushed after the free
>>>>>>>> +	 * (like if Media suddenly turns on and does a
>>>>>>>> full flush)
>>>>>>>> +	 */
>>>>>>>> +	if (GRAPHICS_VER(xe) >= 35 && !IS_DGFX(xe))
>>>>>>>> +		return true;
>>>>>>>> +	return false;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>   void xe_device_l2_flush(struct xe_device *xe)  {
>>>>>>>>   	struct xe_gt *gt;
>>>>>>>> diff --git a/drivers/gpu/drm/xe/xe_device.h
>>>>>>>> b/drivers/gpu/drm/xe/xe_device.h index
>>>>>>>> 39464650533b..baf386e0e037
>>>>>>>> 100644
>>>>>>>> --- a/drivers/gpu/drm/xe/xe_device.h
>>>>>>>> +++ b/drivers/gpu/drm/xe/xe_device.h
>>>>>>>> @@ -184,6 +184,7 @@ void xe_device_snapshot_print(struct
>>>>>>>> xe_device *xe, struct drm_printer *p);
>>>>>>>>   u64 xe_device_canonicalize_addr(struct xe_device *xe, u64
>>>>>>>> address);
>>>>>>>>   u64 xe_device_uncanonicalize_addr(struct xe_device *xe,
>>>>>>>> u64
>>>>>>>> address);
>>>>>>>>
>>>>>>>> +bool xe_device_needs_cache_flush(struct xe_device *xe);
>>>>>>>>   void xe_device_td_flush(struct xe_device *xe);  void
>>>>>>>> xe_device_l2_flush(struct xe_device *xe);
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/xe/xe_userptr.c
>>>>>>>> b/drivers/gpu/drm/xe/xe_userptr.c index
>>>>>>>> e120323c43bc..b435ea7f9b66
>>>>>>>> 100644
>>>>>>>> --- a/drivers/gpu/drm/xe/xe_userptr.c
>>>>>>>> +++ b/drivers/gpu/drm/xe/xe_userptr.c
>>>>>>>> @@ -114,7 +114,8 @@ static void
>>>>>>>> __vma_userptr_invalidate(struct
>>>>>>>> xe_vm
>>>>>> *vm, struct xe_userptr_vma *uv
>>>>>>>>   				    false,
>>>>>>>> MAX_SCHEDULE_TIMEOUT);
>>>>>>>>   	XE_WARN_ON(err <= 0);
>>>>>>>>
>>>>>>>> -	if (xe_vm_in_fault_mode(vm) && userptr-
>>>>>>>>> initial_bind) {
>>>>>>>> +	if ((xe_vm_in_fault_mode(vm) ||
>>>>>>>> +xe_device_needs_cache_flush(vm-
>>>>>>> xe)) &&
>>>>>>>> +	    userptr->initial_bind) {
>>>>>>
>>>>>> Same concern with the LR preempt fence as above — the hardware
>>>>>> will
>>>>>> be interrupted via preempt fences, so it doesn’t seem necessary
>>>>>> to
>>>>>> invalidate the TLBs but perhaps we need a cflush and TLB
>>>>>> invalidation is the mechanism for that too?
>>>>>>
>>>>>> Matt
>>>>>>
>>>>>>>>   		err = xe_vm_invalidate_vma(vma);
>>>>>>>>   		XE_WARN_ON(err);
>>>>>>>>   	}
>>>>>>>> --
>>>>>>>> 2.52.0
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Matt Roper
>>>>>>> Graphics Software Engineer
>>>>>>> Linux GPU Platform Enablement
>>>>>>> Intel Corporation
>>>>
>>>> --
>>>> Matt Roper
>>>> Graphics Software Engineer
>>>> Linux GPU Platform Enablement
>>>> Intel Corporation
> 


  parent reply	other threads:[~2026-02-13 17:31 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-10 12:51 [PATCH 0/3] drm/xe/xe3p_lpg: L2 flush optimization Tejas Upadhyay
2026-02-10 12:51 ` [PATCH 1/3] drm/xe/xe3p_lpg: flush userptr/shrinker bo cachelines manually Tejas Upadhyay
2026-02-10 21:05   ` Matt Roper
2026-02-11  0:02     ` Matthew Brost
2026-02-11 19:06       ` Upadhyay, Tejas
2026-02-11 21:11         ` Matt Roper
2026-02-12  9:53           ` Matthew Auld
2026-02-13 11:17             ` Upadhyay, Tejas
2026-02-13 13:27               ` Matthew Auld
2026-02-13 13:30                 ` Souza, Jose
2026-02-13 16:23           ` Upadhyay, Tejas
2026-02-13 16:48             ` Souza, Jose
2026-02-13 17:16               ` Matt Roper
2026-02-13 17:31                 ` Souza, Jose
2026-02-13 17:31                 ` Matthew Auld [this message]
2026-02-16 10:23                   ` Thomas Hellström
2026-02-16 10:58                     ` Matthew Auld
2026-02-16 12:07                       ` Thomas Hellström
2026-02-16 14:55                         ` Matthew Auld
2026-02-16 15:38                           ` Thomas Hellström
2026-02-16 16:41                             ` Matthew Auld
2026-02-17  6:19                               ` Upadhyay, Tejas
2026-02-17  9:53                                 ` Thomas Hellström
2026-02-17 17:04                               ` Thomas Hellström
2026-02-17 18:41                                 ` Matthew Auld
2026-02-16 10:56             ` Thomas Hellström
2026-02-16 11:26               ` Upadhyay, Tejas
2026-02-13 17:29           ` Matthew Auld
2026-02-10 12:51 ` [PATCH 2/3] drm/xe/xe3p_lpg: Enable L2 flush optimization feature Tejas Upadhyay
2026-02-10 12:51 ` [PATCH 3/3] drm/xe/xe3p: Skip TD flush Tejas Upadhyay
2026-02-10 21:22   ` Matt Roper
2026-02-13 11:06     ` Upadhyay, Tejas
2026-02-10 13:35 ` ✗ CI.checkpatch: warning for drm/xe/xe3p_lpg: L2 flush optimization (rev2) Patchwork
2026-02-10 13:36 ` ✓ CI.KUnit: success " Patchwork
2026-02-10 14:33 ` ✗ Xe.CI.BAT: failure " Patchwork
2026-02-10 18:06 ` ✗ Xe.CI.FULL: " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2025-11-25  9:43 [PATCH 0/3] drm/xe/xe3p_lpg: L2 flush optimization Tejas Upadhyay
2025-11-25  9:43 ` [PATCH 1/3] drm/xe/xe3p_lpg: flush userptr/shrinker bo cachelines manually Tejas Upadhyay
2025-11-25 10:17   ` Matthew Auld
2025-11-25 13:39     ` Souza, Jose
2025-11-25 15:06   ` Thomas Hellström
2025-11-25 15:31     ` Upadhyay, Tejas
2025-11-26 10:26       ` Thomas Hellström

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a62d59e4-c77d-4e28-8227-be253733dd7b@intel.com \
    --to=matthew.auld@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jose.souza@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=matthew.d.roper@intel.com \
    --cc=michal.mrozek@intel.com \
    --cc=tejas.upadhyay@intel.com \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox