From: "Souza, Jose" <jose.souza@intel.com>
To: "Upadhyay, Tejas" <tejas.upadhyay@intel.com>,
"Roper, Matthew D" <matthew.d.roper@intel.com>,
"Auld, Matthew" <matthew.auld@intel.com>
Cc: "intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
"Brost, Matthew" <matthew.brost@intel.com>,
"thomas.hellstrom@linux.intel.com"
<thomas.hellstrom@linux.intel.com>,
"Mrozek, Michal" <michal.mrozek@intel.com>
Subject: Re: [PATCH 1/3] drm/xe/xe3p_lpg: flush userptr/shrinker bo cachelines manually
Date: Fri, 13 Feb 2026 13:30:33 +0000 [thread overview]
Message-ID: <063e842a088f7498decf71deddda5e4a29e457ca.camel@intel.com> (raw)
In-Reply-To: <13ef6a14-ce90-4bd9-bfb0-abbf182a76a1@intel.com>
On Fri, 2026-02-13 at 13:27 +0000, Matthew Auld wrote:
> On 13/02/2026 11:17, Upadhyay, Tejas wrote:
> > + Michal
> >
> > > -----Original Message-----
> > > From: Auld, Matthew <matthew.auld@intel.com>
> > > Sent: 12 February 2026 15:24
> > > To: Roper, Matthew D <matthew.d.roper@intel.com>; Upadhyay, Tejas
> > > <tejas.upadhyay@intel.com>
> > > Cc: Brost, Matthew <matthew.brost@intel.com>; intel-
> > > xe@lists.freedesktop.org; thomas.hellstrom@linux.intel.com
> > > Subject: Re: [PATCH 1/3] drm/xe/xe3p_lpg: flush userptr/shrinker
> > > bo
> > > cachelines manually
> > >
> > > On 11/02/2026 21:11, Matt Roper wrote:
> > > > On Wed, Feb 11, 2026 at 07:06:05PM +0000, Upadhyay, Tejas
> > > > wrote:
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Brost, Matthew <matthew.brost@intel.com>
> > > > > > Sent: 11 February 2026 05:32
> > > > > > To: Roper, Matthew D <matthew.d.roper@intel.com>
> > > > > > Cc: Upadhyay, Tejas <tejas.upadhyay@intel.com>; intel-
> > > > > > xe@lists.freedesktop.org; Auld, Matthew
> > > > > > <matthew.auld@intel.com>;
> > > > > > thomas.hellstrom@linux.intel.com
> > > > > > Subject: Re: [PATCH 1/3] drm/xe/xe3p_lpg: flush
> > > > > > userptr/shrinker bo
> > > > > > cachelines manually
> > > > > >
> > > > > > On Tue, Feb 10, 2026 at 01:05:25PM -0800, Matt Roper wrote:
> > > > > > > On Tue, Feb 10, 2026 at 06:21:22PM +0530, Tejas Upadhyay
> > > > > > > wrote:
> > > > > > > > "eXtended Architecture" (XA) tagged memory—memory
> > > > > > > > shared
> > > between
> > > > > > the
> > > > > > > > CPU and GPU
> > > > > > >
> > > > > > > I'm pretty sure this expansion of "XA" is wrong; where
> > > > > > > are you
> > > > > > > seeing this definition? Everything in the bspec
> > > > > > > indicates that XA
> > > > > > > means "wb
> > > > > > > - transient app" (similar to how "XD" is 'wb - transient
> > > > > > > display").
> > > > > > > I'm not sure why exactly they picked "X" to refer to
> > > > > > > transient in
> > > > > > > both of these cases, but I've never seen any
> > > > > > > documentation that
> > > > > > > refers to it as "extended."
> > > > > > >
> > > > > > > > is treated differently from other GPU memory when the
> > > > > > > > Media engine
> > > > > > > > is
> > > > > > power-gated.
> > > > > > > >
> > > > > > > > XA is *always* flushed, like at the end-of-submssion
> > > > > > > > (and maybe
> > > > > > > > other
> > > > > > >
> > > > > > > I assume you're referring to the fact that the driver
> > > > > > > performs
> > > > > > > flushes at the end of submission (via PIPE_CONTROL or
> > > > > > > MI_FLUSH_DW),
> > > > > > > and that depending on other state/optimizations in the
> > > > > > > system,
> > > > > > > those flushes may flush the entire device cache, or may
> > > > > > > only flush
> > > > > > > the subset of cache data that is not marked as
> > > > > > > transient. The way
> > > > > > > you worded this was confusing since it makes it sound
> > > > > > > like cache
> > > > > > > flushes happen automatically somewhere in
> > > > > > > hardware/firmware.
> > > > > > >
> > > > > > > > places), just that internally as an optimisation hw
> > > > > > > > doesn't need
> > > > > > > > to make that a full flush (which will also include XA)
> > > > > > > > when Media
> > > > > > > > is off/powergated, since it doesn't need to worry about
> > > > > > > > GT caches
> > > > > > > > vs Media coherency, and only CPU vs GPU coherency, so
> > > > > > > > can make
> > > > > > > > that flush a targeted XA flush, since stuff tagged with
> > > > > > > > XA now
> > > > > > > > means it's shared with the CPU. The main implication is
> > > > > > > > that we
> > > > > > > > now need to somehow flush non-XA before freeing system
> > > > > > > > memory
> > > > > > > > pages, otherwise dirty cachelines could be flushed
> > > > > > > > after the free
> > > > > > > > (like if Media suddenly turns on and does a full flush)
> > > > > > >
> > > > > > > This description seems really confusing. My
> > > > > > > understanding is that
> > > > > > > marking something as wb-transient-app indicates that it
> > > > > > > might be
> > > > > > > accessed by something other than our graphics/media IP
> > > > > > > (i.e.,
> > > > > > > accessed from the CPU, exported to another device, etc.),
> > > > > > > so
> > > > > > > transient data truly does need to be flushed at the
> > > > > > > points in the
> > > > > > > driver where a flush typically happens.
> > > > > > >
> > > > > > > However when something is _not_ transient, then either:
> > > > > > > - it's "private" to the GPU and only our
> > > > > > > graphics/media IP will be
> > > > > > > accessing it
> > > > > > > - it's bound with a coherent PAT index so that outside
> > > > > > > observers like
> > > > > > > the CPU can snoop the device cache, even when the
> > > > > > > cache hasn't been
> > > > > > > flushed
> > > > > > >
> > > > > > > If media is not active, then there's really no need to
> > > > > > > include
> > > > > > > non-transient data when an device cache flush happens
> > > > > > > since there's
> > > > > > > no real need for the data to get to RAM. So that enables
> > > > > > > an
> > > > > > > optimization (which comes in your next patch), that
> > > > > > > allows flushes
> > > > > > > to only operate on the subset of the device cache tagged
> > > > > > > as "transient" if
> > > media is idle.
> > > > >
> > > > > But what If we have stale non-XA marked pages for userptr,
> > > > > and that
> > > > > object moves out and at the same time media comes back, will
> > > > > end up
> > > > > in full flush and flush the stale entry to RAM.
> > > >
> > > > What makes userptr special here? During general, active usage,
> > > > userptr would be data that's accessible by the CPU, so it needs
> > > > to
> > > > either be transient (so CPU can see the data in RAM after
> > > > explicit
> > > > flushes) or it needs to be using a coherent PAT (so that the
> > > > CPU can
> > > > just snoop the GPU cache). If you marked userptr as both non-
> > > > XA and
> > > > non-coherent, then that sounds likely to be a userspace bug
> > > > (and
> > > > probably something we can catch and reject as an invalid case
> > > > on any
> > > > Xe3p or later platforms that support this) since the CPU
> > > > wouldn't have
> > > > any reliable way of seeing GPU updates.
> > > >
> > > > If something happens that changes the GTT mapping of an object,
> > > > then
> > > > doesn't that already trigger a TLB invalidation when necessary
> > > > in the
> > > > driver today? It was my understanding that "heavy" TLB
> > > > invalidations
> > > > wait for data values to be globally observable before starting,
> > > > so I
> > > > think that would ensure that any non-XA data makes it to RAM
> > > > before
> > > > any binding changes, object, destruction, etc.? Is there
> > > > something
> > > > special about userptr that makes that case more of a problem?
> > > >
> > > > I just found bspec page 74635 which gives an overview of the
> > > > various
> > > > flush and invalidate cases, and I don't see anything there that
> > > > makes
> > > > it obvious to me that userptr would be special.
> > > >
> > > >
> > > > >
> > > > > > >
> > > > > > > As you said, we eventually do want to force a flush of
> > > > > > > the
> > > > > > > non-transient data as well once we're freeing the
> > > > > > > underlying pages.
> > > > > > > So how do we do that? It's not clear to me how the
> > > > > > > changes below
> > > > > > > are accomplishing that. Is there a way to explicitly
> > > > > > > request a
> > > > > > > full device cache flush (ignoring the transient vs non-
> > > > > > > transient tagging)?
> > > > > > > Since the GuC handles the optimization in the next patch
> > > > > > > (toggling
> > > > > > > whether flushes are full flushes vs non-transient flushes
> > > > > > > depending
> > > > > > > on whether media is active), I thought there might be
> > > > > > > some kind of
> > > > > > > GuC interface to request "please do one full flush now,
> > > > > > > even if media is
> > > idle."
> > > > > > >
> > > > > >
> > > > > > I’m not an expert here by any means, but everything above
> > > > > > from Matt
> > > > > > seems like valid concerns. Thomas also raised some concerns
> > > > > > in the
> > > > > > two previous revisions; again I’m not an expert, but
> > > > > > reading through
> > > > > > those, it doesn’t really seem like he received proper
> > > > > > answers to his
> > > questions.
> > > > >
> > > > > Its forcing flush via tlb invalidation PPC flag under
> > > > > xe_invalidate_vma( ).
> > > >
> > > > By the way, what is "PPC?" It seems like it's another new
> > > > synonym for
> > > > the device cache? It's already really confusing that some of
> > > > our
> > > > hardware docs use a mix of both "L2" and "L3" to refer to the
> > > > same
> > > > device cache for historical reasons...
> > >
> > > Private-physical-cache. It's just what hw side calls the device
> > > side
> > > l2/l3 on newer igpu (I think LNL+). I assume it is quite
> > > different from the
> > > implemetation on dgpu, from HW pov, which is maybe why is has a
> > > special
> > > name. On dgpu they just refer to it as plain l2/l3, not PPC. On
> > > dgpu there is
> > > also additional SMRO (system-memory-read-only) device cache,
> > > which I
> > > assume is caching reads over pci to system memory from GPU side,
> > > but that
> > > gets flushed at the usual places, like end of submission etc.
> > > Setting the PPC bit
> > > on Guc TLB inval will only flush SMRO on dgpu.
> >
> > I had conversation with Michal and also had confirmation from Jose
> > about their usage of pat_index for userptr, just summarising it
> > below :
> >
> > UMD Compute, with userptr :
> > - Uses pat_index 19 (which is App-transient(XA) and 1Way-coh)
> > - Flushing : will be taken care by app-transient ability in HW if
> > media is off, if media is on full flush anyways will happen.
> >
> > Mesa, with userptr:
> > - Use 2way-coh
> > - Flushing : 2 way Coherency will take care
Mesa uses 1 way coherency, 2 way coherencey has a larger impact in iGPU
not backed by LLC.
> >
> > With above understanding it seems we can drop this patch as its not
> > needed, instead we need to have new patch to validate if right pat-
> > index coherency is used and if not used reject the request.
>
> I think that only resolves the userptr side. We still need something
> for
> eviction/shriker path with normal BOs, which this patch was also
> trying
> to address, right?
>
> >
> > Tejas
> > >
> > > >
> > > >
> > > > Matt
> > > >
> > > > >
> > > > > >
> > > > > > A couple of comments below.
> > > > > >
> > > > > > >
> > > > > > > Matt
> > > > > > >
> > > > > > > >
> > > > > > > > V2(MattA): Expand commit description
> > > > > > > >
> > > > > > > > Signed-off-by: Tejas Upadhyay
> > > > > > > > <tejas.upadhyay@intel.com>
> > > > > > > > ---
> > > > > > > > drivers/gpu/drm/xe/xe_bo.c | 3 ++-
> > > > > > > > drivers/gpu/drm/xe/xe_device.c | 23
> > > > > > > > +++++++++++++++++++++++
> > > > > > > > drivers/gpu/drm/xe/xe_device.h | 1 +
> > > > > > > > drivers/gpu/drm/xe/xe_userptr.c | 3 ++-
> > > > > > > > 4 files changed, 28 insertions(+), 2 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/gpu/drm/xe/xe_bo.c
> > > > > > > > b/drivers/gpu/drm/xe/xe_bo.c index
> > > > > > > > e9180b01a4e4..4455886b211e
> > > > > > > > 100644
> > > > > > > > --- a/drivers/gpu/drm/xe/xe_bo.c
> > > > > > > > +++ b/drivers/gpu/drm/xe/xe_bo.c
> > > > > > > > @@ -689,7 +689,8 @@ static int
> > > > > > > > xe_bo_trigger_rebind(struct
> > > > > > > > xe_device *xe, struct xe_bo *bo,
> > > > > > > >
> > > > > > > > if (!xe_vm_in_fault_mode(vm)) {
> > > > > > > > drm_gpuvm_bo_evict(vm_bo,
> > > > > > > > true);
> > > > > > > > - continue;
> > > > > > > > + if
> > > > > > > > (!xe_device_needs_cache_flush(xe))
> > > > > > > > + continue;
> > > > > >
> > > > > > This will trigger a TLB invalidation (and I assume a cache
> > > > > > flush)
> > > > > > every time we move or free memory in the 3D stack if it has
> > > > > > a
> > > > > > binding. It also performs a synchronous wait on the BO
> > > > > > being idle.
> > > > > > Both of these are very expensive operations. I can’t
> > > > > > imagine the
> > > > > > granularity we want here is to do this on every move/free
> > > > > > with bindings.
> > > > > >
> > > > > > Also, for LR compute with preempt fences, we would trigger
> > > > > > the
> > > > > > preempt fences during the wait, so a TLB invalidation after
> > > > > > this
> > > > > > seems unnecessary, though perhaps the cache flush is still
> > > > > > required?
> > > > > >
> > > > > > I think this needs a bit more explanation, because without
> > > > > > knowing a
> > > > > > lot about the exact requirements, the implementation does
> > > > > > not look
> > > correct.
> > > > >
> > > > > The thing is that we are trying to solve problem with userptr
> > > > > with non-XA
> > > pat, consider if that BO got moved while media is not active. As
> > > soon as media
> > > will come back active, stale cached entries of that object will
> > > be flushed as part
> > > of full flush , which may corrupt things.
> > > > > There was thinking that with this patch we would at least
> > > > > solve the problem
> > > of corruption and later when page_reclamation feature comes in
> > > will help in
> > > performance as well. But now when page reclamation feature is
> > > merged earlier
> > > and it tightly coupled with bind/unbind some cases like discussed
> > > above
> > > (which are not doing unbind immediately on move/free) are missed
> > > in
> > > reclamation.
> > > > >
> > > > > So thought was to let this solution go in with little perf
> > > > > hit and discuss with
> > > page reclamation owner to come with cleaner solution together.
> > > > >
> > > > > Tejas
> > > > > >
> > > > > > > > }
> > > > > > > >
> > > > > > > > if (!idle) {
> > > > > > > > diff --git a/drivers/gpu/drm/xe/xe_device.c
> > > > > > > > b/drivers/gpu/drm/xe/xe_device.c index
> > > 743c18e0c580..da2abed94bc0
> > > > > > > > 100644
> > > > > > > > --- a/drivers/gpu/drm/xe/xe_device.c
> > > > > > > > +++ b/drivers/gpu/drm/xe/xe_device.c
> > > > > > > > @@ -1097,6 +1097,29 @@ static void
> > > > > > > > tdf_request_sync(struct
> > > > > > > > xe_device
> > > > > > *xe)
> > > > > > > > }
> > > > > > > > }
> > > > > > > >
> > > > > > > > +/**
> > > > > > > > + * xe_device_needs_cache_flush - Whether the cache
> > > > > > > > needs to be
> > > > > > > > +flushed
> > > > > > > > + * @xe: The device to check.
> > > > > > > > + *
> > > > > > > > + * Return: true if the device needs cache flush, false
> > > > > > > > otherwise.
> > > > > > > > + */
> > > > > > > > +bool xe_device_needs_cache_flush(struct xe_device *xe)
> > > > > > > > {
> > > > > > > > + /* XA is *always* flushed, like at the end-of-
> > > > > > > > submssion (and
> > > > > > > > +maybe
> > > > > > other
> > > > > > > > + * places), just that internally as an
> > > > > > > > optimisation hw doesn't
> > > > > > > > +need to
> > > > > > make
> > > > > > > > + * that a full flush (which will also include
> > > > > > > > XA) when Media is
> > > > > > > > + * off/powergated, since it doesn't need to
> > > > > > > > worry about GT
> > > > > > > > +caches vs
> > > > > > Media
> > > > > > > > + * coherency, and only CPU vs GPU coherency,
> > > > > > > > so can make that
> > > > > > > > +flush
> > > > > > a
> > > > > > > > + * targeted XA flush, since stuff tagged with
> > > > > > > > XA now means it's
> > > > > > > > +shared
> > > > > > with
> > > > > > > > + * the CPU. The main implication is that we
> > > > > > > > now need to somehow
> > > > > > flush non-XA before
> > > > > > > > + * freeing system memory pages, otherwise
> > > > > > > > dirty cachelines could
> > > > > > > > +be
> > > > > > flushed after the free
> > > > > > > > + * (like if Media suddenly turns on and does a
> > > > > > > > full flush)
> > > > > > > > + */
> > > > > > > > + if (GRAPHICS_VER(xe) >= 35 && !IS_DGFX(xe))
> > > > > > > > + return true;
> > > > > > > > + return false;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > void xe_device_l2_flush(struct xe_device *xe) {
> > > > > > > > struct xe_gt *gt;
> > > > > > > > diff --git a/drivers/gpu/drm/xe/xe_device.h
> > > > > > > > b/drivers/gpu/drm/xe/xe_device.h index
> > > 39464650533b..baf386e0e037
> > > > > > > > 100644
> > > > > > > > --- a/drivers/gpu/drm/xe/xe_device.h
> > > > > > > > +++ b/drivers/gpu/drm/xe/xe_device.h
> > > > > > > > @@ -184,6 +184,7 @@ void
> > > > > > > > xe_device_snapshot_print(struct
> > > xe_device
> > > > > > > > *xe, struct drm_printer *p);
> > > > > > > > u64 xe_device_canonicalize_addr(struct xe_device
> > > > > > > > *xe, u64 address);
> > > > > > > > u64 xe_device_uncanonicalize_addr(struct xe_device
> > > > > > > > *xe, u64
> > > > > > > > address);
> > > > > > > >
> > > > > > > > +bool xe_device_needs_cache_flush(struct xe_device
> > > > > > > > *xe);
> > > > > > > > void xe_device_td_flush(struct xe_device *xe); void
> > > > > > > > xe_device_l2_flush(struct xe_device *xe);
> > > > > > > >
> > > > > > > > diff --git a/drivers/gpu/drm/xe/xe_userptr.c
> > > > > > > > b/drivers/gpu/drm/xe/xe_userptr.c index
> > > e120323c43bc..b435ea7f9b66
> > > > > > > > 100644
> > > > > > > > --- a/drivers/gpu/drm/xe/xe_userptr.c
> > > > > > > > +++ b/drivers/gpu/drm/xe/xe_userptr.c
> > > > > > > > @@ -114,7 +114,8 @@ static void
> > > > > > > > __vma_userptr_invalidate(struct
> > > > > > > > xe_vm
> > > > > > *vm, struct xe_userptr_vma *uv
> > > > > > > > false,
> > > > > > > > MAX_SCHEDULE_TIMEOUT);
> > > > > > > > XE_WARN_ON(err <= 0);
> > > > > > > >
> > > > > > > > - if (xe_vm_in_fault_mode(vm) && userptr-
> > > > > > > > >initial_bind) {
> > > > > > > > + if ((xe_vm_in_fault_mode(vm) ||
> > > > > > > > xe_device_needs_cache_flush(vm-
> > > > > > > xe)) &&
> > > > > > > > + userptr->initial_bind) {
> > > > > >
> > > > > > Same concern with the LR preempt fence as above — the
> > > > > > hardware will
> > > > > > be interrupted via preempt fences, so it doesn’t seem
> > > > > > necessary to
> > > > > > invalidate the TLBs but perhaps we need a cflush and TLB
> > > > > > invalidation is the mechanism for that too?
> > > > > >
> > > > > > Matt
> > > > > >
> > > > > > > > err = xe_vm_invalidate_vma(vma);
> > > > > > > > XE_WARN_ON(err);
> > > > > > > > }
> > > > > > > > --
> > > > > > > > 2.52.0
> > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Matt Roper
> > > > > > > Graphics Software Engineer
> > > > > > > Linux GPU Platform Enablement
> > > > > > > Intel Corporation
> > > >
> >
next prev parent reply other threads:[~2026-02-13 13:30 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-10 12:51 [PATCH 0/3] drm/xe/xe3p_lpg: L2 flush optimization Tejas Upadhyay
2026-02-10 12:51 ` [PATCH 1/3] drm/xe/xe3p_lpg: flush userptr/shrinker bo cachelines manually Tejas Upadhyay
2026-02-10 21:05 ` Matt Roper
2026-02-11 0:02 ` Matthew Brost
2026-02-11 19:06 ` Upadhyay, Tejas
2026-02-11 21:11 ` Matt Roper
2026-02-12 9:53 ` Matthew Auld
2026-02-13 11:17 ` Upadhyay, Tejas
2026-02-13 13:27 ` Matthew Auld
2026-02-13 13:30 ` Souza, Jose [this message]
2026-02-13 16:23 ` Upadhyay, Tejas
2026-02-13 16:48 ` Souza, Jose
2026-02-13 17:16 ` Matt Roper
2026-02-13 17:31 ` Souza, Jose
2026-02-13 17:31 ` Matthew Auld
2026-02-16 10:23 ` Thomas Hellström
2026-02-16 10:58 ` Matthew Auld
2026-02-16 12:07 ` Thomas Hellström
2026-02-16 14:55 ` Matthew Auld
2026-02-16 15:38 ` Thomas Hellström
2026-02-16 16:41 ` Matthew Auld
2026-02-17 6:19 ` Upadhyay, Tejas
2026-02-17 9:53 ` Thomas Hellström
2026-02-17 17:04 ` Thomas Hellström
2026-02-17 18:41 ` Matthew Auld
2026-02-16 10:56 ` Thomas Hellström
2026-02-16 11:26 ` Upadhyay, Tejas
2026-02-13 17:29 ` Matthew Auld
2026-02-10 12:51 ` [PATCH 2/3] drm/xe/xe3p_lpg: Enable L2 flush optimization feature Tejas Upadhyay
2026-02-10 12:51 ` [PATCH 3/3] drm/xe/xe3p: Skip TD flush Tejas Upadhyay
2026-02-10 21:22 ` Matt Roper
2026-02-13 11:06 ` Upadhyay, Tejas
2026-02-10 13:35 ` ✗ CI.checkpatch: warning for drm/xe/xe3p_lpg: L2 flush optimization (rev2) Patchwork
2026-02-10 13:36 ` ✓ CI.KUnit: success " Patchwork
2026-02-10 14:33 ` ✗ Xe.CI.BAT: failure " Patchwork
2026-02-10 18:06 ` ✗ Xe.CI.FULL: " Patchwork
-- strict thread matches above, loose matches on Subject: below --
2025-11-25 9:43 [PATCH 0/3] drm/xe/xe3p_lpg: L2 flush optimization Tejas Upadhyay
2025-11-25 9:43 ` [PATCH 1/3] drm/xe/xe3p_lpg: flush userptr/shrinker bo cachelines manually Tejas Upadhyay
2025-11-25 10:17 ` Matthew Auld
2025-11-25 13:39 ` Souza, Jose
2025-11-25 15:06 ` Thomas Hellström
2025-11-25 15:31 ` Upadhyay, Tejas
2025-11-26 10:26 ` Thomas Hellström
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=063e842a088f7498decf71deddda5e4a29e457ca.camel@intel.com \
--to=jose.souza@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.auld@intel.com \
--cc=matthew.brost@intel.com \
--cc=matthew.d.roper@intel.com \
--cc=michal.mrozek@intel.com \
--cc=tejas.upadhyay@intel.com \
--cc=thomas.hellstrom@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox