public inbox for intel-xe@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: "Summers, Stuart" <stuart.summers@intel.com>
Cc: "intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH v3 10/11] drm/xe: Add context-based invalidation to GuC TLB invalidation backend
Date: Wed, 14 Jan 2026 08:09:21 -0800	[thread overview]
Message-ID: <aWe/sZUWl07L+JSR@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <976ce0e45b400122953ede51994d0d2a32782173.camel@intel.com>

On Tue, Jan 13, 2026 at 03:36:59PM -0700, Summers, Stuart wrote:
> On Mon, 2026-01-12 at 17:34 -0800, Matthew Brost wrote:
> > On Mon, Jan 12, 2026 at 06:28:01PM -0700, Summers, Stuart wrote:
> > > On Mon, 2026-01-12 at 15:27 -0800, Matthew Brost wrote:
> > > > Introduce context-based invalidation support to the GuC TLB
> > > > invalidation
> > > > backend. This is implemented by iterating over each exec queue
> > > > per GT
> > > > within a VM, skipping inactive queues, and issuing a context-
> > > > based
> > > > (GuC
> > > > ID) H2G TLB invalidation. All H2G messages, except the final one,
> > > > are
> > > > sent with an invalid seqno, which the G2H handler drops to ensure
> > > > the
> > > > TLB invalidation fence is only signaled once all H2G messages are
> > > > completed.
> > > > 
> > > > A watermark mechanism is also added to switch between context-
> > > > based
> > > > TLB
> > > > invalidations and full device-wide invalidations, as the return
> > > > on
> > > > investment for context-based invalidation diminishes when many
> > > > exec
> > > > queues are mapped.
> > > > 
> > > > v2:
> > > >  - Fix checkpatch warnings
> > > > v3:
> > > >  - Rebase on PRL
> > > >  - Use ref counting to avoid racing with deregisters
> > > > 
> > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/xe/xe_device_types.h  |   2 +
> > > >  drivers/gpu/drm/xe/xe_guc_tlb_inval.c | 145
> > > > +++++++++++++++++++++++++-
> > > >  drivers/gpu/drm/xe/xe_pci.c           |   1 +
> > > >  drivers/gpu/drm/xe/xe_pci_types.h     |   1 +
> > > >  4 files changed, 145 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/xe/xe_device_types.h
> > > > b/drivers/gpu/drm/xe/xe_device_types.h
> > > > index 8db870aaa382..b51acff4edcd 100644
> > > > --- a/drivers/gpu/drm/xe/xe_device_types.h
> > > > +++ b/drivers/gpu/drm/xe/xe_device_types.h
> > > > @@ -358,6 +358,8 @@ struct xe_device {
> > > >                 u8 has_pre_prod_wa:1;
> > > >                 /** @info.has_pxp: Device has PXP support */
> > > >                 u8 has_pxp:1;
> > > > +               /** @info.has_ctx_tlb_inval: Has context based
> > > > TLB
> > > > invalidations */
> > > > +               u8 has_ctx_tlb_inval:1;
> > > >                 /** @info.has_range_tlb_inval: Has range based
> > > > TLB
> > > > invalidations */
> > > >                 u8 has_range_tlb_inval:1;
> > > >                 /** @info.has_soc_remapper_sysctrl: Has SoC
> > > > remapper
> > > > system controller */
> > > > diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > > > b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > > > index 070d2e2cb7c9..328eced5f692 100644
> > > > --- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > > > +++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > > > @@ -6,15 +6,19 @@
> > > >  #include "abi/guc_actions_abi.h"
> > > >  
> > > >  #include "xe_device.h"
> > > > +#include "xe_exec_queue.h"
> > > > +#include "xe_exec_queue_types.h"
> > > >  #include "xe_gt_stats.h"
> > > >  #include "xe_gt_types.h"
> > > >  #include "xe_guc.h"
> > > >  #include "xe_guc_ct.h"
> > > > +#include "xe_guc_exec_queue_types.h"
> > > >  #include "xe_guc_tlb_inval.h"
> > > >  #include "xe_force_wake.h"
> > > >  #include "xe_mmio.h"
> > > >  #include "xe_sa.h"
> > > >  #include "xe_tlb_inval.h"
> > > > +#include "xe_vm.h"
> > > >  
> > > >  #include "regs/xe_guc_regs.h"
> > > >  
> > > > @@ -156,10 +160,16 @@ static int send_tlb_inval_ppgtt(struct
> > > > xe_guc
> > > > *guc, u32 seqno, u64 start,
> > > >  {
> > > >  #define MAX_TLB_INVALIDATION_LEN       7
> > > >         struct xe_gt *gt = guc_to_gt(guc);
> > > > +       struct xe_device *xe = guc_to_xe(guc);
> > > >         u32 action[MAX_TLB_INVALIDATION_LEN];
> > > >         u64 length = end - start;
> > > >         int len = 0, err;
> > > >  
> > > > +       xe_gt_assert(gt, (type == XE_GUC_TLB_INVAL_PAGE_SELECTIVE
> > > > &&
> > > > +                         !xe->info.has_ctx_tlb_inval) ||
> > > > +                    (type == XE_GUC_TLB_INVAL_PAGE_SELECTIVE_CTX
> > > > &&
> > > > +                     xe->info.has_ctx_tlb_inval));
> > > > +
> > > >         action[len++] = XE_GUC_ACTION_TLB_INVALIDATION;
> > > >         action[len++] = !prl_sa ? seqno :
> > > > TLB_INVALIDATION_SEQNO_INVALID;
> > > >         if (!gt_to_xe(gt)->info.has_range_tlb_inval ||
> > > > @@ -168,9 +178,11 @@ static int send_tlb_inval_ppgtt(struct
> > > > xe_guc
> > > > *guc, u32 seqno, u64 start,
> > > >         } else {
> > > >                 u64 normalize_len =
> > > > normalize_invalidation_range(gt,
> > > > &start,
> > > >                                                                 
> > > > &end);
> > > > +               bool need_flush = !prl_sa &&
> > > > +                       seqno != TLB_INVALIDATION_SEQNO_INVALID;
> > > >  
> > > >                 /* Flush on NULL case, Media is not required to
> > > > modify flush due to no PPC so NOP */
> > > > -               action[len++] = MAKE_INVAL_OP_FLUSH(type,
> > > > !prl_sa);
> > > > +               action[len++] = MAKE_INVAL_OP_FLUSH(type,
> > > > need_flush);
> > > >                 action[len++] = id;
> > > >                 action[len++] = lower_32_bits(start);
> > > >                 action[len++] = upper_32_bits(start);
> > > > @@ -181,8 +193,10 @@ static int send_tlb_inval_ppgtt(struct
> > > > xe_guc
> > > > *guc, u32 seqno, u64 start,
> > > >  #undef MAX_TLB_INVALIDATION_LEN
> > > >  
> > > >         err = send_tlb_inval(guc, action, len);
> > > > -       if (!err && prl_sa)
> > > > +       if (!err && prl_sa) {
> > > > +               xe_gt_assert(gt, seqno !=
> > > > TLB_INVALIDATION_SEQNO_INVALID);
> > > >                 err = send_page_reclaim(guc, seqno,
> > > > xe_sa_bo_gpu_addr(prl_sa));
> > > > +       }
> > > >         return err;
> > > >  }
> > > >  
> > > > @@ -201,6 +215,114 @@ static int send_tlb_inval_asid_ppgtt(struct
> > > > xe_tlb_inval *tlb_inval, u32 seqno,
> > > >                                    
> > > > XE_GUC_TLB_INVAL_PAGE_SELECTIVE,
> > > > prl_sa);
> > > >  }
> > > >  
> > > > +static bool queue_mapped_in_guc(struct xe_guc *guc, struct
> > > > xe_exec_queue *q)
> > > > +{
> > > > +       return q->gt == guc_to_gt(guc);
> > > > +}
> > > > +
> > > > +static int send_tlb_inval_ctx_ppgtt(struct xe_tlb_inval
> > > > *tlb_inval,
> > > > u32 seqno,
> > > > +                                   u64 start, u64 end, u32 asid,
> > > > +                                   struct drm_suballoc *prl_sa)
> > > > +{
> > > > +       struct xe_guc *guc = tlb_inval->private;
> > > > +       struct xe_device *xe = guc_to_xe(guc);
> > > > +       struct xe_exec_queue *q, *next, *last_q = NULL;
> > > > +       struct xe_vm *vm;
> > > > +       LIST_HEAD(tlb_inval_list);
> > > > +       int err = 0;
> > > > +
> > > > +       lockdep_assert_held(&tlb_inval->seqno_lock);
> > > > +
> > > > +       if (xe->info.force_execlist)
> > > > +               return -ECANCELED;
> > > > +
> > > > +       vm = xe_device_asid_to_vm(xe, asid);
> > > > +       if (IS_ERR(vm))
> > > > +               return PTR_ERR(vm);
> > > > +
> > > > +       down_read(&vm->exec_queues.lock);
> > > > +
> > > > +       /*
> > > > +        * XXX: Randomly picking a threshold for now. This will
> > > > need
> > > > to be
> > > > +        * tuned based on expected UMD queue counts and
> > > > performance
> > > > profiling.
> > > > +        */
> > > > +#define EXEC_QUEUE_COUNT_FULL_THRESHOLD        8
> > > > +       if (vm->exec_queues.count[guc_to_gt(guc)->info.id] >=
> > > > +           EXEC_QUEUE_COUNT_FULL_THRESHOLD) {
> > > > +               u32 action[] = {
> > > > +                       XE_GUC_ACTION_TLB_INVALIDATION,
> > > > +                       seqno,
> > > > +                       MAKE_INVAL_OP(XE_GUC_TLB_INVAL_FULL),
> > > > +               };
> > > > +
> > > > +               err = send_tlb_inval(guc, action,
> > > > ARRAY_SIZE(action));
> > > > +               goto err_unlock;
> > > > +       }
> > > > +#undef EXEC_QUEUE_COUNT_FULL_THRESHOLD
> > > > +
> > > > +       /*
> > > > +        * Move exec queues to a temporary list to issue
> > > > invalidations. The exec
> > > > +        * queue must be mapped in the current GuC, active, and a
> > > > reference must
> > > > +        * be taken to prevent concurrent deregistrations.
> > > > +        */
> > > > +       list_for_each_entry_safe(q, next, &vm->exec_queues.list,
> > > > +                                vm_exec_queue_link)
> > > 
> > > Nitpick: I'd prefer braces around the the if here so we aren't
> > > nesting
> > > the multi-line condition within this list_for_each loop.
> > > 
> > 
> > Sure. CI and local testing is showing this loop explodes too. I'm
> > reasing two GTs are modifying the list at the same time as I'm just
> > using a read lock here. I think vm->exec_queues.list needs to be per
> > GT
> > actually or just use a mutex to protect the list.
> > 
> > > > +               if (queue_mapped_in_guc(guc, q) && q->ops-
> > > > >active(q)
> > > > &&
> > > > +                   xe_exec_queue_get_unless_zero(q)) {
> > > > +                       last_q = q;
> > > > +                       list_move_tail(&q->vm_exec_queue_link,
> > > > &tlb_inval_list);
> > > > +               }
> > > > +
> > > > +       if (!last_q) {
> > > > +               /*
> > > > +                * We can't break fence ordering for TLB
> > > > invalidation
> > > > jobs, if
> > > > +                * TLB invalidations are inflight issue a dummy
> > > > invalidation to
> > > > +                * maintain ordering. Nor can we move safely the
> > > > seqno_recv when
> > > > +                * returning -ECANCELED if TLB invalidations are
> > > > in
> > > > flight. Use
> > > > +                * GGTT invalidation as dummy invalidation given
> > > > ASID
> > > > +                * invalidations are unsupported here.
> > > > +                */
> > > > +               if (xe_tlb_inval_idle(tlb_inval))
> > > > +                       err = -ECANCELED;
> > > > +               else
> > > > +                       err = send_tlb_inval_ggtt(tlb_inval,
> > > > seqno);
> > > 
> > > Can you give a little more context on this line actually? So this
> > > is
> > > going to send down a 0x3 type invalidation to GuC (TLB_INVAL_GUC).
> > > GuC
> > > is still doing an invalidation to hardware when we write this. Is
> > > that
> > > the expectation with this?
> > > 
> > 
> > Yes, I think I explain this in the comment above. We are issuing a
> > dummy
> > invalidation to maintain ordering when required, this would be GuC
> > TLBs.
> > Not ideal but we can't break dma-fencing ordering and this case so be
> > exceedingly rare.
> 
> Yeah the reasoning is clear to me. It just wasn't clear what would
> happen with this GuC invalidation. Looking through the GuC source, I
> see this is sending with Granularity of 0x1. This is "All mappings
> within ASID and VF" according to bspec. We aren't actually passing an
> ASID here and it looks like GuC isn't programming anything into that
> field in the descriptor. So we'd have to get unlucky with a 0 (I
> believe - can't find where they're clearing this at a glance) ASID
> which seems unlikely.
> 
> So from the hardware side, there will be a lookup and failure to find
> the ASID and the transaction should be dropped.
> 
> I think on current hardware this is ok and the invalidation still
> completes at the GuC level (GuC receives the ack from hardware) and we
> shouldn't get an invalidation timeout in the KMD. But there's always a
> chance this could change in the future.
> 
> Can you add a comment here indicating this? Basically something like:
> Expectation is that hardware will drop this request since GuC submits
> an ASID based request without an ASID.

This is a GuC GGTT TLB invalidation - not a full PPGTT or PASID
invalidation. For example when we create or destroy a GGTT mapping, this
is the type of invalidation we issue there.

I picked GuC GGTT invalidation as I figure this was less costly than a
full PPGTT invalidation but that is just a guess. Are ASID based
invalidations available on platforms that want context based
invalidations - my guess is no and maybe even confirmed that at one
point by looking at GuC source.

I already have comment above indicating this is a GGTT based
invalidation.

Matt

> 
> Thanks,
> Stuart
> 
> > 
> > > > +               goto err_unlock;
> > > > +       }
> > > > +
> > > > +       list_for_each_entry_safe(q, next, &tlb_inval_list,
> > > > vm_exec_queue_link) {
> > > > +               struct drm_suballoc *__prl_sa = NULL;
> > > > +               int __seqno = TLB_INVALIDATION_SEQNO_INVALID;
> > > > +               u32 type = XE_GUC_TLB_INVAL_PAGE_SELECTIVE_CTX;
> > > > +
> > > > +               xe_assert(xe, q->vm == vm);
> > > > +
> > > > +               if (err)
> > > > +                       goto unref;
> > > > +
> > > > +               if (last_q == q) {
> > > > +                       __prl_sa = prl_sa;
> > > > +                       __seqno = seqno;
> > > > +               }
> > > > +
> > > > +               err = send_tlb_inval_ppgtt(guc, __seqno, start,
> > > > end,
> > > > +                                          q->guc->id, type,
> > > > __prl_sa);
> > > > +
> > > > +unref:
> > > > +               /*
> > > > +                * Must always return exec queue to original list
> > > > /
> > > > drop
> > > > +                * reference
> > > > +                */
> > > > +               xe_exec_queue_put(q);
> > > > +               list_move_tail(&q->vm_exec_queue_link, &vm-
> > > > > exec_queues.list);
> > > 
> > > Will get back on this one... need to spend a little more time and
> > > do
> > > some testing.
> > > 
> > 
> > I probably should flip the put for clarity but this is in fact safe
> > as
> > queue's memory can't disapear if vm->exec_queues.lock is held.
> > 
> > Matt
> > 
> > > Thanks,
> > > Stuart
> > > 
> > > > +       }
> > > > +
> > > > +err_unlock:
> > > > +       up_read(&vm->exec_queues.lock);
> > > > +       xe_vm_put(vm);
> > > > +
> > > > +       return err;
> > > > +}
> > > > +
> > > >  static bool tlb_inval_initialized(struct xe_tlb_inval
> > > > *tlb_inval)
> > > >  {
> > > >         struct xe_guc *guc = tlb_inval->private;
> > > > @@ -228,7 +350,7 @@ static long tlb_inval_timeout_delay(struct
> > > > xe_tlb_inval *tlb_inval)
> > > >         return hw_tlb_timeout + 2 * delay;
> > > >  }
> > > >  
> > > > -static const struct xe_tlb_inval_ops guc_tlb_inval_ops = {
> > > > +static const struct xe_tlb_inval_ops guc_tlb_inval_asid_ops = {
> > > >         .all = send_tlb_inval_all,
> > > >         .ggtt = send_tlb_inval_ggtt,
> > > >         .ppgtt = send_tlb_inval_asid_ppgtt,
> > > > @@ -237,6 +359,15 @@ static const struct xe_tlb_inval_ops
> > > > guc_tlb_inval_ops = {
> > > >         .timeout_delay = tlb_inval_timeout_delay,
> > > >  };
> > > >  
> > > > +static const struct xe_tlb_inval_ops guc_tlb_inval_ctx_ops = {
> > > > +       .ggtt = send_tlb_inval_ggtt,
> > > > +       .all = send_tlb_inval_all,
> > > > +       .ppgtt = send_tlb_inval_ctx_ppgtt,
> > > > +       .initialized = tlb_inval_initialized,
> > > > +       .flush = tlb_inval_flush,
> > > > +       .timeout_delay = tlb_inval_timeout_delay,
> > > > +};
> > > > +
> > > >  /**
> > > >   * xe_guc_tlb_inval_init_early() - Init GuC TLB invalidation
> > > > early
> > > >   * @guc: GuC object
> > > > @@ -248,8 +379,14 @@ static const struct xe_tlb_inval_ops
> > > > guc_tlb_inval_ops = {
> > > >  void xe_guc_tlb_inval_init_early(struct xe_guc *guc,
> > > >                                  struct xe_tlb_inval *tlb_inval)
> > > >  {
> > > > +       struct xe_device *xe = guc_to_xe(guc);
> > > > +
> > > >         tlb_inval->private = guc;
> > > > -       tlb_inval->ops = &guc_tlb_inval_ops;
> > > > +
> > > > +       if (xe->info.has_ctx_tlb_inval)
> > > > +               tlb_inval->ops = &guc_tlb_inval_ctx_ops;
> > > > +       else
> > > > +               tlb_inval->ops = &guc_tlb_inval_asid_ops;
> > > >  }
> > > >  
> > > >  /**
> > > > diff --git a/drivers/gpu/drm/xe/xe_pci.c
> > > > b/drivers/gpu/drm/xe/xe_pci.c
> > > > index 91e0553a8163..6ea1199f703e 100644
> > > > --- a/drivers/gpu/drm/xe/xe_pci.c
> > > > +++ b/drivers/gpu/drm/xe/xe_pci.c
> > > > @@ -889,6 +889,7 @@ static int xe_info_init(struct xe_device *xe,
> > > >                 xe->info.has_device_atomics_on_smem = 1;
> > > >  
> > > >         xe->info.has_range_tlb_inval = graphics_desc-
> > > > > has_range_tlb_inval;
> > > > +       xe->info.has_ctx_tlb_inval = graphics_desc-
> > > > > has_ctx_tlb_inval;
> > > >         xe->info.has_usm = graphics_desc->has_usm;
> > > >         xe->info.has_64bit_timestamp = graphics_desc-
> > > > > has_64bit_timestamp;
> > > >  
> > > > diff --git a/drivers/gpu/drm/xe/xe_pci_types.h
> > > > b/drivers/gpu/drm/xe/xe_pci_types.h
> > > > index 5f20f56571d1..000b54cbcd0e 100644
> > > > --- a/drivers/gpu/drm/xe/xe_pci_types.h
> > > > +++ b/drivers/gpu/drm/xe/xe_pci_types.h
> > > > @@ -71,6 +71,7 @@ struct xe_graphics_desc {
> > > >         u8 has_atomic_enable_pte_bit:1;
> > > >         u8 has_indirect_ring_state:1;
> > > >         u8 has_range_tlb_inval:1;
> > > > +       u8 has_ctx_tlb_inval:1;
> > > >         u8 has_usm:1;
> > > >         u8 has_64bit_timestamp:1;
> > > >  };
> > > 
> 

  reply	other threads:[~2026-01-14 16:09 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-12 23:27 [PATCH v3 00/11] Context based TLB invalidations Matthew Brost
2026-01-12 23:27 ` [PATCH v3 01/11] drm/xe: Add normalize_invalidation_range Matthew Brost
2026-01-12 23:27 ` [PATCH v3 02/11] drm/xe: Make usm.asid_to_vm allocation use GFP_NOWAIT Matthew Brost
2026-01-12 23:27 ` [PATCH v3 03/11] drm/xe: Add xe_device_asid_to_vm helper Matthew Brost
2026-01-13  1:01   ` Summers, Stuart
2026-01-13  1:36     ` Matthew Brost
2026-01-12 23:27 ` [PATCH v3 04/11] drm/xe: Add vm to exec queues association Matthew Brost
2026-01-13  1:04   ` Summers, Stuart
2026-01-13  1:30     ` Matthew Brost
2026-01-13 21:55       ` Summers, Stuart
2026-01-12 23:27 ` [PATCH v3 05/11] drm/xe: Taint TLB invalidation seqno lock with GFP_KERNEL Matthew Brost
2026-01-12 23:27 ` [PATCH v3 06/11] drm/xe: Rename send_tlb_inval_ppgtt to send_tlb_inval_asid_ppgtt Matthew Brost
2026-01-12 23:27 ` [PATCH v3 07/11] drm/xe: Add send_tlb_inval_ppgtt helper Matthew Brost
2026-01-12 23:27 ` [PATCH v3 08/11] drm/xe: Add xe_tlb_inval_idle helper Matthew Brost
2026-01-12 23:27 ` [PATCH v3 09/11] drm/xe: Add exec queue active vfunc Matthew Brost
2026-01-12 23:27 ` [PATCH v3 10/11] drm/xe: Add context-based invalidation to GuC TLB invalidation backend Matthew Brost
2026-01-13  1:28   ` Summers, Stuart
2026-01-13  1:34     ` Matthew Brost
2026-01-13 22:36       ` Summers, Stuart
2026-01-14 16:09         ` Matthew Brost [this message]
2026-01-14 23:45           ` Summers, Stuart
2026-01-12 23:27 ` [PATCH v3 11/11] drm/xe: Enable context TLB invalidations for CI Matthew Brost
2026-01-12 23:37 ` ✓ CI.KUnit: success for Context based TLB invalidations (rev3) Patchwork
2026-01-13  0:17 ` ✗ Xe.CI.BAT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aWe/sZUWl07L+JSR@lstrano-desk.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=stuart.summers@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox