From: "Summers, Stuart" <stuart.summers@intel.com>
To: "Brost, Matthew" <matthew.brost@intel.com>
Cc: "intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH v3 10/11] drm/xe: Add context-based invalidation to GuC TLB invalidation backend
Date: Tue, 13 Jan 2026 22:36:59 +0000 [thread overview]
Message-ID: <976ce0e45b400122953ede51994d0d2a32782173.camel@intel.com> (raw)
In-Reply-To: <aWWhQpD6GysgmJYp@lstrano-desk.jf.intel.com>
On Mon, 2026-01-12 at 17:34 -0800, Matthew Brost wrote:
> On Mon, Jan 12, 2026 at 06:28:01PM -0700, Summers, Stuart wrote:
> > On Mon, 2026-01-12 at 15:27 -0800, Matthew Brost wrote:
> > > Introduce context-based invalidation support to the GuC TLB
> > > invalidation
> > > backend. This is implemented by iterating over each exec queue
> > > per GT
> > > within a VM, skipping inactive queues, and issuing a context-
> > > based
> > > (GuC
> > > ID) H2G TLB invalidation. All H2G messages, except the final one,
> > > are
> > > sent with an invalid seqno, which the G2H handler drops to ensure
> > > the
> > > TLB invalidation fence is only signaled once all H2G messages are
> > > completed.
> > >
> > > A watermark mechanism is also added to switch between context-
> > > based
> > > TLB
> > > invalidations and full device-wide invalidations, as the return
> > > on
> > > investment for context-based invalidation diminishes when many
> > > exec
> > > queues are mapped.
> > >
> > > v2:
> > > - Fix checkpatch warnings
> > > v3:
> > > - Rebase on PRL
> > > - Use ref counting to avoid racing with deregisters
> > >
> > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > > ---
> > > drivers/gpu/drm/xe/xe_device_types.h | 2 +
> > > drivers/gpu/drm/xe/xe_guc_tlb_inval.c | 145
> > > +++++++++++++++++++++++++-
> > > drivers/gpu/drm/xe/xe_pci.c | 1 +
> > > drivers/gpu/drm/xe/xe_pci_types.h | 1 +
> > > 4 files changed, 145 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_device_types.h
> > > b/drivers/gpu/drm/xe/xe_device_types.h
> > > index 8db870aaa382..b51acff4edcd 100644
> > > --- a/drivers/gpu/drm/xe/xe_device_types.h
> > > +++ b/drivers/gpu/drm/xe/xe_device_types.h
> > > @@ -358,6 +358,8 @@ struct xe_device {
> > > u8 has_pre_prod_wa:1;
> > > /** @info.has_pxp: Device has PXP support */
> > > u8 has_pxp:1;
> > > + /** @info.has_ctx_tlb_inval: Has context based
> > > TLB
> > > invalidations */
> > > + u8 has_ctx_tlb_inval:1;
> > > /** @info.has_range_tlb_inval: Has range based
> > > TLB
> > > invalidations */
> > > u8 has_range_tlb_inval:1;
> > > /** @info.has_soc_remapper_sysctrl: Has SoC
> > > remapper
> > > system controller */
> > > diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > > b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > > index 070d2e2cb7c9..328eced5f692 100644
> > > --- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > > +++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > > @@ -6,15 +6,19 @@
> > > #include "abi/guc_actions_abi.h"
> > >
> > > #include "xe_device.h"
> > > +#include "xe_exec_queue.h"
> > > +#include "xe_exec_queue_types.h"
> > > #include "xe_gt_stats.h"
> > > #include "xe_gt_types.h"
> > > #include "xe_guc.h"
> > > #include "xe_guc_ct.h"
> > > +#include "xe_guc_exec_queue_types.h"
> > > #include "xe_guc_tlb_inval.h"
> > > #include "xe_force_wake.h"
> > > #include "xe_mmio.h"
> > > #include "xe_sa.h"
> > > #include "xe_tlb_inval.h"
> > > +#include "xe_vm.h"
> > >
> > > #include "regs/xe_guc_regs.h"
> > >
> > > @@ -156,10 +160,16 @@ static int send_tlb_inval_ppgtt(struct
> > > xe_guc
> > > *guc, u32 seqno, u64 start,
> > > {
> > > #define MAX_TLB_INVALIDATION_LEN 7
> > > struct xe_gt *gt = guc_to_gt(guc);
> > > + struct xe_device *xe = guc_to_xe(guc);
> > > u32 action[MAX_TLB_INVALIDATION_LEN];
> > > u64 length = end - start;
> > > int len = 0, err;
> > >
> > > + xe_gt_assert(gt, (type == XE_GUC_TLB_INVAL_PAGE_SELECTIVE
> > > &&
> > > + !xe->info.has_ctx_tlb_inval) ||
> > > + (type == XE_GUC_TLB_INVAL_PAGE_SELECTIVE_CTX
> > > &&
> > > + xe->info.has_ctx_tlb_inval));
> > > +
> > > action[len++] = XE_GUC_ACTION_TLB_INVALIDATION;
> > > action[len++] = !prl_sa ? seqno :
> > > TLB_INVALIDATION_SEQNO_INVALID;
> > > if (!gt_to_xe(gt)->info.has_range_tlb_inval ||
> > > @@ -168,9 +178,11 @@ static int send_tlb_inval_ppgtt(struct
> > > xe_guc
> > > *guc, u32 seqno, u64 start,
> > > } else {
> > > u64 normalize_len =
> > > normalize_invalidation_range(gt,
> > > &start,
> > >
> > > &end);
> > > + bool need_flush = !prl_sa &&
> > > + seqno != TLB_INVALIDATION_SEQNO_INVALID;
> > >
> > > /* Flush on NULL case, Media is not required to
> > > modify flush due to no PPC so NOP */
> > > - action[len++] = MAKE_INVAL_OP_FLUSH(type,
> > > !prl_sa);
> > > + action[len++] = MAKE_INVAL_OP_FLUSH(type,
> > > need_flush);
> > > action[len++] = id;
> > > action[len++] = lower_32_bits(start);
> > > action[len++] = upper_32_bits(start);
> > > @@ -181,8 +193,10 @@ static int send_tlb_inval_ppgtt(struct
> > > xe_guc
> > > *guc, u32 seqno, u64 start,
> > > #undef MAX_TLB_INVALIDATION_LEN
> > >
> > > err = send_tlb_inval(guc, action, len);
> > > - if (!err && prl_sa)
> > > + if (!err && prl_sa) {
> > > + xe_gt_assert(gt, seqno !=
> > > TLB_INVALIDATION_SEQNO_INVALID);
> > > err = send_page_reclaim(guc, seqno,
> > > xe_sa_bo_gpu_addr(prl_sa));
> > > + }
> > > return err;
> > > }
> > >
> > > @@ -201,6 +215,114 @@ static int send_tlb_inval_asid_ppgtt(struct
> > > xe_tlb_inval *tlb_inval, u32 seqno,
> > >
> > > XE_GUC_TLB_INVAL_PAGE_SELECTIVE,
> > > prl_sa);
> > > }
> > >
> > > +static bool queue_mapped_in_guc(struct xe_guc *guc, struct
> > > xe_exec_queue *q)
> > > +{
> > > + return q->gt == guc_to_gt(guc);
> > > +}
> > > +
> > > +static int send_tlb_inval_ctx_ppgtt(struct xe_tlb_inval
> > > *tlb_inval,
> > > u32 seqno,
> > > + u64 start, u64 end, u32 asid,
> > > + struct drm_suballoc *prl_sa)
> > > +{
> > > + struct xe_guc *guc = tlb_inval->private;
> > > + struct xe_device *xe = guc_to_xe(guc);
> > > + struct xe_exec_queue *q, *next, *last_q = NULL;
> > > + struct xe_vm *vm;
> > > + LIST_HEAD(tlb_inval_list);
> > > + int err = 0;
> > > +
> > > + lockdep_assert_held(&tlb_inval->seqno_lock);
> > > +
> > > + if (xe->info.force_execlist)
> > > + return -ECANCELED;
> > > +
> > > + vm = xe_device_asid_to_vm(xe, asid);
> > > + if (IS_ERR(vm))
> > > + return PTR_ERR(vm);
> > > +
> > > + down_read(&vm->exec_queues.lock);
> > > +
> > > + /*
> > > + * XXX: Randomly picking a threshold for now. This will
> > > need
> > > to be
> > > + * tuned based on expected UMD queue counts and
> > > performance
> > > profiling.
> > > + */
> > > +#define EXEC_QUEUE_COUNT_FULL_THRESHOLD 8
> > > + if (vm->exec_queues.count[guc_to_gt(guc)->info.id] >=
> > > + EXEC_QUEUE_COUNT_FULL_THRESHOLD) {
> > > + u32 action[] = {
> > > + XE_GUC_ACTION_TLB_INVALIDATION,
> > > + seqno,
> > > + MAKE_INVAL_OP(XE_GUC_TLB_INVAL_FULL),
> > > + };
> > > +
> > > + err = send_tlb_inval(guc, action,
> > > ARRAY_SIZE(action));
> > > + goto err_unlock;
> > > + }
> > > +#undef EXEC_QUEUE_COUNT_FULL_THRESHOLD
> > > +
> > > + /*
> > > + * Move exec queues to a temporary list to issue
> > > invalidations. The exec
> > > + * queue must be mapped in the current GuC, active, and a
> > > reference must
> > > + * be taken to prevent concurrent deregistrations.
> > > + */
> > > + list_for_each_entry_safe(q, next, &vm->exec_queues.list,
> > > + vm_exec_queue_link)
> >
> > Nitpick: I'd prefer braces around the the if here so we aren't
> > nesting
> > the multi-line condition within this list_for_each loop.
> >
>
> Sure. CI and local testing is showing this loop explodes too. I'm
> reasing two GTs are modifying the list at the same time as I'm just
> using a read lock here. I think vm->exec_queues.list needs to be per
> GT
> actually or just use a mutex to protect the list.
>
> > > + if (queue_mapped_in_guc(guc, q) && q->ops-
> > > >active(q)
> > > &&
> > > + xe_exec_queue_get_unless_zero(q)) {
> > > + last_q = q;
> > > + list_move_tail(&q->vm_exec_queue_link,
> > > &tlb_inval_list);
> > > + }
> > > +
> > > + if (!last_q) {
> > > + /*
> > > + * We can't break fence ordering for TLB
> > > invalidation
> > > jobs, if
> > > + * TLB invalidations are inflight issue a dummy
> > > invalidation to
> > > + * maintain ordering. Nor can we move safely the
> > > seqno_recv when
> > > + * returning -ECANCELED if TLB invalidations are
> > > in
> > > flight. Use
> > > + * GGTT invalidation as dummy invalidation given
> > > ASID
> > > + * invalidations are unsupported here.
> > > + */
> > > + if (xe_tlb_inval_idle(tlb_inval))
> > > + err = -ECANCELED;
> > > + else
> > > + err = send_tlb_inval_ggtt(tlb_inval,
> > > seqno);
> >
> > Can you give a little more context on this line actually? So this
> > is
> > going to send down a 0x3 type invalidation to GuC (TLB_INVAL_GUC).
> > GuC
> > is still doing an invalidation to hardware when we write this. Is
> > that
> > the expectation with this?
> >
>
> Yes, I think I explain this in the comment above. We are issuing a
> dummy
> invalidation to maintain ordering when required, this would be GuC
> TLBs.
> Not ideal but we can't break dma-fencing ordering and this case so be
> exceedingly rare.
Yeah the reasoning is clear to me. It just wasn't clear what would
happen with this GuC invalidation. Looking through the GuC source, I
see this is sending with Granularity of 0x1. This is "All mappings
within ASID and VF" according to bspec. We aren't actually passing an
ASID here and it looks like GuC isn't programming anything into that
field in the descriptor. So we'd have to get unlucky with a 0 (I
believe - can't find where they're clearing this at a glance) ASID
which seems unlikely.
So from the hardware side, there will be a lookup and failure to find
the ASID and the transaction should be dropped.
I think on current hardware this is ok and the invalidation still
completes at the GuC level (GuC receives the ack from hardware) and we
shouldn't get an invalidation timeout in the KMD. But there's always a
chance this could change in the future.
Can you add a comment here indicating this? Basically something like:
Expectation is that hardware will drop this request since GuC submits
an ASID based request without an ASID.
Thanks,
Stuart
>
> > > + goto err_unlock;
> > > + }
> > > +
> > > + list_for_each_entry_safe(q, next, &tlb_inval_list,
> > > vm_exec_queue_link) {
> > > + struct drm_suballoc *__prl_sa = NULL;
> > > + int __seqno = TLB_INVALIDATION_SEQNO_INVALID;
> > > + u32 type = XE_GUC_TLB_INVAL_PAGE_SELECTIVE_CTX;
> > > +
> > > + xe_assert(xe, q->vm == vm);
> > > +
> > > + if (err)
> > > + goto unref;
> > > +
> > > + if (last_q == q) {
> > > + __prl_sa = prl_sa;
> > > + __seqno = seqno;
> > > + }
> > > +
> > > + err = send_tlb_inval_ppgtt(guc, __seqno, start,
> > > end,
> > > + q->guc->id, type,
> > > __prl_sa);
> > > +
> > > +unref:
> > > + /*
> > > + * Must always return exec queue to original list
> > > /
> > > drop
> > > + * reference
> > > + */
> > > + xe_exec_queue_put(q);
> > > + list_move_tail(&q->vm_exec_queue_link, &vm-
> > > > exec_queues.list);
> >
> > Will get back on this one... need to spend a little more time and
> > do
> > some testing.
> >
>
> I probably should flip the put for clarity but this is in fact safe
> as
> queue's memory can't disapear if vm->exec_queues.lock is held.
>
> Matt
>
> > Thanks,
> > Stuart
> >
> > > + }
> > > +
> > > +err_unlock:
> > > + up_read(&vm->exec_queues.lock);
> > > + xe_vm_put(vm);
> > > +
> > > + return err;
> > > +}
> > > +
> > > static bool tlb_inval_initialized(struct xe_tlb_inval
> > > *tlb_inval)
> > > {
> > > struct xe_guc *guc = tlb_inval->private;
> > > @@ -228,7 +350,7 @@ static long tlb_inval_timeout_delay(struct
> > > xe_tlb_inval *tlb_inval)
> > > return hw_tlb_timeout + 2 * delay;
> > > }
> > >
> > > -static const struct xe_tlb_inval_ops guc_tlb_inval_ops = {
> > > +static const struct xe_tlb_inval_ops guc_tlb_inval_asid_ops = {
> > > .all = send_tlb_inval_all,
> > > .ggtt = send_tlb_inval_ggtt,
> > > .ppgtt = send_tlb_inval_asid_ppgtt,
> > > @@ -237,6 +359,15 @@ static const struct xe_tlb_inval_ops
> > > guc_tlb_inval_ops = {
> > > .timeout_delay = tlb_inval_timeout_delay,
> > > };
> > >
> > > +static const struct xe_tlb_inval_ops guc_tlb_inval_ctx_ops = {
> > > + .ggtt = send_tlb_inval_ggtt,
> > > + .all = send_tlb_inval_all,
> > > + .ppgtt = send_tlb_inval_ctx_ppgtt,
> > > + .initialized = tlb_inval_initialized,
> > > + .flush = tlb_inval_flush,
> > > + .timeout_delay = tlb_inval_timeout_delay,
> > > +};
> > > +
> > > /**
> > > * xe_guc_tlb_inval_init_early() - Init GuC TLB invalidation
> > > early
> > > * @guc: GuC object
> > > @@ -248,8 +379,14 @@ static const struct xe_tlb_inval_ops
> > > guc_tlb_inval_ops = {
> > > void xe_guc_tlb_inval_init_early(struct xe_guc *guc,
> > > struct xe_tlb_inval *tlb_inval)
> > > {
> > > + struct xe_device *xe = guc_to_xe(guc);
> > > +
> > > tlb_inval->private = guc;
> > > - tlb_inval->ops = &guc_tlb_inval_ops;
> > > +
> > > + if (xe->info.has_ctx_tlb_inval)
> > > + tlb_inval->ops = &guc_tlb_inval_ctx_ops;
> > > + else
> > > + tlb_inval->ops = &guc_tlb_inval_asid_ops;
> > > }
> > >
> > > /**
> > > diff --git a/drivers/gpu/drm/xe/xe_pci.c
> > > b/drivers/gpu/drm/xe/xe_pci.c
> > > index 91e0553a8163..6ea1199f703e 100644
> > > --- a/drivers/gpu/drm/xe/xe_pci.c
> > > +++ b/drivers/gpu/drm/xe/xe_pci.c
> > > @@ -889,6 +889,7 @@ static int xe_info_init(struct xe_device *xe,
> > > xe->info.has_device_atomics_on_smem = 1;
> > >
> > > xe->info.has_range_tlb_inval = graphics_desc-
> > > > has_range_tlb_inval;
> > > + xe->info.has_ctx_tlb_inval = graphics_desc-
> > > > has_ctx_tlb_inval;
> > > xe->info.has_usm = graphics_desc->has_usm;
> > > xe->info.has_64bit_timestamp = graphics_desc-
> > > > has_64bit_timestamp;
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_pci_types.h
> > > b/drivers/gpu/drm/xe/xe_pci_types.h
> > > index 5f20f56571d1..000b54cbcd0e 100644
> > > --- a/drivers/gpu/drm/xe/xe_pci_types.h
> > > +++ b/drivers/gpu/drm/xe/xe_pci_types.h
> > > @@ -71,6 +71,7 @@ struct xe_graphics_desc {
> > > u8 has_atomic_enable_pte_bit:1;
> > > u8 has_indirect_ring_state:1;
> > > u8 has_range_tlb_inval:1;
> > > + u8 has_ctx_tlb_inval:1;
> > > u8 has_usm:1;
> > > u8 has_64bit_timestamp:1;
> > > };
> >
next prev parent reply other threads:[~2026-01-13 22:37 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-12 23:27 [PATCH v3 00/11] Context based TLB invalidations Matthew Brost
2026-01-12 23:27 ` [PATCH v3 01/11] drm/xe: Add normalize_invalidation_range Matthew Brost
2026-01-12 23:27 ` [PATCH v3 02/11] drm/xe: Make usm.asid_to_vm allocation use GFP_NOWAIT Matthew Brost
2026-01-12 23:27 ` [PATCH v3 03/11] drm/xe: Add xe_device_asid_to_vm helper Matthew Brost
2026-01-13 1:01 ` Summers, Stuart
2026-01-13 1:36 ` Matthew Brost
2026-01-12 23:27 ` [PATCH v3 04/11] drm/xe: Add vm to exec queues association Matthew Brost
2026-01-13 1:04 ` Summers, Stuart
2026-01-13 1:30 ` Matthew Brost
2026-01-13 21:55 ` Summers, Stuart
2026-01-12 23:27 ` [PATCH v3 05/11] drm/xe: Taint TLB invalidation seqno lock with GFP_KERNEL Matthew Brost
2026-01-12 23:27 ` [PATCH v3 06/11] drm/xe: Rename send_tlb_inval_ppgtt to send_tlb_inval_asid_ppgtt Matthew Brost
2026-01-12 23:27 ` [PATCH v3 07/11] drm/xe: Add send_tlb_inval_ppgtt helper Matthew Brost
2026-01-12 23:27 ` [PATCH v3 08/11] drm/xe: Add xe_tlb_inval_idle helper Matthew Brost
2026-01-12 23:27 ` [PATCH v3 09/11] drm/xe: Add exec queue active vfunc Matthew Brost
2026-01-12 23:27 ` [PATCH v3 10/11] drm/xe: Add context-based invalidation to GuC TLB invalidation backend Matthew Brost
2026-01-13 1:28 ` Summers, Stuart
2026-01-13 1:34 ` Matthew Brost
2026-01-13 22:36 ` Summers, Stuart [this message]
2026-01-14 16:09 ` Matthew Brost
2026-01-14 23:45 ` Summers, Stuart
2026-01-12 23:27 ` [PATCH v3 11/11] drm/xe: Enable context TLB invalidations for CI Matthew Brost
2026-01-12 23:37 ` ✓ CI.KUnit: success for Context based TLB invalidations (rev3) Patchwork
2026-01-13 0:17 ` ✗ Xe.CI.BAT: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=976ce0e45b400122953ede51994d0d2a32782173.camel@intel.com \
--to=stuart.summers@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.brost@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox