From: "Summers, Stuart" <stuart.summers@intel.com>
To: "Brost, Matthew" <matthew.brost@intel.com>
Cc: "intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
"Vishwanathapura,
Niranjana" <niranjana.vishwanathapura@intel.com>,
"Cavitt, Jonathan" <jonathan.cavitt@intel.com>
Subject: Re: [PATCH] drm/xe: Add min and max context TLB invalidation sizes
Date: Mon, 23 Mar 2026 19:18:54 +0000 [thread overview]
Message-ID: <8e8d66b91bcb33072e27537ec4b856f68247fa5b.camel@intel.com> (raw)
In-Reply-To: <acF3Gp800SnB/g5D@lstrano-desk.jf.intel.com>
On Mon, 2026-03-23 at 10:23 -0700, Matthew Brost wrote:
> On Fri, Mar 20, 2026 at 08:46:30PM +0000, Stuart Summers wrote:
> > Allow platform-defined TLB invalidation min and max lengths.
> >
> > This gives finer granular control to which invalidations we
> > decide to send to GuC. The min size is essentially a round
> > up. The max allows us to switch to a full invalidation.
> >
> > The expectation here is that GuC will translate the full
> > invalidation in this instance into a series of per context
> > invalidaitons. These are then issued with no H2G or G2H
> > messages and therefore should be quicker than splitting
> > the invalidations from the KMD in max size chunks and sending
> > separately.
> >
> > v2: Add proper defaults for min/max if not set in the device
> > structures
> > v3: Add coverage for pow-of-2 out of bounds cases
> >
> > Signed-off-by: Stuart Summers <stuart.summers@intel.com>
> > Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_device_types.h | 4 +++
> > drivers/gpu/drm/xe/xe_guc_tlb_inval.c | 39 +++++++++++++++++------
> > ----
> > drivers/gpu/drm/xe/xe_pci.c | 3 +++
> > drivers/gpu/drm/xe/xe_pci_types.h | 2 ++
> > 4 files changed, 34 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_device_types.h
> > b/drivers/gpu/drm/xe/xe_device_types.h
> > index 615218d775b1..0c4168fe2ffb 100644
> > --- a/drivers/gpu/drm/xe/xe_device_types.h
> > +++ b/drivers/gpu/drm/xe/xe_device_types.h
> > @@ -137,6 +137,10 @@ struct xe_device {
> > u8 vm_max_level;
> > /** @info.va_bits: Maximum bits of a virtual
> > address */
> > u8 va_bits;
> > + /** @info.min_tlb_inval_size: Minimum size of
> > context based TLB invalidations */
> > + u64 min_tlb_inval_size;
> > + /** @info.max_tlb_inval_size: Maximum size of
> > context based TLB invalidations */
> > + u64 max_tlb_inval_size;
> >
> > /*
> > * Keep all flags below alphabetically sorted
> > diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > index ced58f46f846..e9e0be94ceef 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > @@ -115,14 +115,23 @@ static int send_page_reclaim(struct xe_guc
> > *guc, u32 seqno,
> > G2H_LEN_DW_PAGE_RECLAMATION, 1);
> > }
> >
> > +/*
> > + * Ensure that roundup_pow_of_two(length) doesn't overflow.
> > + * Note that roundup_pow_of_two() operates on unsigned long,
> > + * not on u64.
> > + */
> > +#define MAX_RANGE_TLB_INVALIDATION_LENGTH
> > (rounddown_pow_of_two(ULONG_MAX))
> > +
> > static u64 normalize_invalidation_range(struct xe_gt *gt, u64
> > *start, u64 *end)
> > {
> > + struct xe_device *xe = gt_to_xe(gt);
> > u64 orig_start = *start;
> > u64 length = *end - *start;
> > u64 align;
> >
> > - if (length < SZ_4K)
> > - length = SZ_4K;
> > + xe_gt_assert(gt, length <=
> > MAX_RANGE_TLB_INVALIDATION_LENGTH);
> > +
> > + length = max_t(u64, xe->info.min_tlb_inval_size, length);
> >
> > align = roundup_pow_of_two(length);
> > *start = ALIGN_DOWN(*start, align);
> > @@ -147,13 +156,6 @@ static u64 normalize_invalidation_range(struct
> > xe_gt *gt, u64 *start, u64 *end)
> > return length;
> > }
> >
> > -/*
> > - * Ensure that roundup_pow_of_two(length) doesn't overflow.
> > - * Note that roundup_pow_of_two() operates on unsigned long,
> > - * not on u64.
> > - */
> > -#define MAX_RANGE_TLB_INVALIDATION_LENGTH
> > (rounddown_pow_of_two(ULONG_MAX))
> > -
> > static int send_tlb_inval_ppgtt(struct xe_guc *guc, u32 seqno, u64
> > start,
> > u64 end, u32 id, u32 type,
> > struct drm_suballoc *prl_sa)
> > @@ -162,8 +164,20 @@ static int send_tlb_inval_ppgtt(struct xe_guc
> > *guc, u32 seqno, u64 start,
> > struct xe_gt *gt = guc_to_gt(guc);
> > struct xe_device *xe = guc_to_xe(guc);
> > u32 action[MAX_TLB_INVALIDATION_LEN];
> > - u64 length = end - start;
> > + u64 normalize_len, length = end - start;
> > int len = 0, err;
> > + bool do_full_inval = false;
> > +
> > + if (!xe->info.has_range_tlb_inval ||
> > + length > MAX_RANGE_TLB_INVALIDATION_LENGTH) {
> > + do_full_inval = true;
> > + } else {
> > + normalize_len = normalize_invalidation_range(gt,
> > &start,
> > + &end);
> > +
> > + if (normalize_len > xe->info.max_tlb_inval_size)
> > + do_full_inval = true;
> > + }
>
> I suggested this is the last rev, can this logic be moved to
> send_tlb_inval_asid_ppgtt / send_tlb_inval_ctx_ppgtt?
>
> For send_tlb_inval_asid_ppgtt it doesn't really matter as
> send_tlb_inval_ppgtt is called once.
>
> But consider send_tlb_inval_ctx_ppgtt where send_tlb_inval_ppgtt is
> called multiple times and each call fails the
> normalize_invalidation_range step (i.e., we set do_full_inval). We
> only
> need to issue one full invalidation, not multiple.
Yeah it's a good suggestion. I'll split this out in the next rev.
And I'll repond to those other comments in the earlier rev.
Thanks,
Stuart
>
> So likely want to hook in early in existing if statement in
> send_tlb_inval_ctx_ppgtt.
>
> 244 #define EXEC_QUEUE_COUNT_FULL_THRESHOLD 8
> 245 if (vm->exec_queues.count[id] >=
> EXEC_QUEUE_COUNT_FULL_THRESHOLD) {
> 246 u32 action[] = {
> 247 XE_GUC_ACTION_TLB_INVALIDATION,
> 248 seqno,
> 249 MAKE_INVAL_OP(XE_GUC_TLB_INVAL_FULL),
> 250 };
> 251
> 252 err = send_tlb_inval(guc, action,
> ARRAY_SIZE(action));
> 253 goto err_unlock;
> 254 }
> 255 #undef EXEC_QUEUE_COUNT_FULL_THRESHOLD
>
> Matt
>
> >
> > xe_gt_assert(gt, (type == XE_GUC_TLB_INVAL_PAGE_SELECTIVE
> > &&
> > !xe->info.has_ctx_tlb_inval) ||
> > @@ -172,12 +186,9 @@ static int send_tlb_inval_ppgtt(struct xe_guc
> > *guc, u32 seqno, u64 start,
> >
> > action[len++] = XE_GUC_ACTION_TLB_INVALIDATION;
> > action[len++] = !prl_sa ? seqno :
> > TLB_INVALIDATION_SEQNO_INVALID;
> > - if (!gt_to_xe(gt)->info.has_range_tlb_inval ||
> > - length > MAX_RANGE_TLB_INVALIDATION_LENGTH) {
> > + if (do_full_inval) {
> > action[len++] =
> > MAKE_INVAL_OP(XE_GUC_TLB_INVAL_FULL);
> > } else {
> > - u64 normalize_len =
> > normalize_invalidation_range(gt, &start,
> > -
> > &end);
> > bool need_flush = !prl_sa &&
> > seqno != TLB_INVALIDATION_SEQNO_INVALID;
> >
> > diff --git a/drivers/gpu/drm/xe/xe_pci.c
> > b/drivers/gpu/drm/xe/xe_pci.c
> > index 189e2a1c29f9..5e02f9ab625b 100644
> > --- a/drivers/gpu/drm/xe/xe_pci.c
> > +++ b/drivers/gpu/drm/xe/xe_pci.c
> > @@ -743,6 +743,9 @@ static int xe_info_init_early(struct xe_device
> > *xe,
> > xe->info.vm_max_level = desc->vm_max_level;
> > xe->info.vram_flags = desc->vram_flags;
> >
> > + xe->info.min_tlb_inval_size = desc->min_tlb_inval_size ?:
> > SZ_4K;
> > + xe->info.max_tlb_inval_size = desc->max_tlb_inval_size ?:
> > SZ_1G;
> > +
> > xe->info.is_dgfx = desc->is_dgfx;
> > xe->info.has_cached_pt = desc->has_cached_pt;
> > xe->info.has_fan_control = desc->has_fan_control;
> > diff --git a/drivers/gpu/drm/xe/xe_pci_types.h
> > b/drivers/gpu/drm/xe/xe_pci_types.h
> > index 8eee4fb1c57c..cd9d3ad96fe0 100644
> > --- a/drivers/gpu/drm/xe/xe_pci_types.h
> > +++ b/drivers/gpu/drm/xe/xe_pci_types.h
> > @@ -34,6 +34,8 @@ struct xe_device_desc {
> > u8 va_bits;
> > u8 vm_max_level;
> > u8 vram_flags;
> > + u64 min_tlb_inval_size;
> > + u64 max_tlb_inval_size;
> >
> > u8 require_force_probe:1;
> > u8 is_dgfx:1;
> > --
> > 2.43.0
> >
next prev parent reply other threads:[~2026-03-23 19:19 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-20 20:46 [PATCH] drm/xe: Add min and max context TLB invalidation sizes Stuart Summers
2026-03-20 20:49 ` Summers, Stuart
2026-03-20 20:59 ` Cavitt, Jonathan
2026-03-20 20:53 ` ✓ CI.KUnit: success for drm/xe: Add min and max context TLB invalidation sizes (rev3) Patchwork
2026-03-20 21:31 ` ✓ Xe.CI.BAT: " Patchwork
2026-03-21 22:09 ` ✗ Xe.CI.FULL: failure " Patchwork
2026-03-23 17:23 ` [PATCH] drm/xe: Add min and max context TLB invalidation sizes Matthew Brost
2026-03-23 19:18 ` Summers, Stuart [this message]
-- strict thread matches above, loose matches on Subject: below --
2026-03-19 21:05 Stuart Summers
2026-03-19 21:11 ` Summers, Stuart
2026-03-19 21:36 ` Cavitt, Jonathan
2026-03-19 21:51 ` Summers, Stuart
2026-03-22 5:56 ` Matthew Brost
2026-03-23 19:22 ` Summers, Stuart
2026-03-17 19:50 Stuart Summers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8e8d66b91bcb33072e27537ec4b856f68247fa5b.camel@intel.com \
--to=stuart.summers@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=jonathan.cavitt@intel.com \
--cc=matthew.brost@intel.com \
--cc=niranjana.vishwanathapura@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.