Re: [PATCH] drm/xe: Add min and max context TLB invalidation sizes

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Summers, Stuart" <stuart.summers@intel.com>
To: "Brost, Matthew" <matthew.brost@intel.com>
Cc: "intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
	"Vishwanathapura,
	Niranjana" <niranjana.vishwanathapura@intel.com>
Subject: Re: [PATCH] drm/xe: Add min and max context TLB invalidation sizes
Date: Mon, 23 Mar 2026 19:22:57 +0000	[thread overview]
Message-ID: <0c29f8cdcc886d35ff38775b858d9ef9c45aa175.camel@intel.com> (raw)
In-Reply-To: <ab+EqQFFVnf0MeJJ@lstrano-desk.jf.intel.com>

On Sat, 2026-03-21 at 22:56 -0700, Matthew Brost wrote:
> On Thu, Mar 19, 2026 at 09:05:32PM +0000, Stuart Summers wrote:
> > Allow platform-defined TLB invalidation min and max lengths.
> > 
> > This gives finer granular control to which invalidations we
> > decide to send to GuC. The min size is essentially a round
> > up. The max allows us to switch to a full invalidation.
> > 
> > The expectation here is that GuC will translate the full
> > invalidation in this instance into a series of per context
> > invalidaitons. These are then issued with no H2G or G2H
> > messages and therefore should be quicker than splitting
> > the invalidations from the KMD in max size chunks and sending
> > separately.
> > 
> > v2: Add proper defaults for min/max if not set in the device
> >     structures
> > 
> > Signed-off-by: Stuart Summers <stuart.summers@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_device_types.h  |  4 ++++
> >  drivers/gpu/drm/xe/xe_guc_tlb_inval.c | 20 +++++++-------------
> >  drivers/gpu/drm/xe/xe_pci.c           |  3 +++
> >  drivers/gpu/drm/xe/xe_pci_types.h     |  2 ++
> >  4 files changed, 16 insertions(+), 13 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_device_types.h
> > b/drivers/gpu/drm/xe/xe_device_types.h
> > index 615218d775b1..0c4168fe2ffb 100644
> > --- a/drivers/gpu/drm/xe/xe_device_types.h
> > +++ b/drivers/gpu/drm/xe/xe_device_types.h
> > @@ -137,6 +137,10 @@ struct xe_device {
> >                 u8 vm_max_level;
> >                 /** @info.va_bits: Maximum bits of a virtual
> > address */
> >                 u8 va_bits;
> > +               /** @info.min_tlb_inval_size: Minimum size of
> > context based TLB invalidations */
> > +               u64 min_tlb_inval_size;
> > +               /** @info.max_tlb_inval_size: Maximum size of
> > context based TLB invalidations */
> > +               u64 max_tlb_inval_size;
> >  
> >                 /*
> >                  * Keep all flags below alphabetically sorted
> > diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > index eb40528976ca..7512f889a97a 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > @@ -133,12 +133,12 @@ static int send_page_reclaim(struct xe_guc
> > *guc, u32 seqno,
> >  
> >  static u64 normalize_invalidation_range(struct xe_gt *gt, u64
> > *start, u64 *end)
> >  {
> > +       struct xe_device *xe = gt_to_xe(gt);
> >         u64 orig_start = *start;
> >         u64 length = *end - *start;
> >         u64 align;
> >  
> > -       if (length < SZ_4K)
> > -               length = SZ_4K;
> > +       length = max_t(u64, xe->info.min_tlb_inval_size, length);
> >  
> >         align = roundup_pow_of_two(length);
> >         *start = ALIGN_DOWN(*start, align);
> > @@ -163,13 +163,6 @@ static u64 normalize_invalidation_range(struct
> > xe_gt *gt, u64 *start, u64 *end)
> >         return length;
> >  }
> >  
> > -/*
> > - * Ensure that roundup_pow_of_two(length) doesn't overflow.
> > - * Note that roundup_pow_of_two() operates on unsigned long,
> > - * not on u64.
> > - */
> > -#define MAX_RANGE_TLB_INVALIDATION_LENGTH
> > (rounddown_pow_of_two(ULONG_MAX))
> > -
> >  static int send_tlb_inval_ppgtt(struct xe_guc *guc, u32 seqno, u64
> > start,
> >                                 u64 end, u32 id, u32 type,
> >                                 struct drm_suballoc *prl_sa)
> > @@ -178,9 +171,12 @@ static int send_tlb_inval_ppgtt(struct xe_guc
> > *guc, u32 seqno, u64 start,
> >         struct xe_gt *gt = guc_to_gt(guc);
> >         struct xe_device *xe = guc_to_xe(guc);
> >         u32 action[MAX_TLB_INVALIDATION_LEN];
> > -       u64 length = end - start;
> > +       u64 normalize_len;
> >         int len = 0, err;
> >  
> > +       normalize_len = normalize_invalidation_range(gt, &start,
> > +                                                    &end);
> > +
> 
> Maybe not related, but is there any way we can pull
> normalize_invalidation_range() out of send_tlb_inval_ppgtt()?
> Context-based TLB invalidations call send_tlb_inval_ppgtt() multiple
> times, but the normalization step only needs to happen once. I’d
> include
> this as a “since we’re here” request.

Yeah seems like a good change to make. Let me add that in the next rev.

> 
> >         xe_gt_assert(gt, (type == XE_GUC_TLB_INVAL_PAGE_SELECTIVE
> > &&
> >                           !xe->info.has_ctx_tlb_inval) ||
> >                      (type == XE_GUC_TLB_INVAL_PAGE_SELECTIVE_CTX
> > &&
> > @@ -189,11 +185,9 @@ static int send_tlb_inval_ppgtt(struct xe_guc
> > *guc, u32 seqno, u64 start,
> >         action[len++] = XE_GUC_ACTION_TLB_INVALIDATION;
> >         action[len++] = !prl_sa ? seqno :
> > TLB_INVALIDATION_SEQNO_INVALID;
> >         if (!gt_to_xe(gt)->info.has_range_tlb_inval ||
> > -           length > MAX_RANGE_TLB_INVALIDATION_LENGTH) {
> > +           normalize_len > xe->info.max_tlb_inval_size) {
> >                 action[len++] =
> > MAKE_INVAL_OP(XE_GUC_TLB_INVAL_FULL);
> >         } else {
> > -               u64 normalize_len =
> > normalize_invalidation_range(gt, &start,
> > -                                                               
> > &end);
> >                 bool need_flush = !prl_sa &&
> >                         seqno != TLB_INVALIDATION_SEQNO_INVALID;
> >  
> > diff --git a/drivers/gpu/drm/xe/xe_pci.c
> > b/drivers/gpu/drm/xe/xe_pci.c
> > index 189e2a1c29f9..5e02f9ab625b 100644
> > --- a/drivers/gpu/drm/xe/xe_pci.c
> > +++ b/drivers/gpu/drm/xe/xe_pci.c
> > @@ -743,6 +743,9 @@ static int xe_info_init_early(struct xe_device
> > *xe,
> >         xe->info.vm_max_level = desc->vm_max_level;
> >         xe->info.vram_flags = desc->vram_flags;
> >  
> > +       xe->info.min_tlb_inval_size = desc->min_tlb_inval_size ?:
> > SZ_4K;
> > +       xe->info.max_tlb_inval_size = desc->max_tlb_inval_size ?:
> > SZ_1G;
> 
> How did you arrive at SZ_1G? Previously this was
> rounddown_pow_of_two(ULONG_MAX), which was entirely meant to prevent
> a
> software infinite loop, not because of a hardware restriction.

Not a hardware requirement I agree. This was just a middle ground value
I had chosen. I think on initial bind we're invaliding all 0xf's
basically.. so it would hit that upper case. We could set the default
to that? I was thinking if we go much beyond 1G, I don't know that
we're really gaining much doing a range based invalidation at that
point. Maybe we should at least account for the possibility of 1G page
sizes though and bump that to something like 4 or 8G?

> 
> > +
> >         xe->info.is_dgfx = desc->is_dgfx;
> >         xe->info.has_cached_pt = desc->has_cached_pt;
> >         xe->info.has_fan_control = desc->has_fan_control;
> > diff --git a/drivers/gpu/drm/xe/xe_pci_types.h
> > b/drivers/gpu/drm/xe/xe_pci_types.h
> > index 8eee4fb1c57c..cd9d3ad96fe0 100644
> > --- a/drivers/gpu/drm/xe/xe_pci_types.h
> > +++ b/drivers/gpu/drm/xe/xe_pci_types.h
> > @@ -34,6 +34,8 @@ struct xe_device_desc {
> >         u8 va_bits;
> >         u8 vm_max_level;
> >         u8 vram_flags;
> > +       u64 min_tlb_inval_size;
> > +       u64 max_tlb_inval_size;
> 
> +1 on these bounds - this looks right and needed.

Thanks!
Stuart

> 
> Mztt
> 
> >  
> >         u8 require_force_probe:1;
> >         u8 is_dgfx:1;
> > -- 
> > 2.43.0
> >

next prev parent reply	other threads:[~2026-03-23 19:23 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-19 21:05 [PATCH] drm/xe: Add min and max context TLB invalidation sizes Stuart Summers
2026-03-19 21:11 ` Summers, Stuart
2026-03-19 21:36   ` Cavitt, Jonathan
2026-03-19 21:51     ` Summers, Stuart
2026-03-19 21:12 ` ✓ CI.KUnit: success for drm/xe: Add min and max context TLB invalidation sizes (rev2) Patchwork
2026-03-19 22:18 ` ✓ Xe.CI.BAT: " Patchwork
2026-03-20 20:05 ` ✗ Xe.CI.FULL: failure " Patchwork
2026-03-22  5:56 ` [PATCH] drm/xe: Add min and max context TLB invalidation sizes Matthew Brost
2026-03-23 19:22   ` Summers, Stuart [this message]
  -- strict thread matches above, loose matches on Subject: below --
2026-03-20 20:46 Stuart Summers
2026-03-20 20:49 ` Summers, Stuart
2026-03-20 20:59   ` Cavitt, Jonathan
2026-03-23 17:23 ` Matthew Brost
2026-03-23 19:18   ` Summers, Stuart
2026-03-17 19:50 Stuart Summers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0c29f8cdcc886d35ff38775b858d9ef9c45aa175.camel@intel.com \
    --to=stuart.summers@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    --cc=niranjana.vishwanathapura@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.