Re: [Intel-gfx] [PATCH v8 3/7] drm/i915: Define and use GuC and CTB TLB invalidation routines

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Andi Shyti <andi.shyti@linux.intel.com>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: janusz.krzysztofik@intel.com, matthew.d.roper@intel.com,
	intel-gfx@lists.freedesktop.org,
	Jonathan Cavitt <jonathan.cavitt@intel.com>,
	chris.p.wilson@linux.intel.com, nirmoy.das@intel.com
Subject: Re: [Intel-gfx] [PATCH v8 3/7] drm/i915: Define and use GuC and CTB TLB invalidation routines
Date: Mon, 9 Oct 2023 13:40:54 +0200	[thread overview]
Message-ID: <ZSPmxlfTx2oQ2Ns4@ashyti-mobl2.lan> (raw)
In-Reply-To: <d6051b4f-dc0a-2df7-71ca-5bf9ec209d27@linux.intel.com>

Hi,

...

> > @@ -131,11 +132,23 @@ void intel_gt_invalidate_tlb_full(struct intel_gt *gt, u32 seqno)
> >   		return;
> >   	with_intel_gt_pm_if_awake(gt, wakeref) {
> > +		struct intel_guc *guc = &gt->uc.guc;
> > +
> >   		mutex_lock(&gt->tlb.invalidate_lock);
> >   		if (tlb_seqno_passed(gt, seqno))
> >   			goto unlock;
> > -		mmio_invalidate_full(gt);
> > +		if (HAS_GUC_TLB_INVALIDATION(gt->i915)) {
> > +			/*
> > +			 * Only perform GuC TLB invalidation if GuC is ready.
> > +			 * If GuC is not ready, then there are no TLBs to
> > +			 * invalidate.  Ergo, skip invalidation.
> > +			 */
> > +			if (intel_guc_is_ready(guc))
> > +				intel_guc_invalidate_tlb_engines(guc);
> 
> What was the answer to John's question on why it is okay to just skip and
> not maybe fall back to mmio?

this maybe can be written as:

	if (HAS_GUC_TLB_INVALIDATION(gt->i915) &&
	    intel_guc_is_ready(guc))
		intel_guc_invalidate_tlb_engines(guc);
	else
		mmio_invalidate_full(gt);

> > +		} else {
> > +			mmio_invalidate_full(gt);
> > +		}
> >   		write_seqcount_invalidate(&gt->tlb.seqno);
> >   unlock:

...

> > +	/*
> > +	 * The full GT reset will have cleared the TLB caches and flushed the
> > +	 * G2H message queue; we can release all the blocked waiters.
> > +	 *
> > +	 * This is safe to do unlocked because the xarray is not dependent
> > +	 * on the GT reset, and there's a separate execution path for TLB
> > +	 * invalidations on GT reset, and there's a large window of time
> > +	 * between the GT reset and GuC becoming available.
> > +	 */
> > +	xa_for_each(&guc->tlb_lookup, i, wait)
> > +		wake_up(&wait->wq);
> 
> If you are confident there can be no failures to wake up someone, who maybe
> just added themselves to the xarray (via put pages for instance), while
> reset in ongoing. Or even removed themselves after say timing out the wait
> and so freed their entry...

I guess yuo are suggesting here to spinlock around this. The
reset is protected by the uncore->spinlock, but I don't really
see it colliding with reset, to be honest.

> >   }
> >   static void guc_cancel_context_requests(struct intel_context *ce)
> > @@ -1948,6 +1962,50 @@ void intel_guc_submission_reset_finish(struct intel_guc *guc)
> >   static void destroyed_worker_func(struct work_struct *w);
> >   static void reset_fail_worker_func(struct work_struct *w);
> > +static int init_tlb_lookup(struct intel_guc *guc)
> > +{
> > +	struct intel_guc_tlb_wait *wait;
> > +	int err;
> > +
> > +	if (!HAS_GUC_TLB_INVALIDATION(guc_to_gt(guc)->i915))
> > +		return 0;
> > +
> > +	xa_init_flags(&guc->tlb_lookup, XA_FLAGS_ALLOC);
> > +
> > +	wait = kzalloc(sizeof(*wait), GFP_KERNEL);
> > +	if (!wait)
> > +		return -ENOMEM;
> > +
> > +	init_waitqueue_head(&wait->wq);
> > +
> > +	/* Preallocate a shared id for use under memory pressure. */
> > +	err = xa_alloc_cyclic_irq(&guc->tlb_lookup, &guc->serial_slot, wait,
> > +				  xa_limit_32b, &guc->next_seqno, GFP_KERNEL);
> > +	/* Only error if out of memory, not when busy (list full)*/
> > +	if (err == -ENOMEM) {
> > +		kfree(wait);
> > +		return err;
> > +	}
> 
> I agreed with John here that only looking at ENOMEM reads odd and I did not
> see that answered. Did I miss it?

xa_alloc_cyclic_irq() can also fail with -EBUSY... so that I
think this is a matter...

> Otherwise, I _know_ it is not likely to get any other error having *just*
> created a new xarray, but still, why not simply catch all? It is not like
> the slot fallback code at runtime would handle guc->serial_slot being
> empty?! It appears it would just explode in guc_send_invalidate_tlb if it
> would hit it..

... of what we want to do for such errors. E.g. Jonathan decided
here not to fail, but ignore the error. Should we fail for every
error?

> > +
> > +	return 0;
> > +}
> > +
> > +static void fini_tlb_lookup(struct intel_guc *guc)
> > +{
> > +	struct intel_guc_tlb_wait *wait;
> > +
> > +	if (!HAS_GUC_TLB_INVALIDATION(guc_to_gt(guc)->i915))
> > +		return;
> > +
> > +	wait = xa_load(&guc->tlb_lookup, guc->serial_slot);
> > +	if (wait) {
> > +		guc_dbg(guc, "fini_tlb_lookup: Unexpected item in tlb_lookup\n");
> 
> Hm wait, why is this unexpected when init_tlb_lookup() pre-allocated that
> entry? Who frees it? guc_send_invalidate_tlb() does not appear to.

I think this links with my answer above, right? With th
refactoring of the if's for not skipping tlb invalidation.

> > +		kfree(wait);
> > +	}
> > +
> > +	xa_destroy(&guc->tlb_lookup);
> > +}
> > +
> >   /*
> >    * Set up the memory resources to be shared with the GuC (via the GGTT)
> >    * at firmware loading time.

...

> > +int intel_guc_tlb_invalidation_done(struct intel_guc *guc, u32 size, u32 len, u32 seqno)
> > +{
> > +	/* Check for underflow */
> > +	if (unlikely(len < 1 || len > size))
> > +		return -EPROTO;
> 
> These check are not valid for any message/action type ct_process_request()
> can receive?

You mean discriminating by payload? Jonathan... you konw the
details here?

next prev parent reply	other threads:[~2023-10-09 11:41 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-06 18:20 [Intel-gfx] [PATCH v8 0/7] drm/i915: Define and use GuC and CTB TLB invalidation routines Jonathan Cavitt
2023-10-06 18:20 ` [Intel-gfx] [PATCH v8 1/7] drm/i915: Add GuC TLB Invalidation device info flags Jonathan Cavitt
2023-10-06 18:20 ` [Intel-gfx] [PATCH v8 2/7] drm/i915/guc: Add CT size delay helper Jonathan Cavitt
2023-10-06 18:20 ` [Intel-gfx] [PATCH v8 3/7] drm/i915: Define and use GuC and CTB TLB invalidation routines Jonathan Cavitt
2023-10-06 21:56   ` kernel test robot
2023-10-09  8:56   ` Tvrtko Ursulin
2023-10-09 11:40     ` Andi Shyti [this message]
2023-10-09 12:49       ` Tvrtko Ursulin
2023-10-09 15:02     ` Cavitt, Jonathan
2023-10-09 19:14     ` John Harrison
2023-10-10  9:00       ` Tvrtko Ursulin
2023-10-09 12:12   ` Nirmoy Das
2023-10-09 12:54     ` Tvrtko Ursulin
2023-10-09 13:24       ` Nirmoy Das
2023-10-06 18:20 ` [Intel-gfx] [PATCH v8 4/7] drm/i915: No TLB invalidation on suspended GT Jonathan Cavitt
2023-10-09  9:00   ` Tvrtko Ursulin
2023-10-09  9:31     ` Jani Nikula
2023-10-06 18:20 ` [Intel-gfx] [PATCH v8 5/7] drm/i915: No TLB invalidation on wedged GT Jonathan Cavitt
2023-10-06 18:20 ` [Intel-gfx] [PATCH v8 6/7] drm/i915/gt: Increase sleep in gt_tlb selftest sanitycheck Jonathan Cavitt
2023-10-06 18:20 ` [Intel-gfx] [PATCH v8 7/7] drm/i915: Enable GuC TLB invalidations for MTL Jonathan Cavitt
2023-10-06 21:27 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Define and use GuC and CTB TLB invalidation routines Patchwork
2023-10-06 21:27 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2023-10-06 21:45 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2023-10-07  6:43 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Define and use GuC and CTB TLB invalidation routines (rev2) Patchwork
2023-10-07  6:43 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2023-10-07  6:57 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2023-10-07 16:47 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2023-10-09 11:52 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Define and use GuC and CTB TLB invalidation routines (rev3) Patchwork
2023-10-09 11:52 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2023-10-09 12:05 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2023-10-09 14:40 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZSPmxlfTx2oQ2Ns4@ashyti-mobl2.lan \
    --to=andi.shyti@linux.intel.com \
    --cc=chris.p.wilson@linux.intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=janusz.krzysztofik@intel.com \
    --cc=jonathan.cavitt@intel.com \
    --cc=matthew.d.roper@intel.com \
    --cc=nirmoy.das@intel.com \
    --cc=tvrtko.ursulin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.