public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Andi Shyti <andi.shyti@linux.intel.com>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: janusz.krzysztofik@intel.com, matthew.d.roper@intel.com,
	intel-gfx@lists.freedesktop.org,
	Jonathan Cavitt <jonathan.cavitt@intel.com>,
	chris.p.wilson@linux.intel.com, nirmoy.das@intel.com
Subject: Re: [Intel-gfx] [PATCH v8 3/7] drm/i915: Define and use GuC and CTB TLB invalidation routines
Date: Mon, 9 Oct 2023 13:40:54 +0200	[thread overview]
Message-ID: <ZSPmxlfTx2oQ2Ns4@ashyti-mobl2.lan> (raw)
In-Reply-To: <d6051b4f-dc0a-2df7-71ca-5bf9ec209d27@linux.intel.com>

Hi,

...

> > @@ -131,11 +132,23 @@ void intel_gt_invalidate_tlb_full(struct intel_gt *gt, u32 seqno)
> >   		return;
> >   	with_intel_gt_pm_if_awake(gt, wakeref) {
> > +		struct intel_guc *guc = &gt->uc.guc;
> > +
> >   		mutex_lock(&gt->tlb.invalidate_lock);
> >   		if (tlb_seqno_passed(gt, seqno))
> >   			goto unlock;
> > -		mmio_invalidate_full(gt);
> > +		if (HAS_GUC_TLB_INVALIDATION(gt->i915)) {
> > +			/*
> > +			 * Only perform GuC TLB invalidation if GuC is ready.
> > +			 * If GuC is not ready, then there are no TLBs to
> > +			 * invalidate.  Ergo, skip invalidation.
> > +			 */
> > +			if (intel_guc_is_ready(guc))
> > +				intel_guc_invalidate_tlb_engines(guc);
> 
> What was the answer to John's question on why it is okay to just skip and
> not maybe fall back to mmio?

this maybe can be written as:

	if (HAS_GUC_TLB_INVALIDATION(gt->i915) &&
	    intel_guc_is_ready(guc))
		intel_guc_invalidate_tlb_engines(guc);
	else
		mmio_invalidate_full(gt);

> > +		} else {
> > +			mmio_invalidate_full(gt);
> > +		}
> >   		write_seqcount_invalidate(&gt->tlb.seqno);
> >   unlock:

...

> > +	/*
> > +	 * The full GT reset will have cleared the TLB caches and flushed the
> > +	 * G2H message queue; we can release all the blocked waiters.
> > +	 *
> > +	 * This is safe to do unlocked because the xarray is not dependent
> > +	 * on the GT reset, and there's a separate execution path for TLB
> > +	 * invalidations on GT reset, and there's a large window of time
> > +	 * between the GT reset and GuC becoming available.
> > +	 */
> > +	xa_for_each(&guc->tlb_lookup, i, wait)
> > +		wake_up(&wait->wq);
> 
> If you are confident there can be no failures to wake up someone, who maybe
> just added themselves to the xarray (via put pages for instance), while
> reset in ongoing. Or even removed themselves after say timing out the wait
> and so freed their entry...

I guess yuo are suggesting here to spinlock around this. The
reset is protected by the uncore->spinlock, but I don't really
see it colliding with reset, to be honest.

> >   }
> >   static void guc_cancel_context_requests(struct intel_context *ce)
> > @@ -1948,6 +1962,50 @@ void intel_guc_submission_reset_finish(struct intel_guc *guc)
> >   static void destroyed_worker_func(struct work_struct *w);
> >   static void reset_fail_worker_func(struct work_struct *w);
> > +static int init_tlb_lookup(struct intel_guc *guc)
> > +{
> > +	struct intel_guc_tlb_wait *wait;
> > +	int err;
> > +
> > +	if (!HAS_GUC_TLB_INVALIDATION(guc_to_gt(guc)->i915))
> > +		return 0;
> > +
> > +	xa_init_flags(&guc->tlb_lookup, XA_FLAGS_ALLOC);
> > +
> > +	wait = kzalloc(sizeof(*wait), GFP_KERNEL);
> > +	if (!wait)
> > +		return -ENOMEM;
> > +
> > +	init_waitqueue_head(&wait->wq);
> > +
> > +	/* Preallocate a shared id for use under memory pressure. */
> > +	err = xa_alloc_cyclic_irq(&guc->tlb_lookup, &guc->serial_slot, wait,
> > +				  xa_limit_32b, &guc->next_seqno, GFP_KERNEL);
> > +	/* Only error if out of memory, not when busy (list full)*/
> > +	if (err == -ENOMEM) {
> > +		kfree(wait);
> > +		return err;
> > +	}
> 
> I agreed with John here that only looking at ENOMEM reads odd and I did not
> see that answered. Did I miss it?

xa_alloc_cyclic_irq() can also fail with -EBUSY... so that I
think this is a matter...

> Otherwise, I _know_ it is not likely to get any other error having *just*
> created a new xarray, but still, why not simply catch all? It is not like
> the slot fallback code at runtime would handle guc->serial_slot being
> empty?! It appears it would just explode in guc_send_invalidate_tlb if it
> would hit it..

... of what we want to do for such errors. E.g. Jonathan decided
here not to fail, but ignore the error. Should we fail for every
error?

> > +
> > +	return 0;
> > +}
> > +
> > +static void fini_tlb_lookup(struct intel_guc *guc)
> > +{
> > +	struct intel_guc_tlb_wait *wait;
> > +
> > +	if (!HAS_GUC_TLB_INVALIDATION(guc_to_gt(guc)->i915))
> > +		return;
> > +
> > +	wait = xa_load(&guc->tlb_lookup, guc->serial_slot);
> > +	if (wait) {
> > +		guc_dbg(guc, "fini_tlb_lookup: Unexpected item in tlb_lookup\n");
> 
> Hm wait, why is this unexpected when init_tlb_lookup() pre-allocated that
> entry? Who frees it? guc_send_invalidate_tlb() does not appear to.

I think this links with my answer above, right? With th
refactoring of the if's for not skipping tlb invalidation.

> > +		kfree(wait);
> > +	}
> > +
> > +	xa_destroy(&guc->tlb_lookup);
> > +}
> > +
> >   /*
> >    * Set up the memory resources to be shared with the GuC (via the GGTT)
> >    * at firmware loading time.

...

> > +int intel_guc_tlb_invalidation_done(struct intel_guc *guc, u32 size, u32 len, u32 seqno)
> > +{
> > +	/* Check for underflow */
> > +	if (unlikely(len < 1 || len > size))
> > +		return -EPROTO;
> 
> These check are not valid for any message/action type ct_process_request()
> can receive?

You mean discriminating by payload? Jonathan... you konw the
details here?

  reply	other threads:[~2023-10-09 11:41 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-06 18:20 [Intel-gfx] [PATCH v8 0/7] drm/i915: Define and use GuC and CTB TLB invalidation routines Jonathan Cavitt
2023-10-06 18:20 ` [Intel-gfx] [PATCH v8 1/7] drm/i915: Add GuC TLB Invalidation device info flags Jonathan Cavitt
2023-10-06 18:20 ` [Intel-gfx] [PATCH v8 2/7] drm/i915/guc: Add CT size delay helper Jonathan Cavitt
2023-10-06 18:20 ` [Intel-gfx] [PATCH v8 3/7] drm/i915: Define and use GuC and CTB TLB invalidation routines Jonathan Cavitt
2023-10-09  8:56   ` Tvrtko Ursulin
2023-10-09 11:40     ` Andi Shyti [this message]
2023-10-09 12:49       ` Tvrtko Ursulin
2023-10-09 15:02     ` Cavitt, Jonathan
2023-10-09 19:14     ` John Harrison
2023-10-10  9:00       ` Tvrtko Ursulin
2023-10-09 12:12   ` Nirmoy Das
2023-10-09 12:54     ` Tvrtko Ursulin
2023-10-09 13:24       ` Nirmoy Das
2023-10-06 18:20 ` [Intel-gfx] [PATCH v8 4/7] drm/i915: No TLB invalidation on suspended GT Jonathan Cavitt
2023-10-09  9:00   ` Tvrtko Ursulin
2023-10-09  9:31     ` Jani Nikula
2023-10-06 18:20 ` [Intel-gfx] [PATCH v8 5/7] drm/i915: No TLB invalidation on wedged GT Jonathan Cavitt
2023-10-06 18:20 ` [Intel-gfx] [PATCH v8 6/7] drm/i915/gt: Increase sleep in gt_tlb selftest sanitycheck Jonathan Cavitt
2023-10-06 18:20 ` [Intel-gfx] [PATCH v8 7/7] drm/i915: Enable GuC TLB invalidations for MTL Jonathan Cavitt
2023-10-06 21:27 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Define and use GuC and CTB TLB invalidation routines Patchwork
2023-10-06 21:27 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2023-10-06 21:45 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2023-10-07  6:43 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Define and use GuC and CTB TLB invalidation routines (rev2) Patchwork
2023-10-07  6:43 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2023-10-07  6:57 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2023-10-07 16:47 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2023-10-09 11:52 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Define and use GuC and CTB TLB invalidation routines (rev3) Patchwork
2023-10-09 11:52 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2023-10-09 12:05 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2023-10-09 14:40 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZSPmxlfTx2oQ2Ns4@ashyti-mobl2.lan \
    --to=andi.shyti@linux.intel.com \
    --cc=chris.p.wilson@linux.intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=janusz.krzysztofik@intel.com \
    --cc=jonathan.cavitt@intel.com \
    --cc=matthew.d.roper@intel.com \
    --cc=nirmoy.das@intel.com \
    --cc=tvrtko.ursulin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox