All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mika Kuoppala <mika.kuoppala@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Cc: stable@vger.kernel.org, Chris Wilson <chris@chris-wilson.co.uk>
Subject: Re: [Intel-gfx] [PATCH] drm/i915/gt: Delay execlist processing for tgl
Date: Fri, 16 Oct 2020 10:07:07 +0300	[thread overview]
Message-ID: <87eelysl7o.fsf@gaia.fi.intel.com> (raw)
In-Reply-To: <20201015195023.32346-1-chris@chris-wilson.co.uk>

Chris Wilson <chris@chris-wilson.co.uk> writes:

> When running gem_exec_nop, it floods the system with many requests (with
> the goal of userspace submitting faster than the HW can process a single
> empty batch). This causes the driver to continually resubmit new
> requests onto the end of an active context, a flood of lite-restore
> preemptions. If we time this just right, Tigerlake hangs.
>
> Inserting a small delay between the processing of CS events and
> submitting the next context, prevents the hang. Naturally it does not
> occur with debugging enabled. The suspicion then is that this is related
> to the issues with the CS event buffer, and inserting an mmio read of
> the CS pointer status appears to be very successful in preventing the
> hang. Other registers, or uncached reads, or plain mb, do not prevent
> the hang, suggesting that register is key -- but that the hang can be
> prevented by a simple udelay, suggests it is just a timing issue like
> that encountered by commit 233c1ae3c83f ("drm/i915/gt: Wait for CSB
> entries on Tigerlake"). Also note that the hang is not prevented by
> applying CTX_DESC_FORCE_RESTORE, or by inserting a delay on the GPU
> between requests.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Bruce Chang <yu.bruce.chang@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: stable@vger.kernel.org

Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/gt/intel_lrc.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 6170f6874f52..d15d561152ba 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -2711,6 +2711,9 @@ static void process_csb(struct intel_engine_cs *engine)
>  			smp_wmb(); /* complete the seqlock */
>  			WRITE_ONCE(execlists->active, execlists->inflight);
>  
> +			/* Magic delay for tgl */
> +			ENGINE_POSTING_READ(engine, RING_CONTEXT_STATUS_PTR);
> +
>  			WRITE_ONCE(execlists->pending[0], NULL);
>  		} else {
>  			if (GEM_WARN_ON(!*execlists->active)) {
> -- 
> 2.20.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

WARNING: multiple messages have this Message-ID (diff)
From: Mika Kuoppala <mika.kuoppala@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>,
	Bruce Chang <yu.bruce.chang@intel.com>,
	Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH] drm/i915/gt: Delay execlist processing for tgl
Date: Fri, 16 Oct 2020 10:07:07 +0300	[thread overview]
Message-ID: <87eelysl7o.fsf@gaia.fi.intel.com> (raw)
In-Reply-To: <20201015195023.32346-1-chris@chris-wilson.co.uk>

Chris Wilson <chris@chris-wilson.co.uk> writes:

> When running gem_exec_nop, it floods the system with many requests (with
> the goal of userspace submitting faster than the HW can process a single
> empty batch). This causes the driver to continually resubmit new
> requests onto the end of an active context, a flood of lite-restore
> preemptions. If we time this just right, Tigerlake hangs.
>
> Inserting a small delay between the processing of CS events and
> submitting the next context, prevents the hang. Naturally it does not
> occur with debugging enabled. The suspicion then is that this is related
> to the issues with the CS event buffer, and inserting an mmio read of
> the CS pointer status appears to be very successful in preventing the
> hang. Other registers, or uncached reads, or plain mb, do not prevent
> the hang, suggesting that register is key -- but that the hang can be
> prevented by a simple udelay, suggests it is just a timing issue like
> that encountered by commit 233c1ae3c83f ("drm/i915/gt: Wait for CSB
> entries on Tigerlake"). Also note that the hang is not prevented by
> applying CTX_DESC_FORCE_RESTORE, or by inserting a delay on the GPU
> between requests.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Bruce Chang <yu.bruce.chang@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: stable@vger.kernel.org

Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/gt/intel_lrc.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 6170f6874f52..d15d561152ba 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -2711,6 +2711,9 @@ static void process_csb(struct intel_engine_cs *engine)
>  			smp_wmb(); /* complete the seqlock */
>  			WRITE_ONCE(execlists->active, execlists->inflight);
>  
> +			/* Magic delay for tgl */
> +			ENGINE_POSTING_READ(engine, RING_CONTEXT_STATUS_PTR);
> +
>  			WRITE_ONCE(execlists->pending[0], NULL);
>  		} else {
>  			if (GEM_WARN_ON(!*execlists->active)) {
> -- 
> 2.20.1

  parent reply	other threads:[~2020-10-16  7:08 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-15 19:50 [Intel-gfx] [PATCH] drm/i915/gt: Delay execlist processing for tgl Chris Wilson
2020-10-15 19:50 ` Chris Wilson
2020-10-15 21:05 ` [Intel-gfx] ✓ Fi.CI.BAT: success for " Patchwork
2020-10-16  1:08 ` [Intel-gfx] [PATCH] " Shi, Yang A
2020-10-16  1:08   ` Shi, Yang A
2020-10-16  8:43   ` Chris Wilson
2020-10-16  1:47 ` [Intel-gfx] ✓ Fi.CI.IGT: success for " Patchwork
2020-10-16  7:07 ` Mika Kuoppala [this message]
2020-10-16  7:07   ` [PATCH] " Mika Kuoppala

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87eelysl7o.fsf@gaia.fi.intel.com \
    --to=mika.kuoppala@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.