public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Tomas Elf <tomas.elf@intel.com>
Cc: Intel-GFX@Lists.FreeDesktop.Org
Subject: Re: [RFC 01/11] drm/i915: Early exit from semaphore_waits_for for execlist mode.
Date: Tue, 16 Jun 2015 15:44:45 +0200	[thread overview]
Message-ID: <20150616134445.GW23637@phenom.ffwll.local> (raw)
In-Reply-To: <1433783009-17251-2-git-send-email-tomas.elf@intel.com>

On Mon, Jun 08, 2015 at 06:03:19PM +0100, Tomas Elf wrote:
> When submitting semaphores in execlist mode the hang checker crashes in this
> function because it is only runnable in ring submission mode. The reason this
> is of particular interest to the TDR patch series is because we use semaphores
> as a mean to induce hangs during testing (which is the recommended way to
> induce hangs for gen8+). It's not clear how this is supposed to work in
> execlist mode since:
> 
> 1. This function requires a ring buffer.
> 
> 2. Retrieving a ring buffer in execlist mode requires us to retrieve the
> corresponding context, which we get from a request.
> 
> 3. Retieving a request from the hang checker is not straight-forward since that
> requires us to grab the struct_mutex in order to synchronize against the
> request retirement thread.
> 
> 4. Grabbing the struct_mutex from the hang checker is nothing that we will do
> since that puts us at risk of deadlock since a hung thread might be holding the
> struct_mutex already.
> 
> Therefore it's not obvious how we're supposed to deal with this. For now, we're
> doing an early exit from this function, which avoids any kernel panic situation
> when running our own internal TDR ULT.
> 
> Signed-off-by: Tomas Elf <tomas.elf@intel.com>

We should have a Testcase: line here which mentions the igt testcase which
provoke this bug. Or we need to fill this gap asap.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_irq.c |   20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 46bcbff..40c44fc 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -2698,6 +2698,26 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
>  	u64 offset = 0;
>  	int i, backwards;
>  
> +	/*
> +	 * This function does not support execlist mode - any attempt to
> +	 * proceed further into this function will result in a kernel panic
> +	 * when dereferencing ring->buffer, which is not set up in execlist
> +	 * mode.
> +	 *
> +	 * The correct way of doing it would be to derive the currently
> +	 * executing ring buffer from the current context, which is derived
> +	 * from the currently running request. Unfortunately, to get the
> +	 * current request we would have to grab the struct_mutex before doing
> +	 * anything else, which would be ill-advised since some other thread
> +	 * might have grabbed it already and managed to hang itself, causing
> +	 * the hang checker to deadlock.
> +	 *
> +	 * Therefore, this function does not support execlist mode in its
> +	 * current form. Just return NULL and move on.
> +	 */
> +	if (i915.enable_execlists)
> +		return NULL;
> +
>  	ipehr = I915_READ(RING_IPEHR(ring->mmio_base));
>  	if (!ipehr_is_semaphore_wait(ring->dev, ipehr))
>  		return NULL;
> -- 
> 1.7.9.5
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2015-06-16 13:42 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-08 17:03 [RFC 00/11] TDR/watchdog timeout support for gen8 Tomas Elf
2015-06-08 17:03 ` [RFC 01/11] drm/i915: Early exit from semaphore_waits_for for execlist mode Tomas Elf
2015-06-08 17:36   ` Chris Wilson
2015-06-09 11:02     ` Tomas Elf
2015-06-16 13:44   ` Daniel Vetter [this message]
2015-06-16 15:46     ` Tomas Elf
2015-06-16 16:50       ` Chris Wilson
2015-06-16 17:07         ` Tomas Elf
2015-06-17 11:43       ` Daniel Vetter
2015-06-08 17:03 ` [RFC 02/11] drm/i915: Introduce uevent for full GPU reset Tomas Elf
2015-06-16 13:43   ` Daniel Vetter
2015-06-16 15:43     ` Tomas Elf
2015-06-16 16:55       ` Chris Wilson
2015-06-16 17:32         ` Tomas Elf
2015-06-16 19:33           ` Chris Wilson
2015-06-17 11:49             ` Daniel Vetter
2015-06-17 12:51               ` Chris Wilson
2015-06-08 17:03 ` [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset Tomas Elf
2015-06-08 17:33   ` Chris Wilson
2015-06-09 11:06     ` Tomas Elf
2015-06-16 13:48     ` Daniel Vetter
2015-06-16 13:54       ` Chris Wilson
2015-06-16 15:55         ` Daniel Vetter
2015-06-18 11:12         ` Dave Gordon
2015-06-11  9:14   ` Dave Gordon
2015-06-16 13:49   ` Daniel Vetter
2015-06-16 15:54     ` Tomas Elf
2015-06-17 11:51       ` Daniel Vetter
2015-06-08 17:03 ` [RFC 04/11] drm/i915: Adding TDR / per-engine reset support for gen8 Tomas Elf
2015-06-08 17:03 ` [RFC 05/11] drm/i915: Extending i915_gem_check_wedge to check engine reset in progress Tomas Elf
2015-06-08 17:24   ` Chris Wilson
2015-06-09 11:08     ` Tomas Elf
2015-06-09 11:11   ` Chris Wilson
2015-06-08 17:03 ` [RFC 06/11] drm/i915: Disable warnings for TDR interruptions in the display driver Tomas Elf
2015-06-08 17:53   ` Chris Wilson
2015-06-08 17:03 ` [RFC 07/11] drm/i915: Reinstate hang recovery work queue Tomas Elf
2015-06-08 17:03 ` [RFC 08/11] drm/i915: Watchdog timeout support for gen8 Tomas Elf
2015-06-08 17:03 ` [RFC 09/11] drm/i915: Fake lost context interrupts through forced CSB check Tomas Elf
2015-06-08 17:03 ` [RFC 10/11] drm/i915: Debugfs interface for per-engine hang recovery Tomas Elf
2015-06-08 17:45   ` Chris Wilson
2015-06-09 11:18     ` Tomas Elf
2015-06-09 12:27       ` Chris Wilson
2015-06-09 17:28         ` Tomas Elf
2015-06-11  9:32     ` Dave Gordon
2015-06-08 17:03 ` [RFC 11/11] drm/i915: TDR/watchdog trace points Tomas Elf
2015-06-23 10:05 ` [RFC 00/11] TDR/watchdog timeout support for gen8 Daniel Vetter
2015-06-23 10:47   ` Tomas Elf
2015-06-23 11:38     ` Daniel Vetter
2015-06-23 14:06       ` Tomas Elf
2015-06-23 15:20         ` Daniel Vetter
2015-06-23 15:35           ` Daniel Vetter
2015-06-25 10:38             ` Tomas Elf
2015-07-03 11:15 ` Mika Kuoppala
2015-07-03 17:41   ` Tomas Elf
2015-07-09 18:47 ` Chris Wilson
2015-07-10 15:24   ` Tomas Elf
2015-07-10 15:48     ` Tomas Elf
2015-07-11 18:15       ` Chris Wilson
2015-07-11 18:22     ` Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150616134445.GW23637@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=Intel-GFX@Lists.FreeDesktop.Org \
    --cc=tomas.elf@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox