public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jiri Slaby <jirislaby@gmail.com>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jiri Slaby <jslaby@suse.cz>, Keith Packard <keithp@keithp.com>,
	LKML <linux-kernel@vger.kernel.org>,
	dri-devel@lists.freedesktop.org
Subject: Re: g33: GPU hangs
Date: Wed, 18 Jan 2012 10:31:42 +0100	[thread overview]
Message-ID: <4F16917E.1090401@gmail.com> (raw)
In-Reply-To: <d08817$2d9rsk@azsmga001.ch.intel.com>

On 12/01/2011 01:47 PM, Chris Wilson wrote:
> On Thu, 01 Dec 2011 13:30:18 +0100, Jiri Slaby <jslaby@suse.cz> wrote:
>> Hi,
>>
>> both yesterday and today, my GPU hung. Both happened when I opened
>> google front page in firefox.
>>
>> I'm running 3.2.0-rc3-next-20111130. Given it happened twice in the past
>> 24 hours, it looks like a regression from next-20111124. Or is this a
>> userspace issue (I might updated some packages)?
>>
>> i915_error_state dumps from the two hangs are here:
>> http://www.fi.muni.cz/~xslaby/sklad/panics/915_error_state_0
>> http://www.fi.muni.cz/~xslaby/sklad/panics/915_error_state_second
> 
> Both error states contain the same bug: a fence register in conflict
> with the command stream. The batch is using the buffer at 0x03d0000 
> as an untiled 40x40 rgba buffer with pitch 192. However, a fence
> register is programmed to
>   fence[3] = 03d00001
>     valid, x-tiled, pitch: 512, start: 0x03d00000, size: 1048576
> 
> Also note that buffer is also not listed as currently active, so
> presumably we reused the buffer as tiled (and so reprogrammed the
> fence registered) before the GPU retired the batch. That sounds eerily
> similar to this bug:
> 
> From 2b76187d2f5fc2352e391914b1828f91f93bb356 Mon Sep 17 00:00:00 2001
> From: Chris Wilson <chris@chris-wilson.co.uk>
> Date: Tue, 29 Nov 2011 15:12:16 +0000
> Subject: [PATCH] drm/i915: Only clear the GPU domains upon a successful
>  finish

Hi, do you plan to push this patch upstream? Or am I supposed to not use
it anymore?

> By clearing the GPU read domains before waiting upon the buffer, we run
> the risk of the wait being interrupted and the domains prematurely
> cleared. The next time we attempt to wait upon the buffer (after
> userspace handles the signal), we believe that the buffer is idle and so
> skip the wait.
> 
> There are a number of bugs across all generations which show signs of an
> overly haste reuse of active buffers.
> 
> Such as:
> 
>   https://bugs.freedesktop.org/show_bug.cgi?id=29046
>   https://bugs.freedesktop.org/show_bug.cgi?id=35863
>   https://bugs.freedesktop.org/show_bug.cgi?id=38952
>   https://bugs.freedesktop.org/show_bug.cgi?id=40282
>   https://bugs.freedesktop.org/show_bug.cgi?id=41098
>   https://bugs.freedesktop.org/show_bug.cgi?id=41102
>   https://bugs.freedesktop.org/show_bug.cgi?id=41284
>   https://bugs.freedesktop.org/show_bug.cgi?id=42141
> 
> A couple of those pre-date i915_gem_object_finish_gpu(), so may be
> unrelated (such as a wild write from a userspace command buffer), but
> this does look like a convincing cause for most of those bugs.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: stable@kernel.org
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c |    7 +++++--
>  1 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index d560175..036bc58 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3087,10 +3087,13 @@ i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj)
>  			return ret;
>  	}
>  
> +	ret = i915_gem_object_wait_rendering(obj);
> +	if (ret)
> +		return ret;
> +
>  	/* Ensure that we invalidate the GPU's caches and TLBs. */
>  	obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> -
> -	return i915_gem_object_wait_rendering(obj);
> +	return 0;
>  }
>  
>  /**


-- 
js

  parent reply	other threads:[~2012-01-18  9:31 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-01 12:30 g33: GPU hangs Jiri Slaby
2011-12-01 12:47 ` Chris Wilson
2011-12-03 22:58   ` Jiri Slaby
2012-01-18  9:31   ` Jiri Slaby [this message]
2012-01-18 11:43     ` Daniel Vetter
2012-01-24 14:51       ` new GPU hang [was: g33: GPU hangs] Jiri Slaby

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F16917E.1090401@gmail.com \
    --to=jirislaby@gmail.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jslaby@suse.cz \
    --cc=keithp@keithp.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox