All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Eric Anholt <eric@anholt.net>
Cc: intel-gfx@lists.freedesktop.org, Ben Widawsky <ben@bwidawsk.net>
Subject: Re: I've got the RC6 bug
Date: Wed, 18 Jan 2012 21:09:37 +0100	[thread overview]
Message-ID: <20120118200937.GE4002@phenom.ffwll.local> (raw)
In-Reply-To: <87lip4ew65.fsf@eliezer.anholt.net>

On Wed, Jan 18, 2012 at 09:51:30AM -0800, Eric Anholt wrote:
> On Wed, 18 Jan 2012 11:17:52 +0000, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > On Wed, 18 Jan 2012 01:24:26 +0100, Daniel Vetter <daniel@ffwll.ch> wrote:
> > > On Wed, Jan 18, 2012 at 01:16:02AM +0100, CC wrote:
> > > > I attached the error state.
> > > 
> > > Nice one, your gpu seems to have simply disappeared. And the ringbuffer
> > > contains a rather peculiar cmd sequence. Putting Chris (maybe he
> > > recognizes the pattern) and Ben (he's got a patch in the works to dump a
> > > debug register that might be interesting here) on cc. It's too late atm
> > > for me to think about this some more.
> > 
> > Not simply disappeared, someone clobbered it with an extremely large
> > hammer. The GPU was killed by a stray write to address 0 which took out
> > the render ring buffer and its hws page. So my first thought is a
> > missing relocation, and i965g springs to mind.
> > -Chris
> 
> At one point there was a bug in Mesa that wrote to 0:
> 
> commit dfada714f8db3deea2fea3583c3c166a78db1117
> Author: Eric Anholt <eric@anholt.net>
> Date:   Fri Jun 17 18:20:36 2011 -0700
> 
>     i965/gen6: Use an BO instead of writing to address 0 for PIPE_CONTROL W/A.
>     
>     This was spectacularly unsafe.  On my system, address 0 happens to be
>     the hardware status page for the render ring, and the first quadword
>     of that happens to contain nothing we ever look at, but I sure didn't
>     look forward to having to debug some day when, for example, the kernel
>     happened to bind the ringbuffer before binding the hwsp.

Unfortunately the error_state contains more garbage than just one stray 0
write. So yeah, if this is due to the i965g gallium driver, that would
explain things - otherwise I'm hoping for Ben's reworked gt fifo patch.
The CS regs are all 0, indicating that the gpu isn't getting out of deep
sleep anymore.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

  reply	other threads:[~2012-01-18 20:09 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-16 16:18 I've got the RC6 bug CC
2012-01-16 16:36 ` Daniel Vetter
2012-01-18  0:16   ` CC
2012-01-18  0:24     ` Daniel Vetter
2012-01-18 11:17       ` Chris Wilson
2012-01-18 17:51         ` Eric Anholt
2012-01-18 20:09           ` Daniel Vetter [this message]
2012-01-20 10:30       ` Daniel Vetter
2012-01-20 10:46         ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120118200937.GE4002@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=ben@bwidawsk.net \
    --cc=eric@anholt.net \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.