public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: Jan Niehusmann <jan@gondor.com>
Cc: linux-kernel@vger.kernel.org, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH] intel-gtt: fix memory corruption with GM965 and >4GB RAM
Date: Fri, 25 Feb 2011 22:18:16 +0000	[thread overview]
Message-ID: <b9dded$i2hik6@orsmga002.jf.intel.com> (raw)
In-Reply-To: <20110225211646.GA6837@x61s.reliablesolutions.de>

On Fri, 25 Feb 2011 22:16:46 +0100, Jan Niehusmann <jan@gondor.com> wrote:
> Hi Chris,
> 
> On Fri, Feb 25, 2011 at 08:22:53PM +0000, Chris Wilson wrote:
> > On Fri, 25 Feb 2011 13:30:56 +0100, Jan Niehusmann <jan@gondor.com> wrote:
> > > Further investigation revealed that the corrupted address is
> > > (dev_priv->status_page_dmah->busaddr & 0xffffffff), ie. the beginning of
> > > the hardware status page of the i965 graphics card, cut to 32 bits.
> > 
> > 965GM explicitly supports 36bits of addressing in the PTE. The only
> > exception is that general state (part of the 3D engine) must be located in
> > the lower 4GiB.
> 
> I'm not claiming that 965GM doesn't do 36 bits. In fact I actually see
> activity in /sys/kernel/debug/dri/64/i915_gem_hws, and everything seems
> to be working well, when the status page is above 4GB - if one ignores
> the tiny detail that the wrong memory location gets overwritten,
> sometimes...
> 
> > Simply ignoring the upper 4bits is the wrong approach and means that the
> > PTE then point to random pages, and completely irrelevant to the physical
> > address used in the hardware status page address register.
> 
> Doesn't setting DMA_BIT_MASK(32) only change the region DMA memory is
> allocated from? I made that change just to make sure one gets addresses
> which are safe even if the chipset sometimes ignores address bit 32. The
> only negative impact I could think of is the allocation may fail if no
> appropriate memory is available. Am I wrong?

I just thought Daniel had wired up the dma_mask_size differently and
didn't realise it also did pci_set_dma_mask on the same pci dev. So
our patches were morally equivalent. ;-)

> 
> > I have been considering:
> 
> > +       if (IS_BRROADWATER(dev) || IS_CRESTLINE(dev))
> > +               dma_set_coherent_mask(&dev->pdev->dev, DMA_BIT_MASK(32));
> 
> > to prevent hitting the erratum.
> 
> So is there a known erratum about these chips? I didn't find errata
> documents online, but I only did a short google search and may have
> missed them.

Hah. I wish our documentation were that organised. If you grab the PRM
for gen4 from intellinuxgraphics.org, you should find it mentioned in the
state descriptions for the 3D pipeline.

> > However your bug looks to be:
> 
> > -       if (INTEL_INFO(dev)->gen >= 4)
> > -               dev_priv->dma_status_page |= (dev_priv->dma_status_page >> 28) &
> > -                                            0xf0;
> > +       if (INTEL_INFO(dev)->gen >= 4) /* 36-bit addressing */
> > +               dev_priv->dma_status_page |=
> > +                       (dev_priv->status_page_dmah->busaddr >> 28) & 0xf0;
> 
> Don't think so. dev_priv->dma_status_page gets initialized to
> dev_priv->status_page_dmah->busaddr a few lines above, and it's 64 bit,
> so that diff doesn't change the result of the computation.

And here I was working on the assumption that the code to program a 32bit
register would indeed create a 32bit value.

So, I'm happy to use your patch to workaround the known erratum. I just
wish I had an explanation as to what is actually causing the corruption.
What I want to make sure is that we don't paper over a real bug by
thinking it is yet another silicon issue.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

  reply	other threads:[~2011-02-25 22:18 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20110223233022.GA3439@x61s.reliablesolutions.de>
2011-02-25 12:30 ` [PATCH] intel-gtt: fix memory corruption with GM965 and >4GB RAM Jan Niehusmann
2011-02-25 20:22   ` Chris Wilson
2011-02-25 21:16     ` Jan Niehusmann
2011-02-25 22:18       ` Chris Wilson [this message]
2011-02-25 23:05         ` Daniel Vetter
2011-02-28  6:46           ` [Intel-gfx] " Zhenyu Wang
2011-02-28 19:57             ` Daniel Vetter
2011-03-01 22:24           ` [PATCH] drm/i915: " Jan Niehusmann
2011-03-01 22:32             ` Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='b9dded$i2hik6@orsmga002.jf.intel.com' \
    --to=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jan@gondor.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox