From: "Michel Dänzer" <michel@daenzer.net>
To: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Thomas Hellstrom <thellstrom@vmware.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"dri-devel@lists.freedesktop.org"
<dri-devel@lists.freedesktop.org>
Subject: Re: Radeon RS780 - BUG: unable to handle kernel NULL pointer dereference
Date: Tue, 09 Nov 2010 11:52:27 +0100 [thread overview]
Message-ID: <1289299947.10682.68.camel@thor.local> (raw)
In-Reply-To: <20101109103737.GA1767@arch.trippelsdorf.de>
On Die, 2010-11-09 at 11:37 +0100, Markus Trippelsdorf wrote:
> On Tue, Nov 09, 2010 at 11:32:57AM +0100, Michel Dänzer wrote:
> > On Die, 2010-11-09 at 11:07 +0100, Thomas Hellstrom wrote:
> > > On 11/09/2010 10:53 AM, Thomas Hellstrom wrote:
> > > > On 11/09/2010 10:29 AM, Markus Trippelsdorf wrote:
> > > >> OK I've found the buggy commit by bisection:
> > > >>
> > > >> e376573f7267390f4e1bdc552564b6fb913bce76 is the first bad commit
> > > >> commit e376573f7267390f4e1bdc552564b6fb913bce76
> > > >> Author: Michel Dänzer<daenzer@vmware.com>
> > > >> Date: Thu Jul 8 12:43:28 2010 +1000
> > > >>
> > > >> drm/radeon: fall back to GTT if bo creation/validation in VRAM
> > > >> fails.
> > > >>
> > > >> This fixes a problem where on low VRAM cards we'd run out of
> > > >> space for validation.
> > > >>
> > > >> [airlied: Tested on my M7, Thinkpad T42, compiz works with no
> > > >> problems.]
> > > >>
> > > >> Signed-off-by: Michel Dänzer<daenzer@vmware.com>
> > > >> Cc: stable@kernel.org
> > > >> Signed-off-by: Dave Airlie<airlied@redhat.com>
> > > >>
> > > >> Please note that this is an old commit from 2.6.36-rc. When I revert
> > > >> it the
> > > >> kernel no longer crashes. Instead I see the following in my dmesg:
> > > >>
> > > >
> > > > Hmm, so this sounds like something in the Radeon eviction error path
> > > > is causing corruption.
> > > > I had a similar problem with vmwgfx, when I tried to unref a BO
> > > > _after_ ttm_bo_init() failed.
> > > > ttm_bo_init() is really supposed to call unref itself for various
> > > > reasons, so calling unref() or kfree() after a failed ttm_bo_init()
> > > > will cause corruption.
> > > >
> > > > In any case, the error below also suggests something is a bit fragile
> > > > in the Radeon driver:
> > > >
> > > > First, an accelerated eviction may fail, like in the message below,
> > > > but then there must always be a backup plan, like unaccelerated
> > > > eviction to system. On BO creation, there are a number of placement
> > > > strategies, but if all else fails, it should be possible to initially
> > > > place the BO in system memory.
> > > >
> > > > Second, If bo validation fails during a command submission, due to
> > > > insufficient VRAM / TT, then the driver should retry the complete
> > > > validation cycle after first blocking all other validators and then
> > > > evicting everything not pinned, to avoid failures due to fragmentation.
> > > >
> > > > /Thomas
> > > >
> > >
> > > Indeed, it seems like the commit you mention just retries ttm_bo_init()
> > > after it previously failed. At that point the bo has been destroyed, so
> > > that is probably what's causing the BUG you are seeing.
> > >
> > > Admittedly, ttm_bo_init() calling unref on failure is not properly
> > > documented in the function description. The reason for doing so is to
> > > have a single path for freeing all BO resources already allocated on the
> > > point of failure.
> >
> > Does the patch below fix the problem?
>
> Yes, indeed. I was just about to send the same patch to the list.
>
> Thanks.
Thank you for testing / confirming the fix, and to Thomas for the
analysis of the problem.
I've submitted the fix to Dave with your Tested-by: added.
--
Earthling Michel Dänzer | http://www.vmware.com
Libre software enthusiast | Debian, X and DRI developer
prev parent reply other threads:[~2010-11-09 10:52 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-08 17:02 Radeon RS780 - BUG: unable to handle kernel NULL pointer dereference Markus Trippelsdorf
2010-11-08 17:07 ` Markus Trippelsdorf
2010-11-08 18:43 ` Markus Trippelsdorf
2010-11-08 19:02 ` Markus Trippelsdorf
2010-11-08 19:36 ` Jerome Glisse
2010-11-08 20:53 ` Jerome Glisse
2010-11-08 20:58 ` Rafael J. Wysocki
2010-11-08 22:01 ` Jerome Glisse
2010-11-08 22:25 ` Thomas Hellstrom
2010-11-08 22:29 ` Thomas Hellstrom
2010-11-09 9:29 ` Markus Trippelsdorf
2010-11-09 9:53 ` Thomas Hellstrom
2010-11-09 10:07 ` Thomas Hellstrom
2010-11-09 10:32 ` Michel Dänzer
2010-11-09 10:37 ` Markus Trippelsdorf
2010-11-09 10:52 ` Michel Dänzer [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1289299947.10682.68.camel@thor.local \
--to=michel@daenzer.net \
--cc=dri-devel@lists.freedesktop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=markus@trippelsdorf.de \
--cc=thellstrom@vmware.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox