public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Michel Dänzer" <michel@daenzer.net>
To: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Markus Trippelsdorf <markus@trippelsdorf.de>
Subject: Re: Radeon RS780 - BUG: unable to handle kernel NULL pointer dereference
Date: Tue, 09 Nov 2010 11:32:57 +0100	[thread overview]
Message-ID: <1289298777.10682.63.camel@thor.local> (raw)
In-Reply-To: <4CD91D58.7080508@vmware.com>

On Die, 2010-11-09 at 11:07 +0100, Thomas Hellstrom wrote: 
> On 11/09/2010 10:53 AM, Thomas Hellstrom wrote:
> > On 11/09/2010 10:29 AM, Markus Trippelsdorf wrote:
> >> OK I've found the buggy commit by bisection:
> >>
> >> e376573f7267390f4e1bdc552564b6fb913bce76 is the first bad commit
> >> commit e376573f7267390f4e1bdc552564b6fb913bce76
> >> Author: Michel Dänzer<daenzer@vmware.com>
> >> Date:   Thu Jul 8 12:43:28 2010 +1000
> >>
> >>      drm/radeon: fall back to GTT if bo creation/validation in VRAM 
> >> fails.
> >>
> >>      This fixes a problem where on low VRAM cards we'd run out of 
> >> space for validation.
> >>
> >>      [airlied: Tested on my M7, Thinkpad T42, compiz works with no 
> >> problems.]
> >>
> >>      Signed-off-by: Michel Dänzer<daenzer@vmware.com>
> >>      Cc: stable@kernel.org
> >>      Signed-off-by: Dave Airlie<airlied@redhat.com>
> >>
> >> Please note that this is an old commit from 2.6.36-rc. When I revert 
> >> it the
> >> kernel no longer crashes. Instead I see the following in my dmesg:
> >>
> >
> > Hmm, so this sounds like something in the Radeon eviction error path 
> > is causing corruption.
> > I had a similar problem with vmwgfx, when I tried to unref a BO 
> > _after_ ttm_bo_init() failed.
> > ttm_bo_init() is really supposed to call unref itself for various 
> > reasons,  so calling unref() or kfree() after a failed ttm_bo_init() 
> > will cause corruption.
> >
> > In any case, the error below also suggests something is a bit fragile 
> > in the Radeon driver:
> >
> > First, an accelerated eviction may fail, like in the message below, 
> > but then there must always be a backup plan, like unaccelerated 
> > eviction to system. On BO creation, there are a number of placement 
> > strategies, but if all else fails, it should be possible to initially 
> > place the BO in system memory.
> >
> > Second, If bo validation fails during a command submission, due to 
> > insufficient VRAM / TT, then the driver should retry the complete 
> > validation cycle after first blocking all other validators and then 
> > evicting everything not pinned, to avoid failures due to fragmentation.
> >
> > /Thomas
> >
> 
> Indeed, it seems like the commit you mention just retries ttm_bo_init() 
> after it previously failed. At that point the bo has been destroyed, so 
> that is probably what's causing the BUG you are seeing.
> 
> Admittedly, ttm_bo_init() calling unref on failure is not properly 
> documented in the function description.  The reason for doing so is to 
> have a single path for freeing all BO resources already allocated on the 
> point of failure.

Does the patch below fix the problem?


commit e224472eedbda391ddb6d8b88f26e82e1c3b036b
Author: Michel Dänzer <daenzer@vmware.com>
Date:   Tue Nov 9 11:30:41 2010 +0100

    drm/radeon/kms: Fix retrying ttm_bo_init() after it failed once.
    
    If ttm_bo_init() returns failure, it already destroyed the BO, so we need to
    retry from scratch.
    
    Signed-off-by: Michel Dänzer <daenzer@vmware.com>
    Cc: stable@kernel.org

diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
index 1b9004e..bbe92d5 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -102,6 +102,8 @@ int radeon_bo_create(struct radeon_device *rdev, struct drm_gem_object *gobj,
 		type = ttm_bo_type_device;
 	}
 	*bo_ptr = NULL;
+
+retry:
 	bo = kzalloc(sizeof(struct radeon_bo), GFP_KERNEL);
 	if (bo == NULL)
 		return -ENOMEM;
@@ -109,8 +111,6 @@ int radeon_bo_create(struct radeon_device *rdev, struct drm_gem_object *gobj,
 	bo->gobj = gobj;
 	bo->surface_reg = -1;
 	INIT_LIST_HEAD(&bo->list);
-
-retry:
 	radeon_ttm_placement_from_domain(bo, domain);
 	/* Kernel allocation are uninterruptible */
 	mutex_lock(&rdev->vram_mutex);


-- 
Earthling Michel Dänzer           |                http://www.vmware.com
Libre software enthusiast         |          Debian, X and DRI developer

  reply	other threads:[~2010-11-09 10:41 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-08 17:02 Radeon RS780 - BUG: unable to handle kernel NULL pointer dereference Markus Trippelsdorf
2010-11-08 17:07 ` Markus Trippelsdorf
2010-11-08 18:43   ` Markus Trippelsdorf
2010-11-08 19:02     ` Markus Trippelsdorf
2010-11-08 19:36       ` Jerome Glisse
2010-11-08 20:53       ` Jerome Glisse
2010-11-08 20:58         ` Rafael J. Wysocki
2010-11-08 22:01           ` Jerome Glisse
2010-11-08 22:25           ` Thomas Hellstrom
2010-11-08 22:29         ` Thomas Hellstrom
2010-11-09  9:29           ` Markus Trippelsdorf
2010-11-09  9:53             ` Thomas Hellstrom
2010-11-09 10:07               ` Thomas Hellstrom
2010-11-09 10:32                 ` Michel Dänzer [this message]
2010-11-09 10:37                   ` Markus Trippelsdorf
2010-11-09 10:52                     ` Michel Dänzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1289298777.10682.63.camel@thor.local \
    --to=michel@daenzer.net \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=markus@trippelsdorf.de \
    --cc=thellstrom@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox