From mboxrd@z Thu Jan 1 00:00:00 1970 From: Simon Kitching Subject: Re: radeon error on resume: "cp failed to get scratch reg" - anyone interested? Date: Thu, 20 Sep 2012 08:54:20 +0200 Message-ID: <505ABD9C.6080804@vonos.net> References: <50572040.2020107@vonos.net> <1348045973.3218.16.camel@thor.local> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; Format="flowed" Content-Transfer-Encoding: quoted-printable Return-path: Received: from dave.websupport.sk (dave.websupport.sk [195.210.29.69]) by gabe.freedesktop.org (Postfix) with ESMTP id 977BCA0902 for ; Wed, 19 Sep 2012 23:54:24 -0700 (PDT) In-Reply-To: <1348045973.3218.16.camel@thor.local> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org Errors-To: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org To: =?ISO-8859-1?Q?Michel_D=E4nzer?= Cc: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org On 19/09/12 11:12, Michel D=E4nzer wrote: > On Mon, 2012-09-17 at 15:06 +0200, Simon Kitching wrote: >> Hi, >> >> I've noticed that on resume from suspend, dmesg reports: >> >> [21895.997536] [drm] radeon: 1 quad pipes, 2 z pipes initialized. >> [21896.012072] [drm] PCIE GART of 512M enabled (table at >> 0x0000000000040000). >> [21896.012082] radeon 0000:01:00.0: WB enabled >> [21896.012085] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr >> 0x0000000010000000 and cpu addr 0xffccb000 >> [21896.012138] [drm] radeon: ring at 0x0000000010001000 >> [21896.012157] [drm:r100_ring_test] *ERROR* radeon: cp failed to get >> scratch reg (-22). >> [21896.012158] [drm:r100_cp_init] *ERROR* radeon: cp isn't working (-22). >> [21896.012160] radeon 0000:01:00.0: failed initializing CP (-22). >> >> Graphics still seems to work fine though. >> >> Is this info of interest to anyone? I'm happy to do additional testing >> if that is useful. > Does the patch below fix the failure to get a scratch register? > > > From 6780dca112752e1fc4e7d410441b39874565b754 Mon Sep 17 00:00:00 2001 > From: =3D?UTF-8?q?Michel=3D20D=3DC3=3DA4nzer?=3D > Date: Wed, 19 Sep 2012 11:09:14 +0200 > Subject: [PATCH] drm/radeon: Fix scratch register leak in IB test. > MIME-Version: 1.0 > Content-Type: text/plain; charset=3DUTF-8 > Content-Transfer-Encoding: 8bit > > Restructure the code to jump out via labels instead of directly returning > early. > --- > drivers/gpu/drm/radeon/r100.c | 12 ++++++------ > drivers/gpu/drm/radeon/r600.c | 12 ++++++------ > 2 files changed, 12 insertions(+), 12 deletions(-) > > diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c > index 8acb34f..819c352 100644 > --- a/drivers/gpu/drm/radeon/r100.c > +++ b/drivers/gpu/drm/radeon/r100.c > @@ -3818,7 +3818,7 @@ int r100_ib_test(struct radeon_device *rdev, struct= radeon_ring *ring) > WREG32(scratch, 0xCAFEDEAD); > r =3D radeon_ib_get(rdev, RADEON_RING_TYPE_GFX_INDEX, &ib, 256); > if (r) { > - return r; > + goto free_scratch; > } > ib.ptr[0] =3D PACKET0(scratch, 0); > ib.ptr[1] =3D 0xDEADBEEF; > @@ -3831,13 +3831,11 @@ int r100_ib_test(struct radeon_device *rdev, stru= ct radeon_ring *ring) > ib.length_dw =3D 8; > r =3D radeon_ib_schedule(rdev, &ib, NULL); > if (r) { > - radeon_scratch_free(rdev, scratch); > - radeon_ib_free(rdev, &ib); > - return r; > + goto free_ib; > } > r =3D radeon_fence_wait(ib.fence, false); > if (r) { > - return r; > + goto free_ib; > } > for (i =3D 0; i < rdev->usec_timeout; i++) { > tmp =3D RREG32(scratch); > @@ -3853,8 +3851,10 @@ int r100_ib_test(struct radeon_device *rdev, struc= t radeon_ring *ring) > scratch, tmp); > r =3D -EINVAL; > } > - radeon_scratch_free(rdev, scratch); > +free_ib: > radeon_ib_free(rdev, &ib); > +free_scratch: > + radeon_scratch_free(rdev, scratch); > return r; > } > = > diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c > index d79c639..38b546e 100644 > --- a/drivers/gpu/drm/radeon/r600.c > +++ b/drivers/gpu/drm/radeon/r600.c > @@ -2638,7 +2638,7 @@ int r600_ib_test(struct radeon_device *rdev, struct= radeon_ring *ring) > r =3D radeon_ib_get(rdev, ring->idx, &ib, 256); > if (r) { > DRM_ERROR("radeon: failed to get ib (%d).\n", r); > - return r; > + goto free_scratch; > } > ib.ptr[0] =3D PACKET3(PACKET3_SET_CONFIG_REG, 1); > ib.ptr[1] =3D ((scratch - PACKET3_SET_CONFIG_REG_OFFSET) >> 2); > @@ -2646,15 +2646,13 @@ int r600_ib_test(struct radeon_device *rdev, stru= ct radeon_ring *ring) > ib.length_dw =3D 3; > r =3D radeon_ib_schedule(rdev, &ib, NULL); > if (r) { > - radeon_scratch_free(rdev, scratch); > - radeon_ib_free(rdev, &ib); > DRM_ERROR("radeon: failed to schedule ib (%d).\n", r); > - return r; > + goto free_ib; > } > r =3D radeon_fence_wait(ib.fence, false); > if (r) { > DRM_ERROR("radeon: fence wait failed (%d).\n", r); > - return r; > + goto free_ib; > } > for (i =3D 0; i < rdev->usec_timeout; i++) { > tmp =3D RREG32(scratch); > @@ -2669,8 +2667,10 @@ int r600_ib_test(struct radeon_device *rdev, struc= t radeon_ring *ring) > scratch, tmp); > r =3D -EINVAL; > } > - radeon_scratch_free(rdev, scratch); > +free_ib: > radeon_ib_free(rdev, &ib); > +free_scratch: > + radeon_scratch_free(rdev, scratch); > return r; > } > = Sadly the patch does not fix the issue. However I agree that this patch = is a good thing - the original code does clearly leak scratch registers = in some failure conditions. Maybe this patch could also add the DRM_ERROR calls for these error = cases from the r600 version into the r100 version? Note that on failure of radeon_ib_schedule, the order of resource = releases *was* radeon_scratch_free radeon_ib_free and with this patch is: radeon_ib_free radeon_scratch_free This change looks correct to me, ie post-patch the frees are in reverse = order to original allocation. If you do submit this patch, please add "reviewed-by = skitching@vonos.net" or "tested-by" or whatever you think the = appropriate tag is. Thanks to this patch pointing me to the appropriate area, I have = actually found the real cause of the scratch-register leak, and will = post a candidate patch today. Regards,Simon