From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jochen Rollwagen Subject: possible regression Radeon RV280 (R3xx/R4xx ?) card freeze, re-apply old patch ? Date: Fri, 08 Nov 2013 08:35:32 +0100 Message-ID: <527C9444.4010505@t-online.de> References: <526A6C61.1030405@t-online.de> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1093671426==" Return-path: Received: from mailout04.t-online.de (mailout04.t-online.de [194.25.134.18]) by gabe.freedesktop.org (Postfix) with ESMTP id 16CD3F0F82 for ; Fri, 8 Nov 2013 00:06:50 -0800 (PST) In-Reply-To: <526A6C61.1030405@t-online.de> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dri-devel-bounces@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org To: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org This is a multi-part message in MIME format. --===============1093671426== Content-Type: multipart/alternative; boundary="------------090700060104060105040309" This is a multi-part message in MIME format. --------------090700060104060105040309 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Hello there, i *think* i found a regression (card/system freeze in AGP mode) that=20 must have been in the drm code for quite some time (since the switch to=20 kms drivers) and possibly also the potential solution (re-apply an old=20 patch from pre-kms-days). Affected seem to be older cards (actually,=20 very old cards :-) before R600. I mailed this to the ati driver mailing=20 list, but was told that this is a kernel/drm subject now, so i forward=20 the mail interchange to this list. Details below, one has to start=20 reading from the end upwards to get the chronological order, of course. Could somebody give me a hint on how to re-apply the old patch or=20 whether the info i found is valid ? The next step i would take is to=20 insert some diagnostic messages in radeon_vram_location (see below) and=20 build a new kernel. Cheers Jochen -------- Original-Nachricht -------- Betreff: Fwd: Fwd: Fwd: Re: regression on RV280 card freeze, patch not=20 applicable any more Datum: Fri, 25 Oct 2013 15:04:33 +0200 Von: Jochen Rollwagen An: xorg-driver-ati@lists.x.org more info (and possible solution): void radeon_vram_location in radeon_device.c says * Note: GTT start, end, size should be initialized before calling this * function on AGP platform. * * Note: We don't explicitly enforce VRAM start to be aligned on VRAM si= ze, * this shouldn't be a problem as we are using the PCI aperture as a=20 reference. * Otherwise this would be needed for rv280, all r3xx, and all r4xx, but * not IGP. * so does this mean i just have to re-apply the old patch i found ? struct=20 radeon_mc in radeon.h contains aper_base as a member which could be=20 set/aligned to VRAM size using the code snippet below. Cheers Jochen -------- Original-Nachricht -------- Betreff: Fwd: Fwd: Re: regression on RV280 card freeze, patch not=20 applicable any more Datum: Fri, 25 Oct 2013 11:31:32 +0200 Von: Jochen Rollwagen An: xorg-driver-ati@lists.x.org I've done some more researching and found the following: - There's another follow-on-patch ("Extend the alignment workaround to=20 post-rv280 chips as well") to the one indicated below=20 (http://cgit.freedesktop.org/~agd5f/xf86-video-ati/commit/?id=3Db2145aea3= 6bb035bff048366c607b967d70fff49)=20 that applies to not only RV280 but "rv280, all r3xx, and all r4xx, but=20 not IGP". - the piece of code affected seems to be (IMHO) in=20 drivers/gpu/drm/radeon/: The (Radeon ?) Register=20 RADEON_CONFIG_APER_0_BASE is defined in radeon_reg.h but never used in=20 the driver: radeon_reg.h:#define RADEON_CONFIG_APER_0_BASE 0x0100 in r100.c there's static u32 r100_get_accessible_vram(struct radeon_device *rdev) { u32 aper_size; u8 byte; aper_size =3D RREG32(RADEON_CONFIG_APER_SIZE); /* Set HDP_APER_CNTL only on cards that are known not to be broken, * that is has the 2nd generation multifunction PCI interface */ if (rdev->family =3D=3D CHIP_RV280 || rdev->family >=3D CHIP_RV350) { WREG32_P(RADEON_HOST_PATH_CNTL, RADEON_HDP_APER_CNTL, ~RADEON_HDP_APER_CNTL); DRM_INFO("Generation 2 PCI interface, using max accessible=20 memory\n"); return aper_size * 2; } That's the code executed on my machine according to dmesg. Missing (from=20 the original patch, not applicable any more because of driver=20 reorganization) seems to be CARD32 aper0_base =3D INREG(RADEON_CONFIG_APER_0_BASE); aper0_base &=3D ~(mem_size - 1); info->mc_fb_location =3D (aper0_base >> 16); The patch that seems to have removed/overridden this code is: http://www.mail-archive.com/dri-devel@lists.sourceforge.net/msg41307.html According to that patch, it was "booted on PCI r100, PCIE rv370, IGP=20 rs400". So IMHO this could be a classical regression for an AGP RV280=20 card (like mine) and might explain why PCI mode works. this is=20 Additionally corroborated by this post=20 (http://comments.gmane.org/gmane.comp.freedesktop.xorg/5429):/ // //* The above doesn't necessarily work. For example, I've seen machines=20 * with 128Mb configured as 2x64Mb apertures. I'm now _//_always_//_=20 setting * RADEON_HOST_PATH_CNTL. OUTREGP (RADEON_HOST_PATH_CNTL,=20 RADEON_HDP_APER_CNTL, ~RADEON_HDP_APER_CNTL); (which was previously done=20 only on some chip families). *_I __*/*_/think/_**_/_ this is not correct on all cards as the=20 apertures may not be configured correctly (and X doesn't set them up=20 neither, if those correspond to the RADEON_CONFIG_APER registers)/_**_/"/= _* Could a Radeon guru confirm this or am i totally lost? Cheers Jochen -------- Original-Nachricht -------- Betreff: Fwd: Re: regression on RV280 card freeze, patch not applicable=20 any more Datum: Fri, 18 Oct 2013 15:32:18 +0200 Von: Jochen Rollwagen An: xorg-driver-ati@lists.x.org sorry about that. Anyway, i checked drivers/gpu/drm/radeon and drivers/char/agp/uninorth-agp.c and can't seem to find the patch indicated below. Might it have gone missing :-) ? Am 08.10.2013 18:41, schrieb Michel D=C3=A4nzer: > [ Please always follow up to the mailing list ] > > On Die, 2013-10-08 at 14:53 +0200, Jochen Rollwagen wrote: >> Am 08.10.2013 10:03, schrieb Michel D=C3=A4nzer: >>> On Sam, 2013-10-05 at 15:13 +0200, Jochen Rollwagen wrote: >>>> I=E2=80=99m running a RV280 based Radeon 9200 card (I know, an ancie= nt card) >>>> in a Mac Mini G4 (powerpc-architecture) with Ubuntu Precise and the >>>> latest 3.4.64-kernel/ati driver and get lockups when trying to run t= he >>>> card in AGP mode (KMS enabled). The lockups happen when resetting th= e >>>> card (that=E2=80=99s what I can infer from the oops-screen). >>> It's the other way around: The kernel radeon driver resets the card t= o >>> try and get it running again after a lockup. >>> >>>> PCI mode works. After researching I found a old bug that was fixed >>>> back in 2006 (https://bugs.freedesktop.org/show_bug.cgi?id=3D6011) t= hat >>>> looks like the freeze I experience (since PCI mode =E2=80=93 which a= llocates >>>> 64 MB of memory - works and AGP mode which by default allocates 256 = MB >>>> doesn=E2=80=99t). The card has 64 mb memory. >>>> >>>> So the first question is, could this be the problem that causes the >>>> lockups ? >>> Not really. The GART and VRAM memory apertures aren't directly relate= d, >>> and the fix for the bug above should still be incorporated in the >>> current radeon KMS code. >>> >>> Does radeon.agpmode=3D1 or radeon.agpmode=3D4 work? >>> >> Thank you for your reply. First, none of the agpmodes work, they just >> take more or less time to lockup the card (1 - slowest, 4 fastest). >> Secondly, if you write that the fix "should be incorporated in the >> current code", i'm somewhat lost because it definitely isn't there. > It's in the kernel now. > Well........no. I checked the 3.4.64 kernel sources after my last Mail and the code isn't in the drivers/gpu/drm/radeon sources. But of course i might have overlooked something. --------------090700060104060105040309 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello there,

i *think* i found a regression (card/system freeze in AGP mode) that must have been in the drm code for quite some time (since the switch to kms drivers) and possibly also the potential solution (re-apply an old patch from pre-kms-days). Affected seem to be older cards (actually, very old cards :-) before R600. I mailed this to the ati driver mailing list, but was told that this is a kernel/drm subject now, so i forward the mail interchange to this list. Details below, one has to start reading from the end upwards to get the chronological order, of course.

Could somebody give me a hint on how to re-apply the old patch or whether the info i found is valid ? The next step i would take is to insert some diagnostic messages in radeon_vram_location (see below) and build a new kernel.

Cheers

Jochen


-------- Original-Nachricht --------
Bet= reff: Fwd: Fwd: Fwd: Re: regression on RV280 card freeze, patch not applicable any more
Dat= um: Fri, 25 Oct 2013 15:04:33 +0200
Von= : Jochen Rollwagen <joro-2013@t-online.de>
An:= xorg-driver-ati@lists.x.org

more info (and possible solution):

void radeon_vram_location in radeon_device.c says

=C2=A0* Note: GTT start, end, size should be initialized before cal= ling this
=C2=A0* function on AGP platform.
=C2=A0*
=C2=A0* Note: We don't explicitly enforce VRAM start to be aligned = on VRAM size,
=C2=A0* this shouldn't be a problem as we are using the PCI apertur= e as a reference.
=C2=A0* Otherwise this would be needed for rv280, all r3xx, and all r4xx, but
=C2=A0* not IGP.
=C2=A0*

so does this mean i just have to re-apply the old patch i found ? struct radeon_mc in radeon.h contains aper_base as a member which could be set/aligned to VRAM size using the code snippet below.

Cheers

Jochen


-------- Original-Nachricht --------
B= etreff: Fwd: Fwd: Re: regression on RV280 card freeze, patch not applicable any more
D= atum: Fri, 25 Oct 2013 11:31:32 +0200
V= on: Jochen Rollwagen <joro-2013@t-o= nline.de>
A= n: xorg-driver= -ati@lists.x.org

I've done some more researching and found the following:

- There's another follow-on-patch ("Extend the alignment workaround to post-rv280 chips as well") to the one indicated below (http://cgit.freedesktop.org/~ag= d5f/xf86-video-ati/commit/?id=3Db2145aea36bb035bff048366c607b967d70fff49<= /a>) that applies to not only RV280 but "rv280, all r3xx, and all r4xx, but not IGP".

- the piece of code affected seems to be (IMHO) in drivers/gpu/drm/radeon/: The (Radeon ?) Register RADEON_CONFIG_APER_0_BASE is defined in radeon_reg.h but never used in the driver:

radeon_reg.h:#define RADEON_CONFIG_APER_0_BASE=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x0100

in r100.c there's

static u32 r100_get_accessible_vram(struct radeon_device *rdev)
{
=C2=A0=C2=A0=C2=A0 u32 aper_size;
=C2=A0=C2=A0=C2=A0 u8 byte;

=C2=A0=C2=A0=C2=A0 aper_size =3D RREG32(RADEON_CONFIG_APER_SIZE= );

=C2=A0=C2=A0=C2=A0 /* Set HDP_APER_CNTL only on cards that are = known not to be broken,
=C2=A0=C2=A0=C2=A0 =C2=A0* that is has the 2nd generation multi= function PCI interface
=C2=A0=C2=A0=C2=A0 =C2=A0*/
=C2=A0=C2=A0=C2=A0 if (rdev->family =3D=3D CHIP_RV280 ||
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 rdev->family >=3D C= HIP_RV350) {
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 WREG32_P(RADEON_HOST_PATH= _CNTL, RADEON_HDP_APER_CNTL,
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 ~RADEON_HDP_APER_CNTL);
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 DRM_INFO("Generation 2 PC= I interface, using max accessible memory\n");
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 return aper_size * 2;
=C2=A0=C2=A0=C2=A0 }

That's the code executed on my machine according to dmesg. Missing (from the original patch, not applicable any more because of driver reorganization) seems to be

CARD32 aper0_base =3D INREG(RADEON_CONFIG_APER_0_BASE);
aper0_base &=3D ~(mem_size - 1);
info->mc_fb_location =3D (aper0_base >> 16);

The patch that seems to have removed/overridden this code is:
http://www.mail-archive.com/dri-devel@lists.sourceforge.net/msg= 41307.html

According to that patch, it was "booted on PCI r100, PCIE rv370, IGP rs400". So IMHO this could be a classical regression for an AGP RV280 card (like mine) and might explain why PCI mode works. this is Additionally corroborated by this post (http:= //comments.gmane.org/gmane.comp.freedesktop.xorg/5429):

* The above doesn't necessarily work. For example, I've seen machines * with 128Mb configured as 2x64Mb apertures. I'm now _always_ setting * RADEON_HOST_PATH_CNTL. OUTREGP (RADEON_HOST_PATH_CNTL, RADEON_HDP_APER_CNTL, ~RADEON_HDP_APER_CNTL); (which was previously done only on some chip families).

I _
think_ this is not correct on all cards as the apertures may not be configured correctly (and X doesn't set them up neither, if those correspond to the RADEON_CONFIG_APER registers)"

Could a Radeon guru confirm this or am i totally lost?

Cheers

Jochen
-------- Original-Nachricht --------
Betreff: Fwd: Re: regression on RV280 card freeze, patch not applicable any more
Datum: Fri, 18 Oct 2013 15:32:18 +0200
Von: Jochen Rollwagen <joro-2013@t= -online.de>
An: xorg-driv= er-ati@lists.x.org


sorry about that.

Anyway, i checked drivers/gpu/drm/radeon and=20
drivers/char/agp/uninorth-agp.c and can't seem to find the patch=20
indicated below. Might it have gone missing :-) ?


Am 08.10.2013 18:41, schrieb Michel D=C3=A4nzer:
> [ Please always follow up to the mailing list ]
>
> On Die, 2013-10-08 at 14:53 +0200, Jochen Rollwagen wrote:
>> Am 08.10.2013 10:03, schrieb Michel D=C3=A4nzer:
>>> On Sam, 2013-10-05 at 15:13 +0200, Jochen Rollwagen wrote:
>>>> I=E2=80=99m running a RV280 based Radeon 9200 card (I kn=
ow, an ancient card)
>>>> in a Mac Mini G4 (powerpc-architecture) with Ubuntu Prec=
ise and the
>>>> latest 3.4.64-kernel/ati driver and get lockups when try=
ing to run the
>>>> card in AGP mode (KMS enabled). The lockups happen when =
resetting the
>>>> card (that=E2=80=99s what I can infer from the oops-scre=
en).
>>> It's the other way around: The kernel radeon driver resets t=
he card to
>>> try and get it running again after a lockup.
>>>
>>>> PCI mode works. After researching I found a old bug that=
 was fixed
>>>> back in 2006 (https://bugs.freedesktop.org/show_bug.cgi?id=3D6011) that
>>>> looks like the freeze I experience (since PCI mode =E2=80=
=93 which allocates
>>>> 64 MB of memory - works and AGP mode which by default al=
locates 256 MB
>>>> doesn=E2=80=99t). The card has 64 mb memory.
>>>>
>>>> So the first question is, could this be the problem that=
 causes the
>>>> lockups ?
>>> Not really. The GART and VRAM memory apertures aren't direct=
ly related,
>>> and the fix for the bug above should still be incorporated i=
n the
>>> current radeon KMS code.
>>>
>>> Does radeon.agpmode=3D1 or radeon.agpmode=3D4 work?
>>>
>> Thank you for your reply. First, none of the agpmodes work, they=
 just
>> take more or less time to lockup the card (1 - slowest, 4 fastes=
t).
>> Secondly, if you write that the fix "should be incorporated in t=
he
>> current code", i'm somewhat lost because it definitely isn't the=
re.
> It's in the kernel now.
>
Well........no. I checked the 3.4.64 kernel sources after my last Mail
and the code isn't in the drivers/gpu/drm/radeon sources. But of course
i might have overlooked something.









--------------090700060104060105040309-- --===============1093671426== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel --===============1093671426==--