From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?ISO-8859-1?Q?Christian_K=F6nig?= Subject: Re: [PATCH] drm/radeon: make 64bit fences more robust Date: Tue, 11 Sep 2012 12:11:45 +0200 Message-ID: <504F0E61.9030900@vodafone.de> References: <1347268383-4150-1-git-send-email-deathsimple@vodafone.de> <1347275537.30263.368.camel@thor.local> <504DD6D0.1090307@vodafone.de> <1347291510.30263.394.camel@thor.local> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------010207060004050309070002" Return-path: Received: from outgoing.email.vodafone.de (outgoing.email.vodafone.de [139.7.28.128]) by gabe.freedesktop.org (Postfix) with ESMTP id 20FA6A029F for ; Tue, 11 Sep 2012 03:11:49 -0700 (PDT) In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org Errors-To: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org To: Jerome Glisse Cc: =?ISO-8859-1?Q?Michel_D=E4nzer?= , dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org This is a multi-part message in MIME format. --------------010207060004050309070002 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable On 10.09.2012 18:07, Jerome Glisse wrote: > On Mon, Sep 10, 2012 at 11:52 AM, Jerome Glisse wr= ote: >> On Mon, Sep 10, 2012 at 11:38 AM, Michel D=E4nzer = wrote: >>> On Mon, 2012-09-10 at 14:02 +0200, Christian K=F6nig wrote: >>>> On 10.09.2012 13:12, Michel D=E4nzer wrote: >>>>> On Mon, 2012-09-10 at 11:13 +0200, Christian K=F6nig wrote: >>>>>> Only increase the higher 32bits if we really detect a wrap around. >>>>>> >>>>>> Fixes: >>>>>> https://bugs.freedesktop.org/show_bug.cgi?id=3D54129 >>>>>> https://bugs.freedesktop.org/show_bug.cgi?id=3D54662 >>>>>> >>>>>> Possible fixes: >>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=3D846505 >>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=3D845639 >>>>>> >>>>>> Signed-off-by: Christian K=F6nig >>>>>> Cc: stable@vger.kernel.org >>>>>> --- >>>>>> drivers/gpu/drm/radeon/radeon_fence.c | 6 +++--- >>>>>> 1 file changed, 3 insertions(+), 3 deletions(-) >>>>>> >>>>>> diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/d= rm/radeon/radeon_fence.c >>>>>> index 7b737b9..4781e13 100644 >>>>>> --- a/drivers/gpu/drm/radeon/radeon_fence.c >>>>>> +++ b/drivers/gpu/drm/radeon/radeon_fence.c >>>>>> @@ -160,7 +160,7 @@ void radeon_fence_process(struct radeon_device= *rdev, int ring) >>>>>> do { >>>>>> seq =3D radeon_fence_read(rdev, ring); >>>>>> seq |=3D last_seq & 0xffffffff00000000LL; >>>>>> - if (seq < last_seq) { >>>>>> + if (seq < (last_seq - 0x80000000LL)) { >>>>>> seq +=3D 0x100000000LL; >>>>>> } >>>>> Can you provide a bit more explanation for this change? In particul= ar, >>>>> how could the code previously detect a wraparound when there was no= ne, >>>>> and why is this the proper fix? >>>> Honestly I also don't really understand how this bug happened in the >>>> first place. >>>> >>>> We extend the 32bit fences supported by hardware by testing if a >>>> previously read fence value is smaller than the value we read now: >>>> >>>>> if (seq < last_seq) { >>>> But the problem seems to be that on some systems we do get fence val= ues >>>> that are decreasing, e.g. instead of 5, 6, 7, 8 we get 5, 7, 6, 8 (o= r >>>> maybe 5, 6, 0, 7, 8 because somebody accidentally overwrites the fen= ce >>>> value). >>> Maybe some kind of race involving radeon_fence_write()? >>> >>> >>>> It might be related to a hardware bug, or the algorithm is flawed in= a >>>> way I currently don't see. Anyway the old code we had wasn't so pick= y >>>> about such problems and the patch just tries to make the current cod= e as >>>> robust as the old code was, which indeed seems to solve the problems= we see. >>>> >>>> The wrap around detection still works (tested by setting the initial >>>> fence value to 0xfffffff0 and letting it wrap around shortly after >>>> start), so I think it we can safely commit this. >>> Without knowing exactly what kind of hardware fence value pattern cau= sed >>> the problem, we can't be sure that the wraparound handling will work >>> reliably, or that the values going backwards won't cause other proble= ms. >>> I think it would be good to get more real-world data on that. >>> >> As i said in my email this patch just postpone the issue to last_fence >>> =3D 0x1 8000 0001 if fence value we read back is sometimes randomly 0= . >> If we received fence value out of order (which i highly doubt as old >> code would have had same issue thought on smaller scale) then if fence >> value 0x1 8000 0001 is received before fence value 0x1 8000 0000 we >> are right back to all future fence considered as signalled (again this >> will take month of uptime). > Actually thinking back about it if fence are just received out of > order then this patch corner case is if we received 0x1 ffff ffff > after receiving 0x1 0000 0000, what will happen is that the 0x1 0000 > 0000 is the wrap over that will trigger upper 32bits to be incremented > so fence become 0x2 0000 0000 then we got 0xffff ffff which with | > become 0x2 ffff ffff then we get next fence value 0x0000 0001 and > again we increment upper 32bits so last seq become 0x3 0000 0001. Good point. > Again this will happen after month of uptime and all it does is > decrement the amount of uptime for which 64bit fence are fine ie at > worst we over increment by 0x2 0000 0000 instead of 0x1 0000 0000 on > wrap around. How about this idea: Instead of increasing the upper 32bits we just use=20 the upper 32bits of the last emitted fence value? E.g. see the attached patch. That both should handle random zero and out=20 of order values more gracefully. Additionally I think that the reason we haven't had this before is that=20 this corruption might only happens on hw (re-)initialisation, e.g. boot=20 and resume. Currently I'm hacking together a small test app that just emits an IB=20 with some NOP instructions, if I'm not completely wrong that should=20 gives us a very high fence rate, so we might be able to actually test=20 the wrap around a bit more. Christian. > Cheers, > Jerome > >> All this probably lead to questioning the usefulness of 64bits fence. >> >> Cheers, >> Jerome --------------010207060004050309070002 Content-Type: text/x-patch; name="0001-drm-radeon-make-64bit-fences-more-robust-v2.patch" Content-Disposition: attachment; filename*0="0001-drm-radeon-make-64bit-fences-more-robust-v2.patch" Content-Transfer-Encoding: quoted-printable >>From 8737d17a45e04d7c111abb5e79e48577b224fae6 Mon Sep 17 00:00:00 2001 From: =3D?UTF-8?q?Christian=3D20K=3DC3=3DB6nig?=3D Date: Sun, 9 Sep 2012 11:45:19 +0200 Subject: [PATCH] drm/radeon: make 64bit fences more robust v2 MIME-Version: 1.0 Content-Type: text/plain; charset=3DUTF-8 Content-Transfer-Encoding: 8bit Only increase the higher 32bits if we really detect a wrap around. v2: instead of increasing the higher 32bits just use the higher 32bits from the last emitted fence. Signed-off-by: Christian K=C3=B6nig Cc: stable@vger.kernel.org --- drivers/gpu/drm/radeon/radeon_fence.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/rade= on/radeon_fence.c index 7b737b9..a263513 100644 --- a/drivers/gpu/drm/radeon/radeon_fence.c +++ b/drivers/gpu/drm/radeon/radeon_fence.c @@ -161,10 +161,12 @@ void radeon_fence_process(struct radeon_device *rde= v, int ring) seq =3D radeon_fence_read(rdev, ring); seq |=3D last_seq & 0xffffffff00000000LL; if (seq < last_seq) { - seq +=3D 0x100000000LL; + seq &=3D 0xffffffff; + seq |=3D rdev->fence_drv[ring].sync_seq[ring] & + 0xffffffff00000000LL; } =20 - if (seq =3D=3D last_seq) { + if (seq <=3D last_seq) { break; } /* If we loop over we don't want to return without --=20 1.7.9.5 --------------010207060004050309070002 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel --------------010207060004050309070002--