From mboxrd@z Thu Jan  1 00:00:00 1970
From: =?ISO-8859-1?Q?Christian_K=F6nig?= <deathsimple@vodafone.de>
Subject: Re: RFC: Radeon multi ring support branch
Date: Fri, 18 Nov 2011 13:19:08 +0100
Message-ID: <4EC64D3C.6060004@vodafone.de>
References: <4EABF8EC.3020701@vodafone.de>
	<20111115193212.GB2263@homer.localdomain>
	<4EC2F378.9090801@vodafone.de>
	<CAH3drwboczTeB7hS+pNw6ZWTa-acuXN_G_4pLZy5QK7Bgi2NgQ@mail.gmail.com>
	<4EC53551.8070005@vodafone.de>
	<CAH3drwYybHaP2Vh5JNdYR0FP0v36xud830Pi+wepzw56xHBByQ@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
Content-Transfer-Encoding: quoted-printable
Return-path: <dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org>
Received: from outgoing.email.vodafone.de (outgoing.email.vodafone.de
	[139.7.28.128])
	by gabe.freedesktop.org (Postfix) with ESMTP id 5C5969E9FF
	for <dri-devel@lists.freedesktop.org>;
	Fri, 18 Nov 2011 04:19:16 -0800 (PST)
In-Reply-To: <CAH3drwYybHaP2Vh5JNdYR0FP0v36xud830Pi+wepzw56xHBByQ@mail.gmail.com>
List-Unsubscribe: <http://lists.freedesktop.org/mailman/options/dri-devel>,
	<mailto:dri-devel-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <http://lists.freedesktop.org/archives/dri-devel>
List-Post: <mailto:dri-devel@lists.freedesktop.org>
List-Help: <mailto:dri-devel-request@lists.freedesktop.org?subject=help>
List-Subscribe: <http://lists.freedesktop.org/mailman/listinfo/dri-devel>,
	<mailto:dri-devel-request@lists.freedesktop.org?subject=subscribe>
Sender: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org
Errors-To: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org
To: Jerome Glisse <j.glisse@gmail.com>
Cc: dri-devel@lists.freedesktop.org
List-Id: dri-devel@lists.freedesktop.org

On 17.11.2011 17:58, Jerome Glisse wrote:
> 2011/11/17 Christian K=F6nig<deathsimple@vodafone.de>:
>> On 16.11.2011 01:24, Jerome Glisse wrote:
>>> Well as we don't specify on which value semaphore should wait on, i am
>>> prety sure the first ring to increment the semaphore will unblock all
>>> waiter. So if you have ring1 that want to wait on ring2 and ring3 as so=
on as
>>> ring2 or ring3 is done ring1 will go one while either ring2 or ring3 mi=
ght
>>> not be done. I will test that tomorrow but from doc i have it seems so.=
 Thus
>>> it will be broken with more than one ring, that would mean you have to
>>> allocate one semaphore for each ring couple you want to synchronize. No=
te
>>> that the usual case will likely be sync btw 2 ring.
>> Good point, but I played with it a bit more today and it is just behavin=
g as
>> I thought it would be. A single signal command will just unblock a single
>> waiter, even if there are multiple waiters currently for this semaphore,=
 the
>> only thing you can't tell is which waiter will come first.
> I am not talking about multiple waiter but about single waiter on multilp=
le
> signaler. Current implementation allocate one and only one semaphore
> not matter how much ring you want to synchronize with. Which means
> the single waiter will be unblock once only a single ring signal, which is
> broken if you want to wait for several rings.
Wait a moment, having a single semaphore doesn't means that there is =

only a single waiter, just take a look at the code again:

         for (i =3D 0; i < RADEON_NUM_RINGS; ++i) {
                 /* no need to sync to our own or unused rings */
                 if (i =3D=3D p->ring || !p->sync_to_ring[i])
                         continue;

                 if (!p->ib->fence->semaphore) {
                         r =3D radeon_semaphore_create(p->rdev, =

&p->ib->fence->semaphore);
                         if (r)
                                 return r;
                 }

                 radeon_semaphore_emit_signal(p->rdev, i, =

p->ib->fence->semaphore);
                 radeon_semaphore_emit_wait(p->rdev, p->ring, =

p->ib->fence->semaphore); <--------
         }

It is allocating a single semaphore, thats correct, but at the same time =

it emits multiple wait commands. So if we want to synchronize with just =

one ring, we only wait once on the semaphore, but if we want to =

synchronize with N rings we also wait N times on the same semaphore. =

Since we don't really care about the order in which the signal =

operations arrive that is pretty much ok, because when execution =

continues after the last wait command it has been made sure that every =

signal operation has been fired. The extended semaphore test in my =

repository is also checking for this condition, and it really seems to =

work fine.

>> I should also note that the current algorithm will just emit multiple wa=
it
>> operations to a single ring, and spread the signal operations to all oth=
er
>> rings we are interested in. That isn't very efficient, but should indeed
>> work quite fine.
> Well the cs patch you posted don't do that. It create one semaphore
> and emit a wait with that semaphore and emit signal to all ring. That
> was my point. But i agree that creating a semaphore for each ring
> couple you want to wait for will work.
>
>>> After retesting the first patch  drm/radeon: fix debugfs handling is NAK
>>> a complete no go.
>>>
>>> Issue is that radeon_debugfs_cleanup is call after rdev is free. This
>>> is why i used a static array. I forgot about that, i should have put a
>>> comment. I guess you built your kernel without debugfs or that you
>>> didn't tested to reload the module.
>> Mhm, I have tested it, seen the crash, and didn't thought that this is a
>> problem. Don't ask me why I can't understand it myself right now.
>>
>> Anyway, I moved the unregistering of the files into a separate function,
>> which is now called from radeon_device_fini instead of
>> radeon_debugfs_cleanup. That seems to work fine, at least if I haven't
>> missed something else.
> Please don't touch the debugfs stuff, it's fine the way it is. No need to
> change anything there.
At least for me there is need to change something, because using a =

static variable causes the debugfs files to only appear on the first =

card in the system. And in my configuration the first card is always the =

onboard APU, so I wondered for half a day why a locked up IB doesn't =

contains any data, until i finally recognized that I'm looking at the =

wrong GPU in the system.

>> I also merged your indention fixes and the fix for the never allocated
>> semaphores and pushed the result into my public repository
>> (http://cgit.freedesktop.org/~deathsimple/linux), so please take another
>> look at it.
>>
> I will take a look
>
Thanks, but please be aware that I'm going to also integrate Alex =

patches ontop of it today and also give it another test round, just to =

make sure that I'm not doing anything stupid with the debugfs code. So =

if you haven't done already please wait a bit more.

Thanks,
Christian.