From mboxrd@z Thu Jan  1 00:00:00 1970
From: =?ISO-8859-1?Q?Christian_K=F6nig?= <deathsimple@vodafone.de>
Subject: Re: [PATCH] drm/radeon: add an exclusive lock for GPU reset
Date: Tue, 03 Jul 2012 11:26:43 +0200
Message-ID: <4FF2BAD3.7030402@vodafone.de>
References: <1341243563-2559-1-git-send-email-j.glisse@gmail.com>
	<4FF1CBB5.2060208@vodafone.de>
	<CAH3drwa9f70aAhTqvo=+G0jv8tOyCuGoesoBQARi5jWAjSwJ1A@mail.gmail.com>
	<4FF1D4EF.3040806@vodafone.de>
	<CAH3drwZUfMwFgt6SWZ98OH2Nvhs6QVbcF+TmBuaq9p4HYN4tgA@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
Content-Transfer-Encoding: quoted-printable
Return-path: <dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org>
Received: from outgoing.email.vodafone.de (outgoing.email.vodafone.de
	[139.7.28.128])
	by gabe.freedesktop.org (Postfix) with ESMTP id BA2C89E804
	for <dri-devel@lists.freedesktop.org>;
	Tue,  3 Jul 2012 02:26:47 -0700 (PDT)
In-Reply-To: <CAH3drwZUfMwFgt6SWZ98OH2Nvhs6QVbcF+TmBuaq9p4HYN4tgA@mail.gmail.com>
List-Unsubscribe: <http://lists.freedesktop.org/mailman/options/dri-devel>,
	<mailto:dri-devel-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <http://lists.freedesktop.org/archives/dri-devel>
List-Post: <mailto:dri-devel@lists.freedesktop.org>
List-Help: <mailto:dri-devel-request@lists.freedesktop.org?subject=help>
List-Subscribe: <http://lists.freedesktop.org/mailman/listinfo/dri-devel>,
	<mailto:dri-devel-request@lists.freedesktop.org?subject=subscribe>
Sender: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org
Errors-To: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org
To: Jerome Glisse <j.glisse@gmail.com>
Cc: Jerome Glisse <jglisse@redhat.com>, dri-devel@lists.freedesktop.org
List-Id: dri-devel@lists.freedesktop.org

On 02.07.2012 19:27, Jerome Glisse wrote:
> On Mon, Jul 2, 2012 at 1:05 PM, Christian K=F6nig <deathsimple@vodafone.d=
e> wrote:
>> On 02.07.2012 18:41, Jerome Glisse wrote:
>>> On Mon, Jul 2, 2012 at 12:26 PM, Christian K=F6nig
>>> <deathsimple@vodafone.de> wrote:
>>>> On 02.07.2012 17:39, j.glisse@gmail.com wrote:
>>>>> From: Jerome Glisse <jglisse@redhat.com>
>>>>>
>>>>> GPU reset need to be exclusive, one happening at a time. For this
>>>>> add a rw semaphore so that any path that trigger GPU activities
>>>>> have to take the semaphore as a reader thus allowing concurency.
>>>>>
>>>>> The GPU reset path take the semaphore as a writer ensuring that
>>>>> no concurrent reset take place.
>>>>>
>>>>> Signed-off-by: Jerome Glisse <jglisse@redhat.com>
>>>> NAK, that isn't as bad as the cs mutex was but still to complicated. I=
t's
>>>> just too far up in the call stack, e.g. it tries to catch ioctl
>>>> operations,
>>>> instead of catching the underlying hardware operation which is caused =
by
>>>> the
>>>> ioctl/ttm/etc...
>>>>
>>>> Why not just take the ring look as I suggested?
>>>>
>>>>
>>> No we can't use ring lock, we need to protect suspend/resume path and
>>> we need an exclusive lock for that so we need a reset mutex at the
>>> very least. But instead of having a reset mutex i prefer using a rw
>>> lock so that we can stop ioctl until a reset goes through an let
>>> pending ioctl take proper action. Think about multiple context trying
>>> to reset GPU ...
>>>
>>> Really this is the best option, the rw locking wont induce any lock
>>> contention execept in gpu reset case which is anyway bad news.
>> Why? That makes no sense to me. Well I don't want to prevent lock
>> contention, but understand why we should add locking at the ioctl level.
>> That violates locking rule number one "lock data instead of code" (or in=
 our
>> case "lock hardware access instead of code path") and it really is the
>> reason why we ended up with the cs_mutex protecting practically everythi=
ng.
>>
>> Multiple context trying to reset the GPU should be pretty fine, current =
it
>> would just reset the GPU twice, but in the future asic_reset should be m=
uch
>> more fine grained and only reset the parts of the GPU which really needs=
 an
>> reset.
>>
>> Cheers,
>> Christian.
> No multiple reset is not fine, try it your self and you will see all
> kind of oops (strongly advise you to sync you hd before stress testing
> that). Yes we need to protect code path because suspend/resume code
> path is special one it touch many data in the device structure. GPU
> reset is a rare event or should be a rare event.
Yeah, but I thought that fixing those oops as the second step. I see the =

fact that suspend/resume is unpinning all the ttm memory and then =

pinning it again as a bug that needs to be fixed. Or as an alternative =

we could split the suspend/resume calls into suspend/disable and =

resume/enable calls, where we only call disable/enable in the gpu_reset =

path rather than a complete suspend/resume (not really sure about that).

Also a GPU reset isn't such a rare event, currently it just occurs when =

userspace is doing something bad, for example submitting an invalid =

shader or stuff like that. But with VM and IO protection coming into the =

driver we are going to need a GPU reset when there is an protection =

fault, and with that it is really desirable to just reset the hardware =

in a way where we can say: This IB was faulty skip over it and resume =

with whatever is after it on the ring.

And todo that we need to keep the auxiliary data like sub allocator =

memory, blitting shader bo, and especially vm page tables at the same =

place in hardware memory.

> I stress it we need at very least a mutex to protect gpu reset and i
> will stand on that position because there is no other way around.
> Using rw lock have a bonus of allowing proper handling of gpu reset
> failure and that what the patch i sent to linus fix tree is about, so
> to make drm next merge properly while preserving proper behavior in
> gpu reset failure the rw semaphore is the best option.
Oh well, I'm not arguing that we don't need a look here. I'm just =

questioning to put it at the ioctl level (e.g. the driver entry from =

userspace), that wasn't such a good idea with the cs_mutex and doesn't =

seems like a good idea now. Instead we should place the looking between =

the ioctl/ttm and the actual hardware submission and that brings it =

pretty close (if not identically) to the ring mutex.

Cheers,
Christian.