From mboxrd@z Thu Jan  1 00:00:00 1970
From: =?ISO-8859-1?Q?Christian_K=F6nig?= <deathsimple@vodafone.de>
Subject: Re: Include request for reset-rework branch.
Date: Tue, 01 May 2012 15:38:07 +0200
Message-ID: <4F9FE73F.9040508@vodafone.de>
References: <1335797464-14317-1-git-send-email-deathsimple@vodafone.de>
	<CAH3drwatHbcKNM32AAHkTTckAyy9LBE2LDL=TcHE545rEzHh+g@mail.gmail.com>
	<CAH3drwa5kFUyXJ31T7XC1Ux4nJ_rRHxsnPS16UX=4jS5jWVL8A@mail.gmail.com>
	<4F9EB1B7.2030308@vodafone.de>
	<CAH3drwZi54C2avaTVjMNCY-HAObiMCg1iOjfSthqVdtjRf+1_w@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
Content-Transfer-Encoding: quoted-printable
Return-path: <dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org>
Received: from outgoing.email.vodafone.de (outgoing.email.vodafone.de
	[139.7.28.128])
	by gabe.freedesktop.org (Postfix) with ESMTP id 7E2DBA026E
	for <dri-devel@lists.freedesktop.org>;
	Tue,  1 May 2012 06:38:11 -0700 (PDT)
In-Reply-To: <CAH3drwZi54C2avaTVjMNCY-HAObiMCg1iOjfSthqVdtjRf+1_w@mail.gmail.com>
List-Unsubscribe: <http://lists.freedesktop.org/mailman/options/dri-devel>,
	<mailto:dri-devel-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <http://lists.freedesktop.org/archives/dri-devel>
List-Post: <mailto:dri-devel@lists.freedesktop.org>
List-Help: <mailto:dri-devel-request@lists.freedesktop.org?subject=help>
List-Subscribe: <http://lists.freedesktop.org/mailman/listinfo/dri-devel>,
	<mailto:dri-devel-request@lists.freedesktop.org?subject=subscribe>
Sender: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org
Errors-To: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org
To: Jerome Glisse <j.glisse@gmail.com>
Cc: airlied@redhat.com, dri-devel@lists.freedesktop.org
List-Id: dri-devel@lists.freedesktop.org

On 30.04.2012 18:26, Jerome Glisse wrote:
> On Mon, Apr 30, 2012 at 11:37 AM, Christian K=F6nig
> <deathsimple@vodafone.de>  wrote:
>> On 30.04.2012 17:12, Jerome Glisse wrote:
>>> On Mon, Apr 30, 2012 at 11:12 AM, Jerome Glisse<j.glisse@gmail.com>
>>>   wrote:
>>>> On Mon, Apr 30, 2012 at 10:50 AM, Christian K=F6nig
>>>> <deathsimple@vodafone.de>    wrote:
>>>>> Hi Dave,
>>>>>
>>>>> if nobody has a last moment concern please include the following patc=
hes
>>>>> in drm-next.
>>>>>
>>>>> Except for some minor fixes they have already been on the list for qu=
ite
>>>>> some time,
>>>>> but I intentional left out the debugfs related patches cause we haven=
't
>>>>> finished the
>>>>> discussion about them yet.
>>>>>
>>>>> If you prefer to merge them directly, I also made them available as
>>>>> reset-rework
>>>>> branch here: git://people.freedesktop.org/~deathsimple/linux
>>>>>
>>>>> Cheers,
>>>>> Christian.
>>>>>
>>>> I am not completely ok, i am against patch 5. I need sometime to review
>>>> it.
>>>>
>>>> Cheers,
>>>> Jerome
>>> Sorr mean patch 7
>> I just started to wonder :) what's wrong with patch 7?
>>
>> Please keep in mind that implementing proper locking in the lower level
>> objects allows us to remove quite some locking in the upper layers.
>>
>> By the way, do you mind when I dig into the whole locking stuff of the
>> kernel driver a bit more? There seems to be allot of possibilities to
>> cleanup and simplify the overall driver.
>>
>> Cheers,
>> Christian.
> Well when it comes to lock, i have some kind of an idea i wanted to
> work for a while. GPU is transaction based in a sense, we feed rings
> and it works on them, 90% of the work is cs ioctl so we should be able
> to have one and only one lock (ignoring the whole modesetting path
> here for which i believe one lock too is enough). Things like PM would
> need to take all lock but hey i believe that's a good thing as my
> understanding is PM we do right now need most of the GPU idle.
>
> So here is what we have
> ih spinlock (can be ignored ie left as is)
> irq spinlock (can be ignored ie left as is)
> blit mutex
> pm mutex
> cs mutex
> dc_hw_i2c mutex
> vram mutex
> ib mutex
> rings[] lock
>
> So the real issue is ttm calling back into the driver. So the idea i
> had is to have a work thread that is the only one allowed to mess with
> the GPU. The work thread would use some locking in which it has
> preference over the writer (writer being cs ioctl or ttm call back or
> anything else that needs to schedule GPU work). This would require
> only two lock. I am actually thinking of having 2 list one where
> writer add there "transaction" and one where the worker empty the
> transaction, something like:
>
> cs_ioct
> {
> sa_alloc_for_trans
> ....
> lock(trans_lock)
> list_add_tail(trans, trans_temp_list)
> unlock(trans_lock)
> }
>
> workeur
> {
> lock(trans_lock)
> list_splice_tail(trans_list, trans_temp_list)
> unlock(trans_lock)
> // schedule ib
> ....
> // process fence&  semaphore
> }
>
> So there would be one transaction lock and one lock for the
> transaction memory allocation (ib, semaphore and alike) The workeur
> would also be responsible for GPU reset, there wouldn't be any ring
> lock as i believe one worker is enough for all rings.
>
> For transaction the only issue really is ttm, cs is easy it's an
> ib+fence+semaphore, ttm can be more complex, it can be bo move, bind,
> unbind, ... Anyway it's doable and it's design i had in mind for a
> while.
Well, that sounds like the direction to go, but I would like to even =

avoid the submission thread, since the GPU is working on the rings =

asynchronously anyway.

The locking problems we are currently seeing are more a result of =

abusing global variables for local data. or locking to much code with to =

few primitives, instead of just having to much locking primitives over =

all. For example, I really can't see why we have a blit mutex for the =

r600 shader blit code? Also retrospective having one mutex per ring does =

sound a bit superfluous.

To sum my ideas up:

1. I suggest that memory management is self containing, e.g. you can =

request small amounts of memory for IBs, fences, semaphores, blitting =

vertex buffer etc.. without worrying about others doing that at the same =

time as you. That really sounds like your "lock for the transaction =

memory allocation", and I'm pretty sure that I've extended the SA so far =

to play that role pretty well.

2. Have exactly ONE ring submission mutex. This mutex is taken right =

before an job (and it shouldn't matter if that's a ttm move/blit or an =

IB) is pushed unto a ring. And it is strictly not allowed to allocate =

more SA memory, call into TTM etc.. while this mutex is held. Everything =

that's necessary to submit a job must happen before grabbing it and =

stored in thread local memory, e.g. on the stack or a kmalloc allocated =

patch of memory.

3. Protect data, not code! I.e. have locks to protect one data =

structures and not just say: Those two things shouldn't happen at the =

same time acquire that and this lock.

Over all it sounds like we are playing with similar ideas, I would =

suggest that I just hack together some patches, probably starting with =

the ring submission and r600 blit mutex and then we take a look again =

where this leads us.

Cheers,
Christian.