From mboxrd@z Thu Jan  1 00:00:00 1970
From: =?windows-1252?Q?Christian_K=F6nig?= <christian.koenig@amd.com>
Subject: Re: Question on UAPI for fences
Date: Fri, 12 Sep 2014 17:58:09 +0200
Message-ID: <54131811.4050509@amd.com>
References: <5412F3CA.9060306@amd.com>
 <CAKMK7uHr0Eu0nWn1TzWSpi6fEudyGoTqWOEVR-Jo_f9eZXk9mA@mail.gmail.com>
 <CAKMK7uHgrT-j3qo9hxZcLMFo0Dzr-KCmT+G6VzO+OVrS8D8_SQ@mail.gmail.com>
 <20140912145048.GA4139@gmail.com>
 <CADnq5_N1xPr+zhTGKA4HpQcJffTAS-NcVidWPE31j45H_bVRKw@mail.gmail.com>
 <20140912153346.GB4139@gmail.com> <54131481.4040905@amd.com>
 <20140912154831.GC4139@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="windows-1252"; Format="flowed"
Content-Transfer-Encoding: quoted-printable
Return-path: <dri-devel-bounces@lists.freedesktop.org>
Received: from na01-bn1-obe.outbound.protection.outlook.com
 (mail-bn1bon0143.outbound.protection.outlook.com [157.56.111.143])
 by gabe.freedesktop.org (Postfix) with ESMTP id C121A6E1ED
 for <dri-devel@lists.freedesktop.org>; Fri, 12 Sep 2014 09:13:17 -0700 (PDT)
In-Reply-To: <20140912154831.GC4139@gmail.com>
List-Unsubscribe: <http://lists.freedesktop.org/mailman/options/dri-devel>,
 <mailto:dri-devel-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <http://lists.freedesktop.org/archives/dri-devel>
List-Post: <mailto:dri-devel@lists.freedesktop.org>
List-Help: <mailto:dri-devel-request@lists.freedesktop.org?subject=help>
List-Subscribe: <http://lists.freedesktop.org/mailman/listinfo/dri-devel>,
 <mailto:dri-devel-request@lists.freedesktop.org?subject=subscribe>
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
To: Jerome Glisse <j.glisse@gmail.com>
Cc: Maarten Lankhorst <m.b.lankhorst@gmail.com>, Zach Pfeffer <zpfeffer@audience.com>, "dri-devel@lists.freedesktop.org" <dri-devel@lists.freedesktop.org>, "linaro-mm-sig@lists.linaro.org" <linaro-mm-sig@lists.linaro.org>, gpudriverdevsupport@amd.com, John Harrison <john.c.harrison@intel.com>
List-Id: dri-devel@lists.freedesktop.org

Am 12.09.2014 um 17:48 schrieb Jerome Glisse:
> On Fri, Sep 12, 2014 at 05:42:57PM +0200, Christian K=F6nig wrote:
>> Am 12.09.2014 um 17:33 schrieb Jerome Glisse:
>>> On Fri, Sep 12, 2014 at 11:25:12AM -0400, Alex Deucher wrote:
>>>> On Fri, Sep 12, 2014 at 10:50 AM, Jerome Glisse <j.glisse@gmail.com> w=
rote:
>>>>> On Fri, Sep 12, 2014 at 04:43:44PM +0200, Daniel Vetter wrote:
>>>>>> On Fri, Sep 12, 2014 at 4:09 PM, Daniel Vetter <daniel@ffwll.ch> wro=
te:
>>>>>>> On Fri, Sep 12, 2014 at 03:23:22PM +0200, Christian K=F6nig wrote:
>>>>>>>> Hello everyone,
>>>>>>>>
>>>>>>>> to allow concurrent buffer access by different engines beyond the =
multiple
>>>>>>>> readers/single writer model that we currently use in radeon and ot=
her
>>>>>>>> drivers we need some kind of synchonization object exposed to user=
space.
>>>>>>>>
>>>>>>>> My initial patch set for this used (or rather abused) zero sized G=
EM buffers
>>>>>>>> as fence handles. This is obviously isn't the best way of doing th=
is (to
>>>>>>>> much overhead, rather ugly etc...), Jerome commented on this accor=
dingly.
>>>>>>>>
>>>>>>>> So what should a driver expose instead? Android sync points? Somet=
hing else?
>>>>>>> I think actually exposing the struct fence objects as a fd, using a=
ndroid
>>>>>>> syncpts (or at least something compatible to it) is the way to go. =
Problem
>>>>>>> is that it's super-hard to get the android guys out of hiding for t=
his :(
>>>>>>>
>>>>>>> Adding a bunch of people in the hopes that something sticks.
>>>>>> More people.
>>>>> Just to re-iterate, exposing such thing while still using command str=
eam
>>>>> ioctl that use implicit synchronization is a waste and you can only g=
et
>>>>> the lowest common denominator which is implicit synchronization. So i=
 do
>>>>> not see the point of such api if you are not also adding a new cs ioc=
tl
>>>>> with explicit contract that it does not do any kind of synchronization
>>>>> (it could be almost the exact same code modulo the do not wait for
>>>>> previous cmd to complete).
>>>> Our thinking was to allow explicit sync from a single process, but
>>>> implicitly sync between processes.
>>> This is a BIG NAK if you are using the same ioctl as it would mean you =
are
>>> changing userspace API, well at least userspace expectation. Adding a n=
ew
>>> cs flag might do the trick but it should not be about inter-process, or=
 any
>>> thing special, it's just implicit sync or no synchronization. Converting
>>> userspace is not that much of a big deal either, it can be broken into
>>> several step. Like mesa use explicit synchronization all time but ddx u=
se
>>> implicit.
>> The thinking here is that we need to be backward compatible for DRI2/3 a=
nd
>> support all kind of different use cases like old DDX and new Mesa, or old
>> Mesa and new DDX etc...
>>
>> So for my prototype if the kernel sees any access of a BO from two diffe=
rent
>> clients it falls back to the old behavior of implicit synchronization of
>> access to the same buffer object. That might not be the fastest approach,
>> but is as far as I can see conservative and so should work under all
>> conditions.
>>
>> Apart from that the planning so far was that we just hide this feature
>> behind a couple of command submission flags and new chunks.
> Just to reproduce IRC discussion, i think it's a lot simpler and not that
> complex. For explicit cs ioctl you do not wait for any previous fence of
> any of the buffer referenced in the cs ioctl, but you still associate a
> new fence with all the buffer object referenced in the cs ioctl. So if the
> next ioctl is an implicit sync ioctl it will wait properly and synchronize
> properly with previous explicit cs ioctl. Hence you can easily have a mix
> in userspace thing is you only get benefit once enough of your userspace
> is using explicit.

Yes, that's exactly what my patches currently implement.

The only difference is that by current planning I implemented it as a =

per BO flag for the command submission, but that was just for testing. =

Having a single flag to switch between implicit and explicit =

synchronization for whole CS IOCTL would do equally well.

> Note that you still need a way to have explicit cs ioctl to wait on a
> previos "explicit" fence so you need some api to expose fence per cs
> submission.

Exactly, that's what this mail thread is all about.

As Daniel correctly noted you need something like a functionality to get =

a fence as the result of a command submission as well as pass in a list =

of fences to wait for before beginning a command submission.

At least it looks like we are all on the same general line here, its =

just nobody has a good idea how the details should look like.

Regards,
Christian.

>
> Cheers,
> J=E9r=F4me
>
>> Regards,
>> Christian.
>>
>>> Cheers,
>>> J=E9r=F4me
>>>
>>>> Alex
>>>>
>>>>> Also one thing that the Android sync point does not have, AFAICT, is a
>>>>> way to schedule synchronization as part of a cs ioctl so cpu never ha=
ve
>>>>> to be involve for cmd stream that deal only one gpu (assuming the dri=
ver
>>>>> and hw can do such trick).
>>>>>
>>>>> Cheers,
>>>>> J=E9r=F4me
>>>>>
>>>>>> -Daniel
>>>>>> --
>>>>>> Daniel Vetter
>>>>>> Software Engineer, Intel Corporation
>>>>>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>>>>> _______________________________________________
>>>>> dri-devel mailing list
>>>>> dri-devel@lists.freedesktop.org
>>>>> http://lists.freedesktop.org/mailman/listinfo/dri-devel