From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jerome Glisse Subject: Re: Question on UAPI for fences Date: Fri, 12 Sep 2014 11:48:31 -0400 Message-ID: <20140912154831.GC4139@gmail.com> References: <5412F3CA.9060306@amd.com> <20140912145048.GA4139@gmail.com> <20140912153346.GB4139@gmail.com> <54131481.4040905@amd.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: Received: from mail-qg0-f42.google.com (mail-qg0-f42.google.com [209.85.192.42]) by gabe.freedesktop.org (Postfix) with ESMTP id A252E6E0E1 for ; Fri, 12 Sep 2014 08:48:36 -0700 (PDT) Received: by mail-qg0-f42.google.com with SMTP id q107so944920qgd.15 for ; Fri, 12 Sep 2014 08:48:35 -0700 (PDT) Content-Disposition: inline In-Reply-To: <54131481.4040905@amd.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Christian =?iso-8859-1?Q?K=F6nig?= Cc: Maarten Lankhorst , Zach Pfeffer , "dri-devel@lists.freedesktop.org" , "linaro-mm-sig@lists.linaro.org" , gpudriverdevsupport@amd.com, John Harrison List-Id: dri-devel@lists.freedesktop.org On Fri, Sep 12, 2014 at 05:42:57PM +0200, Christian K=F6nig wrote: > Am 12.09.2014 um 17:33 schrieb Jerome Glisse: > >On Fri, Sep 12, 2014 at 11:25:12AM -0400, Alex Deucher wrote: > >>On Fri, Sep 12, 2014 at 10:50 AM, Jerome Glisse wr= ote: > >>>On Fri, Sep 12, 2014 at 04:43:44PM +0200, Daniel Vetter wrote: > >>>>On Fri, Sep 12, 2014 at 4:09 PM, Daniel Vetter wrot= e: > >>>>>On Fri, Sep 12, 2014 at 03:23:22PM +0200, Christian K=F6nig wrote: > >>>>>>Hello everyone, > >>>>>> > >>>>>>to allow concurrent buffer access by different engines beyond the m= ultiple > >>>>>>readers/single writer model that we currently use in radeon and oth= er > >>>>>>drivers we need some kind of synchonization object exposed to users= pace. > >>>>>> > >>>>>>My initial patch set for this used (or rather abused) zero sized GE= M buffers > >>>>>>as fence handles. This is obviously isn't the best way of doing thi= s (to > >>>>>>much overhead, rather ugly etc...), Jerome commented on this accord= ingly. > >>>>>> > >>>>>>So what should a driver expose instead? Android sync points? Someth= ing else? > >>>>>I think actually exposing the struct fence objects as a fd, using an= droid > >>>>>syncpts (or at least something compatible to it) is the way to go. P= roblem > >>>>>is that it's super-hard to get the android guys out of hiding for th= is :( > >>>>> > >>>>>Adding a bunch of people in the hopes that something sticks. > >>>>More people. > >>>Just to re-iterate, exposing such thing while still using command stre= am > >>>ioctl that use implicit synchronization is a waste and you can only get > >>>the lowest common denominator which is implicit synchronization. So i = do > >>>not see the point of such api if you are not also adding a new cs ioctl > >>>with explicit contract that it does not do any kind of synchronization > >>>(it could be almost the exact same code modulo the do not wait for > >>>previous cmd to complete). > >>Our thinking was to allow explicit sync from a single process, but > >>implicitly sync between processes. > >This is a BIG NAK if you are using the same ioctl as it would mean you a= re > >changing userspace API, well at least userspace expectation. Adding a new > >cs flag might do the trick but it should not be about inter-process, or = any > >thing special, it's just implicit sync or no synchronization. Converting > >userspace is not that much of a big deal either, it can be broken into > >several step. Like mesa use explicit synchronization all time but ddx use > >implicit. > = > The thinking here is that we need to be backward compatible for DRI2/3 and > support all kind of different use cases like old DDX and new Mesa, or old > Mesa and new DDX etc... > = > So for my prototype if the kernel sees any access of a BO from two differ= ent > clients it falls back to the old behavior of implicit synchronization of > access to the same buffer object. That might not be the fastest approach, > but is as far as I can see conservative and so should work under all > conditions. > = > Apart from that the planning so far was that we just hide this feature > behind a couple of command submission flags and new chunks. Just to reproduce IRC discussion, i think it's a lot simpler and not that complex. For explicit cs ioctl you do not wait for any previous fence of any of the buffer referenced in the cs ioctl, but you still associate a new fence with all the buffer object referenced in the cs ioctl. So if the next ioctl is an implicit sync ioctl it will wait properly and synchronize properly with previous explicit cs ioctl. Hence you can easily have a mix in userspace thing is you only get benefit once enough of your userspace is using explicit. Note that you still need a way to have explicit cs ioctl to wait on a previos "explicit" fence so you need some api to expose fence per cs submission. Cheers, J=E9r=F4me > = > Regards, > Christian. > = > > > >Cheers, > >J=E9r=F4me > > > >>Alex > >> > >>>Also one thing that the Android sync point does not have, AFAICT, is a > >>>way to schedule synchronization as part of a cs ioctl so cpu never have > >>>to be involve for cmd stream that deal only one gpu (assuming the driv= er > >>>and hw can do such trick). > >>> > >>>Cheers, > >>>J=E9r=F4me > >>> > >>>>-Daniel > >>>>-- > >>>>Daniel Vetter > >>>>Software Engineer, Intel Corporation > >>>>+41 (0) 79 365 57 48 - http://blog.ffwll.ch > >>>_______________________________________________ > >>>dri-devel mailing list > >>>dri-devel@lists.freedesktop.org > >>>http://lists.freedesktop.org/mailman/listinfo/dri-devel > =