From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tiago Vignatti Subject: Re: [PATCH] dma-buf: Update docs for SYNC ioctl Date: Mon, 28 Mar 2016 16:42:13 -0300 Message-ID: <56F98915.1030200@intel.com> References: <1458546705-3564-1-git-send-email-daniel.vetter@ffwll.ch> <20160321171405.GP28483@phenom.ffwll.local> <20160323115659.GF21717@nuc-i3427.alporthouse.com> <20160323154223.GJ21717@nuc-i3427.alporthouse.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20160323154223.GJ21717@nuc-i3427.alporthouse.com> Sender: linux-media-owner@vger.kernel.org To: Chris Wilson , David Herrmann , Daniel Vetter , Sumit Semwal , Daniel Vetter , DRI Development , =?UTF-8?Q?St=c3=a9phane_Marchesin?= , Daniel Vetter , linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Intel Graphics Development , devel@driverdev.osuosl.org, Hans Verkuil List-Id: intel-gfx@lists.freedesktop.org On 03/23/2016 12:42 PM, Chris Wilson wrote: > On Wed, Mar 23, 2016 at 04:32:59PM +0100, David Herrmann wrote: >> Hi >> >> On Wed, Mar 23, 2016 at 12:56 PM, Chris Wilson wrote: >>> On Wed, Mar 23, 2016 at 12:30:42PM +0100, David Herrmann wrote: >>>> My question was rather about why we do this? Semantics for EINTR are >>>> well defined, and with SA_RESTART (default on linux) user-space can >>>> ignore it. However, looping on EAGAIN is very uncommon, and it is not >>>> at all clear why it is needed? >>>> >>>> Returning an error to user-space makes sense if user-space has a >>>> reason to react to it. I fail to see how EAGAIN on a cache-flush/sync >>>> operation helps user-space at all? As someone without insight into the >>>> driver implementation, it is hard to tell why.. Any hints? >>> >>> The reason we return EAGAIN is to workaround a deadlock we face when >>> blocking on the GPU holding the struct_mutex (inside the client's >>> process), but the GPU is dead. As our locking is very, very coarse we >>> cannot restart the GPU without acquiring the struct_mutex being held by >>> the client so we wake the client up and tell them the resource they are >>> waiting on (the flush of the object from the GPU into the CPU domain) is >>> temporarily unavailable. If they try to immediately wait upon the ioctl >>> again, they are blocked waiting for the reset to occur before they may >>> complete their flush. There are a few other possible deadlocks that are >>> also avoided with EAGAIN (again, the issue is more or less the lack of >>> fine grained locking). >> >> ...so you hijacked EAGAIN for all DRM ioctls just for a driver >> workaround? > > No, we utilized the fact that EAGAIN was already enshrined by libdrm as > the defacto mechanism for repeating the ioctl in order to repeat the > ioctl for a driver workaround. Do we have an agreement here after all? David? I need to know whether this fixup is okay to go cause I'll need to submit to Chrome OS then. Best Regards, Tiago