From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: Intel-GFX@Lists.FreeDesktop.Org
Subject: Re: [RFC] drm/i915: Add sync framework support to execbuff IOCTL
Date: Mon, 06 Jul 2015 16:21:28 +0100 [thread overview]
Message-ID: <559A9CF8.9010601@linux.intel.com> (raw)
In-Reply-To: <20150706151202.GS2156@phenom.ffwll.local>
On 07/06/2015 04:12 PM, Daniel Vetter wrote:
> On Mon, Jul 06, 2015 at 03:46:49PM +0100, Tvrtko Ursulin wrote:
>> On 07/06/2015 03:26 PM, John Harrison wrote:
>>> On 06/07/2015 14:59, Daniel Vetter wrote:
>>>> On Mon, Jul 06, 2015 at 01:58:25PM +0100, John Harrison wrote:
>>>>> On 06/07/2015 10:29, Daniel Vetter wrote:
>>>>>> On Fri, Jul 03, 2015 at 12:17:33PM +0100, Tvrtko Ursulin wrote:
>>>>>>> On 07/02/2015 04:55 PM, Chris Wilson wrote:
>>>>>>>> It would be nice if we could reuse one seqno both for
>>>>>>>> internal/external
>>>>>>>> fences. If you need to expose a fence ordering within a timeline
>>>>>>>> that is
>>>>>>>> based on the creation stamp rather than execution stamp, it seems
>>>>>>>> like
>>>>>>>> we could just add such a stamp when creating the sync_pt and not
>>>>>>>> worry
>>>>>>>> about its relationship to the execution seqno.
>>>>>>>>
>>>>>>>> Doing so does expose that requests are reordered to userspace
>>>>>>>> since the
>>>>>>>> signalling timeline is not the same as userspace's ordered
>>>>>>>> timeline. Not
>>>>>>>> sure if that is a problem or not.
>>>>>>>>
>>>>>>>> Afaict the sync uapi is based on waiting for all of a set of
>>>>>>>> fences to
>>>>>>>> retire. It doesn't seem to rely on fence ordering (that is knowing
>>>>>>>> that
>>>>>>>> fence A will signal before fence B so it need only wait on fence B).
>>>>>>>>
>>>>>>>> Here's hoping that we can have both simplicity and efficiency...
>>>>>>> Jumping in with not even perfect understanding of everything here -
>>>>>>> but
>>>>>>> timeline business has always been confusing me. There is nothing in
>>>>>>> the
>>>>>>> uapi which needs it afaics and iirc there was some discussion at
>>>>>>> the time
>>>>>>> Jesse floated his patches that it can be removed. Based on that when I
>>>>>>> squashed his patches and ported them on top of John's request to fence
>>>>>>> conversion it ended up something like the below (manually edited a
>>>>>>> bit to
>>>>>>> be less noisy and some prep patches omitted):
>>>>>>>
>>>>>>> This implements the ioctl based uapi and indeed seqnos are not
>>>>>>> actually
>>>>>>> used in waits. So is this insufficient for some reason? (Other that it
>>>>>>> does not implement the input fence side of things.)
>>>>>> Yeah android syncpt on top of struct fence embedded int i915 request is
>>>>>> what I'd have expected.
>>>>> The thing I'm not happy with in that plan is that it leaves the kernel
>>>>> driver at the mercy of user land applications. If we return a fence
>>>>> object
>>>>> to user land via a file descriptor (or indeed any other mechanism)
>>>>> then that
>>>>> fence object must be locked until user land closes the file. If the
>>>>> fence
>>>>> object is the one embedded within our request structure then that
>>>>> means user
>>>>> land is effectively locking our request structure. Given that more
>>>>> and more
>>>>> stuff is being attached to the request, that could be a fair bit of
>>>>> memory
>>>>> tied up that we can do nothing about. E.g. if a rogue/buggy application
>>>>> requests a fence be returned for every batch buffer submitted but never
>>>>> closes them. Whereas, if we go the route of a separate fence object
>>>>> specifically for user land then they can leak them like a sieve and
>>>>> we won't
>>>>> really care so much.
>>>> Userspace can exhaust kernel allocations, that's nothing new. And if we
>>>> keep it userspace simply needs to leak a few more fence fds than if
>>>> there's a bit more data attached to it.
>>>>
>>>> The solution to this problem is to have a mem cgroup limit set. No
>>>> need to
>>>> complicate our kernel code.
>>>
>>> There is still the extra complication that request unreferencing cannot
>>> require any kind of mutex lock if we are allowing it to happen from
>>> outside of the driver. That means the unreference callback must move the
>>> request to a 'please clean me later' list, schedule a worker thread to
>>> run, and thus do the clean up asynchronously.
>>
>> For this particular issue my solution was to extend the sync_fence
>> constructor to take a mutex and store it inside the object. Then at
>> destruction time, which happens at sync_fd->f_ops->release() time, it is
>> just a matter of calling kref_put_mutex instead of kref_put.
>>
>> Seemed to work under some quick testing but that is as much as I did back
>> then.
>
> The problem is that it doesn't scale since everyone wants some other kind
> of mutex to serialize the final kref_put. If something is supposed to be
> cross-subsystem/driver (which is the case for fences) then we really can't
> do that kind of leaky locking design. Imo we should have a kref_put_mutex
> considered harmful sign somewhere ...
I get the argument about everything wanting to add their own not
scaling, but don't tie with the leaky comment? Also mutex is a pretty
standard thing, especially since kref_put_mutex. :D If you look at it
from that angle it kind of just exposes to the super class what the base
class can do.
> If you have weak references somewhere and need to prevent the object from
> disappearing untimely while chasing that weak reference then imo the
> better design pattern is to use kref_get_unless_zero. If you need the
> serialization the mutex provides for some other reason (someone is only
> hodling the mutex instead of grabbing a proper refernce when they really
> should grab one) then your refcounting scheme probably needs another kind
> of fixup patch.
I don't see how weak references can work since if the request goes
information is lost, unless stored somewhere else.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2015-07-06 15:21 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-02 11:09 [RFC] drm/i915: Add sync framework support to execbuff IOCTL John.C.Harrison
2015-07-02 11:54 ` Chris Wilson
2015-07-02 12:02 ` Chris Wilson
2015-07-02 13:01 ` John Harrison
2015-07-02 13:22 ` Chris Wilson
2015-07-02 15:43 ` John Harrison
2015-07-02 15:55 ` Chris Wilson
2015-07-03 11:17 ` Tvrtko Ursulin
2015-07-06 9:29 ` Daniel Vetter
2015-07-06 12:58 ` John Harrison
2015-07-06 13:59 ` Daniel Vetter
2015-07-06 14:26 ` John Harrison
2015-07-06 14:41 ` Daniel Vetter
2015-07-06 14:46 ` Tvrtko Ursulin
2015-07-06 15:12 ` Daniel Vetter
2015-07-06 15:21 ` Tvrtko Ursulin [this message]
2015-07-06 15:37 ` Daniel Vetter
2015-07-06 16:34 ` Tvrtko Ursulin
2015-07-06 17:58 ` Daniel Vetter
2015-07-07 9:15 ` Tvrtko Ursulin
2015-07-29 21:19 ` Jesse Barnes
2015-07-30 11:36 ` John Harrison
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=559A9CF8.9010601@linux.intel.com \
--to=tvrtko.ursulin@linux.intel.com \
--cc=Intel-GFX@Lists.FreeDesktop.Org \
--cc=daniel@ffwll.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox