From: John Harrison <John.C.Harrison@Intel.com>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
Intel-GFX@Lists.FreeDesktop.Org
Subject: Re: [RFC 3/9] drm/i915: Convert requests to use struct fence
Date: Tue, 28 Jul 2015 11:10:21 +0100 [thread overview]
Message-ID: <55B7550D.6080001@Intel.com> (raw)
In-Reply-To: <55AFA806.6010502@linux.intel.com>
On 22/07/2015 15:26, Tvrtko Ursulin wrote:
>
> Hi,
>
> On 07/17/2015 03:31 PM, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> There is a construct in the linux kernel called 'struct fence' that
>> is intended
>> to keep track of work that is executed on hardware. I.e. it solves
>> the basic
>> problem that the drivers 'struct drm_i915_gem_request' is trying to
>> address. The
>> request structure does quite a lot more than simply track the
>> execution progress
>> so is very definitely still required. However, the basic completion
>> status side
>> could be updated to use the ready made fence implementation and gain
>> all the
>> advantages that provides.
>>
>> This patch makes the first step of integrating a struct fence into
>> the request.
>> It replaces the explicit reference count with that of the fence. It also
>> replaces the 'is completed' test with the fence's equivalent.
>> Currently, that
>> simply chains on to the original request implementation. A future
>> patch will
>> improve this.
>>
>> For: VIZ-5190
>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_drv.h | 45 +++++++++++++------------
>> drivers/gpu/drm/i915/i915_gem.c | 58
>> ++++++++++++++++++++++++++++++---
>> drivers/gpu/drm/i915/intel_lrc.c | 1 +
>> drivers/gpu/drm/i915/intel_ringbuffer.c | 1 +
>> drivers/gpu/drm/i915/intel_ringbuffer.h | 3 ++
>> 5 files changed, 80 insertions(+), 28 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index cf6761c..79d346c 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -50,6 +50,7 @@
>> #include <linux/intel-iommu.h>
>> #include <linux/kref.h>
>> #include <linux/pm_qos.h>
>> +#include <linux/fence.h>
>>
>> /* General customization:
>> */
>> @@ -2150,7 +2151,17 @@ void i915_gem_track_fb(struct
>> drm_i915_gem_object *old,
>> * initial reference taken using kref_init
>> */
>> struct drm_i915_gem_request {
>> - struct kref ref;
>> + /**
>> + * Underlying object for implementing the signal/wait stuff.
>> + * NB: Never call fence_later() or return this fence object to user
>> + * land! Due to lazy allocation, scheduler re-ordering,
>> pre-emption,
>> + * etc., there is no guarantee at all about the validity or
>> + * sequentiality of the fence's seqno! It is also unsafe to let
>> + * anything outside of the i915 driver get hold of the fence object
>> + * as the clean up when decrementing the reference count requires
>> + * holding the driver mutex lock.
>> + */
>> + struct fence fence;
>>
>> /** On Which ring this request was generated */
>> struct drm_i915_private *i915;
>> @@ -2227,7 +2238,13 @@ int i915_gem_request_alloc(struct
>> intel_engine_cs *ring,
>> struct intel_context *ctx,
>> struct drm_i915_gem_request **req_out);
>> void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>> -void i915_gem_request_free(struct kref *req_ref);
>> +
>> +static inline bool i915_gem_request_completed(struct
>> drm_i915_gem_request *req,
>> + bool lazy_coherency)
>> +{
>> + return fence_is_signaled(&req->fence);
>> +}
>> +
>> int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
>> struct drm_file *file);
>>
>> @@ -2247,7 +2264,7 @@ static inline struct drm_i915_gem_request *
>> i915_gem_request_reference(struct drm_i915_gem_request *req)
>> {
>> if (req)
>> - kref_get(&req->ref);
>> + fence_get(&req->fence);
>> return req;
>> }
>>
>> @@ -2255,7 +2272,7 @@ static inline void
>> i915_gem_request_unreference(struct drm_i915_gem_request *req)
>> {
>> WARN_ON(!mutex_is_locked(&req->ring->dev->struct_mutex));
>> - kref_put(&req->ref, i915_gem_request_free);
>> + fence_put(&req->fence);
>> }
>>
>> static inline void
>> @@ -2267,7 +2284,7 @@ i915_gem_request_unreference__unlocked(struct
>> drm_i915_gem_request *req)
>> return;
>>
>> dev = req->ring->dev;
>> - if (kref_put_mutex(&req->ref, i915_gem_request_free,
>> &dev->struct_mutex))
>> + if (kref_put_mutex(&req->fence.refcount, fence_release,
>> &dev->struct_mutex))
>> mutex_unlock(&dev->struct_mutex);
>
> Would it be nicer to add fence_put_mutex(struct fence *, struct mutex
> *) for this? It would avoid the layering violation of requests peeking
> into fence implementation details.
Maarten pointed out that adding 'fence_put_mutex()' is breaking the
fence ABI itself. It requires users of any random fence to know and
worry about what mutex objects that fence's internal implementation
might require.
This is a bit more hacky from our point of view but it is only a
temporary measure until the mutex lock is not required for
dereferencing. At that point all the nasty stuff disappears completely.
>
>> }
>>
>> @@ -2284,12 +2301,6 @@ static inline void
>> i915_gem_request_assign(struct drm_i915_gem_request **pdst,
>> }
>>
>> /*
>> - * XXX: i915_gem_request_completed should be here but currently
>> needs the
>> - * definition of i915_seqno_passed() which is below. It will be
>> moved in
>> - * a later patch when the call to i915_seqno_passed() is obsoleted...
>> - */
>> -
>> -/*
>> * A command that requires special handling by the command parser.
>> */
>> struct drm_i915_cmd_descriptor {
>> @@ -2851,18 +2862,6 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
>> return (int32_t)(seq1 - seq2) >= 0;
>> }
>>
>> -static inline bool i915_gem_request_completed(struct
>> drm_i915_gem_request *req,
>> - bool lazy_coherency)
>> -{
>> - u32 seqno;
>> -
>> - BUG_ON(req == NULL);
>> -
>> - seqno = req->ring->get_seqno(req->ring, lazy_coherency);
>> -
>> - return i915_seqno_passed(seqno, req->seqno);
>> -}
>> -
>> int __must_check i915_gem_get_seqno(struct drm_device *dev, u32
>> *seqno);
>> int __must_check i915_gem_set_seqno(struct drm_device *dev, u32
>> seqno);
>> int __must_check i915_gem_object_get_fence(struct
>> drm_i915_gem_object *obj);
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c
>> b/drivers/gpu/drm/i915/i915_gem.c
>> index d9f2701..888bb72 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -2616,12 +2616,14 @@ static void i915_set_reset_status(struct
>> drm_i915_private *dev_priv,
>> }
>> }
>>
>> -void i915_gem_request_free(struct kref *req_ref)
>> +static void i915_gem_request_free(struct fence *req_fence)
>> {
>> - struct drm_i915_gem_request *req = container_of(req_ref,
>> - typeof(*req), ref);
>> + struct drm_i915_gem_request *req = container_of(req_fence,
>> + typeof(*req), fence);
>> struct intel_context *ctx = req->ctx;
>>
>> + BUG_ON(!mutex_is_locked(&req->ring->dev->struct_mutex));
>
> It would be nicer (for the user experience, even if only the developer
> working on the code) to WARN and leak.
Habit. Daniel hasn't quite broken my fingers of their BUG_ON preference.
>
>> +
>> if (req->file_priv)
>> i915_gem_request_remove_from_client(req);
>>
>> @@ -2637,6 +2639,47 @@ void i915_gem_request_free(struct kref *req_ref)
>> kmem_cache_free(req->i915->requests, req);
>> }
>>
>> +static const char *i915_gem_request_get_driver_name(struct fence
>> *req_fence)
>> +{
>> + return "i915_request";
>> +}
>
> I think this becomes kind of ABI once added so we need to make sure
> the best name is chosen to start with. I couldn't immediately figure
> out why not just "i915"?
The thought was that we might start using fences for other purposes at
some point in the future. At which point it is helpful to differentiate
them. If nothing else, a previous iteration of the native sync
implementation was using different fence objects due to worries about
allowing user land to get its grubby mitts on important driver internal
structures.
>
>> +
>> +static const char *i915_gem_request_get_timeline_name(struct fence
>> *req_fence)
>> +{
>> + struct drm_i915_gem_request *req = container_of(req_fence,
>> + typeof(*req), fence);
>> + return req->ring->name;
>> +}
>> +
>> +static bool i915_gem_request_enable_signaling(struct fence *req_fence)
>> +{
>> + /* Interrupt driven fences are not implemented yet.*/
>> + WARN(true, "This should not be called!");
>> + return true;
>
> I suppose WARN is not really needed in the interim patch. Would return
> false work?
The point is to keep the driver 'safe' if that patch is viewed as stand
alone. I.e., if the interrupt follow up does not get merged immediately
after (or not at all) then this prevents people accidentally trying to
use an unsupported interface without first implementing it.
>
>> +}
>> +
>> +static bool i915_gem_request_is_completed(struct fence *req_fence)
>> +{
>> + struct drm_i915_gem_request *req = container_of(req_fence,
>> + typeof(*req), fence);
>> + u32 seqno;
>> +
>> + BUG_ON(req == NULL);
>
> Hm, I don't think container_of can return NULL in a meaningful way.
Hmm, historical code. The request used to be a direct parameter.
>
>> +
>> + seqno = req->ring->get_seqno(req->ring, false/*lazy_coherency*/);
>> +
>> + return i915_seqno_passed(seqno, req->seqno);
>> +}
>> +
>> +static const struct fence_ops i915_gem_request_fops = {
>> + .get_driver_name = i915_gem_request_get_driver_name,
>> + .get_timeline_name = i915_gem_request_get_timeline_name,
>> + .enable_signaling = i915_gem_request_enable_signaling,
>> + .signaled = i915_gem_request_is_completed,
>> + .wait = fence_default_wait,
>> + .release = i915_gem_request_free,
>> +};
>> +
>> int i915_gem_request_alloc(struct intel_engine_cs *ring,
>> struct intel_context *ctx,
>> struct drm_i915_gem_request **req_out)
>> @@ -2658,7 +2701,6 @@ int i915_gem_request_alloc(struct
>> intel_engine_cs *ring,
>> if (ret)
>> goto err;
>>
>> - kref_init(&req->ref);
>> req->i915 = dev_priv;
>> req->ring = ring;
>> req->ctx = ctx;
>> @@ -2673,6 +2715,8 @@ int i915_gem_request_alloc(struct
>> intel_engine_cs *ring,
>> goto err;
>> }
>>
>> + fence_init(&req->fence, &i915_gem_request_fops,
>> &ring->fence_lock, ring->fence_context, req->seqno);
>> +
>> /*
>> * Reserve space in the ring buffer for all the commands
>> required to
>> * eventually emit this request. This is to guarantee that the
>> @@ -5021,7 +5065,7 @@ i915_gem_init_hw(struct drm_device *dev)
>> {
>> struct drm_i915_private *dev_priv = dev->dev_private;
>> struct intel_engine_cs *ring;
>> - int ret, i, j;
>> + int ret, i, j, fence_base;
>>
>> if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
>> return -EIO;
>> @@ -5073,12 +5117,16 @@ i915_gem_init_hw(struct drm_device *dev)
>> goto out;
>> }
>>
>> + fence_base = fence_context_alloc(I915_NUM_RINGS);
>> +
>> /* Now it is safe to go back round and do everything else: */
>> for_each_ring(ring, dev_priv, i) {
>> struct drm_i915_gem_request *req;
>>
>> WARN_ON(!ring->default_context);
>>
>> + ring->fence_context = fence_base + i;
>
> Could you store fence_base in dev_priv and then ring->init_hw could
> set up the fence_context on its own?
Probably. It seemed to make more sense to me to keep the fence
allocation all in one place rather than splitting it in half and
distributing it. It gets removed again with the per context per ring
timelines anyway.
>
> Regards,
>
> Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2015-07-28 10:10 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-17 14:31 [RFC 0/9] Convert requests to use struct fence John.C.Harrison
2015-07-17 14:31 ` [RFC 1/9] staging/android/sync: Support sync points created from dma-fences John.C.Harrison
2015-07-17 14:44 ` Tvrtko Ursulin
2015-07-17 14:31 ` [RFC 2/9] android: add sync_fence_create_dma John.C.Harrison
2015-07-17 14:31 ` [RFC 3/9] drm/i915: Convert requests to use struct fence John.C.Harrison
2015-07-21 7:05 ` Daniel Vetter
2015-07-28 10:01 ` John Harrison
2015-07-22 14:26 ` Tvrtko Ursulin
2015-07-28 10:10 ` John Harrison [this message]
2015-08-03 9:17 ` Tvrtko Ursulin
2015-07-22 14:45 ` Tvrtko Ursulin
2015-07-28 10:18 ` John Harrison
2015-08-03 9:18 ` Tvrtko Ursulin
2015-07-17 14:31 ` [RFC 4/9] drm/i915: Removed now redudant parameter to i915_gem_request_completed() John.C.Harrison
2015-07-17 14:31 ` [RFC 5/9] drm/i915: Add per context timelines to fence object John.C.Harrison
2015-07-23 13:50 ` Tvrtko Ursulin
2015-10-28 12:59 ` John Harrison
2015-11-17 13:54 ` Daniel Vetter
2015-07-17 14:31 ` [RFC 6/9] drm/i915: Delay the freeing of requests until retire time John.C.Harrison
2015-07-23 14:25 ` Tvrtko Ursulin
2015-10-28 13:00 ` John Harrison
2015-10-28 13:42 ` Tvrtko Ursulin
2015-07-17 14:31 ` [RFC 7/9] drm/i915: Interrupt driven fences John.C.Harrison
2015-07-20 9:09 ` Maarten Lankhorst
2015-07-21 7:19 ` Daniel Vetter
2015-07-27 11:33 ` Tvrtko Ursulin
2015-10-28 13:00 ` John Harrison
2015-07-27 13:20 ` Tvrtko Ursulin
2015-07-27 14:00 ` Daniel Vetter
2015-08-03 9:20 ` Tvrtko Ursulin
2015-08-05 8:05 ` Daniel Vetter
2015-08-05 11:05 ` Maarten Lankhorst
2015-07-17 14:31 ` [RFC 8/9] drm/i915: Updated request structure tracing John.C.Harrison
2015-07-17 14:31 ` [RFC 9/9] drm/i915: Add sync framework support to execbuff IOCTL John.C.Harrison
2015-07-27 13:00 ` Tvrtko Ursulin
2015-10-28 13:01 ` John Harrison
2015-10-28 14:31 ` Tvrtko Ursulin
2015-11-17 13:59 ` Daniel Vetter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55B7550D.6080001@Intel.com \
--to=john.c.harrison@intel.com \
--cc=Intel-GFX@Lists.FreeDesktop.Org \
--cc=tvrtko.ursulin@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox