public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: John Harrison <John.C.Harrison@Intel.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: Intel-GFX@Lists.FreeDesktop.Org
Subject: Re: [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands
Date: Wed, 24 Jun 2015 18:05:08 +0100	[thread overview]
Message-ID: <558AE344.90006@Intel.com> (raw)
In-Reply-To: <20150624124533.GG25769@phenom.ffwll.local>

On 24/06/2015 13:45, Daniel Vetter wrote:
> On Wed, Jun 24, 2015 at 01:18:48PM +0100, John Harrison wrote:
>> On 23/06/2015 21:00, Daniel Vetter wrote:
>>> On Tue, Jun 23, 2015 at 04:43:24PM +0100, John Harrison wrote:
>>>> On 23/06/2015 14:24, Daniel Vetter wrote:
>>>>> On Tue, Jun 23, 2015 at 12:38:01PM +0100, John Harrison wrote:
>>>>>> On 22/06/2015 21:12, Daniel Vetter wrote:
>>>>>>> On Fri, Jun 19, 2015 at 05:34:12PM +0100, John.C.Harrison@Intel.com wrote:
>>>>>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>>>>>
>>>>>>>> It is a bad idea for i915_add_request() to fail. The work will already have been
>>>>>>>> send to the ring and will be processed, but there will not be any tracking or
>>>>>>>> management of that work.
>>>>>>>>
>>>>>>>> The only way the add request call can fail is if it can't write its epilogue
>>>>>>>> commands to the ring (cache flushing, seqno updates, interrupt signalling). The
>>>>>>>> reasons for that are mostly down to running out of ring buffer space and the
>>>>>>>> problems associated with trying to get some more. This patch prevents that
>>>>>>>> situation from happening in the first place.
>>>>>>>>
>>>>>>>> When a request is created, it marks sufficient space as reserved for the
>>>>>>>> epilogue commands. Thus guaranteeing that by the time the epilogue is written,
>>>>>>>> there will be plenty of space for it. Note that a ring_begin() call is required
>>>>>>>> to actually reserve the space (and do any potential waiting). However, that is
>>>>>>>> not currently done at request creation time. This is because the ring_begin()
>>>>>>>> code can allocate a request. Hence calling begin() from the request allocation
>>>>>>>> code would lead to infinite recursion! Later patches in this series remove the
>>>>>>>> need for begin() to do the allocate. At that point, it becomes safe for the
>>>>>>>> allocate to call begin() and really reserve the space.
>>>>>>>>
>>>>>>>> Until then, there is a potential for insufficient space to be available at the
>>>>>>>> point of calling i915_add_request(). However, that would only be in the case
>>>>>>>> where the request was created and immediately submitted without ever calling
>>>>>>>> ring_begin() and adding any work to that request. Which should never happen. And
>>>>>>>> even if it does, and if that request happens to fall down the tiny window of
>>>>>>>> opportunity for failing due to being out of ring space then does it really
>>>>>>>> matter because the request wasn't doing anything in the first place?
>>>>>>>>
>>>>>>>> v2: Updated the 'reserved space too small' warning to include the offending
>>>>>>>> sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
>>>>>>>> re-initialisation of tracking state after a buffer wrap to keep the sanity
>>>>>>>> checks accurate.
>>>>>>>>
>>>>>>>> v3: Incremented the reserved size to accommodate Ironlake (after finally
>>>>>>>> managing to run on an ILK system). Also fixed missing wrap code in LRC mode.
>>>>>>>>
>>>>>>>> v4: Added extra comment and removed duplicate WARN (feedback from Tomas).
>>>>>>>>
>>>>>>>> v5: Re-write of wrap handling to prevent unnecessary early wraps (feedback from
>>>>>>>> Daniel Vetter).
>>>>>>> This didn't actually implement what I suggested (wrapping is the worst
>>>>>>> case, hence skipping the check for that is breaking the sanity check) and
>>>>>>> so changed the patch from "correct, but a bit fragile" to broken. I've
>>>>>>> merged the previous version instead.
>>>>>>> -Daniel
>>>>>> I'm confused. I thought your main issue was the early wrapping not the
>>>>>> sanity check. The check is to ensure that the reservation is large enough to
>>>>>> cover all the commands written during request submission. That should not be
>>>>>> affected by whether a wrap occurs or not. Wrapping does not magically add an
>>>>>> extra bunch of dwords to the emit_request() call. Whereas making the check
>>>>>> work with the wrap condition requires adding in extra tracking state of
>>>>>> exactly where the wrap occurred. That is extra code that only exists to
>>>>>> catch something in the very rare case which should already have been caught
>>>>>> in the very common case. I.e. if your reserved size is too small then you
>>>>>> will hit the warning on every batch buffer submission.
>>>>> The problem is that if you allow a wrap in the reserve size then the
>>>>> ringspace requirements are bigger than if you don't wrap. And since the
>>>>> add request is split up into many intel_ring_begin that's possible. Hence
>>>>> if you allow wrapping in the reserved space, then the most important case
>>>>> for the debug check is to make sure that it catches any kind of
>>>>> reservation overflow while wrapping. The not-wrapped case is probably the
>>>>> boring one.
>>>>>
>>>>> And indeed eventually we should overflow since according to your comment
>>>>> the worst case add request on ilk is 136 dwords. And the largest
>>>>> intel_ring_begin in there is 32 dwords, which means at most we'll throw
>>>>> away 31 dwords when wrapping. Which means the 160 dwords of reservation
>>>>> are not enough since we'd need 167 dwords of space for the worst case. But
>>>>> since the space_end debug check was a no-op for the wrapped case you won't
>>>>> catch this one.
>>>> The minimum reservation size in this case is still only 136. The prepare
>>>> code checks for the 32 words actually requested and wraps if necessary. It
>>>> then checks for 136+32 words of space. If that would cause a wrap it will
>>>> then add on the amount of space actually left in the ring and wait for that
>>>> bigger total. That guarantees that it has waited for the 136 at the start of
>>>> the ring. The caller is then free to fill in the 32 words and there is still
>>>> guaranteed to be a minimum of 136 words available (with or without wrapping)
>>>> before any further wait for space is necessary. Thus the add_request() code
>>>> is safe from fear of failure irrespective of where any wrap might occur.
>>>>
>>>>
>>>>> Wrt keeping track of wrapping while the reservation is in use, the
>>>>> following should do that without any need of additional tracking:
>>>>>
>>>>>
>>>>> 	int used_size = ringbuf->tail - ringbuf->reserved_tail;
>>>>>
>>>>> 	if (used_size < 0)
>>>>> 		used_size += ringbuf->size;
>>>>>
>>>>> 	WARN(used_size < ringbuf->reserved_size,
>>>>> 	     "request reserved size too small: %d vs %d!\n",
>>>>> 	     used_size, ringbuf->reserved_size);
>>>>>
>>>>> I was mistaken that you can reuse __intel_ring_space (since that has
>>>>> slightly different requirements), but this gives you a nicely localized
>>>>> check for reservation overflow which works even when you wrap. Ofc it
>>>>> won't work if an add_request is bigger than the entire ring, but that's
>>>>> impossible anyway since we can at most reserve ringbuf->size -
>>>>> I915_RING_FREE_SPACE.
>>>> The problem with the above calculation is that it includes the wasted space
>>>> at the end of the ring. Thus it will complain the reserved size was too
>>>> small when in fact it was just fine.
>>> Ok I again misunderstood your patch a bit since it didn't quite do what I
>>> expect, and I stand corrected that v5 works too. But I still seem to fail
>>> to get my main concern across. I'll see whether I can whip up a patch as a
>>> short demonstration, maybe that helps to unconfuse this dicussion.
>>>
>>> For now I think we're covered with either v4 or v5 so sticking with either
>>> is ok with me.
>>> -Daniel
>> I think v5 is much better. It reduces the ring space wastage which I thought
>> was your main concern.
> Ok with me too - I simply didn't pick it up when merging yesterday because
> I couldn't immediately convince myself it's correct, but really wanted to
> pull in your series. Unfortunately it's now burried below piles of
> patches, so can you please do a delta patch?
Delta patch posted:  '[PATCH] drm/i915: Reserve space improvements'.


>
>> The problem with a more simplistic approach that just doubles the minimum
>> reserve size to ensure that it will fit before or after a wrap is that you
>> are doubling the reserve size. That too is rather wasteful of ring space. It
>> also means that you only find out when the reserve size is too small when
>> you hit the maximum usage coincident with a worst case wrap point. Whereas
>> the v5 method means that you notice a too small reserve whether wrapping or
>> not.
> We don't need to double the reservation since the add_request tail is
> split up into many individual intel_ring_begin. And we'd only need to wrap
> for the largest of those, which is substantially less than the entire
> reservation. Furthermore with the reservation these commands can't ever
> fail, so for those we know are only used in the add_request tail we could
> go to a wrap-only intel_ring_begin which never waits and have one at a
> dword cmd boundary. That means we'd need to overestimate the needed
> ringbuffer space by just a few dwords (namely the size of the longest CS
> cmd we emit under reservation). Which is around 6 dwords or so iirc. And
> to avoid changing ilk we could just special case that in reserve_space().
>
> In practice I don't think there would be any difference with your v5 since
> especially with the scheduler we shouldn't ever overfill rings really. But
> the clear upside is that the reserve_space_end check would be independent
> of any implementation details of how reservation vs. wrapping is done
> exactly. And hence robust against any future fumbles in this area. Looking
> at our history of the relevant code we can expect a lot of those.
> -Daniel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2015-06-24 17:05 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
2015-05-29 16:43 ` [PATCH 01/55] drm/i915: Re-instate request->uniq becuase it is extremely useful John.C.Harrison
2015-06-03 11:14   ` Tomas Elf
2015-05-29 16:43 ` [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands John.C.Harrison
2015-06-02 18:14   ` Tomas Elf
2015-06-04 12:06   ` John.C.Harrison
2015-06-09 16:00     ` Tomas Elf
2015-06-18 12:10       ` John.C.Harrison
2015-06-17 14:04     ` Daniel Vetter
2015-06-18 10:43       ` John Harrison
2015-06-19 16:34   ` John.C.Harrison
2015-06-22 20:12     ` Daniel Vetter
2015-06-23 11:38       ` John Harrison
2015-06-23 13:24         ` Daniel Vetter
2015-06-23 15:43           ` John Harrison
2015-06-23 20:00             ` Daniel Vetter
2015-06-24 12:18               ` John Harrison
2015-06-24 12:45                 ` Daniel Vetter
2015-06-24 17:05                   ` John Harrison [this message]
2015-05-29 16:43 ` [PATCH 03/55] drm/i915: i915_add_request must not fail John.C.Harrison
2015-06-02 18:16   ` Tomas Elf
2015-06-04 14:07     ` John Harrison
2015-06-05 10:55       ` Tomas Elf
2015-06-23 10:16   ` Chris Wilson
2015-06-23 10:47     ` John Harrison
2015-05-29 16:43 ` [PATCH 04/55] drm/i915: Early alloc request in execbuff John.C.Harrison
2015-05-29 16:43 ` [PATCH 05/55] drm/i915: Set context in request from creation even in legacy mode John.C.Harrison
2015-05-29 16:43 ` [PATCH 06/55] drm/i915: Merged the many do_execbuf() parameters into a structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 07/55] drm/i915: Simplify i915_gem_execbuffer_retire_commands() parameters John.C.Harrison
2015-05-29 16:43 ` [PATCH 08/55] drm/i915: Update alloc_request to return the allocated request John.C.Harrison
2015-05-29 16:43 ` [PATCH 09/55] drm/i915: Add request to execbuf params and add explicit cleanup John.C.Harrison
2015-05-29 16:43 ` [PATCH 10/55] drm/i915: Update the dispatch tracepoint to use params->request John.C.Harrison
2015-05-29 16:43 ` [PATCH 11/55] drm/i915: Update move_to_gpu() to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 12/55] drm/i915: Update execbuffer_move_to_active() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 13/55] drm/i915: Add flag to i915_add_request() to skip the cache flush John.C.Harrison
2015-06-02 18:19   ` Tomas Elf
2015-05-29 16:43 ` [PATCH 14/55] drm/i915: Update i915_gpu_idle() to manage its own request John.C.Harrison
2015-05-29 16:43 ` [PATCH 15/55] drm/i915: Split i915_ppgtt_init_hw() in half - generic and per ring John.C.Harrison
2015-06-18 12:11   ` John.C.Harrison
2015-05-29 16:43 ` [PATCH 16/55] drm/i915: Moved the for_each_ring loop outside of i915_gem_context_enable() John.C.Harrison
2015-05-29 16:43 ` [PATCH 17/55] drm/i915: Don't tag kernel batches as user batches John.C.Harrison
2015-05-29 16:43 ` [PATCH 18/55] drm/i915: Add explicit request management to i915_gem_init_hw() John.C.Harrison
2015-06-02 18:20   ` Tomas Elf
2015-05-29 16:43 ` [PATCH 19/55] drm/i915: Update ppgtt_init_ring() & context_enable() to take requests John.C.Harrison
2015-05-29 16:43 ` [PATCH 20/55] drm/i915: Update i915_switch_context() to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 21/55] drm/i915: Update do_switch() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 22/55] drm/i915: Update deferred context creation to do explicit request management John.C.Harrison
2015-06-02 18:22   ` Tomas Elf
2015-05-29 16:43 ` [PATCH 23/55] drm/i915: Update init_context() to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 24/55] drm/i915: Update render_state_init() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 25/55] drm/i915: Update i915_gem_object_sync() " John.C.Harrison
2015-06-02 18:26   ` Tomas Elf
2015-06-04 12:57     ` John Harrison
2015-06-18 12:14       ` John.C.Harrison
2015-06-18 12:21         ` Chris Wilson
2015-06-18 12:59           ` John Harrison
2015-06-18 14:24             ` Daniel Vetter
2015-06-18 15:39               ` Chris Wilson
2015-06-18 16:16                 ` John Harrison
2015-06-22 20:03                   ` Daniel Vetter
2015-06-22 20:14                     ` Chris Wilson
2015-06-18 16:36         ` 3.16 backlight kernel options Stéphane ANCELOT
2015-05-29 16:43 ` [PATCH 26/55] drm/i915: Update overlay code to do explicit request management John.C.Harrison
2015-05-29 16:43 ` [PATCH 27/55] drm/i915: Update queue_flip() to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 28/55] drm/i915: Update add_request() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 29/55] drm/i915: Update [vma|object]_move_to_active() to take request structures John.C.Harrison
2015-05-29 16:43 ` [PATCH 30/55] drm/i915: Update l3_remap to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 31/55] drm/i915: Update mi_set_context() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 32/55] drm/i915: Update a bunch of execbuffer helpers to take request structures John.C.Harrison
2015-05-29 16:43 ` [PATCH 33/55] drm/i915: Update workarounds_emit() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 34/55] drm/i915: Update flush_all_caches() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 35/55] drm/i915: Update switch_mm() to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 36/55] drm/i915: Update ring->flush() to take a requests structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 37/55] drm/i915: Update some flush helpers to take request structures John.C.Harrison
2015-05-29 16:43 ` [PATCH 38/55] drm/i915: Update ring->emit_flush() to take a request structure John.C.Harrison
2015-05-29 16:44 ` [PATCH 39/55] drm/i915: Update ring->add_request() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 40/55] drm/i915: Update ring->emit_request() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 41/55] drm/i915: Update ring->dispatch_execbuffer() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 42/55] drm/i915: Update ring->emit_bb_start() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 43/55] drm/i915: Update ring->sync_to() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 44/55] drm/i915: Update ring->signal() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 45/55] drm/i915: Update cacheline_align() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 46/55] drm/i915: Update intel_ring_begin() " John.C.Harrison
2015-06-23 10:24   ` Chris Wilson
2015-06-23 10:37     ` John Harrison
2015-06-23 13:25       ` Daniel Vetter
2015-06-23 15:27         ` John Harrison
2015-06-23 15:34           ` Daniel Vetter
2015-05-29 16:44 ` [PATCH 47/55] drm/i915: Update intel_logical_ring_begin() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 48/55] drm/i915: Add *_ring_begin() to request allocation John.C.Harrison
2015-06-17 13:31   ` Daniel Vetter
2015-06-17 14:27     ` Chris Wilson
2015-06-17 14:54       ` Daniel Vetter
2015-06-17 15:52         ` Chris Wilson
2015-06-18 11:21           ` John Harrison
2015-06-18 13:29             ` Daniel Vetter
2015-06-19 16:34               ` John Harrison
2015-05-29 16:44 ` [PATCH 49/55] drm/i915: Remove the now obsolete intel_ring_get_request() John.C.Harrison
2015-05-29 16:44 ` [PATCH 50/55] drm/i915: Remove the now obsolete 'outstanding_lazy_request' John.C.Harrison
2015-05-29 16:44 ` [PATCH 51/55] drm/i915: Move the request/file and request/pid association to creation time John.C.Harrison
2015-06-03 11:15   ` Tomas Elf
2015-05-29 16:44 ` [PATCH 52/55] drm/i915: Remove 'faked' request from LRC submission John.C.Harrison
2015-05-29 16:44 ` [PATCH 53/55] drm/i915: Update a bunch of LRC functions to take requests John.C.Harrison
2015-05-29 16:44 ` [PATCH 54/55] drm/i915: Remove the now obsolete 'i915_gem_check_olr()' John.C.Harrison
2015-06-02 18:27   ` Tomas Elf
2015-06-23 10:23   ` Chris Wilson
2015-06-23 10:39     ` John Harrison
2015-05-29 16:44 ` [PATCH 55/55] drm/i915: Rename the somewhat reduced i915_gem_object_flush_active() John.C.Harrison
2015-06-02 18:27   ` Tomas Elf
2015-06-17 14:06   ` Daniel Vetter
2015-06-17 14:21     ` Chris Wilson
2015-06-18 11:03       ` John Harrison
2015-06-18 11:10         ` Chris Wilson
2015-06-18 11:27           ` John Harrison
2015-06-18 10:57     ` John Harrison
2015-06-04 18:23 ` [PATCH 14/56] drm/i915: Make retire condition check for requests not objects John.C.Harrison
2015-06-04 18:24   ` John Harrison
2015-06-09 15:56   ` Tomas Elf
2015-06-17 15:01     ` Daniel Vetter
2015-06-22 21:04 ` [PATCH 00/55] Remove the outstanding_lazy_request Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=558AE344.90006@Intel.com \
    --to=john.c.harrison@intel.com \
    --cc=Intel-GFX@Lists.FreeDesktop.Org \
    --cc=daniel@ffwll.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox