From: John Harrison <John.C.Harrison@Intel.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: Intel-GFX@Lists.FreeDesktop.Org
Subject: Re: [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands
Date: Wed, 24 Jun 2015 18:05:08 +0100 [thread overview]
Message-ID: <558AE344.90006@Intel.com> (raw)
In-Reply-To: <20150624124533.GG25769@phenom.ffwll.local>
On 24/06/2015 13:45, Daniel Vetter wrote:
> On Wed, Jun 24, 2015 at 01:18:48PM +0100, John Harrison wrote:
>> On 23/06/2015 21:00, Daniel Vetter wrote:
>>> On Tue, Jun 23, 2015 at 04:43:24PM +0100, John Harrison wrote:
>>>> On 23/06/2015 14:24, Daniel Vetter wrote:
>>>>> On Tue, Jun 23, 2015 at 12:38:01PM +0100, John Harrison wrote:
>>>>>> On 22/06/2015 21:12, Daniel Vetter wrote:
>>>>>>> On Fri, Jun 19, 2015 at 05:34:12PM +0100, John.C.Harrison@Intel.com wrote:
>>>>>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>>>>>
>>>>>>>> It is a bad idea for i915_add_request() to fail. The work will already have been
>>>>>>>> send to the ring and will be processed, but there will not be any tracking or
>>>>>>>> management of that work.
>>>>>>>>
>>>>>>>> The only way the add request call can fail is if it can't write its epilogue
>>>>>>>> commands to the ring (cache flushing, seqno updates, interrupt signalling). The
>>>>>>>> reasons for that are mostly down to running out of ring buffer space and the
>>>>>>>> problems associated with trying to get some more. This patch prevents that
>>>>>>>> situation from happening in the first place.
>>>>>>>>
>>>>>>>> When a request is created, it marks sufficient space as reserved for the
>>>>>>>> epilogue commands. Thus guaranteeing that by the time the epilogue is written,
>>>>>>>> there will be plenty of space for it. Note that a ring_begin() call is required
>>>>>>>> to actually reserve the space (and do any potential waiting). However, that is
>>>>>>>> not currently done at request creation time. This is because the ring_begin()
>>>>>>>> code can allocate a request. Hence calling begin() from the request allocation
>>>>>>>> code would lead to infinite recursion! Later patches in this series remove the
>>>>>>>> need for begin() to do the allocate. At that point, it becomes safe for the
>>>>>>>> allocate to call begin() and really reserve the space.
>>>>>>>>
>>>>>>>> Until then, there is a potential for insufficient space to be available at the
>>>>>>>> point of calling i915_add_request(). However, that would only be in the case
>>>>>>>> where the request was created and immediately submitted without ever calling
>>>>>>>> ring_begin() and adding any work to that request. Which should never happen. And
>>>>>>>> even if it does, and if that request happens to fall down the tiny window of
>>>>>>>> opportunity for failing due to being out of ring space then does it really
>>>>>>>> matter because the request wasn't doing anything in the first place?
>>>>>>>>
>>>>>>>> v2: Updated the 'reserved space too small' warning to include the offending
>>>>>>>> sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
>>>>>>>> re-initialisation of tracking state after a buffer wrap to keep the sanity
>>>>>>>> checks accurate.
>>>>>>>>
>>>>>>>> v3: Incremented the reserved size to accommodate Ironlake (after finally
>>>>>>>> managing to run on an ILK system). Also fixed missing wrap code in LRC mode.
>>>>>>>>
>>>>>>>> v4: Added extra comment and removed duplicate WARN (feedback from Tomas).
>>>>>>>>
>>>>>>>> v5: Re-write of wrap handling to prevent unnecessary early wraps (feedback from
>>>>>>>> Daniel Vetter).
>>>>>>> This didn't actually implement what I suggested (wrapping is the worst
>>>>>>> case, hence skipping the check for that is breaking the sanity check) and
>>>>>>> so changed the patch from "correct, but a bit fragile" to broken. I've
>>>>>>> merged the previous version instead.
>>>>>>> -Daniel
>>>>>> I'm confused. I thought your main issue was the early wrapping not the
>>>>>> sanity check. The check is to ensure that the reservation is large enough to
>>>>>> cover all the commands written during request submission. That should not be
>>>>>> affected by whether a wrap occurs or not. Wrapping does not magically add an
>>>>>> extra bunch of dwords to the emit_request() call. Whereas making the check
>>>>>> work with the wrap condition requires adding in extra tracking state of
>>>>>> exactly where the wrap occurred. That is extra code that only exists to
>>>>>> catch something in the very rare case which should already have been caught
>>>>>> in the very common case. I.e. if your reserved size is too small then you
>>>>>> will hit the warning on every batch buffer submission.
>>>>> The problem is that if you allow a wrap in the reserve size then the
>>>>> ringspace requirements are bigger than if you don't wrap. And since the
>>>>> add request is split up into many intel_ring_begin that's possible. Hence
>>>>> if you allow wrapping in the reserved space, then the most important case
>>>>> for the debug check is to make sure that it catches any kind of
>>>>> reservation overflow while wrapping. The not-wrapped case is probably the
>>>>> boring one.
>>>>>
>>>>> And indeed eventually we should overflow since according to your comment
>>>>> the worst case add request on ilk is 136 dwords. And the largest
>>>>> intel_ring_begin in there is 32 dwords, which means at most we'll throw
>>>>> away 31 dwords when wrapping. Which means the 160 dwords of reservation
>>>>> are not enough since we'd need 167 dwords of space for the worst case. But
>>>>> since the space_end debug check was a no-op for the wrapped case you won't
>>>>> catch this one.
>>>> The minimum reservation size in this case is still only 136. The prepare
>>>> code checks for the 32 words actually requested and wraps if necessary. It
>>>> then checks for 136+32 words of space. If that would cause a wrap it will
>>>> then add on the amount of space actually left in the ring and wait for that
>>>> bigger total. That guarantees that it has waited for the 136 at the start of
>>>> the ring. The caller is then free to fill in the 32 words and there is still
>>>> guaranteed to be a minimum of 136 words available (with or without wrapping)
>>>> before any further wait for space is necessary. Thus the add_request() code
>>>> is safe from fear of failure irrespective of where any wrap might occur.
>>>>
>>>>
>>>>> Wrt keeping track of wrapping while the reservation is in use, the
>>>>> following should do that without any need of additional tracking:
>>>>>
>>>>>
>>>>> int used_size = ringbuf->tail - ringbuf->reserved_tail;
>>>>>
>>>>> if (used_size < 0)
>>>>> used_size += ringbuf->size;
>>>>>
>>>>> WARN(used_size < ringbuf->reserved_size,
>>>>> "request reserved size too small: %d vs %d!\n",
>>>>> used_size, ringbuf->reserved_size);
>>>>>
>>>>> I was mistaken that you can reuse __intel_ring_space (since that has
>>>>> slightly different requirements), but this gives you a nicely localized
>>>>> check for reservation overflow which works even when you wrap. Ofc it
>>>>> won't work if an add_request is bigger than the entire ring, but that's
>>>>> impossible anyway since we can at most reserve ringbuf->size -
>>>>> I915_RING_FREE_SPACE.
>>>> The problem with the above calculation is that it includes the wasted space
>>>> at the end of the ring. Thus it will complain the reserved size was too
>>>> small when in fact it was just fine.
>>> Ok I again misunderstood your patch a bit since it didn't quite do what I
>>> expect, and I stand corrected that v5 works too. But I still seem to fail
>>> to get my main concern across. I'll see whether I can whip up a patch as a
>>> short demonstration, maybe that helps to unconfuse this dicussion.
>>>
>>> For now I think we're covered with either v4 or v5 so sticking with either
>>> is ok with me.
>>> -Daniel
>> I think v5 is much better. It reduces the ring space wastage which I thought
>> was your main concern.
> Ok with me too - I simply didn't pick it up when merging yesterday because
> I couldn't immediately convince myself it's correct, but really wanted to
> pull in your series. Unfortunately it's now burried below piles of
> patches, so can you please do a delta patch?
Delta patch posted: '[PATCH] drm/i915: Reserve space improvements'.
>
>> The problem with a more simplistic approach that just doubles the minimum
>> reserve size to ensure that it will fit before or after a wrap is that you
>> are doubling the reserve size. That too is rather wasteful of ring space. It
>> also means that you only find out when the reserve size is too small when
>> you hit the maximum usage coincident with a worst case wrap point. Whereas
>> the v5 method means that you notice a too small reserve whether wrapping or
>> not.
> We don't need to double the reservation since the add_request tail is
> split up into many individual intel_ring_begin. And we'd only need to wrap
> for the largest of those, which is substantially less than the entire
> reservation. Furthermore with the reservation these commands can't ever
> fail, so for those we know are only used in the add_request tail we could
> go to a wrap-only intel_ring_begin which never waits and have one at a
> dword cmd boundary. That means we'd need to overestimate the needed
> ringbuffer space by just a few dwords (namely the size of the longest CS
> cmd we emit under reservation). Which is around 6 dwords or so iirc. And
> to avoid changing ilk we could just special case that in reserve_space().
>
> In practice I don't think there would be any difference with your v5 since
> especially with the scheduler we shouldn't ever overfill rings really. But
> the clear upside is that the reserve_space_end check would be independent
> of any implementation details of how reservation vs. wrapping is done
> exactly. And hence robust against any future fumbles in this area. Looking
> at our history of the relevant code we can expect a lot of those.
> -Daniel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2015-06-24 17:05 UTC|newest]
Thread overview: 120+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
2015-05-29 16:43 ` [PATCH 01/55] drm/i915: Re-instate request->uniq becuase it is extremely useful John.C.Harrison
2015-06-03 11:14 ` Tomas Elf
2015-05-29 16:43 ` [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands John.C.Harrison
2015-06-02 18:14 ` Tomas Elf
2015-06-04 12:06 ` John.C.Harrison
2015-06-09 16:00 ` Tomas Elf
2015-06-18 12:10 ` John.C.Harrison
2015-06-17 14:04 ` Daniel Vetter
2015-06-18 10:43 ` John Harrison
2015-06-19 16:34 ` John.C.Harrison
2015-06-22 20:12 ` Daniel Vetter
2015-06-23 11:38 ` John Harrison
2015-06-23 13:24 ` Daniel Vetter
2015-06-23 15:43 ` John Harrison
2015-06-23 20:00 ` Daniel Vetter
2015-06-24 12:18 ` John Harrison
2015-06-24 12:45 ` Daniel Vetter
2015-06-24 17:05 ` John Harrison [this message]
2015-05-29 16:43 ` [PATCH 03/55] drm/i915: i915_add_request must not fail John.C.Harrison
2015-06-02 18:16 ` Tomas Elf
2015-06-04 14:07 ` John Harrison
2015-06-05 10:55 ` Tomas Elf
2015-06-23 10:16 ` Chris Wilson
2015-06-23 10:47 ` John Harrison
2015-05-29 16:43 ` [PATCH 04/55] drm/i915: Early alloc request in execbuff John.C.Harrison
2015-05-29 16:43 ` [PATCH 05/55] drm/i915: Set context in request from creation even in legacy mode John.C.Harrison
2015-05-29 16:43 ` [PATCH 06/55] drm/i915: Merged the many do_execbuf() parameters into a structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 07/55] drm/i915: Simplify i915_gem_execbuffer_retire_commands() parameters John.C.Harrison
2015-05-29 16:43 ` [PATCH 08/55] drm/i915: Update alloc_request to return the allocated request John.C.Harrison
2015-05-29 16:43 ` [PATCH 09/55] drm/i915: Add request to execbuf params and add explicit cleanup John.C.Harrison
2015-05-29 16:43 ` [PATCH 10/55] drm/i915: Update the dispatch tracepoint to use params->request John.C.Harrison
2015-05-29 16:43 ` [PATCH 11/55] drm/i915: Update move_to_gpu() to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 12/55] drm/i915: Update execbuffer_move_to_active() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 13/55] drm/i915: Add flag to i915_add_request() to skip the cache flush John.C.Harrison
2015-06-02 18:19 ` Tomas Elf
2015-05-29 16:43 ` [PATCH 14/55] drm/i915: Update i915_gpu_idle() to manage its own request John.C.Harrison
2015-05-29 16:43 ` [PATCH 15/55] drm/i915: Split i915_ppgtt_init_hw() in half - generic and per ring John.C.Harrison
2015-06-18 12:11 ` John.C.Harrison
2015-05-29 16:43 ` [PATCH 16/55] drm/i915: Moved the for_each_ring loop outside of i915_gem_context_enable() John.C.Harrison
2015-05-29 16:43 ` [PATCH 17/55] drm/i915: Don't tag kernel batches as user batches John.C.Harrison
2015-05-29 16:43 ` [PATCH 18/55] drm/i915: Add explicit request management to i915_gem_init_hw() John.C.Harrison
2015-06-02 18:20 ` Tomas Elf
2015-05-29 16:43 ` [PATCH 19/55] drm/i915: Update ppgtt_init_ring() & context_enable() to take requests John.C.Harrison
2015-05-29 16:43 ` [PATCH 20/55] drm/i915: Update i915_switch_context() to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 21/55] drm/i915: Update do_switch() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 22/55] drm/i915: Update deferred context creation to do explicit request management John.C.Harrison
2015-06-02 18:22 ` Tomas Elf
2015-05-29 16:43 ` [PATCH 23/55] drm/i915: Update init_context() to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 24/55] drm/i915: Update render_state_init() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 25/55] drm/i915: Update i915_gem_object_sync() " John.C.Harrison
2015-06-02 18:26 ` Tomas Elf
2015-06-04 12:57 ` John Harrison
2015-06-18 12:14 ` John.C.Harrison
2015-06-18 12:21 ` Chris Wilson
2015-06-18 12:59 ` John Harrison
2015-06-18 14:24 ` Daniel Vetter
2015-06-18 15:39 ` Chris Wilson
2015-06-18 16:16 ` John Harrison
2015-06-22 20:03 ` Daniel Vetter
2015-06-22 20:14 ` Chris Wilson
2015-06-18 16:36 ` 3.16 backlight kernel options Stéphane ANCELOT
2015-05-29 16:43 ` [PATCH 26/55] drm/i915: Update overlay code to do explicit request management John.C.Harrison
2015-05-29 16:43 ` [PATCH 27/55] drm/i915: Update queue_flip() to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 28/55] drm/i915: Update add_request() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 29/55] drm/i915: Update [vma|object]_move_to_active() to take request structures John.C.Harrison
2015-05-29 16:43 ` [PATCH 30/55] drm/i915: Update l3_remap to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 31/55] drm/i915: Update mi_set_context() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 32/55] drm/i915: Update a bunch of execbuffer helpers to take request structures John.C.Harrison
2015-05-29 16:43 ` [PATCH 33/55] drm/i915: Update workarounds_emit() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 34/55] drm/i915: Update flush_all_caches() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 35/55] drm/i915: Update switch_mm() to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 36/55] drm/i915: Update ring->flush() to take a requests structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 37/55] drm/i915: Update some flush helpers to take request structures John.C.Harrison
2015-05-29 16:43 ` [PATCH 38/55] drm/i915: Update ring->emit_flush() to take a request structure John.C.Harrison
2015-05-29 16:44 ` [PATCH 39/55] drm/i915: Update ring->add_request() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 40/55] drm/i915: Update ring->emit_request() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 41/55] drm/i915: Update ring->dispatch_execbuffer() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 42/55] drm/i915: Update ring->emit_bb_start() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 43/55] drm/i915: Update ring->sync_to() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 44/55] drm/i915: Update ring->signal() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 45/55] drm/i915: Update cacheline_align() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 46/55] drm/i915: Update intel_ring_begin() " John.C.Harrison
2015-06-23 10:24 ` Chris Wilson
2015-06-23 10:37 ` John Harrison
2015-06-23 13:25 ` Daniel Vetter
2015-06-23 15:27 ` John Harrison
2015-06-23 15:34 ` Daniel Vetter
2015-05-29 16:44 ` [PATCH 47/55] drm/i915: Update intel_logical_ring_begin() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 48/55] drm/i915: Add *_ring_begin() to request allocation John.C.Harrison
2015-06-17 13:31 ` Daniel Vetter
2015-06-17 14:27 ` Chris Wilson
2015-06-17 14:54 ` Daniel Vetter
2015-06-17 15:52 ` Chris Wilson
2015-06-18 11:21 ` John Harrison
2015-06-18 13:29 ` Daniel Vetter
2015-06-19 16:34 ` John Harrison
2015-05-29 16:44 ` [PATCH 49/55] drm/i915: Remove the now obsolete intel_ring_get_request() John.C.Harrison
2015-05-29 16:44 ` [PATCH 50/55] drm/i915: Remove the now obsolete 'outstanding_lazy_request' John.C.Harrison
2015-05-29 16:44 ` [PATCH 51/55] drm/i915: Move the request/file and request/pid association to creation time John.C.Harrison
2015-06-03 11:15 ` Tomas Elf
2015-05-29 16:44 ` [PATCH 52/55] drm/i915: Remove 'faked' request from LRC submission John.C.Harrison
2015-05-29 16:44 ` [PATCH 53/55] drm/i915: Update a bunch of LRC functions to take requests John.C.Harrison
2015-05-29 16:44 ` [PATCH 54/55] drm/i915: Remove the now obsolete 'i915_gem_check_olr()' John.C.Harrison
2015-06-02 18:27 ` Tomas Elf
2015-06-23 10:23 ` Chris Wilson
2015-06-23 10:39 ` John Harrison
2015-05-29 16:44 ` [PATCH 55/55] drm/i915: Rename the somewhat reduced i915_gem_object_flush_active() John.C.Harrison
2015-06-02 18:27 ` Tomas Elf
2015-06-17 14:06 ` Daniel Vetter
2015-06-17 14:21 ` Chris Wilson
2015-06-18 11:03 ` John Harrison
2015-06-18 11:10 ` Chris Wilson
2015-06-18 11:27 ` John Harrison
2015-06-18 10:57 ` John Harrison
2015-06-04 18:23 ` [PATCH 14/56] drm/i915: Make retire condition check for requests not objects John.C.Harrison
2015-06-04 18:24 ` John Harrison
2015-06-09 15:56 ` Tomas Elf
2015-06-17 15:01 ` Daniel Vetter
2015-06-22 21:04 ` [PATCH 00/55] Remove the outstanding_lazy_request Daniel Vetter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=558AE344.90006@Intel.com \
--to=john.c.harrison@intel.com \
--cc=Intel-GFX@Lists.FreeDesktop.Org \
--cc=daniel@ffwll.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox