From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 08/66] drm/i915: Make the stale cached active node available for any timeline
Date: Wed, 29 Jul 2020 15:22:26 +0100 [thread overview]
Message-ID: <0b558bfe-7baa-4373-abfe-e739971ee288@linux.intel.com> (raw)
In-Reply-To: <159603012686.8877.9862976259674771406@build.alporthouse.com>
On 29/07/2020 14:42, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-07-29 13:40:38)
>>
>> On 28/07/2020 15:28, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2020-07-17 14:04:58)
>>>>
>>>> On 15/07/2020 12:50, Chris Wilson wrote:
>>>>> Rather than require the next timeline after idling to match the MRU
>>>>> before idling, reset the index on the node and allow it to match the
>>>>> first request. However, this requires cmpxchg(u64) and so is not trivial
>>>>> on 32b, so for compatibility we just fallback to keeping the cached node
>>>>> pointing to the MRU timeline.
>>>>>
>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> ---
>>>>> drivers/gpu/drm/i915/i915_active.c | 21 +++++++++++++++++++--
>>>>> 1 file changed, 19 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
>>>>> index 0854b1552bc1..6737b5615c0c 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_active.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_active.c
>>>>> @@ -157,6 +157,10 @@ __active_retire(struct i915_active *ref)
>>>>> rb_link_node(&ref->cache->node, NULL, &ref->tree.rb_node);
>>>>> rb_insert_color(&ref->cache->node, &ref->tree);
>>>>> GEM_BUG_ON(ref->tree.rb_node != &ref->cache->node);
>>>>> +
>>>>> + /* Make the cached node available for reuse with any timeline */
>>>>> + if (IS_ENABLED(CONFIG_64BIT))
>>>>> + ref->cache->timeline = 0; /* needs cmpxchg(u64) */
>>>>
>>>> Or when fence context wraps shock horror.
>>>
>>> I more concerned about that we use timeline:0 as a special unordered
>>> timeline. It's reserved by use in the dma_fence_stub, and everything
>>> will start to break when the timelines wrap. The earliest causalities
>>> will be the kernel_context timelines which are also very special indices
>>> for the barriers.
>>>
>>>>
>>>>> }
>>>>>
>>>>> spin_unlock_irqrestore(&ref->tree_lock, flags);
>>>>> @@ -235,9 +239,22 @@ static struct active_node *__active_lookup(struct i915_active *ref, u64 idx)
>>>>> {
>>>>> struct active_node *it;
>>>>>
>>>>> + GEM_BUG_ON(idx == 0); /* 0 is the unordered timeline, rsvd for cache */
>>>>> +
>>>>> it = READ_ONCE(ref->cache);
>>>>> - if (it && it->timeline == idx)
>>>>> - return it;
>>>>> + if (it) {
>>>>> + u64 cached = READ_ONCE(it->timeline);
>>>>> +
>>>>> + if (cached == idx)
>>>>> + return it;
>>>>> +
>>>>> +#ifdef CONFIG_64BIT /* for cmpxchg(u64) */
>>>>> + if (!cached && !cmpxchg(&it->timeline, 0, idx)) {
>>>>> + GEM_BUG_ON(i915_active_fence_isset(&it->base));
>>>>> + return it;
>>>>
>>>> cpmxchg suggests this needs to be atomic, however above the check for
>>>> equality comes from a separate read.
>>>
>>> That's fine, and quite common to avoid cmpxchg if the current value
>>> already does not match the expected condition.
>>
>> How? What is another thread is about to install its idx into
>> it->timeline with cmpxchg and this thread does not see it because it
>> just returned on the "cached == idx" condition.
>
> Because it's nonzero.
>
> If the idx is already non-zero, it will always remain non-zero until
> everybody idles (and there are no more threads).
>
> If the idx is zero, it can only transition to non-zero once, atomically
> via cmpxchg. The first and only first cmpxchg will return that the
> previous value was 0, and so return with it->idx == idx.
I think this is worthy of a comment to avoid future reader having to
re-figure it all out.
>>>> Since there is a lookup code path under the spinlock, perhaps the
>>>> unlocked lookup could just fail, and then locked lookup could re-assign
>>>> the timeline without the need for cmpxchg?
>>>
>>> The unlocked/locked lookup are the same routine. You pointed that out
>>> :-p
>>
>> Like I remember from ten days ago.. Anyway, I am pointing out it still
>> doesn't smell right.
>>
>> __active_lookup(...) -> lockless
>> {
>> ...
>> it = fetch_node(ref->tree.rb_node);
>> while (it) {
>> if (it->timeline < idx) {
>> it = fetch_node(it->node.rb_right);
>> } else if (it->timeline > idx) {
>> it = fetch_node(it->node.rb_left);
>> } else {
>> WRITE_ONCE(ref->cache, it);
>> break;
>> }
>> }
>> ...
>> }
>>
>> Then in active_instance, locked:
>>
>> ...
>> parent = NULL;
>> p = &ref->tree.rb_node;
>> while (*p) {
>> parent = *p;
>>
>> node = rb_entry(parent, struct active_node, node);
>> if (node->timeline == idx) {
>> kmem_cache_free(global.slab_cache, prealloc);
>> goto out;
>> }
>>
>> if (node->timeline < idx)
>> p = &parent->rb_right;
>> else
>> p = &parent->rb_left;
>> WRITE_ONCE(ref->cache, it);
>> break;
>> }
>> }
>> ...
>>
>> Tree walk could be consolidated between the two.
>
> This tree walk is subtly different, as we aren't just interested in the
> node, but its parent. The exact repetitions have been consolidated into
> __active_lookup.
It returns the previous/parent node if idx is not found so yeah, common
helper would need to have two out parameters. One returns the match, or
NULL, another returns the previous/parent node. You think that is not
worth it?
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2020-07-29 14:22 UTC|newest]
Thread overview: 154+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-15 11:50 [Intel-gfx] [PATCH 01/66] drm/i915: Reduce i915_request.lock contention for i915_request_wait Chris Wilson
2020-07-15 11:50 ` [Intel-gfx] [PATCH 02/66] drm/i915: Remove i915_request.lock requirement for execution callbacks Chris Wilson
2020-07-15 11:50 ` [Intel-gfx] [PATCH 03/66] drm/i915: Remove requirement for holding i915_request.lock for breadcrumbs Chris Wilson
2020-07-15 11:50 ` [Intel-gfx] [PATCH 04/66] drm/i915: Add a couple of missing i915_active_fini() Chris Wilson
2020-07-17 12:00 ` Tvrtko Ursulin
2020-07-21 12:23 ` Thomas Hellström (Intel)
2020-07-15 11:50 ` [Intel-gfx] [PATCH 05/66] drm/i915: Skip taking acquire mutex for no ref->active callback Chris Wilson
2020-07-17 12:04 ` Tvrtko Ursulin
2020-07-21 12:32 ` Thomas Hellström (Intel)
2020-07-15 11:50 ` [Intel-gfx] [PATCH 06/66] drm/i915: Export a preallocate variant of i915_active_acquire() Chris Wilson
2020-07-17 12:21 ` Tvrtko Ursulin
2020-07-17 12:45 ` Chris Wilson
2020-07-17 13:06 ` Tvrtko Ursulin
2020-07-21 15:33 ` Thomas Hellström (Intel)
2020-07-15 11:50 ` [Intel-gfx] [PATCH 07/66] drm/i915: Keep the most recently used active-fence upon discard Chris Wilson
2020-07-17 12:38 ` Tvrtko Ursulin
2020-07-28 14:22 ` Chris Wilson
2020-07-22 9:46 ` Thomas Hellström (Intel)
2020-07-15 11:50 ` [Intel-gfx] [PATCH 08/66] drm/i915: Make the stale cached active node available for any timeline Chris Wilson
2020-07-17 13:04 ` Tvrtko Ursulin
2020-07-28 14:28 ` Chris Wilson
2020-07-29 12:40 ` Tvrtko Ursulin
2020-07-29 13:42 ` Chris Wilson
2020-07-29 13:53 ` Chris Wilson
2020-07-29 14:22 ` Tvrtko Ursulin [this message]
2020-07-29 14:39 ` Chris Wilson
2020-07-29 14:52 ` Chris Wilson
2020-07-29 15:31 ` Tvrtko Ursulin
2020-07-22 11:19 ` Thomas Hellström (Intel)
2020-07-28 14:31 ` Chris Wilson
2020-07-15 11:50 ` [Intel-gfx] [PATCH 09/66] drm/i915: Provide a fastpath for waiting on vma bindings Chris Wilson
2020-07-17 13:23 ` Tvrtko Ursulin
2020-07-28 14:35 ` Chris Wilson
2020-07-29 12:43 ` Tvrtko Ursulin
2020-07-22 15:07 ` Thomas Hellström (Intel)
2020-07-15 11:50 ` [Intel-gfx] [PATCH 10/66] drm/i915: Soften the tasklet flush frequency before waits Chris Wilson
2020-07-16 14:23 ` Mika Kuoppala
2020-07-22 15:10 ` Thomas Hellström (Intel)
2020-07-15 11:50 ` [Intel-gfx] [PATCH 11/66] drm/i915: Preallocate stashes for vma page-directories Chris Wilson
2020-07-20 10:35 ` Matthew Auld
2020-07-23 14:33 ` Thomas Hellström (Intel)
2020-07-28 14:42 ` Chris Wilson
2020-07-31 7:43 ` Thomas Hellström (Intel)
2020-07-27 9:24 ` Thomas Hellström (Intel)
2020-07-28 14:50 ` Chris Wilson
2020-07-30 12:04 ` Thomas Hellström (Intel)
2020-07-30 12:28 ` Thomas Hellström (Intel)
2020-08-04 14:08 ` Chris Wilson
2020-08-04 16:14 ` Daniel Vetter
2020-07-15 11:50 ` [Intel-gfx] [PATCH 12/66] drm/i915: Switch to object allocations for page directories Chris Wilson
2020-07-20 10:34 ` Matthew Auld
2020-07-20 10:40 ` Chris Wilson
2020-07-15 11:50 ` [Intel-gfx] [PATCH 13/66] drm/i915/gem: Don't drop the timeline lock during execbuf Chris Wilson
2020-07-23 16:09 ` Thomas Hellström (Intel)
2020-07-28 14:46 ` Thomas Hellström (Intel)
2020-07-28 14:51 ` Chris Wilson
2020-07-31 8:09 ` Thomas Hellström (Intel)
2020-07-15 11:50 ` [Intel-gfx] [PATCH 14/66] drm/i915/gem: Rename execbuf.bind_link to unbound_link Chris Wilson
2020-07-31 8:11 ` Thomas Hellström (Intel)
2020-07-15 11:50 ` [Intel-gfx] [PATCH 15/66] drm/i915/gem: Break apart the early i915_vma_pin from execbuf object lookup Chris Wilson
2020-07-31 8:51 ` Thomas Hellström (Intel)
2020-07-15 11:50 ` [Intel-gfx] [PATCH 16/66] drm/i915/gem: Remove the call for no-evict i915_vma_pin Chris Wilson
2020-07-17 14:36 ` Tvrtko Ursulin
2020-07-28 15:04 ` Chris Wilson
2020-07-28 9:46 ` Thomas Hellström (Intel)
2020-07-28 15:05 ` Chris Wilson
2020-07-31 8:58 ` Thomas Hellström (Intel)
2020-07-15 11:50 ` [Intel-gfx] [PATCH 17/66] drm/i915: Add list_for_each_entry_safe_continue_reverse Chris Wilson
2020-07-31 8:59 ` Thomas Hellström (Intel)
2020-07-15 11:50 ` [Intel-gfx] [PATCH 18/66] drm/i915: Always defer fenced work to the worker Chris Wilson
2020-07-31 9:03 ` Thomas Hellström (Intel)
2020-07-31 13:28 ` Chris Wilson
2020-07-31 13:31 ` Thomas Hellström (Intel)
2020-07-15 11:51 ` [Intel-gfx] [PATCH 19/66] drm/i915/gem: Assign context id for async work Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 20/66] drm/i915/gem: Separate the ww_mutex walker into its own list Chris Wilson
2020-07-31 9:23 ` Thomas Hellström (Intel)
2020-07-15 11:51 ` [Intel-gfx] [PATCH 21/66] drm/i915/gem: Asynchronous GTT unbinding Chris Wilson
2020-07-31 13:09 ` Thomas Hellström (Intel)
2020-07-15 11:51 ` [Intel-gfx] [PATCH 22/66] drm/i915/gem: Bind the fence async for execbuf Chris Wilson
2020-07-27 18:19 ` Thomas Hellström (Intel)
2020-07-28 15:08 ` Chris Wilson
2020-07-31 13:12 ` Thomas Hellström (Intel)
2020-07-15 11:51 ` [Intel-gfx] [PATCH 23/66] drm/i915/gem: Include cmdparser in common execbuf pinning Chris Wilson
2020-07-31 9:43 ` Thomas Hellström (Intel)
2020-07-15 11:51 ` [Intel-gfx] [PATCH 24/66] drm/i915/gem: Include secure batch " Chris Wilson
2020-07-31 9:47 ` Thomas Hellström (Intel)
2020-07-15 11:51 ` [Intel-gfx] [PATCH 25/66] drm/i915/gem: Reintroduce multiple passes for reloc processing Chris Wilson
2020-07-31 10:05 ` Thomas Hellström (Intel)
2020-07-15 11:51 ` [Intel-gfx] [PATCH 26/66] drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2 Chris Wilson
2020-07-31 10:07 ` Thomas Hellström (Intel)
2020-07-15 11:51 ` [Intel-gfx] [PATCH 27/66] drm/i915/gem: Pull execbuf dma resv under a single critical section Chris Wilson
2020-07-27 18:08 ` Thomas Hellström (Intel)
2020-07-28 15:16 ` Chris Wilson
2020-07-30 12:57 ` Thomas Hellström (Intel)
2020-07-15 11:51 ` [Intel-gfx] [PATCH 28/66] drm/i915/gem: Replace i915_gem_object.mm.mutex with reservation_ww_class Chris Wilson
2020-07-15 15:43 ` Maarten Lankhorst
2020-07-16 15:53 ` Tvrtko Ursulin
2020-07-28 11:17 ` Thomas Hellström (Intel)
2020-07-29 7:56 ` Thomas Hellström (Intel)
2020-07-29 12:17 ` Tvrtko Ursulin
2020-07-29 13:44 ` Thomas Hellström (Intel)
2020-08-05 12:12 ` Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 29/66] drm/i915: Hold wakeref for the duration of the vma GGTT binding Chris Wilson
2020-07-31 10:09 ` Thomas Hellström (Intel)
2020-07-15 11:51 ` [Intel-gfx] [PATCH 30/66] drm/i915: Specialise " Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 31/66] drm/i915/gt: Acquire backing storage for the context Chris Wilson
2020-07-31 10:27 ` Thomas Hellström (Intel)
2020-07-15 11:51 ` [Intel-gfx] [PATCH 32/66] drm/i915/gt: Push the wait for the context to bound to the request Chris Wilson
2020-07-31 10:48 ` Thomas Hellström (Intel)
2020-07-15 11:51 ` [Intel-gfx] [PATCH 33/66] drm/i915: Remove unused i915_gem_evict_vm() Chris Wilson
2020-07-31 10:51 ` Thomas Hellström (Intel)
2020-07-15 11:51 ` [Intel-gfx] [PATCH 34/66] drm/i915/gt: Decouple completed requests on unwind Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 35/66] drm/i915/gt: Check for a completed last request once Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 36/66] drm/i915/gt: Replace direct submit with direct call to tasklet Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 37/66] drm/i915/gt: Free stale request on destroying the virtual engine Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 38/66] drm/i915/gt: Use virtual_engine during execlists_dequeue Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 39/66] drm/i915/gt: Decouple inflight virtual engines Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 40/66] drm/i915/gt: Defer schedule_out until after the next dequeue Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 41/66] drm/i915/gt: Resubmit the virtual engine on schedule-out Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 42/66] drm/i915/gt: Simplify virtual engine handling for execlists_hold() Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 43/66] drm/i915/gt: ce->inflight updates are now serialised Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 44/66] drm/i915/gt: Drop atomic for engine->fw_active tracking Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 45/66] drm/i915/gt: Extract busy-stats for ring-scheduler Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 46/66] drm/i915/gt: Convert stats.active to plain unsigned int Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 47/66] drm/i915: Lift waiter/signaler iterators Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 48/66] drm/i915: Strip out internal priorities Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 49/66] drm/i915: Remove I915_USER_PRIORITY_SHIFT Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 50/66] drm/i915: Replace engine->schedule() with a known request operation Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 51/66] drm/i915/gt: Do not suspend bonded requests if one hangs Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 52/66] drm/i915: Teach the i915_dependency to use a double-lock Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 53/66] drm/i915: Restructure priority inheritance Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 54/66] drm/i915/gt: Remove timeslice suppression Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 55/66] drm/i915: Fair low-latency scheduling Chris Wilson
2020-07-15 15:33 ` [Intel-gfx] [PATCH] " Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 56/66] drm/i915/gt: Specify a deadline for the heartbeat Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 57/66] drm/i915: Replace the priority boosting for the display with a deadline Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 58/66] drm/i915: Move saturated workload detection to the GT Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 59/66] Restore "drm/i915: drop engine_pin/unpin_breadcrumbs_irq" Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 60/66] drm/i915/gt: Couple tasklet scheduling for all CS interrupts Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 61/66] drm/i915/gt: Support creation of 'internal' rings Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 62/66] drm/i915/gt: Use client timeline address for seqno writes Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 63/66] drm/i915/gt: Infrastructure for ring scheduling Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 64/66] drm/i915/gt: Implement ring scheduler for gen6/7 Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 65/66] drm/i915/gt: Enable ring scheduling " Chris Wilson
2020-07-15 11:51 ` [Intel-gfx] [PATCH 66/66] drm/i915/gem: Remove timeline nesting from snb relocs Chris Wilson
2020-07-15 13:27 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/66] drm/i915: Reduce i915_request.lock contention for i915_request_wait Patchwork
2020-07-15 13:28 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-07-15 14:20 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2020-07-15 15:41 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/66] drm/i915: Reduce i915_request.lock contention for i915_request_wait (rev2) Patchwork
2020-07-15 15:42 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-07-15 16:03 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-07-15 19:55 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2020-07-23 20:32 ` [Intel-gfx] [PATCH 01/66] drm/i915: Reduce i915_request.lock contention for i915_request_wait Dave Airlie
2020-07-27 9:35 ` Tvrtko Ursulin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0b558bfe-7baa-4373-abfe-e739971ee288@linux.intel.com \
--to=tvrtko.ursulin@linux.intel.com \
--cc=chris@chris-wilson.co.uk \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox