From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 22/27] drm/i915: Eliminate lots of iterations over the execobjects array
Date: Thu, 20 Apr 2017 11:49:32 +0300 [thread overview]
Message-ID: <1492678172.17161.1.camel@linux.intel.com> (raw)
In-Reply-To: <20170419094143.16922-23-chris@chris-wilson.co.uk>
On ke, 2017-04-19 at 10:41 +0100, Chris Wilson wrote:
> The major scaling bottleneck in execbuffer is the processing of the
> execobjects. Creating an auxiliary list is inefficient when compared to
> using the execobject array we already have allocated.
>
> Reservation is then split into phases. As we lookup up the VMA, we
> try and bind it back into active location. Only if that fails, do we add
> it to the unbound list for phase 2. In phase 2, we try and add all those
> objects that could not fit into their previous location, with fallback
> to retrying all objects and evicting the VM in case of severe
> fragmentation. (This is the same as before, except that phase 1 is now
> done inline with looking up the VMA to avoid an iteration over the
> execobject array. In the ideal case, we eliminate the separate reservation
> phase). During the reservation phase, we only evict from the VM between
> passes (rather than currently as we try to fit every new VMA). In
> testing with Unreal Engine's Atlantis demo which stresses the eviction
> logic on gen7 class hardware, this speed up the framerate by a factor of
> 2.
>
> The second loop amalgamation is between move_to_gpu and move_to_active.
> As we always submit the request, even if incomplete, we can use the
> current request to track active VMA as we perform the flushes and
> synchronisation required.
>
> The next big advancement is to avoid copying back to the user any
> execobjects and relocations that are not changed.
>
> v2: Add a Theory of Operation spiel.
> v3: Fall back to slow relocations in preparation for flushing userptrs.
> v4: Document struct members, factor out eb_validate_vma(), add a few
> more comments to explain some magic and hide other magic behind macros.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Changelog checks out. Assuming you peeked at the generated html docs:
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Regards, Joonas
--
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2017-04-20 8:49 UTC|newest]
Thread overview: 95+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-19 9:41 Confluence of eb + timeline improvements Chris Wilson
2017-04-19 9:41 ` [PATCH 01/27] drm/i915/selftests: Allocate inode/file dynamically Chris Wilson
2017-04-20 7:42 ` Joonas Lahtinen
2017-04-19 9:41 ` [PATCH 02/27] drm/i915: Mark CPU cache as dirty on every transition for CPU writes Chris Wilson
2017-04-19 16:52 ` Dongwon Kim
2017-04-19 17:15 ` Chris Wilson
2017-04-19 17:46 ` Chris Wilson
2017-04-19 18:08 ` Chris Wilson
2017-04-19 18:13 ` Dongwon Kim
2017-04-19 18:26 ` Chris Wilson
2017-04-19 20:30 ` Dongwon Kim
2017-04-19 20:49 ` Dongwon Kim
2017-04-19 9:41 ` [PATCH 03/27] drm/i915: Mark up clflushes as belonging to an unordered timeline Chris Wilson
2017-04-19 9:41 ` [PATCH 04/27] drm/i915: Lift timeline ordering to await_dma_fence Chris Wilson
2017-04-19 9:41 ` [PATCH 05/27] drm/i915: Make ptr_unpack_bits() more function-like Chris Wilson
2017-04-19 9:41 ` [PATCH 06/27] drm/i915: Redefine ptr_pack_bits() and friends Chris Wilson
2017-04-19 9:41 ` [PATCH 07/27] drm/i915: Squash repeated awaits on the same fence Chris Wilson
2017-04-24 13:03 ` Tvrtko Ursulin
2017-04-24 13:19 ` Chris Wilson
2017-04-24 13:31 ` Chris Wilson
2017-04-26 10:20 ` Tvrtko Ursulin
2017-04-26 10:38 ` Chris Wilson
2017-04-26 10:54 ` Tvrtko Ursulin
2017-04-26 11:18 ` Chris Wilson
2017-04-26 12:13 ` Tvrtko Ursulin
2017-04-26 12:23 ` Chris Wilson
2017-04-26 14:36 ` Tvrtko Ursulin
2017-04-26 14:55 ` Chris Wilson
2017-04-26 15:04 ` Chris Wilson
2017-04-26 18:56 ` Chris Wilson
2017-04-26 22:22 ` Chris Wilson
2017-04-27 9:20 ` Tvrtko Ursulin
2017-04-27 9:47 ` Chris Wilson
2017-04-27 7:06 ` [PATCH v8] " Chris Wilson
2017-04-27 7:14 ` Chris Wilson
2017-04-27 9:50 ` Chris Wilson
2017-04-27 11:42 ` Chris Wilson
2017-04-27 11:48 ` [PATCH v9] " Chris Wilson
2017-04-27 16:47 ` Tvrtko Ursulin
2017-04-27 17:25 ` Chris Wilson
2017-04-27 20:34 ` Chris Wilson
2017-04-27 20:53 ` Chris Wilson
2017-04-28 7:41 ` [PATCH v10] " Chris Wilson
2017-04-28 7:59 ` Chris Wilson
2017-04-28 9:32 ` Tvrtko Ursulin
2017-04-28 9:54 ` Chris Wilson
2017-04-28 9:55 ` Tvrtko Ursulin
2017-04-28 10:11 ` Chris Wilson
2017-04-28 14:12 ` [PATCH v13] " Chris Wilson
2017-04-28 19:02 ` [PATCH v14] " Chris Wilson
2017-05-02 12:24 ` Tvrtko Ursulin
2017-05-02 14:45 ` Chris Wilson
2017-05-02 15:11 ` Chris Wilson
2017-05-02 15:17 ` Tvrtko Ursulin
2017-05-02 14:50 ` Chris Wilson
2017-04-19 9:41 ` [PATCH 08/27] drm/i915: Rename intel_timeline.sync_seqno[] to .global_sync[] Chris Wilson
2017-04-19 9:41 ` [PATCH 09/27] drm/i915: Confirm the request is still active before adding it to the await Chris Wilson
2017-04-19 9:41 ` [PATCH 10/27] drm/i915: Do not record a successful syncpoint for a dma-await Chris Wilson
2017-04-19 9:41 ` [PATCH 11/27] drm/i915: Switch the global i915.semaphores check to a local predicate Chris Wilson
2017-04-19 9:41 ` [PATCH 12/27] drm/i915: Only report a wakeup if the waiter was truly asleep Chris Wilson
2017-04-20 13:30 ` Tvrtko Ursulin
2017-04-20 13:57 ` Chris Wilson
2017-04-19 9:41 ` [PATCH 13/27] drm/i915/execlists: Pack the count into the low bits of the port.request Chris Wilson
2017-04-20 14:58 ` Tvrtko Ursulin
2017-04-27 14:37 ` Chris Wilson
2017-04-28 12:02 ` Tvrtko Ursulin
2017-04-28 12:21 ` Chris Wilson
2017-04-19 9:41 ` [PATCH 14/27] drm/i915: Don't mark an execlists context-switch when idle Chris Wilson
2017-04-20 8:53 ` Joonas Lahtinen
2017-04-19 9:41 ` [PATCH 15/27] drm/i915: Split execlist priority queue into rbtree + linked list Chris Wilson
2017-04-24 10:28 ` Tvrtko Ursulin
2017-04-24 11:07 ` Chris Wilson
2017-04-24 12:18 ` Chris Wilson
2017-04-24 12:44 ` Tvrtko Ursulin
2017-04-24 13:06 ` Chris Wilson
2017-04-19 9:41 ` [PATCH 16/27] drm/i915: Reinstate reservation_object zapping for batch_pool objects Chris Wilson
2017-04-28 12:20 ` Tvrtko Ursulin
2017-04-19 9:41 ` [PATCH 17/27] drm/i915: Amalgamate execbuffer parameter structures Chris Wilson
2017-04-19 9:41 ` [PATCH 18/27] drm/i915: Use vma->exec_entry as our double-entry placeholder Chris Wilson
2017-04-19 9:41 ` [PATCH 19/27] drm/i915: Split vma exec_link/evict_link Chris Wilson
2017-04-19 9:41 ` [PATCH 20/27] drm/i915: Store a direct lookup from object handle to vma Chris Wilson
2017-04-19 9:41 ` [PATCH 21/27] drm/i915: Pass vma to relocate entry Chris Wilson
2017-04-19 9:41 ` [PATCH 22/27] drm/i915: Eliminate lots of iterations over the execobjects array Chris Wilson
2017-04-20 8:49 ` Joonas Lahtinen [this message]
2017-04-19 9:41 ` [PATCH 23/27] drm/i915: First try the previous execbuffer location Chris Wilson
2017-04-19 9:41 ` [PATCH 24/27] drm/i915: Wait upon userptr get-user-pages within execbuffer Chris Wilson
2017-04-19 9:41 ` [PATCH 25/27] drm/i915: Allow execbuffer to use the first object as the batch Chris Wilson
2017-04-19 9:41 ` [PATCH 26/27] drm/i915: Async GPU relocation processing Chris Wilson
2017-04-19 9:41 ` [PATCH 27/27] drm/i915/scheduler: Support user-defined priorities Chris Wilson
2017-04-19 10:09 ` Chris Wilson
2017-04-19 11:07 ` Tvrtko Ursulin
2017-04-19 10:01 ` ✗ Fi.CI.BAT: failure for series starting with [01/27] drm/i915/selftests: Allocate inode/file dynamically Patchwork
2017-04-27 7:27 ` ✓ Fi.CI.BAT: success for series starting with [01/27] drm/i915/selftests: Allocate inode/file dynamically (rev2) Patchwork
2017-04-28 14:31 ` ✓ Fi.CI.BAT: success for series starting with [01/27] drm/i915/selftests: Allocate inode/file dynamically (rev5) Patchwork
2017-04-28 19:22 ` ✓ Fi.CI.BAT: success for series starting with [01/27] drm/i915/selftests: Allocate inode/file dynamically (rev6) Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1492678172.17161.1.camel@linux.intel.com \
--to=joonas.lahtinen@linux.intel.com \
--cc=chris@chris-wilson.co.uk \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox