From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH v2 1/3] drm/i915: Fix eviction when the GGTT is idle but full
Date: Wed, 11 Oct 2017 13:05:02 +0100 [thread overview]
Message-ID: <8b146a9f-d5bb-d520-f079-24345854c43c@linux.intel.com> (raw)
In-Reply-To: <20171010213809.7752-1-chris@chris-wilson.co.uk>
On 10/10/2017 22:38, Chris Wilson wrote:
> In the full-ppgtt world, we can fill the GGTT full of context objects.
> These context objects are currently implicitly tracked by the requests
> that pin them i.e. they are only unpinned when the request is completed
> and retired, but we do not have the link from the vma to the request
> (anymore). In order to unpin those contexts, we have to issue another
> request and wait upon the switch to the kernel context.
>
> The bug during eviction was that we assumed that a full GGTT meant we
> would have requests on the GGTT timeline, and so we missed situations
> where those requests where merely in flight (and when even they have not
> yet been submitted to hw yet). The fix employed here is to change the
> already-is-idle test to no look at the execution timeline, but count the
> outstanding requests and then check that we have switched to the kernel
> context. Erring on the side of overkill here just means that we stall a
> little longer than may be strictly required, but we only expect to hit
> this path in extreme corner cases where returning an erroneous error is
> worse than the delay.
>
> v2: Logical inversion when swapping over branches.
>
> Fixes: 80b204bce8f2 ("drm/i915: Enable multiple timelines")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem_evict.c | 63 ++++++++++++++++++++++-------------
> 1 file changed, 39 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index a5a5b7e6daae..ee4811ffb7aa 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -33,21 +33,20 @@
> #include "intel_drv.h"
> #include "i915_trace.h"
>
> -static bool ggtt_is_idle(struct drm_i915_private *dev_priv)
> +static bool ggtt_is_idle(struct drm_i915_private *i915)
> {
> - struct i915_ggtt *ggtt = &dev_priv->ggtt;
> - struct intel_engine_cs *engine;
> - enum intel_engine_id id;
> + struct intel_engine_cs *engine;
> + enum intel_engine_id id;
>
> - for_each_engine(engine, dev_priv, id) {
> - struct intel_timeline *tl;
> + if (i915->gt.active_requests)
> + return false;
>
> - tl = &ggtt->base.timeline.engine[engine->id];
> - if (i915_gem_active_isset(&tl->last_request))
> - return false;
> - }
> + for_each_engine(engine, i915, id) {
> + if (engine->last_retired_context != i915->kernel_context)
> + return false;
> + }
>
> - return true;
> + return true;
> }
>
> static int ggtt_flush(struct drm_i915_private *i915)
> @@ -157,7 +156,8 @@ i915_gem_evict_something(struct i915_address_space *vm,
> min_size, alignment, cache_level,
> start, end, mode);
>
> - /* Retire before we search the active list. Although we have
> + /*
> + * Retire before we search the active list. Although we have
> * reasonable accuracy in our retirement lists, we may have
> * a stray pin (preventing eviction) that can only be resolved by
> * retiring.
> @@ -182,7 +182,8 @@ i915_gem_evict_something(struct i915_address_space *vm,
> BUG_ON(ret);
> }
>
> - /* Can we unpin some objects such as idle hw contents,
> + /*
> + * Can we unpin some objects such as idle hw contents,
> * or pending flips? But since only the GGTT has global entries
> * such as scanouts, rinbuffers and contexts, we can skip the
> * purge when inspecting per-process local address spaces.
> @@ -190,19 +191,33 @@ i915_gem_evict_something(struct i915_address_space *vm,
> if (!i915_is_ggtt(vm) || flags & PIN_NONBLOCK)
> return -ENOSPC;
>
> - if (ggtt_is_idle(dev_priv)) {
> - /* If we still have pending pageflip completions, drop
> - * back to userspace to give our workqueues time to
> - * acquire our locks and unpin the old scanouts.
> - */
> - return intel_has_pending_fb_unpin(dev_priv) ? -EAGAIN : -ENOSPC;
> - }
> + /*
> + * Not everything in the GGTT is tracked via VMA using
> + * i915_vma_move_to_active(), otherwise we could evict as required
> + * with minimal stalling. Instead we are forced to idle the GPU and
> + * explicitly retire outstanding requests which will then remove
> + * the pinning for active objects such as contexts and ring,
> + * enabling us to evict them on the next iteration.
> + *
> + * To ensure that all user contexts are evictable, we perform
> + * a switch to the perma-pinned kernel context. This all also gives
> + * us a termination condition, when the last retired context is
> + * the kernel's there is no more we can evict.
> + */
> + if (!ggtt_is_idle(dev_priv)) {
> + ret = ggtt_flush(dev_priv);
> + if (ret)
> + return ret;
>
> - ret = ggtt_flush(dev_priv);
> - if (ret)
> - return ret;
> + goto search_again;
> + }
>
> - goto search_again;
> + /*
> + * If we still have pending pageflip completions, drop
> + * back to userspace to give our workqueues time to
> + * acquire our locks and unpin the old scanouts.
> + */
> + return intel_has_pending_fb_unpin(dev_priv) ? -EAGAIN : -ENOSPC;
>
> found:
> /* drm_mm doesn't allow any other other operations while
>
Looks like it will fix the bug and can't spot that it introduces a
problem. Was there an igt which was failing or any bugzilla?
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2017-10-11 12:05 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-10 21:38 [PATCH v2 1/3] drm/i915: Fix eviction when the GGTT is idle but full Chris Wilson
2017-10-10 21:38 ` [PATCH v2 2/3] drm/i915: Wrap a timer into a i915_sw_fence Chris Wilson
2017-10-11 12:20 ` Tvrtko Ursulin
2017-10-11 12:34 ` Chris Wilson
2017-10-10 21:38 ` [PATCH v2 3/3] drm/i915/selftests: Exercise adding requests to a full GGTT Chris Wilson
2017-10-11 12:33 ` Tvrtko Ursulin
2017-10-11 12:45 ` Chris Wilson
2017-10-10 22:09 ` ✓ Fi.CI.BAT: success for series starting with [v2,1/3] drm/i915: Fix eviction when the GGTT is idle but full Patchwork
2017-10-11 12:05 ` Tvrtko Ursulin [this message]
2017-10-11 12:12 ` [PATCH v2 1/3] " Chris Wilson
2017-10-11 12:11 ` ✗ Fi.CI.IGT: failure for series starting with [v2,1/3] " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8b146a9f-d5bb-d520-f079-24345854c43c@linux.intel.com \
--to=tvrtko.ursulin@linux.intel.com \
--cc=chris@chris-wilson.co.uk \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox