public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH v2 1/3] drm/i915: Fix eviction when the GGTT is idle but full
Date: Wed, 11 Oct 2017 13:05:02 +0100	[thread overview]
Message-ID: <8b146a9f-d5bb-d520-f079-24345854c43c@linux.intel.com> (raw)
In-Reply-To: <20171010213809.7752-1-chris@chris-wilson.co.uk>


On 10/10/2017 22:38, Chris Wilson wrote:
> In the full-ppgtt world, we can fill the GGTT full of context objects.
> These context objects are currently implicitly tracked by the requests
> that pin them i.e. they are only unpinned when the request is completed
> and retired, but we do not have the link from the vma to the request
> (anymore). In order to unpin those contexts, we have to issue another
> request and wait upon the switch to the kernel context.
> 
> The bug during eviction was that we assumed that a full GGTT meant we
> would have requests on the GGTT timeline, and so we missed situations
> where those requests where merely in flight (and when even they have not
> yet been submitted to hw yet). The fix employed here is to change the
> already-is-idle test to no look at the execution timeline, but count the
> outstanding requests and then check that we have switched to the kernel
> context. Erring on the side of overkill here just means that we stall a
> little longer than may be strictly required, but we only expect to hit
> this path in extreme corner cases where returning an erroneous error is
> worse than the delay.
> 
> v2: Logical inversion when swapping over branches.
> 
> Fixes: 80b204bce8f2 ("drm/i915: Enable multiple timelines")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem_evict.c | 63 ++++++++++++++++++++++-------------
>   1 file changed, 39 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index a5a5b7e6daae..ee4811ffb7aa 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -33,21 +33,20 @@
>   #include "intel_drv.h"
>   #include "i915_trace.h"
>   
> -static bool ggtt_is_idle(struct drm_i915_private *dev_priv)
> +static bool ggtt_is_idle(struct drm_i915_private *i915)
>   {
> -	struct i915_ggtt *ggtt = &dev_priv->ggtt;
> -	struct intel_engine_cs *engine;
> -	enum intel_engine_id id;
> +       struct intel_engine_cs *engine;
> +       enum intel_engine_id id;
>   
> -	for_each_engine(engine, dev_priv, id) {
> -		struct intel_timeline *tl;
> +       if (i915->gt.active_requests)
> +	       return false;
>   
> -		tl = &ggtt->base.timeline.engine[engine->id];
> -		if (i915_gem_active_isset(&tl->last_request))
> -			return false;
> -	}
> +       for_each_engine(engine, i915, id) {
> +	       if (engine->last_retired_context != i915->kernel_context)
> +		       return false;
> +       }
>   
> -	return true;
> +       return true;
>   }
>   
>   static int ggtt_flush(struct drm_i915_private *i915)
> @@ -157,7 +156,8 @@ i915_gem_evict_something(struct i915_address_space *vm,
>   				    min_size, alignment, cache_level,
>   				    start, end, mode);
>   
> -	/* Retire before we search the active list. Although we have
> +	/*
> +	 * Retire before we search the active list. Although we have
>   	 * reasonable accuracy in our retirement lists, we may have
>   	 * a stray pin (preventing eviction) that can only be resolved by
>   	 * retiring.
> @@ -182,7 +182,8 @@ i915_gem_evict_something(struct i915_address_space *vm,
>   		BUG_ON(ret);
>   	}
>   
> -	/* Can we unpin some objects such as idle hw contents,
> +	/*
> +	 * Can we unpin some objects such as idle hw contents,
>   	 * or pending flips? But since only the GGTT has global entries
>   	 * such as scanouts, rinbuffers and contexts, we can skip the
>   	 * purge when inspecting per-process local address spaces.
> @@ -190,19 +191,33 @@ i915_gem_evict_something(struct i915_address_space *vm,
>   	if (!i915_is_ggtt(vm) || flags & PIN_NONBLOCK)
>   		return -ENOSPC;
>   
> -	if (ggtt_is_idle(dev_priv)) {
> -		/* If we still have pending pageflip completions, drop
> -		 * back to userspace to give our workqueues time to
> -		 * acquire our locks and unpin the old scanouts.
> -		 */
> -		return intel_has_pending_fb_unpin(dev_priv) ? -EAGAIN : -ENOSPC;
> -	}
> +	/*
> +	 * Not everything in the GGTT is tracked via VMA using
> +	 * i915_vma_move_to_active(), otherwise we could evict as required
> +	 * with minimal stalling. Instead we are forced to idle the GPU and
> +	 * explicitly retire outstanding requests which will then remove
> +	 * the pinning for active objects such as contexts and ring,
> +	 * enabling us to evict them on the next iteration.
> +	 *
> +	 * To ensure that all user contexts are evictable, we perform
> +	 * a switch to the perma-pinned kernel context. This all also gives
> +	 * us a termination condition, when the last retired context is
> +	 * the kernel's there is no more we can evict.
> +	 */
> +	if (!ggtt_is_idle(dev_priv)) {
> +		ret = ggtt_flush(dev_priv);
> +		if (ret)
> +			return ret;
>   
> -	ret = ggtt_flush(dev_priv);
> -	if (ret)
> -		return ret;
> +		goto search_again;
> +	}
>   
> -	goto search_again;
> +	/*
> +	 * If we still have pending pageflip completions, drop
> +	 * back to userspace to give our workqueues time to
> +	 * acquire our locks and unpin the old scanouts.
> +	 */
> +	return intel_has_pending_fb_unpin(dev_priv) ? -EAGAIN : -ENOSPC;
>   
>   found:
>   	/* drm_mm doesn't allow any other other operations while
> 

Looks like it will fix the bug and can't spot that it introduces a 
problem. Was there an igt which was failing or any bugzilla?

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2017-10-11 12:05 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-10 21:38 [PATCH v2 1/3] drm/i915: Fix eviction when the GGTT is idle but full Chris Wilson
2017-10-10 21:38 ` [PATCH v2 2/3] drm/i915: Wrap a timer into a i915_sw_fence Chris Wilson
2017-10-11 12:20   ` Tvrtko Ursulin
2017-10-11 12:34     ` Chris Wilson
2017-10-10 21:38 ` [PATCH v2 3/3] drm/i915/selftests: Exercise adding requests to a full GGTT Chris Wilson
2017-10-11 12:33   ` Tvrtko Ursulin
2017-10-11 12:45     ` Chris Wilson
2017-10-10 22:09 ` ✓ Fi.CI.BAT: success for series starting with [v2,1/3] drm/i915: Fix eviction when the GGTT is idle but full Patchwork
2017-10-11 12:05 ` Tvrtko Ursulin [this message]
2017-10-11 12:12   ` [PATCH v2 1/3] " Chris Wilson
2017-10-11 12:11 ` ✗ Fi.CI.IGT: failure for series starting with [v2,1/3] " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8b146a9f-d5bb-d520-f079-24345854c43c@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox