Re: [Intel-gfx] [PATCH 2/3] Revert "drm/i915: Hold reference to intel_context over life of i915_request"

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Matthew Brost <matthew.brost@intel.com>
To: Ramalingam C <ramalingam.c@intel.com>
Cc: intel-gfx <intel-gfx@lists.freedesktop.org>,
	Hellstrom Thomas <thomas.hellstrom@intel.com>,
	Matthew Auld <matthew.auld@intel.com>,
	dri-devel <dri-devel@lists.freedesktop.org>
Subject: Re: [Intel-gfx] [PATCH 2/3] Revert "drm/i915: Hold reference to intel_context over life of i915_request"
Date: Mon, 27 Jun 2022 10:18:11 -0700	[thread overview]
Message-ID: <20220627171811.GA15034@jons-linux-dev-box> (raw)
In-Reply-To: <20220614184348.23746-3-ramalingam.c@intel.com>

On Wed, Jun 15, 2022 at 12:13:47AM +0530, Ramalingam C wrote:
> From: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> 
> This reverts commit 1e98d8c52ed5dfbaf273c4423c636525c2ce59e7.
> 
> The problem with this patch is that it makes i915_request to hold a
> reference to intel_context, which in turn holds a reference on the VM.
> This strong back referencing can lead to reference loops which leads
> to resource leak.
> 
> An example is the upcoming VM_BIND work which requires VM to hold
> a reference to some shared VM specific BO. But this BO's dma-resv
> fences holds reference to the i915_request thus leading to reference
> loop.
> 
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
> Suggested-by: Matthew Brost <matthew.brost@intel.com>

Talked with Ram, this patch needs to be squashed with the following
patch. The reasoning is with just this patch, the tree is broken -
parallel submission contexts will leak requests.

With the patches squashed:

Reviewed-by: Mathew Brost <matthew.brost@intel.com>

> ---
>  drivers/gpu/drm/i915/i915_request.c | 55 +++++++++++++++++------------
>  1 file changed, 32 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 7f6998bf390c..c71905d8e154 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -134,17 +134,39 @@ static void i915_fence_release(struct dma_fence *fence)
>  	i915_sw_fence_fini(&rq->semaphore);
>  
>  	/*
> -	 * Keep one request on each engine for reserved use under mempressure,
> -	 * do not use with virtual engines as this really is only needed for
> -	 * kernel contexts.
> +	 * Keep one request on each engine for reserved use under mempressure
> +	 *
> +	 * We do not hold a reference to the engine here and so have to be
> +	 * very careful in what rq->engine we poke. The virtual engine is
> +	 * referenced via the rq->context and we released that ref during
> +	 * i915_request_retire(), ergo we must not dereference a virtual
> +	 * engine here. Not that we would want to, as the only consumer of
> +	 * the reserved engine->request_pool is the power management parking,
> +	 * which must-not-fail, and that is only run on the physical engines.
> +	 *
> +	 * Since the request must have been executed to be have completed,
> +	 * we know that it will have been processed by the HW and will
> +	 * not be unsubmitted again, so rq->engine and rq->execution_mask
> +	 * at this point is stable. rq->execution_mask will be a single
> +	 * bit if the last and _only_ engine it could execution on was a
> +	 * physical engine, if it's multiple bits then it started on and
> +	 * could still be on a virtual engine. Thus if the mask is not a
> +	 * power-of-two we assume that rq->engine may still be a virtual
> +	 * engine and so a dangling invalid pointer that we cannot dereference
> +	 *
> +	 * For example, consider the flow of a bonded request through a virtual
> +	 * engine. The request is created with a wide engine mask (all engines
> +	 * that we might execute on). On processing the bond, the request mask
> +	 * is reduced to one or more engines. If the request is subsequently
> +	 * bound to a single engine, it will then be constrained to only
> +	 * execute on that engine and never returned to the virtual engine
> +	 * after timeslicing away, see __unwind_incomplete_requests(). Thus we
> +	 * know that if the rq->execution_mask is a single bit, rq->engine
> +	 * can be a physical engine with the exact corresponding mask.
>  	 */
> -	if (!intel_engine_is_virtual(rq->engine) &&
> -	    !cmpxchg(&rq->engine->request_pool, NULL, rq)) {
> -		intel_context_put(rq->context);
> +	if (is_power_of_2(rq->execution_mask) &&
> +	    !cmpxchg(&rq->engine->request_pool, NULL, rq))
>  		return;
> -	}
> -
> -	intel_context_put(rq->context);
>  
>  	kmem_cache_free(slab_requests, rq);
>  }
> @@ -921,19 +943,7 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
>  		}
>  	}
>  
> -	/*
> -	 * Hold a reference to the intel_context over life of an i915_request.
> -	 * Without this an i915_request can exist after the context has been
> -	 * destroyed (e.g. request retired, context closed, but user space holds
> -	 * a reference to the request from an out fence). In the case of GuC
> -	 * submission + virtual engine, the engine that the request references
> -	 * is also destroyed which can trigger bad pointer dref in fence ops
> -	 * (e.g. i915_fence_get_driver_name). We could likely change these
> -	 * functions to avoid touching the engine but let's just be safe and
> -	 * hold the intel_context reference. In execlist mode the request always
> -	 * eventually points to a physical engine so this isn't an issue.
> -	 */
> -	rq->context = intel_context_get(ce);
> +	rq->context = ce;
>  	rq->engine = ce->engine;
>  	rq->ring = ce->ring;
>  	rq->execution_mask = ce->engine->mask;
> @@ -1009,7 +1019,6 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
>  	GEM_BUG_ON(!list_empty(&rq->sched.waiters_list));
>  
>  err_free:
> -	intel_context_put(ce);
>  	kmem_cache_free(slab_requests, rq);
>  err_unreserve:
>  	intel_context_unpin(ce);
> -- 
> 2.20.1
>

WARNING: multiple messages have this Message-ID (diff)

From: Matthew Brost <matthew.brost@intel.com>
To: Ramalingam C <ramalingam.c@intel.com>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>,
	intel-gfx <intel-gfx@lists.freedesktop.org>,
	Hellstrom Thomas <thomas.hellstrom@intel.com>,
	Matthew Auld <matthew.auld@intel.com>,
	dri-devel <dri-devel@lists.freedesktop.org>
Subject: Re: [PATCH 2/3] Revert "drm/i915: Hold reference to intel_context over life of i915_request"
Date: Mon, 27 Jun 2022 10:18:11 -0700	[thread overview]
Message-ID: <20220627171811.GA15034@jons-linux-dev-box> (raw)
In-Reply-To: <20220614184348.23746-3-ramalingam.c@intel.com>

On Wed, Jun 15, 2022 at 12:13:47AM +0530, Ramalingam C wrote:
> From: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> 
> This reverts commit 1e98d8c52ed5dfbaf273c4423c636525c2ce59e7.
> 
> The problem with this patch is that it makes i915_request to hold a
> reference to intel_context, which in turn holds a reference on the VM.
> This strong back referencing can lead to reference loops which leads
> to resource leak.
> 
> An example is the upcoming VM_BIND work which requires VM to hold
> a reference to some shared VM specific BO. But this BO's dma-resv
> fences holds reference to the i915_request thus leading to reference
> loop.
> 
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
> Suggested-by: Matthew Brost <matthew.brost@intel.com>

Talked with Ram, this patch needs to be squashed with the following
patch. The reasoning is with just this patch, the tree is broken -
parallel submission contexts will leak requests.

With the patches squashed:

Reviewed-by: Mathew Brost <matthew.brost@intel.com>

> ---
>  drivers/gpu/drm/i915/i915_request.c | 55 +++++++++++++++++------------
>  1 file changed, 32 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 7f6998bf390c..c71905d8e154 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -134,17 +134,39 @@ static void i915_fence_release(struct dma_fence *fence)
>  	i915_sw_fence_fini(&rq->semaphore);
>  
>  	/*
> -	 * Keep one request on each engine for reserved use under mempressure,
> -	 * do not use with virtual engines as this really is only needed for
> -	 * kernel contexts.
> +	 * Keep one request on each engine for reserved use under mempressure
> +	 *
> +	 * We do not hold a reference to the engine here and so have to be
> +	 * very careful in what rq->engine we poke. The virtual engine is
> +	 * referenced via the rq->context and we released that ref during
> +	 * i915_request_retire(), ergo we must not dereference a virtual
> +	 * engine here. Not that we would want to, as the only consumer of
> +	 * the reserved engine->request_pool is the power management parking,
> +	 * which must-not-fail, and that is only run on the physical engines.
> +	 *
> +	 * Since the request must have been executed to be have completed,
> +	 * we know that it will have been processed by the HW and will
> +	 * not be unsubmitted again, so rq->engine and rq->execution_mask
> +	 * at this point is stable. rq->execution_mask will be a single
> +	 * bit if the last and _only_ engine it could execution on was a
> +	 * physical engine, if it's multiple bits then it started on and
> +	 * could still be on a virtual engine. Thus if the mask is not a
> +	 * power-of-two we assume that rq->engine may still be a virtual
> +	 * engine and so a dangling invalid pointer that we cannot dereference
> +	 *
> +	 * For example, consider the flow of a bonded request through a virtual
> +	 * engine. The request is created with a wide engine mask (all engines
> +	 * that we might execute on). On processing the bond, the request mask
> +	 * is reduced to one or more engines. If the request is subsequently
> +	 * bound to a single engine, it will then be constrained to only
> +	 * execute on that engine and never returned to the virtual engine
> +	 * after timeslicing away, see __unwind_incomplete_requests(). Thus we
> +	 * know that if the rq->execution_mask is a single bit, rq->engine
> +	 * can be a physical engine with the exact corresponding mask.
>  	 */
> -	if (!intel_engine_is_virtual(rq->engine) &&
> -	    !cmpxchg(&rq->engine->request_pool, NULL, rq)) {
> -		intel_context_put(rq->context);
> +	if (is_power_of_2(rq->execution_mask) &&
> +	    !cmpxchg(&rq->engine->request_pool, NULL, rq))
>  		return;
> -	}
> -
> -	intel_context_put(rq->context);
>  
>  	kmem_cache_free(slab_requests, rq);
>  }
> @@ -921,19 +943,7 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
>  		}
>  	}
>  
> -	/*
> -	 * Hold a reference to the intel_context over life of an i915_request.
> -	 * Without this an i915_request can exist after the context has been
> -	 * destroyed (e.g. request retired, context closed, but user space holds
> -	 * a reference to the request from an out fence). In the case of GuC
> -	 * submission + virtual engine, the engine that the request references
> -	 * is also destroyed which can trigger bad pointer dref in fence ops
> -	 * (e.g. i915_fence_get_driver_name). We could likely change these
> -	 * functions to avoid touching the engine but let's just be safe and
> -	 * hold the intel_context reference. In execlist mode the request always
> -	 * eventually points to a physical engine so this isn't an issue.
> -	 */
> -	rq->context = intel_context_get(ce);
> +	rq->context = ce;
>  	rq->engine = ce->engine;
>  	rq->ring = ce->ring;
>  	rq->execution_mask = ce->engine->mask;
> @@ -1009,7 +1019,6 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
>  	GEM_BUG_ON(!list_empty(&rq->sched.waiters_list));
>  
>  err_free:
> -	intel_context_put(ce);
>  	kmem_cache_free(slab_requests, rq);
>  err_unreserve:
>  	intel_context_unpin(ce);
> -- 
> 2.20.1
>

next prev parent reply	other threads:[~2022-06-27 17:24 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-14 18:43 [Intel-gfx] [PATCH 0/3] Break VM to rq reference loop Ramalingam C
2022-06-14 18:43 ` Ramalingam C
2022-06-14 18:43 ` [Intel-gfx] [PATCH 1/3] drm/i915: Do not access rq->engine without a reference Ramalingam C
2022-06-14 18:43   ` Ramalingam C
2022-06-27 17:09   ` [Intel-gfx] " Matthew Brost
2022-06-27 17:09     ` Matthew Brost
2022-06-14 18:43 ` [Intel-gfx] [PATCH 2/3] Revert "drm/i915: Hold reference to intel_context over life of i915_request" Ramalingam C
2022-06-14 18:43   ` Ramalingam C
2022-06-27 17:18   ` Matthew Brost [this message]
2022-06-27 17:18     ` Matthew Brost
2022-06-14 18:43 ` [Intel-gfx] [PATCH 3/3] drm/i915: Do not use reserved requests for virtual engines Ramalingam C
2022-06-14 18:43   ` Ramalingam C
2022-06-27 17:18   ` [Intel-gfx] " Matthew Brost
2022-06-27 17:18     ` Matthew Brost
2022-06-27 18:01     ` [Intel-gfx] " Ramalingam C
2022-06-27 18:01       ` Ramalingam C
2022-06-14 19:56 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Break VM to rq reference loop Patchwork
2022-06-14 20:27 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2022-06-15  6:41 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220627171811.GA15034@jons-linux-dev-box \
    --to=matthew.brost@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=matthew.auld@intel.com \
    --cc=ramalingam.c@intel.com \
    --cc=thomas.hellstrom@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.