All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Dave Gordon <david.s.gordon@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/4] drm/i915: handle teardown of HWSP when context is freed
Date: Mon, 25 Jan 2016 19:08:08 +0100	[thread overview]
Message-ID: <20160125180808.GO11240@phenom.ffwll.local> (raw)
In-Reply-To: <56A24206.208@intel.com>

On Fri, Jan 22, 2016 at 02:51:50PM +0000, Dave Gordon wrote:
> On 22/01/16 10:28, Mika Kuoppala wrote:
> >Dave Gordon <david.s.gordon@intel.com> writes:
> >
> >>Existing code did a kunmap() on the wrong page, and didn't account for the
> >>fact that the HWSP and the default context are different offsets within the
> >>same GEM object.
> >>
> >
> >Just to note here that usually we try to split bug fixes early in the
> >series with minimal code changes. This helps cherrypickups
> >to fixes/stable. And also lessens the probability that we accidentally
> >revert bugfix when some other problem occurs with the patch.
> 
> OK, let's drop this one from the series (they're orthogonal code changes,
> only related by the fact they're all to do with init/fini), and I'll
> resubmit it separately.
> 
> Please continue this series from 2/4 ...
> 
> >>This patch improves symmetry by creating lrc_teardown_hardware_status_page()
> >>to complement lrc_setup_hardware_status_page(), making sure that they both
> >>take the same argument (pointer to engine). They encapsulate all the details
> >>of the relationship between the default context and the status page, so the
> >>rest of the LRC code doesn't have to know about it.
> >>
> >>Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> >>---
> >>  drivers/gpu/drm/i915/intel_lrc.c | 57 ++++++++++++++++++++++++++++------------
> >>  1 file changed, 40 insertions(+), 17 deletions(-)
> >>
> >>diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> >>index 73d4347..3914120 100644
> >>--- a/drivers/gpu/drm/i915/intel_lrc.c
> >>+++ b/drivers/gpu/drm/i915/intel_lrc.c
> >>@@ -226,9 +226,8 @@ enum {
> >>  #define CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT  0x17
> >>
> >>  static int intel_lr_context_pin(struct drm_i915_gem_request *rq);
> >>-static void lrc_setup_hardware_status_page(struct intel_engine_cs *ring,
> >>-		struct drm_i915_gem_object *default_ctx_obj);
> >>-
> >>+static void lrc_setup_hardware_status_page(struct intel_engine_cs *ring);
> >>+static void lrc_teardown_hardware_status_page(struct intel_engine_cs *ring);
> >>
> >>  /**
> >>   * intel_sanitize_enable_execlists() - sanitize i915.enable_execlists
> >>@@ -1536,8 +1535,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *ring)
> >>  	struct drm_i915_private *dev_priv = dev->dev_private;
> >>  	u8 next_context_status_buffer_hw;
> >>
> >>-	lrc_setup_hardware_status_page(ring,
> >>-				dev_priv->kernel_context->engine[ring->id].state);
> >>+	lrc_setup_hardware_status_page(ring);
> >>
> >>  	I915_WRITE_IMR(ring, ~(ring->irq_enable_mask | ring->irq_keep_mask));
> >>  	I915_WRITE(RING_HWSTAM(ring->mmio_base), 0xffffffff);
> >>@@ -1992,10 +1990,9 @@ void intel_logical_ring_cleanup(struct intel_engine_cs *ring)
> >>  	i915_cmd_parser_fini_ring(ring);
> >>  	i915_gem_batch_pool_fini(&ring->batch_pool);
> >>
> >>-	if (ring->status_page.obj) {
> >>-		kunmap(sg_page(ring->status_page.obj->pages->sgl));
> >>-		ring->status_page.obj = NULL;
> >>-	}
> >>+	/* Status page should have gone already */
> >>+	WARN_ON(ring->status_page.page_addr);
> >>+	WARN_ON(ring->status_page.obj);
> >>
> >>  	ring->disable_lite_restore_wa = false;
> >>  	ring->ctx_desc_template = 0;
> >>@@ -2434,6 +2431,11 @@ void intel_lr_context_free(struct intel_context *ctx)
> >>  			continue;
> >>
> >>  		if (ctx == ctx->i915->kernel_context) {
> >>+			/*
> >>+			 * The HWSP is part of the default context
> >>+			 * object in LRC mode.
> >>+			 */
> >>+			lrc_teardown_hardware_status_page(ringbuf->ring);
> >>  			intel_unpin_ringbuffer_obj(ringbuf);
> >>  			i915_gem_object_ggtt_unpin(ctx_obj);
> >>  		}
> >>@@ -2482,24 +2484,45 @@ uint32_t intel_lr_context_size(struct intel_engine_cs *ring)
> >>  	return ret;
> >>  }
> >>
> >>-static void lrc_setup_hardware_status_page(struct intel_engine_cs *ring,
> >>-		struct drm_i915_gem_object *default_ctx_obj)
> >>+static void lrc_setup_hardware_status_page(struct intel_engine_cs *ring)
> >>  {
> >>-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> >>+	struct drm_i915_private *dev_priv = to_i915(ring->dev);
> >>+	struct intel_context *dctx = dev_priv->kernel_context;
> >>+	struct drm_i915_gem_object *dctx_obj = dctx->engine[ring->id].state;
> >>+	u64 dctx_addr = i915_gem_obj_ggtt_offset(dctx_obj);
> >>  	struct page *page;
> >>
> >>-	/* The HWSP is part of the default context object in LRC mode. */
> >>-	ring->status_page.gfx_addr = i915_gem_obj_ggtt_offset(default_ctx_obj)
> >>-			+ LRC_PPHWSP_PN * PAGE_SIZE;
> >>-	page = i915_gem_object_get_page(default_ctx_obj, LRC_PPHWSP_PN);
> >>+	/*
> >>+	 * The HWSP is part of the default context object in LRC mode.
> >>+	 * Note that it doesn't contribute to the refcount!
> >>+	 */
> >>+	page = i915_gem_object_get_page(dctx_obj, LRC_PPHWSP_PN);
> >>  	ring->status_page.page_addr = kmap(page);
> >>-	ring->status_page.obj = default_ctx_obj;
> >>+	ring->status_page.gfx_addr = dctx_addr + LRC_PPHWSP_PN * PAGE_SIZE;
> >>+	ring->status_page.obj = dctx_obj;
> >>
> >>  	I915_WRITE(RING_HWS_PGA(ring->mmio_base),
> >>  			(u32)ring->status_page.gfx_addr);
> >>  	POSTING_READ(RING_HWS_PGA(ring->mmio_base));
> >>  }
> >>
> >>+/* This should be called before the default context is destroyed */
> >>+static void lrc_teardown_hardware_status_page(struct intel_engine_cs *ring)
> >>+{
> >>+	struct drm_i915_gem_object *dctx_obj = ring->status_page.obj;
> >>+	struct page *page;
> >>+
> >>+	WARN_ON(!dctx_obj);
> >>+
> >>+	if (ring->status_page.page_addr) {
> >>+		page = i915_gem_object_get_page(dctx_obj, LRC_PPHWSP_PN);
> >>+		kunmap(page);
> >>+		ring->status_page.page_addr = NULL;
> >>+	}
> >>+
> >
> >As the page is always kmapped along with setting of status_page.obj,
> >I fail to see why we need the check here for the page_addr.
> 
> The WARN_ON() tells the developer something unexpected happened.
> 
> The page_addr test avoids doing something that may cause an OOPS so that the
> driver can continue (to unload!).
> 
> IMHO, teardown code *in particular* should be written to anticipate being
> called in all sorts of inconsistent states, because a common error handling
> strategy is just to say,
> 
>     oops, something bad happened at some stage of setup!
>     let's tear down everything and pass the error back
> 
> which is certainly easier than tracking exactly where the error occurred and
> undoing only the steps known to have succeeded -- that leads to lots of
> "goto err_1", "goto err_2", etc :(

Russian dolls error code unwinding is pretty much standard kernel
practice. And since it's also mostly dead code (well in practice) it's not
too much of a validation horror show. This will likely change a lot if we
start using EPROBE_DEFER in earnest.

This comment is just for context, i.e. I think this is totally fine. It's
a bit fragile, but you'll be fighting against a _lot_ of inertia with this
;-)

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-01-25 18:08 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-21 19:37 A collection of cleanups Dave Gordon
2016-01-21 19:37 ` [PATCH 1/4] drm/i915: handle teardown of HWSP when context is freed Dave Gordon
2016-01-22 10:28   ` Mika Kuoppala
2016-01-22 10:35     ` Chris Wilson
2016-01-22 14:51     ` Dave Gordon
2016-01-25 18:08       ` Daniel Vetter [this message]
2016-01-21 19:37 ` [PATCH 2/4] drm/i915: Fix context/engine cleanup order Dave Gordon
2016-01-25 18:09   ` Daniel Vetter
2016-01-26 10:45   ` Tvrtko Ursulin
2016-01-21 19:37 ` [PATCH 3/4] drm/i915: tidy up initialisation failure paths (legacy) Dave Gordon
2016-01-21 19:37 ` [PATCH 4/4] drm/i915: tidy up initialisation failure paths (GEM & LRC) Dave Gordon
2016-01-21 20:02 ` A collection of cleanups Dave Gordon
2016-01-22  7:42 ` ✗ Fi.CI.BAT: warning for series starting with [1/4] drm/i915: handle teardown of HWSP when context is freed Patchwork
2016-01-22 15:07   ` Dave Gordon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160125180808.GO11240@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=david.s.gordon@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.