[PATCH 1/2] drm/i915: Split late "for_each_ring" loop from i915_gem_init

public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed

* [PATCH 1/2] drm/i915: Split late "for_each_ring" loop from i915_gem_init_hw()
@ 2015-06-30 15:01 Dave Gordon
  2015-06-30 15:01 ` [PATCH 2/2] drm/i915: Defer late hardware initialisation until first open Dave Gordon
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Gordon @ 2015-06-30 15:01 UTC (permalink / raw)
  To: intel-gfx

This function has recently been updated by several patches, including:
    drm/i915: Add explicit request management to i915_gem_init_hw()
    drm/i915: Moved the for_each_ring loop outside of i915_gem_context_enable()

Now we need to move the entire loop into a separate function, replacing
the inline loop with a call. This will allow a future patch to add a
call from another locations (for now, there are no other calls).

The split marks the distinction between early initialisation using
MMIO register access to set up non-context registers, and late
initialisation using batchbuffers containing LRI instructions to
set up context-specific registers.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h |    1 +
 drivers/gpu/drm/i915/i915_gem.c |   15 ++++++++++++++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ea9caf2..bc7c510 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2882,6 +2882,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
 int __must_check i915_gem_init(struct drm_device *dev);
 int i915_gem_init_rings(struct drm_device *dev);
 int __must_check i915_gem_init_hw(struct drm_device *dev);
+int i915_gem_init_hw_late(struct drm_device *dev);
 int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice);
 void i915_gem_init_swizzling(struct drm_device *dev);
 void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 52efe43..1887e60 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5073,6 +5073,20 @@ i915_gem_init_hw(struct drm_device *dev)
 			goto out;
 	}
 
+	ret = i915_gem_init_hw_late(dev);
+
+out:
+	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+	return ret;
+}
+
+int
+i915_gem_init_hw_late(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_engine_cs *ring;
+	int ret, i, j;
+
 	/* Now it is safe to go back round and do everything else: */
 	for_each_ring(ring, dev_priv, i) {
 		struct drm_i915_gem_request *req;
@@ -5110,7 +5124,6 @@ i915_gem_init_hw(struct drm_device *dev)
 	}
 
 out:
-	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 	return ret;
 }
 
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] drm/i915: Defer late hardware initialisation until first open
  2015-06-30 15:01 [PATCH 1/2] drm/i915: Split late "for_each_ring" loop from i915_gem_init_hw() Dave Gordon
@ 2015-06-30 15:01 ` Dave Gordon
  2015-06-30 15:08   ` Chris Wilson
  2015-07-02 12:20   ` [PATCH 2/2] drm/i915: Defer late hardware initialisation until first open shuang.he
  0 siblings, 2 replies; 9+ messages in thread
From: Dave Gordon @ 2015-06-30 15:01 UTC (permalink / raw)
  To: intel-gfx

We can do less work during driver load by deferring some of it until
the first time the device is opened; in particular, the function
i915_gem_init_hw_late() introduced by the previous patch. This should
allow the system to get out of the early single-threaded phase of
system initialisation and into full multi-user mode somewhat quicker.

In addition, we expect that by the time of the first open, not only
the driver's software structures but also system-specific items such
as filesystem mounting have been fully initialised, meaning that the
late initialisation code can run in a much more complete environment
than the driver_load stage presents. This can be important for
embedded programmable devices that need firmware loaded from a file
before they can be used.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |    1 +
 drivers/gpu/drm/i915/i915_gem.c         |    4 +++-
 drivers/gpu/drm/i915/i915_gem_context.c |   32 ++++++++++++++++++++++++++-----
 3 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bc7c510..ba63804 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1845,6 +1845,7 @@ struct drm_i915_private {
 	/* hda/i915 audio component */
 	bool audio_component_registered;
 
+	bool contexts_ready;
 	uint32_t hw_context_size;
 	struct list_head context_list;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1887e60..0cb962f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5073,7 +5073,9 @@ i915_gem_init_hw(struct drm_device *dev)
 			goto out;
 	}
 
-	ret = i915_gem_init_hw_late(dev);
+	/* Don't do late init on the first time through here */
+	if (dev_priv->contexts_ready)
+		ret = i915_gem_init_hw_late(dev);
 
 out:
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index a7e58a8..917c867 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -438,23 +438,45 @@ static int context_idr_cleanup(int id, void *p, void *data)
 	return 0;
 }
 
+/* Complete any late initialisation here */
+static int i915_gem_context_first_open(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	int ret;
+
+	ret = i915_gem_init_hw_late(dev);
+	if (ret == 0)
+		dev_priv->contexts_ready = true;
+
+	return ret;
+}
+
 int i915_gem_context_open(struct drm_device *dev, struct drm_file *file)
 {
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 	struct intel_context *ctx;
+	int ret = 0;
 
 	idr_init(&file_priv->context_idr);
 
 	mutex_lock(&dev->struct_mutex);
-	ctx = i915_gem_create_context(dev, file_priv);
+
+	if (!dev_priv->contexts_ready)
+		ret = i915_gem_context_first_open(dev);
+
+	if (ret == 0) {
+		ctx = i915_gem_create_context(dev, file_priv);
+		if (IS_ERR(ctx))
+			ret = PTR_ERR(ctx);
+	}
+
 	mutex_unlock(&dev->struct_mutex);
 
-	if (IS_ERR(ctx)) {
+	if (ret)
 		idr_destroy(&file_priv->context_idr);
-		return PTR_ERR(ctx);
-	}
 
-	return 0;
+	return ret;
 }
 
 void i915_gem_context_close(struct drm_device *dev, struct drm_file *file)
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] drm/i915: Defer late hardware initialisation until first open
  2015-06-30 15:01 ` [PATCH 2/2] drm/i915: Defer late hardware initialisation until first open Dave Gordon
@ 2015-06-30 15:08   ` Chris Wilson
  2015-07-01  9:27     ` [PATCH] drm/i915: Asynchronously initialise the GPU state Chris Wilson
  2015-07-02 12:20   ` [PATCH 2/2] drm/i915: Defer late hardware initialisation until first open shuang.he
  1 sibling, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2015-06-30 15:08 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Tue, Jun 30, 2015 at 04:01:11PM +0100, Dave Gordon wrote:
> We can do less work during driver load by deferring some of it until
> the first time the device is opened; in particular, the function
> i915_gem_init_hw_late() introduced by the previous patch. This should
> allow the system to get out of the early single-threaded phase of
> system initialisation and into full multi-user mode somewhat quicker.
> 
> In addition, we expect that by the time of the first open, not only
> the driver's software structures but also system-specific items such
> as filesystem mounting have been fully initialised, meaning that the
> late initialisation code can run in a much more complete environment
> than the driver_load stage presents. This can be important for
> embedded programmable devices that need firmware loaded from a file
> before they can be used.

No. It delays establising the power contexts. Something that in the past
we have been told must be done asap. You can move it to an async worker,
that would be achieve the goals stated here, and not be dependent on a
cooperating userspace.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] drm/i915: Asynchronously initialise the GPU state
  2015-06-30 15:08   ` Chris Wilson
@ 2015-07-01  9:27     ` Chris Wilson
  2015-07-01 13:07       ` Daniel Vetter
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2015-07-01  9:27 UTC (permalink / raw)
  To: intel-gfx

Dave Gordon made the good suggestion that once the ringbuffers were
setup, the actual queuing of commands to program the initial GPU state
could be deferred. Since that initial state contains instructions for
setting up the first power context, we want to execute that as earlier
as possible, preferrably in the background to userspace. Then when
userspace does wake up, the first time it opens the device we just need
to flush the work to be sure that our commands are queued before any of
userspace's. (Hooking into the device open should mean we have to check
less often than say hooking into execbuffer.)

Suggested-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h |   2 +
 drivers/gpu/drm/i915/i915_gem.c | 113 +++++++++++++++++++++++++++-------------
 2 files changed, 79 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 3ea1fe8db63e..d4003dea97eb 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1938,6 +1938,8 @@ struct drm_i915_private {
 
 	bool edp_low_vswing;
 
+	struct work_struct init_hw_late;
+
 	/*
 	 * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
 	 * will be rejected. Instead look for a better place.
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2f0fed1b9dd7..7efa71f8edd7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5140,12 +5140,76 @@ cleanup_render_ring:
 	return ret;
 }
 
+static int
+i915_gem_init_hw_late(struct drm_i915_private *dev_priv)
+{
+	struct intel_engine_cs *ring;
+	int i, j;
+
+	for_each_ring(ring, dev_priv, i) {
+		struct drm_i915_gem_request *req;
+		int ret;
+
+		if (WARN_ON(!ring->default_context)) {
+			ret = -ENODEV;
+			goto err;
+		}
+
+		req = i915_gem_request_alloc(ring, ring->default_context);
+		if (IS_ERR(req)) {
+			ret = PTR_ERR(req);
+			goto err;
+		}
+
+		if (ring->id == RCS) {
+			for (j = 0; j < NUM_L3_SLICES(dev_priv); j++)
+				i915_gem_l3_remap(req, j);
+		}
+
+		ret = i915_ppgtt_init_ring(req);
+		if (ret) {
+			DRM_ERROR("PPGTT enable ring #%d failed %d\n", i, ret);
+			goto err_req;
+		}
+
+		ret = i915_gem_context_enable(req);
+		if (ret) {
+			DRM_ERROR("Context enable ring #%d failed %d\n", i, ret);
+			goto err_req;
+		}
+
+		i915_add_request_no_flush(req);
+		continue;
+
+err_req:
+		i915_gem_request_cancel(req);
+err:
+		return ret;
+	}
+
+	return 0;
+}
+
+static void
+i915_gem_init_hw_worker(struct work_struct *work)
+{
+	struct drm_i915_private *dev_priv =
+		container_of(work, typeof(*dev_priv), init_hw_late);
+	mutex_lock(&dev_priv->dev->struct_mutex);
+	if (i915_gem_init_hw_late(dev_priv)) {
+		DRM_ERROR("Failed to initialize GPU, declaring it wedged\n");
+		atomic_set_mask(I915_WEDGED,
+				&dev_priv->gpu_error.reset_counter);
+	}
+	mutex_unlock(&dev_priv->dev->struct_mutex);
+}
+
 int
 i915_gem_init_hw(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine_cs *ring;
-	int ret, i, j;
+	int ret, i;
 
 	if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
 		return -EIO;
@@ -5198,41 +5262,10 @@ i915_gem_init_hw(struct drm_device *dev)
 	}
 
 	/* Now it is safe to go back round and do everything else: */
-	for_each_ring(ring, dev_priv, i) {
-		struct drm_i915_gem_request *req;
-
-		WARN_ON(!ring->default_context);
-
-		req = i915_gem_request_alloc(ring, ring->default_context);
-		if (IS_ERR(req)) {
-			ret = PTR_ERR(req);
-			i915_gem_cleanup_ringbuffer(dev);
-			goto out;
-		}
-
-		if (ring->id == RCS) {
-			for (j = 0; j < NUM_L3_SLICES(dev); j++)
-				i915_gem_l3_remap(req, j);
-		}
-
-		ret = i915_ppgtt_init_ring(req);
-		if (ret && ret != -EIO) {
-			DRM_ERROR("PPGTT enable ring #%d failed %d\n", i, ret);
-			i915_gem_request_cancel(req);
-			i915_gem_cleanup_ringbuffer(dev);
-			goto out;
-		}
-
-		ret = i915_gem_context_enable(req);
-		if (ret && ret != -EIO) {
-			DRM_ERROR("Context enable ring #%d failed %d\n", i, ret);
-			i915_gem_request_cancel(req);
-			i915_gem_cleanup_ringbuffer(dev);
-			goto out;
-		}
-
-		i915_add_request_no_flush(req);
-	}
+	if (dev->open_count == 0) /* uncontested with userspace, i.e. boot */
+		queue_work(dev_priv->wq, &dev_priv->init_hw_late);
+	else
+		ret = i915_gem_init_hw_late(dev_priv);
 
 out:
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
@@ -5379,6 +5412,7 @@ i915_gem_load(struct drm_device *dev)
 		init_ring_lists(&dev_priv->ring[i]);
 	for (i = 0; i < I915_MAX_NUM_FENCES; i++)
 		INIT_LIST_HEAD(&dev_priv->fence_regs[i].lru_list);
+	INIT_WORK(&dev_priv->init_hw_late, i915_gem_init_hw_worker);
 	INIT_DELAYED_WORK(&dev_priv->mm.retire_work,
 			  i915_gem_retire_work_handler);
 	INIT_DELAYED_WORK(&dev_priv->mm.idle_work,
@@ -5442,11 +5476,18 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file)
 
 int i915_gem_open(struct drm_device *dev, struct drm_file *file)
 {
+	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct drm_i915_file_private *file_priv;
 	int ret;
 
 	DRM_DEBUG_DRIVER("\n");
 
+	/* Flush ring initialisation before userspace can submit its own
+	 * batches, so the hardware initialisation commands are queued
+	 * first.
+	 */
+	flush_work(&dev_priv->init_hw_late);
+
 	file_priv = kzalloc(sizeof(*file_priv), GFP_KERNEL);
 	if (!file_priv)
 		return -ENOMEM;
-- 
2.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/i915: Asynchronously initialise the GPU state
  2015-07-01  9:27     ` [PATCH] drm/i915: Asynchronously initialise the GPU state Chris Wilson
@ 2015-07-01 13:07       ` Daniel Vetter
  2015-07-01 13:17         ` Chris Wilson
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Vetter @ 2015-07-01 13:07 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Wed, Jul 01, 2015 at 10:27:21AM +0100, Chris Wilson wrote:
> Dave Gordon made the good suggestion that once the ringbuffers were
> setup, the actual queuing of commands to program the initial GPU state
> could be deferred. Since that initial state contains instructions for
> setting up the first power context, we want to execute that as earlier
> as possible, preferrably in the background to userspace. Then when
> userspace does wake up, the first time it opens the device we just need
> to flush the work to be sure that our commands are queued before any of
> userspace's. (Hooking into the device open should mean we have to check
> less often than say hooking into execbuffer.)
> 
> Suggested-by: Dave Gordon <david.s.gordon@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Dave Gordon <david.s.gordon@intel.com>

Just before this gets a bit out of hand with various patches floating
around ... I really meant it when I said that we should have a proper
design discussion about this in Jesse's meeting first.

Looking at all the ideas between you, Dave & me I count about 3-4
approaches to async gem init, and all have upsides and downsides.

Aside from that I concur that if we do async gem init then it better be a
worker and not relying on some abitrary userspace ioctl/syscall. Of course
we'd still need to place proper synchronization points at a good place
(flush_work in gem_open for Dave's design), but that's really orthogonal
to running it in a worker imo.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h |   2 +
>  drivers/gpu/drm/i915/i915_gem.c | 113 +++++++++++++++++++++++++++-------------
>  2 files changed, 79 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 3ea1fe8db63e..d4003dea97eb 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1938,6 +1938,8 @@ struct drm_i915_private {
>  
>  	bool edp_low_vswing;
>  
> +	struct work_struct init_hw_late;
> +
>  	/*
>  	 * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
>  	 * will be rejected. Instead look for a better place.
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 2f0fed1b9dd7..7efa71f8edd7 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -5140,12 +5140,76 @@ cleanup_render_ring:
>  	return ret;
>  }
>  
> +static int
> +i915_gem_init_hw_late(struct drm_i915_private *dev_priv)
> +{
> +	struct intel_engine_cs *ring;
> +	int i, j;
> +
> +	for_each_ring(ring, dev_priv, i) {
> +		struct drm_i915_gem_request *req;
> +		int ret;
> +
> +		if (WARN_ON(!ring->default_context)) {
> +			ret = -ENODEV;
> +			goto err;
> +		}
> +
> +		req = i915_gem_request_alloc(ring, ring->default_context);
> +		if (IS_ERR(req)) {
> +			ret = PTR_ERR(req);
> +			goto err;
> +		}
> +
> +		if (ring->id == RCS) {
> +			for (j = 0; j < NUM_L3_SLICES(dev_priv); j++)
> +				i915_gem_l3_remap(req, j);
> +		}
> +
> +		ret = i915_ppgtt_init_ring(req);
> +		if (ret) {
> +			DRM_ERROR("PPGTT enable ring #%d failed %d\n", i, ret);
> +			goto err_req;
> +		}
> +
> +		ret = i915_gem_context_enable(req);
> +		if (ret) {
> +			DRM_ERROR("Context enable ring #%d failed %d\n", i, ret);
> +			goto err_req;
> +		}
> +
> +		i915_add_request_no_flush(req);
> +		continue;
> +
> +err_req:
> +		i915_gem_request_cancel(req);
> +err:
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static void
> +i915_gem_init_hw_worker(struct work_struct *work)
> +{
> +	struct drm_i915_private *dev_priv =
> +		container_of(work, typeof(*dev_priv), init_hw_late);
> +	mutex_lock(&dev_priv->dev->struct_mutex);
> +	if (i915_gem_init_hw_late(dev_priv)) {
> +		DRM_ERROR("Failed to initialize GPU, declaring it wedged\n");
> +		atomic_set_mask(I915_WEDGED,
> +				&dev_priv->gpu_error.reset_counter);
> +	}
> +	mutex_unlock(&dev_priv->dev->struct_mutex);
> +}
> +
>  int
>  i915_gem_init_hw(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_engine_cs *ring;
> -	int ret, i, j;
> +	int ret, i;
>  
>  	if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
>  		return -EIO;
> @@ -5198,41 +5262,10 @@ i915_gem_init_hw(struct drm_device *dev)
>  	}
>  
>  	/* Now it is safe to go back round and do everything else: */
> -	for_each_ring(ring, dev_priv, i) {
> -		struct drm_i915_gem_request *req;
> -
> -		WARN_ON(!ring->default_context);
> -
> -		req = i915_gem_request_alloc(ring, ring->default_context);
> -		if (IS_ERR(req)) {
> -			ret = PTR_ERR(req);
> -			i915_gem_cleanup_ringbuffer(dev);
> -			goto out;
> -		}
> -
> -		if (ring->id == RCS) {
> -			for (j = 0; j < NUM_L3_SLICES(dev); j++)
> -				i915_gem_l3_remap(req, j);
> -		}
> -
> -		ret = i915_ppgtt_init_ring(req);
> -		if (ret && ret != -EIO) {
> -			DRM_ERROR("PPGTT enable ring #%d failed %d\n", i, ret);
> -			i915_gem_request_cancel(req);
> -			i915_gem_cleanup_ringbuffer(dev);
> -			goto out;
> -		}
> -
> -		ret = i915_gem_context_enable(req);
> -		if (ret && ret != -EIO) {
> -			DRM_ERROR("Context enable ring #%d failed %d\n", i, ret);
> -			i915_gem_request_cancel(req);
> -			i915_gem_cleanup_ringbuffer(dev);
> -			goto out;
> -		}
> -
> -		i915_add_request_no_flush(req);
> -	}
> +	if (dev->open_count == 0) /* uncontested with userspace, i.e. boot */
> +		queue_work(dev_priv->wq, &dev_priv->init_hw_late);
> +	else
> +		ret = i915_gem_init_hw_late(dev_priv);
>  
>  out:
>  	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
> @@ -5379,6 +5412,7 @@ i915_gem_load(struct drm_device *dev)
>  		init_ring_lists(&dev_priv->ring[i]);
>  	for (i = 0; i < I915_MAX_NUM_FENCES; i++)
>  		INIT_LIST_HEAD(&dev_priv->fence_regs[i].lru_list);
> +	INIT_WORK(&dev_priv->init_hw_late, i915_gem_init_hw_worker);
>  	INIT_DELAYED_WORK(&dev_priv->mm.retire_work,
>  			  i915_gem_retire_work_handler);
>  	INIT_DELAYED_WORK(&dev_priv->mm.idle_work,
> @@ -5442,11 +5476,18 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file)
>  
>  int i915_gem_open(struct drm_device *dev, struct drm_file *file)
>  {
> +	struct drm_i915_private *dev_priv = to_i915(dev);
>  	struct drm_i915_file_private *file_priv;
>  	int ret;
>  
>  	DRM_DEBUG_DRIVER("\n");
>  
> +	/* Flush ring initialisation before userspace can submit its own
> +	 * batches, so the hardware initialisation commands are queued
> +	 * first.
> +	 */
> +	flush_work(&dev_priv->init_hw_late);
> +
>  	file_priv = kzalloc(sizeof(*file_priv), GFP_KERNEL);
>  	if (!file_priv)
>  		return -ENOMEM;
> -- 
> 2.1.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/i915: Asynchronously initialise the GPU state
  2015-07-01 13:07       ` Daniel Vetter
@ 2015-07-01 13:17         ` Chris Wilson
  2015-07-01 14:07           ` Daniel Vetter
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2015-07-01 13:17 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Wed, Jul 01, 2015 at 03:07:18PM +0200, Daniel Vetter wrote:
> On Wed, Jul 01, 2015 at 10:27:21AM +0100, Chris Wilson wrote:
> > Dave Gordon made the good suggestion that once the ringbuffers were
> > setup, the actual queuing of commands to program the initial GPU state
> > could be deferred. Since that initial state contains instructions for
> > setting up the first power context, we want to execute that as earlier
> > as possible, preferrably in the background to userspace. Then when
> > userspace does wake up, the first time it opens the device we just need
> > to flush the work to be sure that our commands are queued before any of
> > userspace's. (Hooking into the device open should mean we have to check
> > less often than say hooking into execbuffer.)
> > 
> > Suggested-by: Dave Gordon <david.s.gordon@intel.com>
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Dave Gordon <david.s.gordon@intel.com>
> 
> Just before this gets a bit out of hand with various patches floating
> around ... I really meant it when I said that we should have a proper
> design discussion about this in Jesse's meeting first.

What more is there to design? Asynchronously loading the submission port
is orthogonal to the task of queuing requests for it, and need not block
request construction (be it kernel or userspace). Dave just identified
some work that we didn't need to do during module load. I don't think he
would propose using it for loading guc firmware, that would just be
silly...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/i915: Asynchronously initialise the GPU state
  2015-07-01 13:17         ` Chris Wilson
@ 2015-07-01 14:07           ` Daniel Vetter
  2015-07-01 14:15             ` Chris Wilson
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Vetter @ 2015-07-01 14:07 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, intel-gfx

On Wed, Jul 01, 2015 at 02:17:28PM +0100, Chris Wilson wrote:
> On Wed, Jul 01, 2015 at 03:07:18PM +0200, Daniel Vetter wrote:
> > On Wed, Jul 01, 2015 at 10:27:21AM +0100, Chris Wilson wrote:
> > > Dave Gordon made the good suggestion that once the ringbuffers were
> > > setup, the actual queuing of commands to program the initial GPU state
> > > could be deferred. Since that initial state contains instructions for
> > > setting up the first power context, we want to execute that as earlier
> > > as possible, preferrably in the background to userspace. Then when
> > > userspace does wake up, the first time it opens the device we just need
> > > to flush the work to be sure that our commands are queued before any of
> > > userspace's. (Hooking into the device open should mean we have to check
> > > less often than say hooking into execbuffer.)
> > > 
> > > Suggested-by: Dave Gordon <david.s.gordon@intel.com>
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Dave Gordon <david.s.gordon@intel.com>
> > 
> > Just before this gets a bit out of hand with various patches floating
> > around ... I really meant it when I said that we should have a proper
> > design discussion about this in Jesse's meeting first.
> 
> What more is there to design? Asynchronously loading the submission port
> is orthogonal to the task of queuing requests for it, and need not block
> request construction (be it kernel or userspace). Dave just identified
> some work that we didn't need to do during module load. I don't think he
> would propose using it for loading guc firmware, that would just be
> silly...

set_wedged in your patch doesn't have the wakeup to kick waiters. And
maybe we want to be somewhat more synchronous with with init fail than gpu
hangs, for userspace to make better decisions. Also we still have that
issue that sometimes an -EIO escapes into modeset code.  And yes this is
mean to provide the async init for the request firmware.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/i915: Asynchronously initialise the GPU state
  2015-07-01 14:07           ` Daniel Vetter
@ 2015-07-01 14:15             ` Chris Wilson
  0 siblings, 0 replies; 9+ messages in thread
From: Chris Wilson @ 2015-07-01 14:15 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Wed, Jul 01, 2015 at 04:07:08PM +0200, Daniel Vetter wrote:
> On Wed, Jul 01, 2015 at 02:17:28PM +0100, Chris Wilson wrote:
> > On Wed, Jul 01, 2015 at 03:07:18PM +0200, Daniel Vetter wrote:
> > > On Wed, Jul 01, 2015 at 10:27:21AM +0100, Chris Wilson wrote:
> > > > Dave Gordon made the good suggestion that once the ringbuffers were
> > > > setup, the actual queuing of commands to program the initial GPU state
> > > > could be deferred. Since that initial state contains instructions for
> > > > setting up the first power context, we want to execute that as earlier
> > > > as possible, preferrably in the background to userspace. Then when
> > > > userspace does wake up, the first time it opens the device we just need
> > > > to flush the work to be sure that our commands are queued before any of
> > > > userspace's. (Hooking into the device open should mean we have to check
> > > > less often than say hooking into execbuffer.)
> > > > 
> > > > Suggested-by: Dave Gordon <david.s.gordon@intel.com>
> > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > > Cc: Dave Gordon <david.s.gordon@intel.com>
> > > 
> > > Just before this gets a bit out of hand with various patches floating
> > > around ... I really meant it when I said that we should have a proper
> > > design discussion about this in Jesse's meeting first.
> > 
> > What more is there to design? Asynchronously loading the submission port
> > is orthogonal to the task of queuing requests for it, and need not block
> > request construction (be it kernel or userspace). Dave just identified
> > some work that we didn't need to do during module load. I don't think he
> > would propose using it for loading guc firmware, that would just be
> > silly...
> 
> set_wedged in your patch doesn't have the wakeup to kick waiters.

True. But is has to be impossible for a waiter to exist at this point,
or else the entire async GPU init is broken. These commands have to be
the first requests we send to the GPU. Everything else must wait before
it is allowed to start queuing.

> And
> maybe we want to be somewhat more synchronous with with init fail than gpu
> hangs, for userspace to make better decisions.

The init is still synchronous with userspace using the device, just (and
this is no change) the only communication with userspace that GEM
initialisation failed is the wedged GPU.

> Also we still have that
> issue that sometimes an -EIO escapes into modeset code.

But there are no new wait requests running conncurrent with GEM init,
so this patch doesn't alter that.

> And yes this is
> mean to provide the async init for the request firmware.

It is an inappropriate juncture for async request firmware. I can keep
repeating that enabling the submission ports is orthogonal to setting up
the CS engines and allowing requests to be queued, because it is...

There is no need to modify the higher levels for async GuC
initialisation. The serialisation there is when to start feeding requests
into the submission port. At the moment we do that immediately when it 
is idle - but the GuC is not idle until it loaded, as soon as it is
loaded it can simply feed in the first set of requests and start on
its merry way. This also allows GuC failure also always to transparently
fallback to execlists.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] drm/i915: Defer late hardware initialisation until first open
  2015-06-30 15:01 ` [PATCH 2/2] drm/i915: Defer late hardware initialisation until first open Dave Gordon
  2015-06-30 15:08   ` Chris Wilson
@ 2015-07-02 12:20   ` shuang.he
  1 sibling, 0 replies; 9+ messages in thread
From: shuang.he @ 2015-07-02 12:20 UTC (permalink / raw)
  To: shuang.he, lei.a.liu, intel-gfx, david.s.gordon

Tested-By: Intel Graphics QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 6687
-------------------------------------Summary-------------------------------------
Platform          Delta          drm-intel-nightly          Series Applied
ILK                                  302/302              302/302
SNB                                  312/316              312/316
IVB                                  343/343              343/343
BYT                 -1              287/287              286/287
HSW                                  380/380              380/380
-------------------------------------Detailed-------------------------------------
Platform  Test                                drm-intel-nightly          Series Applied
*BYT  igt@gem_partial_pwrite_pread@reads-display      PASS(1)      FAIL(1)
Note: You need to pay more attention to line start with '*'
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-07-02 12:20 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-30 15:01 [PATCH 1/2] drm/i915: Split late "for_each_ring" loop from i915_gem_init_hw() Dave Gordon
2015-06-30 15:01 ` [PATCH 2/2] drm/i915: Defer late hardware initialisation until first open Dave Gordon
2015-06-30 15:08   ` Chris Wilson
2015-07-01  9:27     ` [PATCH] drm/i915: Asynchronously initialise the GPU state Chris Wilson
2015-07-01 13:07       ` Daniel Vetter
2015-07-01 13:17         ` Chris Wilson
2015-07-01 14:07           ` Daniel Vetter
2015-07-01 14:15             ` Chris Wilson
2015-07-02 12:20   ` [PATCH 2/2] drm/i915: Defer late hardware initialisation until first open shuang.he

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox