* [PATCH 1/2] drm/i915: Split late "for_each_ring" loop from i915_gem_init_hw()
@ 2015-06-30 15:01 Dave Gordon
2015-06-30 15:01 ` [PATCH 2/2] drm/i915: Defer late hardware initialisation until first open Dave Gordon
0 siblings, 1 reply; 9+ messages in thread
From: Dave Gordon @ 2015-06-30 15:01 UTC (permalink / raw)
To: intel-gfx
This function has recently been updated by several patches, including:
drm/i915: Add explicit request management to i915_gem_init_hw()
drm/i915: Moved the for_each_ring loop outside of i915_gem_context_enable()
Now we need to move the entire loop into a separate function, replacing
the inline loop with a call. This will allow a future patch to add a
call from another locations (for now, there are no other calls).
The split marks the distinction between early initialisation using
MMIO register access to set up non-context registers, and late
initialisation using batchbuffers containing LRI instructions to
set up context-specific registers.
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 1 +
drivers/gpu/drm/i915/i915_gem.c | 15 ++++++++++++++-
2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ea9caf2..bc7c510 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2882,6 +2882,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
int __must_check i915_gem_init(struct drm_device *dev);
int i915_gem_init_rings(struct drm_device *dev);
int __must_check i915_gem_init_hw(struct drm_device *dev);
+int i915_gem_init_hw_late(struct drm_device *dev);
int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice);
void i915_gem_init_swizzling(struct drm_device *dev);
void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 52efe43..1887e60 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5073,6 +5073,20 @@ i915_gem_init_hw(struct drm_device *dev)
goto out;
}
+ ret = i915_gem_init_hw_late(dev);
+
+out:
+ intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+ return ret;
+}
+
+int
+i915_gem_init_hw_late(struct drm_device *dev)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct intel_engine_cs *ring;
+ int ret, i, j;
+
/* Now it is safe to go back round and do everything else: */
for_each_ring(ring, dev_priv, i) {
struct drm_i915_gem_request *req;
@@ -5110,7 +5124,6 @@ i915_gem_init_hw(struct drm_device *dev)
}
out:
- intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
return ret;
}
--
1.7.9.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH 2/2] drm/i915: Defer late hardware initialisation until first open 2015-06-30 15:01 [PATCH 1/2] drm/i915: Split late "for_each_ring" loop from i915_gem_init_hw() Dave Gordon @ 2015-06-30 15:01 ` Dave Gordon 2015-06-30 15:08 ` Chris Wilson 2015-07-02 12:20 ` [PATCH 2/2] drm/i915: Defer late hardware initialisation until first open shuang.he 0 siblings, 2 replies; 9+ messages in thread From: Dave Gordon @ 2015-06-30 15:01 UTC (permalink / raw) To: intel-gfx We can do less work during driver load by deferring some of it until the first time the device is opened; in particular, the function i915_gem_init_hw_late() introduced by the previous patch. This should allow the system to get out of the early single-threaded phase of system initialisation and into full multi-user mode somewhat quicker. In addition, we expect that by the time of the first open, not only the driver's software structures but also system-specific items such as filesystem mounting have been fully initialised, meaning that the late initialisation code can run in a much more complete environment than the driver_load stage presents. This can be important for embedded programmable devices that need firmware loaded from a file before they can be used. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gem.c | 4 +++- drivers/gpu/drm/i915/i915_gem_context.c | 32 ++++++++++++++++++++++++++----- 3 files changed, 31 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index bc7c510..ba63804 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1845,6 +1845,7 @@ struct drm_i915_private { /* hda/i915 audio component */ bool audio_component_registered; + bool contexts_ready; uint32_t hw_context_size; struct list_head context_list; diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 1887e60..0cb962f 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -5073,7 +5073,9 @@ i915_gem_init_hw(struct drm_device *dev) goto out; } - ret = i915_gem_init_hw_late(dev); + /* Don't do late init on the first time through here */ + if (dev_priv->contexts_ready) + ret = i915_gem_init_hw_late(dev); out: intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL); diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index a7e58a8..917c867 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -438,23 +438,45 @@ static int context_idr_cleanup(int id, void *p, void *data) return 0; } +/* Complete any late initialisation here */ +static int i915_gem_context_first_open(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev->dev_private; + int ret; + + ret = i915_gem_init_hw_late(dev); + if (ret == 0) + dev_priv->contexts_ready = true; + + return ret; +} + int i915_gem_context_open(struct drm_device *dev, struct drm_file *file) { + struct drm_i915_private *dev_priv = dev->dev_private; struct drm_i915_file_private *file_priv = file->driver_priv; struct intel_context *ctx; + int ret = 0; idr_init(&file_priv->context_idr); mutex_lock(&dev->struct_mutex); - ctx = i915_gem_create_context(dev, file_priv); + + if (!dev_priv->contexts_ready) + ret = i915_gem_context_first_open(dev); + + if (ret == 0) { + ctx = i915_gem_create_context(dev, file_priv); + if (IS_ERR(ctx)) + ret = PTR_ERR(ctx); + } + mutex_unlock(&dev->struct_mutex); - if (IS_ERR(ctx)) { + if (ret) idr_destroy(&file_priv->context_idr); - return PTR_ERR(ctx); - } - return 0; + return ret; } void i915_gem_context_close(struct drm_device *dev, struct drm_file *file) -- 1.7.9.5 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] drm/i915: Defer late hardware initialisation until first open 2015-06-30 15:01 ` [PATCH 2/2] drm/i915: Defer late hardware initialisation until first open Dave Gordon @ 2015-06-30 15:08 ` Chris Wilson 2015-07-01 9:27 ` [PATCH] drm/i915: Asynchronously initialise the GPU state Chris Wilson 2015-07-02 12:20 ` [PATCH 2/2] drm/i915: Defer late hardware initialisation until first open shuang.he 1 sibling, 1 reply; 9+ messages in thread From: Chris Wilson @ 2015-06-30 15:08 UTC (permalink / raw) To: Dave Gordon; +Cc: intel-gfx On Tue, Jun 30, 2015 at 04:01:11PM +0100, Dave Gordon wrote: > We can do less work during driver load by deferring some of it until > the first time the device is opened; in particular, the function > i915_gem_init_hw_late() introduced by the previous patch. This should > allow the system to get out of the early single-threaded phase of > system initialisation and into full multi-user mode somewhat quicker. > > In addition, we expect that by the time of the first open, not only > the driver's software structures but also system-specific items such > as filesystem mounting have been fully initialised, meaning that the > late initialisation code can run in a much more complete environment > than the driver_load stage presents. This can be important for > embedded programmable devices that need firmware loaded from a file > before they can be used. No. It delays establising the power contexts. Something that in the past we have been told must be done asap. You can move it to an async worker, that would be achieve the goals stated here, and not be dependent on a cooperating userspace. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] drm/i915: Asynchronously initialise the GPU state 2015-06-30 15:08 ` Chris Wilson @ 2015-07-01 9:27 ` Chris Wilson 2015-07-01 13:07 ` Daniel Vetter 0 siblings, 1 reply; 9+ messages in thread From: Chris Wilson @ 2015-07-01 9:27 UTC (permalink / raw) To: intel-gfx Dave Gordon made the good suggestion that once the ringbuffers were setup, the actual queuing of commands to program the initial GPU state could be deferred. Since that initial state contains instructions for setting up the first power context, we want to execute that as earlier as possible, preferrably in the background to userspace. Then when userspace does wake up, the first time it opens the device we just need to flush the work to be sure that our commands are queued before any of userspace's. (Hooking into the device open should mean we have to check less often than say hooking into execbuffer.) Suggested-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Gordon <david.s.gordon@intel.com> --- drivers/gpu/drm/i915/i915_drv.h | 2 + drivers/gpu/drm/i915/i915_gem.c | 113 +++++++++++++++++++++++++++------------- 2 files changed, 79 insertions(+), 36 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 3ea1fe8db63e..d4003dea97eb 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1938,6 +1938,8 @@ struct drm_i915_private { bool edp_low_vswing; + struct work_struct init_hw_late; + /* * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch * will be rejected. Instead look for a better place. diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 2f0fed1b9dd7..7efa71f8edd7 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -5140,12 +5140,76 @@ cleanup_render_ring: return ret; } +static int +i915_gem_init_hw_late(struct drm_i915_private *dev_priv) +{ + struct intel_engine_cs *ring; + int i, j; + + for_each_ring(ring, dev_priv, i) { + struct drm_i915_gem_request *req; + int ret; + + if (WARN_ON(!ring->default_context)) { + ret = -ENODEV; + goto err; + } + + req = i915_gem_request_alloc(ring, ring->default_context); + if (IS_ERR(req)) { + ret = PTR_ERR(req); + goto err; + } + + if (ring->id == RCS) { + for (j = 0; j < NUM_L3_SLICES(dev_priv); j++) + i915_gem_l3_remap(req, j); + } + + ret = i915_ppgtt_init_ring(req); + if (ret) { + DRM_ERROR("PPGTT enable ring #%d failed %d\n", i, ret); + goto err_req; + } + + ret = i915_gem_context_enable(req); + if (ret) { + DRM_ERROR("Context enable ring #%d failed %d\n", i, ret); + goto err_req; + } + + i915_add_request_no_flush(req); + continue; + +err_req: + i915_gem_request_cancel(req); +err: + return ret; + } + + return 0; +} + +static void +i915_gem_init_hw_worker(struct work_struct *work) +{ + struct drm_i915_private *dev_priv = + container_of(work, typeof(*dev_priv), init_hw_late); + mutex_lock(&dev_priv->dev->struct_mutex); + if (i915_gem_init_hw_late(dev_priv)) { + DRM_ERROR("Failed to initialize GPU, declaring it wedged\n"); + atomic_set_mask(I915_WEDGED, + &dev_priv->gpu_error.reset_counter); + } + mutex_unlock(&dev_priv->dev->struct_mutex); +} + int i915_gem_init_hw(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev->dev_private; struct intel_engine_cs *ring; - int ret, i, j; + int ret, i; if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt()) return -EIO; @@ -5198,41 +5262,10 @@ i915_gem_init_hw(struct drm_device *dev) } /* Now it is safe to go back round and do everything else: */ - for_each_ring(ring, dev_priv, i) { - struct drm_i915_gem_request *req; - - WARN_ON(!ring->default_context); - - req = i915_gem_request_alloc(ring, ring->default_context); - if (IS_ERR(req)) { - ret = PTR_ERR(req); - i915_gem_cleanup_ringbuffer(dev); - goto out; - } - - if (ring->id == RCS) { - for (j = 0; j < NUM_L3_SLICES(dev); j++) - i915_gem_l3_remap(req, j); - } - - ret = i915_ppgtt_init_ring(req); - if (ret && ret != -EIO) { - DRM_ERROR("PPGTT enable ring #%d failed %d\n", i, ret); - i915_gem_request_cancel(req); - i915_gem_cleanup_ringbuffer(dev); - goto out; - } - - ret = i915_gem_context_enable(req); - if (ret && ret != -EIO) { - DRM_ERROR("Context enable ring #%d failed %d\n", i, ret); - i915_gem_request_cancel(req); - i915_gem_cleanup_ringbuffer(dev); - goto out; - } - - i915_add_request_no_flush(req); - } + if (dev->open_count == 0) /* uncontested with userspace, i.e. boot */ + queue_work(dev_priv->wq, &dev_priv->init_hw_late); + else + ret = i915_gem_init_hw_late(dev_priv); out: intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL); @@ -5379,6 +5412,7 @@ i915_gem_load(struct drm_device *dev) init_ring_lists(&dev_priv->ring[i]); for (i = 0; i < I915_MAX_NUM_FENCES; i++) INIT_LIST_HEAD(&dev_priv->fence_regs[i].lru_list); + INIT_WORK(&dev_priv->init_hw_late, i915_gem_init_hw_worker); INIT_DELAYED_WORK(&dev_priv->mm.retire_work, i915_gem_retire_work_handler); INIT_DELAYED_WORK(&dev_priv->mm.idle_work, @@ -5442,11 +5476,18 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file) int i915_gem_open(struct drm_device *dev, struct drm_file *file) { + struct drm_i915_private *dev_priv = to_i915(dev); struct drm_i915_file_private *file_priv; int ret; DRM_DEBUG_DRIVER("\n"); + /* Flush ring initialisation before userspace can submit its own + * batches, so the hardware initialisation commands are queued + * first. + */ + flush_work(&dev_priv->init_hw_late); + file_priv = kzalloc(sizeof(*file_priv), GFP_KERNEL); if (!file_priv) return -ENOMEM; -- 2.1.4 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: Asynchronously initialise the GPU state 2015-07-01 9:27 ` [PATCH] drm/i915: Asynchronously initialise the GPU state Chris Wilson @ 2015-07-01 13:07 ` Daniel Vetter 2015-07-01 13:17 ` Chris Wilson 0 siblings, 1 reply; 9+ messages in thread From: Daniel Vetter @ 2015-07-01 13:07 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx On Wed, Jul 01, 2015 at 10:27:21AM +0100, Chris Wilson wrote: > Dave Gordon made the good suggestion that once the ringbuffers were > setup, the actual queuing of commands to program the initial GPU state > could be deferred. Since that initial state contains instructions for > setting up the first power context, we want to execute that as earlier > as possible, preferrably in the background to userspace. Then when > userspace does wake up, the first time it opens the device we just need > to flush the work to be sure that our commands are queued before any of > userspace's. (Hooking into the device open should mean we have to check > less often than say hooking into execbuffer.) > > Suggested-by: Dave Gordon <david.s.gordon@intel.com> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Dave Gordon <david.s.gordon@intel.com> Just before this gets a bit out of hand with various patches floating around ... I really meant it when I said that we should have a proper design discussion about this in Jesse's meeting first. Looking at all the ideas between you, Dave & me I count about 3-4 approaches to async gem init, and all have upsides and downsides. Aside from that I concur that if we do async gem init then it better be a worker and not relying on some abitrary userspace ioctl/syscall. Of course we'd still need to place proper synchronization points at a good place (flush_work in gem_open for Dave's design), but that's really orthogonal to running it in a worker imo. -Daniel > --- > drivers/gpu/drm/i915/i915_drv.h | 2 + > drivers/gpu/drm/i915/i915_gem.c | 113 +++++++++++++++++++++++++++------------- > 2 files changed, 79 insertions(+), 36 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 3ea1fe8db63e..d4003dea97eb 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -1938,6 +1938,8 @@ struct drm_i915_private { > > bool edp_low_vswing; > > + struct work_struct init_hw_late; > + > /* > * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch > * will be rejected. Instead look for a better place. > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 2f0fed1b9dd7..7efa71f8edd7 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -5140,12 +5140,76 @@ cleanup_render_ring: > return ret; > } > > +static int > +i915_gem_init_hw_late(struct drm_i915_private *dev_priv) > +{ > + struct intel_engine_cs *ring; > + int i, j; > + > + for_each_ring(ring, dev_priv, i) { > + struct drm_i915_gem_request *req; > + int ret; > + > + if (WARN_ON(!ring->default_context)) { > + ret = -ENODEV; > + goto err; > + } > + > + req = i915_gem_request_alloc(ring, ring->default_context); > + if (IS_ERR(req)) { > + ret = PTR_ERR(req); > + goto err; > + } > + > + if (ring->id == RCS) { > + for (j = 0; j < NUM_L3_SLICES(dev_priv); j++) > + i915_gem_l3_remap(req, j); > + } > + > + ret = i915_ppgtt_init_ring(req); > + if (ret) { > + DRM_ERROR("PPGTT enable ring #%d failed %d\n", i, ret); > + goto err_req; > + } > + > + ret = i915_gem_context_enable(req); > + if (ret) { > + DRM_ERROR("Context enable ring #%d failed %d\n", i, ret); > + goto err_req; > + } > + > + i915_add_request_no_flush(req); > + continue; > + > +err_req: > + i915_gem_request_cancel(req); > +err: > + return ret; > + } > + > + return 0; > +} > + > +static void > +i915_gem_init_hw_worker(struct work_struct *work) > +{ > + struct drm_i915_private *dev_priv = > + container_of(work, typeof(*dev_priv), init_hw_late); > + mutex_lock(&dev_priv->dev->struct_mutex); > + if (i915_gem_init_hw_late(dev_priv)) { > + DRM_ERROR("Failed to initialize GPU, declaring it wedged\n"); > + atomic_set_mask(I915_WEDGED, > + &dev_priv->gpu_error.reset_counter); > + } > + mutex_unlock(&dev_priv->dev->struct_mutex); > +} > + > int > i915_gem_init_hw(struct drm_device *dev) > { > struct drm_i915_private *dev_priv = dev->dev_private; > struct intel_engine_cs *ring; > - int ret, i, j; > + int ret, i; > > if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt()) > return -EIO; > @@ -5198,41 +5262,10 @@ i915_gem_init_hw(struct drm_device *dev) > } > > /* Now it is safe to go back round and do everything else: */ > - for_each_ring(ring, dev_priv, i) { > - struct drm_i915_gem_request *req; > - > - WARN_ON(!ring->default_context); > - > - req = i915_gem_request_alloc(ring, ring->default_context); > - if (IS_ERR(req)) { > - ret = PTR_ERR(req); > - i915_gem_cleanup_ringbuffer(dev); > - goto out; > - } > - > - if (ring->id == RCS) { > - for (j = 0; j < NUM_L3_SLICES(dev); j++) > - i915_gem_l3_remap(req, j); > - } > - > - ret = i915_ppgtt_init_ring(req); > - if (ret && ret != -EIO) { > - DRM_ERROR("PPGTT enable ring #%d failed %d\n", i, ret); > - i915_gem_request_cancel(req); > - i915_gem_cleanup_ringbuffer(dev); > - goto out; > - } > - > - ret = i915_gem_context_enable(req); > - if (ret && ret != -EIO) { > - DRM_ERROR("Context enable ring #%d failed %d\n", i, ret); > - i915_gem_request_cancel(req); > - i915_gem_cleanup_ringbuffer(dev); > - goto out; > - } > - > - i915_add_request_no_flush(req); > - } > + if (dev->open_count == 0) /* uncontested with userspace, i.e. boot */ > + queue_work(dev_priv->wq, &dev_priv->init_hw_late); > + else > + ret = i915_gem_init_hw_late(dev_priv); > > out: > intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL); > @@ -5379,6 +5412,7 @@ i915_gem_load(struct drm_device *dev) > init_ring_lists(&dev_priv->ring[i]); > for (i = 0; i < I915_MAX_NUM_FENCES; i++) > INIT_LIST_HEAD(&dev_priv->fence_regs[i].lru_list); > + INIT_WORK(&dev_priv->init_hw_late, i915_gem_init_hw_worker); > INIT_DELAYED_WORK(&dev_priv->mm.retire_work, > i915_gem_retire_work_handler); > INIT_DELAYED_WORK(&dev_priv->mm.idle_work, > @@ -5442,11 +5476,18 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file) > > int i915_gem_open(struct drm_device *dev, struct drm_file *file) > { > + struct drm_i915_private *dev_priv = to_i915(dev); > struct drm_i915_file_private *file_priv; > int ret; > > DRM_DEBUG_DRIVER("\n"); > > + /* Flush ring initialisation before userspace can submit its own > + * batches, so the hardware initialisation commands are queued > + * first. > + */ > + flush_work(&dev_priv->init_hw_late); > + > file_priv = kzalloc(sizeof(*file_priv), GFP_KERNEL); > if (!file_priv) > return -ENOMEM; > -- > 2.1.4 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: Asynchronously initialise the GPU state 2015-07-01 13:07 ` Daniel Vetter @ 2015-07-01 13:17 ` Chris Wilson 2015-07-01 14:07 ` Daniel Vetter 0 siblings, 1 reply; 9+ messages in thread From: Chris Wilson @ 2015-07-01 13:17 UTC (permalink / raw) To: Daniel Vetter; +Cc: intel-gfx On Wed, Jul 01, 2015 at 03:07:18PM +0200, Daniel Vetter wrote: > On Wed, Jul 01, 2015 at 10:27:21AM +0100, Chris Wilson wrote: > > Dave Gordon made the good suggestion that once the ringbuffers were > > setup, the actual queuing of commands to program the initial GPU state > > could be deferred. Since that initial state contains instructions for > > setting up the first power context, we want to execute that as earlier > > as possible, preferrably in the background to userspace. Then when > > userspace does wake up, the first time it opens the device we just need > > to flush the work to be sure that our commands are queued before any of > > userspace's. (Hooking into the device open should mean we have to check > > less often than say hooking into execbuffer.) > > > > Suggested-by: Dave Gordon <david.s.gordon@intel.com> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Dave Gordon <david.s.gordon@intel.com> > > Just before this gets a bit out of hand with various patches floating > around ... I really meant it when I said that we should have a proper > design discussion about this in Jesse's meeting first. What more is there to design? Asynchronously loading the submission port is orthogonal to the task of queuing requests for it, and need not block request construction (be it kernel or userspace). Dave just identified some work that we didn't need to do during module load. I don't think he would propose using it for loading guc firmware, that would just be silly... -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: Asynchronously initialise the GPU state 2015-07-01 13:17 ` Chris Wilson @ 2015-07-01 14:07 ` Daniel Vetter 2015-07-01 14:15 ` Chris Wilson 0 siblings, 1 reply; 9+ messages in thread From: Daniel Vetter @ 2015-07-01 14:07 UTC (permalink / raw) To: Chris Wilson, Daniel Vetter, intel-gfx On Wed, Jul 01, 2015 at 02:17:28PM +0100, Chris Wilson wrote: > On Wed, Jul 01, 2015 at 03:07:18PM +0200, Daniel Vetter wrote: > > On Wed, Jul 01, 2015 at 10:27:21AM +0100, Chris Wilson wrote: > > > Dave Gordon made the good suggestion that once the ringbuffers were > > > setup, the actual queuing of commands to program the initial GPU state > > > could be deferred. Since that initial state contains instructions for > > > setting up the first power context, we want to execute that as earlier > > > as possible, preferrably in the background to userspace. Then when > > > userspace does wake up, the first time it opens the device we just need > > > to flush the work to be sure that our commands are queued before any of > > > userspace's. (Hooking into the device open should mean we have to check > > > less often than say hooking into execbuffer.) > > > > > > Suggested-by: Dave Gordon <david.s.gordon@intel.com> > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > > Cc: Dave Gordon <david.s.gordon@intel.com> > > > > Just before this gets a bit out of hand with various patches floating > > around ... I really meant it when I said that we should have a proper > > design discussion about this in Jesse's meeting first. > > What more is there to design? Asynchronously loading the submission port > is orthogonal to the task of queuing requests for it, and need not block > request construction (be it kernel or userspace). Dave just identified > some work that we didn't need to do during module load. I don't think he > would propose using it for loading guc firmware, that would just be > silly... set_wedged in your patch doesn't have the wakeup to kick waiters. And maybe we want to be somewhat more synchronous with with init fail than gpu hangs, for userspace to make better decisions. Also we still have that issue that sometimes an -EIO escapes into modeset code. And yes this is mean to provide the async init for the request firmware. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: Asynchronously initialise the GPU state 2015-07-01 14:07 ` Daniel Vetter @ 2015-07-01 14:15 ` Chris Wilson 0 siblings, 0 replies; 9+ messages in thread From: Chris Wilson @ 2015-07-01 14:15 UTC (permalink / raw) To: Daniel Vetter; +Cc: intel-gfx On Wed, Jul 01, 2015 at 04:07:08PM +0200, Daniel Vetter wrote: > On Wed, Jul 01, 2015 at 02:17:28PM +0100, Chris Wilson wrote: > > On Wed, Jul 01, 2015 at 03:07:18PM +0200, Daniel Vetter wrote: > > > On Wed, Jul 01, 2015 at 10:27:21AM +0100, Chris Wilson wrote: > > > > Dave Gordon made the good suggestion that once the ringbuffers were > > > > setup, the actual queuing of commands to program the initial GPU state > > > > could be deferred. Since that initial state contains instructions for > > > > setting up the first power context, we want to execute that as earlier > > > > as possible, preferrably in the background to userspace. Then when > > > > userspace does wake up, the first time it opens the device we just need > > > > to flush the work to be sure that our commands are queued before any of > > > > userspace's. (Hooking into the device open should mean we have to check > > > > less often than say hooking into execbuffer.) > > > > > > > > Suggested-by: Dave Gordon <david.s.gordon@intel.com> > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > > > Cc: Dave Gordon <david.s.gordon@intel.com> > > > > > > Just before this gets a bit out of hand with various patches floating > > > around ... I really meant it when I said that we should have a proper > > > design discussion about this in Jesse's meeting first. > > > > What more is there to design? Asynchronously loading the submission port > > is orthogonal to the task of queuing requests for it, and need not block > > request construction (be it kernel or userspace). Dave just identified > > some work that we didn't need to do during module load. I don't think he > > would propose using it for loading guc firmware, that would just be > > silly... > > set_wedged in your patch doesn't have the wakeup to kick waiters. True. But is has to be impossible for a waiter to exist at this point, or else the entire async GPU init is broken. These commands have to be the first requests we send to the GPU. Everything else must wait before it is allowed to start queuing. > And > maybe we want to be somewhat more synchronous with with init fail than gpu > hangs, for userspace to make better decisions. The init is still synchronous with userspace using the device, just (and this is no change) the only communication with userspace that GEM initialisation failed is the wedged GPU. > Also we still have that > issue that sometimes an -EIO escapes into modeset code. But there are no new wait requests running conncurrent with GEM init, so this patch doesn't alter that. > And yes this is > mean to provide the async init for the request firmware. It is an inappropriate juncture for async request firmware. I can keep repeating that enabling the submission ports is orthogonal to setting up the CS engines and allowing requests to be queued, because it is... There is no need to modify the higher levels for async GuC initialisation. The serialisation there is when to start feeding requests into the submission port. At the moment we do that immediately when it is idle - but the GuC is not idle until it loaded, as soon as it is loaded it can simply feed in the first set of requests and start on its merry way. This also allows GuC failure also always to transparently fallback to execlists. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] drm/i915: Defer late hardware initialisation until first open 2015-06-30 15:01 ` [PATCH 2/2] drm/i915: Defer late hardware initialisation until first open Dave Gordon 2015-06-30 15:08 ` Chris Wilson @ 2015-07-02 12:20 ` shuang.he 1 sibling, 0 replies; 9+ messages in thread From: shuang.he @ 2015-07-02 12:20 UTC (permalink / raw) To: shuang.he, lei.a.liu, intel-gfx, david.s.gordon Tested-By: Intel Graphics QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com) Task id: 6687 -------------------------------------Summary------------------------------------- Platform Delta drm-intel-nightly Series Applied ILK 302/302 302/302 SNB 312/316 312/316 IVB 343/343 343/343 BYT -1 287/287 286/287 HSW 380/380 380/380 -------------------------------------Detailed------------------------------------- Platform Test drm-intel-nightly Series Applied *BYT igt@gem_partial_pwrite_pread@reads-display PASS(1) FAIL(1) Note: You need to pay more attention to line start with '*' _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-07-02 12:20 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-06-30 15:01 [PATCH 1/2] drm/i915: Split late "for_each_ring" loop from i915_gem_init_hw() Dave Gordon 2015-06-30 15:01 ` [PATCH 2/2] drm/i915: Defer late hardware initialisation until first open Dave Gordon 2015-06-30 15:08 ` Chris Wilson 2015-07-01 9:27 ` [PATCH] drm/i915: Asynchronously initialise the GPU state Chris Wilson 2015-07-01 13:07 ` Daniel Vetter 2015-07-01 13:17 ` Chris Wilson 2015-07-01 14:07 ` Daniel Vetter 2015-07-01 14:15 ` Chris Wilson 2015-07-02 12:20 ` [PATCH 2/2] drm/i915: Defer late hardware initialisation until first open shuang.he
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox