* Premature unpinning, finally?
@ 2016-04-26 20:05 Chris Wilson
2016-04-26 20:05 ` [PATCH v6 01/25] drm/i915/fbdev: Call intel_unpin_fb_obj() on release Chris Wilson
` (25 more replies)
0 siblings, 26 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:05 UTC (permalink / raw)
To: intel-gfx
Now we have the context leak sorted, everything should be in place to
satisfy BAT, or so I hope. A few new patches need reviewing, and the rest
could do with eyeballing - though I have tried to keep them the same, the
code around them has changed slightly, and I hope I have introduced subtle
bugs in the rebasing.
[PATCH v6 12/25] drm/i915: Unify intel_ring_begin()
[PATCH v6 13/25] drm/i915: Remove the identical implementations of
[PATCH v6 14/25] drm/i915: Manually unwind after a failed request
[PATCH v6 15/25] drm/i915: Preallocate enough space for the average
[PATCH v6 16/25] drm/i915: Update execlists context descriptor format
are new.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH v6 01/25] drm/i915/fbdev: Call intel_unpin_fb_obj() on release
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
@ 2016-04-26 20:05 ` Chris Wilson
2016-04-27 12:45 ` Tvrtko Ursulin
2016-04-26 20:05 ` [PATCH v6 02/25] drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr Chris Wilson
` (24 subsequent siblings)
25 siblings, 1 reply; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:05 UTC (permalink / raw)
To: intel-gfx
When releasing the intel_fbdev, we should unpin the framebuffer that we
pinned during construction.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/intel_display.c | 2 +-
drivers/gpu/drm/i915/intel_drv.h | 1 +
drivers/gpu/drm/i915/intel_fbdev.c | 7 ++++++-
3 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index b7cb632a2a31..87ce7852482b 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2309,7 +2309,7 @@ err_pm:
return ret;
}
-static void intel_unpin_fb_obj(struct drm_framebuffer *fb, unsigned int rotation)
+void intel_unpin_fb_obj(struct drm_framebuffer *fb, unsigned int rotation)
{
struct drm_i915_gem_object *obj = intel_fb_obj(fb);
struct i915_ggtt_view view;
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index cb89a35a6755..dcb19133f640 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1162,6 +1162,7 @@ void intel_release_load_detect_pipe(struct drm_connector *connector,
struct drm_modeset_acquire_ctx *ctx);
int intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb,
unsigned int rotation);
+void intel_unpin_fb_obj(struct drm_framebuffer *fb, unsigned int rotation);
struct drm_framebuffer *
__intel_framebuffer_create(struct drm_device *dev,
struct drm_mode_fb_cmd2 *mode_cmd,
diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index af561542a5a1..1c3ad121f1b9 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -287,7 +287,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
out_destroy_fbi:
drm_fb_helper_release_fbi(helper);
out_unpin:
- i915_gem_object_ggtt_unpin(obj);
+ intel_unpin_fb_obj(&ifbdev->fb->base, BIT(DRM_ROTATE_0));
out_unlock:
mutex_unlock(&dev->struct_mutex);
return ret;
@@ -551,6 +551,11 @@ static void intel_fbdev_destroy(struct drm_device *dev,
if (ifbdev->fb) {
drm_framebuffer_unregister_private(&ifbdev->fb->base);
+
+ mutex_lock(&dev->struct_mutex);
+ intel_unpin_fb_obj(&ifbdev->fb->base, BIT(DRM_ROTATE_0));
+ mutex_unlock(&dev->struct_mutex);
+
drm_framebuffer_remove(&ifbdev->fb->base);
}
}
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 02/25] drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
2016-04-26 20:05 ` [PATCH v6 01/25] drm/i915/fbdev: Call intel_unpin_fb_obj() on release Chris Wilson
@ 2016-04-26 20:05 ` Chris Wilson
[not found] ` <1461701180-895-1-git-send-email-chris-Y6uKTt2uX1cEflXRtASbqLVCufUGDwFn@public.gmane.org>
` (23 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:05 UTC (permalink / raw)
To: intel-gfx
When setting up the overlay page, we pin it into the GGTT (when using
virtual addresses) and store the offset as overlay->flip_addr. Rather
than doing a lookup of the GGTT address everytime, we can use the known
address instead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
---
drivers/gpu/drm/i915/intel_overlay.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 597fbcda8c5c..06bcd6aee64f 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -198,7 +198,7 @@ intel_overlay_map_regs(struct intel_overlay *overlay)
regs = (struct overlay_registers __iomem *)overlay->reg_bo->phys_handle->vaddr;
else
regs = io_mapping_map_wc(ggtt->mappable,
- i915_gem_obj_ggtt_offset(overlay->reg_bo));
+ overlay->flip_addr);
return regs;
}
@@ -1493,7 +1493,7 @@ intel_overlay_map_regs_atomic(struct intel_overlay *overlay)
overlay->reg_bo->phys_handle->vaddr;
else
regs = io_mapping_map_atomic_wc(ggtt->mappable,
- i915_gem_obj_ggtt_offset(overlay->reg_bo));
+ overlay->flip_addr);
return regs;
}
@@ -1523,10 +1523,7 @@ intel_overlay_capture_error_state(struct drm_device *dev)
error->dovsta = I915_READ(DOVSTA);
error->isr = I915_READ(ISR);
- if (OVERLAY_NEEDS_PHYSICAL(overlay->dev))
- error->base = (__force long)overlay->reg_bo->phys_handle->vaddr;
- else
- error->base = i915_gem_obj_ggtt_offset(overlay->reg_bo);
+ error->base = overlay->flip_addr;
regs = intel_overlay_map_regs_atomic(overlay);
if (!regs)
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 03/25] io-mapping: Specify mapping size for io_mapping_map_wc()
[not found] ` <1461701180-895-1-git-send-email-chris-Y6uKTt2uX1cEflXRtASbqLVCufUGDwFn@public.gmane.org>
@ 2016-04-26 20:05 ` Chris Wilson
0 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:05 UTC (permalink / raw)
To: intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Cc: Chris Wilson, Tvrtko Ursulin, Daniel Vetter, Jani Nikula,
David Airlie, Yishai Hadas, Dan Williams, Ingo Molnar,
Peter Zijlstra (Intel), David Hildenbrand, Luis R . Rodriguez,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
netdev-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
The ioremap() hidden behind the io_mapping_map_wc() convenience helper
can be used for remapping multiple pages. Extend the helper so that
future callers can use it for larger ranges.
Signed-off-by: Chris Wilson <chris-Y6uKTt2uX1cEflXRtASbqLVCufUGDwFn@public.gmane.org>
Cc: Tvrtko Ursulin <tvrtko.ursulin-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Daniel Vetter <daniel.vetter-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Jani Nikula <jani.nikula-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
Cc: David Airlie <airlied-cv59FeDIM0c@public.gmane.org>
Cc: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Ingo Molnar <mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: "Peter Zijlstra (Intel)" <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Cc: David Hildenbrand <dahi-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Cc: Luis R. Rodriguez <mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
Cc: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Reviewed-by: Luis R. Rodriguez <mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
drivers/gpu/drm/i915/intel_overlay.c | 3 ++-
drivers/net/ethernet/mellanox/mlx4/pd.c | 4 +++-
include/linux/io-mapping.h | 10 +++++++---
3 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 06bcd6aee64f..8d5293644d05 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -198,7 +198,8 @@ intel_overlay_map_regs(struct intel_overlay *overlay)
regs = (struct overlay_registers __iomem *)overlay->reg_bo->phys_handle->vaddr;
else
regs = io_mapping_map_wc(ggtt->mappable,
- overlay->flip_addr);
+ overlay->flip_addr,
+ PAGE_SIZE);
return regs;
}
diff --git a/drivers/net/ethernet/mellanox/mlx4/pd.c b/drivers/net/ethernet/mellanox/mlx4/pd.c
index b3cc3ab63799..6fc156a3918d 100644
--- a/drivers/net/ethernet/mellanox/mlx4/pd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/pd.c
@@ -205,7 +205,9 @@ int mlx4_bf_alloc(struct mlx4_dev *dev, struct mlx4_bf *bf, int node)
goto free_uar;
}
- uar->bf_map = io_mapping_map_wc(priv->bf_mapping, uar->index << PAGE_SHIFT);
+ uar->bf_map = io_mapping_map_wc(priv->bf_mapping,
+ uar->index << PAGE_SHIFT,
+ PAGE_SIZE);
if (!uar->bf_map) {
err = -ENOMEM;
goto unamp_uar;
diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h
index e399029b68c5..645ad06b5d52 100644
--- a/include/linux/io-mapping.h
+++ b/include/linux/io-mapping.h
@@ -100,14 +100,16 @@ io_mapping_unmap_atomic(void __iomem *vaddr)
}
static inline void __iomem *
-io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset)
+io_mapping_map_wc(struct io_mapping *mapping,
+ unsigned long offset,
+ unsigned long size)
{
resource_size_t phys_addr;
BUG_ON(offset >= mapping->size);
phys_addr = mapping->base + offset;
- return ioremap_wc(phys_addr, PAGE_SIZE);
+ return ioremap_wc(phys_addr, size);
}
static inline void
@@ -155,7 +157,9 @@ io_mapping_unmap_atomic(void __iomem *vaddr)
/* Non-atomic map/unmap */
static inline void __iomem *
-io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset)
+io_mapping_map_wc(struct io_mapping *mapping,
+ unsigned long offset,
+ unsigned long size)
{
return ((char __force __iomem *) mapping) + offset;
}
--
2.8.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 04/25] drm/i915: Introduce i915_vm_to_ggtt()
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (2 preceding siblings ...)
[not found] ` <1461701180-895-1-git-send-email-chris-Y6uKTt2uX1cEflXRtASbqLVCufUGDwFn@public.gmane.org>
@ 2016-04-26 20:05 ` Chris Wilson
2016-04-26 20:06 ` [PATCH v6 05/25] drm/i915: Move ioremap_wc tracking onto VMA Chris Wilson
` (21 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:05 UTC (permalink / raw)
To: intel-gfx
In a couple of places, we have an i915_address_space that we know is
really an i915_ggtt that we want to use. Create an inline helper to
convert from the i915_address_space subclass into its container.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 0d666b3f7e9b..4ddf2ed8501f 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -93,6 +93,13 @@
*
*/
+static inline struct i915_ggtt *
+i915_vm_to_ggtt(struct i915_address_space *vm)
+{
+ GEM_BUG_ON(!i915_is_ggtt(vm));
+ return container_of(vm, struct i915_ggtt, base);
+}
+
static int
i915_get_ggtt_vma_pages(struct i915_vma *vma);
@@ -2358,7 +2365,7 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
enum i915_cache_level level, u32 unused)
{
struct drm_i915_private *dev_priv = to_i915(vm->dev);
- struct i915_ggtt *ggtt = &dev_priv->ggtt;
+ struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
unsigned first_entry = start >> PAGE_SHIFT;
gen8_pte_t __iomem *gtt_entries =
(gen8_pte_t __iomem *)ggtt->gsm + first_entry;
@@ -2436,7 +2443,7 @@ static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
enum i915_cache_level level, u32 flags)
{
struct drm_i915_private *dev_priv = to_i915(vm->dev);
- struct i915_ggtt *ggtt = &dev_priv->ggtt;
+ struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
unsigned first_entry = start >> PAGE_SHIFT;
gen6_pte_t __iomem *gtt_entries =
(gen6_pte_t __iomem *)ggtt->gsm + first_entry;
@@ -2480,7 +2487,7 @@ static void gen8_ggtt_clear_range(struct i915_address_space *vm,
bool use_scratch)
{
struct drm_i915_private *dev_priv = to_i915(vm->dev);
- struct i915_ggtt *ggtt = &dev_priv->ggtt;
+ struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
unsigned first_entry = start >> PAGE_SHIFT;
unsigned num_entries = length >> PAGE_SHIFT;
gen8_pte_t scratch_pte, __iomem *gtt_base =
@@ -2512,7 +2519,7 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
bool use_scratch)
{
struct drm_i915_private *dev_priv = to_i915(vm->dev);
- struct i915_ggtt *ggtt = &dev_priv->ggtt;
+ struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
unsigned first_entry = start >> PAGE_SHIFT;
unsigned num_entries = length >> PAGE_SHIFT;
gen6_pte_t scratch_pte, __iomem *gtt_base =
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 05/25] drm/i915: Move ioremap_wc tracking onto VMA
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (3 preceding siblings ...)
2016-04-26 20:05 ` [PATCH v6 04/25] drm/i915: Introduce i915_vm_to_ggtt() Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-26 20:06 ` [PATCH v6 06/25] drm/i915: Use i915_vma_pin_iomap on the ringbuffer object Chris Wilson
` (20 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx
By tracking the iomapping on the VMA itself, we can share that area
between multiple users. Also by only revoking the iomapping upon
unbinding from the mappable portion of the GGTT, we can keep that iomap
across multiple invocations (e.g. execlists context pinning).
Note that by moving the iounnmap tracking to the VMA, we actually end up
fixing a leak of the iomapping in intel_fbdev.
v1.5: Rebase prompted by Tvrtko
v2: Drop dev_priv parameter, we can recover the i915_ggtt from the vma.
v3: Move handling of ioremap space exhaustion to vmap_purge and also
allow vmallocs to recover old iomaps. Add Tvrtko's kerneldoc.
v4: Fix a use-after-free in shrinker and rearrange i915_vma_iomap
v5: Back to i915_vm_to_ggtt
v6: Use i915_vma_pin_iomap and i915_vma_unpin_iomap to mark critical
sections and ensure the VMA cannot be reaped whilst mapped.
v7: Move i915_vma_iounmap so that consumers of the API are not tempted,
and add iomem annotations
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 13 ++++++++++++
drivers/gpu/drm/i915/i915_gem_gtt.c | 26 ++++++++++++++++++++++++
drivers/gpu/drm/i915/i915_gem_gtt.h | 35 ++++++++++++++++++++++++++++++++
drivers/gpu/drm/i915/i915_gem_shrinker.c | 28 +++++++++++++++++++------
drivers/gpu/drm/i915/intel_fbdev.c | 22 +++++++++++---------
5 files changed, 108 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d493e79d0f6d..40f6504de7c1 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3338,6 +3338,17 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
old_write_domain);
}
+static void __i915_vma_iounmap(struct i915_vma *vma)
+{
+ GEM_BUG_ON(vma->pin_count);
+
+ if (vma->iomap == NULL)
+ return;
+
+ io_mapping_unmap(vma->iomap);
+ vma->iomap = NULL;
+}
+
static int __i915_vma_unbind(struct i915_vma *vma, bool wait)
{
struct drm_i915_gem_object *obj = vma->obj;
@@ -3370,6 +3381,8 @@ static int __i915_vma_unbind(struct i915_vma *vma, bool wait)
ret = i915_gem_object_put_fence(obj);
if (ret)
return ret;
+
+ __i915_vma_iounmap(vma);
}
trace_i915_vma_unbind(vma);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 4ddf2ed8501f..942b98f910f8 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3633,3 +3633,29 @@ i915_ggtt_view_size(struct drm_i915_gem_object *obj,
return obj->base.size;
}
}
+
+void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
+{
+ void __iomem *ptr;
+
+ lockdep_assert_held(&vma->vm->dev->struct_mutex);
+ if (WARN_ON(!vma->obj->map_and_fenceable))
+ return ERR_PTR(-ENODEV);
+
+ GEM_BUG_ON(!vma->is_ggtt);
+ GEM_BUG_ON((vma->bound & GLOBAL_BIND) == 0);
+
+ ptr = vma->iomap;
+ if (ptr == NULL) {
+ ptr = io_mapping_map_wc(i915_vm_to_ggtt(vma->vm)->mappable,
+ vma->node.start,
+ vma->node.size);
+ if (ptr == NULL)
+ return ERR_PTR(-ENOMEM);
+
+ vma->iomap = ptr;
+ }
+
+ vma->pin_count++;
+ return ptr;
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index d7dd3d8a8758..b0ae6632c01a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -34,6 +34,8 @@
#ifndef __I915_GEM_GTT_H__
#define __I915_GEM_GTT_H__
+#include <linux/io-mapping.h>
+
struct drm_i915_file_private;
typedef uint32_t gen6_pte_t;
@@ -175,6 +177,7 @@ struct i915_vma {
struct drm_mm_node node;
struct drm_i915_gem_object *obj;
struct i915_address_space *vm;
+ void __iomem *iomap;
/** Flags and address space this VMA is bound to */
#define GLOBAL_BIND (1<<0)
@@ -559,4 +562,36 @@ size_t
i915_ggtt_view_size(struct drm_i915_gem_object *obj,
const struct i915_ggtt_view *view);
+/**
+ * i915_vma_pin_iomap - calls ioremap_wc to map the GGTT VMA via the aperture
+ * @vma: VMA to iomap
+ *
+ * The passed in VMA has to be pinned in the global GTT mappable region.
+ * An extra pinning of the VMA is acquired for the return iomapping,
+ * the caller must call i915_vma_unpin_iomap to relinquish the pinning
+ * after the iomapping is no longer required.
+ *
+ * Callers must hold the struct_mutex.
+ *
+ * Returns a valid iomapped pointer or ERR_PTR.
+ */
+void __iomem *i915_vma_pin_iomap(struct i915_vma *vma);
+
+/**
+ * i915_vma_unpin_iomap - unpins the mapping returned from i915_vma_iomap
+ * @vma: VMA to unpin
+ *
+ * Unpins the previously iomapped VMA from i915_vma_pin_iomap().
+ *
+ * Callers must hold the struct_mutex. This function is only valid to be
+ * called on a VMA previously iomapped by the caller with i915_vma_pin_iomap().
+ */
+static inline void i915_vma_unpin_iomap(struct i915_vma *vma)
+{
+ lockdep_assert_held(&vma->vm->dev->struct_mutex);
+ GEM_BUG_ON(vma->pin_count == 0);
+ GEM_BUG_ON(vma->iomap == NULL);
+ vma->pin_count--;
+}
+
#endif
diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c
index 425e721aac58..2643d18e41f5 100644
--- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
@@ -386,17 +386,33 @@ i915_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr
struct drm_i915_private *dev_priv =
container_of(nb, struct drm_i915_private, mm.vmap_notifier);
struct shrinker_lock_uninterruptible slu;
- unsigned long freed_pages;
+ struct i915_vma *vma, *next;
+ unsigned long freed_pages = 0;
+ int ret;
if (!i915_gem_shrinker_lock_uninterruptible(dev_priv, &slu, 5000))
return NOTIFY_DONE;
- freed_pages = i915_gem_shrink(dev_priv, -1UL,
- I915_SHRINK_BOUND |
- I915_SHRINK_UNBOUND |
- I915_SHRINK_ACTIVE |
- I915_SHRINK_VMAPS);
+ /* Force everything onto the inactive lists */
+ ret = i915_gpu_idle(dev_priv->dev);
+ if (ret)
+ goto out;
+
+ freed_pages += i915_gem_shrink(dev_priv, -1UL,
+ I915_SHRINK_BOUND |
+ I915_SHRINK_UNBOUND |
+ I915_SHRINK_ACTIVE |
+ I915_SHRINK_VMAPS);
+
+ /* We also want to clear any cached iomaps as they wrap vmap */
+ list_for_each_entry_safe(vma, next,
+ &dev_priv->ggtt.base.inactive_list, vm_link) {
+ unsigned long count = vma->node.size >> PAGE_SHIFT;
+ if (vma->iomap && i915_vma_unbind(vma) == 0)
+ freed_pages += count;
+ }
+out:
i915_gem_shrinker_unlock_uninterruptible(dev_priv, &slu);
*(unsigned long *)ptr += freed_pages;
diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index 1c3ad121f1b9..e0d4a387f9b7 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -186,9 +186,11 @@ static int intelfb_create(struct drm_fb_helper *helper,
struct i915_ggtt *ggtt = &dev_priv->ggtt;
struct fb_info *info;
struct drm_framebuffer *fb;
+ struct i915_vma *vma;
struct drm_i915_gem_object *obj;
- int size, ret;
bool prealloc = false;
+ void *vaddr;
+ int ret;
if (intel_fb &&
(sizes->fb_width > intel_fb->base.width ||
@@ -214,7 +216,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
}
obj = intel_fb->obj;
- size = obj->base.size;
mutex_lock(&dev->struct_mutex);
@@ -244,22 +245,23 @@ static int intelfb_create(struct drm_fb_helper *helper,
info->flags = FBINFO_DEFAULT | FBINFO_CAN_FORCE_OUTPUT;
info->fbops = &intelfb_ops;
+ vma = i915_gem_obj_to_ggtt(obj);
+
/* setup aperture base/size for vesafb takeover */
info->apertures->ranges[0].base = dev->mode_config.fb_base;
info->apertures->ranges[0].size = ggtt->mappable_end;
- info->fix.smem_start = dev->mode_config.fb_base + i915_gem_obj_ggtt_offset(obj);
- info->fix.smem_len = size;
+ info->fix.smem_start = dev->mode_config.fb_base + vma->node.start;
+ info->fix.smem_len = vma->node.size;
- info->screen_base =
- ioremap_wc(ggtt->mappable_base + i915_gem_obj_ggtt_offset(obj),
- size);
- if (!info->screen_base) {
+ vaddr = i915_vma_pin_iomap(vma);
+ if (IS_ERR(vaddr)) {
DRM_ERROR("Failed to remap framebuffer into virtual memory\n");
- ret = -ENOSPC;
+ ret = PTR_ERR(vaddr);
goto out_destroy_fbi;
}
- info->screen_size = size;
+ info->screen_base = vaddr;
+ info->screen_size = vma->node.size;
/* This driver doesn't need a VT switch to restore the mode on resume */
info->skip_vt_switch = true;
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 06/25] drm/i915: Use i915_vma_pin_iomap on the ringbuffer object
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (4 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 05/25] drm/i915: Move ioremap_wc tracking onto VMA Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-26 20:06 ` [PATCH v6 07/25] drm/i915: Mark the current context as lost on suspend Chris Wilson
` (19 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx
Similarly to i915_gem_object_pin_map on LLC platforms, we can
use the new VMA based io mapping on !LLC to amoritize the cost
of ringbuffer pinning and unpinning.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
drivers/gpu/drm/i915/intel_ringbuffer.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 66f69cdd1d36..61c120aed11e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2088,20 +2088,23 @@ static int init_phys_status_page(struct intel_engine_cs *engine)
void intel_unpin_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
{
+ GEM_BUG_ON(ringbuf->vma == NULL);
+ GEM_BUG_ON(ringbuf->virtual_start == NULL);
+
if (HAS_LLC(ringbuf->obj->base.dev) && !ringbuf->obj->stolen)
i915_gem_object_unpin_map(ringbuf->obj);
else
- iounmap(ringbuf->virtual_start);
+ i915_vma_unpin_iomap(ringbuf->vma);
ringbuf->virtual_start = NULL;
- ringbuf->vma = NULL;
+
i915_gem_object_ggtt_unpin(ringbuf->obj);
+ ringbuf->vma = NULL;
}
int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
struct intel_ringbuffer *ringbuf)
{
struct drm_i915_private *dev_priv = to_i915(dev);
- struct i915_ggtt *ggtt = &dev_priv->ggtt;
struct drm_i915_gem_object *obj = ringbuf->obj;
/* Ring wraparound at offset 0 sometimes hangs. No idea why. */
unsigned flags = PIN_OFFSET_BIAS | 4096;
@@ -2135,10 +2138,9 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
/* Access through the GTT requires the device to be awake. */
assert_rpm_wakelock_held(dev_priv);
- addr = ioremap_wc(ggtt->mappable_base +
- i915_gem_obj_ggtt_offset(obj), ringbuf->size);
- if (addr == NULL) {
- ret = -ENOMEM;
+ addr = i915_vma_pin_iomap(i915_gem_obj_to_ggtt(obj));
+ if (IS_ERR(addr)) {
+ ret = PTR_ERR(addr);
goto err_unpin;
}
}
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 07/25] drm/i915: Mark the current context as lost on suspend
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (5 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 06/25] drm/i915: Use i915_vma_pin_iomap on the ringbuffer object Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-26 20:06 ` [PATCH v6 08/25] drm/i915: L3 cache remapping is part of context switching Chris Wilson
` (18 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx
In order to force a reload of the context image upon resume, we first
need to mark its absence on suspend. Currently we are failing to restore
the golden context state and any context w/a to the default context
after resume.
One oversight corrected, is that we had forgotten to reapply the L3
remapping when restoring the lost default context.
v2: Remove deprecated WARN.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 1 +
drivers/gpu/drm/i915/i915_gem.c | 1 +
drivers/gpu/drm/i915/i915_gem_context.c | 54 ++++++++++++++-------------------
3 files changed, 25 insertions(+), 31 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 32f05979ade7..c3fbc056d6aa 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3293,6 +3293,7 @@ void i915_gem_object_save_bit_17_swizzle(struct drm_i915_gem_object *obj);
/* i915_gem_context.c */
int __must_check i915_gem_context_init(struct drm_device *dev);
+void i915_gem_context_lost(struct drm_i915_private *dev_priv);
void i915_gem_context_fini(struct drm_device *dev);
void i915_gem_context_reset(struct drm_device *dev);
int i915_gem_context_open(struct drm_device *dev, struct drm_file *file);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 40f6504de7c1..f9ef11273698 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4711,6 +4711,7 @@ i915_gem_suspend(struct drm_device *dev)
i915_gem_retire_requests(dev);
i915_gem_stop_engines(dev);
+ i915_gem_context_lost(dev_priv);
mutex_unlock(&dev->struct_mutex);
cancel_delayed_work_sync(&dev_priv->gpu_error.hangcheck_work);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 4e12bae5c659..05752a2f1810 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -90,6 +90,8 @@
#include "i915_drv.h"
#include "i915_trace.h"
+#define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
+
/* This is a HW constraint. The value below is the largest known requirement
* I've seen in a spec to date, and that was a workaround for a non-shipping
* part. It should be safe to decrease this, but it's more future proof as is.
@@ -249,7 +251,7 @@ __create_hw_context(struct drm_device *dev,
/* NB: Mark all slices as needing a remap so that when the context first
* loads it will restore whatever remap state already exists. If there
* is no remap info, it will be a NOP. */
- ctx->remap_slice = (1 << NUM_L3_SLICES(dev)) - 1;
+ ctx->remap_slice = ALL_L3_SLICES(dev_priv);
ctx->hang_stats.ban_period_seconds = DRM_I915_CTX_BAN_PERIOD;
@@ -336,7 +338,6 @@ static void i915_gem_context_unpin(struct intel_context *ctx,
void i915_gem_context_reset(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
- int i;
if (i915.enable_execlists) {
struct intel_context *ctx;
@@ -345,17 +346,7 @@ void i915_gem_context_reset(struct drm_device *dev)
intel_lr_context_reset(dev_priv, ctx);
}
- for (i = 0; i < I915_NUM_ENGINES; i++) {
- struct intel_engine_cs *engine = &dev_priv->engine[i];
-
- if (engine->last_context) {
- i915_gem_context_unpin(engine->last_context, engine);
- engine->last_context = NULL;
- }
- }
-
- /* Force the GPU state to be reinitialised on enabling */
- dev_priv->kernel_context->legacy_hw_ctx.initialized = false;
+ i915_gem_context_lost(dev_priv);
}
int i915_gem_context_init(struct drm_device *dev)
@@ -403,11 +394,29 @@ int i915_gem_context_init(struct drm_device *dev)
return 0;
}
+void i915_gem_context_lost(struct drm_i915_private *dev_priv)
+{
+ struct intel_engine_cs *engine;
+
+ for_each_engine(engine, dev_priv) {
+ if (engine->last_context == NULL)
+ continue;
+
+ i915_gem_context_unpin(engine->last_context, engine);
+ engine->last_context = NULL;
+ }
+
+ /* Force the GPU state to be reinitialised on enabling */
+ dev_priv->kernel_context->legacy_hw_ctx.initialized = false;
+ dev_priv->kernel_context->remap_slice = ALL_L3_SLICES(dev_priv);
+}
+
void i915_gem_context_fini(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_context *dctx = dev_priv->kernel_context;
- int i;
+
+ i915_gem_context_lost(dev_priv);
if (dctx->legacy_hw_ctx.rcs_state) {
/* The only known way to stop the gpu from accessing the hw context is
@@ -415,26 +424,9 @@ void i915_gem_context_fini(struct drm_device *dev)
* other code, leading to spurious errors. */
intel_gpu_reset(dev, ALL_ENGINES);
- /* When default context is created and switched to, base object refcount
- * will be 2 (+1 from object creation and +1 from do_switch()).
- * i915_gem_context_fini() will be called after gpu_idle() has switched
- * to default context. So we need to unreference the base object once
- * to offset the do_switch part, so that i915_gem_context_unreference()
- * can then free the base object correctly. */
- WARN_ON(!dev_priv->engine[RCS].last_context);
-
i915_gem_object_ggtt_unpin(dctx->legacy_hw_ctx.rcs_state);
}
- for (i = I915_NUM_ENGINES; --i >= 0;) {
- struct intel_engine_cs *engine = &dev_priv->engine[i];
-
- if (engine->last_context) {
- i915_gem_context_unpin(engine->last_context, engine);
- engine->last_context = NULL;
- }
- }
-
i915_gem_context_unreference(dctx);
dev_priv->kernel_context = NULL;
}
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 08/25] drm/i915: L3 cache remapping is part of context switching
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (6 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 07/25] drm/i915: Mark the current context as lost on suspend Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-26 20:06 ` [PATCH v6 09/25] drm/i915: Consolidate L3 remapping LRI Chris Wilson
` (17 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx
Move the i915_gem_l3_remap function such that it next to the context
switching, which is where we perform the L3 remap.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 31 -------------------------------
drivers/gpu/drm/i915/i915_gem_context.c | 31 +++++++++++++++++++++++++++++++
2 files changed, 31 insertions(+), 31 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f9ef11273698..f07cf81ef16b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4730,37 +4730,6 @@ err:
return ret;
}
-int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice)
-{
- struct intel_engine_cs *engine = req->engine;
- struct drm_device *dev = engine->dev;
- struct drm_i915_private *dev_priv = dev->dev_private;
- u32 *remap_info = dev_priv->l3_parity.remap_info[slice];
- int i, ret;
-
- if (!HAS_L3_DPF(dev) || !remap_info)
- return 0;
-
- ret = intel_ring_begin(req, GEN7_L3LOG_SIZE / 4 * 3);
- if (ret)
- return ret;
-
- /*
- * Note: We do not worry about the concurrent register cacheline hang
- * here because no other code should access these registers other than
- * at initialization time.
- */
- for (i = 0; i < GEN7_L3LOG_SIZE / 4; i++) {
- intel_ring_emit(engine, MI_LOAD_REGISTER_IMM(1));
- intel_ring_emit_reg(engine, GEN7_L3LOG(slice, i));
- intel_ring_emit(engine, remap_info[i]);
- }
-
- intel_ring_advance(engine);
-
- return ret;
-}
-
void i915_gem_init_swizzling(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 05752a2f1810..a429b4dcb4de 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -601,6 +601,37 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
return ret;
}
+int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice)
+{
+ struct intel_engine_cs *engine = req->engine;
+ struct drm_device *dev = engine->dev;
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ u32 *remap_info = dev_priv->l3_parity.remap_info[slice];
+ int i, ret;
+
+ if (!HAS_L3_DPF(dev) || !remap_info)
+ return 0;
+
+ ret = intel_ring_begin(req, GEN7_L3LOG_SIZE / 4 * 3);
+ if (ret)
+ return ret;
+
+ /*
+ * Note: We do not worry about the concurrent register cacheline hang
+ * here because no other code should access these registers other than
+ * at initialization time.
+ */
+ for (i = 0; i < GEN7_L3LOG_SIZE / 4; i++) {
+ intel_ring_emit(engine, MI_LOAD_REGISTER_IMM(1));
+ intel_ring_emit_reg(engine, GEN7_L3LOG(slice, i));
+ intel_ring_emit(engine, remap_info[i]);
+ }
+
+ intel_ring_advance(engine);
+
+ return ret;
+}
+
static inline bool skip_rcs_switch(struct intel_engine_cs *engine,
struct intel_context *to)
{
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 09/25] drm/i915: Consolidate L3 remapping LRI
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (7 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 08/25] drm/i915: L3 cache remapping is part of context switching Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-26 20:06 ` [PATCH v6 10/25] drm/i915: Remove early l3-remap Chris Wilson
` (16 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx
We can use a single MI_LOAD_REGISTER_IMM command packet to write all the
L3 remapping registers, shrinking the number of bytes required to emit
the context switch.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
drivers/gpu/drm/i915/i915_gem_context.c | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index a429b4dcb4de..373bad8ee04c 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -603,16 +603,14 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice)
{
+ u32 *remap_info = req->i915->l3_parity.remap_info[slice];
struct intel_engine_cs *engine = req->engine;
- struct drm_device *dev = engine->dev;
- struct drm_i915_private *dev_priv = dev->dev_private;
- u32 *remap_info = dev_priv->l3_parity.remap_info[slice];
int i, ret;
- if (!HAS_L3_DPF(dev) || !remap_info)
+ if (!remap_info)
return 0;
- ret = intel_ring_begin(req, GEN7_L3LOG_SIZE / 4 * 3);
+ ret = intel_ring_begin(req, GEN7_L3LOG_SIZE/4 * 2 + 2);
if (ret)
return ret;
@@ -621,15 +619,15 @@ int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice)
* here because no other code should access these registers other than
* at initialization time.
*/
- for (i = 0; i < GEN7_L3LOG_SIZE / 4; i++) {
- intel_ring_emit(engine, MI_LOAD_REGISTER_IMM(1));
+ intel_ring_emit(engine, MI_LOAD_REGISTER_IMM(GEN7_L3LOG_SIZE/4));
+ for (i = 0; i < GEN7_L3LOG_SIZE/4; i++) {
intel_ring_emit_reg(engine, GEN7_L3LOG(slice, i));
intel_ring_emit(engine, remap_info[i]);
}
-
+ intel_ring_emit(engine, MI_NOOP);
intel_ring_advance(engine);
- return ret;
+ return 0;
}
static inline bool skip_rcs_switch(struct intel_engine_cs *engine,
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 10/25] drm/i915: Remove early l3-remap
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (8 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 09/25] drm/i915: Consolidate L3 remapping LRI Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-26 20:06 ` [PATCH v6 11/25] drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use Chris Wilson
` (15 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx
Since we do the l3-remap on context switch, we can remove the redundant
early call to set the mapping prior to performing the first context
switch.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 1 -
drivers/gpu/drm/i915/i915_gem.c | 10 +---------
drivers/gpu/drm/i915/i915_gem_context.c | 4 ++--
3 files changed, 3 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c3fbc056d6aa..a4032393eddf 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3147,7 +3147,6 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
int __must_check i915_gem_init(struct drm_device *dev);
int i915_gem_init_engines(struct drm_device *dev);
int __must_check i915_gem_init_hw(struct drm_device *dev);
-int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice);
void i915_gem_init_swizzling(struct drm_device *dev);
void i915_gem_cleanup_engines(struct drm_device *dev);
int __must_check i915_gpu_idle(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f07cf81ef16b..c5dd4008e775 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4834,7 +4834,7 @@ i915_gem_init_hw(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_engine_cs *engine;
- int ret, j;
+ int ret;
if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
return -EIO;
@@ -4916,14 +4916,6 @@ i915_gem_init_hw(struct drm_device *dev)
break;
}
- if (engine->id == RCS) {
- for (j = 0; j < NUM_L3_SLICES(dev); j++) {
- ret = i915_gem_l3_remap(req, j);
- if (ret)
- goto err_request;
- }
- }
-
ret = i915_ppgtt_init_ring(req);
if (ret)
goto err_request;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 373bad8ee04c..7556cb7c329d 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -601,7 +601,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
return ret;
}
-int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice)
+static int remap_l3(struct drm_i915_gem_request *req, int slice)
{
u32 *remap_info = req->i915->l3_parity.remap_info[slice];
struct intel_engine_cs *engine = req->engine;
@@ -799,7 +799,7 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
if (!(to->remap_slice & (1<<i)))
continue;
- ret = i915_gem_l3_remap(req, i);
+ ret = remap_l3(req, i);
if (ret)
return ret;
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 11/25] drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (9 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 10/25] drm/i915: Remove early l3-remap Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-26 20:06 ` [PATCH v6 12/25] drm/i915: Unify intel_ring_begin() Chris Wilson
` (14 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx; +Cc: Mika Kuoppala
The code to switch_mm() is already handled by i915_switch_context(), the
only difference required to setup the aliasing ppgtt is that we need to
emit te switch_mm() on the first context, i.e. when transitioning from
engine->last_context == NULL. This allows us to defer the
initialisation of the GPU from early device initialisation to first use,
which should marginally speed up both. The caveat is that we then defer
the context initialisation until first use - i.e. we cannot assume that
the GPU engines are initialised. For example, this means that power
contexts for rc6 (Ironlake) need to explicitly loaded, as they are.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 1 -
drivers/gpu/drm/i915/i915_gem.c | 30 -------------
drivers/gpu/drm/i915/i915_gem_context.c | 77 +++++++++++++--------------------
drivers/gpu/drm/i915/i915_gem_gtt.c | 14 ------
drivers/gpu/drm/i915/i915_gem_gtt.h | 1 -
5 files changed, 31 insertions(+), 92 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a4032393eddf..444a8ea0c5c4 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3296,7 +3296,6 @@ void i915_gem_context_lost(struct drm_i915_private *dev_priv);
void i915_gem_context_fini(struct drm_device *dev);
void i915_gem_context_reset(struct drm_device *dev);
int i915_gem_context_open(struct drm_device *dev, struct drm_file *file);
-int i915_gem_context_enable(struct drm_i915_gem_request *req);
void i915_gem_context_close(struct drm_device *dev, struct drm_file *file);
int i915_switch_context(struct drm_i915_gem_request *req);
struct intel_context *
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c5dd4008e775..71135c3ce44e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4903,36 +4903,6 @@ i915_gem_init_hw(struct drm_device *dev)
* on re-initialisation
*/
ret = i915_gem_set_seqno(dev, dev_priv->next_seqno+0x100);
- if (ret)
- goto out;
-
- /* Now it is safe to go back round and do everything else: */
- for_each_engine(engine, dev_priv) {
- struct drm_i915_gem_request *req;
-
- req = i915_gem_request_alloc(engine, NULL);
- if (IS_ERR(req)) {
- ret = PTR_ERR(req);
- break;
- }
-
- ret = i915_ppgtt_init_ring(req);
- if (ret)
- goto err_request;
-
- ret = i915_gem_context_enable(req);
- if (ret)
- goto err_request;
-
-err_request:
- i915_add_request_no_flush(req);
- if (ret) {
- DRM_ERROR("Failed to enable %s, error=%d\n",
- engine->name, ret);
- i915_gem_cleanup_engines(dev);
- break;
- }
- }
out:
intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 7556cb7c329d..4071a0ec134a 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -431,27 +431,6 @@ void i915_gem_context_fini(struct drm_device *dev)
dev_priv->kernel_context = NULL;
}
-int i915_gem_context_enable(struct drm_i915_gem_request *req)
-{
- struct intel_engine_cs *engine = req->engine;
- int ret;
-
- if (i915.enable_execlists) {
- if (engine->init_context == NULL)
- return 0;
-
- ret = engine->init_context(req);
- } else
- ret = i915_switch_context(req);
-
- if (ret) {
- DRM_ERROR("ring init context: %d\n", ret);
- return ret;
- }
-
- return 0;
-}
-
static int context_idr_cleanup(int id, void *p, void *data)
{
struct intel_context *ctx = p;
@@ -630,7 +609,8 @@ static int remap_l3(struct drm_i915_gem_request *req, int slice)
return 0;
}
-static inline bool skip_rcs_switch(struct intel_engine_cs *engine,
+static inline bool skip_rcs_switch(struct i915_hw_ppgtt *ppgtt,
+ struct intel_engine_cs *engine,
struct intel_context *to)
{
if (to->remap_slice)
@@ -639,21 +619,27 @@ static inline bool skip_rcs_switch(struct intel_engine_cs *engine,
if (!to->legacy_hw_ctx.initialized)
return false;
- if (to->ppgtt &&
- !(intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings))
+ if (ppgtt && (intel_engine_flag(engine) & ppgtt->pd_dirty_rings))
return false;
return to == engine->last_context;
}
static bool
-needs_pd_load_pre(struct intel_engine_cs *engine, struct intel_context *to)
+needs_pd_load_pre(struct i915_hw_ppgtt *ppgtt,
+ struct intel_engine_cs *engine,
+ struct intel_context *to)
{
- if (!to->ppgtt)
+ if (!ppgtt)
return false;
+ /* Always load the ppgtt on first use */
+ if (!engine->last_context)
+ return true;
+
+ /* Same context without new entries, skip */
if (engine->last_context == to &&
- !(intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings))
+ !(intel_engine_flag(engine) & ppgtt->pd_dirty_rings))
return false;
if (engine->id != RCS)
@@ -666,9 +652,11 @@ needs_pd_load_pre(struct intel_engine_cs *engine, struct intel_context *to)
}
static bool
-needs_pd_load_post(struct intel_context *to, u32 hw_flags)
+needs_pd_load_post(struct i915_hw_ppgtt *ppgtt,
+ struct intel_context *to,
+ u32 hw_flags)
{
- if (!to->ppgtt)
+ if (!ppgtt)
return false;
if (!IS_GEN8(to->i915))
@@ -684,11 +672,12 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
{
struct intel_context *to = req->ctx;
struct intel_engine_cs *engine = req->engine;
+ struct i915_hw_ppgtt *ppgtt = to->ppgtt ?: req->i915->mm.aliasing_ppgtt;
struct intel_context *from;
u32 hw_flags;
int ret, i;
- if (skip_rcs_switch(engine, to))
+ if (skip_rcs_switch(ppgtt, engine, to))
return 0;
/* Trying to pin first makes error handling easier. */
@@ -719,13 +708,13 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
if (ret)
goto unpin_out;
- if (needs_pd_load_pre(engine, to)) {
+ if (needs_pd_load_pre(ppgtt, engine, to)) {
/* Older GENs and non render rings still want the load first,
* "PP_DCLV followed by PP_DIR_BASE register through Load
* Register Immediate commands in Ring Buffer before submitting
* a context."*/
trace_switch_mm(engine, to);
- ret = to->ppgtt->switch_mm(to->ppgtt, req);
+ ret = ppgtt->switch_mm(ppgtt, req);
if (ret)
goto unpin_out;
}
@@ -736,16 +725,11 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
* space. This means we must enforce that a page table load
* occur when this occurs. */
hw_flags = MI_RESTORE_INHIBIT;
- else if (to->ppgtt &&
- intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings)
+ else if (ppgtt && intel_engine_flag(engine) & ppgtt->pd_dirty_rings)
hw_flags = MI_FORCE_RESTORE;
else
hw_flags = 0;
- /* We should never emit switch_mm more than once */
- WARN_ON(needs_pd_load_pre(engine, to) &&
- needs_pd_load_post(to, hw_flags));
-
if (to != from || (hw_flags & MI_FORCE_RESTORE)) {
ret = mi_set_context(req, hw_flags);
if (ret)
@@ -780,9 +764,9 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
/* GEN8 does *not* require an explicit reload if the PDPs have been
* setup, and we do not wish to move them.
*/
- if (needs_pd_load_post(to, hw_flags)) {
+ if (needs_pd_load_post(ppgtt, to, hw_flags)) {
trace_switch_mm(engine, to);
- ret = to->ppgtt->switch_mm(to->ppgtt, req);
+ ret = ppgtt->switch_mm(ppgtt, req);
/* The hardware context switch is emitted, but we haven't
* actually changed the state - so it's probably safe to bail
* here. Still, let the user know something dangerous has
@@ -792,8 +776,8 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
return ret;
}
- if (to->ppgtt)
- to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
+ if (ppgtt)
+ ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
for (i = 0; i < MAX_L3_SLICES; i++) {
if (!(to->remap_slice & (1<<i)))
@@ -846,17 +830,18 @@ int i915_switch_context(struct drm_i915_gem_request *req)
if (engine->id != RCS ||
req->ctx->legacy_hw_ctx.rcs_state == NULL) {
struct intel_context *to = req->ctx;
+ struct i915_hw_ppgtt *ppgtt =
+ to->ppgtt ?: req->i915->mm.aliasing_ppgtt;
- if (needs_pd_load_pre(engine, to)) {
+ if (needs_pd_load_pre(ppgtt, engine, to)) {
int ret;
trace_switch_mm(engine, to);
- ret = to->ppgtt->switch_mm(to->ppgtt, req);
+ ret = ppgtt->switch_mm(ppgtt, req);
if (ret)
return ret;
- /* Doing a PD load always reloads the page dirs */
- to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
+ ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
}
if (to != engine->last_context) {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 942b98f910f8..750569af4b0e 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2186,20 +2186,6 @@ int i915_ppgtt_init_hw(struct drm_device *dev)
return 0;
}
-int i915_ppgtt_init_ring(struct drm_i915_gem_request *req)
-{
- struct drm_i915_private *dev_priv = req->i915;
- struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
-
- if (i915.enable_execlists)
- return 0;
-
- if (!ppgtt)
- return 0;
-
- return ppgtt->switch_mm(ppgtt, req);
-}
-
struct i915_hw_ppgtt *
i915_ppgtt_create(struct drm_device *dev, struct drm_i915_file_private *fpriv)
{
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index b0ae6632c01a..114850368bca 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -522,7 +522,6 @@ void i915_ggtt_cleanup_hw(struct drm_device *dev);
int i915_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt);
int i915_ppgtt_init_hw(struct drm_device *dev);
-int i915_ppgtt_init_ring(struct drm_i915_gem_request *req);
void i915_ppgtt_release(struct kref *kref);
struct i915_hw_ppgtt *i915_ppgtt_create(struct drm_device *dev,
struct drm_i915_file_private *fpriv);
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 12/25] drm/i915: Unify intel_ring_begin()
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (10 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 11/25] drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-27 9:21 ` Joonas Lahtinen
2016-04-26 20:06 ` [PATCH v6 13/25] drm/i915: Remove the identical implementations of request space reservation Chris Wilson
` (13 subsequent siblings)
25 siblings, 1 reply; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx
Combine the near identical implementations of intel_logical_ring_begin()
and intel_ring_begin() - the only difference is that the logical wait
has to check for a matching ring (which is assumed by legacy).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
---
drivers/gpu/drm/i915/intel_lrc.c | 146 ++-----------------------
drivers/gpu/drm/i915/intel_lrc.h | 1 -
drivers/gpu/drm/i915/intel_mocs.c | 12 +--
drivers/gpu/drm/i915/intel_ringbuffer.c | 182 ++++++++++++--------------------
drivers/gpu/drm/i915/intel_ringbuffer.h | 3 -
5 files changed, 81 insertions(+), 263 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 2b7e6bbc017b..ba87c94928e7 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -721,48 +721,6 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
return ret;
}
-static int logical_ring_wait_for_space(struct drm_i915_gem_request *req,
- int bytes)
-{
- struct intel_ringbuffer *ringbuf = req->ringbuf;
- struct intel_engine_cs *engine = req->engine;
- struct drm_i915_gem_request *target;
- unsigned space;
- int ret;
-
- if (intel_ring_space(ringbuf) >= bytes)
- return 0;
-
- /* The whole point of reserving space is to not wait! */
- WARN_ON(ringbuf->reserved_in_use);
-
- list_for_each_entry(target, &engine->request_list, list) {
- /*
- * The request queue is per-engine, so can contain requests
- * from multiple ringbuffers. Here, we must ignore any that
- * aren't from the ringbuffer we're considering.
- */
- if (target->ringbuf != ringbuf)
- continue;
-
- /* Would completion of this request free enough space? */
- space = __intel_ring_space(target->postfix, ringbuf->tail,
- ringbuf->size);
- if (space >= bytes)
- break;
- }
-
- if (WARN_ON(&target->list == &engine->request_list))
- return -ENOSPC;
-
- ret = i915_wait_request(target);
- if (ret)
- return ret;
-
- ringbuf->space = space;
- return 0;
-}
-
/*
* intel_logical_ring_advance_and_submit() - advance the tail and submit the workload
* @request: Request to advance the logical ringbuffer of.
@@ -814,92 +772,6 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
return 0;
}
-static void __wrap_ring_buffer(struct intel_ringbuffer *ringbuf)
-{
- uint32_t __iomem *virt;
- int rem = ringbuf->size - ringbuf->tail;
-
- virt = ringbuf->virtual_start + ringbuf->tail;
- rem /= 4;
- while (rem--)
- iowrite32(MI_NOOP, virt++);
-
- ringbuf->tail = 0;
- intel_ring_update_space(ringbuf);
-}
-
-static int logical_ring_prepare(struct drm_i915_gem_request *req, int bytes)
-{
- struct intel_ringbuffer *ringbuf = req->ringbuf;
- int remain_usable = ringbuf->effective_size - ringbuf->tail;
- int remain_actual = ringbuf->size - ringbuf->tail;
- int ret, total_bytes, wait_bytes = 0;
- bool need_wrap = false;
-
- if (ringbuf->reserved_in_use)
- total_bytes = bytes;
- else
- total_bytes = bytes + ringbuf->reserved_size;
-
- if (unlikely(bytes > remain_usable)) {
- /*
- * Not enough space for the basic request. So need to flush
- * out the remainder and then wait for base + reserved.
- */
- wait_bytes = remain_actual + total_bytes;
- need_wrap = true;
- } else {
- if (unlikely(total_bytes > remain_usable)) {
- /*
- * The base request will fit but the reserved space
- * falls off the end. So don't need an immediate wrap
- * and only need to effectively wait for the reserved
- * size space from the start of ringbuffer.
- */
- wait_bytes = remain_actual + ringbuf->reserved_size;
- } else if (total_bytes > ringbuf->space) {
- /* No wrapping required, just waiting. */
- wait_bytes = total_bytes;
- }
- }
-
- if (wait_bytes) {
- ret = logical_ring_wait_for_space(req, wait_bytes);
- if (unlikely(ret))
- return ret;
-
- if (need_wrap)
- __wrap_ring_buffer(ringbuf);
- }
-
- return 0;
-}
-
-/**
- * intel_logical_ring_begin() - prepare the logical ringbuffer to accept some commands
- *
- * @req: The request to start some new work for
- * @num_dwords: number of DWORDs that we plan to write to the ringbuffer.
- *
- * The ringbuffer might not be ready to accept the commands right away (maybe it needs to
- * be wrapped, or wait a bit for the tail to be updated). This function takes care of that
- * and also preallocates a request (every workload submission is still mediated through
- * requests, same as it did with legacy ringbuffer submission).
- *
- * Return: non-zero if the ringbuffer is not ready to be written to.
- */
-int intel_logical_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
-{
- int ret;
-
- ret = logical_ring_prepare(req, num_dwords * sizeof(uint32_t));
- if (ret)
- return ret;
-
- req->ringbuf->space -= num_dwords * sizeof(uint32_t);
- return 0;
-}
-
int intel_logical_ring_reserve_space(struct drm_i915_gem_request *request)
{
/*
@@ -912,7 +784,7 @@ int intel_logical_ring_reserve_space(struct drm_i915_gem_request *request)
*/
intel_ring_reserved_space_reserve(request->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
- return intel_logical_ring_begin(request, 0);
+ return intel_ring_begin(request, 0);
}
/**
@@ -982,7 +854,7 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
if (engine == &dev_priv->engine[RCS] &&
instp_mode != dev_priv->relative_constants_mode) {
- ret = intel_logical_ring_begin(params->request, 4);
+ ret = intel_ring_begin(params->request, 4);
if (ret)
return ret;
@@ -1178,7 +1050,7 @@ static int intel_logical_ring_workarounds_emit(struct drm_i915_gem_request *req)
if (ret)
return ret;
- ret = intel_logical_ring_begin(req, w->count * 2 + 2);
+ ret = intel_ring_begin(req, w->count * 2 + 2);
if (ret)
return ret;
@@ -1669,7 +1541,7 @@ static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req)
const int num_lri_cmds = GEN8_LEGACY_PDPES * 2;
int i, ret;
- ret = intel_logical_ring_begin(req, num_lri_cmds * 2 + 2);
+ ret = intel_ring_begin(req, num_lri_cmds * 2 + 2);
if (ret)
return ret;
@@ -1716,7 +1588,7 @@ static int gen8_emit_bb_start(struct drm_i915_gem_request *req,
req->ctx->ppgtt->pd_dirty_rings &= ~intel_engine_flag(req->engine);
}
- ret = intel_logical_ring_begin(req, 4);
+ ret = intel_ring_begin(req, 4);
if (ret)
return ret;
@@ -1778,7 +1650,7 @@ static int gen8_emit_flush(struct drm_i915_gem_request *request,
uint32_t cmd;
int ret;
- ret = intel_logical_ring_begin(request, 4);
+ ret = intel_ring_begin(request, 4);
if (ret)
return ret;
@@ -1846,7 +1718,7 @@ static int gen8_emit_flush_render(struct drm_i915_gem_request *request,
vf_flush_wa = true;
}
- ret = intel_logical_ring_begin(request, vf_flush_wa ? 12 : 6);
+ ret = intel_ring_begin(request, vf_flush_wa ? 12 : 6);
if (ret)
return ret;
@@ -1920,7 +1792,7 @@ static int gen8_emit_request(struct drm_i915_gem_request *request)
struct intel_ringbuffer *ringbuf = request->ringbuf;
int ret;
- ret = intel_logical_ring_begin(request, 6 + WA_TAIL_DWORDS);
+ ret = intel_ring_begin(request, 6 + WA_TAIL_DWORDS);
if (ret)
return ret;
@@ -1944,7 +1816,7 @@ static int gen8_emit_request_render(struct drm_i915_gem_request *request)
struct intel_ringbuffer *ringbuf = request->ringbuf;
int ret;
- ret = intel_logical_ring_begin(request, 8 + WA_TAIL_DWORDS);
+ ret = intel_ring_begin(request, 8 + WA_TAIL_DWORDS);
if (ret)
return ret;
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 461f1ef9b5c1..60a7385bc531 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -63,7 +63,6 @@ int intel_logical_ring_reserve_space(struct drm_i915_gem_request *request);
void intel_logical_ring_stop(struct intel_engine_cs *engine);
void intel_logical_ring_cleanup(struct intel_engine_cs *engine);
int intel_logical_rings_init(struct drm_device *dev);
-int intel_logical_ring_begin(struct drm_i915_gem_request *req, int num_dwords);
int logical_ring_flush_all_caches(struct drm_i915_gem_request *req);
/**
diff --git a/drivers/gpu/drm/i915/intel_mocs.c b/drivers/gpu/drm/i915/intel_mocs.c
index 23b8545ad6b0..6ba4bf7f2a89 100644
--- a/drivers/gpu/drm/i915/intel_mocs.c
+++ b/drivers/gpu/drm/i915/intel_mocs.c
@@ -239,11 +239,9 @@ static int emit_mocs_control_table(struct drm_i915_gem_request *req,
if (WARN_ON(table->size > GEN9_NUM_MOCS_ENTRIES))
return -ENODEV;
- ret = intel_logical_ring_begin(req, 2 + 2 * GEN9_NUM_MOCS_ENTRIES);
- if (ret) {
- DRM_DEBUG("intel_logical_ring_begin failed %d\n", ret);
+ ret = intel_ring_begin(req, 2 + 2 * GEN9_NUM_MOCS_ENTRIES);
+ if (ret)
return ret;
- }
intel_logical_ring_emit(ringbuf,
MI_LOAD_REGISTER_IMM(GEN9_NUM_MOCS_ENTRIES));
@@ -305,11 +303,9 @@ static int emit_mocs_l3cc_table(struct drm_i915_gem_request *req,
if (WARN_ON(table->size > GEN9_NUM_MOCS_ENTRIES))
return -ENODEV;
- ret = intel_logical_ring_begin(req, 2 + GEN9_NUM_MOCS_ENTRIES);
- if (ret) {
- DRM_DEBUG("intel_logical_ring_begin failed %d\n", ret);
+ ret = intel_ring_begin(req, 2 + GEN9_NUM_MOCS_ENTRIES);
+ if (ret)
return ret;
- }
intel_logical_ring_emit(ringbuf,
MI_LOAD_REGISTER_IMM(GEN9_NUM_MOCS_ENTRIES / 2));
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 61c120aed11e..1d3d2ea3e9bc 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -53,12 +53,6 @@ void intel_ring_update_space(struct intel_ringbuffer *ringbuf)
ringbuf->tail, ringbuf->size);
}
-int intel_ring_space(struct intel_ringbuffer *ringbuf)
-{
- intel_ring_update_space(ringbuf);
- return ringbuf->space;
-}
-
bool intel_engine_stopped(struct intel_engine_cs *engine)
{
struct drm_i915_private *dev_priv = engine->dev->dev_private;
@@ -2318,51 +2312,6 @@ void intel_cleanup_engine(struct intel_engine_cs *engine)
engine->dev = NULL;
}
-static int ring_wait_for_space(struct intel_engine_cs *engine, int n)
-{
- struct intel_ringbuffer *ringbuf = engine->buffer;
- struct drm_i915_gem_request *request;
- unsigned space;
- int ret;
-
- if (intel_ring_space(ringbuf) >= n)
- return 0;
-
- /* The whole point of reserving space is to not wait! */
- WARN_ON(ringbuf->reserved_in_use);
-
- list_for_each_entry(request, &engine->request_list, list) {
- space = __intel_ring_space(request->postfix, ringbuf->tail,
- ringbuf->size);
- if (space >= n)
- break;
- }
-
- if (WARN_ON(&request->list == &engine->request_list))
- return -ENOSPC;
-
- ret = i915_wait_request(request);
- if (ret)
- return ret;
-
- ringbuf->space = space;
- return 0;
-}
-
-static void __wrap_ring_buffer(struct intel_ringbuffer *ringbuf)
-{
- uint32_t __iomem *virt;
- int rem = ringbuf->size - ringbuf->tail;
-
- virt = ringbuf->virtual_start + ringbuf->tail;
- rem /= 4;
- while (rem--)
- iowrite32(MI_NOOP, virt++);
-
- ringbuf->tail = 0;
- intel_ring_update_space(ringbuf);
-}
-
int intel_engine_idle(struct intel_engine_cs *engine)
{
struct drm_i915_gem_request *req;
@@ -2404,63 +2353,74 @@ int intel_ring_reserve_space(struct drm_i915_gem_request *request)
void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size)
{
- WARN_ON(ringbuf->reserved_size);
- WARN_ON(ringbuf->reserved_in_use);
-
+ GEM_BUG_ON(ringbuf->reserved_size);
ringbuf->reserved_size = size;
}
void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf)
{
- WARN_ON(ringbuf->reserved_in_use);
-
+ GEM_BUG_ON(!ringbuf->reserved_size);
ringbuf->reserved_size = 0;
- ringbuf->reserved_in_use = false;
}
void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf)
{
- WARN_ON(ringbuf->reserved_in_use);
-
- ringbuf->reserved_in_use = true;
- ringbuf->reserved_tail = ringbuf->tail;
+ GEM_BUG_ON(!ringbuf->reserved_size);
+ ringbuf->reserved_size = 0;
}
void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf)
{
- WARN_ON(!ringbuf->reserved_in_use);
- if (ringbuf->tail > ringbuf->reserved_tail) {
- WARN(ringbuf->tail > ringbuf->reserved_tail + ringbuf->reserved_size,
- "request reserved size too small: %d vs %d!\n",
- ringbuf->tail - ringbuf->reserved_tail, ringbuf->reserved_size);
- } else {
+ GEM_BUG_ON(ringbuf->reserved_size);
+}
+
+static int wait_for_space(struct drm_i915_gem_request *req, int bytes)
+{
+ struct intel_ringbuffer *ringbuf = req->ringbuf;
+ struct intel_engine_cs *engine = req->engine;
+ struct drm_i915_gem_request *target;
+
+ intel_ring_update_space(ringbuf);
+ if (ringbuf->space >= bytes)
+ return 0;
+
+ /* The whole point of reserving space is to not wait! */
+ GEM_BUG_ON(!ringbuf->reserved_size);
+
+ list_for_each_entry(target, &engine->request_list, list) {
+ unsigned space;
+
/*
- * The ring was wrapped while the reserved space was in use.
- * That means that some unknown amount of the ring tail was
- * no-op filled and skipped. Thus simply adding the ring size
- * to the tail and doing the above space check will not work.
- * Rather than attempt to track how much tail was skipped,
- * it is much simpler to say that also skipping the sanity
- * check every once in a while is not a big issue.
+ * The request queue is per-engine, so can contain requests
+ * from multiple ringbuffers. Here, we must ignore any that
+ * aren't from the ringbuffer we're considering.
*/
+ if (target->ringbuf != ringbuf)
+ continue;
+
+ /* Would completion of this request free enough space? */
+ space = __intel_ring_space(target->postfix, ringbuf->tail,
+ ringbuf->size);
+ if (space >= bytes)
+ break;
}
- ringbuf->reserved_size = 0;
- ringbuf->reserved_in_use = false;
+ if (WARN_ON(&target->list == &engine->request_list))
+ return -ENOSPC;
+
+ return i915_wait_request(target);
}
-static int __intel_ring_prepare(struct intel_engine_cs *engine, int bytes)
+int intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
{
- struct intel_ringbuffer *ringbuf = engine->buffer;
- int remain_usable = ringbuf->effective_size - ringbuf->tail;
+ struct intel_ringbuffer *ringbuf = req->ringbuf;
int remain_actual = ringbuf->size - ringbuf->tail;
- int ret, total_bytes, wait_bytes = 0;
+ int remain_usable = ringbuf->effective_size - ringbuf->tail;
+ int bytes = num_dwords * sizeof(u32);
+ int total_bytes, wait_bytes;
bool need_wrap = false;
- if (ringbuf->reserved_in_use)
- total_bytes = bytes;
- else
- total_bytes = bytes + ringbuf->reserved_size;
+ total_bytes = bytes + ringbuf->reserved_size;
if (unlikely(bytes > remain_usable)) {
/*
@@ -2469,44 +2429,38 @@ static int __intel_ring_prepare(struct intel_engine_cs *engine, int bytes)
*/
wait_bytes = remain_actual + total_bytes;
need_wrap = true;
- } else {
- if (unlikely(total_bytes > remain_usable)) {
- /*
- * The base request will fit but the reserved space
- * falls off the end. So don't need an immediate wrap
- * and only need to effectively wait for the reserved
- * size space from the start of ringbuffer.
- */
- wait_bytes = remain_actual + ringbuf->reserved_size;
- } else if (total_bytes > ringbuf->space) {
- /* No wrapping required, just waiting. */
- wait_bytes = total_bytes;
- }
- }
+ } else if (unlikely(total_bytes > remain_usable)) {
+ /*
+ * The base request will fit but the reserved space
+ * falls off the end. So we don't need an immediate wrap
+ * and only need to effectively wait for the reserved
+ * size space from the start of ringbuffer.
+ */
+ wait_bytes = remain_actual + ringbuf->reserved_size;
+ } else
+ /* No wrapping required, just waiting. */
+ wait_bytes = total_bytes;
- if (wait_bytes) {
- ret = ring_wait_for_space(engine, wait_bytes);
+ if (wait_bytes > ringbuf->space) {
+ int ret = wait_for_space(req, wait_bytes);
if (unlikely(ret))
return ret;
- if (need_wrap)
- __wrap_ring_buffer(ringbuf);
+ intel_ring_update_space(ringbuf);
}
- return 0;
-}
+ if (unlikely(need_wrap)) {
+ GEM_BUG_ON(remain_actual > ringbuf->space);
+ GEM_BUG_ON(ringbuf->tail + remain_actual > ringbuf->size);
-int intel_ring_begin(struct drm_i915_gem_request *req,
- int num_dwords)
-{
- struct intel_engine_cs *engine = req->engine;
- int ret;
-
- ret = __intel_ring_prepare(engine, num_dwords * sizeof(uint32_t));
- if (ret)
- return ret;
+ memset(ringbuf->virtual_start + ringbuf->tail,
+ 0, remain_actual);
+ ringbuf->tail = 0;
+ ringbuf->space -= remain_actual;
+ }
- engine->buffer->space -= num_dwords * sizeof(uint32_t);
+ ringbuf->space -= bytes;
+ GEM_BUG_ON(ringbuf->space < 0);
return 0;
}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 2ade194bbea9..201e7752d765 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -108,8 +108,6 @@ struct intel_ringbuffer {
int size;
int effective_size;
int reserved_size;
- int reserved_tail;
- bool reserved_in_use;
/** We track the position of the requests in the ring buffer, and
* when each is retired we increment last_retired_head as the GPU
@@ -459,7 +457,6 @@ static inline void intel_ring_advance(struct intel_engine_cs *engine)
}
int __intel_ring_space(int head, int tail, int size);
void intel_ring_update_space(struct intel_ringbuffer *ringbuf);
-int intel_ring_space(struct intel_ringbuffer *ringbuf);
bool intel_engine_stopped(struct intel_engine_cs *engine);
int __must_check intel_engine_idle(struct intel_engine_cs *engine);
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 13/25] drm/i915: Remove the identical implementations of request space reservation
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (11 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 12/25] drm/i915: Unify intel_ring_begin() Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-27 10:27 ` Joonas Lahtinen
2016-04-26 20:06 ` [PATCH v6 14/25] drm/i915: Manually unwind after a failed request allocation Chris Wilson
` (12 subsequent siblings)
25 siblings, 1 reply; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx
Now that we share intel_ring_begin(), reserving space for the tail of
the request is identical between legacy/execlists and so the tautology
can be removed. In the process, we move the reserved space tracking
from the ringbuffer on to the request. This is to enable us to reorder
the reserved space allocation in the next patch.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 3 +++
drivers/gpu/drm/i915/i915_gem.c | 21 ++++++++++------
drivers/gpu/drm/i915/intel_lrc.c | 15 -----------
drivers/gpu/drm/i915/intel_ringbuffer.c | 44 +++------------------------------
drivers/gpu/drm/i915/intel_ringbuffer.h | 17 -------------
5 files changed, 19 insertions(+), 81 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 444a8ea0c5c4..831da9f43324 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2278,6 +2278,9 @@ struct drm_i915_gem_request {
/** Position in the ringbuffer of the end of the whole request */
u32 tail;
+ /** Preallocate space in the ringbuffer for the emitting the request */
+ u32 reserved_space;
+
/**
* Context and ring buffer related to this request
* Contexts are refcounted, so when this request is associated with a
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 71135c3ce44e..0e27484bd28a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2573,6 +2573,7 @@ void __i915_add_request(struct drm_i915_gem_request *request,
struct drm_i915_private *dev_priv;
struct intel_ringbuffer *ringbuf;
u32 request_start;
+ u32 reserved_tail;
int ret;
if (WARN_ON(request == NULL))
@@ -2587,9 +2588,10 @@ void __i915_add_request(struct drm_i915_gem_request *request,
* should already have been reserved in the ring buffer. Let the ring
* know that it is time to use that space up.
*/
- intel_ring_reserved_space_use(ringbuf);
-
request_start = intel_ring_get_tail(ringbuf);
+ reserved_tail = request->reserved_space;
+ request->reserved_space = 0;
+
/*
* Emit any outstanding flushes - execbuf can fail to emit the flush
* after having emitted the batchbuffer command. Hence we need to fix
@@ -2653,7 +2655,13 @@ void __i915_add_request(struct drm_i915_gem_request *request,
intel_mark_busy(dev_priv->dev);
/* Sanity check that the reserved size was large enough. */
- intel_ring_reserved_space_end(ringbuf);
+ ret = intel_ring_get_tail(ringbuf) - request_start;
+ if (ret < 0)
+ ret += ringbuf->size;
+ WARN_ONCE(ret > reserved_tail,
+ "Not enough space reserved (%d bytes) "
+ "for adding the request (%d bytes)\n",
+ reserved_tail, ret);
}
static bool i915_context_is_banned(struct drm_i915_private *dev_priv,
@@ -2774,17 +2782,14 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
* to be redone if the request is not actually submitted straight
* away, e.g. because a GPU scheduler has deferred it.
*/
- if (i915.enable_execlists)
- ret = intel_logical_ring_reserve_space(req);
- else
- ret = intel_ring_reserve_space(req);
+ req->reserved_space = MIN_SPACE_FOR_ADD_REQUEST;
+ ret = intel_ring_begin(req, 0);
if (ret) {
/*
* At this point, the request is fully allocated even if not
* fully prepared. Thus it can be cleaned up using the proper
* free code.
*/
- intel_ring_reserved_space_cancel(req->ringbuf);
i915_gem_request_unreference(req);
return ret;
}
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index ba87c94928e7..910044cf143e 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -772,21 +772,6 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
return 0;
}
-int intel_logical_ring_reserve_space(struct drm_i915_gem_request *request)
-{
- /*
- * The first call merely notes the reserve request and is common for
- * all back ends. The subsequent localised _begin() call actually
- * ensures that the reservation is available. Without the begin, if
- * the request creator immediately submitted the request without
- * adding any commands to it then there might not actually be
- * sufficient room for the submission commands.
- */
- intel_ring_reserved_space_reserve(request->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
-
- return intel_ring_begin(request, 0);
-}
-
/**
* execlists_submission() - submit a batchbuffer for execution, Execlists style
* @dev: DRM device.
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 1d3d2ea3e9bc..ba5946b9fa06 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2336,44 +2336,6 @@ int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
return 0;
}
-int intel_ring_reserve_space(struct drm_i915_gem_request *request)
-{
- /*
- * The first call merely notes the reserve request and is common for
- * all back ends. The subsequent localised _begin() call actually
- * ensures that the reservation is available. Without the begin, if
- * the request creator immediately submitted the request without
- * adding any commands to it then there might not actually be
- * sufficient room for the submission commands.
- */
- intel_ring_reserved_space_reserve(request->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
-
- return intel_ring_begin(request, 0);
-}
-
-void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size)
-{
- GEM_BUG_ON(ringbuf->reserved_size);
- ringbuf->reserved_size = size;
-}
-
-void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf)
-{
- GEM_BUG_ON(!ringbuf->reserved_size);
- ringbuf->reserved_size = 0;
-}
-
-void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf)
-{
- GEM_BUG_ON(!ringbuf->reserved_size);
- ringbuf->reserved_size = 0;
-}
-
-void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf)
-{
- GEM_BUG_ON(ringbuf->reserved_size);
-}
-
static int wait_for_space(struct drm_i915_gem_request *req, int bytes)
{
struct intel_ringbuffer *ringbuf = req->ringbuf;
@@ -2385,7 +2347,7 @@ static int wait_for_space(struct drm_i915_gem_request *req, int bytes)
return 0;
/* The whole point of reserving space is to not wait! */
- GEM_BUG_ON(!ringbuf->reserved_size);
+ GEM_BUG_ON(!req->reserved_space);
list_for_each_entry(target, &engine->request_list, list) {
unsigned space;
@@ -2420,7 +2382,7 @@ int intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
int total_bytes, wait_bytes;
bool need_wrap = false;
- total_bytes = bytes + ringbuf->reserved_size;
+ total_bytes = bytes + req->reserved_space;
if (unlikely(bytes > remain_usable)) {
/*
@@ -2436,7 +2398,7 @@ int intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
* and only need to effectively wait for the reserved
* size space from the start of ringbuffer.
*/
- wait_bytes = remain_actual + ringbuf->reserved_size;
+ wait_bytes = remain_actual + req->reserved_space;
} else
/* No wrapping required, just waiting. */
wait_bytes = total_bytes;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 201e7752d765..038914ccc6fd 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -107,7 +107,6 @@ struct intel_ringbuffer {
int space;
int size;
int effective_size;
- int reserved_size;
/** We track the position of the requests in the ring buffer, and
* when each is retired we increment last_retired_head as the GPU
@@ -491,20 +490,4 @@ static inline u32 intel_ring_get_tail(struct intel_ringbuffer *ringbuf)
*/
#define MIN_SPACE_FOR_ADD_REQUEST 160
-/*
- * Reserve space in the ring to guarantee that the i915_add_request() call
- * will always have sufficient room to do its stuff. The request creation
- * code calls this automatically.
- */
-void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size);
-/* Cancel the reservation, e.g. because the request is being discarded. */
-void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf);
-/* Use the reserved space - for use by i915_add_request() only. */
-void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf);
-/* Finish with the reserved space - for use by i915_add_request() only. */
-void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf);
-
-/* Legacy ringbuffer specific portion of reservation code: */
-int intel_ring_reserve_space(struct drm_i915_gem_request *request);
-
#endif /* _INTEL_RINGBUFFER_H_ */
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 14/25] drm/i915: Manually unwind after a failed request allocation
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (12 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 13/25] drm/i915: Remove the identical implementations of request space reservation Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-27 10:48 ` Joonas Lahtinen
2016-04-27 12:51 ` Tvrtko Ursulin
2016-04-26 20:06 ` [PATCH v6 15/25] drm/i915: Preallocate enough space for the average request Chris Wilson
` (11 subsequent siblings)
25 siblings, 2 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx
In the next patches, we want to move the work out of freeing the request
and into its retirement (so that we can free the request without
requiring the struct_mutex). This means that we cannot rely on
unreferencing the request to completely teardown the request any more
and so we need to manually unwind the failed allocation. In doing so, we
reorder the allocation in order to make the unwind simple (and ensure
that we don't try to unwind a partial request that may have modified
global state) and so we end up pushing the initial preallocation down
into the engine request initialisation functions where we have the
requisite control over the state of the request.
Moving the initial preallocation into the engine is less than ideal: it
moves logic to handle a specific problem with request handling out of
the common code. On the other hand, it does allow those backends
significantly more flexibility in performing its allocations.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 28 +++++++++-------------------
drivers/gpu/drm/i915/intel_lrc.c | 16 ++++++++++++++--
drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +-
3 files changed, 24 insertions(+), 22 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0e27484bd28a..d7ff5e79182f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2766,15 +2766,6 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
req->ctx = ctx;
i915_gem_context_reference(req->ctx);
- if (i915.enable_execlists)
- ret = intel_logical_ring_alloc_request_extras(req);
- else
- ret = intel_ring_alloc_request_extras(req);
- if (ret) {
- i915_gem_context_unreference(req->ctx);
- goto err;
- }
-
/*
* Reserve space in the ring buffer for all the commands required to
* eventually emit this request. This is to guarantee that the
@@ -2783,20 +2774,19 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
* away, e.g. because a GPU scheduler has deferred it.
*/
req->reserved_space = MIN_SPACE_FOR_ADD_REQUEST;
- ret = intel_ring_begin(req, 0);
- if (ret) {
- /*
- * At this point, the request is fully allocated even if not
- * fully prepared. Thus it can be cleaned up using the proper
- * free code.
- */
- i915_gem_request_unreference(req);
- return ret;
- }
+
+ if (i915.enable_execlists)
+ ret = intel_logical_ring_alloc_request_extras(req);
+ else
+ ret = intel_ring_alloc_request_extras(req);
+ if (ret)
+ goto err_ctx;
*req_out = req;
return 0;
+err_ctx:
+ i915_gem_context_unreference(ctx);
err:
kmem_cache_free(dev_priv->requests, req);
return ret;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 910044cf143e..01517dd7069b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -698,7 +698,7 @@ static int execlists_move_to_gpu(struct drm_i915_gem_request *req,
int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request)
{
- int ret = 0;
+ int ret;
request->ringbuf = request->ctx->engine[request->engine->id].ringbuf;
@@ -715,9 +715,21 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
return ret;
}
- if (request->ctx != request->i915->kernel_context)
+ if (request->ctx != request->i915->kernel_context) {
ret = intel_lr_context_pin(request->ctx, request->engine);
+ if (ret)
+ return ret;
+ }
+ ret = intel_ring_begin(request, 0);
+ if (ret)
+ goto err_unpin;
+
+ return 0;
+
+err_unpin:
+ if (request->ctx != request->i915->kernel_context)
+ intel_lr_context_unpin(request->ctx, request->engine);
return ret;
}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index ba5946b9fa06..1193372f74fd 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2333,7 +2333,7 @@ int intel_engine_idle(struct intel_engine_cs *engine)
int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
{
request->ringbuf = request->engine->buffer;
- return 0;
+ return intel_ring_begin(request, 0);
}
static int wait_for_space(struct drm_i915_gem_request *req, int bytes)
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 15/25] drm/i915: Preallocate enough space for the average request
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (13 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 14/25] drm/i915: Manually unwind after a failed request allocation Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-27 13:15 ` Tvrtko Ursulin
2016-04-26 20:06 ` [PATCH v6 16/25] drm/i915: Update execlists context descriptor format commentary Chris Wilson
` (10 subsequent siblings)
25 siblings, 1 reply; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx
Rather than being interrupted when we run out of space halfway through
the request, and having to restart from the beginning (and returning to
userspace), flush a little more free space when we prepare the request.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
---
drivers/gpu/drm/i915/intel_lrc.c | 7 +++++++
drivers/gpu/drm/i915/intel_ringbuffer.c | 16 +++++++++++++++-
2 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 01517dd7069b..b5c2c1931a5f 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -700,6 +700,12 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
{
int ret;
+ /* Flush enough space to reduce the likelihood of waiting after
+ * we start building the request - in which case we will just
+ * have to repeat work.
+ */
+ request->reserved_space += MIN_SPACE_FOR_ADD_REQUEST;
+
request->ringbuf = request->ctx->engine[request->engine->id].ringbuf;
if (i915.enable_guc_submission) {
@@ -725,6 +731,7 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
if (ret)
goto err_unpin;
+ request->reserved_space -= MIN_SPACE_FOR_ADD_REQUEST;
return 0;
err_unpin:
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 1193372f74fd..1285605f25c7 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2332,8 +2332,22 @@ int intel_engine_idle(struct intel_engine_cs *engine)
int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
{
+ int ret;
+
+ /* Flush enough space to reduce the likelihood of waiting after
+ * we start building the request - in which case we will just
+ * have to repeat work.
+ */
+ request->reserved_space += MIN_SPACE_FOR_ADD_REQUEST;
+
request->ringbuf = request->engine->buffer;
- return intel_ring_begin(request, 0);
+
+ ret = intel_ring_begin(request, 0);
+ if (ret)
+ return ret;
+
+ request->reserved_space -= MIN_SPACE_FOR_ADD_REQUEST;
+ return 0;
}
static int wait_for_space(struct drm_i915_gem_request *req, int bytes)
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 16/25] drm/i915: Update execlists context descriptor format commentary
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (14 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 15/25] drm/i915: Preallocate enough space for the average request Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-27 13:22 ` Tvrtko Ursulin
2016-04-26 20:06 ` [PATCH v6 17/25] drm/i915: Assign every HW context a unique ID Chris Wilson
` (9 subsequent siblings)
25 siblings, 1 reply; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx
The comments describing the Context Descriptor Format are off by a bit
for the size of the context ID.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
drivers/gpu/drm/i915/intel_lrc.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index b5c2c1931a5f..5d8ee9059eee 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -305,10 +305,11 @@ logical_ring_init_platform_invariants(struct intel_engine_cs *engine)
* which remains valid until the context is unpinned.
*
* This is what a descriptor looks like, from LSB to MSB:
- * bits 0-11: flags, GEN8_CTX_* (cached in ctx_desc_template)
+ * bits 0-11: flags, GEN8_CTX_* (cached in ctx_desc_template)
* bits 12-31: LRCA, GTT address of (the HWSP of) this context
- * bits 32-51: ctx ID, a globally unique tag (the LRCA again!)
- * bits 52-63: reserved, may encode the engine ID (for GuC)
+ * bits 32-52: ctx ID, a globally unique tag (the LRCA again!)
+ * bits 53-54: mbz, reserved for use by hardware
+ * bits 55-63: group ID, currently unused and set to 0
*/
static void
intel_lr_context_descriptor_update(struct intel_context *ctx,
@@ -319,9 +320,9 @@ intel_lr_context_descriptor_update(struct intel_context *ctx,
lrca = ctx->engine[engine->id].lrc_vma->node.start +
LRC_PPHWSP_PN * PAGE_SIZE;
- desc = engine->ctx_desc_template; /* bits 0-11 */
+ desc = engine->ctx_desc_template; /* bits 0-11 */
desc |= lrca; /* bits 12-31 */
- desc |= (lrca >> PAGE_SHIFT) << GEN8_CTX_ID_SHIFT; /* bits 32-51 */
+ desc |= (lrca >> PAGE_SHIFT) << GEN8_CTX_ID_SHIFT; /* bits 32-52 */
ctx->engine[engine->id].lrc_desc = desc;
}
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 17/25] drm/i915: Assign every HW context a unique ID
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (15 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 16/25] drm/i915: Update execlists context descriptor format commentary Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-26 20:06 ` [PATCH v6 18/25] drm/i915: Replace the pinned context address with its " Chris Wilson
` (8 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx
The hardware tracks contexts and expects all live contexts (those active
on the hardware) to have a unique identifier. This is used by the
hardware to assign pagefaults and the like to a particular context.
v2: Reorder to make sure ctx->link is not left dangling if the
assignment of a hw_id fails (Mika).
v3: We have 21bits of context space, not 20.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v2
---
drivers/gpu/drm/i915/i915_debugfs.c | 2 +-
drivers/gpu/drm/i915/i915_drv.h | 10 +++++++++
drivers/gpu/drm/i915/i915_gem_context.c | 36 +++++++++++++++++++++++++++++++++
3 files changed, 47 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 4950d05d2948..3bd2f89933ff 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2001,7 +2001,7 @@ static int i915_context_status(struct seq_file *m, void *unused)
ctx->legacy_hw_ctx.rcs_state == NULL)
continue;
- seq_puts(m, "HW context ");
+ seq_printf(m, "HW context %u ", ctx->hw_id);
describe_ctx(m, ctx);
if (ctx == dev_priv->kernel_context)
seq_printf(m, "(kernel context) ");
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 831da9f43324..0f7575252c6e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -851,6 +851,9 @@ struct intel_context {
struct i915_ctx_hang_stats hang_stats;
struct i915_hw_ppgtt *ppgtt;
+ /* Unique identifier for this context, used by the hw for tracking */
+ unsigned hw_id;
+
/* Legacy ring buffer submission */
struct {
struct drm_i915_gem_object *rcs_state;
@@ -1838,6 +1841,13 @@ struct drm_i915_private {
DECLARE_HASHTABLE(mm_structs, 7);
struct mutex mm_lock;
+ /* The hw wants to have a stable context identifier for the lifetime
+ * of the context (for OA, PASID, faults, etc). This is limited
+ * in execlists to 21 bits.
+ */
+ struct ida context_hw_ida;
+#define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
+
/* Kernel Modesetting */
struct drm_crtc *plane_to_crtc_mapping[I915_MAX_PIPES];
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 4071a0ec134a..9477af0fcb33 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -171,6 +171,8 @@ void i915_gem_context_free(struct kref *ctx_ref)
if (ctx->legacy_hw_ctx.rcs_state)
drm_gem_object_unreference(&ctx->legacy_hw_ctx.rcs_state->base);
list_del(&ctx->link);
+
+ ida_simple_remove(&ctx->i915->context_hw_ida, ctx->hw_id);
kfree(ctx);
}
@@ -211,6 +213,28 @@ i915_gem_alloc_context_obj(struct drm_device *dev, size_t size)
return obj;
}
+static int assign_hw_id(struct drm_i915_private *dev_priv, unsigned *out)
+{
+ int ret;
+
+ ret = ida_simple_get(&dev_priv->context_hw_ida,
+ 0, MAX_CONTEXT_HW_ID, GFP_KERNEL);
+ if (ret < 0) {
+ /* Contexts are only released when no longer active.
+ * Flush any pending retires to hopefully release some
+ * stale contexts and try again.
+ */
+ i915_gem_retire_requests(dev_priv->dev);
+ ret = ida_simple_get(&dev_priv->context_hw_ida,
+ 0, MAX_CONTEXT_HW_ID, GFP_KERNEL);
+ if (ret < 0)
+ return ret;
+ }
+
+ *out = ret;
+ return 0;
+}
+
static struct intel_context *
__create_hw_context(struct drm_device *dev,
struct drm_i915_file_private *file_priv)
@@ -223,6 +247,12 @@ __create_hw_context(struct drm_device *dev,
if (ctx == NULL)
return ERR_PTR(-ENOMEM);
+ ret = assign_hw_id(dev_priv, &ctx->hw_id);
+ if (ret) {
+ kfree(ctx);
+ return ERR_PTR(ret);
+ }
+
kref_init(&ctx->ref);
list_add_tail(&ctx->link, &dev_priv->context_list);
ctx->i915 = dev_priv;
@@ -366,6 +396,10 @@ int i915_gem_context_init(struct drm_device *dev)
}
}
+ /* Using the simple ida interface, the max is limited by sizeof(int) */
+ BUILD_BUG_ON(MAX_CONTEXT_HW_ID > INT_MAX);
+ ida_init(&dev_priv->context_hw_ida);
+
if (i915.enable_execlists) {
/* NB: intentionally left blank. We will allocate our own
* backing objects as we need them, thank you very much */
@@ -429,6 +463,8 @@ void i915_gem_context_fini(struct drm_device *dev)
i915_gem_context_unreference(dctx);
dev_priv->kernel_context = NULL;
+
+ ida_destroy(&dev_priv->context_hw_ida);
}
static int context_idr_cleanup(int id, void *p, void *data)
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 18/25] drm/i915: Replace the pinned context address with its unique ID
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (16 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 17/25] drm/i915: Assign every HW context a unique ID Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-26 20:06 ` [PATCH v6 19/25] drm/i915: Refactor execlists default context pinning Chris Wilson
` (7 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx
Rather than reuse the current location of the context in the global GTT
for its hardware identifier, use the context's unique ID assigned to it
for its whole lifetime.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
drivers/gpu/drm/i915/i915_debugfs.c | 12 +++++-------
drivers/gpu/drm/i915/intel_lrc.c | 39 +++++++++----------------------------
drivers/gpu/drm/i915/intel_lrc.h | 3 ---
3 files changed, 14 insertions(+), 40 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 3bd2f89933ff..be2a4a0fae13 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2043,15 +2043,13 @@ static void i915_dump_lrc_obj(struct seq_file *m,
struct drm_i915_gem_object *ctx_obj = ctx->engine[engine->id].state;
unsigned long ggtt_offset = 0;
+ seq_printf(m, "CONTEXT: %s %u\n", engine->name, ctx->hw_id);
+
if (ctx_obj == NULL) {
- seq_printf(m, "Context on %s with no gem object\n",
- engine->name);
+ seq_puts(m, "\tNot allocated\n");
return;
}
- seq_printf(m, "CONTEXT: %s %u\n", engine->name,
- intel_execlists_ctx_id(ctx, engine));
-
if (!i915_gem_obj_ggtt_bound(ctx_obj))
seq_puts(m, "\tNot bound in GGTT\n");
else
@@ -2170,8 +2168,8 @@ static int i915_execlists(struct seq_file *m, void *data)
seq_printf(m, "\t%d requests in queue\n", count);
if (head_req) {
- seq_printf(m, "\tHead request id: %u\n",
- intel_execlists_ctx_id(head_req->ctx, engine));
+ seq_printf(m, "\tHead request context: %u\n",
+ head_req->ctx->hw_id);
seq_printf(m, "\tHead request tail: %u\n",
head_req->tail);
}
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 5d8ee9059eee..ccabb688e049 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -224,6 +224,7 @@ enum {
FAULT_AND_CONTINUE /* Unsupported */
};
#define GEN8_CTX_ID_SHIFT 32
+#define GEN8_CTX_ID_WIDTH 21
#define GEN8_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT 0x17
#define GEN9_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT 0x26
@@ -307,7 +308,7 @@ logical_ring_init_platform_invariants(struct intel_engine_cs *engine)
* This is what a descriptor looks like, from LSB to MSB:
* bits 0-11: flags, GEN8_CTX_* (cached in ctx_desc_template)
* bits 12-31: LRCA, GTT address of (the HWSP of) this context
- * bits 32-52: ctx ID, a globally unique tag (the LRCA again!)
+ * bits 32-52: ctx ID, a globally unique tag
* bits 53-54: mbz, reserved for use by hardware
* bits 55-63: group ID, currently unused and set to 0
*/
@@ -315,14 +316,14 @@ static void
intel_lr_context_descriptor_update(struct intel_context *ctx,
struct intel_engine_cs *engine)
{
- uint64_t lrca, desc;
+ u64 desc;
- lrca = ctx->engine[engine->id].lrc_vma->node.start +
- LRC_PPHWSP_PN * PAGE_SIZE;
+ BUILD_BUG_ON(MAX_CONTEXT_HW_ID > (1<<GEN8_CTX_ID_WIDTH));
- desc = engine->ctx_desc_template; /* bits 0-11 */
- desc |= lrca; /* bits 12-31 */
- desc |= (lrca >> PAGE_SHIFT) << GEN8_CTX_ID_SHIFT; /* bits 32-52 */
+ desc = engine->ctx_desc_template; /* bits 0-11 */
+ desc |= ctx->engine[engine->id].lrc_vma->node.start + /* bits 12-31 */
+ LRC_PPHWSP_PN * PAGE_SIZE;
+ desc |= (u64)ctx->hw_id << GEN8_CTX_ID_SHIFT; /* bits 32-52 */
ctx->engine[engine->id].lrc_desc = desc;
}
@@ -333,28 +334,6 @@ uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
return ctx->engine[engine->id].lrc_desc;
}
-/**
- * intel_execlists_ctx_id() - get the Execlists Context ID
- * @ctx: Context to get the ID for
- * @ring: Engine to get the ID for
- *
- * Do not confuse with ctx->id! Unfortunately we have a name overload
- * here: the old context ID we pass to userspace as a handler so that
- * they can refer to a context, and the new context ID we pass to the
- * ELSP so that the GPU can inform us of the context status via
- * interrupts.
- *
- * The context ID is a portion of the context descriptor, so we can
- * just extract the required part from the cached descriptor.
- *
- * Return: 20-bits globally unique context ID.
- */
-u32 intel_execlists_ctx_id(struct intel_context *ctx,
- struct intel_engine_cs *engine)
-{
- return intel_lr_context_descriptor(ctx, engine) >> GEN8_CTX_ID_SHIFT;
-}
-
static void execlists_elsp_write(struct drm_i915_gem_request *rq0,
struct drm_i915_gem_request *rq1)
{
@@ -500,7 +479,7 @@ execlists_check_remove_request(struct intel_engine_cs *engine, u32 request_id)
if (!head_req)
return 0;
- if (unlikely(intel_execlists_ctx_id(head_req->ctx, engine) != request_id))
+ if (unlikely(head_req->ctx->hw_id != request_id))
return 0;
WARN(head_req->elsp_submitted == 0, "Never submitted head request\n");
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 60a7385bc531..9510826f0d90 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -113,9 +113,6 @@ void intel_lr_context_reset(struct drm_i915_private *dev_priv,
uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
struct intel_engine_cs *engine);
-u32 intel_execlists_ctx_id(struct intel_context *ctx,
- struct intel_engine_cs *engine);
-
/* Execlists */
int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists);
struct i915_execbuffer_params;
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 19/25] drm/i915: Refactor execlists default context pinning
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (17 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 18/25] drm/i915: Replace the pinned context address with its " Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-26 20:06 ` [PATCH v6 20/25] drm/i915: Move the magical deferred context allocation into the request Chris Wilson
` (6 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx
Refactor pinning and unpinning of contexts, such that the default
context for an engine is pinned during initialisation and unpinned
during teardown (pinning of the context handles the reference counting).
Thus we can eliminate the special case handling of the default context
that was required to mask that it was not being pinned normally.
v2: Rebalance context_queue after rebasing.
v3: Rebase to -nightly (not 40 patches in)
v4: Rebase onto request_alloc unwinding
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
drivers/gpu/drm/i915/i915_debugfs.c | 5 +-
drivers/gpu/drm/i915/i915_drv.h | 1 +
drivers/gpu/drm/i915/i915_gem.c | 2 +-
drivers/gpu/drm/i915/intel_lrc.c | 150 +++++++++++++++---------------------
4 files changed, 65 insertions(+), 93 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index be2a4a0fae13..5bc9789e1f1e 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2095,9 +2095,8 @@ static int i915_dump_lrc(struct seq_file *m, void *unused)
return ret;
list_for_each_entry(ctx, &dev_priv->context_list, link)
- if (ctx != dev_priv->kernel_context)
- for_each_engine(engine, dev_priv)
- i915_dump_lrc_obj(m, ctx, engine);
+ for_each_engine(engine, dev_priv)
+ i915_dump_lrc_obj(m, ctx, engine);
mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0f7575252c6e..ea7aea044603 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -868,6 +868,7 @@ struct intel_context {
struct i915_vma *lrc_vma;
u64 lrc_desc;
uint32_t *lrc_reg_state;
+ bool initialised;
} engine[I915_NUM_ENGINES];
struct list_head link;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d7ff5e79182f..df9d43397492 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2719,7 +2719,7 @@ void i915_gem_request_free(struct kref *req_ref)
i915_gem_request_remove_from_client(req);
if (ctx) {
- if (i915.enable_execlists && ctx != req->i915->kernel_context)
+ if (i915.enable_execlists)
intel_lr_context_unpin(ctx, req->engine);
i915_gem_context_unreference(ctx);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index ccabb688e049..f74c77cf2b52 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -592,9 +592,7 @@ static void execlists_context_queue(struct drm_i915_gem_request *request)
struct drm_i915_gem_request *cursor;
int num_elements = 0;
- if (request->ctx != request->i915->kernel_context)
- intel_lr_context_pin(request->ctx, engine);
-
+ intel_lr_context_pin(request->ctx, request->engine);
i915_gem_request_reference(request);
spin_lock_bh(&engine->execlist_lock);
@@ -678,6 +676,7 @@ static int execlists_move_to_gpu(struct drm_i915_gem_request *req,
int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request)
{
+ struct intel_engine_cs *engine = request->engine;
int ret;
/* Flush enough space to reduce the likelihood of waiting after
@@ -686,7 +685,7 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
*/
request->reserved_space += MIN_SPACE_FOR_ADD_REQUEST;
- request->ringbuf = request->ctx->engine[request->engine->id].ringbuf;
+ request->ringbuf = request->ctx->engine[engine->id].ringbuf;
if (i915.enable_guc_submission) {
/*
@@ -701,22 +700,34 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
return ret;
}
- if (request->ctx != request->i915->kernel_context) {
- ret = intel_lr_context_pin(request->ctx, request->engine);
- if (ret)
- return ret;
- }
+ ret = intel_lr_context_pin(request->ctx, engine);
+ if (ret)
+ return ret;
ret = intel_ring_begin(request, 0);
if (ret)
goto err_unpin;
+ if (!request->ctx->engine[engine->id].initialised) {
+ ret = engine->init_context(request);
+ if (ret)
+ goto err_unpin;
+
+ request->ctx->engine[engine->id].initialised = true;
+ }
+
+ /* Note that after this point, we have committed to using
+ * this request as it is being used to both track the
+ * state of engine initialisation and liveness of the
+ * golden renderstate above. Think twice before you try
+ * to cancel/unwind this request now.
+ */
+
request->reserved_space -= MIN_SPACE_FOR_ADD_REQUEST;
return 0;
err_unpin:
- if (request->ctx != request->i915->kernel_context)
- intel_lr_context_unpin(request->ctx, request->engine);
+ intel_lr_context_unpin(request->ctx, engine);
return ret;
}
@@ -755,12 +766,8 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
if (engine->last_context != request->ctx) {
if (engine->last_context)
intel_lr_context_unpin(engine->last_context, engine);
- if (request->ctx != request->i915->kernel_context) {
- intel_lr_context_pin(request->ctx, engine);
- engine->last_context = request->ctx;
- } else {
- engine->last_context = NULL;
- }
+ intel_lr_context_pin(request->ctx, engine);
+ engine->last_context = request->ctx;
}
if (dev_priv->guc.execbuf_client)
@@ -880,12 +887,7 @@ void intel_execlists_retire_requests(struct intel_engine_cs *engine)
spin_unlock_bh(&engine->execlist_lock);
list_for_each_entry_safe(req, tmp, &retired_list, execlist_link) {
- struct intel_context *ctx = req->ctx;
- struct drm_i915_gem_object *ctx_obj =
- ctx->engine[engine->id].state;
-
- if (ctx_obj && (ctx != req->i915->kernel_context))
- intel_lr_context_unpin(ctx, engine);
+ intel_lr_context_unpin(req->ctx, engine);
list_del(&req->execlist_link);
i915_gem_request_unreference(req);
@@ -930,23 +932,26 @@ int logical_ring_flush_all_caches(struct drm_i915_gem_request *req)
return 0;
}
-static int intel_lr_context_do_pin(struct intel_context *ctx,
- struct intel_engine_cs *engine)
+static int intel_lr_context_pin(struct intel_context *ctx,
+ struct intel_engine_cs *engine)
{
- struct drm_device *dev = engine->dev;
- struct drm_i915_private *dev_priv = dev->dev_private;
- struct drm_i915_gem_object *ctx_obj = ctx->engine[engine->id].state;
- struct intel_ringbuffer *ringbuf = ctx->engine[engine->id].ringbuf;
+ struct drm_i915_private *dev_priv = ctx->i915;
+ struct drm_i915_gem_object *ctx_obj;
+ struct intel_ringbuffer *ringbuf;
void *vaddr;
u32 *lrc_reg_state;
int ret;
- WARN_ON(!mutex_is_locked(&engine->dev->struct_mutex));
+ lockdep_assert_held(&ctx->i915->dev->struct_mutex);
+ if (ctx->engine[engine->id].pin_count++)
+ return 0;
+
+ ctx_obj = ctx->engine[engine->id].state;
ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN,
PIN_OFFSET_BIAS | GUC_WOPCM_TOP);
if (ret)
- return ret;
+ goto err;
vaddr = i915_gem_object_pin_map(ctx_obj);
if (IS_ERR(vaddr)) {
@@ -956,10 +961,12 @@ static int intel_lr_context_do_pin(struct intel_context *ctx,
lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE;
+ ringbuf = ctx->engine[engine->id].ringbuf;
ret = intel_pin_and_map_ringbuffer_obj(engine->dev, ringbuf);
if (ret)
goto unpin_map;
+ i915_gem_context_reference(ctx);
ctx->engine[engine->id].lrc_vma = i915_gem_obj_to_ggtt(ctx_obj);
intel_lr_context_descriptor_update(ctx, engine);
lrc_reg_state[CTX_RING_BUFFER_START+1] = ringbuf->vma->node.start;
@@ -970,51 +977,39 @@ static int intel_lr_context_do_pin(struct intel_context *ctx,
if (i915.enable_guc_submission)
I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
- return ret;
+ return 0;
unpin_map:
i915_gem_object_unpin_map(ctx_obj);
unpin_ctx_obj:
i915_gem_object_ggtt_unpin(ctx_obj);
-
+err:
+ ctx->engine[engine->id].pin_count = 0;
return ret;
}
-static int intel_lr_context_pin(struct intel_context *ctx,
- struct intel_engine_cs *engine)
+void intel_lr_context_unpin(struct intel_context *ctx,
+ struct intel_engine_cs *engine)
{
- int ret = 0;
+ struct drm_i915_gem_object *ctx_obj;
- if (ctx->engine[engine->id].pin_count++ == 0) {
- ret = intel_lr_context_do_pin(ctx, engine);
- if (ret)
- goto reset_pin_count;
+ lockdep_assert_held(&ctx->i915->dev->struct_mutex);
+ GEM_BUG_ON(ctx->engine[engine->id].pin_count == 0);
- i915_gem_context_reference(ctx);
- }
- return ret;
+ if (--ctx->engine[engine->id].pin_count)
+ return;
-reset_pin_count:
- ctx->engine[engine->id].pin_count = 0;
- return ret;
-}
+ intel_unpin_ringbuffer_obj(ctx->engine[engine->id].ringbuf);
-void intel_lr_context_unpin(struct intel_context *ctx,
- struct intel_engine_cs *engine)
-{
- struct drm_i915_gem_object *ctx_obj = ctx->engine[engine->id].state;
+ ctx_obj = ctx->engine[engine->id].state;
+ i915_gem_object_unpin_map(ctx_obj);
+ i915_gem_object_ggtt_unpin(ctx_obj);
- WARN_ON(!mutex_is_locked(&ctx->i915->dev->struct_mutex));
- if (--ctx->engine[engine->id].pin_count == 0) {
- i915_gem_object_unpin_map(ctx_obj);
- intel_unpin_ringbuffer_obj(ctx->engine[engine->id].ringbuf);
- i915_gem_object_ggtt_unpin(ctx_obj);
- ctx->engine[engine->id].lrc_vma = NULL;
- ctx->engine[engine->id].lrc_desc = 0;
- ctx->engine[engine->id].lrc_reg_state = NULL;
+ ctx->engine[engine->id].lrc_vma = NULL;
+ ctx->engine[engine->id].lrc_desc = 0;
+ ctx->engine[engine->id].lrc_reg_state = NULL;
- i915_gem_context_unreference(ctx);
- }
+ i915_gem_context_unreference(ctx);
}
static int intel_logical_ring_workarounds_emit(struct drm_i915_gem_request *req)
@@ -1912,6 +1907,7 @@ void intel_logical_ring_cleanup(struct intel_engine_cs *engine)
i915_gem_object_unpin_map(engine->status_page.obj);
engine->status_page.obj = NULL;
}
+ intel_lr_context_unpin(dev_priv->kernel_context, engine);
engine->idle_lite_restore_wa = 0;
engine->disable_lite_restore_wa = false;
@@ -2015,11 +2011,10 @@ logical_ring_init(struct drm_device *dev, struct intel_engine_cs *engine)
goto error;
/* As this is the default context, always pin it */
- ret = intel_lr_context_do_pin(dctx, engine);
+ ret = intel_lr_context_pin(dctx, engine);
if (ret) {
- DRM_ERROR(
- "Failed to pin and map ringbuffer %s: %d\n",
- engine->name, ret);
+ DRM_ERROR("Failed to pin context for %s: %d\n",
+ engine->name, ret);
goto error;
}
@@ -2440,12 +2435,6 @@ void intel_lr_context_free(struct intel_context *ctx)
if (!ctx_obj)
continue;
- if (ctx == ctx->i915->kernel_context) {
- intel_unpin_ringbuffer_obj(ringbuf);
- i915_gem_object_ggtt_unpin(ctx_obj);
- i915_gem_object_unpin_map(ctx_obj);
- }
-
WARN_ON(ctx->engine[i].pin_count);
intel_ringbuffer_free(ringbuf);
drm_gem_object_unreference(&ctx_obj->base);
@@ -2541,25 +2530,8 @@ int intel_lr_context_deferred_alloc(struct intel_context *ctx,
ctx->engine[engine->id].ringbuf = ringbuf;
ctx->engine[engine->id].state = ctx_obj;
+ ctx->engine[engine->id].initialised = engine->init_context == NULL;
- if (ctx != ctx->i915->kernel_context && engine->init_context) {
- struct drm_i915_gem_request *req;
-
- req = i915_gem_request_alloc(engine, ctx);
- if (IS_ERR(req)) {
- ret = PTR_ERR(req);
- DRM_ERROR("ring create req: %d\n", ret);
- goto error_ringbuf;
- }
-
- ret = engine->init_context(req);
- i915_add_request_no_flush(req);
- if (ret) {
- DRM_ERROR("ring init context: %d\n",
- ret);
- goto error_ringbuf;
- }
- }
return 0;
error_ringbuf:
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 20/25] drm/i915: Move the magical deferred context allocation into the request
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (18 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 19/25] drm/i915: Refactor execlists default context pinning Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-26 20:06 ` [PATCH v6 21/25] drm/i915: Move releasing of the GEM request from free to retire/cancel Chris Wilson
` (5 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx
We can hide more details of execlists from higher level code by removing
the explicit call to create an execlist context from execbuffer and
into its first use by execlists.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 8 --------
drivers/gpu/drm/i915/intel_lrc.c | 19 +++++++++++++------
drivers/gpu/drm/i915/intel_lrc.h | 2 --
3 files changed, 13 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 6f4f2a6cdf93..e0ee5d1ac372 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1085,14 +1085,6 @@ i915_gem_validate_context(struct drm_device *dev, struct drm_file *file,
return ERR_PTR(-EIO);
}
- if (i915.enable_execlists && !ctx->engine[engine->id].state) {
- int ret = intel_lr_context_deferred_alloc(ctx, engine);
- if (ret) {
- DRM_DEBUG("Could not create LRC %u: %d\n", ctx_id, ret);
- return ERR_PTR(ret);
- }
- }
-
return ctx;
}
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index f74c77cf2b52..4be11f8d0e04 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -228,6 +228,8 @@ enum {
#define GEN8_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT 0x17
#define GEN9_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT 0x26
+static int execlists_context_deferred_alloc(struct intel_context *ctx,
+ struct intel_engine_cs *engine);
static int intel_lr_context_pin(struct intel_context *ctx,
struct intel_engine_cs *engine);
@@ -685,6 +687,12 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
*/
request->reserved_space += MIN_SPACE_FOR_ADD_REQUEST;
+ if (request->ctx->engine[engine->id].state == NULL) {
+ ret = execlists_context_deferred_alloc(request->ctx, engine);
+ if (ret)
+ return ret;
+ }
+
request->ringbuf = request->ctx->engine[engine->id].ringbuf;
if (i915.enable_guc_submission) {
@@ -2006,7 +2014,7 @@ logical_ring_init(struct drm_device *dev, struct intel_engine_cs *engine)
if (ret)
goto error;
- ret = intel_lr_context_deferred_alloc(dctx, engine);
+ ret = execlists_context_deferred_alloc(dctx, engine);
if (ret)
goto error;
@@ -2480,9 +2488,9 @@ uint32_t intel_lr_context_size(struct intel_engine_cs *engine)
}
/**
- * intel_lr_context_deferred_alloc() - create the LRC specific bits of a context
+ * execlists_context_deferred_alloc() - create the LRC specific bits of a context
* @ctx: LR context to create.
- * @ring: engine to be used with the context.
+ * @engine: engine to be used with the context.
*
* This function can be called more than once, with different engines, if we plan
* to use the context with them. The context backing objects and the ringbuffers
@@ -2492,9 +2500,8 @@ uint32_t intel_lr_context_size(struct intel_engine_cs *engine)
*
* Return: non-zero on error.
*/
-
-int intel_lr_context_deferred_alloc(struct intel_context *ctx,
- struct intel_engine_cs *engine)
+static int execlists_context_deferred_alloc(struct intel_context *ctx,
+ struct intel_engine_cs *engine)
{
struct drm_device *dev = engine->dev;
struct drm_i915_gem_object *ctx_obj;
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 9510826f0d90..28ff324ba5dc 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -101,8 +101,6 @@ static inline void intel_logical_ring_emit_reg(struct intel_ringbuffer *ringbuf,
void intel_lr_context_free(struct intel_context *ctx);
uint32_t intel_lr_context_size(struct intel_engine_cs *engine);
-int intel_lr_context_deferred_alloc(struct intel_context *ctx,
- struct intel_engine_cs *engine);
void intel_lr_context_unpin(struct intel_context *ctx,
struct intel_engine_cs *engine);
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 21/25] drm/i915: Move releasing of the GEM request from free to retire/cancel
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (19 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 20/25] drm/i915: Move the magical deferred context allocation into the request Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-26 20:06 ` [PATCH v6 22/25] drm/i915: Track the previous pinned context inside the request Chris Wilson
` (4 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx
If we move the release of the GEM request (i.e. decoupling it from the
various lists used for client and context tracking) after it is complete
(either by the GPU retiring the request, or by the caller cancelling the
request), we can remove the requirement that the final unreference of
the GEM request need to be under the struct_mutex.
The careful reader may notice that one or two impossible NULL pointer
tests are dropped for readability. These pointers cannot be NULL since
they are assigned during request construction and never unset.
v2,v3: Rebalance execlists by moving the context unpinning.
v4: Rebase onto -nightly
v5: Avoid trying to rebalance execlist/GuC context pinning, leave that
to the next step
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 14 --------------
drivers/gpu/drm/i915/i915_gem.c | 23 +++++++++--------------
drivers/gpu/drm/i915/intel_display.c | 2 +-
drivers/gpu/drm/i915/intel_pm.c | 2 +-
4 files changed, 11 insertions(+), 30 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ea7aea044603..0bd9d17ff974 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2373,23 +2373,9 @@ i915_gem_request_reference(struct drm_i915_gem_request *req)
static inline void
i915_gem_request_unreference(struct drm_i915_gem_request *req)
{
- WARN_ON(!mutex_is_locked(&req->engine->dev->struct_mutex));
kref_put(&req->ref, i915_gem_request_free);
}
-static inline void
-i915_gem_request_unreference__unlocked(struct drm_i915_gem_request *req)
-{
- struct drm_device *dev;
-
- if (!req)
- return;
-
- dev = req->engine->dev;
- if (kref_put_mutex(&req->ref, i915_gem_request_free, &dev->struct_mutex))
- mutex_unlock(&dev->struct_mutex);
-}
-
static inline void i915_gem_request_assign(struct drm_i915_gem_request **pdst,
struct drm_i915_gem_request *src)
{
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index df9d43397492..e4c8a8b998d1 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1413,6 +1413,13 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
list_del_init(&request->list);
i915_gem_request_remove_from_client(request);
+ if (request->ctx) {
+ if (i915.enable_execlists)
+ intel_lr_context_unpin(request->ctx, request->engine);
+
+ i915_gem_context_unreference(request->ctx);
+ }
+
i915_gem_request_unreference(request);
}
@@ -2713,18 +2720,6 @@ void i915_gem_request_free(struct kref *req_ref)
{
struct drm_i915_gem_request *req = container_of(req_ref,
typeof(*req), ref);
- struct intel_context *ctx = req->ctx;
-
- if (req->file_priv)
- i915_gem_request_remove_from_client(req);
-
- if (ctx) {
- if (i915.enable_execlists)
- intel_lr_context_unpin(ctx, req->engine);
-
- i915_gem_context_unreference(ctx);
- }
-
kmem_cache_free(req->i915->requests, req);
}
@@ -3173,7 +3168,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
ret = __i915_wait_request(req[i], true,
args->timeout_ns > 0 ? &args->timeout_ns : NULL,
to_rps_client(file));
- i915_gem_request_unreference__unlocked(req[i]);
+ i915_gem_request_unreference(req[i]);
}
return ret;
@@ -4199,7 +4194,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
if (ret == 0)
queue_delayed_work(dev_priv->wq, &dev_priv->mm.retire_work, 0);
- i915_gem_request_unreference__unlocked(target);
+ i915_gem_request_unreference(target);
return ret;
}
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 87ce7852482b..5692df5b71e5 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -11399,7 +11399,7 @@ static void intel_mmio_flip_work_func(struct work_struct *work)
WARN_ON(__i915_wait_request(mmio_flip->req,
false, NULL,
&mmio_flip->i915->rps.mmioflips));
- i915_gem_request_unreference__unlocked(mmio_flip->req);
+ i915_gem_request_unreference(mmio_flip->req);
}
/* For framebuffer backed by dmabuf, wait for fence */
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 695a464a5e64..2422ac38ce5d 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -7385,7 +7385,7 @@ static void __intel_rps_boost_work(struct work_struct *work)
gen6_rps_boost(to_i915(req->engine->dev), NULL,
req->emitted_jiffies);
- i915_gem_request_unreference__unlocked(req);
+ i915_gem_request_unreference(req);
kfree(boost);
}
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 22/25] drm/i915: Track the previous pinned context inside the request
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (20 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 21/25] drm/i915: Move releasing of the GEM request from free to retire/cancel Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-26 20:06 ` [PATCH v6 23/25] drm/i915: Store LRC hardware id in " Chris Wilson
` (3 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx
As the contexts are accessed by the hardware until the switch is completed
to a new context, the hardware may still be writing to the context object
after the breadcrumb is visible. We must not unpin/unbind/prune that
object whilst still active and so we keep the previous context pinned until
the following request. We can generalise the tracking we already do via
the engine->last_context and move it to the request so that it works
equally for execlists and GuC.
v2: Drop the execlists double pin as that exposes a race inside the lrc
irq handler as it tries to access the context after it may be retired.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 11 +++++++++++
drivers/gpu/drm/i915/i915_gem.c | 8 ++++----
drivers/gpu/drm/i915/intel_lrc.c | 14 ++++++++------
3 files changed, 23 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0bd9d17ff974..eb4a95b853dd 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2305,6 +2305,17 @@ struct drm_i915_gem_request {
struct intel_context *ctx;
struct intel_ringbuffer *ringbuf;
+ /**
+ * Context related to the previous request.
+ * As the contexts are accessed by the hardware until the switch is
+ * completed to a new context, the hardware may still be writing
+ * to the context object after the breadcrumb is visible. We must
+ * not unpin/unbind/prune that object whilst still active and so
+ * we keep the previous context pinned until the following (this)
+ * request is retired.
+ */
+ struct intel_context *previous_context;
+
/** Batch buffer related to this request if any (used for
error state dump only) */
struct drm_i915_gem_object *batch_obj;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e4c8a8b998d1..fc68af50ed1b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1413,13 +1413,13 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
list_del_init(&request->list);
i915_gem_request_remove_from_client(request);
- if (request->ctx) {
+ if (request->previous_context) {
if (i915.enable_execlists)
- intel_lr_context_unpin(request->ctx, request->engine);
-
- i915_gem_context_unreference(request->ctx);
+ intel_lr_context_unpin(request->previous_context,
+ request->engine);
}
+ i915_gem_context_unreference(request->ctx);
i915_gem_request_unreference(request);
}
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 4be11f8d0e04..0682a79be02c 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -771,12 +771,14 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
if (intel_engine_stopped(engine))
return 0;
- if (engine->last_context != request->ctx) {
- if (engine->last_context)
- intel_lr_context_unpin(engine->last_context, engine);
- intel_lr_context_pin(request->ctx, engine);
- engine->last_context = request->ctx;
- }
+ /* We keep the previous context alive until we retire the following
+ * request. This ensures that any the context object is still pinned
+ * for any residual writes the HW makes into it on the context switch
+ * into the next object following the breadcrumb. Otherwise, we may
+ * retire the context too early.
+ */
+ request->previous_context = engine->last_context;
+ engine->last_context = request->ctx;
if (dev_priv->guc.execbuf_client)
i915_guc_submit(dev_priv->guc.execbuf_client, request);
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 23/25] drm/i915: Store LRC hardware id in the request
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (21 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 22/25] drm/i915: Track the previous pinned context inside the request Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-26 20:06 ` [PATCH v6 24/25] drm/i915: Stop tracking execlists retired requests Chris Wilson
` (2 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
This way in the following patch we can disconnect requests
from contexts.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_drv.h | 2 ++
drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index eb4a95b853dd..e2abbcc27f2c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2352,6 +2352,8 @@ struct drm_i915_gem_request {
/** Execlists no. of times this request has been sent to the ELSP */
int elsp_submitted;
+ /** Execlists context hardware id. */
+ unsigned ctx_hw_id;
};
struct drm_i915_gem_request * __must_check
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0682a79be02c..1035ca0e1d8d 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -481,7 +481,7 @@ execlists_check_remove_request(struct intel_engine_cs *engine, u32 request_id)
if (!head_req)
return 0;
- if (unlikely(head_req->ctx->hw_id != request_id))
+ if (unlikely(head_req->ctx_hw_id != request_id))
return 0;
WARN(head_req->elsp_submitted == 0, "Never submitted head request\n");
@@ -619,6 +619,7 @@ static void execlists_context_queue(struct drm_i915_gem_request *request)
}
list_add_tail(&request->execlist_link, &engine->execlist_queue);
+ request->ctx_hw_id = request->ctx->hw_id;
if (num_elements == 0)
execlists_context_unqueue(engine);
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 24/25] drm/i915: Stop tracking execlists retired requests
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (22 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 23/25] drm/i915: Store LRC hardware id in " Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-26 20:06 ` [PATCH v6 25/25] drm/i915: Unify GPU resets upon shutdown Chris Wilson
2016-04-27 7:24 ` ✗ Fi.CI.BAT: failure for series starting with [v6,01/25] drm/i915/fbdev: Call intel_unpin_fb_obj() on release Patchwork
25 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx; +Cc: Tvrtko Ursulin
From: Tvrtko Ursulin <tvrtko@ursulin.net>
With the previous patch having extended the pinned lifetime of
contexts by referencing the previous context from the current
request until the latter is retired (completed by the GPU),
we can now remove usage of execlist retired queue entirely.
This is because the above now guarantees that all execlist
object access requirements are satisfied by this new tracking,
and we can stop taking additional references and stop keeping
request on the execlists retired queue.
The latter was a source of significant scalability issues in
the driver causing performance hits on some tests. Most
dramatical of which was igt/gem_close_race which had run time
in tens of minutes which is now reduced to tens of seconds.
Signed-off-by: Tvrtko Ursulin <tvrtko@ursulin.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_gem.c | 10 +--------
drivers/gpu/drm/i915/intel_lrc.c | 39 ++++++++++++---------------------
drivers/gpu/drm/i915/intel_lrc.h | 2 +-
drivers/gpu/drm/i915/intel_ringbuffer.h | 1 -
4 files changed, 16 insertions(+), 36 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fc68af50ed1b..1174966b3b38 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2871,13 +2871,7 @@ static void i915_gem_reset_engine_cleanup(struct drm_i915_private *dev_priv,
/* Ensure irq handler finishes or is cancelled. */
tasklet_kill(&engine->irq_tasklet);
- spin_lock_bh(&engine->execlist_lock);
- /* list_splice_tail_init checks for empty lists */
- list_splice_tail_init(&engine->execlist_queue,
- &engine->execlist_retired_req_list);
- spin_unlock_bh(&engine->execlist_lock);
-
- intel_execlists_retire_requests(engine);
+ intel_execlists_cancel_requests(engine);
}
/*
@@ -3001,8 +2995,6 @@ i915_gem_retire_requests(struct drm_device *dev)
spin_lock_bh(&engine->execlist_lock);
idle &= list_empty(&engine->execlist_queue);
spin_unlock_bh(&engine->execlist_lock);
-
- intel_execlists_retire_requests(engine);
}
}
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 1035ca0e1d8d..bc3257675296 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -435,8 +435,8 @@ static void execlists_context_unqueue(struct intel_engine_cs *engine)
/* Same ctx: ignore first request, as second request
* will update tail past first request's workload */
cursor->elsp_submitted = req0->elsp_submitted;
- list_move_tail(&req0->execlist_link,
- &engine->execlist_retired_req_list);
+ list_del(&req0->execlist_link);
+ i915_gem_request_unreference(req0);
req0 = cursor;
} else {
req1 = cursor;
@@ -468,7 +468,7 @@ static void execlists_context_unqueue(struct intel_engine_cs *engine)
}
static unsigned int
-execlists_check_remove_request(struct intel_engine_cs *engine, u32 request_id)
+execlists_check_remove_request(struct intel_engine_cs *engine, u32 ctx_id)
{
struct drm_i915_gem_request *head_req;
@@ -478,19 +478,16 @@ execlists_check_remove_request(struct intel_engine_cs *engine, u32 request_id)
struct drm_i915_gem_request,
execlist_link);
- if (!head_req)
- return 0;
-
- if (unlikely(head_req->ctx_hw_id != request_id))
- return 0;
+ if (WARN_ON(!head_req || (head_req->ctx_hw_id != ctx_id)))
+ return 0;
WARN(head_req->elsp_submitted == 0, "Never submitted head request\n");
if (--head_req->elsp_submitted > 0)
return 0;
- list_move_tail(&head_req->execlist_link,
- &engine->execlist_retired_req_list);
+ list_del(&head_req->execlist_link);
+ i915_gem_request_unreference(head_req);
return 1;
}
@@ -594,9 +591,6 @@ static void execlists_context_queue(struct drm_i915_gem_request *request)
struct drm_i915_gem_request *cursor;
int num_elements = 0;
- intel_lr_context_pin(request->ctx, request->engine);
- i915_gem_request_reference(request);
-
spin_lock_bh(&engine->execlist_lock);
list_for_each_entry(cursor, &engine->execlist_queue, execlist_link)
@@ -613,11 +607,12 @@ static void execlists_context_queue(struct drm_i915_gem_request *request)
if (request->ctx == tail_req->ctx) {
WARN(tail_req->elsp_submitted != 0,
"More than 2 already-submitted reqs queued\n");
- list_move_tail(&tail_req->execlist_link,
- &engine->execlist_retired_req_list);
+ list_del(&tail_req->execlist_link);
+ i915_gem_request_unreference(tail_req);
}
}
+ i915_gem_request_reference(request);
list_add_tail(&request->execlist_link, &engine->execlist_queue);
request->ctx_hw_id = request->ctx->hw_id;
if (num_elements == 0)
@@ -883,23 +878,18 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
return 0;
}
-void intel_execlists_retire_requests(struct intel_engine_cs *engine)
+void intel_execlists_cancel_requests(struct intel_engine_cs *engine)
{
struct drm_i915_gem_request *req, *tmp;
- struct list_head retired_list;
+ LIST_HEAD(cancel_list);
WARN_ON(!mutex_is_locked(&engine->dev->struct_mutex));
- if (list_empty(&engine->execlist_retired_req_list))
- return;
- INIT_LIST_HEAD(&retired_list);
spin_lock_bh(&engine->execlist_lock);
- list_replace_init(&engine->execlist_retired_req_list, &retired_list);
+ list_replace_init(&engine->execlist_queue, &cancel_list);
spin_unlock_bh(&engine->execlist_lock);
- list_for_each_entry_safe(req, tmp, &retired_list, execlist_link) {
- intel_lr_context_unpin(req->ctx, engine);
-
+ list_for_each_entry_safe(req, tmp, &cancel_list, execlist_link) {
list_del(&req->execlist_link);
i915_gem_request_unreference(req);
}
@@ -1991,7 +1981,6 @@ logical_ring_init(struct drm_device *dev, struct intel_engine_cs *engine)
INIT_LIST_HEAD(&engine->buffers);
INIT_LIST_HEAD(&engine->execlist_queue);
- INIT_LIST_HEAD(&engine->execlist_retired_req_list);
spin_lock_init(&engine->execlist_lock);
tasklet_init(&engine->irq_tasklet,
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 28ff324ba5dc..229b8a974262 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -118,6 +118,6 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
struct drm_i915_gem_execbuffer2 *args,
struct list_head *vmas);
-void intel_execlists_retire_requests(struct intel_engine_cs *engine);
+void intel_execlists_cancel_requests(struct intel_engine_cs *engine);
#endif /* _INTEL_LRC_H_ */
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 038914ccc6fd..7023e88531b5 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -266,7 +266,6 @@ struct intel_engine_cs {
struct tasklet_struct irq_tasklet;
spinlock_t execlist_lock; /* used inside tasklet, use spin_lock_bh */
struct list_head execlist_queue;
- struct list_head execlist_retired_req_list;
unsigned int fw_domains;
unsigned int next_context_status_buffer;
unsigned int idle_lite_restore_wa;
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v6 25/25] drm/i915: Unify GPU resets upon shutdown
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (23 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 24/25] drm/i915: Stop tracking execlists retired requests Chris Wilson
@ 2016-04-26 20:06 ` Chris Wilson
2016-04-27 7:24 ` ✗ Fi.CI.BAT: failure for series starting with [v6,01/25] drm/i915/fbdev: Call intel_unpin_fb_obj() on release Patchwork
25 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-26 20:06 UTC (permalink / raw)
To: intel-gfx; +Cc: Mika Kuoppala
Both execlists and legacy need to reset the context (and mode) of the
GPU before we lose control of the system. By resetting the GPU, we
revert back to default settings. This simplifies the life of any
subsequent driver (in particular for virtualized setups) as it does not
then have to try and recover from an unknown condition. As both paths
need to reset for the same reason, move the reset to a common point.
This unifies the resets added in a647828afc (drm/i915: Also perform gpu
reset under execlist mode) and 8e96d9c4d9 (drm/i915: reset the GPU on
context fini).
v2: Restrict the reset to "modern" gen (where we enable HW contexts) to
try and avoid leaving the machine in an unusable state with a risky
reset on older GPU. This should keep the status quo as to who performs
resets (i.e. currently only GPUs with HW contexts perform a reset on
shutdown).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
CC: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: "Niu, Bing" <bing.niu@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
drivers/gpu/drm/i915/i915_dma.c | 45 +++++++++++++++++++++++++++------
drivers/gpu/drm/i915/i915_gem.c | 8 ------
drivers/gpu/drm/i915/i915_gem_context.c | 10 +-------
3 files changed, 38 insertions(+), 25 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 5c7615041b31..d70f028afadb 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -425,6 +425,41 @@ static const struct vga_switcheroo_client_ops i915_switcheroo_ops = {
.can_switch = i915_switcheroo_can_switch,
};
+static void i915_gem_fini(struct drm_device *dev)
+{
+ /*
+ * Neither the BIOS, ourselves or any other kernel
+ * expects the system to be in execlists mode on startup,
+ * so we need to reset the GPU back to legacy mode. And the only
+ * known way to disable logical contexts is through a GPU reset.
+ *
+ * So in order to leave the system in a known default configuration,
+ * always reset the GPU upon unload. Afterwards we then clean up the
+ * GEM state tracking, flushing off the requests and leaving the
+ * system in a known idle state.
+ *
+ * Note that is of the upmost importance that the GPU is idle and
+ * all stray writes are flushed *before* we dismantle the backing
+ * storage for the pinned objects.
+ *
+ * However, since we are uncertain that reseting the GPU on older
+ * machines is a good idea, we don't - just in case it leaves the
+ * machine in an unusable condition.
+ */
+ if (HAS_HW_CONTEXTS(dev)) {
+ int reset = intel_gpu_reset(dev, ALL_ENGINES);
+ WARN_ON(reset && reset != -ENODEV);
+ }
+
+ mutex_lock(&dev->struct_mutex);
+ i915_gem_reset(dev);
+ i915_gem_cleanup_engines(dev);
+ i915_gem_context_fini(dev);
+ mutex_unlock(&dev->struct_mutex);
+
+ WARN_ON(!list_empty(&to_i915(dev)->context_list));
+}
+
static int i915_load_modeset_init(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
@@ -506,10 +541,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
return 0;
cleanup_gem:
- mutex_lock(&dev->struct_mutex);
- i915_gem_cleanup_engines(dev);
- i915_gem_context_fini(dev);
- mutex_unlock(&dev->struct_mutex);
+ i915_gem_fini(dev);
cleanup_irq:
intel_guc_ucode_fini(dev);
drm_irq_uninstall(dev);
@@ -1456,10 +1488,7 @@ int i915_driver_unload(struct drm_device *dev)
flush_workqueue(dev_priv->wq);
intel_guc_ucode_fini(dev);
- mutex_lock(&dev->struct_mutex);
- i915_gem_cleanup_engines(dev);
- i915_gem_context_fini(dev);
- mutex_unlock(&dev->struct_mutex);
+ i915_gem_fini(dev);
intel_fbc_cleanup_cfb(dev_priv);
intel_power_domains_fini(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1174966b3b38..fb5c46dd0219 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4961,14 +4961,6 @@ i915_gem_cleanup_engines(struct drm_device *dev)
for_each_engine(engine, dev_priv)
dev_priv->gt.cleanup_engine(engine);
-
- if (i915.enable_execlists)
- /*
- * Neither the BIOS, ourselves or any other kernel
- * expects the system to be in execlists mode on startup,
- * so we need to reset the GPU back to legacy mode.
- */
- intel_gpu_reset(dev, ALL_ENGINES);
}
static void
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 9477af0fcb33..cd39aba09137 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -450,16 +450,8 @@ void i915_gem_context_fini(struct drm_device *dev)
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_context *dctx = dev_priv->kernel_context;
- i915_gem_context_lost(dev_priv);
-
- if (dctx->legacy_hw_ctx.rcs_state) {
- /* The only known way to stop the gpu from accessing the hw context is
- * to reset it. Do this as the very last operation to avoid confusing
- * other code, leading to spurious errors. */
- intel_gpu_reset(dev, ALL_ENGINES);
-
+ if (dctx->legacy_hw_ctx.rcs_state)
i915_gem_object_ggtt_unpin(dctx->legacy_hw_ctx.rcs_state);
- }
i915_gem_context_unreference(dctx);
dev_priv->kernel_context = NULL;
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 40+ messages in thread
* ✗ Fi.CI.BAT: failure for series starting with [v6,01/25] drm/i915/fbdev: Call intel_unpin_fb_obj() on release
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
` (24 preceding siblings ...)
2016-04-26 20:06 ` [PATCH v6 25/25] drm/i915: Unify GPU resets upon shutdown Chris Wilson
@ 2016-04-27 7:24 ` Patchwork
25 siblings, 0 replies; 40+ messages in thread
From: Patchwork @ 2016-04-27 7:24 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [v6,01/25] drm/i915/fbdev: Call intel_unpin_fb_obj() on release
URL : https://patchwork.freedesktop.org/series/6363/
State : failure
== Summary ==
Series 6363v1 Series without cover letter
http://patchwork.freedesktop.org/api/1.0/series/6363/revisions/1/mbox/
Test drv_hangman:
Subgroup error-state-basic:
pass -> INCOMPLETE (ivb-t430s)
Test gem_basic:
Subgroup create-close:
pass -> INCOMPLETE (ivb-t430s)
Test gem_busy:
Subgroup basic-bsd1:
skip -> INCOMPLETE (ivb-t430s)
Subgroup basic-bsd2:
skip -> INCOMPLETE (ivb-t430s)
Subgroup basic-parallel-vebox:
skip -> INCOMPLETE (ivb-t430s)
Subgroup basic-render:
pass -> INCOMPLETE (ivb-t430s)
Subgroup basic-vebox:
skip -> INCOMPLETE (ivb-t430s)
Test gem_exec_basic:
Subgroup basic-blt:
pass -> INCOMPLETE (ivb-t430s)
Subgroup basic-bsd:
pass -> INCOMPLETE (ivb-t430s)
Subgroup basic-render:
pass -> INCOMPLETE (ivb-t430s)
Subgroup gtt-vebox:
skip -> INCOMPLETE (ivb-t430s)
Test gem_exec_create:
Subgroup basic:
pass -> INCOMPLETE (ivb-t430s)
Test gem_exec_nop:
Subgroup basic:
pass -> INCOMPLETE (ivb-t430s)
Test gem_exec_store:
Subgroup basic-blt:
pass -> INCOMPLETE (ivb-t430s)
Test gem_exec_suspend:
Subgroup basic-s4:
skip -> INCOMPLETE (ivb-t430s)
Test gem_exec_whisper:
Subgroup basic:
pass -> INCOMPLETE (ivb-t430s)
Test gem_flink_basic:
Subgroup bad-open:
pass -> INCOMPLETE (ivb-t430s)
Subgroup double-flink:
pass -> INCOMPLETE (ivb-t430s)
Test gem_mmap:
Subgroup basic:
pass -> INCOMPLETE (ivb-t430s)
Test gem_mmap_gtt:
Subgroup basic-read:
pass -> INCOMPLETE (ivb-t430s)
Subgroup basic-short:
pass -> INCOMPLETE (ivb-t430s)
Subgroup basic-write:
pass -> INCOMPLETE (ivb-t430s)
Test gem_pwrite:
Subgroup basic:
pass -> INCOMPLETE (ivb-t430s)
Test gem_ringfill:
Subgroup basic-default-forked:
pass -> INCOMPLETE (ivb-t430s)
Test gem_storedw_loop:
Subgroup basic-blt:
pass -> INCOMPLETE (ivb-t430s)
Subgroup basic-render:
pass -> INCOMPLETE (ivb-t430s)
Test gem_sync:
Subgroup basic-bsd1:
skip -> INCOMPLETE (ivb-t430s)
Subgroup basic-bsd2:
skip -> INCOMPLETE (ivb-t430s)
Subgroup basic-each:
pass -> INCOMPLETE (ivb-t430s)
Subgroup basic-vebox:
skip -> INCOMPLETE (ivb-t430s)
Test gem_tiled_pread_basic:
pass -> INCOMPLETE (ivb-t430s)
Test kms_addfb_basic:
Subgroup addfb25-framebuffer-vs-set-tiling:
pass -> INCOMPLETE (ivb-t430s)
Subgroup addfb25-x-tiled:
pass -> INCOMPLETE (ivb-t430s)
Subgroup addfb25-y-tiled:
pass -> INCOMPLETE (ivb-t430s)
Subgroup addfb25-yf-tiled:
pass -> INCOMPLETE (ivb-t430s)
Subgroup basic:
pass -> INCOMPLETE (ivb-t430s)
Subgroup framebuffer-vs-set-tiling:
pass -> INCOMPLETE (ivb-t430s)
Subgroup no-handle:
pass -> INCOMPLETE (ivb-t430s)
Subgroup tile-pitch-mismatch:
pass -> INCOMPLETE (ivb-t430s)
Subgroup too-high:
pass -> INCOMPLETE (ivb-t430s)
Subgroup unused-handle:
pass -> INCOMPLETE (ivb-t430s)
Test kms_flip:
Subgroup basic-plain-flip:
pass -> INCOMPLETE (ivb-t430s)
Test kms_force_connector_basic:
Subgroup force-connector-state:
pass -> INCOMPLETE (ivb-t430s)
Test kms_pipe_crc_basic:
Subgroup bad-nb-words-3:
pass -> INCOMPLETE (ivb-t430s)
Subgroup bad-pipe:
pass -> INCOMPLETE (ivb-t430s)
Subgroup hang-read-crc-pipe-b:
incomplete -> PASS (snb-dellxps)
Subgroup nonblocking-crc-pipe-a:
pass -> INCOMPLETE (ivb-t430s)
Subgroup nonblocking-crc-pipe-c:
pass -> TIMEOUT (ivb-t430s)
Subgroup nonblocking-crc-pipe-c-frame-sequence:
pass -> INCOMPLETE (ivb-t430s)
Subgroup read-crc-pipe-b:
pass -> INCOMPLETE (ivb-t430s)
Subgroup read-crc-pipe-c-frame-sequence:
pass -> INCOMPLETE (ivb-t430s)
Test pm_rpm:
Subgroup basic-rte:
skip -> INCOMPLETE (ivb-t430s)
Test prime_self_import:
Subgroup basic-with_one_bo:
pass -> INCOMPLETE (ivb-t430s)
bdw-nuci7 total:200 pass:188 dwarn:0 dfail:0 fail:0 skip:12
bdw-ultra total:200 pass:175 dwarn:0 dfail:0 fail:0 skip:25
bsw-nuc-2 total:199 pass:158 dwarn:0 dfail:0 fail:0 skip:41
byt-nuc total:199 pass:158 dwarn:0 dfail:0 fail:0 skip:41
hsw-brixbox total:200 pass:174 dwarn:0 dfail:0 fail:0 skip:26
ivb-t430s total:200 pass:127 dwarn:0 dfail:0 fail:0 skip:21
skl-i7k-2 total:200 pass:173 dwarn:0 dfail:0 fail:0 skip:27
skl-nuci5 total:200 pass:189 dwarn:0 dfail:0 fail:0 skip:11
snb-dellxps total:200 pass:158 dwarn:0 dfail:0 fail:0 skip:42
snb-x220t total:200 pass:158 dwarn:0 dfail:0 fail:1 skip:41
Results at /archive/results/CI_IGT_test/Patchwork_2082/
e005db1cb2c60d18abe837ac683d8993ea77b239 drm-intel-nightly: 2016y-04m-26d-12h-51m-57s UTC integration manifest
ae0fd2e drm/i915: Unify GPU resets upon shutdown
78050a3 drm/i915: Stop tracking execlists retired requests
d95dca4 drm/i915: Store LRC hardware id in the request
2fac2e2 drm/i915: Track the previous pinned context inside the request
3182315 drm/i915: Move releasing of the GEM request from free to retire/cancel
5a010fd drm/i915: Move the magical deferred context allocation into the request
4e2052e drm/i915: Refactor execlists default context pinning
d4f0abc drm/i915: Replace the pinned context address with its unique ID
fcf34fe drm/i915: Assign every HW context a unique ID
a7abeb5 drm/i915: Update execlists context descriptor format commentary
00d5807 drm/i915: Preallocate enough space for the average request
dbc33f6 drm/i915: Manually unwind after a failed request allocation
f22565a drm/i915: Remove the identical implementations of request space reservation
e598aff drm/i915: Unify intel_ring_begin()
5ee18ae drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use
8bcdb1d drm/i915: Remove early l3-remap
9b422d7 drm/i915: Consolidate L3 remapping LRI
97955a5 drm/i915: L3 cache remapping is part of context switching
b380a51 drm/i915: Mark the current context as lost on suspend
2bfedd9 drm/i915: Use i915_vma_pin_iomap on the ringbuffer object
6498bd5 drm/i915: Move ioremap_wc tracking onto VMA
32de60f drm/i915: Introduce i915_vm_to_ggtt()
8b2f9dc io-mapping: Specify mapping size for io_mapping_map_wc()
b8d40fd drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr
d50d83e drm/i915/fbdev: Call intel_unpin_fb_obj() on release
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v6 12/25] drm/i915: Unify intel_ring_begin()
2016-04-26 20:06 ` [PATCH v6 12/25] drm/i915: Unify intel_ring_begin() Chris Wilson
@ 2016-04-27 9:21 ` Joonas Lahtinen
2016-04-27 9:37 ` Chris Wilson
0 siblings, 1 reply; 40+ messages in thread
From: Joonas Lahtinen @ 2016-04-27 9:21 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On ti, 2016-04-26 at 21:06 +0100, Chris Wilson wrote:
> Combine the near identical implementations of intel_logical_ring_begin()
> and intel_ring_begin() - the only difference is that the logical wait
> has to check for a matching ring (which is assumed by legacy).
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> ---
> drivers/gpu/drm/i915/intel_lrc.c | 146 ++-----------------------
> drivers/gpu/drm/i915/intel_lrc.h | 1 -
> drivers/gpu/drm/i915/intel_mocs.c | 12 +--
> drivers/gpu/drm/i915/intel_ringbuffer.c | 182 ++++++++++++--------------------
> drivers/gpu/drm/i915/intel_ringbuffer.h | 3 -
> 5 files changed, 81 insertions(+), 263 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 2b7e6bbc017b..ba87c94928e7 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -721,48 +721,6 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
> return ret;
> }
>
> -static int logical_ring_wait_for_space(struct drm_i915_gem_request *req,
> - int bytes)
> -{
> - struct intel_ringbuffer *ringbuf = req->ringbuf;
> - struct intel_engine_cs *engine = req->engine;
> - struct drm_i915_gem_request *target;
> - unsigned space;
> - int ret;
> -
> - if (intel_ring_space(ringbuf) >= bytes)
> - return 0;
> -
> - /* The whole point of reserving space is to not wait! */
> - WARN_ON(ringbuf->reserved_in_use);
> -
> - list_for_each_entry(target, &engine->request_list, list) {
> - /*
> - * The request queue is per-engine, so can contain requests
> - * from multiple ringbuffers. Here, we must ignore any that
> - * aren't from the ringbuffer we're considering.
> - */
> - if (target->ringbuf != ringbuf)
> - continue;
> -
> - /* Would completion of this request free enough space? */
> - space = __intel_ring_space(target->postfix, ringbuf->tail,
> - ringbuf->size);
> - if (space >= bytes)
> - break;
> - }
> -
> - if (WARN_ON(&target->list == &engine->request_list))
> - return -ENOSPC;
> -
> - ret = i915_wait_request(target);
> - if (ret)
> - return ret;
> -
> - ringbuf->space = space;
> - return 0;
> -}
> -
> /*
> * intel_logical_ring_advance_and_submit() - advance the tail and submit the workload
> * @request: Request to advance the logical ringbuffer of.
> @@ -814,92 +772,6 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
> return 0;
> }
>
> -static void __wrap_ring_buffer(struct intel_ringbuffer *ringbuf)
> -{
> - uint32_t __iomem *virt;
> - int rem = ringbuf->size - ringbuf->tail;
> -
> - virt = ringbuf->virtual_start + ringbuf->tail;
> - rem /= 4;
> - while (rem--)
> - iowrite32(MI_NOOP, virt++);
> -
> - ringbuf->tail = 0;
> - intel_ring_update_space(ringbuf);
> -}
> -
> -static int logical_ring_prepare(struct drm_i915_gem_request *req, int bytes)
> -{
> - struct intel_ringbuffer *ringbuf = req->ringbuf;
> - int remain_usable = ringbuf->effective_size - ringbuf->tail;
> - int remain_actual = ringbuf->size - ringbuf->tail;
> - int ret, total_bytes, wait_bytes = 0;
> - bool need_wrap = false;
> -
> - if (ringbuf->reserved_in_use)
> - total_bytes = bytes;
> - else
> - total_bytes = bytes + ringbuf->reserved_size;
> -
> - if (unlikely(bytes > remain_usable)) {
> - /*
> - * Not enough space for the basic request. So need to flush
> - * out the remainder and then wait for base + reserved.
> - */
> - wait_bytes = remain_actual + total_bytes;
> - need_wrap = true;
> - } else {
> - if (unlikely(total_bytes > remain_usable)) {
> - /*
> - * The base request will fit but the reserved space
> - * falls off the end. So don't need an immediate wrap
> - * and only need to effectively wait for the reserved
> - * size space from the start of ringbuffer.
> - */
> - wait_bytes = remain_actual + ringbuf->reserved_size;
> - } else if (total_bytes > ringbuf->space) {
> - /* No wrapping required, just waiting. */
> - wait_bytes = total_bytes;
> - }
> - }
> -
> - if (wait_bytes) {
> - ret = logical_ring_wait_for_space(req, wait_bytes);
> - if (unlikely(ret))
> - return ret;
> -
> - if (need_wrap)
> - __wrap_ring_buffer(ringbuf);
> - }
> -
> - return 0;
> -}
> -
> -/**
> - * intel_logical_ring_begin() - prepare the logical ringbuffer to accept some commands
> - *
> - * @req: The request to start some new work for
> - * @num_dwords: number of DWORDs that we plan to write to the ringbuffer.
> - *
> - * The ringbuffer might not be ready to accept the commands right away (maybe it needs to
> - * be wrapped, or wait a bit for the tail to be updated). This function takes care of that
> - * and also preallocates a request (every workload submission is still mediated through
> - * requests, same as it did with legacy ringbuffer submission).
> - *
> - * Return: non-zero if the ringbuffer is not ready to be written to.
> - */
> -int intel_logical_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
> -{
> - int ret;
> -
> - ret = logical_ring_prepare(req, num_dwords * sizeof(uint32_t));
> - if (ret)
> - return ret;
> -
> - req->ringbuf->space -= num_dwords * sizeof(uint32_t);
> - return 0;
> -}
> -
> int intel_logical_ring_reserve_space(struct drm_i915_gem_request *request)
> {
> /*
> @@ -912,7 +784,7 @@ int intel_logical_ring_reserve_space(struct drm_i915_gem_request *request)
> */
> intel_ring_reserved_space_reserve(request->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
>
> - return intel_logical_ring_begin(request, 0);
> + return intel_ring_begin(request, 0);
> }
>
> /**
> @@ -982,7 +854,7 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
>
> if (engine == &dev_priv->engine[RCS] &&
> instp_mode != dev_priv->relative_constants_mode) {
> - ret = intel_logical_ring_begin(params->request, 4);
> + ret = intel_ring_begin(params->request, 4);
> if (ret)
> return ret;
>
> @@ -1178,7 +1050,7 @@ static int intel_logical_ring_workarounds_emit(struct drm_i915_gem_request *req)
> if (ret)
> return ret;
>
> - ret = intel_logical_ring_begin(req, w->count * 2 + 2);
> + ret = intel_ring_begin(req, w->count * 2 + 2);
> if (ret)
> return ret;
>
> @@ -1669,7 +1541,7 @@ static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req)
> const int num_lri_cmds = GEN8_LEGACY_PDPES * 2;
> int i, ret;
>
> - ret = intel_logical_ring_begin(req, num_lri_cmds * 2 + 2);
> + ret = intel_ring_begin(req, num_lri_cmds * 2 + 2);
> if (ret)
> return ret;
>
> @@ -1716,7 +1588,7 @@ static int gen8_emit_bb_start(struct drm_i915_gem_request *req,
> req->ctx->ppgtt->pd_dirty_rings &= ~intel_engine_flag(req->engine);
> }
>
> - ret = intel_logical_ring_begin(req, 4);
> + ret = intel_ring_begin(req, 4);
> if (ret)
> return ret;
>
> @@ -1778,7 +1650,7 @@ static int gen8_emit_flush(struct drm_i915_gem_request *request,
> uint32_t cmd;
> int ret;
>
> - ret = intel_logical_ring_begin(request, 4);
> + ret = intel_ring_begin(request, 4);
> if (ret)
> return ret;
>
> @@ -1846,7 +1718,7 @@ static int gen8_emit_flush_render(struct drm_i915_gem_request *request,
> vf_flush_wa = true;
> }
>
> - ret = intel_logical_ring_begin(request, vf_flush_wa ? 12 : 6);
> + ret = intel_ring_begin(request, vf_flush_wa ? 12 : 6);
> if (ret)
> return ret;
>
> @@ -1920,7 +1792,7 @@ static int gen8_emit_request(struct drm_i915_gem_request *request)
> struct intel_ringbuffer *ringbuf = request->ringbuf;
> int ret;
>
> - ret = intel_logical_ring_begin(request, 6 + WA_TAIL_DWORDS);
> + ret = intel_ring_begin(request, 6 + WA_TAIL_DWORDS);
> if (ret)
> return ret;
>
> @@ -1944,7 +1816,7 @@ static int gen8_emit_request_render(struct drm_i915_gem_request *request)
> struct intel_ringbuffer *ringbuf = request->ringbuf;
> int ret;
>
> - ret = intel_logical_ring_begin(request, 8 + WA_TAIL_DWORDS);
> + ret = intel_ring_begin(request, 8 + WA_TAIL_DWORDS);
> if (ret)
> return ret;
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
> index 461f1ef9b5c1..60a7385bc531 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/intel_lrc.h
> @@ -63,7 +63,6 @@ int intel_logical_ring_reserve_space(struct drm_i915_gem_request *request);
> void intel_logical_ring_stop(struct intel_engine_cs *engine);
> void intel_logical_ring_cleanup(struct intel_engine_cs *engine);
> int intel_logical_rings_init(struct drm_device *dev);
> -int intel_logical_ring_begin(struct drm_i915_gem_request *req, int num_dwords);
>
> int logical_ring_flush_all_caches(struct drm_i915_gem_request *req);
> /**
> diff --git a/drivers/gpu/drm/i915/intel_mocs.c b/drivers/gpu/drm/i915/intel_mocs.c
> index 23b8545ad6b0..6ba4bf7f2a89 100644
> --- a/drivers/gpu/drm/i915/intel_mocs.c
> +++ b/drivers/gpu/drm/i915/intel_mocs.c
> @@ -239,11 +239,9 @@ static int emit_mocs_control_table(struct drm_i915_gem_request *req,
> if (WARN_ON(table->size > GEN9_NUM_MOCS_ENTRIES))
> return -ENODEV;
>
> - ret = intel_logical_ring_begin(req, 2 + 2 * GEN9_NUM_MOCS_ENTRIES);
> - if (ret) {
> - DRM_DEBUG("intel_logical_ring_begin failed %d\n", ret);
These debugs disappear in silence. Could be worthy a note in the commit
message.
> + ret = intel_ring_begin(req, 2 + 2 * GEN9_NUM_MOCS_ENTRIES);
> + if (ret)
> return ret;
> - }
>
> intel_logical_ring_emit(ringbuf,
> MI_LOAD_REGISTER_IMM(GEN9_NUM_MOCS_ENTRIES));
> @@ -305,11 +303,9 @@ static int emit_mocs_l3cc_table(struct drm_i915_gem_request *req,
> if (WARN_ON(table->size > GEN9_NUM_MOCS_ENTRIES))
> return -ENODEV;
>
> - ret = intel_logical_ring_begin(req, 2 + GEN9_NUM_MOCS_ENTRIES);
> - if (ret) {
> - DRM_DEBUG("intel_logical_ring_begin failed %d\n", ret);
This too.
> + ret = intel_ring_begin(req, 2 + GEN9_NUM_MOCS_ENTRIES);
> + if (ret)
> return ret;
> - }
>
> intel_logical_ring_emit(ringbuf,
> MI_LOAD_REGISTER_IMM(GEN9_NUM_MOCS_ENTRIES / 2));
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 61c120aed11e..1d3d2ea3e9bc 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -53,12 +53,6 @@ void intel_ring_update_space(struct intel_ringbuffer *ringbuf)
> ringbuf->tail, ringbuf->size);
> }
>
> -int intel_ring_space(struct intel_ringbuffer *ringbuf)
> -{
> - intel_ring_update_space(ringbuf);
> - return ringbuf->space;
> -}
> -
> bool intel_engine_stopped(struct intel_engine_cs *engine)
> {
> struct drm_i915_private *dev_priv = engine->dev->dev_private;
> @@ -2318,51 +2312,6 @@ void intel_cleanup_engine(struct intel_engine_cs *engine)
> engine->dev = NULL;
> }
>
> -static int ring_wait_for_space(struct intel_engine_cs *engine, int n)
> -{
> - struct intel_ringbuffer *ringbuf = engine->buffer;
> - struct drm_i915_gem_request *request;
> - unsigned space;
> - int ret;
> -
> - if (intel_ring_space(ringbuf) >= n)
> - return 0;
> -
> - /* The whole point of reserving space is to not wait! */
> - WARN_ON(ringbuf->reserved_in_use);
> -
> - list_for_each_entry(request, &engine->request_list, list) {
> - space = __intel_ring_space(request->postfix, ringbuf->tail,
> - ringbuf->size);
> - if (space >= n)
> - break;
> - }
> -
> - if (WARN_ON(&request->list == &engine->request_list))
> - return -ENOSPC;
> -
> - ret = i915_wait_request(request);
> - if (ret)
> - return ret;
> -
> - ringbuf->space = space;
> - return 0;
> -}
> -
> -static void __wrap_ring_buffer(struct intel_ringbuffer *ringbuf)
> -{
> - uint32_t __iomem *virt;
> - int rem = ringbuf->size - ringbuf->tail;
> -
> - virt = ringbuf->virtual_start + ringbuf->tail;
> - rem /= 4;
> - while (rem--)
> - iowrite32(MI_NOOP, virt++);
> -
> - ringbuf->tail = 0;
> - intel_ring_update_space(ringbuf);
> -}
> -
> int intel_engine_idle(struct intel_engine_cs *engine)
> {
> struct drm_i915_gem_request *req;
> @@ -2404,63 +2353,74 @@ int intel_ring_reserve_space(struct drm_i915_gem_request *request)
>
> void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size)
> {
> - WARN_ON(ringbuf->reserved_size);
> - WARN_ON(ringbuf->reserved_in_use);
> -
> + GEM_BUG_ON(ringbuf->reserved_size);
> ringbuf->reserved_size = size;
> }
>
> void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf)
> {
> - WARN_ON(ringbuf->reserved_in_use);
> -
> + GEM_BUG_ON(!ringbuf->reserved_size);
> ringbuf->reserved_size = 0;
> - ringbuf->reserved_in_use = false;
> }
>
> void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf)
> {
> - WARN_ON(ringbuf->reserved_in_use);
> -
> - ringbuf->reserved_in_use = true;
> - ringbuf->reserved_tail = ringbuf->tail;
> + GEM_BUG_ON(!ringbuf->reserved_size);
> + ringbuf->reserved_size = 0;
> }
>
Above two functions are now the same, I'd remove the _cancel variant.
> void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf)
> {
> - WARN_ON(!ringbuf->reserved_in_use);
> - if (ringbuf->tail > ringbuf->reserved_tail) {
> - WARN(ringbuf->tail > ringbuf->reserved_tail + ringbuf->reserved_size,
> - "request reserved size too small: %d vs %d!\n",
> - ringbuf->tail - ringbuf->reserved_tail, ringbuf->reserved_size);
> - } else {
> + GEM_BUG_ON(ringbuf->reserved_size);
> +}
> +
> +static int wait_for_space(struct drm_i915_gem_request *req, int bytes)
> +{
> + struct intel_ringbuffer *ringbuf = req->ringbuf;
> + struct intel_engine_cs *engine = req->engine;
> + struct drm_i915_gem_request *target;
> +
> + intel_ring_update_space(ringbuf);
> + if (ringbuf->space >= bytes)
> + return 0;
> +
> + /* The whole point of reserving space is to not wait! */
> + GEM_BUG_ON(!ringbuf->reserved_size);
reserved_size = 0 would indicate we're at __i915_add_request, but the
comment and test are not clearest. reserved_size being zero does not
directly indicate "aha, reserved bytes are being used", it could very
well be no reserved_size was requested. But right.
> +
> + list_for_each_entry(target, &engine->request_list, list) {
> + unsigned space;
> +
> /*
> - * The ring was wrapped while the reserved space was in use.
> - * That means that some unknown amount of the ring tail was
> - * no-op filled and skipped. Thus simply adding the ring size
> - * to the tail and doing the above space check will not work.
> - * Rather than attempt to track how much tail was skipped,
> - * it is much simpler to say that also skipping the sanity
> - * check every once in a while is not a big issue.
> + * The request queue is per-engine, so can contain requests
> + * from multiple ringbuffers. Here, we must ignore any that
> + * aren't from the ringbuffer we're considering.
> */
> + if (target->ringbuf != ringbuf)
> + continue;
> +
> + /* Would completion of this request free enough space? */
> + space = __intel_ring_space(target->postfix, ringbuf->tail,
> + ringbuf->size);
> + if (space >= bytes)
> + break;
> }
>
> - ringbuf->reserved_size = 0;
> - ringbuf->reserved_in_use = false;
> + if (WARN_ON(&target->list == &engine->request_list))
> + return -ENOSPC;
> +
> + return i915_wait_request(target);
> }
>
> -static int __intel_ring_prepare(struct intel_engine_cs *engine, int bytes)
> +int intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
> {
> - struct intel_ringbuffer *ringbuf = engine->buffer;
> - int remain_usable = ringbuf->effective_size - ringbuf->tail;
> + struct intel_ringbuffer *ringbuf = req->ringbuf;
> int remain_actual = ringbuf->size - ringbuf->tail;
> - int ret, total_bytes, wait_bytes = 0;
> + int remain_usable = ringbuf->effective_size - ringbuf->tail;
> + int bytes = num_dwords * sizeof(u32);
> + int total_bytes, wait_bytes;
> bool need_wrap = false;
>
> - if (ringbuf->reserved_in_use)
> - total_bytes = bytes;
> - else
> - total_bytes = bytes + ringbuf->reserved_size;
> + total_bytes = bytes + ringbuf->reserved_size;
>
> if (unlikely(bytes > remain_usable)) {
> /*
> @@ -2469,44 +2429,38 @@ static int __intel_ring_prepare(struct intel_engine_cs *engine, int bytes)
> */
> wait_bytes = remain_actual + total_bytes;
> need_wrap = true;
> - } else {
> - if (unlikely(total_bytes > remain_usable)) {
> - /*
> - * The base request will fit but the reserved space
> - * falls off the end. So don't need an immediate wrap
> - * and only need to effectively wait for the reserved
> - * size space from the start of ringbuffer.
> - */
> - wait_bytes = remain_actual + ringbuf->reserved_size;
> - } else if (total_bytes > ringbuf->space) {
> - /* No wrapping required, just waiting. */
> - wait_bytes = total_bytes;
> - }
> - }
> + } else if (unlikely(total_bytes > remain_usable)) {
> + /*
> + * The base request will fit but the reserved space
> + * falls off the end. So we don't need an immediate wrap
> + * and only need to effectively wait for the reserved
> + * size space from the start of ringbuffer.
> + */
> + wait_bytes = remain_actual + ringbuf->reserved_size;
> + } else
Should have braces here in the final else.
> + /* No wrapping required, just waiting. */
> + wait_bytes = total_bytes;
>
> - if (wait_bytes) {
> - ret = ring_wait_for_space(engine, wait_bytes);
> + if (wait_bytes > ringbuf->space) {
> + int ret = wait_for_space(req, wait_bytes);
> if (unlikely(ret))
> return ret;
>
> - if (need_wrap)
> - __wrap_ring_buffer(ringbuf);
> + intel_ring_update_space(ringbuf);
> }
>
> - return 0;
> -}
> + if (unlikely(need_wrap)) {
> + GEM_BUG_ON(remain_actual > ringbuf->space);
> + GEM_BUG_ON(ringbuf->tail + remain_actual > ringbuf->size);
>
> -int intel_ring_begin(struct drm_i915_gem_request *req,
> - int num_dwords)
> -{
> - struct intel_engine_cs *engine = req->engine;
> - int ret;
> -
> - ret = __intel_ring_prepare(engine, num_dwords * sizeof(uint32_t));
> - if (ret)
> - return ret;
> + memset(ringbuf->virtual_start + ringbuf->tail,
> + 0, remain_actual);
I'd like to see MI_NOOP or at least a comment that this is not a casual
clear to 0.
With above addressed,
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> + ringbuf->tail = 0;
> + ringbuf->space -= remain_actual;
> + }
>
> - engine->buffer->space -= num_dwords * sizeof(uint32_t);
> + ringbuf->space -= bytes;
> + GEM_BUG_ON(ringbuf->space < 0);
> return 0;
> }
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 2ade194bbea9..201e7752d765 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -108,8 +108,6 @@ struct intel_ringbuffer {
> int size;
> int effective_size;
> int reserved_size;
> - int reserved_tail;
> - bool reserved_in_use;
>
> /** We track the position of the requests in the ring buffer, and
> * when each is retired we increment last_retired_head as the GPU
> @@ -459,7 +457,6 @@ static inline void intel_ring_advance(struct intel_engine_cs *engine)
> }
> int __intel_ring_space(int head, int tail, int size);
> void intel_ring_update_space(struct intel_ringbuffer *ringbuf);
> -int intel_ring_space(struct intel_ringbuffer *ringbuf);
> bool intel_engine_stopped(struct intel_engine_cs *engine);
>
> int __must_check intel_engine_idle(struct intel_engine_cs *engine);
--
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v6 12/25] drm/i915: Unify intel_ring_begin()
2016-04-27 9:21 ` Joonas Lahtinen
@ 2016-04-27 9:37 ` Chris Wilson
2016-04-27 10:54 ` Joonas Lahtinen
0 siblings, 1 reply; 40+ messages in thread
From: Chris Wilson @ 2016-04-27 9:37 UTC (permalink / raw)
To: Joonas Lahtinen; +Cc: intel-gfx
On Wed, Apr 27, 2016 at 12:21:52PM +0300, Joonas Lahtinen wrote:
> On ti, 2016-04-26 at 21:06 +0100, Chris Wilson wrote:
> > - ret = intel_logical_ring_begin(req, 2 + 2 * GEN9_NUM_MOCS_ENTRIES);
> > - if (ret) {
> > - DRM_DEBUG("intel_logical_ring_begin failed %d\n", ret);
>
> These debugs disappear in silence. Could be worthy a note in the commit
> message.
"Remove useless debugging following WARN in case of real error." ?
> > void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf)
> > {
> > - WARN_ON(ringbuf->reserved_in_use);
> > -
> > + GEM_BUG_ON(!ringbuf->reserved_size);
> > ringbuf->reserved_size = 0;
> > - ringbuf->reserved_in_use = false;
> > }
> >
> > void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf)
> > {
> > - WARN_ON(ringbuf->reserved_in_use);
> > -
> > - ringbuf->reserved_in_use = true;
> > - ringbuf->reserved_tail = ringbuf->tail;
> > + GEM_BUG_ON(!ringbuf->reserved_size);
> > + ringbuf->reserved_size = 0;
> > }
> >
>
> Above two functions are now the same, I'd remove the _cancel variant.
I thought semantically they were different until the next patch. Moving
reserved onto the request, makes it moot as the cancellation is just
deleting the incomplete request.
> > void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf)
> > {
> > - WARN_ON(!ringbuf->reserved_in_use);
> > - if (ringbuf->tail > ringbuf->reserved_tail) {
> > - WARN(ringbuf->tail > ringbuf->reserved_tail + ringbuf->reserved_size,
> > - "request reserved size too small: %d vs %d!\n",
> > - ringbuf->tail - ringbuf->reserved_tail, ringbuf->reserved_size);
> > - } else {
> > + GEM_BUG_ON(ringbuf->reserved_size);
> > +}
> > +
> > +static int wait_for_space(struct drm_i915_gem_request *req, int bytes)
> > +{
> > + struct intel_ringbuffer *ringbuf = req->ringbuf;
> > + struct intel_engine_cs *engine = req->engine;
> > + struct drm_i915_gem_request *target;
> > +
> > + intel_ring_update_space(ringbuf);
> > + if (ringbuf->space >= bytes)
> > + return 0;
> > +
> > + /* The whole point of reserving space is to not wait! */
> > + GEM_BUG_ON(!ringbuf->reserved_size);
>
> reserved_size = 0 would indicate we're at __i915_add_request, but the
> comment and test are not clearest. reserved_size being zero does not
> directly indicate "aha, reserved bytes are being used", it could very
> well be no reserved_size was requested. But right.
/* The whole point of reserving space before the request was not to wait! */
doesn't fit :|
/* Space is reserved in the ringbuffer for finalising the request,
* as that cannot be allowed to fail. During request finalisation,
* reserved_space is set to 0 to stop the overallocation and the
* assumption is that then we never need to wait (and so risk
* EINTR).
*/
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v6 13/25] drm/i915: Remove the identical implementations of request space reservation
2016-04-26 20:06 ` [PATCH v6 13/25] drm/i915: Remove the identical implementations of request space reservation Chris Wilson
@ 2016-04-27 10:27 ` Joonas Lahtinen
0 siblings, 0 replies; 40+ messages in thread
From: Joonas Lahtinen @ 2016-04-27 10:27 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On ti, 2016-04-26 at 21:06 +0100, Chris Wilson wrote:
> Now that we share intel_ring_begin(), reserving space for the tail of
> the request is identical between legacy/execlists and so the tautology
> can be removed. In the process, we move the reserved space tracking
> from the ringbuffer on to the request. This is to enable us to reorder
> the reserved space allocation in the next patch.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Some comments below.
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 3 +++
> drivers/gpu/drm/i915/i915_gem.c | 21 ++++++++++------
> drivers/gpu/drm/i915/intel_lrc.c | 15 -----------
> drivers/gpu/drm/i915/intel_ringbuffer.c | 44 +++------------------------------
> drivers/gpu/drm/i915/intel_ringbuffer.h | 17 -------------
> 5 files changed, 19 insertions(+), 81 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 444a8ea0c5c4..831da9f43324 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2278,6 +2278,9 @@ struct drm_i915_gem_request {
> /** Position in the ringbuffer of the end of the whole request */
> u32 tail;
>
> + /** Preallocate space in the ringbuffer for the emitting the request */
> + u32 reserved_space;
> +
> /**
> * Context and ring buffer related to this request
> * Contexts are refcounted, so when this request is associated with a
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 71135c3ce44e..0e27484bd28a 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2573,6 +2573,7 @@ void __i915_add_request(struct drm_i915_gem_request *request,
> struct drm_i915_private *dev_priv;
> struct intel_ringbuffer *ringbuf;
> u32 request_start;
> + u32 reserved_tail;
> int ret;
>
> if (WARN_ON(request == NULL))
> @@ -2587,9 +2588,10 @@ void __i915_add_request(struct drm_i915_gem_request *request,
> * should already have been reserved in the ring buffer. Let the ring
> * know that it is time to use that space up.
> */
> - intel_ring_reserved_space_use(ringbuf);
> -
> request_start = intel_ring_get_tail(ringbuf);
> + reserved_tail = request->reserved_space;
> + request->reserved_space = 0;
> +
> /*
> * Emit any outstanding flushes - execbuf can fail to emit the flush
> * after having emitted the batchbuffer command. Hence we need to fix
> @@ -2653,7 +2655,13 @@ void __i915_add_request(struct drm_i915_gem_request *request,
> intel_mark_busy(dev_priv->dev);
>
> /* Sanity check that the reserved size was large enough. */
> - intel_ring_reserved_space_end(ringbuf);
> + ret = intel_ring_get_tail(ringbuf) - request_start;
> + if (ret < 0)
> + ret += ringbuf->size;
> + WARN_ONCE(ret > reserved_tail,
> + "Not enough space reserved (%d bytes) "
> + "for adding the request (%d bytes)\n",
> + reserved_tail, ret);
Not sure if I'd rather prefer this to stay in an enclosed function and
not spill ringbuf stuff in the function.
> }
>
> static bool i915_context_is_banned(struct drm_i915_private *dev_priv,
> @@ -2774,17 +2782,14 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
> * to be redone if the request is not actually submitted straight
> * away, e.g. because a GPU scheduler has deferred it.
> */
> - if (i915.enable_execlists)
> - ret = intel_logical_ring_reserve_space(req);
> - else
> - ret = intel_ring_reserve_space(req);
> + req->reserved_space = MIN_SPACE_FOR_ADD_REQUEST;
> + ret = intel_ring_begin(req, 0);
> if (ret) {
> /*
> * At this point, the request is fully allocated even if not
> * fully prepared. Thus it can be cleaned up using the proper
> * free code.
> */
> - intel_ring_reserved_space_cancel(req->ringbuf);
You could append this to the above comment that the resrved space will
be released along it.
> i915_gem_request_unreference(req);
> return ret;
> }
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index ba87c94928e7..910044cf143e 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -772,21 +772,6 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
> return 0;
> }
>
> -int intel_logical_ring_reserve_space(struct drm_i915_gem_request *request)
> -{
> - /*
> - * The first call merely notes the reserve request and is common for
> - * all back ends. The subsequent localised _begin() call actually
> - * ensures that the reservation is available. Without the begin, if
> - * the request creator immediately submitted the request without
> - * adding any commands to it then there might not actually be
> - * sufficient room for the submission commands.
> - */
> - intel_ring_reserved_space_reserve(request->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
> -
> - return intel_ring_begin(request, 0);
> -}
> -
> /**
> * execlists_submission() - submit a batchbuffer for execution, Execlists style
> * @dev: DRM device.
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 1d3d2ea3e9bc..ba5946b9fa06 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2336,44 +2336,6 @@ int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
> return 0;
> }
>
> -int intel_ring_reserve_space(struct drm_i915_gem_request *request)
> -{
> - /*
> - * The first call merely notes the reserve request and is common for
> - * all back ends. The subsequent localised _begin() call actually
> - * ensures that the reservation is available. Without the begin, if
> - * the request creator immediately submitted the request without
> - * adding any commands to it then there might not actually be
> - * sufficient room for the submission commands.
> - */
> - intel_ring_reserved_space_reserve(request->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
> -
> - return intel_ring_begin(request, 0);
> -}
> -
> -void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size)
> -{
> - GEM_BUG_ON(ringbuf->reserved_size);
> - ringbuf->reserved_size = size;
> -}
> -
> -void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf)
> -{
> - GEM_BUG_ON(!ringbuf->reserved_size);
> - ringbuf->reserved_size = 0;
> -}
> -
> -void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf)
> -{
> - GEM_BUG_ON(!ringbuf->reserved_size);
> - ringbuf->reserved_size = 0;
> -}
> -
> -void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf)
> -{
> - GEM_BUG_ON(ringbuf->reserved_size);
> -}
> -
> static int wait_for_space(struct drm_i915_gem_request *req, int bytes)
> {
> struct intel_ringbuffer *ringbuf = req->ringbuf;
> @@ -2385,7 +2347,7 @@ static int wait_for_space(struct drm_i915_gem_request *req, int bytes)
> return 0;
>
> /* The whole point of reserving space is to not wait! */
> - GEM_BUG_ON(!ringbuf->reserved_size);
> + GEM_BUG_ON(!req->reserved_space);
>
> list_for_each_entry(target, &engine->request_list, list) {
> unsigned space;
> @@ -2420,7 +2382,7 @@ int intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
> int total_bytes, wait_bytes;
> bool need_wrap = false;
>
> - total_bytes = bytes + ringbuf->reserved_size;
> + total_bytes = bytes + req->reserved_space;
>
> if (unlikely(bytes > remain_usable)) {
> /*
> @@ -2436,7 +2398,7 @@ int intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
> * and only need to effectively wait for the reserved
> * size space from the start of ringbuffer.
> */
> - wait_bytes = remain_actual + ringbuf->reserved_size;
> + wait_bytes = remain_actual + req->reserved_space;
> } else
> /* No wrapping required, just waiting. */
> wait_bytes = total_bytes;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 201e7752d765..038914ccc6fd 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -107,7 +107,6 @@ struct intel_ringbuffer {
> int space;
> int size;
> int effective_size;
> - int reserved_size;
>
> /** We track the position of the requests in the ring buffer, and
> * when each is retired we increment last_retired_head as the GPU
> @@ -491,20 +490,4 @@ static inline u32 intel_ring_get_tail(struct intel_ringbuffer *ringbuf)
> */
> #define MIN_SPACE_FOR_ADD_REQUEST 160
>
> -/*
> - * Reserve space in the ring to guarantee that the i915_add_request() call
> - * will always have sufficient room to do its stuff. The request creation
> - * code calls this automatically.
> - */
> -void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size);
> -/* Cancel the reservation, e.g. because the request is being discarded. */
> -void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf);
> -/* Use the reserved space - for use by i915_add_request() only. */
> -void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf);
> -/* Finish with the reserved space - for use by i915_add_request() only. */
> -void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf);
> -
> -/* Legacy ringbuffer specific portion of reservation code: */
> -int intel_ring_reserve_space(struct drm_i915_gem_request *request);
> -
> #endif /* _INTEL_RINGBUFFER_H_ */
--
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v6 14/25] drm/i915: Manually unwind after a failed request allocation
2016-04-26 20:06 ` [PATCH v6 14/25] drm/i915: Manually unwind after a failed request allocation Chris Wilson
@ 2016-04-27 10:48 ` Joonas Lahtinen
2016-04-27 10:59 ` Chris Wilson
2016-04-27 12:51 ` Tvrtko Ursulin
1 sibling, 1 reply; 40+ messages in thread
From: Joonas Lahtinen @ 2016-04-27 10:48 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On ti, 2016-04-26 at 21:06 +0100, Chris Wilson wrote:
> In the next patches, we want to move the work out of freeing the request
> and into its retirement (so that we can free the request without
> requiring the struct_mutex). This means that we cannot rely on
> unreferencing the request to completely teardown the request any more
> and so we need to manually unwind the failed allocation. In doing so, we
> reorder the allocation in order to make the unwind simple (and ensure
> that we don't try to unwind a partial request that may have modified
> global state) and so we end up pushing the initial preallocation down
> into the engine request initialisation functions where we have the
> requisite control over the state of the request.
>
> Moving the initial preallocation into the engine is less than ideal: it
> moves logic to handle a specific problem with request handling out of
> the common code. On the other hand, it does allow those backends
> significantly more flexibility in performing its allocations.
>
Adding John as CC,
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 28 +++++++++-------------------
> drivers/gpu/drm/i915/intel_lrc.c | 16 ++++++++++++++--
> drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +-
> 3 files changed, 24 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 0e27484bd28a..d7ff5e79182f 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2766,15 +2766,6 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
> req->ctx = ctx;
> i915_gem_context_reference(req->ctx);
>
> - if (i915.enable_execlists)
> - ret = intel_logical_ring_alloc_request_extras(req);
> - else
> - ret = intel_ring_alloc_request_extras(req);
> - if (ret) {
> - i915_gem_context_unreference(req->ctx);
> - goto err;
> - }
> -
> /*
> * Reserve space in the ring buffer for all the commands required to
> * eventually emit this request. This is to guarantee that the
> @@ -2783,20 +2774,19 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
> * away, e.g. because a GPU scheduler has deferred it.
> */
> req->reserved_space = MIN_SPACE_FOR_ADD_REQUEST;
> - ret = intel_ring_begin(req, 0);
> - if (ret) {
> - /*
> - * At this point, the request is fully allocated even if not
> - * fully prepared. Thus it can be cleaned up using the proper
> - * free code.
> - */
> - i915_gem_request_unreference(req);
> - return ret;
> - }
> +
> + if (i915.enable_execlists)
> + ret = intel_logical_ring_alloc_request_extras(req);
> + else
> + ret = intel_ring_alloc_request_extras(req);
> + if (ret)
> + goto err_ctx;
>
> *req_out = req;
> return 0;
>
> +err_ctx:
> + i915_gem_context_unreference(ctx);
> err:
> kmem_cache_free(dev_priv->requests, req);
> return ret;
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 910044cf143e..01517dd7069b 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -698,7 +698,7 @@ static int execlists_move_to_gpu(struct drm_i915_gem_request *req,
>
> int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request)
> {
> - int ret = 0;
> + int ret;
>
> request->ringbuf = request->ctx->engine[request->engine->id].ringbuf;
>
> @@ -715,9 +715,21 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
> return ret;
> }
>
> - if (request->ctx != request->i915->kernel_context)
> + if (request->ctx != request->i915->kernel_context) {
> ret = intel_lr_context_pin(request->ctx, request->engine);
> + if (ret)
> + return ret;
> + }
>
> + ret = intel_ring_begin(request, 0);
> + if (ret)
> + goto err_unpin;
> +
> + return 0;
> +
> +err_unpin:
> + if (request->ctx != request->i915->kernel_context)
> + intel_lr_context_unpin(request->ctx, request->engine);
> return ret;
> }
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index ba5946b9fa06..1193372f74fd 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2333,7 +2333,7 @@ int intel_engine_idle(struct intel_engine_cs *engine)
> int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
> {
> request->ringbuf = request->engine->buffer;
> - return 0;
> + return intel_ring_begin(request, 0);
> }
The names of these two _extras functions with the added functionality
do not make sense.
Regards, Joonas
>
> static int wait_for_space(struct drm_i915_gem_request *req, int bytes)
--
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v6 12/25] drm/i915: Unify intel_ring_begin()
2016-04-27 9:37 ` Chris Wilson
@ 2016-04-27 10:54 ` Joonas Lahtinen
0 siblings, 0 replies; 40+ messages in thread
From: Joonas Lahtinen @ 2016-04-27 10:54 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On ke, 2016-04-27 at 10:37 +0100, Chris Wilson wrote:
> On Wed, Apr 27, 2016 at 12:21:52PM +0300, Joonas Lahtinen wrote:
> >
> > On ti, 2016-04-26 at 21:06 +0100, Chris Wilson wrote:
> > >
> > > - ret = intel_logical_ring_begin(req, 2 + 2 * GEN9_NUM_MOCS_ENTRIES);
> > > - if (ret) {
> > > - DRM_DEBUG("intel_logical_ring_begin failed %d\n", ret);
> > These debugs disappear in silence. Could be worthy a note in the commit
> > message.
> "Remove useless debugging following WARN in case of real error." ?
Sounds ok.
>
> >
> > >
> > > void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf)
> > > {
> > > - WARN_ON(ringbuf->reserved_in_use);
> > > -
> > > + GEM_BUG_ON(!ringbuf->reserved_size);
> > > ringbuf->reserved_size = 0;
> > > - ringbuf->reserved_in_use = false;
> > > }
> > >
> > > void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf)
> > > {
> > > - WARN_ON(ringbuf->reserved_in_use);
> > > -
> > > - ringbuf->reserved_in_use = true;
> > > - ringbuf->reserved_tail = ringbuf->tail;
> > > + GEM_BUG_ON(!ringbuf->reserved_size);
> > > + ringbuf->reserved_size = 0;
> > > }
> > >
> > Above two functions are now the same, I'd remove the _cancel variant.
> I thought semantically they were different until the next patch. Moving
> reserved onto the request, makes it moot as the cancellation is just
> deleting the incomplete request.
>
> >
> > >
> > > void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf)
> > > {
> > > - WARN_ON(!ringbuf->reserved_in_use);
> > > - if (ringbuf->tail > ringbuf->reserved_tail) {
> > > - WARN(ringbuf->tail > ringbuf->reserved_tail + ringbuf->reserved_size,
> > > - "request reserved size too small: %d vs %d!\n",
> > > - ringbuf->tail - ringbuf->reserved_tail, ringbuf->reserved_size);
> > > - } else {
> > > + GEM_BUG_ON(ringbuf->reserved_size);
> > > +}
> > > +
> > > +static int wait_for_space(struct drm_i915_gem_request *req, int bytes)
> > > +{
> > > + struct intel_ringbuffer *ringbuf = req->ringbuf;
> > > + struct intel_engine_cs *engine = req->engine;
> > > + struct drm_i915_gem_request *target;
> > > +
> > > + intel_ring_update_space(ringbuf);
> > > + if (ringbuf->space >= bytes)
> > > + return 0;
> > > +
> > > + /* The whole point of reserving space is to not wait! */
> > > + GEM_BUG_ON(!ringbuf->reserved_size);
> > reserved_size = 0 would indicate we're at __i915_add_request, but the
> > comment and test are not clearest. reserved_size being zero does not
> > directly indicate "aha, reserved bytes are being used", it could very
> > well be no reserved_size was requested. But right.
> /* The whole point of reserving space before the request was not to wait! */
> doesn't fit :|
>
> /* Space is reserved in the ringbuffer for finalising the request,
> * as that cannot be allowed to fail. During request finalisation,
> * reserved_space is set to 0 to stop the overallocation and the
> * assumption is that then we never need to wait (and so risk
> * EINTR).
> */
Looks good too.
Regards, Joonas
> -Chris
>
--
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v6 14/25] drm/i915: Manually unwind after a failed request allocation
2016-04-27 10:48 ` Joonas Lahtinen
@ 2016-04-27 10:59 ` Chris Wilson
0 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-27 10:59 UTC (permalink / raw)
To: Joonas Lahtinen; +Cc: intel-gfx
On Wed, Apr 27, 2016 at 01:48:01PM +0300, Joonas Lahtinen wrote:
> The names of these two _extras functions with the added functionality
> do not make sense.
They never did. The original intent here was to call engine->pin_request
that would acquire the context and prepare the ringbuf for writing into it.
Eventually they will do so again.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v6 01/25] drm/i915/fbdev: Call intel_unpin_fb_obj() on release
2016-04-26 20:05 ` [PATCH v6 01/25] drm/i915/fbdev: Call intel_unpin_fb_obj() on release Chris Wilson
@ 2016-04-27 12:45 ` Tvrtko Ursulin
0 siblings, 0 replies; 40+ messages in thread
From: Tvrtko Ursulin @ 2016-04-27 12:45 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 26/04/16 21:05, Chris Wilson wrote:
> When releasing the intel_fbdev, we should unpin the framebuffer that we
> pinned during construction.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
> drivers/gpu/drm/i915/intel_display.c | 2 +-
> drivers/gpu/drm/i915/intel_drv.h | 1 +
> drivers/gpu/drm/i915/intel_fbdev.c | 7 ++++++-
> 3 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index b7cb632a2a31..87ce7852482b 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -2309,7 +2309,7 @@ err_pm:
> return ret;
> }
>
> -static void intel_unpin_fb_obj(struct drm_framebuffer *fb, unsigned int rotation)
> +void intel_unpin_fb_obj(struct drm_framebuffer *fb, unsigned int rotation)
> {
> struct drm_i915_gem_object *obj = intel_fb_obj(fb);
> struct i915_ggtt_view view;
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index cb89a35a6755..dcb19133f640 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1162,6 +1162,7 @@ void intel_release_load_detect_pipe(struct drm_connector *connector,
> struct drm_modeset_acquire_ctx *ctx);
> int intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb,
> unsigned int rotation);
> +void intel_unpin_fb_obj(struct drm_framebuffer *fb, unsigned int rotation);
> struct drm_framebuffer *
> __intel_framebuffer_create(struct drm_device *dev,
> struct drm_mode_fb_cmd2 *mode_cmd,
> diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
> index af561542a5a1..1c3ad121f1b9 100644
> --- a/drivers/gpu/drm/i915/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/intel_fbdev.c
> @@ -287,7 +287,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
> out_destroy_fbi:
> drm_fb_helper_release_fbi(helper);
> out_unpin:
> - i915_gem_object_ggtt_unpin(obj);
> + intel_unpin_fb_obj(&ifbdev->fb->base, BIT(DRM_ROTATE_0));
> out_unlock:
> mutex_unlock(&dev->struct_mutex);
> return ret;
> @@ -551,6 +551,11 @@ static void intel_fbdev_destroy(struct drm_device *dev,
>
> if (ifbdev->fb) {
> drm_framebuffer_unregister_private(&ifbdev->fb->base);
> +
> + mutex_lock(&dev->struct_mutex);
> + intel_unpin_fb_obj(&ifbdev->fb->base, BIT(DRM_ROTATE_0));
> + mutex_unlock(&dev->struct_mutex);
> +
> drm_framebuffer_remove(&ifbdev->fb->base);
> }
> }
>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v6 14/25] drm/i915: Manually unwind after a failed request allocation
2016-04-26 20:06 ` [PATCH v6 14/25] drm/i915: Manually unwind after a failed request allocation Chris Wilson
2016-04-27 10:48 ` Joonas Lahtinen
@ 2016-04-27 12:51 ` Tvrtko Ursulin
2016-04-27 12:58 ` Chris Wilson
1 sibling, 1 reply; 40+ messages in thread
From: Tvrtko Ursulin @ 2016-04-27 12:51 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 26/04/16 21:06, Chris Wilson wrote:
> In the next patches, we want to move the work out of freeing the request
> and into its retirement (so that we can free the request without
> requiring the struct_mutex). This means that we cannot rely on
> unreferencing the request to completely teardown the request any more
> and so we need to manually unwind the failed allocation. In doing so, we
> reorder the allocation in order to make the unwind simple (and ensure
> that we don't try to unwind a partial request that may have modified
> global state) and so we end up pushing the initial preallocation down
> into the engine request initialisation functions where we have the
> requisite control over the state of the request.
>
> Moving the initial preallocation into the engine is less than ideal: it
> moves logic to handle a specific problem with request handling out of
> the common code. On the other hand, it does allow those backends
> significantly more flexibility in performing its allocations.
Could add _free_request_extras which would only be allowed to be called
from _alloc_request? That would enable not-polluting the engine with
common code I think.
Regards,
Tvrtko
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 28 +++++++++-------------------
> drivers/gpu/drm/i915/intel_lrc.c | 16 ++++++++++++++--
> drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +-
> 3 files changed, 24 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 0e27484bd28a..d7ff5e79182f 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2766,15 +2766,6 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
> req->ctx = ctx;
> i915_gem_context_reference(req->ctx);
>
> - if (i915.enable_execlists)
> - ret = intel_logical_ring_alloc_request_extras(req);
> - else
> - ret = intel_ring_alloc_request_extras(req);
> - if (ret) {
> - i915_gem_context_unreference(req->ctx);
> - goto err;
> - }
> -
> /*
> * Reserve space in the ring buffer for all the commands required to
> * eventually emit this request. This is to guarantee that the
> @@ -2783,20 +2774,19 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
> * away, e.g. because a GPU scheduler has deferred it.
> */
> req->reserved_space = MIN_SPACE_FOR_ADD_REQUEST;
> - ret = intel_ring_begin(req, 0);
> - if (ret) {
> - /*
> - * At this point, the request is fully allocated even if not
> - * fully prepared. Thus it can be cleaned up using the proper
> - * free code.
> - */
> - i915_gem_request_unreference(req);
> - return ret;
> - }
> +
> + if (i915.enable_execlists)
> + ret = intel_logical_ring_alloc_request_extras(req);
> + else
> + ret = intel_ring_alloc_request_extras(req);
> + if (ret)
> + goto err_ctx;
>
> *req_out = req;
> return 0;
>
> +err_ctx:
> + i915_gem_context_unreference(ctx);
> err:
> kmem_cache_free(dev_priv->requests, req);
> return ret;
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 910044cf143e..01517dd7069b 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -698,7 +698,7 @@ static int execlists_move_to_gpu(struct drm_i915_gem_request *req,
>
> int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request)
> {
> - int ret = 0;
> + int ret;
>
> request->ringbuf = request->ctx->engine[request->engine->id].ringbuf;
>
> @@ -715,9 +715,21 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
> return ret;
> }
>
> - if (request->ctx != request->i915->kernel_context)
> + if (request->ctx != request->i915->kernel_context) {
> ret = intel_lr_context_pin(request->ctx, request->engine);
> + if (ret)
> + return ret;
> + }
>
> + ret = intel_ring_begin(request, 0);
> + if (ret)
> + goto err_unpin;
> +
> + return 0;
> +
> +err_unpin:
> + if (request->ctx != request->i915->kernel_context)
> + intel_lr_context_unpin(request->ctx, request->engine);
> return ret;
> }
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index ba5946b9fa06..1193372f74fd 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2333,7 +2333,7 @@ int intel_engine_idle(struct intel_engine_cs *engine)
> int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
> {
> request->ringbuf = request->engine->buffer;
> - return 0;
> + return intel_ring_begin(request, 0);
> }
>
> static int wait_for_space(struct drm_i915_gem_request *req, int bytes)
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v6 14/25] drm/i915: Manually unwind after a failed request allocation
2016-04-27 12:51 ` Tvrtko Ursulin
@ 2016-04-27 12:58 ` Chris Wilson
2016-04-27 13:03 ` Tvrtko Ursulin
0 siblings, 1 reply; 40+ messages in thread
From: Chris Wilson @ 2016-04-27 12:58 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: intel-gfx
On Wed, Apr 27, 2016 at 01:51:38PM +0100, Tvrtko Ursulin wrote:
>
> On 26/04/16 21:06, Chris Wilson wrote:
> >In the next patches, we want to move the work out of freeing the request
> >and into its retirement (so that we can free the request without
> >requiring the struct_mutex). This means that we cannot rely on
> >unreferencing the request to completely teardown the request any more
> >and so we need to manually unwind the failed allocation. In doing so, we
> >reorder the allocation in order to make the unwind simple (and ensure
> >that we don't try to unwind a partial request that may have modified
> >global state) and so we end up pushing the initial preallocation down
> >into the engine request initialisation functions where we have the
> >requisite control over the state of the request.
> >
> >Moving the initial preallocation into the engine is less than ideal: it
> >moves logic to handle a specific problem with request handling out of
> >the common code. On the other hand, it does allow those backends
> >significantly more flexibility in performing its allocations.
>
> Could add _free_request_extras which would only be allowed to be
> called from _alloc_request? That would enable not-polluting the
> engine with common code I think.
If you look at where I think it should be placed inside lrc, then we
need multiple phases. Not that isn't much of a big deal:
request_alloc:
engine->pin_request()
/* prep */
engine->init_context()
are more or less what we need, it will take a bit of organisation to
align legacy / execlists. But it can be done.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v6 14/25] drm/i915: Manually unwind after a failed request allocation
2016-04-27 12:58 ` Chris Wilson
@ 2016-04-27 13:03 ` Tvrtko Ursulin
0 siblings, 0 replies; 40+ messages in thread
From: Tvrtko Ursulin @ 2016-04-27 13:03 UTC (permalink / raw)
To: Chris Wilson, intel-gfx, Mika Kuoppala, Joonas Lahtinen
On 27/04/16 13:58, Chris Wilson wrote:
> On Wed, Apr 27, 2016 at 01:51:38PM +0100, Tvrtko Ursulin wrote:
>>
>> On 26/04/16 21:06, Chris Wilson wrote:
>>> In the next patches, we want to move the work out of freeing the request
>>> and into its retirement (so that we can free the request without
>>> requiring the struct_mutex). This means that we cannot rely on
>>> unreferencing the request to completely teardown the request any more
>>> and so we need to manually unwind the failed allocation. In doing so, we
>>> reorder the allocation in order to make the unwind simple (and ensure
>>> that we don't try to unwind a partial request that may have modified
>>> global state) and so we end up pushing the initial preallocation down
>>> into the engine request initialisation functions where we have the
>>> requisite control over the state of the request.
>>>
>>> Moving the initial preallocation into the engine is less than ideal: it
>>> moves logic to handle a specific problem with request handling out of
>>> the common code. On the other hand, it does allow those backends
>>> significantly more flexibility in performing its allocations.
>>
>> Could add _free_request_extras which would only be allowed to be
>> called from _alloc_request? That would enable not-polluting the
>> engine with common code I think.
>
> If you look at where I think it should be placed inside lrc, then we
> need multiple phases. Not that isn't much of a big deal:
>
> request_alloc:
>
> engine->pin_request()
>
> /* prep */
>
> engine->init_context()
>
> are more or less what we need, it will take a bit of organisation to
> align legacy / execlists. But it can be done.
Forgot to say, patch looks correct to me as it is. So r-b from that
point of view anyway. Because in my mind solving the big performance
sloppiness the series fixes is much more important than one (more)
slight temporary design inelegance.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v6 15/25] drm/i915: Preallocate enough space for the average request
2016-04-26 20:06 ` [PATCH v6 15/25] drm/i915: Preallocate enough space for the average request Chris Wilson
@ 2016-04-27 13:15 ` Tvrtko Ursulin
2016-04-27 13:26 ` Chris Wilson
0 siblings, 1 reply; 40+ messages in thread
From: Tvrtko Ursulin @ 2016-04-27 13:15 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 26/04/16 21:06, Chris Wilson wrote:
> Rather than being interrupted when we run out of space halfway through
> the request, and having to restart from the beginning (and returning to
> userspace), flush a little more free space when we prepare the request.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> ---
> drivers/gpu/drm/i915/intel_lrc.c | 7 +++++++
> drivers/gpu/drm/i915/intel_ringbuffer.c | 16 +++++++++++++++-
> 2 files changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 01517dd7069b..b5c2c1931a5f 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -700,6 +700,12 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
> {
> int ret;
>
> + /* Flush enough space to reduce the likelihood of waiting after
> + * we start building the request - in which case we will just
> + * have to repeat work.
> + */
> + request->reserved_space += MIN_SPACE_FOR_ADD_REQUEST;
> +
> request->ringbuf = request->ctx->engine[request->engine->id].ringbuf;
>
> if (i915.enable_guc_submission) {
> @@ -725,6 +731,7 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
> if (ret)
> goto err_unpin;
>
> + request->reserved_space -= MIN_SPACE_FOR_ADD_REQUEST;
Without any previous experience with ringbuf space reservation and
related, this does make sense to me. :)
So why add and subtract and not just increase MIN_SPACE_FOR_ADD_REQUEST?
Or if MIN_SPACE_FOR_ADD_REQUEST is completely unrelated to the
subsequent stuff to go in, and perhaps only represent the typical driver
prologue & epilogue, why increase the reserved size temporarily by that
amount and not something else?
Regards,
Tvrtko
> return 0;
>
> err_unpin:
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 1193372f74fd..1285605f25c7 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2332,8 +2332,22 @@ int intel_engine_idle(struct intel_engine_cs *engine)
>
> int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
> {
> + int ret;
> +
> + /* Flush enough space to reduce the likelihood of waiting after
> + * we start building the request - in which case we will just
> + * have to repeat work.
> + */
> + request->reserved_space += MIN_SPACE_FOR_ADD_REQUEST;
> +
> request->ringbuf = request->engine->buffer;
> - return intel_ring_begin(request, 0);
> +
> + ret = intel_ring_begin(request, 0);
> + if (ret)
> + return ret;
> +
> + request->reserved_space -= MIN_SPACE_FOR_ADD_REQUEST;
> + return 0;
> }
>
> static int wait_for_space(struct drm_i915_gem_request *req, int bytes)
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v6 16/25] drm/i915: Update execlists context descriptor format commentary
2016-04-26 20:06 ` [PATCH v6 16/25] drm/i915: Update execlists context descriptor format commentary Chris Wilson
@ 2016-04-27 13:22 ` Tvrtko Ursulin
0 siblings, 0 replies; 40+ messages in thread
From: Tvrtko Ursulin @ 2016-04-27 13:22 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 26/04/16 21:06, Chris Wilson wrote:
> The comments describing the Context Descriptor Format are off by a bit
> for the size of the context ID.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Dave Gordon <david.s.gordon@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> drivers/gpu/drm/i915/intel_lrc.c | 11 ++++++-----
> 1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index b5c2c1931a5f..5d8ee9059eee 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -305,10 +305,11 @@ logical_ring_init_platform_invariants(struct intel_engine_cs *engine)
> * which remains valid until the context is unpinned.
> *
> * This is what a descriptor looks like, from LSB to MSB:
> - * bits 0-11: flags, GEN8_CTX_* (cached in ctx_desc_template)
> + * bits 0-11: flags, GEN8_CTX_* (cached in ctx_desc_template)
> * bits 12-31: LRCA, GTT address of (the HWSP of) this context
> - * bits 32-51: ctx ID, a globally unique tag (the LRCA again!)
> - * bits 52-63: reserved, may encode the engine ID (for GuC)
> + * bits 32-52: ctx ID, a globally unique tag (the LRCA again!)
> + * bits 53-54: mbz, reserved for use by hardware
> + * bits 55-63: group ID, currently unused and set to 0
> */
> static void
> intel_lr_context_descriptor_update(struct intel_context *ctx,
> @@ -319,9 +320,9 @@ intel_lr_context_descriptor_update(struct intel_context *ctx,
> lrca = ctx->engine[engine->id].lrc_vma->node.start +
> LRC_PPHWSP_PN * PAGE_SIZE;
>
> - desc = engine->ctx_desc_template; /* bits 0-11 */
> + desc = engine->ctx_desc_template; /* bits 0-11 */
> desc |= lrca; /* bits 12-31 */
> - desc |= (lrca >> PAGE_SHIFT) << GEN8_CTX_ID_SHIFT; /* bits 32-51 */
> + desc |= (lrca >> PAGE_SHIFT) << GEN8_CTX_ID_SHIFT; /* bits 32-52 */
>
> ctx->engine[engine->id].lrc_desc = desc;
> }
>
Verified against the docs and it is correct.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v6 15/25] drm/i915: Preallocate enough space for the average request
2016-04-27 13:15 ` Tvrtko Ursulin
@ 2016-04-27 13:26 ` Chris Wilson
0 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2016-04-27 13:26 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: intel-gfx
On Wed, Apr 27, 2016 at 02:15:29PM +0100, Tvrtko Ursulin wrote:
>
> On 26/04/16 21:06, Chris Wilson wrote:
> >Rather than being interrupted when we run out of space halfway through
> >the request, and having to restart from the beginning (and returning to
> >userspace), flush a little more free space when we prepare the request.
> >
> >Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> >---
> > drivers/gpu/drm/i915/intel_lrc.c | 7 +++++++
> > drivers/gpu/drm/i915/intel_ringbuffer.c | 16 +++++++++++++++-
> > 2 files changed, 22 insertions(+), 1 deletion(-)
> >
> >diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> >index 01517dd7069b..b5c2c1931a5f 100644
> >--- a/drivers/gpu/drm/i915/intel_lrc.c
> >+++ b/drivers/gpu/drm/i915/intel_lrc.c
> >@@ -700,6 +700,12 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
> > {
> > int ret;
> >
> >+ /* Flush enough space to reduce the likelihood of waiting after
> >+ * we start building the request - in which case we will just
> >+ * have to repeat work.
> >+ */
> >+ request->reserved_space += MIN_SPACE_FOR_ADD_REQUEST;
> >+
> > request->ringbuf = request->ctx->engine[request->engine->id].ringbuf;
> >
> > if (i915.enable_guc_submission) {
> >@@ -725,6 +731,7 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
> > if (ret)
> > goto err_unpin;
> >
> >+ request->reserved_space -= MIN_SPACE_FOR_ADD_REQUEST;
>
> Without any previous experience with ringbuf space reservation and
> related, this does make sense to me. :)
>
> So why add and subtract and not just increase
> MIN_SPACE_FOR_ADD_REQUEST? Or if MIN_SPACE_FOR_ADD_REQUEST is
> completely unrelated to the subsequent stuff to go in, and perhaps
> only represent the typical driver prologue & epilogue, why increase
> the reserved size temporarily by that amount and not something else?
The game being played here is to first free up enough space inside the
ringbuffer that we shouldn't have to pause again, and then ensure that
we always have enough space to submit the request. So step 1 is flush,
then step 2 is reserve, step 3 is to use the reserve.
MIN_SPACE_FOR_ADD_REQUEST is being used as a conservative estimate. What
we actually use is approx. 1 flush and 1 bb for lrc. 1 flush, several
LRI, 1 switch context, more LRI, another flush and then 1 bb for
ringbuffer. The amount of space we flush and reserve could be fine
tuned, and will likely need to be adjusted in future for various
features. But the flush is just an optimisation, the reserve is
mandatory.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 40+ messages in thread
end of thread, other threads:[~2016-04-27 13:26 UTC | newest]
Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-26 20:05 Premature unpinning, finally? Chris Wilson
2016-04-26 20:05 ` [PATCH v6 01/25] drm/i915/fbdev: Call intel_unpin_fb_obj() on release Chris Wilson
2016-04-27 12:45 ` Tvrtko Ursulin
2016-04-26 20:05 ` [PATCH v6 02/25] drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr Chris Wilson
[not found] ` <1461701180-895-1-git-send-email-chris-Y6uKTt2uX1cEflXRtASbqLVCufUGDwFn@public.gmane.org>
2016-04-26 20:05 ` [PATCH v6 03/25] io-mapping: Specify mapping size for io_mapping_map_wc() Chris Wilson
2016-04-26 20:05 ` [PATCH v6 04/25] drm/i915: Introduce i915_vm_to_ggtt() Chris Wilson
2016-04-26 20:06 ` [PATCH v6 05/25] drm/i915: Move ioremap_wc tracking onto VMA Chris Wilson
2016-04-26 20:06 ` [PATCH v6 06/25] drm/i915: Use i915_vma_pin_iomap on the ringbuffer object Chris Wilson
2016-04-26 20:06 ` [PATCH v6 07/25] drm/i915: Mark the current context as lost on suspend Chris Wilson
2016-04-26 20:06 ` [PATCH v6 08/25] drm/i915: L3 cache remapping is part of context switching Chris Wilson
2016-04-26 20:06 ` [PATCH v6 09/25] drm/i915: Consolidate L3 remapping LRI Chris Wilson
2016-04-26 20:06 ` [PATCH v6 10/25] drm/i915: Remove early l3-remap Chris Wilson
2016-04-26 20:06 ` [PATCH v6 11/25] drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use Chris Wilson
2016-04-26 20:06 ` [PATCH v6 12/25] drm/i915: Unify intel_ring_begin() Chris Wilson
2016-04-27 9:21 ` Joonas Lahtinen
2016-04-27 9:37 ` Chris Wilson
2016-04-27 10:54 ` Joonas Lahtinen
2016-04-26 20:06 ` [PATCH v6 13/25] drm/i915: Remove the identical implementations of request space reservation Chris Wilson
2016-04-27 10:27 ` Joonas Lahtinen
2016-04-26 20:06 ` [PATCH v6 14/25] drm/i915: Manually unwind after a failed request allocation Chris Wilson
2016-04-27 10:48 ` Joonas Lahtinen
2016-04-27 10:59 ` Chris Wilson
2016-04-27 12:51 ` Tvrtko Ursulin
2016-04-27 12:58 ` Chris Wilson
2016-04-27 13:03 ` Tvrtko Ursulin
2016-04-26 20:06 ` [PATCH v6 15/25] drm/i915: Preallocate enough space for the average request Chris Wilson
2016-04-27 13:15 ` Tvrtko Ursulin
2016-04-27 13:26 ` Chris Wilson
2016-04-26 20:06 ` [PATCH v6 16/25] drm/i915: Update execlists context descriptor format commentary Chris Wilson
2016-04-27 13:22 ` Tvrtko Ursulin
2016-04-26 20:06 ` [PATCH v6 17/25] drm/i915: Assign every HW context a unique ID Chris Wilson
2016-04-26 20:06 ` [PATCH v6 18/25] drm/i915: Replace the pinned context address with its " Chris Wilson
2016-04-26 20:06 ` [PATCH v6 19/25] drm/i915: Refactor execlists default context pinning Chris Wilson
2016-04-26 20:06 ` [PATCH v6 20/25] drm/i915: Move the magical deferred context allocation into the request Chris Wilson
2016-04-26 20:06 ` [PATCH v6 21/25] drm/i915: Move releasing of the GEM request from free to retire/cancel Chris Wilson
2016-04-26 20:06 ` [PATCH v6 22/25] drm/i915: Track the previous pinned context inside the request Chris Wilson
2016-04-26 20:06 ` [PATCH v6 23/25] drm/i915: Store LRC hardware id in " Chris Wilson
2016-04-26 20:06 ` [PATCH v6 24/25] drm/i915: Stop tracking execlists retired requests Chris Wilson
2016-04-26 20:06 ` [PATCH v6 25/25] drm/i915: Unify GPU resets upon shutdown Chris Wilson
2016-04-27 7:24 ` ✗ Fi.CI.BAT: failure for series starting with [v6,01/25] drm/i915/fbdev: Call intel_unpin_fb_obj() on release Patchwork
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox