* [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL
@ 2015-11-04 19:25 Imre Deak
2015-11-04 19:47 ` Paulo Zanoni
2015-11-04 20:57 ` Chris Wilson
0 siblings, 2 replies; 18+ messages in thread
From: Imre Deak @ 2015-11-04 19:25 UTC (permalink / raw)
To: intel-gfx
After Damien's D3 fix I started to get runtime suspend residency for the
first time and that revealed a breakage on the set_caching IOCTL path
that accesses the HW but doesn't take an RPM ref. Fix this up.
Signed-off-by: Imre Deak <imre.deak@intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f1e3fde..56cd501 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3829,6 +3829,7 @@ int i915_gem_get_caching_ioctl(struct drm_device *dev, void *data,
int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
struct drm_file *file)
{
+ struct drm_i915_private *dev_priv = dev->dev_private;
struct drm_i915_gem_caching *args = data;
struct drm_i915_gem_object *obj;
enum i915_cache_level level;
@@ -3857,9 +3858,11 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
return -EINVAL;
}
+ intel_runtime_pm_get(dev_priv);
+
ret = i915_mutex_lock_interruptible(dev);
if (ret)
- return ret;
+ goto rpm_put;
obj = to_intel_bo(drm_gem_object_lookup(dev, file, args->handle));
if (&obj->base == NULL) {
@@ -3872,6 +3875,9 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
drm_gem_object_unreference(&obj->base);
unlock:
mutex_unlock(&dev->struct_mutex);
+rpm_put:
+ intel_runtime_pm_put(dev_priv);
+
return ret;
}
--
2.1.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL
2015-11-04 19:25 [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL Imre Deak
@ 2015-11-04 19:47 ` Paulo Zanoni
2015-11-16 13:33 ` Jani Nikula
2015-11-17 19:00 ` Daniel Vetter
2015-11-04 20:57 ` Chris Wilson
1 sibling, 2 replies; 18+ messages in thread
From: Paulo Zanoni @ 2015-11-04 19:47 UTC (permalink / raw)
To: Imre Deak; +Cc: Intel Graphics Development
2015-11-04 17:25 GMT-02:00 Imre Deak <imre.deak@intel.com>:
> After Damien's D3 fix I started to get runtime suspend residency for the
> first time and that revealed a breakage on the set_caching IOCTL path
> that accesses the HW but doesn't take an RPM ref. Fix this up.
Oh, well, that's the the RPM problem that prevents me from sleeping at
night: how can we be 100% sure we wrapped every single possible driver
entry point? At least we shout errors and warns when we detect the
problem.
By the way, we already have some tests for specific IOCTLs on
igt/pm_rpm, so adding a new one for this IOCTL wouldn't hurt :)
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
>
> Signed-off-by: Imre Deak <imre.deak@intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index f1e3fde..56cd501 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3829,6 +3829,7 @@ int i915_gem_get_caching_ioctl(struct drm_device *dev, void *data,
> int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> struct drm_file *file)
> {
> + struct drm_i915_private *dev_priv = dev->dev_private;
> struct drm_i915_gem_caching *args = data;
> struct drm_i915_gem_object *obj;
> enum i915_cache_level level;
> @@ -3857,9 +3858,11 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> return -EINVAL;
> }
>
> + intel_runtime_pm_get(dev_priv);
> +
> ret = i915_mutex_lock_interruptible(dev);
> if (ret)
> - return ret;
> + goto rpm_put;
>
> obj = to_intel_bo(drm_gem_object_lookup(dev, file, args->handle));
> if (&obj->base == NULL) {
> @@ -3872,6 +3875,9 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> drm_gem_object_unreference(&obj->base);
> unlock:
> mutex_unlock(&dev->struct_mutex);
> +rpm_put:
> + intel_runtime_pm_put(dev_priv);
> +
> return ret;
> }
>
> --
> 2.1.4
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Paulo Zanoni
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL
2015-11-04 19:25 [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL Imre Deak
2015-11-04 19:47 ` Paulo Zanoni
@ 2015-11-04 20:57 ` Chris Wilson
2015-11-05 11:28 ` Imre Deak
1 sibling, 1 reply; 18+ messages in thread
From: Chris Wilson @ 2015-11-04 20:57 UTC (permalink / raw)
To: Imre Deak; +Cc: intel-gfx
On Wed, Nov 04, 2015 at 09:25:32PM +0200, Imre Deak wrote:
> After Damien's D3 fix I started to get runtime suspend residency for the
> first time and that revealed a breakage on the set_caching IOCTL path
> that accesses the HW but doesn't take an RPM ref. Fix this up.
Why here (and in so many random places) and not around the PTE write
itself?
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL
2015-11-04 20:57 ` Chris Wilson
@ 2015-11-05 11:28 ` Imre Deak
2015-11-05 11:56 ` Chris Wilson
0 siblings, 1 reply; 18+ messages in thread
From: Imre Deak @ 2015-11-05 11:28 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On ke, 2015-11-04 at 20:57 +0000, Chris Wilson wrote:
> On Wed, Nov 04, 2015 at 09:25:32PM +0200, Imre Deak wrote:
> > After Damien's D3 fix I started to get runtime suspend residency for the
> > first time and that revealed a breakage on the set_caching IOCTL path
> > that accesses the HW but doesn't take an RPM ref. Fix this up.
>
> Why here (and in so many random places) and not around the PTE write
> itself?
Imo we should take the RPM ref outside of any of our locks. Otherwise we
need hacks like we have currently in the runtime suspend handler to work
around lock inversions. It works now, but we couldn't do the same trick
if we needed to take struct_mutex for example in the resume handler too
for some reason.
> -Chris
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL
2015-11-05 11:28 ` Imre Deak
@ 2015-11-05 11:56 ` Chris Wilson
2015-11-05 22:57 ` Imre Deak
2015-11-17 22:16 ` Daniel Vetter
0 siblings, 2 replies; 18+ messages in thread
From: Chris Wilson @ 2015-11-05 11:56 UTC (permalink / raw)
To: Imre Deak; +Cc: intel-gfx
On Thu, Nov 05, 2015 at 01:28:32PM +0200, Imre Deak wrote:
> On ke, 2015-11-04 at 20:57 +0000, Chris Wilson wrote:
> > On Wed, Nov 04, 2015 at 09:25:32PM +0200, Imre Deak wrote:
> > > After Damien's D3 fix I started to get runtime suspend residency for the
> > > first time and that revealed a breakage on the set_caching IOCTL path
> > > that accesses the HW but doesn't take an RPM ref. Fix this up.
> >
> > Why here (and in so many random places) and not around the PTE write
> > itself?
>
> Imo we should take the RPM ref outside of any of our locks. Otherwise we
> need hacks like we have currently in the runtime suspend handler to work
> around lock inversions. It works now, but we couldn't do the same trick
> if we needed to take struct_mutex for example in the resume handler too
> for some reason.
On the other hand, it leads to hard to track down bugs and improper
documentation. Neither solution is perfect.
Note since intel_runtime_suspend has ample barriers of its own, you can
avoid the struct_mutex if you have a dedicated dev_priv->mm.fault_list.
Something along the lines of:
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 86734be..fe91ce5 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -180,11 +180,11 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
}
if (obj->stolen)
seq_printf(m, " (stolen: %08llx)", obj->stolen->start);
- if (obj->pin_display || obj->fault_mappable) {
+ if (obj->pin_display || !list_empty(&obj->fault_link)) {
char s[3], *t = s;
if (obj->pin_display)
*t++ = 'p';
- if (obj->fault_mappable)
+ if (!list_empty(&obj->fault_link))
*t++ = 'f';
*t = '\0';
seq_printf(m, " (%s mappable)", s);
@@ -474,7 +474,7 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
size = count = mappable_size = mappable_count = 0;
list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
- if (obj->fault_mappable) {
+ if (!list_empty(&obj->fault_link)) {
size += i915_gem_obj_ggtt_size(obj);
++count;
}
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 1d88745..179594e 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1447,28 +1447,10 @@ static int intel_runtime_suspend(struct device *device)
DRM_DEBUG_KMS("Suspending device\n");
/*
- * We could deadlock here in case another thread holding struct_mutex
- * calls RPM suspend concurrently, since the RPM suspend will wait
- * first for this RPM suspend to finish. In this case the concurrent
- * RPM resume will be followed by its RPM suspend counterpart. Still
- * for consistency return -EAGAIN, which will reschedule this suspend.
- */
- if (!mutex_trylock(&dev->struct_mutex)) {
- DRM_DEBUG_KMS("device lock contention, deffering suspend\n");
- /*
- * Bump the expiration timestamp, otherwise the suspend won't
- * be rescheduled.
- */
- pm_runtime_mark_last_busy(device);
-
- return -EAGAIN;
- }
- /*
* We are safe here against re-faults, since the fault handler takes
* an RPM reference.
*/
i915_gem_release_all_mmaps(dev_priv);
- mutex_unlock(&dev->struct_mutex);
intel_suspend_gt_powersave(dev);
intel_runtime_pm_disable_interrupts(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 55611d8..5e4904a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1268,6 +1268,8 @@ struct i915_gem_mm {
*/
struct list_head unbound_list;
+ struct list_head fault_list;
+
/** Usable portion of the GTT for GEM */
unsigned long stolen_base; /* limited to low memory (32-bit) */
@@ -2025,6 +2027,8 @@ struct drm_i915_gem_object {
struct list_head batch_pool_link;
+ struct list_head fault_link;
+
/**
* This is set if the object is on the active lists (has pending
* rendering and so a non-zero seqno), and is not set if it i s on
@@ -2069,13 +2073,6 @@ struct drm_i915_gem_object {
*/
unsigned int map_and_fenceable:1;
- /**
- * Whether the current gtt mapping needs to be mappable (and isn't just
- * mappable by accident). Track pin and fault separate for a more
- * accurate mappable working set.
- */
- unsigned int fault_mappable:1;
-
/*
* Is the object to be mapped as read-only to the GPU
* Only honoured if hardware has relevant pte bit
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 407b6b3..a90d1d8 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1814,9 +1814,10 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
break;
}
- obj->fault_mappable = true;
+ if (list_empty(&obj->fault_link))
+ list_add(&obj->fault_link, &dev_priv->mm.fault_list);
} else {
- if (!obj->fault_mappable) {
+ if (list_empty(&obj->fault_link)) {
unsigned long size = min_t(unsigned long,
vma->vm_end - vma->vm_start,
obj->base.size);
@@ -1830,7 +1831,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
break;
}
- obj->fault_mappable = true;
+ list_add(&obj->fault_link, &dev_priv->mm.fault_list);
} else
ret = vm_insert_pfn(vma,
(unsigned long)vmf->virtual_address,
@@ -1903,20 +1904,20 @@ out:
void
i915_gem_release_mmap(struct drm_i915_gem_object *obj)
{
- if (!obj->fault_mappable)
+ if (list_empty(&obj->fault_link))
return;
drm_vma_node_unmap(&obj->base.vma_node,
obj->base.dev->anon_inode->i_mapping);
- obj->fault_mappable = false;
+ list_del_init(&obj->fault_link);
}
void
i915_gem_release_all_mmaps(struct drm_i915_private *dev_priv)
{
- struct drm_i915_gem_object *obj;
+ struct drm_i915_gem_object *obj, *n;
- list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
+ list_for_each_entry_safe(obj, n, &dev_priv->mm.fault_list, fault_link)
i915_gem_release_mmap(obj);
}
@@ -4224,6 +4229,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
INIT_LIST_HEAD(&obj->obj_exec_link);
INIT_LIST_HEAD(&obj->vma_list);
INIT_LIST_HEAD(&obj->batch_pool_link);
+ INIT_LIST_HEAD(&obj->fault_link);
obj->ops = ops;
@@ -4858,6 +4864,7 @@ i915_gem_load(struct drm_device *dev)
INIT_LIST_HEAD(&dev_priv->context_list);
INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
INIT_LIST_HEAD(&dev_priv->mm.bound_list);
+ INIT_LIST_HEAD(&dev_priv->mm.fault_list);
INIT_LIST_HEAD(&dev_priv->mm.fence_list);
for (i = 0; i < I915_NUM_RINGS; i++)
init_ring_lists(&dev_priv->ring[i]);
The tricky part is reviewing the i915_gem_release_mmap() callers to
ensure that they have the right barrier. If you make
i915_gem_release_mmap() assert it holds an rpm ref, and then make the
i915_gem_release_all_mmaps() unlink the node directly that should do the
trick.
-Chris
--
Chris Wilson, InteleOpen Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL
2015-11-05 11:56 ` Chris Wilson
@ 2015-11-05 22:57 ` Imre Deak
2015-11-05 23:24 ` Imre Deak
2015-11-06 8:54 ` Chris Wilson
2015-11-17 22:16 ` Daniel Vetter
1 sibling, 2 replies; 18+ messages in thread
From: Imre Deak @ 2015-11-05 22:57 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On Thu, 2015-11-05 at 11:56 +0000, Chris Wilson wrote:
> On Thu, Nov 05, 2015 at 01:28:32PM +0200, Imre Deak wrote:
> > On ke, 2015-11-04 at 20:57 +0000, Chris Wilson wrote:
> > > On Wed, Nov 04, 2015 at 09:25:32PM +0200, Imre Deak wrote:
> > > > After Damien's D3 fix I started to get runtime suspend residency for the
> > > > first time and that revealed a breakage on the set_caching IOCTL path
> > > > that accesses the HW but doesn't take an RPM ref. Fix this up.
> > >
> > > Why here (and in so many random places) and not around the PTE write
> > > itself?
> >
> > Imo we should take the RPM ref outside of any of our locks. Otherwise we
> > need hacks like we have currently in the runtime suspend handler to work
> > around lock inversions. It works now, but we couldn't do the same trick
> > if we needed to take struct_mutex for example in the resume handler too
> > for some reason.
>
> On the other hand, it leads to hard to track down bugs and improper
> documentation. Neither solution is perfect.
I think I understand the documentation part, not sure how that could be
solved. Perhaps RPM-held asserts right before the point where the HW
needs to be on?
What do you mean by hard to track down bugs? Couldn't that also be
tackled by asserts?
> Note since intel_runtime_suspend has ample barriers of its own, you can
> avoid the struct_mutex if you have a dedicated dev_priv->mm.fault_list.
>
> Something along the lines of:
Ok, looks interesting. But as you said we would have to then make
callers of i915_gem_release_mmap() wake up the device, which isn't
strictly necessary. (and imo as such goes somewhat against the
documentation argument)
--Imre
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 86734be..fe91ce5 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -180,11 +180,11 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
> }
> if (obj->stolen)
> seq_printf(m, " (stolen: %08llx)", obj->stolen->start);
> - if (obj->pin_display || obj->fault_mappable) {
> + if (obj->pin_display || !list_empty(&obj->fault_link)) {
> char s[3], *t = s;
> if (obj->pin_display)
> *t++ = 'p';
> - if (obj->fault_mappable)
> + if (!list_empty(&obj->fault_link))
> *t++ = 'f';
> *t = '\0';
> seq_printf(m, " (%s mappable)", s);
> @@ -474,7 +474,7 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
>
> size = count = mappable_size = mappable_count = 0;
> list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> - if (obj->fault_mappable) {
> + if (!list_empty(&obj->fault_link)) {
> size += i915_gem_obj_ggtt_size(obj);
> ++count;
> }
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 1d88745..179594e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1447,28 +1447,10 @@ static int intel_runtime_suspend(struct device *device)
> DRM_DEBUG_KMS("Suspending device\n");
>
> /*
> - * We could deadlock here in case another thread holding struct_mutex
> - * calls RPM suspend concurrently, since the RPM suspend will wait
> - * first for this RPM suspend to finish. In this case the concurrent
> - * RPM resume will be followed by its RPM suspend counterpart. Still
> - * for consistency return -EAGAIN, which will reschedule this suspend.
> - */
> - if (!mutex_trylock(&dev->struct_mutex)) {
> - DRM_DEBUG_KMS("device lock contention, deffering suspend\n");
> - /*
> - * Bump the expiration timestamp, otherwise the suspend won't
> - * be rescheduled.
> - */
> - pm_runtime_mark_last_busy(device);
> -
> - return -EAGAIN;
> - }
> - /*
> * We are safe here against re-faults, since the fault handler takes
> * an RPM reference.
> */
> i915_gem_release_all_mmaps(dev_priv);
> - mutex_unlock(&dev->struct_mutex);
>
> intel_suspend_gt_powersave(dev);
> intel_runtime_pm_disable_interrupts(dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 55611d8..5e4904a 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1268,6 +1268,8 @@ struct i915_gem_mm {
> */
> struct list_head unbound_list;
>
> + struct list_head fault_list;
> +
> /** Usable portion of the GTT for GEM */
> unsigned long stolen_base; /* limited to low memory (32-bit) */
>
> @@ -2025,6 +2027,8 @@ struct drm_i915_gem_object {
>
> struct list_head batch_pool_link;
>
> + struct list_head fault_link;
> +
> /**
> * This is set if the object is on the active lists (has pending
> * rendering and so a non-zero seqno), and is not set if it i s on
> @@ -2069,13 +2073,6 @@ struct drm_i915_gem_object {
> */
> unsigned int map_and_fenceable:1;
>
> - /**
> - * Whether the current gtt mapping needs to be mappable (and isn't just
> - * mappable by accident). Track pin and fault separate for a more
> - * accurate mappable working set.
> - */
> - unsigned int fault_mappable:1;
> -
> /*
> * Is the object to be mapped as read-only to the GPU
> * Only honoured if hardware has relevant pte bit
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 407b6b3..a90d1d8 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1814,9 +1814,10 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> break;
> }
>
> - obj->fault_mappable = true;
> + if (list_empty(&obj->fault_link))
> + list_add(&obj->fault_link, &dev_priv->mm.fault_list);
> } else {
> - if (!obj->fault_mappable) {
> + if (list_empty(&obj->fault_link)) {
> unsigned long size = min_t(unsigned long,
> vma->vm_end - vma->vm_start,
> obj->base.size);
> @@ -1830,7 +1831,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> break;
> }
>
> - obj->fault_mappable = true;
> + list_add(&obj->fault_link, &dev_priv->mm.fault_list);
> } else
> ret = vm_insert_pfn(vma,
> (unsigned long)vmf->virtual_address,
> @@ -1903,20 +1904,20 @@ out:
> void
> i915_gem_release_mmap(struct drm_i915_gem_object *obj)
> {
> - if (!obj->fault_mappable)
> + if (list_empty(&obj->fault_link))
> return;
>
> drm_vma_node_unmap(&obj->base.vma_node,
> obj->base.dev->anon_inode->i_mapping);
> - obj->fault_mappable = false;
> + list_del_init(&obj->fault_link);
> }
>
> void
> i915_gem_release_all_mmaps(struct drm_i915_private *dev_priv)
> {
> - struct drm_i915_gem_object *obj;
> + struct drm_i915_gem_object *obj, *n;
>
> - list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
> + list_for_each_entry_safe(obj, n, &dev_priv->mm.fault_list, fault_link)
> i915_gem_release_mmap(obj);
> }
>
> @@ -4224,6 +4229,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
> INIT_LIST_HEAD(&obj->obj_exec_link);
> INIT_LIST_HEAD(&obj->vma_list);
> INIT_LIST_HEAD(&obj->batch_pool_link);
> + INIT_LIST_HEAD(&obj->fault_link);
>
> obj->ops = ops;
>
> @@ -4858,6 +4864,7 @@ i915_gem_load(struct drm_device *dev)
> INIT_LIST_HEAD(&dev_priv->context_list);
> INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
> INIT_LIST_HEAD(&dev_priv->mm.bound_list);
> + INIT_LIST_HEAD(&dev_priv->mm.fault_list);
> INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> for (i = 0; i < I915_NUM_RINGS; i++)
> init_ring_lists(&dev_priv->ring[i]);
>
>
> The tricky part is reviewing the i915_gem_release_mmap() callers to
> ensure that they have the right barrier. If you make
> i915_gem_release_mmap() assert it holds an rpm ref, and then make the
> i915_gem_release_all_mmaps() unlink the node directly that should do the
> trick.
> -Chris
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL
2015-11-05 22:57 ` Imre Deak
@ 2015-11-05 23:24 ` Imre Deak
2015-11-06 8:54 ` Chris Wilson
1 sibling, 0 replies; 18+ messages in thread
From: Imre Deak @ 2015-11-05 23:24 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On Fri, 2015-11-06 at 00:57 +0200, Imre Deak wrote:
> On Thu, 2015-11-05 at 11:56 +0000, Chris Wilson wrote:
> > On Thu, Nov 05, 2015 at 01:28:32PM +0200, Imre Deak wrote:
> > > On ke, 2015-11-04 at 20:57 +0000, Chris Wilson wrote:
> > > > On Wed, Nov 04, 2015 at 09:25:32PM +0200, Imre Deak wrote:
> > > > > After Damien's D3 fix I started to get runtime suspend residency for the
> > > > > first time and that revealed a breakage on the set_caching IOCTL path
> > > > > that accesses the HW but doesn't take an RPM ref. Fix this up.
> > > >
> > > > Why here (and in so many random places) and not around the PTE write
> > > > itself?
> > >
> > > Imo we should take the RPM ref outside of any of our locks. Otherwise we
> > > need hacks like we have currently in the runtime suspend handler to work
> > > around lock inversions. It works now, but we couldn't do the same trick
> > > if we needed to take struct_mutex for example in the resume handler too
> > > for some reason.
> >
> > On the other hand, it leads to hard to track down bugs and improper
> > documentation. Neither solution is perfect.
>
> I think I understand the documentation part, not sure how that could be
> solved. Perhaps RPM-held asserts right before the point where the HW
> needs to be on?
>
> What do you mean by hard to track down bugs? Couldn't that also be
> tackled by asserts?
>
> > Note since intel_runtime_suspend has ample barriers of its own, you can
> > avoid the struct_mutex if you have a dedicated dev_priv->mm.fault_list.
> >
> > Something along the lines of:
>
> Ok, looks interesting. But as you said we would have to then make
> callers of i915_gem_release_mmap() wake up the device, which isn't
> strictly necessary. (and imo as such goes somewhat against the
> documentation argument)
Actually, we could also use pm_runtime_get_noresume(). But I find this
would just complicate things without a real benefit.
> --Imre
>
>
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > index 86734be..fe91ce5 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -180,11 +180,11 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
> > }
> > if (obj->stolen)
> > seq_printf(m, " (stolen: %08llx)", obj->stolen->start);
> > - if (obj->pin_display || obj->fault_mappable) {
> > + if (obj->pin_display || !list_empty(&obj->fault_link)) {
> > char s[3], *t = s;
> > if (obj->pin_display)
> > *t++ = 'p';
> > - if (obj->fault_mappable)
> > + if (!list_empty(&obj->fault_link))
> > *t++ = 'f';
> > *t = '\0';
> > seq_printf(m, " (%s mappable)", s);
> > @@ -474,7 +474,7 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
> >
> > size = count = mappable_size = mappable_count = 0;
> > list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > - if (obj->fault_mappable) {
> > + if (!list_empty(&obj->fault_link)) {
> > size += i915_gem_obj_ggtt_size(obj);
> > ++count;
> > }
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > index 1d88745..179594e 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -1447,28 +1447,10 @@ static int intel_runtime_suspend(struct device *device)
> > DRM_DEBUG_KMS("Suspending device\n");
> >
> > /*
> > - * We could deadlock here in case another thread holding struct_mutex
> > - * calls RPM suspend concurrently, since the RPM suspend will wait
> > - * first for this RPM suspend to finish. In this case the concurrent
> > - * RPM resume will be followed by its RPM suspend counterpart. Still
> > - * for consistency return -EAGAIN, which will reschedule this suspend.
> > - */
> > - if (!mutex_trylock(&dev->struct_mutex)) {
> > - DRM_DEBUG_KMS("device lock contention, deffering suspend\n");
> > - /*
> > - * Bump the expiration timestamp, otherwise the suspend won't
> > - * be rescheduled.
> > - */
> > - pm_runtime_mark_last_busy(device);
> > -
> > - return -EAGAIN;
> > - }
> > - /*
> > * We are safe here against re-faults, since the fault handler takes
> > * an RPM reference.
> > */
> > i915_gem_release_all_mmaps(dev_priv);
> > - mutex_unlock(&dev->struct_mutex);
> >
> > intel_suspend_gt_powersave(dev);
> > intel_runtime_pm_disable_interrupts(dev_priv);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 55611d8..5e4904a 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1268,6 +1268,8 @@ struct i915_gem_mm {
> > */
> > struct list_head unbound_list;
> >
> > + struct list_head fault_list;
> > +
> > /** Usable portion of the GTT for GEM */
> > unsigned long stolen_base; /* limited to low memory (32-bit) */
> >
> > @@ -2025,6 +2027,8 @@ struct drm_i915_gem_object {
> >
> > struct list_head batch_pool_link;
> >
> > + struct list_head fault_link;
> > +
> > /**
> > * This is set if the object is on the active lists (has pending
> > * rendering and so a non-zero seqno), and is not set if it i s on
> > @@ -2069,13 +2073,6 @@ struct drm_i915_gem_object {
> > */
> > unsigned int map_and_fenceable:1;
> >
> > - /**
> > - * Whether the current gtt mapping needs to be mappable (and isn't just
> > - * mappable by accident). Track pin and fault separate for a more
> > - * accurate mappable working set.
> > - */
> > - unsigned int fault_mappable:1;
> > -
> > /*
> > * Is the object to be mapped as read-only to the GPU
> > * Only honoured if hardware has relevant pte bit
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 407b6b3..a90d1d8 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -1814,9 +1814,10 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> > break;
> > }
> >
> > - obj->fault_mappable = true;
> > + if (list_empty(&obj->fault_link))
> > + list_add(&obj->fault_link, &dev_priv->mm.fault_list);
> > } else {
> > - if (!obj->fault_mappable) {
> > + if (list_empty(&obj->fault_link)) {
> > unsigned long size = min_t(unsigned long,
> > vma->vm_end - vma->vm_start,
> > obj->base.size);
> > @@ -1830,7 +1831,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> > break;
> > }
> >
> > - obj->fault_mappable = true;
> > + list_add(&obj->fault_link, &dev_priv->mm.fault_list);
> > } else
> > ret = vm_insert_pfn(vma,
> > (unsigned long)vmf->virtual_address,
> > @@ -1903,20 +1904,20 @@ out:
> > void
> > i915_gem_release_mmap(struct drm_i915_gem_object *obj)
> > {
> > - if (!obj->fault_mappable)
> > + if (list_empty(&obj->fault_link))
> > return;
> >
> > drm_vma_node_unmap(&obj->base.vma_node,
> > obj->base.dev->anon_inode->i_mapping);
> > - obj->fault_mappable = false;
> > + list_del_init(&obj->fault_link);
> > }
> >
> > void
> > i915_gem_release_all_mmaps(struct drm_i915_private *dev_priv)
> > {
> > - struct drm_i915_gem_object *obj;
> > + struct drm_i915_gem_object *obj, *n;
> >
> > - list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
> > + list_for_each_entry_safe(obj, n, &dev_priv->mm.fault_list, fault_link)
> > i915_gem_release_mmap(obj);
> > }
> >
> > @@ -4224,6 +4229,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
> > INIT_LIST_HEAD(&obj->obj_exec_link);
> > INIT_LIST_HEAD(&obj->vma_list);
> > INIT_LIST_HEAD(&obj->batch_pool_link);
> > + INIT_LIST_HEAD(&obj->fault_link);
> >
> > obj->ops = ops;
> >
> > @@ -4858,6 +4864,7 @@ i915_gem_load(struct drm_device *dev)
> > INIT_LIST_HEAD(&dev_priv->context_list);
> > INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
> > INIT_LIST_HEAD(&dev_priv->mm.bound_list);
> > + INIT_LIST_HEAD(&dev_priv->mm.fault_list);
> > INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> > for (i = 0; i < I915_NUM_RINGS; i++)
> > init_ring_lists(&dev_priv->ring[i]);
> >
> >
> > The tricky part is reviewing the i915_gem_release_mmap() callers to
> > ensure that they have the right barrier. If you make
> > i915_gem_release_mmap() assert it holds an rpm ref, and then make the
> > i915_gem_release_all_mmaps() unlink the node directly that should do the
> > trick.
> > -Chris
> >
>
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL
2015-11-05 22:57 ` Imre Deak
2015-11-05 23:24 ` Imre Deak
@ 2015-11-06 8:54 ` Chris Wilson
2015-11-09 13:09 ` Imre Deak
1 sibling, 1 reply; 18+ messages in thread
From: Chris Wilson @ 2015-11-06 8:54 UTC (permalink / raw)
To: Imre Deak; +Cc: intel-gfx
On Fri, Nov 06, 2015 at 12:57:35AM +0200, Imre Deak wrote:
> On Thu, 2015-11-05 at 11:56 +0000, Chris Wilson wrote:
> > On Thu, Nov 05, 2015 at 01:28:32PM +0200, Imre Deak wrote:
> > > On ke, 2015-11-04 at 20:57 +0000, Chris Wilson wrote:
> > > > On Wed, Nov 04, 2015 at 09:25:32PM +0200, Imre Deak wrote:
> > > > > After Damien's D3 fix I started to get runtime suspend residency for the
> > > > > first time and that revealed a breakage on the set_caching IOCTL path
> > > > > that accesses the HW but doesn't take an RPM ref. Fix this up.
> > > >
> > > > Why here (and in so many random places) and not around the PTE write
> > > > itself?
> > >
> > > Imo we should take the RPM ref outside of any of our locks. Otherwise we
> > > need hacks like we have currently in the runtime suspend handler to work
> > > around lock inversions. It works now, but we couldn't do the same trick
> > > if we needed to take struct_mutex for example in the resume handler too
> > > for some reason.
> >
> > On the other hand, it leads to hard to track down bugs and improper
> > documentation. Neither solution is perfect.
>
> I think I understand the documentation part, not sure how that could be
> solved. Perhaps RPM-held asserts right before the point where the HW
> needs to be on?
>
> What do you mean by hard to track down bugs? Couldn't that also be
> tackled by asserts?
>
> > Note since intel_runtime_suspend has ample barriers of its own, you can
> > avoid the struct_mutex if you have a dedicated dev_priv->mm.fault_list.
> >
> > Something along the lines of:
>
> Ok, looks interesting. But as you said we would have to then make
> callers of i915_gem_release_mmap() wake up the device, which isn't
> strictly necessary. (and imo as such goes somewhat against the
> documentation argument)
Outside of suspend, we only revoke the CPU PTE mapping when we change
the hardware (as doing so is very expensive). So all callers should
already have a reason (and be holding) the rpm wakelock, the only
complication from my point of view is enforcing that and reviewing that
what I said is true.
From the rpm point of view, this should improve the success of runtime
suspend, and reduce wakelocks.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL
2015-11-06 8:54 ` Chris Wilson
@ 2015-11-09 13:09 ` Imre Deak
2015-11-09 13:25 ` Chris Wilson
0 siblings, 1 reply; 18+ messages in thread
From: Imre Deak @ 2015-11-09 13:09 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On pe, 2015-11-06 at 08:54 +0000, Chris Wilson wrote:
> On Fri, Nov 06, 2015 at 12:57:35AM +0200, Imre Deak wrote:
> > On Thu, 2015-11-05 at 11:56 +0000, Chris Wilson wrote:
> > > On Thu, Nov 05, 2015 at 01:28:32PM +0200, Imre Deak wrote:
> > > > On ke, 2015-11-04 at 20:57 +0000, Chris Wilson wrote:
> > > > > On Wed, Nov 04, 2015 at 09:25:32PM +0200, Imre Deak wrote:
> > > > > > After Damien's D3 fix I started to get runtime suspend
> > > > > > residency for the
> > > > > > first time and that revealed a breakage on the set_caching
> > > > > > IOCTL path
> > > > > > that accesses the HW but doesn't take an RPM ref. Fix this
> > > > > > up.
> > > > >
> > > > > Why here (and in so many random places) and not around the
> > > > > PTE write
> > > > > itself?
> > > >
> > > > Imo we should take the RPM ref outside of any of our locks.
> > > > Otherwise we
> > > > need hacks like we have currently in the runtime suspend
> > > > handler to work
> > > > around lock inversions. It works now, but we couldn't do the
> > > > same trick
> > > > if we needed to take struct_mutex for example in the resume
> > > > handler too
> > > > for some reason.
> > >
> > > On the other hand, it leads to hard to track down bugs and
> > > improper
> > > documentation. Neither solution is perfect.
> >
> > I think I understand the documentation part, not sure how that
> > could be
> > solved. Perhaps RPM-held asserts right before the point where the
> > HW
> > needs to be on?
> >
> > What do you mean by hard to track down bugs? Couldn't that also be
> > tackled by asserts?
> >
> > > Note since intel_runtime_suspend has ample barriers of its own,
> > > you can
> > > avoid the struct_mutex if you have a dedicated dev_priv
> > > ->mm.fault_list.
> > >
> > > Something along the lines of:
> >
> > Ok, looks interesting. But as you said we would have to then make
> > callers of i915_gem_release_mmap() wake up the device, which isn't
> > strictly necessary. (and imo as such goes somewhat against the
> > documentation argument)
>
> Outside of suspend, we only revoke the CPU PTE mapping when we change
> the hardware (as doing so is very expensive). So all callers should
> already have a reason (and be holding) the rpm wakelock, the only
> complication from my point of view is enforcing that and reviewing
> that
> what I said is true.
Looked through it, it seems only i915_gem_set_tiling() could release
the PTE's without waking up the hardware (if no need to unbind the
object). Otherwise it's true that all callers hold (or should hold)
already an RPM ref. To fix the set tiling case to work after your
optimization we could wake up the HW unconditionally there, use a
no_resume RPM ref+and RPM barrier or a separate new lock for the fault
list.
> From the rpm point of view, this should improve the success of
> runtime suspend, and reduce wakelocks.
Yes, seems like a worthy optimization, since I assume struct_mutex can
be held for a long time without the need to wake up the hardware.
Are you ok to first have the fix I posted and a similar one for
i915_gem_set_tiling()? And then to follow-up with your plan.
--Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL
2015-11-09 13:09 ` Imre Deak
@ 2015-11-09 13:25 ` Chris Wilson
2015-11-09 13:36 ` Imre Deak
0 siblings, 1 reply; 18+ messages in thread
From: Chris Wilson @ 2015-11-09 13:25 UTC (permalink / raw)
To: Imre Deak; +Cc: intel-gfx
On Mon, Nov 09, 2015 at 03:09:18PM +0200, Imre Deak wrote:
> Looked through it, it seems only i915_gem_set_tiling() could release
> the PTE's without waking up the hardware (if no need to unbind the
> object). Otherwise it's true that all callers hold (or should hold)
> already an RPM ref. To fix the set tiling case to work after your
> optimization we could wake up the HW unconditionally there, use a
> no_resume RPM ref+and RPM barrier or a separate new lock for the fault
> list.
I was suggesting we move to the model where writes through gsm took the
rpm reference itself.
> > From the rpm point of view, this should improve the success of
> > runtime suspend, and reduce wakelocks.
>
> Yes, seems like a worthy optimization, since I assume struct_mutex can
> be held for a long time without the need to wake up the hardware.
Admittedly most of the time we hold struct_mutex, the hw will be awake
for other reasons. But there are many times where we do take the
struct_mutex for 10s (if not 100s!) of milliseconds where the hw is
completely idle, and so every chance to reduce usage/contention on
struct_mutex is a little victory.
> Are you ok to first have the fix I posted and a similar one for
> i915_gem_set_tiling()? And then to follow-up with your plan.
Yes, adding the extra reference to that ioctl, juggling with the
struct_mutex and then moving the rpm reference to where it is required
lgtm.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL
2015-11-09 13:25 ` Chris Wilson
@ 2015-11-09 13:36 ` Imre Deak
2015-11-09 13:43 ` Chris Wilson
0 siblings, 1 reply; 18+ messages in thread
From: Imre Deak @ 2015-11-09 13:36 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On ma, 2015-11-09 at 13:25 +0000, Chris Wilson wrote:
> On Mon, Nov 09, 2015 at 03:09:18PM +0200, Imre Deak wrote:
> > Looked through it, it seems only i915_gem_set_tiling() could
> > release
> > the PTE's without waking up the hardware (if no need to unbind the
> > object). Otherwise it's true that all callers hold (or should hold)
> > already an RPM ref. To fix the set tiling case to work after your
> > optimization we could wake up the HW unconditionally there, use a
> > no_resume RPM ref+and RPM barrier or a separate new lock for the
> > fault
> > list.
>
> I was suggesting we move to the model where writes through gsm took
> the
> rpm reference itself.
Yes, but even then you want to have a lock around updating the new
fault list, no? So if we go with your way and push down the RPM ref
where GSM is written, we wouldn't have a lock around the fault_list
update in i915_gem_set_tiling() (via i915_gem_release_mmap()). That's
where I meant we need an extra ref/lock above.
> > > From the rpm point of view, this should improve the success of
> > > runtime suspend, and reduce wakelocks.
> >
> > Yes, seems like a worthy optimization, since I assume struct_mutex
> > can
> > be held for a long time without the need to wake up the hardware.
>
> Admittedly most of the time we hold struct_mutex, the hw will be
> awake
> for other reasons. But there are many times where we do take the
> struct_mutex for 10s (if not 100s!) of milliseconds where the hw is
> completely idle, and so every chance to reduce usage/contention on
> struct_mutex is a little victory.
>
> > Are you ok to first have the fix I posted and a similar one for
> > i915_gem_set_tiling()? And then to follow-up with your plan.
>
> Yes, adding the extra reference to that ioctl, juggling with the
> struct_mutex and then moving the rpm reference to where it is
> required
> lgtm.
Ok, will post a new one for set_tiling.
--Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL
2015-11-09 13:36 ` Imre Deak
@ 2015-11-09 13:43 ` Chris Wilson
0 siblings, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2015-11-09 13:43 UTC (permalink / raw)
To: Imre Deak; +Cc: intel-gfx
On Mon, Nov 09, 2015 at 03:36:10PM +0200, Imre Deak wrote:
> On ma, 2015-11-09 at 13:25 +0000, Chris Wilson wrote:
> > On Mon, Nov 09, 2015 at 03:09:18PM +0200, Imre Deak wrote:
> > > Looked through it, it seems only i915_gem_set_tiling() could
> > > release
> > > the PTE's without waking up the hardware (if no need to unbind the
> > > object). Otherwise it's true that all callers hold (or should hold)
> > > already an RPM ref. To fix the set tiling case to work after your
> > > optimization we could wake up the HW unconditionally there, use a
> > > no_resume RPM ref+and RPM barrier or a separate new lock for the
> > > fault
> > > list.
> >
> > I was suggesting we move to the model where writes through gsm took
> > the
> > rpm reference itself.
>
> Yes, but even then you want to have a lock around updating the new
> fault list, no? So if we go with your way and push down the RPM ref
> where GSM is written, we wouldn't have a lock around the fault_list
> update in i915_gem_set_tiling() (via i915_gem_release_mmap()). That's
> where I meant we need an extra ref/lock above.
Ok, I remember some of the specifics I had in mind about having to do it
inside the if (vma->map_and_fencable) branch of i915_vma_unbind() as
opposed to be able to push it into ggtt_unbind_vma... Hmm, pushing that
itself down to ggtt_unbind_vma isn't too silly. Equally moving the
vma->ggtt_view.pages = NULL would also help with symmetry.
Plenty of opportunity here for cleanup that should pay off nicely wrt to
rpm handling :)
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL
2015-11-04 19:47 ` Paulo Zanoni
@ 2015-11-16 13:33 ` Jani Nikula
2015-11-17 19:00 ` Daniel Vetter
1 sibling, 0 replies; 18+ messages in thread
From: Jani Nikula @ 2015-11-16 13:33 UTC (permalink / raw)
To: Paulo Zanoni, Imre Deak; +Cc: Intel Graphics Development
On Wed, 04 Nov 2015, Paulo Zanoni <przanoni@gmail.com> wrote:
> 2015-11-04 17:25 GMT-02:00 Imre Deak <imre.deak@intel.com>:
>> After Damien's D3 fix I started to get runtime suspend residency for the
>> first time and that revealed a breakage on the set_caching IOCTL path
>> that accesses the HW but doesn't take an RPM ref. Fix this up.
>
> Oh, well, that's the the RPM problem that prevents me from sleeping at
> night: how can we be 100% sure we wrapped every single possible driver
> entry point? At least we shout errors and warns when we detect the
> problem.
>
> By the way, we already have some tests for specific IOCTLs on
> igt/pm_rpm, so adding a new one for this IOCTL wouldn't hurt :)
>
> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Pushed to drm-intel-fixes, thanks for the patch and review.
BR,
Jani.
>
>>
>> Signed-off-by: Imre Deak <imre.deak@intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_gem.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index f1e3fde..56cd501 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -3829,6 +3829,7 @@ int i915_gem_get_caching_ioctl(struct drm_device *dev, void *data,
>> int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
>> struct drm_file *file)
>> {
>> + struct drm_i915_private *dev_priv = dev->dev_private;
>> struct drm_i915_gem_caching *args = data;
>> struct drm_i915_gem_object *obj;
>> enum i915_cache_level level;
>> @@ -3857,9 +3858,11 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
>> return -EINVAL;
>> }
>>
>> + intel_runtime_pm_get(dev_priv);
>> +
>> ret = i915_mutex_lock_interruptible(dev);
>> if (ret)
>> - return ret;
>> + goto rpm_put;
>>
>> obj = to_intel_bo(drm_gem_object_lookup(dev, file, args->handle));
>> if (&obj->base == NULL) {
>> @@ -3872,6 +3875,9 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
>> drm_gem_object_unreference(&obj->base);
>> unlock:
>> mutex_unlock(&dev->struct_mutex);
>> +rpm_put:
>> + intel_runtime_pm_put(dev_priv);
>> +
>> return ret;
>> }
>>
>> --
>> 2.1.4
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Jani Nikula, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL
2015-11-04 19:47 ` Paulo Zanoni
2015-11-16 13:33 ` Jani Nikula
@ 2015-11-17 19:00 ` Daniel Vetter
2015-11-17 19:22 ` Imre Deak
1 sibling, 1 reply; 18+ messages in thread
From: Daniel Vetter @ 2015-11-17 19:00 UTC (permalink / raw)
To: Paulo Zanoni; +Cc: Intel Graphics Development
On Wed, Nov 04, 2015 at 05:47:12PM -0200, Paulo Zanoni wrote:
> 2015-11-04 17:25 GMT-02:00 Imre Deak <imre.deak@intel.com>:
> > After Damien's D3 fix I started to get runtime suspend residency for the
> > first time and that revealed a breakage on the set_caching IOCTL path
> > that accesses the HW but doesn't take an RPM ref. Fix this up.
>
> Oh, well, that's the the RPM problem that prevents me from sleeping at
> night: how can we be 100% sure we wrapped every single possible driver
> entry point? At least we shout errors and warns when we detect the
> problem.
>
> By the way, we already have some tests for specific IOCTLs on
> igt/pm_rpm, so adding a new one for this IOCTL wouldn't hurt :)
Yeah, rpm subtests for set_caching and set_tiling are indeed missing.
Imre, can you please add them?
Thanks, Daniel
>
> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
>
> >
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > ---
> > drivers/gpu/drm/i915/i915_gem.c | 8 +++++++-
> > 1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index f1e3fde..56cd501 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3829,6 +3829,7 @@ int i915_gem_get_caching_ioctl(struct drm_device *dev, void *data,
> > int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > struct drm_file *file)
> > {
> > + struct drm_i915_private *dev_priv = dev->dev_private;
> > struct drm_i915_gem_caching *args = data;
> > struct drm_i915_gem_object *obj;
> > enum i915_cache_level level;
> > @@ -3857,9 +3858,11 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > return -EINVAL;
> > }
> >
> > + intel_runtime_pm_get(dev_priv);
> > +
> > ret = i915_mutex_lock_interruptible(dev);
> > if (ret)
> > - return ret;
> > + goto rpm_put;
> >
> > obj = to_intel_bo(drm_gem_object_lookup(dev, file, args->handle));
> > if (&obj->base == NULL) {
> > @@ -3872,6 +3875,9 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > drm_gem_object_unreference(&obj->base);
> > unlock:
> > mutex_unlock(&dev->struct_mutex);
> > +rpm_put:
> > + intel_runtime_pm_put(dev_priv);
> > +
> > return ret;
> > }
> >
> > --
> > 2.1.4
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
>
>
> --
> Paulo Zanoni
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL
2015-11-17 19:00 ` Daniel Vetter
@ 2015-11-17 19:22 ` Imre Deak
0 siblings, 0 replies; 18+ messages in thread
From: Imre Deak @ 2015-11-17 19:22 UTC (permalink / raw)
To: Daniel Vetter, Paulo Zanoni; +Cc: Intel Graphics Development
On ti, 2015-11-17 at 20:00 +0100, Daniel Vetter wrote:
> On Wed, Nov 04, 2015 at 05:47:12PM -0200, Paulo Zanoni wrote:
> > 2015-11-04 17:25 GMT-02:00 Imre Deak <imre.deak@intel.com>:
> > > After Damien's D3 fix I started to get runtime suspend residency
> > > for the
> > > first time and that revealed a breakage on the set_caching IOCTL
> > > path
> > > that accesses the HW but doesn't take an RPM ref. Fix this up.
> >
> > Oh, well, that's the the RPM problem that prevents me from sleeping
> > at
> > night: how can we be 100% sure we wrapped every single possible
> > driver
> > entry point? At least we shout errors and warns when we detect the
> > problem.
> >
> > By the way, we already have some tests for specific IOCTLs on
> > igt/pm_rpm, so adding a new one for this IOCTL wouldn't hurt :)
>
> Yeah, rpm subtests for set_caching and set_tiling are indeed missing.
> Imre, can you please add them?
Ok, will add them.
--Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL
2015-11-05 11:56 ` Chris Wilson
2015-11-05 22:57 ` Imre Deak
@ 2015-11-17 22:16 ` Daniel Vetter
2015-11-17 22:38 ` Chris Wilson
1 sibling, 1 reply; 18+ messages in thread
From: Daniel Vetter @ 2015-11-17 22:16 UTC (permalink / raw)
To: Chris Wilson, Imre Deak, intel-gfx
On Thu, Nov 5, 2015 at 12:56 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> The tricky part is reviewing the i915_gem_release_mmap() callers to
> ensure that they have the right barrier. If you make
> i915_gem_release_mmap() assert it holds an rpm ref, and then make the
> i915_gem_release_all_mmaps() unlink the node directly that should do the
> trick.
I think the easier solution would be to add a mutex for the
fault_list. We call release_mmap from a lot of places, and for many it
doesn't make sense or would be nontrivial to guarantee that we hold an
rpm reference. The lock won't be a problem as long as it's always
nested within the rpm_get/put and never the other way round (which was
all the trouble with dev->struct_mutex).
Since this is creating noise in the CI system with
pci_pm_runtime_suspend(): intel_runtime_suspend+0x0/0x240 [i915] returns -11
can you please bake this into a proper patch?
Thanks, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL
2015-11-17 22:16 ` Daniel Vetter
@ 2015-11-17 22:38 ` Chris Wilson
2015-11-17 22:49 ` Imre Deak
0 siblings, 1 reply; 18+ messages in thread
From: Chris Wilson @ 2015-11-17 22:38 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
On Tue, Nov 17, 2015 at 11:16:09PM +0100, Daniel Vetter wrote:
> On Thu, Nov 5, 2015 at 12:56 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > The tricky part is reviewing the i915_gem_release_mmap() callers to
> > ensure that they have the right barrier. If you make
> > i915_gem_release_mmap() assert it holds an rpm ref, and then make the
> > i915_gem_release_all_mmaps() unlink the node directly that should do the
> > trick.
>
> I think the easier solution would be to add a mutex for the
> fault_list. We call release_mmap from a lot of places,
We don't though. The only times we do are when we touching hw registers
(or gsm):
set_tiling_ioctl - which also may unbind and so needs rpm
fence_write - which needs rpm to write the reigsters
vma_unbind - which needs rpm to write through the gsm
set_caching - which needs rpm to write through the gsm
and currently by rpm itself.
I think it is certainly reasonable to use the rpm barriers for the
faulted list. The only one where we have to actually ensure we hold rpm
is the set_tiling_ioctl.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL
2015-11-17 22:38 ` Chris Wilson
@ 2015-11-17 22:49 ` Imre Deak
0 siblings, 0 replies; 18+ messages in thread
From: Imre Deak @ 2015-11-17 22:49 UTC (permalink / raw)
To: Chris Wilson, Daniel Vetter; +Cc: intel-gfx
On Tue, 2015-11-17 at 22:38 +0000, Chris Wilson wrote:
> On Tue, Nov 17, 2015 at 11:16:09PM +0100, Daniel Vetter wrote:
> > On Thu, Nov 5, 2015 at 12:56 PM, Chris Wilson <chris@chris-wilson.c
> > o.uk> wrote:
> > > The tricky part is reviewing the i915_gem_release_mmap() callers
> > > to
> > > ensure that they have the right barrier. If you make
> > > i915_gem_release_mmap() assert it holds an rpm ref, and then make
> > > the
> > > i915_gem_release_all_mmaps() unlink the node directly that should
> > > do the
> > > trick.
> >
> > I think the easier solution would be to add a mutex for the
> > fault_list. We call release_mmap from a lot of places,
>
> We don't though. The only times we do are when we touching hw
> registers
> (or gsm):
>
> set_tiling_ioctl - which also may unbind and so needs rpm
> fence_write - which needs rpm to write the reigsters
> vma_unbind - which needs rpm to write through the gsm
> set_caching - which needs rpm to write through the gsm
>
> and currently by rpm itself.
>
> I think it is certainly reasonable to use the rpm barriers for the
> faulted list. The only one where we have to actually ensure we hold
> rpm
> is the set_tiling_ioctl.
Btw, since this would be a bigger change shouldn't we first add the new
RPM wakelock asserts? That's mostly reviewed already anyway, I still
have to check the RPS work which would need the same handling as the
hang check work (Chris' idea to use rpm_try_get there).
So how about the following until that, which is the correct way in any
case imo:
void __suspend_report_result(const char *function, void *fn, int ret)
{
- if (ret)
+ if (ret == -EBUSY || ret == -EAGAIN)
+ printk(KERN_DEBUG "%s(): %pF returns %d\n", function, fn, ret);
+ else if (ret)
printk(KERN_ERR "%s(): %pF returns %d\n", function, fn, ret);
}
--Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2015-11-17 22:49 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-04 19:25 [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL Imre Deak
2015-11-04 19:47 ` Paulo Zanoni
2015-11-16 13:33 ` Jani Nikula
2015-11-17 19:00 ` Daniel Vetter
2015-11-17 19:22 ` Imre Deak
2015-11-04 20:57 ` Chris Wilson
2015-11-05 11:28 ` Imre Deak
2015-11-05 11:56 ` Chris Wilson
2015-11-05 22:57 ` Imre Deak
2015-11-05 23:24 ` Imre Deak
2015-11-06 8:54 ` Chris Wilson
2015-11-09 13:09 ` Imre Deak
2015-11-09 13:25 ` Chris Wilson
2015-11-09 13:36 ` Imre Deak
2015-11-09 13:43 ` Chris Wilson
2015-11-17 22:16 ` Daniel Vetter
2015-11-17 22:38 ` Chris Wilson
2015-11-17 22:49 ` Imre Deak
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox