* [PATCH] drm/i915: Delay the relase of the forcewake by a jiffie
@ 2013-08-25 19:45 Chris Wilson
2013-08-26 9:59 ` Ville Syrjälä
0 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2013-08-25 19:45 UTC (permalink / raw)
To: intel-gfx
Obtaining the forcwake requires expensive and time consuming
serialisation. And we often try to obtain the forcewake multiple times
in very quick succession. We can reduce the overhead of these sequences
by delaying the forcewake release, and so not hammer the hw quite so
hard.
I was hoping this would help with the spurious
[drm:__gen6_gt_force_wake_mt_get] *ERROR* Timed out waiting for forcewake old ack to clear.
found on Haswell. Alas not.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_dma.c | 2 ++
drivers/gpu/drm/i915/i915_drv.h | 3 +++
drivers/gpu/drm/i915/intel_uncore.c | 33 +++++++++++++++++++++++++++++++--
3 files changed, 36 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 883990f..c0fb23e 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1800,6 +1800,8 @@ int i915_driver_unload(struct drm_device *dev)
dev_priv->gtt.base.cleanup(&dev_priv->gtt.base);
+ intel_uncore_fini(dev);
+
if (dev_priv->slab)
kmem_cache_destroy(dev_priv->slab);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a6354c3..8c93d93 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -406,6 +406,8 @@ struct intel_uncore {
unsigned fifo_count;
unsigned forcewake_count;
+
+ struct delayed_work force_wake_work;
};
#define DEV_INFO_FOR_EACH_FLAG(func, sep) \
@@ -1792,6 +1794,7 @@ extern void intel_uncore_early_sanitize(struct drm_device *dev);
extern void intel_uncore_init(struct drm_device *dev);
extern void intel_uncore_clear_errors(struct drm_device *dev);
extern void intel_uncore_check_errors(struct drm_device *dev);
+extern void intel_uncore_fini(struct drm_device *dev);
void
i915_enable_pipestat(drm_i915_private_t *dev_priv, int pipe, u32 mask);
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 8f5bc86..50fdad7 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -204,6 +204,18 @@ static void vlv_force_wake_put(struct drm_i915_private *dev_priv)
gen6_gt_check_fifodbg(dev_priv);
}
+static void gen6_force_wake_work(struct work_struct *work)
+{
+ struct drm_i915_private *dev_priv =
+ container_of(work, typeof(*dev_priv), uncore.force_wake_work.work);
+ unsigned long irqflags;
+
+ spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
+ if (--dev_priv->uncore.forcewake_count == 0)
+ dev_priv->uncore.funcs.force_wake_put(dev_priv);
+ spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
+}
+
void intel_uncore_early_sanitize(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
@@ -216,6 +228,9 @@ void intel_uncore_init(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
+ INIT_DELAYED_WORK(&dev_priv->uncore.force_wake_work,
+ gen6_force_wake_work);
+
if (IS_VALLEYVIEW(dev)) {
dev_priv->uncore.funcs.force_wake_get = vlv_force_wake_get;
dev_priv->uncore.funcs.force_wake_put = vlv_force_wake_put;
@@ -261,6 +276,13 @@ void intel_uncore_init(struct drm_device *dev)
}
}
+void intel_uncore_fini(struct drm_device *dev)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+
+ cancel_delayed_work_sync(&dev_priv->uncore.force_wake_work);
+}
+
void intel_uncore_sanitize(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
@@ -301,8 +323,15 @@ void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
unsigned long irqflags;
spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
- if (--dev_priv->uncore.forcewake_count == 0)
- dev_priv->uncore.funcs.force_wake_put(dev_priv);
+ if (--dev_priv->uncore.forcewake_count == 0) {
+ if (dev_priv->uncore.force_wake_work.work.func) {
+ dev_priv->uncore.forcewake_count++;
+ mod_delayed_work(dev_priv->wq,
+ &dev_priv->uncore.force_wake_work,
+ 1);
+ } else
+ dev_priv->uncore.funcs.force_wake_put(dev_priv);
+ }
spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
}
--
1.8.4.rc3
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH] drm/i915: Delay the relase of the forcewake by a jiffie
2013-08-25 19:45 [PATCH] drm/i915: Delay the relase of the forcewake by a jiffie Chris Wilson
@ 2013-08-26 9:59 ` Ville Syrjälä
0 siblings, 0 replies; 9+ messages in thread
From: Ville Syrjälä @ 2013-08-26 9:59 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On Sun, Aug 25, 2013 at 08:45:34PM +0100, Chris Wilson wrote:
> Obtaining the forcwake requires expensive and time consuming
> serialisation. And we often try to obtain the forcewake multiple times
> in very quick succession. We can reduce the overhead of these sequences
> by delaying the forcewake release, and so not hammer the hw quite so
> hard.
>
> I was hoping this would help with the spurious
> [drm:__gen6_gt_force_wake_mt_get] *ERROR* Timed out waiting for forcewake old ack to clear.
> found on Haswell. Alas not.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
> drivers/gpu/drm/i915/i915_dma.c | 2 ++
> drivers/gpu/drm/i915/i915_drv.h | 3 +++
> drivers/gpu/drm/i915/intel_uncore.c | 33 +++++++++++++++++++++++++++++++--
> 3 files changed, 36 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 883990f..c0fb23e 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1800,6 +1800,8 @@ int i915_driver_unload(struct drm_device *dev)
>
> dev_priv->gtt.base.cleanup(&dev_priv->gtt.base);
>
> + intel_uncore_fini(dev);
> +
We've already unmapped the registers.
> if (dev_priv->slab)
> kmem_cache_destroy(dev_priv->slab);
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index a6354c3..8c93d93 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -406,6 +406,8 @@ struct intel_uncore {
>
> unsigned fifo_count;
> unsigned forcewake_count;
> +
> + struct delayed_work force_wake_work;
> };
>
> #define DEV_INFO_FOR_EACH_FLAG(func, sep) \
> @@ -1792,6 +1794,7 @@ extern void intel_uncore_early_sanitize(struct drm_device *dev);
> extern void intel_uncore_init(struct drm_device *dev);
> extern void intel_uncore_clear_errors(struct drm_device *dev);
> extern void intel_uncore_check_errors(struct drm_device *dev);
> +extern void intel_uncore_fini(struct drm_device *dev);
>
> void
> i915_enable_pipestat(drm_i915_private_t *dev_priv, int pipe, u32 mask);
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index 8f5bc86..50fdad7 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -204,6 +204,18 @@ static void vlv_force_wake_put(struct drm_i915_private *dev_priv)
> gen6_gt_check_fifodbg(dev_priv);
> }
>
> +static void gen6_force_wake_work(struct work_struct *work)
> +{
> + struct drm_i915_private *dev_priv =
> + container_of(work, typeof(*dev_priv), uncore.force_wake_work.work);
> + unsigned long irqflags;
> +
> + spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
> + if (--dev_priv->uncore.forcewake_count == 0)
> + dev_priv->uncore.funcs.force_wake_put(dev_priv);
> + spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
> +}
> +
> void intel_uncore_early_sanitize(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -216,6 +228,9 @@ void intel_uncore_init(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
>
> + INIT_DELAYED_WORK(&dev_priv->uncore.force_wake_work,
> + gen6_force_wake_work);
> +
> if (IS_VALLEYVIEW(dev)) {
> dev_priv->uncore.funcs.force_wake_get = vlv_force_wake_get;
> dev_priv->uncore.funcs.force_wake_put = vlv_force_wake_put;
> @@ -261,6 +276,13 @@ void intel_uncore_init(struct drm_device *dev)
> }
> }
>
> +void intel_uncore_fini(struct drm_device *dev)
> +{
> + struct drm_i915_private *dev_priv = dev->dev_private;
> +
> + cancel_delayed_work_sync(&dev_priv->uncore.force_wake_work);
Aren't we potentially leaking a forcewake here? Or do we clear it
forcefully somewhere else?
> +}
> +
> void intel_uncore_sanitize(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -301,8 +323,15 @@ void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
> unsigned long irqflags;
>
> spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
> - if (--dev_priv->uncore.forcewake_count == 0)
> - dev_priv->uncore.funcs.force_wake_put(dev_priv);
> + if (--dev_priv->uncore.forcewake_count == 0) {
> + if (dev_priv->uncore.force_wake_work.work.func) {
> + dev_priv->uncore.forcewake_count++;
> + mod_delayed_work(dev_priv->wq,
> + &dev_priv->uncore.force_wake_work,
> + 1);
> + } else
> + dev_priv->uncore.funcs.force_wake_put(dev_priv);
> + }
> spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
> }
>
> --
> 1.8.4.rc3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Ville Syrjälä
Intel OTC
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] drm/i915: Delay the relase of the forcewake by a jiffie
@ 2013-08-26 11:06 Chris Wilson
2013-08-26 11:49 ` Ville Syrjälä
2014-02-23 20:12 ` Ben Widawsky
0 siblings, 2 replies; 9+ messages in thread
From: Chris Wilson @ 2013-08-26 11:06 UTC (permalink / raw)
To: intel-gfx
Obtaining the forcwake requires expensive and time consuming
serialisation. And we often try to obtain the forcewake multiple times
in very quick succession. We can reduce the overhead of these sequences
by delaying the forcewake release, and so not hammer the hw quite so
hard.
I was hoping this would help with the spurious
[drm:__gen6_gt_force_wake_mt_get] *ERROR* Timed out waiting for forcewake old ack to clear.
found on Haswell. Alas not.
v2: Fix teardown ordering - unmap the regs after turning off forcewake,
and make sure we do turn off forcewake - both found by Ville.
Note: I have no claims for improved performance, stablity or power
comsumption for this patch. We should not be hitting the registers often
enough for this to improve benchmarks, but given the nature of our hw it
is likely to improve long term stability.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_dma.c | 6 ++++--
drivers/gpu/drm/i915/i915_drv.h | 3 +++
drivers/gpu/drm/i915/intel_uncore.c | 30 ++++++++++++++++++++++++++++--
3 files changed, 35 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 883990f..97a6e22 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1787,8 +1787,6 @@ int i915_driver_unload(struct drm_device *dev)
list_del(&dev_priv->gtt.base.global_link);
WARN_ON(!list_empty(&dev_priv->vm_list));
drm_mm_takedown(&dev_priv->gtt.base.mm);
- if (dev_priv->regs != NULL)
- pci_iounmap(dev->pdev, dev_priv->regs);
drm_vblank_cleanup(dev);
@@ -1800,6 +1798,10 @@ int i915_driver_unload(struct drm_device *dev)
dev_priv->gtt.base.cleanup(&dev_priv->gtt.base);
+ intel_uncore_fini(dev);
+ if (dev_priv->regs != NULL)
+ pci_iounmap(dev->pdev, dev_priv->regs);
+
if (dev_priv->slab)
kmem_cache_destroy(dev_priv->slab);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a6354c3..8c93d93 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -406,6 +406,8 @@ struct intel_uncore {
unsigned fifo_count;
unsigned forcewake_count;
+
+ struct delayed_work force_wake_work;
};
#define DEV_INFO_FOR_EACH_FLAG(func, sep) \
@@ -1792,6 +1794,7 @@ extern void intel_uncore_early_sanitize(struct drm_device *dev);
extern void intel_uncore_init(struct drm_device *dev);
extern void intel_uncore_clear_errors(struct drm_device *dev);
extern void intel_uncore_check_errors(struct drm_device *dev);
+extern void intel_uncore_fini(struct drm_device *dev);
void
i915_enable_pipestat(drm_i915_private_t *dev_priv, int pipe, u32 mask);
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 8649f1c..462cc7f 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -204,6 +204,18 @@ static void vlv_force_wake_put(struct drm_i915_private *dev_priv)
gen6_gt_check_fifodbg(dev_priv);
}
+static void gen6_force_wake_work(struct work_struct *work)
+{
+ struct drm_i915_private *dev_priv =
+ container_of(work, typeof(*dev_priv), uncore.force_wake_work.work);
+ unsigned long irqflags;
+
+ spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
+ if (--dev_priv->uncore.forcewake_count == 0)
+ dev_priv->uncore.funcs.force_wake_put(dev_priv);
+ spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
+}
+
void intel_uncore_early_sanitize(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
@@ -216,6 +228,9 @@ void intel_uncore_init(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
+ INIT_DELAYED_WORK(&dev_priv->uncore.force_wake_work,
+ gen6_force_wake_work);
+
if (IS_VALLEYVIEW(dev)) {
dev_priv->uncore.funcs.force_wake_get = vlv_force_wake_get;
dev_priv->uncore.funcs.force_wake_put = vlv_force_wake_put;
@@ -261,6 +276,13 @@ void intel_uncore_init(struct drm_device *dev)
}
}
+void intel_uncore_fini(struct drm_device *dev)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+
+ flush_delayed_work(&dev_priv->uncore.force_wake_work);
+}
+
static void intel_uncore_forcewake_reset(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
@@ -306,8 +328,12 @@ void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
unsigned long irqflags;
spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
- if (--dev_priv->uncore.forcewake_count == 0)
- dev_priv->uncore.funcs.force_wake_put(dev_priv);
+ if (--dev_priv->uncore.forcewake_count == 0) {
+ dev_priv->uncore.forcewake_count++;
+ mod_delayed_work(dev_priv->wq,
+ &dev_priv->uncore.force_wake_work,
+ 1);
+ }
spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
}
--
1.8.4.rc3
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH] drm/i915: Delay the relase of the forcewake by a jiffie
2013-08-26 11:06 Chris Wilson
@ 2013-08-26 11:49 ` Ville Syrjälä
2014-02-23 20:12 ` Ben Widawsky
1 sibling, 0 replies; 9+ messages in thread
From: Ville Syrjälä @ 2013-08-26 11:49 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On Mon, Aug 26, 2013 at 12:06:43PM +0100, Chris Wilson wrote:
> Obtaining the forcwake requires expensive and time consuming
> serialisation. And we often try to obtain the forcewake multiple times
> in very quick succession. We can reduce the overhead of these sequences
> by delaying the forcewake release, and so not hammer the hw quite so
> hard.
>
> I was hoping this would help with the spurious
> [drm:__gen6_gt_force_wake_mt_get] *ERROR* Timed out waiting for forcewake old ack to clear.
> found on Haswell. Alas not.
>
> v2: Fix teardown ordering - unmap the regs after turning off forcewake,
> and make sure we do turn off forcewake - both found by Ville.
>
> Note: I have no claims for improved performance, stablity or power
> comsumption for this patch. We should not be hitting the registers often
> enough for this to improve benchmarks, but given the nature of our hw it
> is likely to improve long term stability.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
> drivers/gpu/drm/i915/i915_dma.c | 6 ++++--
> drivers/gpu/drm/i915/i915_drv.h | 3 +++
> drivers/gpu/drm/i915/intel_uncore.c | 30 ++++++++++++++++++++++++++++--
> 3 files changed, 35 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 883990f..97a6e22 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1787,8 +1787,6 @@ int i915_driver_unload(struct drm_device *dev)
> list_del(&dev_priv->gtt.base.global_link);
> WARN_ON(!list_empty(&dev_priv->vm_list));
> drm_mm_takedown(&dev_priv->gtt.base.mm);
> - if (dev_priv->regs != NULL)
> - pci_iounmap(dev->pdev, dev_priv->regs);
>
> drm_vblank_cleanup(dev);
>
> @@ -1800,6 +1798,10 @@ int i915_driver_unload(struct drm_device *dev)
>
> dev_priv->gtt.base.cleanup(&dev_priv->gtt.base);
>
> + intel_uncore_fini(dev);
> + if (dev_priv->regs != NULL)
> + pci_iounmap(dev->pdev, dev_priv->regs);
> +
> if (dev_priv->slab)
> kmem_cache_destroy(dev_priv->slab);
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index a6354c3..8c93d93 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -406,6 +406,8 @@ struct intel_uncore {
>
> unsigned fifo_count;
> unsigned forcewake_count;
> +
> + struct delayed_work force_wake_work;
> };
>
> #define DEV_INFO_FOR_EACH_FLAG(func, sep) \
> @@ -1792,6 +1794,7 @@ extern void intel_uncore_early_sanitize(struct drm_device *dev);
> extern void intel_uncore_init(struct drm_device *dev);
> extern void intel_uncore_clear_errors(struct drm_device *dev);
> extern void intel_uncore_check_errors(struct drm_device *dev);
> +extern void intel_uncore_fini(struct drm_device *dev);
>
> void
> i915_enable_pipestat(drm_i915_private_t *dev_priv, int pipe, u32 mask);
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index 8649f1c..462cc7f 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -204,6 +204,18 @@ static void vlv_force_wake_put(struct drm_i915_private *dev_priv)
> gen6_gt_check_fifodbg(dev_priv);
> }
>
> +static void gen6_force_wake_work(struct work_struct *work)
> +{
> + struct drm_i915_private *dev_priv =
> + container_of(work, typeof(*dev_priv), uncore.force_wake_work.work);
> + unsigned long irqflags;
> +
> + spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
> + if (--dev_priv->uncore.forcewake_count == 0)
> + dev_priv->uncore.funcs.force_wake_put(dev_priv);
> + spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
> +}
> +
> void intel_uncore_early_sanitize(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -216,6 +228,9 @@ void intel_uncore_init(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
>
> + INIT_DELAYED_WORK(&dev_priv->uncore.force_wake_work,
> + gen6_force_wake_work);
> +
> if (IS_VALLEYVIEW(dev)) {
> dev_priv->uncore.funcs.force_wake_get = vlv_force_wake_get;
> dev_priv->uncore.funcs.force_wake_put = vlv_force_wake_put;
> @@ -261,6 +276,13 @@ void intel_uncore_init(struct drm_device *dev)
> }
> }
>
> +void intel_uncore_fini(struct drm_device *dev)
> +{
> + struct drm_i915_private *dev_priv = dev->dev_private;
> +
> + flush_delayed_work(&dev_priv->uncore.force_wake_work);
A bit orthogonal, but maybe we should also add
'WARN_ON(dev_priv->uncore.forcewake_count)' just to make sure we
don't leave forcewake enabled by accident? Or if we're really
paranoid, maybe even 'if (WARN_ON(...)) force_wake_reset()'.
But anyways, the patch looks like it should do what it claims, and
I can't see other bugs, so:
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> +}
> +
> static void intel_uncore_forcewake_reset(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -306,8 +328,12 @@ void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
> unsigned long irqflags;
>
> spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
> - if (--dev_priv->uncore.forcewake_count == 0)
> - dev_priv->uncore.funcs.force_wake_put(dev_priv);
> + if (--dev_priv->uncore.forcewake_count == 0) {
> + dev_priv->uncore.forcewake_count++;
> + mod_delayed_work(dev_priv->wq,
> + &dev_priv->uncore.force_wake_work,
> + 1);
> + }
> spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
> }
>
> --
> 1.8.4.rc3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Ville Syrjälä
Intel OTC
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] drm/i915: Delay the relase of the forcewake by a jiffie
2013-08-26 11:06 Chris Wilson
2013-08-26 11:49 ` Ville Syrjälä
@ 2014-02-23 20:12 ` Ben Widawsky
2014-02-24 8:30 ` Chris Wilson
1 sibling, 1 reply; 9+ messages in thread
From: Ben Widawsky @ 2014-02-23 20:12 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On Mon, Aug 26, 2013 at 12:06:43PM +0100, Chris Wilson wrote:
> Obtaining the forcwake requires expensive and time consuming
> serialisation. And we often try to obtain the forcewake multiple times
> in very quick succession. We can reduce the overhead of these sequences
> by delaying the forcewake release, and so not hammer the hw quite so
> hard.
>
> I was hoping this would help with the spurious
> [drm:__gen6_gt_force_wake_mt_get] *ERROR* Timed out waiting for forcewake old ack to clear.
> found on Haswell. Alas not.
>
> v2: Fix teardown ordering - unmap the regs after turning off forcewake,
> and make sure we do turn off forcewake - both found by Ville.
>
> Note: I have no claims for improved performance, stablity or power
> comsumption for this patch. We should not be hitting the registers often
> enough for this to improve benchmarks, but given the nature of our hw it
> is likely to improve long term stability.
I don't understand how or why but from casual powertop observation, this
workqueue uses between 4x and 50x or the nearest other i915 workqueue
(i915_gem_retire_work_handler). On my x240...
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
> drivers/gpu/drm/i915/i915_dma.c | 6 ++++--
> drivers/gpu/drm/i915/i915_drv.h | 3 +++
> drivers/gpu/drm/i915/intel_uncore.c | 30 ++++++++++++++++++++++++++++--
> 3 files changed, 35 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 883990f..97a6e22 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1787,8 +1787,6 @@ int i915_driver_unload(struct drm_device *dev)
> list_del(&dev_priv->gtt.base.global_link);
> WARN_ON(!list_empty(&dev_priv->vm_list));
> drm_mm_takedown(&dev_priv->gtt.base.mm);
> - if (dev_priv->regs != NULL)
> - pci_iounmap(dev->pdev, dev_priv->regs);
>
> drm_vblank_cleanup(dev);
>
> @@ -1800,6 +1798,10 @@ int i915_driver_unload(struct drm_device *dev)
>
> dev_priv->gtt.base.cleanup(&dev_priv->gtt.base);
>
> + intel_uncore_fini(dev);
> + if (dev_priv->regs != NULL)
> + pci_iounmap(dev->pdev, dev_priv->regs);
> +
> if (dev_priv->slab)
> kmem_cache_destroy(dev_priv->slab);
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index a6354c3..8c93d93 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -406,6 +406,8 @@ struct intel_uncore {
>
> unsigned fifo_count;
> unsigned forcewake_count;
> +
> + struct delayed_work force_wake_work;
> };
>
> #define DEV_INFO_FOR_EACH_FLAG(func, sep) \
> @@ -1792,6 +1794,7 @@ extern void intel_uncore_early_sanitize(struct drm_device *dev);
> extern void intel_uncore_init(struct drm_device *dev);
> extern void intel_uncore_clear_errors(struct drm_device *dev);
> extern void intel_uncore_check_errors(struct drm_device *dev);
> +extern void intel_uncore_fini(struct drm_device *dev);
>
> void
> i915_enable_pipestat(drm_i915_private_t *dev_priv, int pipe, u32 mask);
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index 8649f1c..462cc7f 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -204,6 +204,18 @@ static void vlv_force_wake_put(struct drm_i915_private *dev_priv)
> gen6_gt_check_fifodbg(dev_priv);
> }
>
> +static void gen6_force_wake_work(struct work_struct *work)
> +{
> + struct drm_i915_private *dev_priv =
> + container_of(work, typeof(*dev_priv), uncore.force_wake_work.work);
> + unsigned long irqflags;
> +
> + spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
> + if (--dev_priv->uncore.forcewake_count == 0)
> + dev_priv->uncore.funcs.force_wake_put(dev_priv);
> + spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
> +}
> +
> void intel_uncore_early_sanitize(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -216,6 +228,9 @@ void intel_uncore_init(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
>
> + INIT_DELAYED_WORK(&dev_priv->uncore.force_wake_work,
> + gen6_force_wake_work);
> +
> if (IS_VALLEYVIEW(dev)) {
> dev_priv->uncore.funcs.force_wake_get = vlv_force_wake_get;
> dev_priv->uncore.funcs.force_wake_put = vlv_force_wake_put;
> @@ -261,6 +276,13 @@ void intel_uncore_init(struct drm_device *dev)
> }
> }
>
> +void intel_uncore_fini(struct drm_device *dev)
> +{
> + struct drm_i915_private *dev_priv = dev->dev_private;
> +
> + flush_delayed_work(&dev_priv->uncore.force_wake_work);
> +}
> +
> static void intel_uncore_forcewake_reset(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -306,8 +328,12 @@ void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
> unsigned long irqflags;
>
> spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
> - if (--dev_priv->uncore.forcewake_count == 0)
> - dev_priv->uncore.funcs.force_wake_put(dev_priv);
> + if (--dev_priv->uncore.forcewake_count == 0) {
> + dev_priv->uncore.forcewake_count++;
> + mod_delayed_work(dev_priv->wq,
> + &dev_priv->uncore.force_wake_work,
> + 1);
> + }
> spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
> }
>
> --
> 1.8.4.rc3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Ben Widawsky, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] drm/i915: Delay the relase of the forcewake by a jiffie
2014-02-23 20:12 ` Ben Widawsky
@ 2014-02-24 8:30 ` Chris Wilson
2014-02-24 20:31 ` Ben Widawsky
0 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2014-02-24 8:30 UTC (permalink / raw)
To: Ben Widawsky; +Cc: intel-gfx
On Sun, Feb 23, 2014 at 12:12:25PM -0800, Ben Widawsky wrote:
> On Mon, Aug 26, 2013 at 12:06:43PM +0100, Chris Wilson wrote:
> > Obtaining the forcwake requires expensive and time consuming
> > serialisation. And we often try to obtain the forcewake multiple times
> > in very quick succession. We can reduce the overhead of these sequences
> > by delaying the forcewake release, and so not hammer the hw quite so
> > hard.
> >
> > I was hoping this would help with the spurious
> > [drm:__gen6_gt_force_wake_mt_get] *ERROR* Timed out waiting for forcewake old ack to clear.
> > found on Haswell. Alas not.
> >
> > v2: Fix teardown ordering - unmap the regs after turning off forcewake,
> > and make sure we do turn off forcewake - both found by Ville.
> >
> > Note: I have no claims for improved performance, stablity or power
> > comsumption for this patch. We should not be hitting the registers often
> > enough for this to improve benchmarks, but given the nature of our hw it
> > is likely to improve long term stability.
>
> I don't understand how or why but from casual powertop observation, this
> workqueue uses between 4x and 50x or the nearest other i915 workqueue
> (i915_gem_retire_work_handler). On my x240...
What does that mean? We expect this to be frequently hit since we use it
so often, but retire_work_handler is only run every couple of seconds to
trim our lists (in case of userspace failure).
The idea is to have the forcewake reset run during the next scheduler event,
so using a workqueue was probably the wrong choice and perhaps we should have
used a timer instead?
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: Delay the relase of the forcewake by a jiffie
2014-02-24 8:30 ` Chris Wilson
@ 2014-02-24 20:31 ` Ben Widawsky
0 siblings, 0 replies; 9+ messages in thread
From: Ben Widawsky @ 2014-02-24 20:31 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On Mon, Feb 24, 2014 at 08:30:22AM +0000, Chris Wilson wrote:
> On Sun, Feb 23, 2014 at 12:12:25PM -0800, Ben Widawsky wrote:
> > On Mon, Aug 26, 2013 at 12:06:43PM +0100, Chris Wilson wrote:
> > > Obtaining the forcwake requires expensive and time consuming
> > > serialisation. And we often try to obtain the forcewake multiple times
> > > in very quick succession. We can reduce the overhead of these sequences
> > > by delaying the forcewake release, and so not hammer the hw quite so
> > > hard.
> > >
> > > I was hoping this would help with the spurious
> > > [drm:__gen6_gt_force_wake_mt_get] *ERROR* Timed out waiting for forcewake old ack to clear.
> > > found on Haswell. Alas not.
> > >
> > > v2: Fix teardown ordering - unmap the regs after turning off forcewake,
> > > and make sure we do turn off forcewake - both found by Ville.
> > >
> > > Note: I have no claims for improved performance, stablity or power
> > > comsumption for this patch. We should not be hitting the registers often
> > > enough for this to improve benchmarks, but given the nature of our hw it
> > > is likely to improve long term stability.
> >
> > I don't understand how or why but from casual powertop observation, this
> > workqueue uses between 4x and 50x or the nearest other i915 workqueue
> > (i915_gem_retire_work_handler). On my x240...
>
> What does that mean? We expect this to be frequently hit since we use it
> so often, but retire_work_handler is only run every couple of seconds to
> trim our lists (in case of userspace failure).
>
> The idea is to have the forcewake reset run during the next scheduler event,
> so using a workqueue was probably the wrong choice and perhaps we should have
> used a timer instead?
> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
I just posted it for the sake of potentially starting a conversation. I
assume we always expected power consumption to go up since [it would
seem at least] we're usually just deferring the release of the last
put(). But then we have the race to idle argument in there as well.
In any event, I wasn't asking for any change, was just surprised by the
data.
--
Ben Widawsky, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] drm/i915: Delay the relase of the forcewake by a jiffie
@ 2013-08-26 12:46 Chris Wilson
2013-09-24 10:37 ` Daniel Vetter
0 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2013-08-26 12:46 UTC (permalink / raw)
To: intel-gfx
Obtaining the forcwake requires expensive and time consuming
serialisation. And we often try to obtain the forcewake multiple times
in very quick succession. We can reduce the overhead of these sequences
by delaying the forcewake release, and so not hammer the hw quite so
hard.
I was hoping this would help with the spurious
[drm:__gen6_gt_force_wake_mt_get] *ERROR* Timed out waiting for forcewake old ack to clear.
found on Haswell. Alas not.
v2: Fix teardown ordering - unmap the regs after turning off forcewake,
and make sure we do turn off forcewake - both found by Ville.
v3: As we introduce intel_uncore_fini(), use it to make sure everything
is disabled before we hand back to the BIOS.
Note: I have no claims for improved performance, stablity or power
comsumption for this patch. We should not be hitting the registers often
enough for this to improve benchmarks, but given the nature of our hw it
is likely to improve long term stability.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
drivers/gpu/drm/i915/i915_dma.c | 6 ++++--
drivers/gpu/drm/i915/i915_drv.h | 3 +++
drivers/gpu/drm/i915/intel_uncore.c | 33 +++++++++++++++++++++++++++++++--
3 files changed, 38 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 883990f..97a6e22 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1787,8 +1787,6 @@ int i915_driver_unload(struct drm_device *dev)
list_del(&dev_priv->gtt.base.global_link);
WARN_ON(!list_empty(&dev_priv->vm_list));
drm_mm_takedown(&dev_priv->gtt.base.mm);
- if (dev_priv->regs != NULL)
- pci_iounmap(dev->pdev, dev_priv->regs);
drm_vblank_cleanup(dev);
@@ -1800,6 +1798,10 @@ int i915_driver_unload(struct drm_device *dev)
dev_priv->gtt.base.cleanup(&dev_priv->gtt.base);
+ intel_uncore_fini(dev);
+ if (dev_priv->regs != NULL)
+ pci_iounmap(dev->pdev, dev_priv->regs);
+
if (dev_priv->slab)
kmem_cache_destroy(dev_priv->slab);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a6354c3..8c93d93 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -406,6 +406,8 @@ struct intel_uncore {
unsigned fifo_count;
unsigned forcewake_count;
+
+ struct delayed_work force_wake_work;
};
#define DEV_INFO_FOR_EACH_FLAG(func, sep) \
@@ -1792,6 +1794,7 @@ extern void intel_uncore_early_sanitize(struct drm_device *dev);
extern void intel_uncore_init(struct drm_device *dev);
extern void intel_uncore_clear_errors(struct drm_device *dev);
extern void intel_uncore_check_errors(struct drm_device *dev);
+extern void intel_uncore_fini(struct drm_device *dev);
void
i915_enable_pipestat(drm_i915_private_t *dev_priv, int pipe, u32 mask);
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 8649f1c..f2753d9 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -204,6 +204,18 @@ static void vlv_force_wake_put(struct drm_i915_private *dev_priv)
gen6_gt_check_fifodbg(dev_priv);
}
+static void gen6_force_wake_work(struct work_struct *work)
+{
+ struct drm_i915_private *dev_priv =
+ container_of(work, typeof(*dev_priv), uncore.force_wake_work.work);
+ unsigned long irqflags;
+
+ spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
+ if (--dev_priv->uncore.forcewake_count == 0)
+ dev_priv->uncore.funcs.force_wake_put(dev_priv);
+ spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
+}
+
void intel_uncore_early_sanitize(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
@@ -216,6 +228,9 @@ void intel_uncore_init(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
+ INIT_DELAYED_WORK(&dev_priv->uncore.force_wake_work,
+ gen6_force_wake_work);
+
if (IS_VALLEYVIEW(dev)) {
dev_priv->uncore.funcs.force_wake_get = vlv_force_wake_get;
dev_priv->uncore.funcs.force_wake_put = vlv_force_wake_put;
@@ -261,6 +276,16 @@ void intel_uncore_init(struct drm_device *dev)
}
}
+void intel_uncore_fini(struct drm_device *dev)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+
+ flush_delayed_work(&dev_priv->uncore.force_wake_work);
+
+ /* Paranoia: make sure we have disabled everything before we exit. */
+ intel_uncore_sanitize(dev);
+}
+
static void intel_uncore_forcewake_reset(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
@@ -306,8 +331,12 @@ void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
unsigned long irqflags;
spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
- if (--dev_priv->uncore.forcewake_count == 0)
- dev_priv->uncore.funcs.force_wake_put(dev_priv);
+ if (--dev_priv->uncore.forcewake_count == 0) {
+ dev_priv->uncore.forcewake_count++;
+ mod_delayed_work(dev_priv->wq,
+ &dev_priv->uncore.force_wake_work,
+ 1);
+ }
spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
}
--
1.8.4.rc3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH] drm/i915: Delay the relase of the forcewake by a jiffie
2013-08-26 12:46 Chris Wilson
@ 2013-09-24 10:37 ` Daniel Vetter
0 siblings, 0 replies; 9+ messages in thread
From: Daniel Vetter @ 2013-09-24 10:37 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On Mon, Aug 26, 2013 at 01:46:09PM +0100, Chris Wilson wrote:
> Obtaining the forcwake requires expensive and time consuming
> serialisation. And we often try to obtain the forcewake multiple times
> in very quick succession. We can reduce the overhead of these sequences
> by delaying the forcewake release, and so not hammer the hw quite so
> hard.
>
> I was hoping this would help with the spurious
> [drm:__gen6_gt_force_wake_mt_get] *ERROR* Timed out waiting for forcewake old ack to clear.
> found on Haswell. Alas not.
>
> v2: Fix teardown ordering - unmap the regs after turning off forcewake,
> and make sure we do turn off forcewake - both found by Ville.
>
> v3: As we introduce intel_uncore_fini(), use it to make sure everything
> is disabled before we hand back to the BIOS.
>
> Note: I have no claims for improved performance, stablity or power
> comsumption for this patch. We should not be hitting the registers often
> enough for this to improve benchmarks, but given the nature of our hw it
> is likely to improve long term stability.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Queued for -next, thanks for the patch.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-02-24 20:31 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-25 19:45 [PATCH] drm/i915: Delay the relase of the forcewake by a jiffie Chris Wilson
2013-08-26 9:59 ` Ville Syrjälä
-- strict thread matches above, loose matches on Subject: below --
2013-08-26 11:06 Chris Wilson
2013-08-26 11:49 ` Ville Syrjälä
2014-02-23 20:12 ` Ben Widawsky
2014-02-24 8:30 ` Chris Wilson
2014-02-24 20:31 ` Ben Widawsky
2013-08-26 12:46 Chris Wilson
2013-09-24 10:37 ` Daniel Vetter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox