public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [RFC] Reduce idle vblank wakeups
@ 2011-11-16 14:20 Matthew Garrett
  2011-11-16 14:20 ` [RFC 1/3] drm: Make drm_vblank_offdelay per-device Matthew Garrett
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Matthew Garrett @ 2011-11-16 14:20 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

The drm core currently waits 5 seconds from userspace dropping a request
for vblanks to vblanks actually being disabled. This appears to be a
workaround for broken hardware, but results in a mostly idle desktop
generating a huge number of wakeups that are entirely unnecessary but which
consume measurable amounts of power. This patchset makes the vblank timeout
per-device rather than global, making it easy for drivers to override the
behaviour without compromising other graphics hardware in the system. It
then removes the delay on Intel hardware. I've tested this successfully on
Sandybridge without any evidence of spurious or missing irqs, but I don't
know how well it works on older hardware. Feedback not only welcome, but
positively desired.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC 1/3] drm: Make drm_vblank_offdelay per-device
  2011-11-16 14:20 [RFC] Reduce idle vblank wakeups Matthew Garrett
@ 2011-11-16 14:20 ` Matthew Garrett
  2011-11-16 14:20 ` [RFC 2/3] drm: Handle the vblank_offdelay=0 case properly Matthew Garrett
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Matthew Garrett @ 2011-11-16 14:20 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Matthew Garrett

drm_vblank_offdelay is currently a system global, despite the optimal
value being hardware-specific. Move it to the drm_device structure.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
---
 drivers/gpu/drm/drm_irq.c  |    6 +++---
 drivers/gpu/drm/drm_stub.c |    8 +++++---
 include/drm/drmP.h         |    2 +-
 3 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index cb3794a..8bcb6a4 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -928,7 +928,7 @@ EXPORT_SYMBOL(drm_vblank_get);
  * @crtc: which counter to give up
  *
  * Release ownership of a given vblank counter, turning off interrupts
- * if possible. Disable interrupts after drm_vblank_offdelay milliseconds.
+ * if possible. Disable interrupts after vblank_offdelay milliseconds.
  */
 void drm_vblank_put(struct drm_device *dev, int crtc)
 {
@@ -936,9 +936,9 @@ void drm_vblank_put(struct drm_device *dev, int crtc)
 
 	/* Last user schedules interrupt disable */
 	if (atomic_dec_and_test(&dev->vblank_refcount[crtc]) &&
-	    (drm_vblank_offdelay > 0))
+	    (dev->vblank_offdelay > 0))
 		mod_timer(&dev->vblank_disable_timer,
-			  jiffies + ((drm_vblank_offdelay * DRM_HZ)/1000));
+			  jiffies + ((dev->vblank_offdelay * DRM_HZ)/1000));
 }
 EXPORT_SYMBOL(drm_vblank_put);
 
diff --git a/drivers/gpu/drm/drm_stub.c b/drivers/gpu/drm/drm_stub.c
index 6d7b083..189a077 100644
--- a/drivers/gpu/drm/drm_stub.c
+++ b/drivers/gpu/drm/drm_stub.c
@@ -40,8 +40,8 @@
 unsigned int drm_debug = 0;	/* 1 to enable debug output */
 EXPORT_SYMBOL(drm_debug);
 
-unsigned int drm_vblank_offdelay = 5000;    /* Default to 5000 msecs. */
-EXPORT_SYMBOL(drm_vblank_offdelay);
+unsigned int drm_default_vblank_offdelay = 5000;   /* Default to 5000 msecs. */
+EXPORT_SYMBOL(drm_default_vblank_offdelay);
 
 unsigned int drm_timestamp_precision = 20;  /* Default to 20 usecs. */
 EXPORT_SYMBOL(drm_timestamp_precision);
@@ -54,7 +54,7 @@ MODULE_PARM_DESC(vblankoffdelay, "Delay until vblank irq auto-disable [msecs]");
 MODULE_PARM_DESC(timestamp_precision_usec, "Max. error on timestamps [usecs]");
 
 module_param_named(debug, drm_debug, int, 0600);
-module_param_named(vblankoffdelay, drm_vblank_offdelay, int, 0600);
+module_param_named(vblankoffdelay, drm_default_vblank_offdelay, int, 0600);
 module_param_named(timestamp_precision_usec, drm_timestamp_precision, int, 0600);
 
 struct idr drm_minors_idr;
@@ -290,6 +290,8 @@ int drm_fill_in_dev(struct drm_device *dev,
 
 	dev->driver = driver;
 
+	dev->vblank_offdelay = drm_default_vblank_offdelay;
+
 	if (dev->driver->bus->agp_init) {
 		retcode = dev->driver->bus->agp_init(dev);
 		if (retcode)
diff --git a/include/drm/drmP.h b/include/drm/drmP.h
index cf39949..81e6bbb 100644
--- a/include/drm/drmP.h
+++ b/include/drm/drmP.h
@@ -1168,6 +1168,7 @@ struct drm_device {
 	struct idr object_name_idr;
 	/*@} */
 	int switch_power_state;
+	int vblank_offdelay;
 };
 
 #define DRM_SWITCH_POWER_ON 0
@@ -1463,7 +1464,6 @@ extern void drm_put_dev(struct drm_device *dev);
 extern int drm_put_minor(struct drm_minor **minor);
 extern unsigned int drm_debug;
 
-extern unsigned int drm_vblank_offdelay;
 extern unsigned int drm_timestamp_precision;
 
 extern struct class *drm_class;
-- 
1.7.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC 2/3] drm: Handle the vblank_offdelay=0 case properly
  2011-11-16 14:20 [RFC] Reduce idle vblank wakeups Matthew Garrett
  2011-11-16 14:20 ` [RFC 1/3] drm: Make drm_vblank_offdelay per-device Matthew Garrett
@ 2011-11-16 14:20 ` Matthew Garrett
  2011-11-16 14:20 ` [RFC 3/3] i915: Drop vblank_offdelay Matthew Garrett
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Matthew Garrett @ 2011-11-16 14:20 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Matthew Garrett

Right now if vblank_offdelay is 0, vblanks won't be disabled after the
last user. Fix that case up.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
---
 drivers/gpu/drm/drm_irq.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index 8bcb6a4..94f9579 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -935,8 +935,7 @@ void drm_vblank_put(struct drm_device *dev, int crtc)
 	BUG_ON(atomic_read(&dev->vblank_refcount[crtc]) == 0);
 
 	/* Last user schedules interrupt disable */
-	if (atomic_dec_and_test(&dev->vblank_refcount[crtc]) &&
-	    (dev->vblank_offdelay > 0))
+	if (atomic_dec_and_test(&dev->vblank_refcount[crtc]))
 		mod_timer(&dev->vblank_disable_timer,
 			  jiffies + ((dev->vblank_offdelay * DRM_HZ)/1000));
 }
-- 
1.7.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC 3/3] i915: Drop vblank_offdelay
  2011-11-16 14:20 [RFC] Reduce idle vblank wakeups Matthew Garrett
  2011-11-16 14:20 ` [RFC 1/3] drm: Make drm_vblank_offdelay per-device Matthew Garrett
  2011-11-16 14:20 ` [RFC 2/3] drm: Handle the vblank_offdelay=0 case properly Matthew Garrett
@ 2011-11-16 14:20 ` Matthew Garrett
  2011-11-16 14:32 ` [RFC] Reduce idle vblank wakeups Adam Jackson
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Matthew Garrett @ 2011-11-16 14:20 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Matthew Garrett

Sandybridge, at least, seems to manage without any vblank offdelay.
Dropping this reduces the number of wakeups on an otherwise idle system
dramatically.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
---
 drivers/gpu/drm/i915/i915_dma.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index a9533c5..46e7172 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1917,6 +1917,9 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 		goto free_priv;
 	}
 
+	/* vblank is reliable */
+	dev->vblank_offdelay = 0;
+
 	/* overlay on gen2 is broken and can't address above 1G */
 	if (IS_GEN2(dev))
 		dma_set_coherent_mask(&dev->pdev->dev, DMA_BIT_MASK(30));
-- 
1.7.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC] Reduce idle vblank wakeups
  2011-11-16 14:20 [RFC] Reduce idle vblank wakeups Matthew Garrett
                   ` (2 preceding siblings ...)
  2011-11-16 14:20 ` [RFC 3/3] i915: Drop vblank_offdelay Matthew Garrett
@ 2011-11-16 14:32 ` Adam Jackson
  2011-11-16 17:03 ` Michel Dänzer
  2012-05-04 18:31 ` [Intel-gfx] [RFC] Reduce idle vblank wakeups Matthew Garrett
  5 siblings, 0 replies; 15+ messages in thread
From: Adam Jackson @ 2011-11-16 14:32 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: intel-gfx, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 996 bytes --]

On Wed, 2011-11-16 at 09:20 -0500, Matthew Garrett wrote:
> The drm core currently waits 5 seconds from userspace dropping a request
> for vblanks to vblanks actually being disabled. This appears to be a
> workaround for broken hardware, but results in a mostly idle desktop
> generating a huge number of wakeups that are entirely unnecessary but which
> consume measurable amounts of power. This patchset makes the vblank timeout
> per-device rather than global, making it easy for drivers to override the
> behaviour without compromising other graphics hardware in the system. It
> then removes the delay on Intel hardware. I've tested this successfully on
> Sandybridge without any evidence of spurious or missing irqs, but I don't
> know how well it works on older hardware. Feedback not only welcome, but
> positively desired.

Looks good to me.  I'll test this on some other intel kit I've got
handy.  For the series:

Reviewed-by: Adam Jackson <ajax@redhat.com>

- ajax

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC] Reduce idle vblank wakeups
  2011-11-16 14:20 [RFC] Reduce idle vblank wakeups Matthew Garrett
                   ` (3 preceding siblings ...)
  2011-11-16 14:32 ` [RFC] Reduce idle vblank wakeups Adam Jackson
@ 2011-11-16 17:03 ` Michel Dänzer
  2011-11-16 17:11   ` Matthew Garrett
  2012-05-04 18:31 ` [Intel-gfx] [RFC] Reduce idle vblank wakeups Matthew Garrett
  5 siblings, 1 reply; 15+ messages in thread
From: Michel Dänzer @ 2011-11-16 17:03 UTC (permalink / raw)
  To: Matthew Garrett, Mario Kleiner; +Cc: intel-gfx, dri-devel

On Mit, 2011-11-16 at 09:20 -0500, Matthew Garrett wrote: 
> The drm core currently waits 5 seconds from userspace dropping a request
> for vblanks to vblanks actually being disabled. This appears to be a
> workaround for broken hardware, but results in a mostly idle desktop
> generating a huge number of wakeups that are entirely unnecessary but which
> consume measurable amounts of power. This patchset makes the vblank timeout
> per-device rather than global, making it easy for drivers to override the
> behaviour without compromising other graphics hardware in the system. It
> then removes the delay on Intel hardware. I've tested this successfully on
> Sandybridge without any evidence of spurious or missing irqs, but I don't
> know how well it works on older hardware. Feedback not only welcome, but
> positively desired.

Have you discussed this with Mario Kleiner (CC'd)?

I thought the main reason for the delay wasn't broken hardware but to
avoid constantly ping-ponging the vblank IRQ between on and off with
apps which regularly neeed the vblank counter value, as that could make
the counter unreliable. Maybe I'm misremembering though.

Even if I'm not, lowering the delay shouldn't be a problem, so long as
it's long enough that at least apps which need the vblank counter every
or every other frame don't cause the IRQ to ping-pong. But that
shouldn't depend on the hardware.


-- 
Earthling Michel Dänzer           |                   http://www.amd.com
Libre software enthusiast         |          Debian, X and DRI developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC] Reduce idle vblank wakeups
  2011-11-16 17:03 ` Michel Dänzer
@ 2011-11-16 17:11   ` Matthew Garrett
  2011-11-16 17:53     ` [Intel-gfx] " Andrew Lutomirski
  2011-11-16 18:27     ` Mario Kleiner
  0 siblings, 2 replies; 15+ messages in thread
From: Matthew Garrett @ 2011-11-16 17:11 UTC (permalink / raw)
  To: Michel Dänzer; +Cc: intel-gfx, dri-devel, Mario Kleiner

On Wed, Nov 16, 2011 at 06:03:15PM +0100, Michel Dänzer wrote:

> I thought the main reason for the delay wasn't broken hardware but to
> avoid constantly ping-ponging the vblank IRQ between on and off with
> apps which regularly neeed the vblank counter value, as that could make
> the counter unreliable. Maybe I'm misremembering though.

If turning it on and off results in the counter value being wrong then 
surely that's a hardware problem? I've tested that turning it on and off 
between every IRQ still gives valid counter values on sandybridge.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Intel-gfx] [RFC] Reduce idle vblank wakeups
  2011-11-16 17:11   ` Matthew Garrett
@ 2011-11-16 17:53     ` Andrew Lutomirski
  2011-11-16 18:27     ` Mario Kleiner
  1 sibling, 0 replies; 15+ messages in thread
From: Andrew Lutomirski @ 2011-11-16 17:53 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: Michel Dänzer, Mario Kleiner, intel-gfx, dri-devel

2011/11/16 Matthew Garrett <mjg59@srcf.ucam.org>:
> On Wed, Nov 16, 2011 at 06:03:15PM +0100, Michel Dänzer wrote:
>
>> I thought the main reason for the delay wasn't broken hardware but to
>> avoid constantly ping-ponging the vblank IRQ between on and off with
>> apps which regularly neeed the vblank counter value, as that could make
>> the counter unreliable. Maybe I'm misremembering though.
>
> If turning it on and off results in the counter value being wrong then
> surely that's a hardware problem? I've tested that turning it on and off
> between every IRQ still gives valid counter values on sandybridge.

This, and the thread that contains it, might be interesting:

http://permalink.gmane.org/gmane.comp.video.dri.devel/53201

In summary: there was some resistance to doing this in January, but I
couldn't find any legitimate reason.

--Andy

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC] Reduce idle vblank wakeups
  2011-11-16 17:11   ` Matthew Garrett
  2011-11-16 17:53     ` [Intel-gfx] " Andrew Lutomirski
@ 2011-11-16 18:27     ` Mario Kleiner
  2011-11-16 18:48       ` Matthew Garrett
  1 sibling, 1 reply; 15+ messages in thread
From: Mario Kleiner @ 2011-11-16 18:27 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: Michel Dänzer, Mario Kleiner, intel-gfx, dri-devel


On Nov 16, 2011, at 6:11 PM, Matthew Garrett wrote:

> On Wed, Nov 16, 2011 at 06:03:15PM +0100, Michel Dänzer wrote:
>
>> I thought the main reason for the delay wasn't broken hardware but to
>> avoid constantly ping-ponging the vblank IRQ between on and off with
>> apps which regularly neeed the vblank counter value, as that could  
>> make
>> the counter unreliable. Maybe I'm misremembering though.
>
> If turning it on and off results in the counter value being wrong then
> surely that's a hardware problem? I've tested that turning it on  
> and off
> between every IRQ still gives valid counter values on sandybridge.
>
> -- 
> Matthew Garrett | mjg59@srcf.ucam.org


> On Wed, Nov 16, 2011 at 06:03:15PM +0100, Michel Dänzer wrote:


 >Even if I'm not, lowering the delay shouldn't be a problem, so long as
 >it's long enough that at least apps which need the vblank counter  
every
 >or every other frame don't cause the IRQ to ping-pong. But that
 >shouldn't depend on the hardware.

Hi, and thanks Michel for cc'ing me,

It's not broken hardware, but fast ping-ponging it on and off can  
make the vblank counter and vblank timestamps unreliable for apps  
that need high timing precision, especially for the ones that use the  
OML_sync_control extensions for precise swap scheduling. My target  
application is vision science
  neuroscience, where (sub-)milliseconds often matter for visual  
stimulation.

I think making the vblank off delay driver specific via these patches  
is a good idea. Lowering the timeout to something like a few refresh  
cycles, maybe somewhere between 50 msecs and 100 msecs would be also  
fine by me. I still would like to keep some drm config option to  
disable or override the vblank off delay by users.

The intel and radeon kms drivers implement everything that's needed  
to make it mostly work. Except for a small race between the cpu and  
gpu in the vblank_disable_and_save() function <http://lxr.free- 
electrons.com/source/drivers/gpu/drm/drm_irq.c#L101> and  
drm_update_vblank_count(). It can cause an off-by-one error when  
reinitializing the drm vblank counter from the gpu's hardware counter  
if the enable/disable function is called at the wrong moment while  
the gpu's scanout is inside the vblank interval, see comments in the  
code. I have some sketchy idea for a patch that could detect when the  
race happens and retry hw counter queries to fix this. Without that  
patch, there's some chance between 0% and 4% of being off-by-one.

On current nouveau kms, disabling vblank irqs guarantees you wrong  
vblank counts and wrong vblank timestamps after turning them on  
again, because the kms driver doesn't implement the hook for hardware  
vblank counter query correctly. The drm vblank counter misses all  
counts during the vblank irq off period. Other timing related hooks  
are missing as well. I have a couple of patches queued up and some  
more to come for the ddx and kms driver to mostly fix this. NVidia  
gpu's only have hardware vblank counters for NV-50 and later, fixing  
this for earlier gpu's would require some emulation of a hw vblank  
counter inside the kms driver.

Apps that rely on the vblank counter being totally reliable over long  
periods of time currently would be in a bad situation with a lowered  
vblank off delay, but that's probably generally not a good  
assumption. Toolkits like mine, which are more paranoid, currently  
can work fine as long as the off delay is at least a few video  
refresh cycles. I do the following for scheduling a reliably timed swap:

1. Query current vblank counter current_msc and vblank timestamp  
current_ust.
2. Calculate a target vblank count target_msc, based on current_msc,  
current_ust and some target time from usercode.
3. Schedule bufferswap for target_msc.

As long as the vblank off delay is long enough so that vblanks don't  
get turned off between 1. and 3, everything is fine, otherwise bad  
things will happen.
Keeping a way to override the default off delay would be good to  
allow poor scientists to work around potentially broken drivers or  
gpu's in the future. @Matthew: I'm appealing here to your ex-  
Drosophila biologist heritage ;-)

thanks,
-mario


*********************************************************************
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.kleiner@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:    +49 (0)7071/601-616
www:    http://www.kyb.tuebingen.mpg.de/~kleinerm
*********************************************************************
"For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled."
(Richard Feynman)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC] Reduce idle vblank wakeups
  2011-11-16 18:27     ` Mario Kleiner
@ 2011-11-16 18:48       ` Matthew Garrett
  2011-11-17  0:26         ` Mario Kleiner
  0 siblings, 1 reply; 15+ messages in thread
From: Matthew Garrett @ 2011-11-16 18:48 UTC (permalink / raw)
  To: Mario Kleiner; +Cc: Michel Dänzer, intel-gfx, dri-devel

On Wed, Nov 16, 2011 at 07:27:51PM +0100, Mario Kleiner wrote:

> It's not broken hardware, but fast ping-ponging it on and off can
> make the vblank counter and vblank timestamps unreliable for apps
> that need high timing precision, especially for the ones that use
> the OML_sync_control extensions for precise swap scheduling. My
> target application is vision science
>  neuroscience, where (sub-)milliseconds often matter for visual
> stimulation.

I'll admit that I'm struggling to understand the issue here. If the 
vblank counter is incremented at the time of vblank (which isn't the 
case for radeon, it seems, but as far as I can tell is the case for 
Intel) then how does ping-ponging the IRQ matter? 
vblank_disable_and_save() appears to handle this case.

> I think making the vblank off delay driver specific via these
> patches is a good idea. Lowering the timeout to something like a few
> refresh cycles, maybe somewhere between 50 msecs and 100 msecs would
> be also fine by me. I still would like to keep some drm config
> option to disable or override the vblank off delay by users.

Does the timeout serve any purpose other than letting software 
effectively prevent vblanks from being disabled?

> The intel and radeon kms drivers implement everything that's needed
> to make it mostly work. Except for a small race between the cpu and
> gpu in the vblank_disable_and_save() function <http://lxr.free-
> electrons.com/source/drivers/gpu/drm/drm_irq.c#L101> and
> drm_update_vblank_count(). It can cause an off-by-one error when
> reinitializing the drm vblank counter from the gpu's hardware
> counter if the enable/disable function is called at the wrong moment
> while the gpu's scanout is inside the vblank interval, see comments
> in the code. I have some sketchy idea for a patch that could detect
> when the race happens and retry hw counter queries to fix this.
> Without that patch, there's some chance between 0% and 4% of being
> off-by-one.

For Radeon, I'd have thought you could handle this by scheduling an irq 
for the beginning of scanout (avivo has a register for that) and 
delaying the vblank disable until you hit it?

> On current nouveau kms, disabling vblank irqs guarantees you wrong
> vblank counts and wrong vblank timestamps after turning them on
> again, because the kms driver doesn't implement the hook for
> hardware vblank counter query correctly. The drm vblank counter
> misses all counts during the vblank irq off period. Other timing
> related hooks are missing as well. I have a couple of patches queued
> up and some more to come for the ddx and kms driver to mostly fix
> this. NVidia gpu's only have hardware vblank counters for NV-50 and
> later, fixing this for earlier gpu's would require some emulation of
> a hw vblank counter inside the kms driver.

I've no problem with all of this work being case by case.

> Apps that rely on the vblank counter being totally reliable over
> long periods of time currently would be in a bad situation with a
> lowered vblank off delay, but that's probably generally not a good
> assumption. Toolkits like mine, which are more paranoid, currently
> can work fine as long as the off delay is at least a few video
> refresh cycles. I do the following for scheduling a reliably timed
> swap:
> 
> 1. Query current vblank counter current_msc and vblank timestamp
> current_ust.
> 2. Calculate a target vblank count target_msc, based on current_msc,
> current_ust and some target time from usercode.
> 3. Schedule bufferswap for target_msc.
> 
> As long as the vblank off delay is long enough so that vblanks don't
> get turned off between 1. and 3, everything is fine, otherwise bad
> things will happen.
> Keeping a way to override the default off delay would be good to
> allow poor scientists to work around potentially broken drivers or
> gpu's in the future. @Matthew: I'm appealing here to your ex-
> Drosophila biologist heritage ;-)

If vblanks are disabled and then re-enabled between 1 and 3, what's the 
negative outcome?

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC] Reduce idle vblank wakeups
  2011-11-16 18:48       ` Matthew Garrett
@ 2011-11-17  0:26         ` Mario Kleiner
  2011-11-17  2:19           ` Matthew Garrett
  0 siblings, 1 reply; 15+ messages in thread
From: Mario Kleiner @ 2011-11-17  0:26 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: Michel Dänzer, Mario Kleiner, intel-gfx, dri-devel

On Nov 16, 2011, at 7:48 PM, Matthew Garrett wrote:

> On Wed, Nov 16, 2011 at 07:27:51PM +0100, Mario Kleiner wrote:
>
>> It's not broken hardware, but fast ping-ponging it on and off can
>> make the vblank counter and vblank timestamps unreliable for apps
>> that need high timing precision, especially for the ones that use
>> the OML_sync_control extensions for precise swap scheduling. My
>> target application is vision science
>>  neuroscience, where (sub-)milliseconds often matter for visual
>> stimulation.
>
> I'll admit that I'm struggling to understand the issue here. If the
> vblank counter is incremented at the time of vblank (which isn't the
> case for radeon, it seems, but as far as I can tell is the case for
> Intel) then how does ping-ponging the IRQ matter?
> vblank_disable_and_save() appears to handle this case.
>

The drm software vblank counter which is used for scheduling swaps  
etc. gets incremented in the gpu's vblank irq handler. The gpu's  
hardware counter gets incremented somewhere inside the vblank  
interval. The increments don't happen at the same point in time. The  
race is between the vblank on/off code, which gets scheduled either  
by the timeout timer for the "off" case, or by usercode for the "on"  
case and the gpu's hardware vblank counter.

The existing code uses a lock (vblank_time_lock) to prevent some  
races between itself and the vblank irq handler, but it can't "lock"  
the gpu, so if the enable/disable code executes while the gpu is  
scanning out the vblank interval, it is undefined if the final vblank  
count will be correct or off by one. Vblank duration is somewhere up  
to 4.5% of refresh interval duration, so there's your up to 4% chance  
of getting it wrong.

If one could reliably avoid enabling/disabling in the problematic  
time period, the problem would go away.

>> I think making the vblank off delay driver specific via these
>> patches is a good idea. Lowering the timeout to something like a few
>> refresh cycles, maybe somewhere between 50 msecs and 100 msecs would
>> be also fine by me. I still would like to keep some drm config
>> option to disable or override the vblank off delay by users.
>
> Does the timeout serve any purpose other than letting software
> effectively prevent vblanks from being disabled?

With perfect drivers and gpu's in a perfect world, no. In reality  
there's the race i described above, and nouveau and all other drivers  
except intel and radeon. The vblank irq also drives timestamping of  
vblanks, one update per vblank. The timestamps are cached if a client  
needs them inbetween updates. Turning off vblank irq invalidates the  
timestamps. radeon and intel can recreate the timestamp anytime as  
needed, but nouveau lacks this atm., so timestamps remain invalid for  
a whole video refresh cycle after vblank irq on. We have patches for  
nouveau kms almost ready, so only the race mentioned above would remain.

>> The intel and radeon kms drivers implement everything that's needed
>> to make it mostly work. Except for a small race between the cpu and
>> gpu in the vblank_disable_and_save() function <http://lxr.free-
>> electrons.com/source/drivers/gpu/drm/drm_irq.c#L101> and
>> drm_update_vblank_count(). It can cause an off-by-one error when
>> reinitializing the drm vblank counter from the gpu's hardware
>> counter if the enable/disable function is called at the wrong moment
>> while the gpu's scanout is inside the vblank interval, see comments
>> in the code. I have some sketchy idea for a patch that could detect
>> when the race happens and retry hw counter queries to fix this.
>> Without that patch, there's some chance between 0% and 4% of being
>> off-by-one.
>
> For Radeon, I'd have thought you could handle this by scheduling an  
> irq
> for the beginning of scanout (avivo has a register for that) and
> delaying the vblank disable until you hit it?

For Radeon there is such an irq, but iirc we had some discussions on  
this, also with Alex Deucher, a while ago and some irq's weren't  
considered very reliable, or already used for other stuff. The idea i  
had goes like this:

Use the crtc scanout position queries together with the vblank  
counter queries inside some calibration loop, maybe executed after  
each modeset, to find out the scanline range in which the hardware  
vblank counter increments -- basically a forbidden range of scanline  
positions where the race would happen. Then at each vblank off/on,  
query scanout position before and after the hw vblank counter query.  
If according to the scanout positions the vblank counter query  
happened within the forbidden time window, retry the query. With a  
well working calibration that should add no delay in most cases and a  
delay to the on/off code of a few dozen microseconds (=duration of a  
few scanlines) worst case.

With that and the pending nouveau patches in place i think we  
wouldn't need the vblank off delay anymore on such drivers.

>> On current nouveau kms, disabling vblank irqs guarantees you wrong
>> vblank counts and wrong vblank timestamps after turning them on
>> again, because the kms driver doesn't implement the hook for
>> hardware vblank counter query correctly. The drm vblank counter
>> misses all counts during the vblank irq off period. Other timing
>> related hooks are missing as well. I have a couple of patches queued
>> up and some more to come for the ddx and kms driver to mostly fix
>> this. NVidia gpu's only have hardware vblank counters for NV-50 and
>> later, fixing this for earlier gpu's would require some emulation of
>> a hw vblank counter inside the kms driver.
>
> I've no problem with all of this work being case by case.

Me neither. I just say if you'd go to the extreme and disable vblank  
irq's immediately with zero delay, without reliably fixing the  
problem i mentioned, you'd get those off by one counts, which would  
be very bad for apps that rely on precise timing. Doing so would  
basically make the whole oml_sync_control implementation mostly useless.

>
>> Apps that rely on the vblank counter being totally reliable over
>> long periods of time currently would be in a bad situation with a
>> lowered vblank off delay, but that's probably generally not a good
>> assumption. Toolkits like mine, which are more paranoid, currently
>> can work fine as long as the off delay is at least a few video
>> refresh cycles. I do the following for scheduling a reliably timed
>> swap:
>>
>> 1. Query current vblank counter current_msc and vblank timestamp
>> current_ust.
>> 2. Calculate a target vblank count target_msc, based on current_msc,
>> current_ust and some target time from usercode.
>> 3. Schedule bufferswap for target_msc.
>>
>> As long as the vblank off delay is long enough so that vblanks don't
>> get turned off between 1. and 3, everything is fine, otherwise bad
>> things will happen.
>> Keeping a way to override the default off delay would be good to
>> allow poor scientists to work around potentially broken drivers or
>> gpu's in the future. @Matthew: I'm appealing here to your ex-
>> Drosophila biologist heritage ;-)
>
> If vblanks are disabled and then re-enabled between 1 and 3, what's  
> the
> negative outcome?

1. glXGetSyncValuesOML() gives you the current vblank count and a  
timestamp of when exactly scanout of the corresponding video frame  
started. These values are the baseline for any vblank based swap  
scheduling or any other vblank synchronized actions, e.g., via  
glXWaitForMscOML().

2. App calculates stuff based on 1. but gets preempted or experiences  
scheduling delay on return from 1. - or the x-server gets preempted  
or scheduled away. Whatever, time passes...

3. glXSwapBuffersMscOML() or glXWaitForMscOML() is used to schedule  
something to occur at a target vblank count, based on the results of 1.

As long as vblanks don't get disabled between 1 and 3, everything is  
consistent. Scheduling delays/preemption of more than a few (dozen)  
milliseconds worst case are very rare, so a lowered vblank off delay  
of even one or two refresh cycles would solve the problem. I assume  
that most toolkits with needs for timing precision would follow a  
three step strategy like this one.

If vblank irq's get disabled immediately after step 1. and reenabled  
at step 3., and this triggers an off by one error due to a race, then  
the calculated target vblank count from steps 1/2 is invalid and  
useless for step 3 and many vision scientists which just started to  
make friendship with Linux lately will make very sad faces. Due to  
the nature of many of the stimulation paradigms used, there is also a  
high chance that many of the enables and disables would happen inside  
or very close to the problematic vblank interval when off by one  
error is most likely.

I agree with your patches, i just don't want vblank irq's to be  
disabled immediately with a zero timeout before remaining races are  
fixed, that would sure break things at least for scientific  
applications. And i'd like to keep some optional parameter that  
allows desparate users to disable the vblank off mechanism in case  
they would encounter a bug.

-mario


*********************************************************************
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.kleiner@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:    +49 (0)7071/601-616
www:    http://www.kyb.tuebingen.mpg.de/~kleinerm
*********************************************************************
"For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled."
(Richard Feynman)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC] Reduce idle vblank wakeups
  2011-11-17  0:26         ` Mario Kleiner
@ 2011-11-17  2:19           ` Matthew Garrett
  2011-11-17  5:36             ` [Intel-gfx] " Andrew Lutomirski
  0 siblings, 1 reply; 15+ messages in thread
From: Matthew Garrett @ 2011-11-17  2:19 UTC (permalink / raw)
  To: Mario Kleiner; +Cc: Michel Dänzer, intel-gfx, dri-devel

On Thu, Nov 17, 2011 at 01:26:37AM +0100, Mario Kleiner wrote:
> On Nov 16, 2011, at 7:48 PM, Matthew Garrett wrote:
> >I'll admit that I'm struggling to understand the issue here. If the
> >vblank counter is incremented at the time of vblank (which isn't the
> >case for radeon, it seems, but as far as I can tell is the case for
> >Intel) then how does ping-ponging the IRQ matter?
> >vblank_disable_and_save() appears to handle this case.
> >
> 
> The drm software vblank counter which is used for scheduling swaps
> etc. gets incremented in the gpu's vblank irq handler. The gpu's
> hardware counter gets incremented somewhere inside the vblank
> interval. The increments don't happen at the same point in time. The
> race is between the vblank on/off code, which gets scheduled either
> by the timeout timer for the "off" case, or by usercode for the "on"
> case and the gpu's hardware vblank counter.

Yes, that makes sense given the current design.

> The existing code uses a lock (vblank_time_lock) to prevent some
> races between itself and the vblank irq handler, but it can't "lock"
> the gpu, so if the enable/disable code executes while the gpu is
> scanning out the vblank interval, it is undefined if the final
> vblank count will be correct or off by one. Vblank duration is
> somewhere up to 4.5% of refresh interval duration, so there's your
> up to 4% chance of getting it wrong.

Well presumably by "undefined" we really mean "hardware-specific" - for 
any given well-functioning hardware I'd expect the answer to be 
well-defined. Like I said, with Radeon we have the option of triggering 
an interrupt at the point where the hardware counter is actually 
incremented. If that then shuts down the vblank interrupt then we should 
have a well-defined answer.

> >Does the timeout serve any purpose other than letting software
> >effectively prevent vblanks from being disabled?
> 
> With perfect drivers and gpu's in a perfect world, no. In reality
> there's the race i described above, and nouveau and all other
> drivers except intel and radeon. The vblank irq also drives
> timestamping of vblanks, one update per vblank. The timestamps are
> cached if a client needs them inbetween updates. Turning off vblank
> irq invalidates the timestamps. radeon and intel can recreate the
> timestamp anytime as needed, but nouveau lacks this atm., so
> timestamps remain invalid for a whole video refresh cycle after
> vblank irq on. We have patches for nouveau kms almost ready, so only
> the race mentioned above would remain.

Sure. We'd certainly need to improve things for other GPUs before 
enabling the same functionality there.

> >For Radeon, I'd have thought you could handle this by scheduling
> >an irq
> >for the beginning of scanout (avivo has a register for that) and
> >delaying the vblank disable until you hit it?
> 
> For Radeon there is such an irq, but iirc we had some discussions on
> this, also with Alex Deucher, a while ago and some irq's weren't
> considered very reliable, or already used for other stuff. The idea
> i had goes like this:
> 
> Use the crtc scanout position queries together with the vblank
> counter queries inside some calibration loop, maybe executed after
> each modeset, to find out the scanline range in which the hardware
> vblank counter increments -- basically a forbidden range of scanline
> positions where the race would happen. Then at each vblank off/on,
> query scanout position before and after the hw vblank counter query.
> If according to the scanout positions the vblank counter query
> happened within the forbidden time window, retry the query. With a
> well working calibration that should add no delay in most cases and
> a delay to the on/off code of a few dozen microseconds (=duration of
> a few scanlines) worst case.

Assuming we're sleeping rather than busy-looping, that's certainly ok. 
My previous experiments with radeon indicated that the scanout irq was 
certainly not entirely reliable - on the other hand, I was trying to use 
it for completing memory reclocking within the vblank interval. It was 
typically still within a few scanlines, so a sanity check there wouldn't 
pose too much of a problem.

> >I've no problem with all of this work being case by case.
> 
> Me neither. I just say if you'd go to the extreme and disable vblank
> irq's immediately with zero delay, without reliably fixing the
> problem i mentioned, you'd get those off by one counts, which would
> be very bad for apps that rely on precise timing. Doing so would
> basically make the whole oml_sync_control implementation mostly
> useless.

Right. My testing of sandybridge suggests that there wasn't a problem 
here - even with the ping-ponging I was reliably getting 60 interrupts 
in 60 seconds, with the counter incrementing by 1 each time. I certainly 
wouldn't push to enable it elsewhere without making sure that the 
results are reliable.

> >If vblanks are disabled and then re-enabled between 1 and 3,
> >what's the
> >negative outcome?
> 
> 1. glXGetSyncValuesOML() gives you the current vblank count and a
> timestamp of when exactly scanout of the corresponding video frame
> started. These values are the baseline for any vblank based swap
> scheduling or any other vblank synchronized actions, e.g., via
> glXWaitForMscOML().
> 
> 2. App calculates stuff based on 1. but gets preempted or
> experiences scheduling delay on return from 1. - or the x-server
> gets preempted or scheduled away. Whatever, time passes...
> 
> 3. glXSwapBuffersMscOML() or glXWaitForMscOML() is used to schedule
> something to occur at a target vblank count, based on the results of
> 1.
> 
> As long as vblanks don't get disabled between 1 and 3, everything is
> consistent. Scheduling delays/preemption of more than a few (dozen)
> milliseconds worst case are very rare, so a lowered vblank off delay
> of even one or two refresh cycles would solve the problem. I assume
> that most toolkits with needs for timing precision would follow a
> three step strategy like this one.
> 
> If vblank irq's get disabled immediately after step 1. and reenabled
> at step 3., and this triggers an off by one error due to a race,
> then the calculated target vblank count from steps 1/2 is invalid
> and useless for step 3 and many vision scientists which just started
> to make friendship with Linux lately will make very sad faces. Due
> to the nature of many of the stimulation paradigms used, there is
> also a high chance that many of the enables and disables would
> happen inside or very close to the problematic vblank interval when
> off by one error is most likely.

Ok, so as long as we believe that we're reliably reading the hardware 
counter and not getting off by one, we should be ok? I just want to be 
clear that this is dependent on hardware behaviour rather than being 
absolutely inherent :)

> I agree with your patches, i just don't want vblank irq's to be
> disabled immediately with a zero timeout before remaining races are
> fixed, that would sure break things at least for scientific
> applications. And i'd like to keep some optional parameter that
> allows desparate users to disable the vblank off mechanism in case
> they would encounter a bug.

Yeah, no problem. I understand that completely.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Intel-gfx] [RFC] Reduce idle vblank wakeups
  2011-11-17  2:19           ` Matthew Garrett
@ 2011-11-17  5:36             ` Andrew Lutomirski
  2011-11-17  5:39               ` [RFC PATCH] drm: Fix off-by-one races on vblank disable Andy Lutomirski
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Lutomirski @ 2011-11-17  5:36 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: dri-devel, Michel Dänzer, intel-gfx, Mario Kleiner

On Wed, Nov 16, 2011 at 6:19 PM, Matthew Garrett <mjg59@srcf.ucam.org> wrote:
> On Thu, Nov 17, 2011 at 01:26:37AM +0100, Mario Kleiner wrote:
>> On Nov 16, 2011, at 7:48 PM, Matthew Garrett wrote:
>> >For Radeon, I'd have thought you could handle this by scheduling
>> >an irq
>> >for the beginning of scanout (avivo has a register for that) and
>> >delaying the vblank disable until you hit it?
>>
>> For Radeon there is such an irq, but iirc we had some discussions on
>> this, also with Alex Deucher, a while ago and some irq's weren't
>> considered very reliable, or already used for other stuff. The idea
>> i had goes like this:
>>
>> Use the crtc scanout position queries together with the vblank
>> counter queries inside some calibration loop, maybe executed after
>> each modeset, to find out the scanline range in which the hardware
>> vblank counter increments -- basically a forbidden range of scanline
>> positions where the race would happen. Then at each vblank off/on,
>> query scanout position before and after the hw vblank counter query.
>> If according to the scanout positions the vblank counter query
>> happened within the forbidden time window, retry the query. With a
>> well working calibration that should add no delay in most cases and
>> a delay to the on/off code of a few dozen microseconds (=duration of
>> a few scanlines) worst case.
>
> Assuming we're sleeping rather than busy-looping, that's certainly ok.
> My previous experiments with radeon indicated that the scanout irq was
> certainly not entirely reliable - on the other hand, I was trying to use
> it for completing memory reclocking within the vblank interval. It was
> typically still within a few scanlines, so a sanity check there wouldn't
> pose too much of a problem.

I think there's a simpler fix: just keep the hardware and software
counts in sync -- if everything is working correctly (even with all
these crazy races), the difference should be constant.  Patch coming
momentarily.

--Andy

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC PATCH] drm: Fix off-by-one races on vblank disable
  2011-11-17  5:36             ` [Intel-gfx] " Andrew Lutomirski
@ 2011-11-17  5:39               ` Andy Lutomirski
  0 siblings, 0 replies; 15+ messages in thread
From: Andy Lutomirski @ 2011-11-17  5:39 UTC (permalink / raw)
  To: Matthew Garrett, Mario Kleiner
  Cc: Andy Lutomirski, Michel Dänzer, intel-gfx, dri-devel

There are two possible races when disabling vblanks.  If the IRQ
fired but the hardware didn't update its counter yet, then we store
too low a hardware counter.  (Sensible hardware never does this.
Apparently not all hardware is sensible.)  If, on the other hand,
the counter updated but the IRQ didn't fire yet, we store too high a
counter.

We handled the former case with a heuristic based on timestamps and
we did not handle the latter case.  By saving a little more state,
we can handle both cases exactly: all we need to do is watch for
changes in the difference between the hardware and software vblank
counts.

Signed-off-by: Andy Lutomirski <luto@mit.edu>
---

Rather than tweaking more things to reduce the chance of hitting a race
while keeping the vblank disable timeout as low as possible, why not
just fix the race?

This compiles but is not very well tested, because I don't know what
tests to run.  I haven't been able to provoke either race on my SNB
laptop.

 drivers/gpu/drm/drm_irq.c |   92 ++++++++++++++++++++++++++++----------------
 include/drm/drmP.h        |    2 +-
 2 files changed, 59 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index 3830e9e..1674a33 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -56,6 +56,12 @@
  */
 #define DRM_REDUNDANT_VBLIRQ_THRESH_NS 1000000
 
+/* Saved vblank count data, used only in this file. */
+struct drm_vbl_counts {
+	u32 hwcount;	/* hw count at last state save or load */
+	u32 drmcount;	/* drm count at last state save or load */
+};
+
 /**
  * Get interrupt from bus id.
  *
@@ -101,7 +107,8 @@ static void clear_vblank_timestamps(struct drm_device *dev, int crtc)
 static void vblank_disable_and_save(struct drm_device *dev, int crtc)
 {
 	unsigned long irqflags;
-	u32 vblcount;
+	u32 drmcount, hwcount;
+	u32 drm_counts_seen, hw_counts_seen, offset;
 	s64 diff_ns;
 	int vblrc;
 	struct timeval tvblank;
@@ -121,44 +128,53 @@ static void vblank_disable_and_save(struct drm_device *dev, int crtc)
 	/* No further vblank irq's will be processed after
 	 * this point. Get current hardware vblank count and
 	 * vblank timestamp, repeat until they are consistent.
-	 *
-	 * FIXME: There is still a race condition here and in
-	 * drm_update_vblank_count() which can cause off-by-one
-	 * reinitialization of software vblank counter. If gpu
-	 * vblank counter doesn't increment exactly at the leading
-	 * edge of a vblank interval, then we can lose 1 count if
-	 * we happen to execute between start of vblank and the
-	 * delayed gpu counter increment.
 	 */
 	do {
-		dev->last_vblank[crtc] = dev->driver->get_vblank_counter(dev, crtc);
+		hwcount = dev->driver->get_vblank_counter(dev, crtc);
 		vblrc = drm_get_last_vbltimestamp(dev, crtc, &tvblank, 0);
-	} while (dev->last_vblank[crtc] != dev->driver->get_vblank_counter(dev, crtc));
+	} while (hwcount != dev->driver->get_vblank_counter(dev, crtc));
 
 	/* Compute time difference to stored timestamp of last vblank
 	 * as updated by last invocation of drm_handle_vblank() in vblank irq.
 	 */
-	vblcount = atomic_read(&dev->_vblank_count[crtc]);
+	drmcount = atomic_read(&dev->_vblank_count[crtc]);
 	diff_ns = timeval_to_ns(&tvblank) -
-		  timeval_to_ns(&vblanktimestamp(dev, crtc, vblcount));
+		  timeval_to_ns(&vblanktimestamp(dev, crtc, drmcount));
 
-	/* If there is at least 1 msec difference between the last stored
-	 * timestamp and tvblank, then we are currently executing our
-	 * disable inside a new vblank interval, the tvblank timestamp
-	 * corresponds to this new vblank interval and the irq handler
-	 * for this vblank didn't run yet and won't run due to our disable.
-	 * Therefore we need to do the job of drm_handle_vblank() and
-	 * increment the vblank counter by one to account for this vblank.
+	/* We could be off by one in either direction.  If a vblank just
+	 * happened but the IRQ hasn't been handled yet, then drmcount is
+	 * too low by one.  On the other hand, if the GPU fires its vblank
+	 * interrupts *before* updating its counter, then hwcount could
+	 * be too low by one.  (If both happen, they cancel out.)
 	 *
-	 * Skip this step if there isn't any high precision timestamp
-	 * available. In that case we can't account for this and just
-	 * hope for the best.
+	 * Fortunately, we have enough information to figure out what
+	 * happened.  Assuming the hardware counter works right, the
+	 * difference between drmcount and vblcount should be a constant
+	 * (modulo max_vblank_count).  We have both saved values from last
+	 * time we turned the interrupt on.
 	 */
-	if ((vblrc > 0) && (abs64(diff_ns) > 1000000)) {
-		atomic_inc(&dev->_vblank_count[crtc]);
-		smp_mb__after_atomic_inc();
+	drm_counts_seen = drmcount - dev->last_vblank[crtc].drmcount;
+	hw_counts_seen = hwcount - dev->last_vblank[crtc].hwcount;
+
+	offset = (drm_counts_seen - hw_counts_seen) % dev->max_vblank_count;
+	if (offset == 1) {
+		hwcount++;
+		DRM_DEBUG("last_vblank[%d]: hwcount++\n", crtc);
+	} else if (offset == dev->max_vblank_count - 1) {
+		hwcount--;
+		DRM_DEBUG("last_vblank[%d]: hwcount--\n", crtc);
+	} else if (offset != 0) {
+		DRM_DEBUG("last_vblank[%d]: drm_counts_seen = %u, hw_counts_seen = %u, max = %u is unexpected",
+			  crtc, (unsigned)drm_counts_seen,
+			  (unsigned)hw_counts_seen,
+			  (unsigned)dev->max_vblank_count);
+
+		/* Trying to fix it might just make it worse. */
 	}
 
+	dev->last_vblank[crtc].hwcount = hwcount;
+	dev->last_vblank[crtc].drmcount = drmcount;  /* not really necessary */
+
 	/* Invalidate all timestamps while vblank irq's are off. */
 	clear_vblank_timestamps(dev, crtc);
 
@@ -238,7 +254,8 @@ int drm_vblank_init(struct drm_device *dev, int num_crtcs)
 	if (!dev->vblank_enabled)
 		goto err;
 
-	dev->last_vblank = kcalloc(num_crtcs, sizeof(u32), GFP_KERNEL);
+	dev->last_vblank = kcalloc(num_crtcs, sizeof(struct drm_vbl_counts),
+				   GFP_KERNEL);
 	if (!dev->last_vblank)
 		goto err;
 
@@ -268,6 +285,9 @@ int drm_vblank_init(struct drm_device *dev, int num_crtcs)
 		init_waitqueue_head(&dev->vbl_queue[i]);
 		atomic_set(&dev->_vblank_count[i], 0);
 		atomic_set(&dev->vblank_refcount[i], 0);
+		dev->last_vblank[i].drmcount = 0;
+		dev->last_vblank[i].hwcount =
+			dev->driver->get_vblank_counter(dev, i);
 	}
 
 	dev->vblank_disable_allowed = 0;
@@ -410,7 +430,9 @@ int drm_irq_uninstall(struct drm_device *dev)
 	for (i = 0; i < dev->num_crtcs; i++) {
 		DRM_WAKEUP(&dev->vbl_queue[i]);
 		dev->vblank_enabled[i] = 0;
-		dev->last_vblank[i] = dev->driver->get_vblank_counter(dev, i);
+		dev->last_vblank[i].hwcount =
+			dev->driver->get_vblank_counter(dev, i);
+		dev->last_vblank[i].drmcount = drm_vblank_count(dev, i);
 	}
 	spin_unlock_irqrestore(&dev->vbl_lock, irqflags);
 
@@ -841,12 +863,12 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc)
 	} while (cur_vblank != dev->driver->get_vblank_counter(dev, crtc));
 
 	/* Deal with counter wrap */
-	diff = cur_vblank - dev->last_vblank[crtc];
-	if (cur_vblank < dev->last_vblank[crtc]) {
+	diff = cur_vblank - dev->last_vblank[crtc].hwcount;
+	if (cur_vblank < dev->last_vblank[crtc].hwcount) {
 		diff += dev->max_vblank_count;
 
-		DRM_DEBUG("last_vblank[%d]=0x%x, cur_vblank=0x%x => diff=0x%x\n",
-			  crtc, dev->last_vblank[crtc], cur_vblank, diff);
+		DRM_DEBUG("last_vblank[%d].hwcount=0x%x, cur_vblank=0x%x => diff=0x%x\n",
+			  crtc, dev->last_vblank[crtc].hwcount, cur_vblank, diff);
 	}
 
 	DRM_DEBUG("enabling vblank interrupts on crtc %d, missed %d\n",
@@ -861,8 +883,10 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc)
 		vblanktimestamp(dev, crtc, tslot) = t_vblank;
 	}
 
+	dev->last_vblank[crtc].hwcount = cur_vblank;
 	smp_mb__before_atomic_inc();
-	atomic_add(diff, &dev->_vblank_count[crtc]);
+	dev->last_vblank[crtc].drmcount =
+		atomic_add_return(diff, &dev->_vblank_count[crtc]);
 	smp_mb__after_atomic_inc();
 }
 
diff --git a/include/drm/drmP.h b/include/drm/drmP.h
index 9b7c2bb..4950c0f 100644
--- a/include/drm/drmP.h
+++ b/include/drm/drmP.h
@@ -1108,7 +1108,7 @@ struct drm_device {
 	spinlock_t vblank_time_lock;    /**< Protects vblank count and time updates during vblank enable/disable */
 	spinlock_t vbl_lock;
 	atomic_t *vblank_refcount;      /* number of users of vblank interruptsper crtc */
-	u32 *last_vblank;               /* protected by dev->vbl_lock, used */
+	struct drm_vbl_counts *last_vblank; /* protected by dev->vbl_lock, used */
 					/* for wraparound handling */
 	int *vblank_enabled;            /* so we don't call enable more than
 					   once per disable */
-- 
1.7.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Intel-gfx] [RFC] Reduce idle vblank wakeups
  2011-11-16 14:20 [RFC] Reduce idle vblank wakeups Matthew Garrett
                   ` (4 preceding siblings ...)
  2011-11-16 17:03 ` Michel Dänzer
@ 2012-05-04 18:31 ` Matthew Garrett
  5 siblings, 0 replies; 15+ messages in thread
From: Matthew Garrett @ 2012-05-04 18:31 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Anyone have any further thoughts on this?

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-05-04 18:31 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-16 14:20 [RFC] Reduce idle vblank wakeups Matthew Garrett
2011-11-16 14:20 ` [RFC 1/3] drm: Make drm_vblank_offdelay per-device Matthew Garrett
2011-11-16 14:20 ` [RFC 2/3] drm: Handle the vblank_offdelay=0 case properly Matthew Garrett
2011-11-16 14:20 ` [RFC 3/3] i915: Drop vblank_offdelay Matthew Garrett
2011-11-16 14:32 ` [RFC] Reduce idle vblank wakeups Adam Jackson
2011-11-16 17:03 ` Michel Dänzer
2011-11-16 17:11   ` Matthew Garrett
2011-11-16 17:53     ` [Intel-gfx] " Andrew Lutomirski
2011-11-16 18:27     ` Mario Kleiner
2011-11-16 18:48       ` Matthew Garrett
2011-11-17  0:26         ` Mario Kleiner
2011-11-17  2:19           ` Matthew Garrett
2011-11-17  5:36             ` [Intel-gfx] " Andrew Lutomirski
2011-11-17  5:39               ` [RFC PATCH] drm: Fix off-by-one races on vblank disable Andy Lutomirski
2012-05-04 18:31 ` [Intel-gfx] [RFC] Reduce idle vblank wakeups Matthew Garrett

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox