linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* BUG: sleeping function called from invalid context on 3.10.10-rt7
@ 2013-09-11 10:28 Luis Claudio R. Goncalves
  2013-09-11 13:26 ` Steven Rostedt
  2013-10-11 14:19 ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 32+ messages in thread
From: Luis Claudio R. Goncalves @ 2013-09-11 10:28 UTC (permalink / raw)
  To: linux-rt-users
  Cc: Steven Rostedt, Thomas Gleixner, Sebastian Andrzej Siewior,
	Clark Williams

Hello,

I saw two different occurrences of "BUG: sleeping function called from
invalid context" happening on 3.10.10-rt7. The first one happened on
drm_vblank_get() -> i915_get_vblank_timestamp() and was flooding dmesg
after Xorg started. I silenced that with a hackish patch, just for fun, and
found a second problem, this time on tty_ldisc_reinit().

The logs:

[   23.973991] BUG: sleeping function called from invalid context at /home/lclaudio/SANDBOX/mrg-rt-v2-devel/kernel/rtmutex.c:659
[   23.973992] in_atomic(): 1, irqs_disabled(): 0, pid: 517, name: Xorg
[   23.973993] 2 locks held by Xorg/517:
[   23.973993]  #0: 
[   23.973994]  (
[   23.973994] &dev->vbl_lock
[   23.973995] ){......}
[   23.973995] , at: 
[   23.974007] [<ffffffffa0024c60>] drm_vblank_get+0x30/0x2b0 [drm]
[   23.974008]  #1: 
[   23.974008]  (
[   23.974008] &dev->vblank_time_lock
[   23.974008] ){......}
[   23.974009] , at: 
[   23.974013] [<ffffffffa0024ce1>] drm_vblank_get+0xb1/0x2b0 [drm]
[   23.974013] Preemption disabled at:
[   23.974021] [<ffffffffa008bc95>] i915_get_vblank_timestamp+0x45/0xa0 [i915]
[   23.974023] CPU: 3 PID: 517 Comm: Xorg Not tainted 3.10.10-rt7+ #5
[   23.974024] Hardware name: Hewlett-Packard p7-1512/2ADA, BIOS 8.15 02/05/2013
[   23.974024]  0000000000070008
[   23.974025]  ffff88017a08bae0
[   23.974025]  ffffffff8164b790
[   23.974025]  ffff88017a08baf8
[   23.974026]  ffffffff8107e62f
[   23.974026]  ffff880179840040
[   23.974026]  ffff88017a08bb18
[   23.974027]  ffffffff81651ac4
[   23.974027]  0000000000000002
[   23.974027]  ffff880179840000
[   23.974028]  ffff88017a08bb48
[   23.974028]  ffffffffa0084e67
[   23.974028] Call Trace:
[   23.974033]  [<ffffffff8164b790>] dump_stack+0x19/0x1b
[   23.974035]  [<ffffffff8107e62f>] __might_sleep+0xff/0x170
[   23.974037]  [<ffffffff81651ac4>] rt_spin_lock+0x24/0x60
[   23.974043]  [<ffffffffa0084e67>] i915_read32+0x27/0x170 [i915]
[   23.974048]  [<ffffffffa008a591>] i915_pipe_enabled+0x31/0x40 [i915]
[   23.974054]  [<ffffffffa008a6be>] i915_get_crtc_scanoutpos+0x3e/0x1b0 [i915]
[   23.974056]  [<ffffffff81087025>] ? sched_clock_cpu+0xb5/0x100
[   23.974062]  [<ffffffffa00245d4>] drm_calc_vbltimestamp_from_scanoutpos+0xf4/0x430 [drm]
[   23.974068]  [<ffffffffa008bc95>] i915_get_vblank_timestamp+0x45/0xa0 [i915]
[   23.974073]  [<ffffffffa0024998>] drm_get_last_vbltimestamp+0x48/0x70 [drm]
[   23.974078]  [<ffffffffa0024db5>] drm_vblank_get+0x185/0x2b0 [drm]
[   23.974079]  [<ffffffff81087025>] ? sched_clock_cpu+0xb5/0x100
[   23.974085]  [<ffffffffa0025d03>] drm_wait_vblank+0x83/0x5d0 [drm]
[   23.974086]  [<ffffffff81087025>] ? sched_clock_cpu+0xb5/0x100
[   23.974088]  [<ffffffff810af143>] ? __lock_acquire.isra.31+0x273/0xa70
[   23.974093]  [<ffffffffa00212a2>] drm_ioctl+0x552/0x6a0 [drm]
[   23.974096]  [<ffffffff8128d886>] ? avc_has_perm_flags+0x126/0x1c0
[   23.974098]  [<ffffffff8128d788>] ? avc_has_perm_flags+0x28/0x1c0
[   23.974099]  [<ffffffff810ad49e>] ? put_lock_stats.isra.23+0xe/0x40
[   23.974101]  [<ffffffff811a0095>] do_vfs_ioctl+0x325/0x5b0
[   23.974103]  [<ffffffff8128fc3e>] ? file_has_perm+0x8e/0xa0
[   23.974105]  [<ffffffff811a03a1>] SyS_ioctl+0x81/0xa0
[   23.974107]  [<ffffffff8165a342>] tracesys+0xdd/0xe2


and


[   25.423675] BUG: sleeping function called from invalid context at /home/lclaudio/SANDBOX/mrg-rt-v2-devel/kernel/rtmutex.c:659
[   25.423676] in_atomic(): 1, irqs_disabled(): 1, pid: 188, name: plymouthd
[   25.423676] 3 locks held by plymouthd/188:
[   25.423682]  #0:  (&tty->legacy_mutex){......}, at: [<ffffffff816528d0>] tty_lock_nested+0x40/0x90
[   25.423686]  #1:  (&tty->ldisc_mutex){......}, at: [<ffffffff8139b482>] tty_ldisc_hangup+0x152/0x300
[   25.423688]  #2:  (tty_ldisc_lock){......}, at: [<ffffffff8139a9c2>] tty_ldisc_reinit+0x72/0x130
[   25.423689] Preemption disabled at:[<ffffffff8139a9c2>] tty_ldisc_reinit+0x72/0x130
[   25.423691] CPU: 2 PID: 188 Comm: plymouthd Not tainted 3.10.10-rt7+ #6
[   25.423692] Hardware name: Hewlett-Packard p7-1512/2ADA, BIOS 8.15 02/05/2013
[   25.423694]  005ff00000000000 ffff8801788ada68 ffffffff8164b790 ffff8801788ada80
[   25.423695]  ffffffff8107e62f ffff8801991ce1a0 ffff8801788adaa0 ffffffff81651ac4
[   25.423696]  0000000000000000 ffffea0005e26680 ffff8801788adaf8 ffffffff81130984
[   25.423696] Call Trace:
[   25.423700]  [<ffffffff8164b790>] dump_stack+0x19/0x1b
[   25.423702]  [<ffffffff8107e62f>] __might_sleep+0xff/0x170
[   25.423704]  [<ffffffff81651ac4>] rt_spin_lock+0x24/0x60
[   25.423707]  [<ffffffff81130984>] free_hot_cold_page+0xb4/0x3c0
[   25.423710]  [<ffffffff81178209>] ?unfreeze_partials.isra.42+0x229/0x2b0
[   25.423711]  [<ffffffff81130dc7>] __free_pages+0x47/0x70
[   25.423713]  [<ffffffff81130fb2>] __free_memcg_kmem_pages+0x22/0x50
[   25.423714]  [<ffffffff81177528>] __free_slab+0xe8/0x1e0
[   25.423716]  [<ffffffff81177654>] free_delayed+0x34/0x50
[   25.423717]  [<ffffffff81649633>] __slab_free+0x273/0x36b
[   25.423719]  [<ffffffff810acde9>] ? get_lock_stats+0x19/0x60
[   25.423721]  [<ffffffff810ad49e>] ? put_lock_stats.isra.23+0xe/0x40
[   25.423722]  [<ffffffff81178794>] kfree+0x1c4/0x210
[   25.423724]  [<ffffffff8139a9f5>] ? tty_ldisc_reinit+0xa5/0x130
[   25.423725]  [<ffffffff8139a9f5>] tty_ldisc_reinit+0xa5/0x130
[   25.423726]  [<ffffffff8139b49f>] tty_ldisc_hangup+0x16f/0x300
[   25.423728]  [<ffffffff81082e1d>] ? get_parent_ip+0xd/0x50
[   25.423731]  [<ffffffff8104d736>] ? unpin_current_cpu+0x16/0x70
[   25.423732]  [<ffffffff81392136>] __tty_hangup+0x346/0x460
[   25.423733]  [<ffffffff81392260>] tty_vhangup+0x10/0x20
[   25.423735]  [<ffffffff8139d6e1>] pty_close+0x131/0x180
[   25.423736]  [<ffffffff813936ad>] tty_release+0x11d/0x5f0
[   25.423737]  [<ffffffff810acde9>] ? get_lock_stats+0x19/0x60
[   25.423747]  [<ffffffff81009d84>] ? native_sched_clock+0x24/0x80
[   25.423749]  [<ffffffff81087025>] ? sched_clock_cpu+0xb5/0x100
[   25.423750]  [<ffffffff810acde9>] ? get_lock_stats+0x19/0x60
[   25.423752]  [<ffffffff8118eb45>] __fput+0xf5/0x250
[   25.423762]  [<ffffffff8118ed6e>] ____fput+0xe/0x10
[   25.423764]  [<ffffffff8106f1cc>] task_work_run+0xac/0xe0
[   25.423765]  [<ffffffff81002a11>] do_notify_resume+0x91/0xf0
[   25.423767]  [<ffffffff8165a41c>] int_signal+0x12/0x17


About the hackish patch I used to silence the constant i915 complaints,
even though the comments on drm_calc_vbltimestamp_from_scanoutpos() state
the preempt_disable is there for a reason, I removed it for the RT case. I
really wanted to confirm if there was anything else after that complaint.

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index f92da0a..bd2e8e2 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -631,7 +631,7 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, int crtc,
 		/* Disable preemption to make it very likely to
 		 * succeed in the first iteration even on PREEMPT_RT kernel.
 		 */
-		preempt_disable();
+		preempt_disable_nort();
 
 		/* Get system timestamp before query. */
 		stime = ktime_get();
@@ -644,7 +644,7 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, int crtc,
 		if (!drm_timestamp_monotonic)
 			mono_time_offset = ktime_get_monotonic_offset();
 
-		preempt_enable();
+		preempt_enable_nort();
 
 		/* Return as no-op if scanout query unsupported or failed. */
 		if (!(vbl_status & DRM_SCANOUTPOS_VALID)) {

-- 
[ Luis Claudio R. Goncalves                    Bass - Gospel - RT ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9  2696 7203 D980 A448 C8F8 ]


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-11 10:28 BUG: sleeping function called from invalid context on 3.10.10-rt7 Luis Claudio R. Goncalves
@ 2013-09-11 13:26 ` Steven Rostedt
  2013-09-11 15:16   ` Peter Hurley
  2013-10-11 14:19 ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 32+ messages in thread
From: Steven Rostedt @ 2013-09-11 13:26 UTC (permalink / raw)
  To: Luis Claudio R. Goncalves
  Cc: linux-rt-users, Thomas Gleixner, Sebastian Andrzej Siewior,
	Clark Williams, Mario Kleiner, Dave Airlie, LKML

On Wed, 11 Sep 2013 07:28:09 -0300
"Luis Claudio R. Goncalves" <lclaudio@uudg.org> wrote:

> Hello,
> 
> I saw two different occurrences of "BUG: sleeping function called from
> invalid context" happening on 3.10.10-rt7. The first one happened on
> drm_vblank_get() -> i915_get_vblank_timestamp() and was flooding dmesg
> after Xorg started. I silenced that with a hackish patch, just for fun, and
> found a second problem, this time on tty_ldisc_reinit().
> 
> The logs:
> 
> [   23.973991] BUG: sleeping function called from invalid context at /home/lclaudio/SANDBOX/mrg-rt-v2-devel/kernel/rtmutex.c:659
> [   23.973992] in_atomic(): 1, irqs_disabled(): 0, pid: 517, name: Xorg
> [   23.973993] 2 locks held by Xorg/517:
> [   23.973993]  #0: 
> [   23.973994]  (
> [   23.973994] &dev->vbl_lock
> [   23.973995] ){......}
> [   23.973995] , at: 
> [   23.974007] [<ffffffffa0024c60>] drm_vblank_get+0x30/0x2b0 [drm]
> [   23.974008]  #1: 
> [   23.974008]  (
> [   23.974008] &dev->vblank_time_lock
> [   23.974008] ){......}
> [   23.974009] , at: 
> [   23.974013] [<ffffffffa0024ce1>] drm_vblank_get+0xb1/0x2b0 [drm]
> [   23.974013] Preemption disabled at:
> [   23.974021] [<ffffffffa008bc95>] i915_get_vblank_timestamp+0x45/0xa0 [i915]
> [   23.974023] CPU: 3 PID: 517 Comm: Xorg Not tainted 3.10.10-rt7+ #5
> [   23.974024] Hardware name: Hewlett-Packard p7-1512/2ADA, BIOS 8.15 02/05/2013
> [   23.974024]  0000000000070008
> [   23.974025]  ffff88017a08bae0
> [   23.974025]  ffffffff8164b790
> [   23.974025]  ffff88017a08baf8
> [   23.974026]  ffffffff8107e62f
> [   23.974026]  ffff880179840040
> [   23.974026]  ffff88017a08bb18
> [   23.974027]  ffffffff81651ac4
> [   23.974027]  0000000000000002
> [   23.974027]  ffff880179840000
> [   23.974028]  ffff88017a08bb48
> [   23.974028]  ffffffffa0084e67
> [   23.974028] Call Trace:
> [   23.974033]  [<ffffffff8164b790>] dump_stack+0x19/0x1b
> [   23.974035]  [<ffffffff8107e62f>] __might_sleep+0xff/0x170
> [   23.974037]  [<ffffffff81651ac4>] rt_spin_lock+0x24/0x60
> [   23.974043]  [<ffffffffa0084e67>] i915_read32+0x27/0x170 [i915]
> [   23.974048]  [<ffffffffa008a591>] i915_pipe_enabled+0x31/0x40 [i915]
> [   23.974054]  [<ffffffffa008a6be>] i915_get_crtc_scanoutpos+0x3e/0x1b0 [i915]
> [   23.974056]  [<ffffffff81087025>] ? sched_clock_cpu+0xb5/0x100
> [   23.974062]  [<ffffffffa00245d4>] drm_calc_vbltimestamp_from_scanoutpos+0xf4/0x430 [drm]
> [   23.974068]  [<ffffffffa008bc95>] i915_get_vblank_timestamp+0x45/0xa0 [i915]
> [   23.974073]  [<ffffffffa0024998>] drm_get_last_vbltimestamp+0x48/0x70 [drm]
> [   23.974078]  [<ffffffffa0024db5>] drm_vblank_get+0x185/0x2b0 [drm]
> [   23.974079]  [<ffffffff81087025>] ? sched_clock_cpu+0xb5/0x100
> [   23.974085]  [<ffffffffa0025d03>] drm_wait_vblank+0x83/0x5d0 [drm]
> [   23.974086]  [<ffffffff81087025>] ? sched_clock_cpu+0xb5/0x100
> [   23.974088]  [<ffffffff810af143>] ? __lock_acquire.isra.31+0x273/0xa70
> [   23.974093]  [<ffffffffa00212a2>] drm_ioctl+0x552/0x6a0 [drm]
> [   23.974096]  [<ffffffff8128d886>] ? avc_has_perm_flags+0x126/0x1c0
> [   23.974098]  [<ffffffff8128d788>] ? avc_has_perm_flags+0x28/0x1c0
> [   23.974099]  [<ffffffff810ad49e>] ? put_lock_stats.isra.23+0xe/0x40
> [   23.974101]  [<ffffffff811a0095>] do_vfs_ioctl+0x325/0x5b0
> [   23.974103]  [<ffffffff8128fc3e>] ? file_has_perm+0x8e/0xa0
> [   23.974105]  [<ffffffff811a03a1>] SyS_ioctl+0x81/0xa0
> [   23.974107]  [<ffffffff8165a342>] tracesys+0xdd/0xe2
> 
> 
> and
> 
> 
> [   25.423675] BUG: sleeping function called from invalid context at /home/lclaudio/SANDBOX/mrg-rt-v2-devel/kernel/rtmutex.c:659
> [   25.423676] in_atomic(): 1, irqs_disabled(): 1, pid: 188, name: plymouthd
> [   25.423676] 3 locks held by plymouthd/188:
> [   25.423682]  #0:  (&tty->legacy_mutex){......}, at: [<ffffffff816528d0>] tty_lock_nested+0x40/0x90
> [   25.423686]  #1:  (&tty->ldisc_mutex){......}, at: [<ffffffff8139b482>] tty_ldisc_hangup+0x152/0x300
> [   25.423688]  #2:  (tty_ldisc_lock){......}, at: [<ffffffff8139a9c2>] tty_ldisc_reinit+0x72/0x130
> [   25.423689] Preemption disabled at:[<ffffffff8139a9c2>] tty_ldisc_reinit+0x72/0x130
> [   25.423691] CPU: 2 PID: 188 Comm: plymouthd Not tainted 3.10.10-rt7+ #6
> [   25.423692] Hardware name: Hewlett-Packard p7-1512/2ADA, BIOS 8.15 02/05/2013
> [   25.423694]  005ff00000000000 ffff8801788ada68 ffffffff8164b790 ffff8801788ada80
> [   25.423695]  ffffffff8107e62f ffff8801991ce1a0 ffff8801788adaa0 ffffffff81651ac4
> [   25.423696]  0000000000000000 ffffea0005e26680 ffff8801788adaf8 ffffffff81130984
> [   25.423696] Call Trace:
> [   25.423700]  [<ffffffff8164b790>] dump_stack+0x19/0x1b
> [   25.423702]  [<ffffffff8107e62f>] __might_sleep+0xff/0x170
> [   25.423704]  [<ffffffff81651ac4>] rt_spin_lock+0x24/0x60
> [   25.423707]  [<ffffffff81130984>] free_hot_cold_page+0xb4/0x3c0
> [   25.423710]  [<ffffffff81178209>] ?unfreeze_partials.isra.42+0x229/0x2b0
> [   25.423711]  [<ffffffff81130dc7>] __free_pages+0x47/0x70
> [   25.423713]  [<ffffffff81130fb2>] __free_memcg_kmem_pages+0x22/0x50
> [   25.423714]  [<ffffffff81177528>] __free_slab+0xe8/0x1e0
> [   25.423716]  [<ffffffff81177654>] free_delayed+0x34/0x50
> [   25.423717]  [<ffffffff81649633>] __slab_free+0x273/0x36b
> [   25.423719]  [<ffffffff810acde9>] ? get_lock_stats+0x19/0x60
> [   25.423721]  [<ffffffff810ad49e>] ? put_lock_stats.isra.23+0xe/0x40
> [   25.423722]  [<ffffffff81178794>] kfree+0x1c4/0x210
> [   25.423724]  [<ffffffff8139a9f5>] ? tty_ldisc_reinit+0xa5/0x130
> [   25.423725]  [<ffffffff8139a9f5>] tty_ldisc_reinit+0xa5/0x130
> [   25.423726]  [<ffffffff8139b49f>] tty_ldisc_hangup+0x16f/0x300
> [   25.423728]  [<ffffffff81082e1d>] ? get_parent_ip+0xd/0x50
> [   25.423731]  [<ffffffff8104d736>] ? unpin_current_cpu+0x16/0x70
> [   25.423732]  [<ffffffff81392136>] __tty_hangup+0x346/0x460
> [   25.423733]  [<ffffffff81392260>] tty_vhangup+0x10/0x20
> [   25.423735]  [<ffffffff8139d6e1>] pty_close+0x131/0x180
> [   25.423736]  [<ffffffff813936ad>] tty_release+0x11d/0x5f0
> [   25.423737]  [<ffffffff810acde9>] ? get_lock_stats+0x19/0x60
> [   25.423747]  [<ffffffff81009d84>] ? native_sched_clock+0x24/0x80
> [   25.423749]  [<ffffffff81087025>] ? sched_clock_cpu+0xb5/0x100
> [   25.423750]  [<ffffffff810acde9>] ? get_lock_stats+0x19/0x60
> [   25.423752]  [<ffffffff8118eb45>] __fput+0xf5/0x250
> [   25.423762]  [<ffffffff8118ed6e>] ____fput+0xe/0x10
> [   25.423764]  [<ffffffff8106f1cc>] task_work_run+0xac/0xe0
> [   25.423765]  [<ffffffff81002a11>] do_notify_resume+0x91/0xf0
> [   25.423767]  [<ffffffff8165a41c>] int_signal+0x12/0x17
> 
> 
> About the hackish patch I used to silence the constant i915 complaints,
> even though the comments on drm_calc_vbltimestamp_from_scanoutpos() state
> the preempt_disable is there for a reason, I removed it for the RT case. I
> really wanted to confirm if there was anything else after that complaint.

The funny part is, there's a comment there that shows that this was
done even for "PREEMPT_RT". Unfortunately, the call to
"get_scanout_position()" can call functions that use the rt-mutex
"sleeping spin locks" and it breaks there.

I guess we need to ask the authors of the mainline patch exactly why
that preempt_disable() is needed?

-- Steve

> 
> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
> index f92da0a..bd2e8e2 100644
> --- a/drivers/gpu/drm/drm_irq.c
> +++ b/drivers/gpu/drm/drm_irq.c
> @@ -631,7 +631,7 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, int crtc,
>  		/* Disable preemption to make it very likely to
>  		 * succeed in the first iteration even on PREEMPT_RT kernel.
>  		 */
> -		preempt_disable();
> +		preempt_disable_nort();
>  
>  		/* Get system timestamp before query. */
>  		stime = ktime_get();
> @@ -644,7 +644,7 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, int crtc,
>  		if (!drm_timestamp_monotonic)
>  			mono_time_offset = ktime_get_monotonic_offset();
>  
> -		preempt_enable();
> +		preempt_enable_nort();
>  
>  		/* Return as no-op if scanout query unsupported or failed. */
>  		if (!(vbl_status & DRM_SCANOUTPOS_VALID)) {
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-11 13:26 ` Steven Rostedt
@ 2013-09-11 15:16   ` Peter Hurley
  2013-09-11 15:38     ` Steven Rostedt
  2013-09-11 18:29     ` Mario Kleiner
  0 siblings, 2 replies; 32+ messages in thread
From: Peter Hurley @ 2013-09-11 15:16 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Luis Claudio R. Goncalves, linux-rt-users, Thomas Gleixner,
	Sebastian Andrzej Siewior, Clark Williams, Mario Kleiner,
	Dave Airlie, LKML

On 09/11/2013 09:26 AM, Steven Rostedt wrote:
> On Wed, 11 Sep 2013 07:28:09 -0300
> "Luis Claudio R. Goncalves" <lclaudio@uudg.org> wrote:
>
>> Hello,
>>
>> I saw two different occurrences of "BUG: sleeping function called from
>> invalid context" happening on 3.10.10-rt7. The first one happened on
>> drm_vblank_get() -> i915_get_vblank_timestamp() and was flooding dmesg
>> after Xorg started. I silenced that with a hackish patch, just for fun, and
>> found a second problem, this time on tty_ldisc_reinit().
>>
>> The logs:
>>
>> [   23.973991] BUG: sleeping function called from invalid context at /home/lclaudio/SANDBOX/mrg-rt-v2-devel/kernel/rtmutex.c:659
>> [   23.973992] in_atomic(): 1, irqs_disabled(): 0, pid: 517, name: Xorg
>> [   23.973993] 2 locks held by Xorg/517:
>> [   23.973993]  #0:
>> [   23.973994]  (
>> [   23.973994] &dev->vbl_lock
>> [   23.973995] ){......}
>> [   23.973995] , at:
>> [   23.974007] [<ffffffffa0024c60>] drm_vblank_get+0x30/0x2b0 [drm]
>> [   23.974008]  #1:
>> [   23.974008]  (
>> [   23.974008] &dev->vblank_time_lock
>> [   23.974008] ){......}
>> [   23.974009] , at:
>> [   23.974013] [<ffffffffa0024ce1>] drm_vblank_get+0xb1/0x2b0 [drm]
>> [   23.974013] Preemption disabled at:
>> [   23.974021] [<ffffffffa008bc95>] i915_get_vblank_timestamp+0x45/0xa0 [i915]
>> [   23.974023] CPU: 3 PID: 517 Comm: Xorg Not tainted 3.10.10-rt7+ #5
>> [   23.974024] Hardware name: Hewlett-Packard p7-1512/2ADA, BIOS 8.15 02/05/2013
>> [   23.974024]  0000000000070008
>> [   23.974025]  ffff88017a08bae0
>> [   23.974025]  ffffffff8164b790
>> [   23.974025]  ffff88017a08baf8
>> [   23.974026]  ffffffff8107e62f
>> [   23.974026]  ffff880179840040
>> [   23.974026]  ffff88017a08bb18
>> [   23.974027]  ffffffff81651ac4
>> [   23.974027]  0000000000000002
>> [   23.974027]  ffff880179840000
>> [   23.974028]  ffff88017a08bb48
>> [   23.974028]  ffffffffa0084e67
>> [   23.974028] Call Trace:
>> [   23.974033]  [<ffffffff8164b790>] dump_stack+0x19/0x1b
>> [   23.974035]  [<ffffffff8107e62f>] __might_sleep+0xff/0x170
>> [   23.974037]  [<ffffffff81651ac4>] rt_spin_lock+0x24/0x60
>> [   23.974043]  [<ffffffffa0084e67>] i915_read32+0x27/0x170 [i915]
>> [   23.974048]  [<ffffffffa008a591>] i915_pipe_enabled+0x31/0x40 [i915]
>> [   23.974054]  [<ffffffffa008a6be>] i915_get_crtc_scanoutpos+0x3e/0x1b0 [i915]
>> [   23.974056]  [<ffffffff81087025>] ? sched_clock_cpu+0xb5/0x100
>> [   23.974062]  [<ffffffffa00245d4>] drm_calc_vbltimestamp_from_scanoutpos+0xf4/0x430 [drm]
>> [   23.974068]  [<ffffffffa008bc95>] i915_get_vblank_timestamp+0x45/0xa0 [i915]
>> [   23.974073]  [<ffffffffa0024998>] drm_get_last_vbltimestamp+0x48/0x70 [drm]
>> [   23.974078]  [<ffffffffa0024db5>] drm_vblank_get+0x185/0x2b0 [drm]
>> [   23.974079]  [<ffffffff81087025>] ? sched_clock_cpu+0xb5/0x100
>> [   23.974085]  [<ffffffffa0025d03>] drm_wait_vblank+0x83/0x5d0 [drm]
>> [   23.974086]  [<ffffffff81087025>] ? sched_clock_cpu+0xb5/0x100
>> [   23.974088]  [<ffffffff810af143>] ? __lock_acquire.isra.31+0x273/0xa70
>> [   23.974093]  [<ffffffffa00212a2>] drm_ioctl+0x552/0x6a0 [drm]
>> [   23.974096]  [<ffffffff8128d886>] ? avc_has_perm_flags+0x126/0x1c0
>> [   23.974098]  [<ffffffff8128d788>] ? avc_has_perm_flags+0x28/0x1c0
>> [   23.974099]  [<ffffffff810ad49e>] ? put_lock_stats.isra.23+0xe/0x40
>> [   23.974101]  [<ffffffff811a0095>] do_vfs_ioctl+0x325/0x5b0
>> [   23.974103]  [<ffffffff8128fc3e>] ? file_has_perm+0x8e/0xa0
>> [   23.974105]  [<ffffffff811a03a1>] SyS_ioctl+0x81/0xa0
>> [   23.974107]  [<ffffffff8165a342>] tracesys+0xdd/0xe2
>>
>>
>> and
>>
>>
>> [   25.423675] BUG: sleeping function called from invalid context at /home/lclaudio/SANDBOX/mrg-rt-v2-devel/kernel/rtmutex.c:659
>> [   25.423676] in_atomic(): 1, irqs_disabled(): 1, pid: 188, name: plymouthd
>> [   25.423676] 3 locks held by plymouthd/188:
>> [   25.423682]  #0:  (&tty->legacy_mutex){......}, at: [<ffffffff816528d0>] tty_lock_nested+0x40/0x90
>> [   25.423686]  #1:  (&tty->ldisc_mutex){......}, at: [<ffffffff8139b482>] tty_ldisc_hangup+0x152/0x300
>> [   25.423688]  #2:  (tty_ldisc_lock){......}, at: [<ffffffff8139a9c2>] tty_ldisc_reinit+0x72/0x130
>> [   25.423689] Preemption disabled at:[<ffffffff8139a9c2>] tty_ldisc_reinit+0x72/0x130
>> [   25.423691] CPU: 2 PID: 188 Comm: plymouthd Not tainted 3.10.10-rt7+ #6
>> [   25.423692] Hardware name: Hewlett-Packard p7-1512/2ADA, BIOS 8.15 02/05/2013
>> [   25.423694]  005ff00000000000 ffff8801788ada68 ffffffff8164b790 ffff8801788ada80
>> [   25.423695]  ffffffff8107e62f ffff8801991ce1a0 ffff8801788adaa0 ffffffff81651ac4
>> [   25.423696]  0000000000000000 ffffea0005e26680 ffff8801788adaf8 ffffffff81130984
>> [   25.423696] Call Trace:
>> [   25.423700]  [<ffffffff8164b790>] dump_stack+0x19/0x1b
>> [   25.423702]  [<ffffffff8107e62f>] __might_sleep+0xff/0x170
>> [   25.423704]  [<ffffffff81651ac4>] rt_spin_lock+0x24/0x60
>> [   25.423707]  [<ffffffff81130984>] free_hot_cold_page+0xb4/0x3c0
>> [   25.423710]  [<ffffffff81178209>] ?unfreeze_partials.isra.42+0x229/0x2b0
>> [   25.423711]  [<ffffffff81130dc7>] __free_pages+0x47/0x70
>> [   25.423713]  [<ffffffff81130fb2>] __free_memcg_kmem_pages+0x22/0x50
>> [   25.423714]  [<ffffffff81177528>] __free_slab+0xe8/0x1e0
>> [   25.423716]  [<ffffffff81177654>] free_delayed+0x34/0x50
>> [   25.423717]  [<ffffffff81649633>] __slab_free+0x273/0x36b
>> [   25.423719]  [<ffffffff810acde9>] ? get_lock_stats+0x19/0x60
>> [   25.423721]  [<ffffffff810ad49e>] ? put_lock_stats.isra.23+0xe/0x40
>> [   25.423722]  [<ffffffff81178794>] kfree+0x1c4/0x210
>> [   25.423724]  [<ffffffff8139a9f5>] ? tty_ldisc_reinit+0xa5/0x130
>> [   25.423725]  [<ffffffff8139a9f5>] tty_ldisc_reinit+0xa5/0x130
>> [   25.423726]  [<ffffffff8139b49f>] tty_ldisc_hangup+0x16f/0x300
>> [   25.423728]  [<ffffffff81082e1d>] ? get_parent_ip+0xd/0x50
>> [   25.423731]  [<ffffffff8104d736>] ? unpin_current_cpu+0x16/0x70
>> [   25.423732]  [<ffffffff81392136>] __tty_hangup+0x346/0x460
>> [   25.423733]  [<ffffffff81392260>] tty_vhangup+0x10/0x20
>> [   25.423735]  [<ffffffff8139d6e1>] pty_close+0x131/0x180
>> [   25.423736]  [<ffffffff813936ad>] tty_release+0x11d/0x5f0
>> [   25.423737]  [<ffffffff810acde9>] ? get_lock_stats+0x19/0x60
>> [   25.423747]  [<ffffffff81009d84>] ? native_sched_clock+0x24/0x80
>> [   25.423749]  [<ffffffff81087025>] ? sched_clock_cpu+0xb5/0x100
>> [   25.423750]  [<ffffffff810acde9>] ? get_lock_stats+0x19/0x60
>> [   25.423752]  [<ffffffff8118eb45>] __fput+0xf5/0x250
>> [   25.423762]  [<ffffffff8118ed6e>] ____fput+0xe/0x10
>> [   25.423764]  [<ffffffff8106f1cc>] task_work_run+0xac/0xe0
>> [   25.423765]  [<ffffffff81002a11>] do_notify_resume+0x91/0xf0
>> [   25.423767]  [<ffffffff8165a41c>] int_signal+0x12/0x17
>>
>>
>> About the hackish patch I used to silence the constant i915 complaints,
>> even though the comments on drm_calc_vbltimestamp_from_scanoutpos() state
>> the preempt_disable is there for a reason, I removed it for the RT case. I
>> really wanted to confirm if there was anything else after that complaint.
>
> The funny part is, there's a comment there that shows that this was
> done even for "PREEMPT_RT". Unfortunately, the call to
> "get_scanout_position()" can call functions that use the rt-mutex
> "sleeping spin locks" and it breaks there.
>
> I guess we need to ask the authors of the mainline patch exactly why
> that preempt_disable() is needed?

The drm core associates a timestamp with each vertical blank frame #.
Drm drivers can optionally support a 'high resolution' hw timestamp.
The vblank frame #/timestamp tuple is user-space visible.

The i915 drm driver supports a hw timestamp via this drm helper function
which computes the timestamp from the crtc scan position (based on the
pixel clock).

For mainline, the preempt_disable/_enable() isn't actually necessary
because every call tree that leads here already has preemption disabled.

For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?

Regards,
Peter Hurley


>>
>> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
>> index f92da0a..bd2e8e2 100644
>> --- a/drivers/gpu/drm/drm_irq.c
>> +++ b/drivers/gpu/drm/drm_irq.c
>> @@ -631,7 +631,7 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, int crtc,
>>   		/* Disable preemption to make it very likely to
>>   		 * succeed in the first iteration even on PREEMPT_RT kernel.
>>   		 */
>> -		preempt_disable();
>> +		preempt_disable_nort();
>>
>>   		/* Get system timestamp before query. */
>>   		stime = ktime_get();
>> @@ -644,7 +644,7 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, int crtc,
>>   		if (!drm_timestamp_monotonic)
>>   			mono_time_offset = ktime_get_monotonic_offset();
>>
>> -		preempt_enable();
>> +		preempt_enable_nort();
>>
>>   		/* Return as no-op if scanout query unsupported or failed. */
>>   		if (!(vbl_status & DRM_SCANOUTPOS_VALID)) {
>>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-11 15:16   ` Peter Hurley
@ 2013-09-11 15:38     ` Steven Rostedt
  2013-09-11 19:31       ` Peter Hurley
  2013-09-11 18:29     ` Mario Kleiner
  1 sibling, 1 reply; 32+ messages in thread
From: Steven Rostedt @ 2013-09-11 15:38 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Luis Claudio R. Goncalves, linux-rt-users, Thomas Gleixner,
	Sebastian Andrzej Siewior, Clark Williams, Mario Kleiner,
	Dave Airlie, LKML

On Wed, 11 Sep 2013 11:16:43 -0400
Peter Hurley <peter@hurleysoftware.com> wrote:

> > The funny part is, there's a comment there that shows that this was
> > done even for "PREEMPT_RT". Unfortunately, the call to
> > "get_scanout_position()" can call functions that use the rt-mutex
> > "sleeping spin locks" and it breaks there.
> >
> > I guess we need to ask the authors of the mainline patch exactly why
> > that preempt_disable() is needed?
> 
> The drm core associates a timestamp with each vertical blank frame #.
> Drm drivers can optionally support a 'high resolution' hw timestamp.
> The vblank frame #/timestamp tuple is user-space visible.
> 
> The i915 drm driver supports a hw timestamp via this drm helper function
> which computes the timestamp from the crtc scan position (based on the
> pixel clock).
> 
> For mainline, the preempt_disable/_enable() isn't actually necessary
> because every call tree that leads here already has preemption disabled.
> 
> For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?
> 

No, it should not. Note, any other lock that can be held when it is
held would also need to be raw. And by taking a quick audit of the
code, I see this:

	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);

	/* Reset the chip */

	/* GEN6_GDRST is not in the gt power well, no need to check
	 * for fifo space for the write or forcewake the chip for
	 * the read
	 */
	__raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL);

	/* Spin waiting for the device to ack the reset request */
	ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) & GEN6_GRDOM_FULL) == 0, 500);

That spin is unacceptable in RT with preemption and interrupts disabled.


What's the real issue here? Does it have something to do with this
dump?

[    3.932060] ------------[ cut here ]------------
[    3.936809] WARNING: at /home/rostedt/work/git/linux-rt.git/drivers/gpu/drm/i915/i915_drv.c:322 gen6_gt_force_wake_get+0x4d/0x50 [i915]()
[    3.949229] Hardware name: HP Compaq Pro 6300 SFF
[    3.953988] Modules linked in: i915(+) video i2c_algo_bit drm_kms_helper drm i2c_core
[    3.961943] Pid: 220, comm: udevd Not tainted 3.0.89-test-rt117 #117
[    3.968343] Call Trace:
[    3.970851]  [<ffffffff810671bf>] warn_slowpath_common+0x7f/0xc0
[    3.970852]  [<ffffffff8106721a>] warn_slowpath_null+0x1a/0x20
[    3.970857]  [<ffffffffa006569d>] gen6_gt_force_wake_get+0x4d/0x50 [i915]
[    3.970865]  [<ffffffffa00962ab>] ivybridge_init_clock_gating+0xcb/0x2f0 [i915]
[    3.970870]  [<ffffffffa0090209>] ? intel_crtc_disable+0x29/0x60 [i915]
[    3.970877]  [<ffffffffa009f741>] intel_modeset_init+0x751/0x10e0 [i915]
[    3.970882]  [<ffffffffa006a158>] i915_driver_load+0xfc8/0x17f0 [i915]
[    3.970885]  [<ffffffff8137909e>] ? device_register+0x1e/0x30
[    3.970892]  [<ffffffffa001f596>] ? drm_sysfs_device_add+0x86/0xb0 [drm]
[    3.970896]  [<ffffffffa001b8b3>] ? drm_get_minor+0x233/0x300 [drm]
[    3.970900]  [<ffffffffa001dd49>] drm_get_pci_dev+0x189/0x2a0 [drm]
[    3.970902]  [<ffffffff8105e31b>] ? migrate_enable+0x8b/0x1c0
[    3.970910]  [<ffffffffa00c5f27>] i915_pci_probe+0x1b/0x1d [i915]
[    3.970913]  [<ffffffff812c5b6c>] local_pci_probe+0x5c/0xd0
[    3.970915]  [<ffffffff812c73f9>] pci_device_probe+0x109/0x130
[    3.970917]  [<ffffffff8137bc6c>] driver_probe_device+0x9c/0x2b0
[    3.970918]  [<ffffffff8137bf2b>] __driver_attach+0xab/0xb0
[    3.970919]  [<ffffffff8137be80>] ? driver_probe_device+0x2b0/0x2b0
[    3.970920]  [<ffffffff8137be80>] ? driver_probe_device+0x2b0/0x2b0
[    3.970921]  [<ffffffff8137aa34>] bus_for_each_dev+0x64/0xa0
[    3.970923]  [<ffffffff8137b87e>] driver_attach+0x1e/0x20
[    3.970924]  [<ffffffff8137b490>] bus_add_driver+0x1b0/0x290
[    3.970925]  [<ffffffff8137c4a6>] driver_register+0x76/0x140
[    3.970927]  [<ffffffff812c70a2>] __pci_register_driver+0x82/0x100
[    3.970930]  [<ffffffff815b1f6d>] ? notifier_call_chain+0x4d/0x70
[    3.970934]  [<ffffffffa001df7a>] drm_pci_init+0x11a/0x130 [drm]
[    3.970935]  [<ffffffffa00f4000>] ? 0xffffffffa00f3fff
[    3.970940]  [<ffffffffa00f408b>] i915_init+0x8b/0x8d [i915]
[    3.970943]  [<ffffffff81002040>] do_one_initcall+0x40/0x180
[    3.970946]  [<ffffffff810aa4ae>] sys_init_module+0xbe/0x230
[    3.970948]  [<ffffffff815b5a82>] system_call_fastpath+0x16/0x1b
[    3.970949] ---[ end trace 0000000000000002 ]---
[    4.164458] ------------[ cut here ]------------

-- Steve

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-11 15:16   ` Peter Hurley
  2013-09-11 15:38     ` Steven Rostedt
@ 2013-09-11 18:29     ` Mario Kleiner
  2013-09-11 18:35       ` Steven Rostedt
  1 sibling, 1 reply; 32+ messages in thread
From: Mario Kleiner @ 2013-09-11 18:29 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Steven Rostedt, Luis Claudio R. Goncalves, linux-rt-users,
	Thomas Gleixner, Sebastian Andrzej Siewior, Clark Williams,
	Dave Airlie, LKML, Mario Kleiner

On 11.09.13 17:16, Peter Hurley wrote:
> On 09/11/2013 09:26 AM, Steven Rostedt wrote:
>> On Wed, 11 Sep 2013 07:28:09 -0300
>> "Luis Claudio R. Goncalves" <lclaudio@uudg.org> wrote:
>>
>>> Hello,
>>>
>>> I saw two different occurrences of "BUG: sleeping function called from
>>> invalid context" happening on 3.10.10-rt7. The first one happened on
>>> drm_vblank_get() -> i915_get_vblank_timestamp() and was flooding dmesg
>>> after Xorg started. I silenced that with a hackish patch, just for
>>> fun, and
>>> found a second problem, this time on tty_ldisc_reinit().
>>>
>>> The logs:
>>>
>>> [   23.973991] BUG: sleeping function called from invalid context at
>>> /home/lclaudio/SANDBOX/mrg-rt-v2-devel/kernel/rtmutex.c:659
>>> [   23.973992] in_atomic(): 1, irqs_disabled(): 0, pid: 517, name: Xorg
>>> [   23.973993] 2 locks held by Xorg/517:
>>> [   23.973993]  #0:
>>> [   23.973994]  (
>>> [   23.973994] &dev->vbl_lock
>>> [   23.973995] ){......}
>>> [   23.973995] , at:
>>> [   23.974007] [<ffffffffa0024c60>] drm_vblank_get+0x30/0x2b0 [drm]
>>> [   23.974008]  #1:
>>> [   23.974008]  (
>>> [   23.974008] &dev->vblank_time_lock
>>> [   23.974008] ){......}
>>> [   23.974009] , at:
>>> [   23.974013] [<ffffffffa0024ce1>] drm_vblank_get+0xb1/0x2b0 [drm]
>>> [   23.974013] Preemption disabled at:
>>> [   23.974021] [<ffffffffa008bc95>]
>>> i915_get_vblank_timestamp+0x45/0xa0 [i915]
>>> [   23.974023] CPU: 3 PID: 517 Comm: Xorg Not tainted 3.10.10-rt7+ #5
>>> [   23.974024] Hardware name: Hewlett-Packard p7-1512/2ADA, BIOS 8.15
>>> 02/05/2013
>>> [   23.974024]  0000000000070008
>>> [   23.974025]  ffff88017a08bae0
>>> [   23.974025]  ffffffff8164b790
>>> [   23.974025]  ffff88017a08baf8
>>> [   23.974026]  ffffffff8107e62f
>>> [   23.974026]  ffff880179840040
>>> [   23.974026]  ffff88017a08bb18
>>> [   23.974027]  ffffffff81651ac4
>>> [   23.974027]  0000000000000002
>>> [   23.974027]  ffff880179840000
>>> [   23.974028]  ffff88017a08bb48
>>> [   23.974028]  ffffffffa0084e67
>>> [   23.974028] Call Trace:
>>> [   23.974033]  [<ffffffff8164b790>] dump_stack+0x19/0x1b
>>> [   23.974035]  [<ffffffff8107e62f>] __might_sleep+0xff/0x170
>>> [   23.974037]  [<ffffffff81651ac4>] rt_spin_lock+0x24/0x60
>>> [   23.974043]  [<ffffffffa0084e67>] i915_read32+0x27/0x170 [i915]
>>> [   23.974048]  [<ffffffffa008a591>] i915_pipe_enabled+0x31/0x40 [i915]
>>> [   23.974054]  [<ffffffffa008a6be>]
>>> i915_get_crtc_scanoutpos+0x3e/0x1b0 [i915]
>>> [   23.974056]  [<ffffffff81087025>] ? sched_clock_cpu+0xb5/0x100
>>> [   23.974062]  [<ffffffffa00245d4>]
>>> drm_calc_vbltimestamp_from_scanoutpos+0xf4/0x430 [drm]
>>> [   23.974068]  [<ffffffffa008bc95>]
>>> i915_get_vblank_timestamp+0x45/0xa0 [i915]
>>> [   23.974073]  [<ffffffffa0024998>]
>>> drm_get_last_vbltimestamp+0x48/0x70 [drm]
>>> [   23.974078]  [<ffffffffa0024db5>] drm_vblank_get+0x185/0x2b0 [drm]
>>> [   23.974079]  [<ffffffff81087025>] ? sched_clock_cpu+0xb5/0x100
>>> [   23.974085]  [<ffffffffa0025d03>] drm_wait_vblank+0x83/0x5d0 [drm]
>>> [   23.974086]  [<ffffffff81087025>] ? sched_clock_cpu+0xb5/0x100
>>> [   23.974088]  [<ffffffff810af143>] ?
>>> __lock_acquire.isra.31+0x273/0xa70
>>> [   23.974093]  [<ffffffffa00212a2>] drm_ioctl+0x552/0x6a0 [drm]
>>> [   23.974096]  [<ffffffff8128d886>] ? avc_has_perm_flags+0x126/0x1c0
>>> [   23.974098]  [<ffffffff8128d788>] ? avc_has_perm_flags+0x28/0x1c0
>>> [   23.974099]  [<ffffffff810ad49e>] ? put_lock_stats.isra.23+0xe/0x40
>>> [   23.974101]  [<ffffffff811a0095>] do_vfs_ioctl+0x325/0x5b0
>>> [   23.974103]  [<ffffffff8128fc3e>] ? file_has_perm+0x8e/0xa0
>>> [   23.974105]  [<ffffffff811a03a1>] SyS_ioctl+0x81/0xa0
>>> [   23.974107]  [<ffffffff8165a342>] tracesys+0xdd/0xe2
>>>
>>>
>>> and
>>>
>>>
>>> [   25.423675] BUG: sleeping function called from invalid context at
>>> /home/lclaudio/SANDBOX/mrg-rt-v2-devel/kernel/rtmutex.c:659
>>> [   25.423676] in_atomic(): 1, irqs_disabled(): 1, pid: 188, name:
>>> plymouthd
>>> [   25.423676] 3 locks held by plymouthd/188:
>>> [   25.423682]  #0:  (&tty->legacy_mutex){......}, at:
>>> [<ffffffff816528d0>] tty_lock_nested+0x40/0x90
>>> [   25.423686]  #1:  (&tty->ldisc_mutex){......}, at:
>>> [<ffffffff8139b482>] tty_ldisc_hangup+0x152/0x300
>>> [   25.423688]  #2:  (tty_ldisc_lock){......}, at:
>>> [<ffffffff8139a9c2>] tty_ldisc_reinit+0x72/0x130
>>> [   25.423689] Preemption disabled at:[<ffffffff8139a9c2>]
>>> tty_ldisc_reinit+0x72/0x130
>>> [   25.423691] CPU: 2 PID: 188 Comm: plymouthd Not tainted
>>> 3.10.10-rt7+ #6
>>> [   25.423692] Hardware name: Hewlett-Packard p7-1512/2ADA, BIOS 8.15
>>> 02/05/2013
>>> [   25.423694]  005ff00000000000 ffff8801788ada68 ffffffff8164b790
>>> ffff8801788ada80
>>> [   25.423695]  ffffffff8107e62f ffff8801991ce1a0 ffff8801788adaa0
>>> ffffffff81651ac4
>>> [   25.423696]  0000000000000000 ffffea0005e26680 ffff8801788adaf8
>>> ffffffff81130984
>>> [   25.423696] Call Trace:
>>> [   25.423700]  [<ffffffff8164b790>] dump_stack+0x19/0x1b
>>> [   25.423702]  [<ffffffff8107e62f>] __might_sleep+0xff/0x170
>>> [   25.423704]  [<ffffffff81651ac4>] rt_spin_lock+0x24/0x60
>>> [   25.423707]  [<ffffffff81130984>] free_hot_cold_page+0xb4/0x3c0
>>> [   25.423710]  [<ffffffff81178209>]
>>> ?unfreeze_partials.isra.42+0x229/0x2b0
>>> [   25.423711]  [<ffffffff81130dc7>] __free_pages+0x47/0x70
>>> [   25.423713]  [<ffffffff81130fb2>] __free_memcg_kmem_pages+0x22/0x50
>>> [   25.423714]  [<ffffffff81177528>] __free_slab+0xe8/0x1e0
>>> [   25.423716]  [<ffffffff81177654>] free_delayed+0x34/0x50
>>> [   25.423717]  [<ffffffff81649633>] __slab_free+0x273/0x36b
>>> [   25.423719]  [<ffffffff810acde9>] ? get_lock_stats+0x19/0x60
>>> [   25.423721]  [<ffffffff810ad49e>] ? put_lock_stats.isra.23+0xe/0x40
>>> [   25.423722]  [<ffffffff81178794>] kfree+0x1c4/0x210
>>> [   25.423724]  [<ffffffff8139a9f5>] ? tty_ldisc_reinit+0xa5/0x130
>>> [   25.423725]  [<ffffffff8139a9f5>] tty_ldisc_reinit+0xa5/0x130
>>> [   25.423726]  [<ffffffff8139b49f>] tty_ldisc_hangup+0x16f/0x300
>>> [   25.423728]  [<ffffffff81082e1d>] ? get_parent_ip+0xd/0x50
>>> [   25.423731]  [<ffffffff8104d736>] ? unpin_current_cpu+0x16/0x70
>>> [   25.423732]  [<ffffffff81392136>] __tty_hangup+0x346/0x460
>>> [   25.423733]  [<ffffffff81392260>] tty_vhangup+0x10/0x20
>>> [   25.423735]  [<ffffffff8139d6e1>] pty_close+0x131/0x180
>>> [   25.423736]  [<ffffffff813936ad>] tty_release+0x11d/0x5f0
>>> [   25.423737]  [<ffffffff810acde9>] ? get_lock_stats+0x19/0x60
>>> [   25.423747]  [<ffffffff81009d84>] ? native_sched_clock+0x24/0x80
>>> [   25.423749]  [<ffffffff81087025>] ? sched_clock_cpu+0xb5/0x100
>>> [   25.423750]  [<ffffffff810acde9>] ? get_lock_stats+0x19/0x60
>>> [   25.423752]  [<ffffffff8118eb45>] __fput+0xf5/0x250
>>> [   25.423762]  [<ffffffff8118ed6e>] ____fput+0xe/0x10
>>> [   25.423764]  [<ffffffff8106f1cc>] task_work_run+0xac/0xe0
>>> [   25.423765]  [<ffffffff81002a11>] do_notify_resume+0x91/0xf0
>>> [   25.423767]  [<ffffffff8165a41c>] int_signal+0x12/0x17
>>>
>>>
>>> About the hackish patch I used to silence the constant i915 complaints,
>>> even though the comments on drm_calc_vbltimestamp_from_scanoutpos()
>>> state
>>> the preempt_disable is there for a reason, I removed it for the RT
>>> case. I
>>> really wanted to confirm if there was anything else after that
>>> complaint.
>>
>> The funny part is, there's a comment there that shows that this was
>> done even for "PREEMPT_RT". Unfortunately, the call to
>> "get_scanout_position()" can call functions that use the rt-mutex
>> "sleeping spin locks" and it breaks there.
>>
>> I guess we need to ask the authors of the mainline patch exactly why
>> that preempt_disable() is needed?
>
> The drm core associates a timestamp with each vertical blank frame #.
> Drm drivers can optionally support a 'high resolution' hw timestamp.
> The vblank frame #/timestamp tuple is user-space visible.
>
> The i915 drm driver supports a hw timestamp via this drm helper function
> which computes the timestamp from the crtc scan position (based on the
> pixel clock).
>
> For mainline, the preempt_disable/_enable() isn't actually necessary
> because every call tree that leads here already has preemption disabled.
>
> For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?
>

Hi

Just to give some context why i wrote that routine the way it is written:

The preempt_disable() / _enable() there is intended to try to make sure 
that the etime = ktime_get() query and get_crtc_scanoutpos() query 
happen as closely together in time as possible, because time delays 
between those queries directly translate into noise/uncertainty in the 
calculated vblank timestamps.

Those timestamps are now used by userspace apps like desktop compositors 
or media players for smooth animation and audio-video sync, but the main 
reason i implemented those patches was to support scientific 
neuro-science research applications which need very precise ( << 100 
usecs, ideally < 20 usecs ) vblank timestamps (see 
<https://github.com/Psychtoolbox-3/Psychtoolbox-3/blob/master/Psychtoolbox/PsychDocumentation/ECVP2010Poster_VisualTimingPrecision.pdf?raw=true> 
for more background). The code checks how big the uncertainty is and 
retries the query up to three times if the delay is greater than 20 
microseconds by default.

The retry is meant as a last resort measure for isolated unavoidable 
outliers, e.g., interruption by NMI irq's or maybe system management 
interrupts. We can only retry a few times, because the code is also 
called from the vblank irq handler every video refresh cycle 
(drm_handle_vblank()), and we can't report failure to userspace due to a 
client api (OML_sync_control and INTEL_swap_events GLX extension specs 
at www.opengl.org) which didn't anticipate error cases -- essentially 
reporting failure was not an option.

As far as the typical timing sensitive scientific applications go, those 
users who need this vblank timestamping to be the most precise and 
robust are also the same users who will likely need to install a 
lowlatency or even realtime kernel, so for this to be useful, that 
routine should be really well protected against preemption on a rt kernel.

That said, maybe preempt_disable is no longer the optimal choice there 
and there's some better way to achieve good protection against 
interruptions of that bit of code? My knowledge here is a bit rusty, and 
the intel kms drivers and rt stuff has changed quite a bit.

thanks,
-mario


> Regards,
> Peter Hurley
>
>
>>>
>>> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
>>> index f92da0a..bd2e8e2 100644
>>> --- a/drivers/gpu/drm/drm_irq.c
>>> +++ b/drivers/gpu/drm/drm_irq.c
>>> @@ -631,7 +631,7 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct
>>> drm_device *dev, int crtc,
>>>           /* Disable preemption to make it very likely to
>>>            * succeed in the first iteration even on PREEMPT_RT kernel.
>>>            */
>>> -        preempt_disable();
>>> +        preempt_disable_nort();
>>>
>>>           /* Get system timestamp before query. */
>>>           stime = ktime_get();
>>> @@ -644,7 +644,7 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct
>>> drm_device *dev, int crtc,
>>>           if (!drm_timestamp_monotonic)
>>>               mono_time_offset = ktime_get_monotonic_offset();
>>>
>>> -        preempt_enable();
>>> +        preempt_enable_nort();
>>>
>>>           /* Return as no-op if scanout query unsupported or failed. */
>>>           if (!(vbl_status & DRM_SCANOUTPOS_VALID)) {
>>>
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-11 18:29     ` Mario Kleiner
@ 2013-09-11 18:35       ` Steven Rostedt
  2013-09-11 19:07         ` Mario Kleiner
  0 siblings, 1 reply; 32+ messages in thread
From: Steven Rostedt @ 2013-09-11 18:35 UTC (permalink / raw)
  To: Mario Kleiner
  Cc: Peter Hurley, Luis Claudio R. Goncalves, linux-rt-users,
	Thomas Gleixner, Sebastian Andrzej Siewior, Clark Williams,
	Dave Airlie, LKML

On Wed, 11 Sep 2013 20:29:07 +0200
Mario Kleiner <mario.kleiner@tuebingen.mpg.de> wrote:

> That said, maybe preempt_disable is no longer the optimal choice there 
> and there's some better way to achieve good protection against 
> interruptions of that bit of code? My knowledge here is a bit rusty, and 
> the intel kms drivers and rt stuff has changed quite a bit.

If you set your code to a higher priority than other tasks (and
interrupts) than it wont be preempted there. Unless of course it blocks
on a lock, but even then, priority inheritance will take place and it
still should be rather quick. (unless the holder of the lock is doing
that strange polling).

-- Steve

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-11 18:35       ` Steven Rostedt
@ 2013-09-11 19:07         ` Mario Kleiner
  2013-09-11 19:19           ` Steven Rostedt
  0 siblings, 1 reply; 32+ messages in thread
From: Mario Kleiner @ 2013-09-11 19:07 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Peter Hurley, Luis Claudio R. Goncalves, linux-rt-users,
	Thomas Gleixner, Sebastian Andrzej Siewior, Clark Williams,
	Dave Airlie, LKML, Mario Kleiner



On 11.09.13 20:35, Steven Rostedt wrote:
> On Wed, 11 Sep 2013 20:29:07 +0200
> Mario Kleiner <mario.kleiner@tuebingen.mpg.de> wrote:
>
>> That said, maybe preempt_disable is no longer the optimal choice there
>> and there's some better way to achieve good protection against
>> interruptions of that bit of code? My knowledge here is a bit rusty, and
>> the intel kms drivers and rt stuff has changed quite a bit.
>
> If you set your code to a higher priority than other tasks (and
> interrupts) than it wont be preempted there. Unless of course it blocks
> on a lock, but even then, priority inheritance will take place and it
> still should be rather quick. (unless the holder of the lock is doing
> that strange polling).
>
> -- Steve
>

Right, on a rt kernel. But that creates the problem of not very computer 
savvy users (psychologists and biologists mostly) somehow having to 
choose proper priorities for gpu interrupt threads and for the 
x-server/wayland/..., and not much protection on a non-rt kernel?

preempt_disable() a few years ago looked like a good "plug and play" 
default solution, because the ->get_crtc_scanoutpos() function was 
supposed to have a very low and bounded execution time. At the time we 
wrote the patches for intel/radeon/nouveau, that was the case. Typical 
execution time (= preempt off time) was like 1-4 usecs, even on very low 
end hardware.

Seems that at least intel's kms driver does a lot of things now, which 
can sleep and spin inside that section? I tried to follow the posted 
stack trace, but got lost somewhere around the i915_read32 code and 
power management stuff...

-mario

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-11 19:07         ` Mario Kleiner
@ 2013-09-11 19:19           ` Steven Rostedt
  2013-09-11 20:23             ` Mario Kleiner
  0 siblings, 1 reply; 32+ messages in thread
From: Steven Rostedt @ 2013-09-11 19:19 UTC (permalink / raw)
  To: Mario Kleiner
  Cc: Peter Hurley, Luis Claudio R. Goncalves, linux-rt-users,
	Thomas Gleixner, Sebastian Andrzej Siewior, Clark Williams,
	Dave Airlie, LKML

On Wed, 11 Sep 2013 21:07:10 +0200
Mario Kleiner <mario.kleiner@tuebingen.mpg.de> wrote:

> 
> 
> On 11.09.13 20:35, Steven Rostedt wrote:
> > On Wed, 11 Sep 2013 20:29:07 +0200
> > Mario Kleiner <mario.kleiner@tuebingen.mpg.de> wrote:
> >
> >> That said, maybe preempt_disable is no longer the optimal choice there
> >> and there's some better way to achieve good protection against
> >> interruptions of that bit of code? My knowledge here is a bit rusty, and
> >> the intel kms drivers and rt stuff has changed quite a bit.
> >
> > If you set your code to a higher priority than other tasks (and
> > interrupts) than it wont be preempted there. Unless of course it blocks
> > on a lock, but even then, priority inheritance will take place and it
> > still should be rather quick. (unless the holder of the lock is doing
> > that strange polling).
> >
> > -- Steve
> >
> 
> Right, on a rt kernel. But that creates the problem of not very computer 
> savvy users (psychologists and biologists mostly) somehow having to 
> choose proper priorities for gpu interrupt threads and for the 
> x-server/wayland/..., and not much protection on a non-rt kernel?

IIUC, the preempt_disable() is only for -rt, the non-rt case already
disables preemption with the spin_locks called before it.

> 
> preempt_disable() a few years ago looked like a good "plug and play" 
> default solution, because the ->get_crtc_scanoutpos() function was 
> supposed to have a very low and bounded execution time. At the time we 
> wrote the patches for intel/radeon/nouveau, that was the case. Typical 
> execution time (= preempt off time) was like 1-4 usecs, even on very low 
> end hardware.
> 
> Seems that at least intel's kms driver does a lot of things now, which 
> can sleep and spin inside that section? I tried to follow the posted 
> stack trace, but got lost somewhere around the i915_read32 code and 
> power management stuff...

Note, the sleeps only happen on -rt, and not in mainline.

If one is going to use -rt for real-time work, it requires a bit more
knowledge of the system. The problem with RT in general, is that it's
hard, and anyone telling you they have a generic RT system that
requires no computer savvyness can also be selling you a bridge over
the east river.

-- Steve

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-11 15:38     ` Steven Rostedt
@ 2013-09-11 19:31       ` Peter Hurley
  2013-09-17 19:50         ` Peter Hurley
  0 siblings, 1 reply; 32+ messages in thread
From: Peter Hurley @ 2013-09-11 19:31 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Luis Claudio R. Goncalves, linux-rt-users, Thomas Gleixner,
	Sebastian Andrzej Siewior, Clark Williams, Mario Kleiner,
	Dave Airlie, LKML, dri-devel@lists.freedesktop.org

[+cc dri-devel]

On 09/11/2013 11:38 AM, Steven Rostedt wrote:
> On Wed, 11 Sep 2013 11:16:43 -0400
> Peter Hurley <peter@hurleysoftware.com> wrote:
>
>>> The funny part is, there's a comment there that shows that this was
>>> done even for "PREEMPT_RT". Unfortunately, the call to
>>> "get_scanout_position()" can call functions that use the rt-mutex
>>> "sleeping spin locks" and it breaks there.
>>>
>>> I guess we need to ask the authors of the mainline patch exactly why
>>> that preempt_disable() is needed?
>>
>> The drm core associates a timestamp with each vertical blank frame #.
>> Drm drivers can optionally support a 'high resolution' hw timestamp.
>> The vblank frame #/timestamp tuple is user-space visible.
>>
>> The i915 drm driver supports a hw timestamp via this drm helper function
>> which computes the timestamp from the crtc scan position (based on the
>> pixel clock).
>>
>> For mainline, the preempt_disable/_enable() isn't actually necessary
>> because every call tree that leads here already has preemption disabled.
>>
>> For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?
>>
>
> No, it should not. Note, any other lock that can be held when it is
> held would also need to be raw.

By that, you mean "any other lock" that might be claimed "would also need
to be raw"?  Hopefully not "any other lock" already held?

> And by taking a quick audit of the code, I see this:
>
> 	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
>
> 	/* Reset the chip */
>
> 	/* GEN6_GDRST is not in the gt power well, no need to check
> 	 * for fifo space for the write or forcewake the chip for
> 	 * the read
> 	 */
> 	__raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL);
>
> 	/* Spin waiting for the device to ack the reset request */
> 	ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) & GEN6_GRDOM_FULL) == 0, 500);
>
> That spin is unacceptable in RT with preemption and interrupts disabled.

Yep. That would be bad.

AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included
in the force-wake set, so raw reads of the registers would
probably be acceptable (thus obviating the need for claiming the uncore.lock).

Except that _ALL_ register access is disabled with the uncore.lock
during a gpu reset. Not sure if that's meant to include crtc registers
or not, or what other synchronization/serialization issues are being
handled/hidden by forcing all register accesses to wait during a gpu reset.

Hopefully an i915 expert can weigh in here?

> What's the real issue here?

That the vblank timestamp needs to be an accurate measurement of a
realtime event. Sleeping/servicing interrupts while reading
the registers necessary to compute the timestamp would be bad too.

(edit: which hopefully Mario Kleiner clarified in his reply)

My point earlier was three-fold:
1. Don't need the preempt_disable() for mainline: all callers are already
    holding interrupt-disabling spinlocks.
2. -RT still needs to prevent scheduling there.
3. the problem is i915-specific.

[update: the radeon driver should also BUG like the i915 driver but probably
should have mmio_idx_lock spinlock as raw]

> Does it have something to do with this dump?

Not sure [but I didn't realize the query was regarding 3.0].

> [    3.932060] ------------[ cut here ]------------
> [    3.936809] WARNING: at /home/rostedt/work/git/linux-rt.git/drivers/gpu/drm/i915/i915_drv.c:322 gen6_gt_force_wake_get+0x4d/0x50 [i915]()
> [    3.949229] Hardware name: HP Compaq Pro 6300 SFF
> [    3.953988] Modules linked in: i915(+) video i2c_algo_bit drm_kms_helper drm i2c_core
> [    3.961943] Pid: 220, comm: udevd Not tainted 3.0.89-test-rt117 #117
> [    3.968343] Call Trace:
> [    3.970851]  [<ffffffff810671bf>] warn_slowpath_common+0x7f/0xc0
> [    3.970852]  [<ffffffff8106721a>] warn_slowpath_null+0x1a/0x20
> [    3.970857]  [<ffffffffa006569d>] gen6_gt_force_wake_get+0x4d/0x50 [i915]
> [    3.970865]  [<ffffffffa00962ab>] ivybridge_init_clock_gating+0xcb/0x2f0 [i915]
> [    3.970870]  [<ffffffffa0090209>] ? intel_crtc_disable+0x29/0x60 [i915]
> [    3.970877]  [<ffffffffa009f741>] intel_modeset_init+0x751/0x10e0 [i915]
> [    3.970882]  [<ffffffffa006a158>] i915_driver_load+0xfc8/0x17f0 [i915]
> [    3.970885]  [<ffffffff8137909e>] ? device_register+0x1e/0x30
> [    3.970892]  [<ffffffffa001f596>] ? drm_sysfs_device_add+0x86/0xb0 [drm]
> [    3.970896]  [<ffffffffa001b8b3>] ? drm_get_minor+0x233/0x300 [drm]
> [    3.970900]  [<ffffffffa001dd49>] drm_get_pci_dev+0x189/0x2a0 [drm]
> [    3.970902]  [<ffffffff8105e31b>] ? migrate_enable+0x8b/0x1c0
> [    3.970910]  [<ffffffffa00c5f27>] i915_pci_probe+0x1b/0x1d [i915]
> [    3.970913]  [<ffffffff812c5b6c>] local_pci_probe+0x5c/0xd0
> [    3.970915]  [<ffffffff812c73f9>] pci_device_probe+0x109/0x130
> [    3.970917]  [<ffffffff8137bc6c>] driver_probe_device+0x9c/0x2b0
> [    3.970918]  [<ffffffff8137bf2b>] __driver_attach+0xab/0xb0
> [    3.970919]  [<ffffffff8137be80>] ? driver_probe_device+0x2b0/0x2b0
> [    3.970920]  [<ffffffff8137be80>] ? driver_probe_device+0x2b0/0x2b0
> [    3.970921]  [<ffffffff8137aa34>] bus_for_each_dev+0x64/0xa0
> [    3.970923]  [<ffffffff8137b87e>] driver_attach+0x1e/0x20
> [    3.970924]  [<ffffffff8137b490>] bus_add_driver+0x1b0/0x290
> [    3.970925]  [<ffffffff8137c4a6>] driver_register+0x76/0x140
> [    3.970927]  [<ffffffff812c70a2>] __pci_register_driver+0x82/0x100
> [    3.970930]  [<ffffffff815b1f6d>] ? notifier_call_chain+0x4d/0x70
> [    3.970934]  [<ffffffffa001df7a>] drm_pci_init+0x11a/0x130 [drm]
> [    3.970935]  [<ffffffffa00f4000>] ? 0xffffffffa00f3fff
> [    3.970940]  [<ffffffffa00f408b>] i915_init+0x8b/0x8d [i915]
> [    3.970943]  [<ffffffff81002040>] do_one_initcall+0x40/0x180
> [    3.970946]  [<ffffffff810aa4ae>] sys_init_module+0xbe/0x230
> [    3.970948]  [<ffffffff815b5a82>] system_call_fastpath+0x16/0x1b
> [    3.970949] ---[ end trace 0000000000000002 ]---
> [    4.164458] ------------[ cut here ]------------
>
> -- Steve
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-11 19:19           ` Steven Rostedt
@ 2013-09-11 20:23             ` Mario Kleiner
  0 siblings, 0 replies; 32+ messages in thread
From: Mario Kleiner @ 2013-09-11 20:23 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Peter Hurley, Luis Claudio R. Goncalves, linux-rt-users,
	Thomas Gleixner, Sebastian Andrzej Siewior, Clark Williams,
	Dave Airlie, LKML, Mario Kleiner



On 11.09.13 21:19, Steven Rostedt wrote:
> On Wed, 11 Sep 2013 21:07:10 +0200
> Mario Kleiner <mario.kleiner@tuebingen.mpg.de> wrote:
>
>>
>>
>> On 11.09.13 20:35, Steven Rostedt wrote:
>>> On Wed, 11 Sep 2013 20:29:07 +0200
>>> Mario Kleiner <mario.kleiner@tuebingen.mpg.de> wrote:
>>>
>>>> That said, maybe preempt_disable is no longer the optimal choice there
>>>> and there's some better way to achieve good protection against
>>>> interruptions of that bit of code? My knowledge here is a bit rusty, and
>>>> the intel kms drivers and rt stuff has changed quite a bit.
>>>
>>> If you set your code to a higher priority than other tasks (and
>>> interrupts) than it wont be preempted there. Unless of course it blocks
>>> on a lock, but even then, priority inheritance will take place and it
>>> still should be rather quick. (unless the holder of the lock is doing
>>> that strange polling).
>>>
>>> -- Steve
>>>
>>
>> Right, on a rt kernel. But that creates the problem of not very computer
>> savvy users (psychologists and biologists mostly) somehow having to
>> choose proper priorities for gpu interrupt threads and for the
>> x-server/wayland/..., and not much protection on a non-rt kernel?
>
> IIUC, the preempt_disable() is only for -rt, the non-rt case already
> disables preemption with the spin_locks called before it.
>

Oh, right! should have thought about that. I'm quite sleepy, so my brain 
is not working very well atm.

>>
>> preempt_disable() a few years ago looked like a good "plug and play"
>> default solution, because the ->get_crtc_scanoutpos() function was
>> supposed to have a very low and bounded execution time. At the time we
>> wrote the patches for intel/radeon/nouveau, that was the case. Typical
>> execution time (= preempt off time) was like 1-4 usecs, even on very low
>> end hardware.
>>
>> Seems that at least intel's kms driver does a lot of things now, which
>> can sleep and spin inside that section? I tried to follow the posted
>> stack trace, but got lost somewhere around the i915_read32 code and
>> power management stuff...
>
> Note, the sleeps only happen on -rt, and not in mainline.
>
> If one is going to use -rt for real-time work, it requires a bit more
> knowledge of the system. The problem with RT in general, is that it's
> hard, and anyone telling you they have a generic RT system that
> requires no computer savvyness can also be selling you a bridge over
> the east river.
>
> -- Steve

;) - I know the problem, i spend a lot of time telling that to users of 
my software, although they then generally want some sort of bridges 
anyway. I'm maintaining one of the most popular open-source toolkits for 
neuro-science, and in my experience at least the field of neuro-science 
research has the problem that a lot of people there need good real-time 
behaviour and a lot of flexibility in their hardware and software 
setups, but very few have the necessary technical background. Given the 
limited money they can spend, there's also not much commercial interest 
or probably viability in providing good technical consulting. The few 
proprietary hardware solutions i know of are either unaffordable by the 
majority, or are bridges over the east river, or quite often both. My 
main motivation for luring my users to Linux and contributing some 
little bits sometimes is the hope that some problems can be solved in a 
better way at the system level than piling software workarounds on top 
of hardware workarounds on top of expensive equipment.

But back to the topic, I think a better argument for the 
preempt_disable() there instead of changing code execution priority is 
that i wouldn't know how to set a static priority properly either. The 
timestamping code is also called from drm code (drmWaitVblank ioctl()) 
and it isn't called from the actual experiment software, where i would 
at least roughly know what i'm doing, and could adjust priorities 
dynamically, but from the X-Server, or maybe in the future Wayland, on 
behalf of the OpenGL client app. For the timestamping to work properly, 
one only would need a raised priority (higher than most interrupt kernel 
threads, except the one of the kms driver) for those few lines of 
timestamping code. I don't think it would be good to run xorg or wayland 
permanently at a higher priority than most irq threads, given that the 
display server does not only serve rt apps and is not designed as a 
realtime application. One only wants a short protection from preemption 
during timestamping.

Sorry, i think i'm rambling here quite a bit and i didn't want to 
sidetrack the thread, just give some explanation why i think the 
preempt_disable() is (/was?) justified.

-mario

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-11 19:31       ` Peter Hurley
@ 2013-09-17 19:50         ` Peter Hurley
  2013-09-17 20:55           ` Daniel Vetter
  0 siblings, 1 reply; 32+ messages in thread
From: Peter Hurley @ 2013-09-17 19:50 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Steven Rostedt, Luis Claudio R. Goncalves, linux-rt-users,
	Thomas Gleixner, Sebastian Andrzej Siewior, Clark Williams,
	Mario Kleiner, Dave Airlie, LKML, dri-devel@lists.freedesktop.org,
	intel-gfx

On 09/11/2013 03:31 PM, Peter Hurley wrote:
> [+cc dri-devel]
>
> On 09/11/2013 11:38 AM, Steven Rostedt wrote:
>> On Wed, 11 Sep 2013 11:16:43 -0400
>> Peter Hurley <peter@hurleysoftware.com> wrote:
>>
>>>> The funny part is, there's a comment there that shows that this was
>>>> done even for "PREEMPT_RT". Unfortunately, the call to
>>>> "get_scanout_position()" can call functions that use the rt-mutex
>>>> "sleeping spin locks" and it breaks there.
>>>>
>>>> I guess we need to ask the authors of the mainline patch exactly why
>>>> that preempt_disable() is needed?
>>>
>>> The drm core associates a timestamp with each vertical blank frame #.
>>> Drm drivers can optionally support a 'high resolution' hw timestamp.
>>> The vblank frame #/timestamp tuple is user-space visible.
>>>
>>> The i915 drm driver supports a hw timestamp via this drm helper function
>>> which computes the timestamp from the crtc scan position (based on the
>>> pixel clock).
>>>
>>> For mainline, the preempt_disable/_enable() isn't actually necessary
>>> because every call tree that leads here already has preemption disabled.
>>>
>>> For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?
>>>
>>
>> No, it should not. Note, any other lock that can be held when it is
>> held would also need to be raw.
>
> By that, you mean "any other lock" that might be claimed "would also need
> to be raw"?  Hopefully not "any other lock" already held?
>
>> And by taking a quick audit of the code, I see this:
>>
>>     spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
>>
>>     /* Reset the chip */
>>
>>     /* GEN6_GDRST is not in the gt power well, no need to check
>>      * for fifo space for the write or forcewake the chip for
>>      * the read
>>      */
>>     __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL);
>>
>>     /* Spin waiting for the device to ack the reset request */
>>     ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) & GEN6_GRDOM_FULL) == 0, 500);
>>
>> That spin is unacceptable in RT with preemption and interrupts disabled.
>
> Yep. That would be bad.
>
> AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included
> in the force-wake set, so raw reads of the registers would
> probably be acceptable (thus obviating the need for claiming the uncore.lock).
>
> Except that _ALL_ register access is disabled with the uncore.lock
> during a gpu reset. Not sure if that's meant to include crtc registers
> or not, or what other synchronization/serialization issues are being
> handled/hidden by forcing all register accesses to wait during a gpu reset.
>
> Hopefully an i915 expert can weigh in here?


Daniel,

Can you shed some light on whether the i915+ crtc registers (specifically
those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter())
read as part of the vblank counter/timestamp handling need to
be prevented during gpu reset?

The implied wait with preemption and interrupts disabled is causing grief
in -RT, but also a 4ms wait inside an irq handler seems like a bad idea.

Regards,
Peter Hurley


>> What's the real issue here?
>
> That the vblank timestamp needs to be an accurate measurement of a
> realtime event. Sleeping/servicing interrupts while reading
> the registers necessary to compute the timestamp would be bad too.
>
> (edit: which hopefully Mario Kleiner clarified in his reply)
>
> My point earlier was three-fold:
> 1. Don't need the preempt_disable() for mainline: all callers are already
>     holding interrupt-disabling spinlocks.
> 2. -RT still needs to prevent scheduling there.
> 3. the problem is i915-specific.
>
> [update: the radeon driver should also BUG like the i915 driver but probably
> should have mmio_idx_lock spinlock as raw]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-17 19:50         ` Peter Hurley
@ 2013-09-17 20:55           ` Daniel Vetter
  2013-09-18 16:52             ` Peter Hurley
  2013-09-20 22:07             ` Mario Kleiner
  0 siblings, 2 replies; 32+ messages in thread
From: Daniel Vetter @ 2013-09-17 20:55 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Steven Rostedt, Luis Claudio R. Goncalves, linux-rt-users,
	Thomas Gleixner, Sebastian Andrzej Siewior, Clark Williams,
	Mario Kleiner, Dave Airlie, LKML, dri-devel@lists.freedesktop.org,
	intel-gfx

On Tue, Sep 17, 2013 at 9:50 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
> On 09/11/2013 03:31 PM, Peter Hurley wrote:
>>
>> [+cc dri-devel]
>>
>> On 09/11/2013 11:38 AM, Steven Rostedt wrote:
>>>
>>> On Wed, 11 Sep 2013 11:16:43 -0400
>>> Peter Hurley <peter@hurleysoftware.com> wrote:
>>>
>>>>> The funny part is, there's a comment there that shows that this was
>>>>> done even for "PREEMPT_RT". Unfortunately, the call to
>>>>> "get_scanout_position()" can call functions that use the rt-mutex
>>>>> "sleeping spin locks" and it breaks there.
>>>>>
>>>>> I guess we need to ask the authors of the mainline patch exactly why
>>>>> that preempt_disable() is needed?
>>>>
>>>>
>>>> The drm core associates a timestamp with each vertical blank frame #.
>>>> Drm drivers can optionally support a 'high resolution' hw timestamp.
>>>> The vblank frame #/timestamp tuple is user-space visible.
>>>>
>>>> The i915 drm driver supports a hw timestamp via this drm helper function
>>>> which computes the timestamp from the crtc scan position (based on the
>>>> pixel clock).
>>>>
>>>> For mainline, the preempt_disable/_enable() isn't actually necessary
>>>> because every call tree that leads here already has preemption disabled.
>>>>
>>>> For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?
>>>>
>>>
>>> No, it should not. Note, any other lock that can be held when it is
>>> held would also need to be raw.
>>
>>
>> By that, you mean "any other lock" that might be claimed "would also need
>> to be raw"?  Hopefully not "any other lock" already held?
>>
>>> And by taking a quick audit of the code, I see this:
>>>
>>>     spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
>>>
>>>     /* Reset the chip */
>>>
>>>     /* GEN6_GDRST is not in the gt power well, no need to check
>>>      * for fifo space for the write or forcewake the chip for
>>>      * the read
>>>      */
>>>     __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL);
>>>
>>>     /* Spin waiting for the device to ack the reset request */
>>>     ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) &
>>> GEN6_GRDOM_FULL) == 0, 500);
>>>
>>> That spin is unacceptable in RT with preemption and interrupts disabled.
>>
>>
>> Yep. That would be bad.
>>
>> AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included
>> in the force-wake set, so raw reads of the registers would
>> probably be acceptable (thus obviating the need for claiming the
>> uncore.lock).
>>
>> Except that _ALL_ register access is disabled with the uncore.lock
>> during a gpu reset. Not sure if that's meant to include crtc registers
>> or not, or what other synchronization/serialization issues are being
>> handled/hidden by forcing all register accesses to wait during a gpu
>> reset.
>>
>> Hopefully an i915 expert can weigh in here?
>
>
>
> Daniel,
>
> Can you shed some light on whether the i915+ crtc registers (specifically
> those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter())
> read as part of the vblank counter/timestamp handling need to
> be prevented during gpu reset?

The depency here in the locking is a recent addition:

commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Jul 19 20:36:51 2013 +0100

    drm/i915: Serialize almost all register access

It's a (slightly) oversized hammer to work around a hardware issue -
we could break it down to register blocks, which can be accessed
concurrently, but that tends to be more fragile. But the chip really
dies if you access (even just reads) the same block concurrently :(

We could try break the spinlock protected section a bit in the reset
handler - register access on a hung gpu tends to be ill-defined
anyway.

> The implied wait with preemption and interrupts disabled is causing grief
> in -RT, but also a 4ms wait inside an irq handler seems like a bad idea.

Oops, the magic code in wait_for which is just there to make the imo
totally misguided kgdb support work papered over the aweful long wait
in atomic context ever since we've added this in

commit b6e45f866465f42b53d803b0c574da0fc508a0e9
Author: Keith Packard <keithp@keithp.com>
Date:   Fri Jan 6 11:34:04 2012 -0800

    drm/i915: Move reset forcewake processing to gen6_do_reset

Reverting this change should be enough (code moved obviously a bit).

Cheers, Daniel

>
> Regards,
> Peter Hurley
>
>
>
>>> What's the real issue here?
>>
>>
>> That the vblank timestamp needs to be an accurate measurement of a
>> realtime event. Sleeping/servicing interrupts while reading
>> the registers necessary to compute the timestamp would be bad too.
>>
>> (edit: which hopefully Mario Kleiner clarified in his reply)
>>
>> My point earlier was three-fold:
>> 1. Don't need the preempt_disable() for mainline: all callers are already
>>     holding interrupt-disabling spinlocks.
>> 2. -RT still needs to prevent scheduling there.
>> 3. the problem is i915-specific.
>>
>> [update: the radeon driver should also BUG like the i915 driver but
>> probably
>> should have mmio_idx_lock spinlock as raw]
>
>
>



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-17 20:55           ` Daniel Vetter
@ 2013-09-18 16:52             ` Peter Hurley
  2013-09-18 17:03               ` Daniel Vetter
  2013-09-18 17:03               ` Ville Syrjälä
  2013-09-20 22:07             ` Mario Kleiner
  1 sibling, 2 replies; 32+ messages in thread
From: Peter Hurley @ 2013-09-18 16:52 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: linux-rt-users, Clark Williams, Sebastian Andrzej Siewior, LKML,
	Steven Rostedt, dri-devel@lists.freedesktop.org, Mario Kleiner,
	Dave Airlie, Thomas Gleixner, intel-gfx,
	Luis Claudio R. Goncalves

On 09/17/2013 04:55 PM, Daniel Vetter wrote:
> On Tue, Sep 17, 2013 at 9:50 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
>> On 09/11/2013 03:31 PM, Peter Hurley wrote:
>>>
>>> [+cc dri-devel]
>>>
>>> On 09/11/2013 11:38 AM, Steven Rostedt wrote:
>>>>
>>>> On Wed, 11 Sep 2013 11:16:43 -0400
>>>> Peter Hurley <peter@hurleysoftware.com> wrote:
>>>>
>>>>>> The funny part is, there's a comment there that shows that this was
>>>>>> done even for "PREEMPT_RT". Unfortunately, the call to
>>>>>> "get_scanout_position()" can call functions that use the rt-mutex
>>>>>> "sleeping spin locks" and it breaks there.
>>>>>>
>>>>>> I guess we need to ask the authors of the mainline patch exactly why
>>>>>> that preempt_disable() is needed?
>>>>>
>>>>>
>>>>> The drm core associates a timestamp with each vertical blank frame #.
>>>>> Drm drivers can optionally support a 'high resolution' hw timestamp.
>>>>> The vblank frame #/timestamp tuple is user-space visible.
>>>>>
>>>>> The i915 drm driver supports a hw timestamp via this drm helper function
>>>>> which computes the timestamp from the crtc scan position (based on the
>>>>> pixel clock).
>>>>>
>>>>> For mainline, the preempt_disable/_enable() isn't actually necessary
>>>>> because every call tree that leads here already has preemption disabled.
>>>>>
>>>>> For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?
>>>>>
>>>>
>>>> No, it should not. Note, any other lock that can be held when it is
>>>> held would also need to be raw.
>>>
>>>
>>> By that, you mean "any other lock" that might be claimed "would also need
>>> to be raw"?  Hopefully not "any other lock" already held?
>>>
>>>> And by taking a quick audit of the code, I see this:
>>>>
>>>>      spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
>>>>
>>>>      /* Reset the chip */
>>>>
>>>>      /* GEN6_GDRST is not in the gt power well, no need to check
>>>>       * for fifo space for the write or forcewake the chip for
>>>>       * the read
>>>>       */
>>>>      __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL);
>>>>
>>>>      /* Spin waiting for the device to ack the reset request */
>>>>      ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) &
>>>> GEN6_GRDOM_FULL) == 0, 500);
>>>>
>>>> That spin is unacceptable in RT with preemption and interrupts disabled.
>>>
>>>
>>> Yep. That would be bad.
>>>
>>> AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included
>>> in the force-wake set, so raw reads of the registers would
>>> probably be acceptable (thus obviating the need for claiming the
>>> uncore.lock).
>>>
>>> Except that _ALL_ register access is disabled with the uncore.lock
>>> during a gpu reset. Not sure if that's meant to include crtc registers
>>> or not, or what other synchronization/serialization issues are being
>>> handled/hidden by forcing all register accesses to wait during a gpu
>>> reset.
>>>
>>> Hopefully an i915 expert can weigh in here?
>>
>>
>>
>> Daniel,
>>
>> Can you shed some light on whether the i915+ crtc registers (specifically
>> those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter())
>> read as part of the vblank counter/timestamp handling need to
>> be prevented during gpu reset?
>
> The depency here in the locking is a recent addition:
>
> commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Fri Jul 19 20:36:51 2013 +0100
>
>      drm/i915: Serialize almost all register access
>
> It's a (slightly) oversized hammer to work around a hardware issue -
> we could break it down to register blocks, which can be accessed
> concurrently, but that tends to be more fragile. But the chip really
> dies if you access (even just reads) the same block concurrently :(

Ouch. But thanks for clarifying that.

Ok, so register access needs to be serialized. And a separate but
related concern is that gen6+ resets also need to hold-off register
access where forcewake is required.


While I was reviewing the registers that require forcewake handling,
I saw this:

from i915_reg.h:
#define _DPLL_A	(dev_priv->info->display_mmio_offset + 0x6014)
#define _DPLL_B	(dev_priv->info->display_mmio_offset + 0x6018)

from i915_drv.c:
static const struct intel_device_info intel_valleyview_m_info = {
	GEN7_FEATURES,
	.is_mobile = 1,
	.num_pipes = 2,
	.is_valleyview = 1,
	.display_mmio_offset = VLV_DISPLAY_BASE,     <<<-------
	.has_llc = 0, /* legal, last one wins */
};

from intel_uncore.c:
#define NEEDS_FORCE_WAKE(dev_priv, reg) \
	((HAS_FORCE_WAKE((dev_priv)->dev)) && \
	 ((reg) < 0x40000) &&            \
	 ((reg) != FORCEWAKE))

Is this is a mistake or do the valleyview PLLs not require the
same forcewake handling as the other intel gpus?

Regards,
Peter Hurley


> We could try break the spinlock protected section a bit in the reset
> handler - register access on a hung gpu tends to be ill-defined
> anyway.
>
>> The implied wait with preemption and interrupts disabled is causing grief
>> in -RT, but also a 4ms wait inside an irq handler seems like a bad idea.
>
> Oops, the magic code in wait_for which is just there to make the imo
> totally misguided kgdb support work papered over the aweful long wait
> in atomic context ever since we've added this in
>
> commit b6e45f866465f42b53d803b0c574da0fc508a0e9
> Author: Keith Packard <keithp@keithp.com>
> Date:   Fri Jan 6 11:34:04 2012 -0800
>
>      drm/i915: Move reset forcewake processing to gen6_do_reset
>
> Reverting this change should be enough (code moved obviously a bit).
>
> Cheers, Daniel
>
>>
>> Regards,
>> Peter Hurley
>>
>>
>>
>>>> What's the real issue here?
>>>
>>>
>>> That the vblank timestamp needs to be an accurate measurement of a
>>> realtime event. Sleeping/servicing interrupts while reading
>>> the registers necessary to compute the timestamp would be bad too.
>>>
>>> (edit: which hopefully Mario Kleiner clarified in his reply)
>>>
>>> My point earlier was three-fold:
>>> 1. Don't need the preempt_disable() for mainline: all callers are already
>>>      holding interrupt-disabling spinlocks.
>>> 2. -RT still needs to prevent scheduling there.
>>> 3. the problem is i915-specific.
>>>
>>> [update: the radeon driver should also BUG like the i915 driver but
>>> probably
>>> should have mmio_idx_lock spinlock as raw]
>>
>>
>>
>
>
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-18 16:52             ` Peter Hurley
@ 2013-09-18 17:03               ` Daniel Vetter
  2013-09-18 17:03               ` Ville Syrjälä
  1 sibling, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2013-09-18 17:03 UTC (permalink / raw)
  To: Peter Hurley
  Cc: linux-rt-users, Clark Williams, Sebastian Andrzej Siewior, LKML,
	Steven Rostedt, dri-devel@lists.freedesktop.org, Mario Kleiner,
	Dave Airlie, Thomas Gleixner, intel-gfx,
	Luis Claudio R. Goncalves

On Wed, Sep 18, 2013 at 6:52 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
> Ouch. But thanks for clarifying that.
>
> Ok, so register access needs to be serialized. And a separate but
> related concern is that gen6+ resets also need to hold-off register
> access where forcewake is required.
>
>
> While I was reviewing the registers that require forcewake handling,
> I saw this:
>
> from i915_reg.h:
> #define _DPLL_A (dev_priv->info->display_mmio_offset + 0x6014)
> #define _DPLL_B (dev_priv->info->display_mmio_offset + 0x6018)
>
> from i915_drv.c:
> static const struct intel_device_info intel_valleyview_m_info = {
>         GEN7_FEATURES,
>         .is_mobile = 1,
>         .num_pipes = 2,
>         .is_valleyview = 1,
>         .display_mmio_offset = VLV_DISPLAY_BASE,     <<<-------
>         .has_llc = 0, /* legal, last one wins */
> };
>
> from intel_uncore.c:
> #define NEEDS_FORCE_WAKE(dev_priv, reg) \
>         ((HAS_FORCE_WAKE((dev_priv)->dev)) && \
>          ((reg) < 0x40000) &&            \
>          ((reg) != FORCEWAKE))
>
> Is this is a mistake or do the valleyview PLLs not require the
> same forcewake handling as the other intel gpus?

The DPLL offset from the macro at 0x6000 is from older platforms which
lacked forcewake and where the display block started already on
0x6000.

On recent big core platforms we have the north display block at
0x40000 (i.e. forcewake is only used for the rendering side of
things). For those platforms the DPLL macro is called PCH_DPLL (and
it's in the south display range starting at 0xc0000. VLV itself
inherited the old display register blocks (mostly) but moved them all
by the vlv display base offset.

We have quite a bit of fun with hw engineers moving display blocks around ;-)

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-18 16:52             ` Peter Hurley
  2013-09-18 17:03               ` Daniel Vetter
@ 2013-09-18 17:03               ` Ville Syrjälä
  1 sibling, 0 replies; 32+ messages in thread
From: Ville Syrjälä @ 2013-09-18 17:03 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Daniel Vetter, linux-rt-users, Clark Williams,
	Sebastian Andrzej Siewior, LKML, dri-devel@lists.freedesktop.org,
	Steven Rostedt, Mario Kleiner, Dave Airlie, Thomas Gleixner,
	intel-gfx, Luis Claudio R. Goncalves

On Wed, Sep 18, 2013 at 12:52:07PM -0400, Peter Hurley wrote:
> On 09/17/2013 04:55 PM, Daniel Vetter wrote:
> > On Tue, Sep 17, 2013 at 9:50 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
> >> On 09/11/2013 03:31 PM, Peter Hurley wrote:
> >>>
> >>> [+cc dri-devel]
> >>>
> >>> On 09/11/2013 11:38 AM, Steven Rostedt wrote:
> >>>>
> >>>> On Wed, 11 Sep 2013 11:16:43 -0400
> >>>> Peter Hurley <peter@hurleysoftware.com> wrote:
> >>>>
> >>>>>> The funny part is, there's a comment there that shows that this was
> >>>>>> done even for "PREEMPT_RT". Unfortunately, the call to
> >>>>>> "get_scanout_position()" can call functions that use the rt-mutex
> >>>>>> "sleeping spin locks" and it breaks there.
> >>>>>>
> >>>>>> I guess we need to ask the authors of the mainline patch exactly why
> >>>>>> that preempt_disable() is needed?
> >>>>>
> >>>>>
> >>>>> The drm core associates a timestamp with each vertical blank frame #.
> >>>>> Drm drivers can optionally support a 'high resolution' hw timestamp.
> >>>>> The vblank frame #/timestamp tuple is user-space visible.
> >>>>>
> >>>>> The i915 drm driver supports a hw timestamp via this drm helper function
> >>>>> which computes the timestamp from the crtc scan position (based on the
> >>>>> pixel clock).
> >>>>>
> >>>>> For mainline, the preempt_disable/_enable() isn't actually necessary
> >>>>> because every call tree that leads here already has preemption disabled.
> >>>>>
> >>>>> For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?
> >>>>>
> >>>>
> >>>> No, it should not. Note, any other lock that can be held when it is
> >>>> held would also need to be raw.
> >>>
> >>>
> >>> By that, you mean "any other lock" that might be claimed "would also need
> >>> to be raw"?  Hopefully not "any other lock" already held?
> >>>
> >>>> And by taking a quick audit of the code, I see this:
> >>>>
> >>>>      spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
> >>>>
> >>>>      /* Reset the chip */
> >>>>
> >>>>      /* GEN6_GDRST is not in the gt power well, no need to check
> >>>>       * for fifo space for the write or forcewake the chip for
> >>>>       * the read
> >>>>       */
> >>>>      __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL);
> >>>>
> >>>>      /* Spin waiting for the device to ack the reset request */
> >>>>      ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) &
> >>>> GEN6_GRDOM_FULL) == 0, 500);
> >>>>
> >>>> That spin is unacceptable in RT with preemption and interrupts disabled.
> >>>
> >>>
> >>> Yep. That would be bad.
> >>>
> >>> AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included
> >>> in the force-wake set, so raw reads of the registers would
> >>> probably be acceptable (thus obviating the need for claiming the
> >>> uncore.lock).
> >>>
> >>> Except that _ALL_ register access is disabled with the uncore.lock
> >>> during a gpu reset. Not sure if that's meant to include crtc registers
> >>> or not, or what other synchronization/serialization issues are being
> >>> handled/hidden by forcing all register accesses to wait during a gpu
> >>> reset.
> >>>
> >>> Hopefully an i915 expert can weigh in here?
> >>
> >>
> >>
> >> Daniel,
> >>
> >> Can you shed some light on whether the i915+ crtc registers (specifically
> >> those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter())
> >> read as part of the vblank counter/timestamp handling need to
> >> be prevented during gpu reset?
> >
> > The depency here in the locking is a recent addition:
> >
> > commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a
> > Author: Chris Wilson <chris@chris-wilson.co.uk>
> > Date:   Fri Jul 19 20:36:51 2013 +0100
> >
> >      drm/i915: Serialize almost all register access
> >
> > It's a (slightly) oversized hammer to work around a hardware issue -
> > we could break it down to register blocks, which can be accessed
> > concurrently, but that tends to be more fragile. But the chip really
> > dies if you access (even just reads) the same block concurrently :(
> 
> Ouch. But thanks for clarifying that.
> 
> Ok, so register access needs to be serialized. And a separate but
> related concern is that gen6+ resets also need to hold-off register
> access where forcewake is required.
> 
> 
> While I was reviewing the registers that require forcewake handling,
> I saw this:
> 
> from i915_reg.h:
> #define _DPLL_A	(dev_priv->info->display_mmio_offset + 0x6014)
> #define _DPLL_B	(dev_priv->info->display_mmio_offset + 0x6018)
> 
> from i915_drv.c:
> static const struct intel_device_info intel_valleyview_m_info = {
> 	GEN7_FEATURES,
> 	.is_mobile = 1,
> 	.num_pipes = 2,
> 	.is_valleyview = 1,
> 	.display_mmio_offset = VLV_DISPLAY_BASE,     <<<-------
> 	.has_llc = 0, /* legal, last one wins */
> };
> 
> from intel_uncore.c:
> #define NEEDS_FORCE_WAKE(dev_priv, reg) \
> 	((HAS_FORCE_WAKE((dev_priv)->dev)) && \
> 	 ((reg) < 0x40000) &&            \
> 	 ((reg) != FORCEWAKE))
> 
> Is this is a mistake or do the valleyview PLLs not require the
> same forcewake handling as the other intel gpus?

Display registers shouldn't need forcewake on any platform. I guess our
NEEDS_FORCE_WAKE() check is a bit too coarse and includes a bunch of
stuff doesn't need to be there. So sort of by accident we do the right
thing on VLV, and the "wrong" thing on other platforms.

-- 
Ville Syrjälä
Intel OTC
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-17 20:55           ` Daniel Vetter
  2013-09-18 16:52             ` Peter Hurley
@ 2013-09-20 22:07             ` Mario Kleiner
  2013-09-23  8:38               ` [Intel-gfx] " Ville Syrjälä
  2013-09-25 14:07               ` Steven Rostedt
  1 sibling, 2 replies; 32+ messages in thread
From: Mario Kleiner @ 2013-09-20 22:07 UTC (permalink / raw)
  To: Daniel Vetter, Peter Hurley
  Cc: Steven Rostedt, Luis Claudio R. Goncalves, linux-rt-users,
	Thomas Gleixner, Sebastian Andrzej Siewior, Clark Williams,
	Dave Airlie, LKML, dri-devel@lists.freedesktop.org, intel-gfx,
	Mario Kleiner

On 09/17/2013 10:55 PM, Daniel Vetter wrote:
> On Tue, Sep 17, 2013 at 9:50 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
>> On 09/11/2013 03:31 PM, Peter Hurley wrote:
>>>
>>> [+cc dri-devel]
>>>
>>> On 09/11/2013 11:38 AM, Steven Rostedt wrote:
>>>>
>>>> On Wed, 11 Sep 2013 11:16:43 -0400
>>>> Peter Hurley <peter@hurleysoftware.com> wrote:
>>>>
>>>>>> The funny part is, there's a comment there that shows that this was
>>>>>> done even for "PREEMPT_RT". Unfortunately, the call to
>>>>>> "get_scanout_position()" can call functions that use the rt-mutex
>>>>>> "sleeping spin locks" and it breaks there.
>>>>>>
>>>>>> I guess we need to ask the authors of the mainline patch exactly why
>>>>>> that preempt_disable() is needed?
>>>>>
>>>>>
>>>>> The drm core associates a timestamp with each vertical blank frame #.
>>>>> Drm drivers can optionally support a 'high resolution' hw timestamp.
>>>>> The vblank frame #/timestamp tuple is user-space visible.
>>>>>
>>>>> The i915 drm driver supports a hw timestamp via this drm helper function
>>>>> which computes the timestamp from the crtc scan position (based on the
>>>>> pixel clock).
>>>>>
>>>>> For mainline, the preempt_disable/_enable() isn't actually necessary
>>>>> because every call tree that leads here already has preemption disabled.
>>>>>
>>>>> For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?
>>>>>
>>>>
>>>> No, it should not. Note, any other lock that can be held when it is
>>>> held would also need to be raw.
>>>
>>>
>>> By that, you mean "any other lock" that might be claimed "would also need
>>> to be raw"?  Hopefully not "any other lock" already held?
>>>
>>>> And by taking a quick audit of the code, I see this:
>>>>
>>>>      spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
>>>>
>>>>      /* Reset the chip */
>>>>
>>>>      /* GEN6_GDRST is not in the gt power well, no need to check
>>>>       * for fifo space for the write or forcewake the chip for
>>>>       * the read
>>>>       */
>>>>      __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL);
>>>>
>>>>      /* Spin waiting for the device to ack the reset request */
>>>>      ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) &
>>>> GEN6_GRDOM_FULL) == 0, 500);
>>>>
>>>> That spin is unacceptable in RT with preemption and interrupts disabled.
>>>
>>>
>>> Yep. That would be bad.
>>>
>>> AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included
>>> in the force-wake set, so raw reads of the registers would
>>> probably be acceptable (thus obviating the need for claiming the
>>> uncore.lock).
>>>
>>> Except that _ALL_ register access is disabled with the uncore.lock
>>> during a gpu reset. Not sure if that's meant to include crtc registers
>>> or not, or what other synchronization/serialization issues are being
>>> handled/hidden by forcing all register accesses to wait during a gpu
>>> reset.
>>>
>>> Hopefully an i915 expert can weigh in here?
>>
>>
>>
>> Daniel,
>>
>> Can you shed some light on whether the i915+ crtc registers (specifically
>> those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter())
>> read as part of the vblank counter/timestamp handling need to
>> be prevented during gpu reset?
>
> The depency here in the locking is a recent addition:
>
> commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Fri Jul 19 20:36:51 2013 +0100
>
>      drm/i915: Serialize almost all register access
>
> It's a (slightly) oversized hammer to work around a hardware issue -
> we could break it down to register blocks, which can be accessed
> concurrently, but that tends to be more fragile. But the chip really
> dies if you access (even just reads) the same block concurrently :(
>
> We could try break the spinlock protected section a bit in the reset
> handler - register access on a hung gpu tends to be ill-defined
> anyway.
>
>> The implied wait with preemption and interrupts disabled is causing grief
>> in -RT, but also a 4ms wait inside an irq handler seems like a bad idea.
>
> Oops, the magic code in wait_for which is just there to make the imo
> totally misguided kgdb support work papered over the aweful long wait
> in atomic context ever since we've added this in
>
> commit b6e45f866465f42b53d803b0c574da0fc508a0e9
> Author: Keith Packard <keithp@keithp.com>
> Date:   Fri Jan 6 11:34:04 2012 -0800
>
>      drm/i915: Move reset forcewake processing to gen6_do_reset
>
> Reverting this change should be enough (code moved obviously a bit).
>
> Cheers, Daniel
>
>>
>> Regards,
>> Peter Hurley
>>
>>
>>
>>>> What's the real issue here?
>>>
>>>
>>> That the vblank timestamp needs to be an accurate measurement of a
>>> realtime event. Sleeping/servicing interrupts while reading
>>> the registers necessary to compute the timestamp would be bad too.
>>>
>>> (edit: which hopefully Mario Kleiner clarified in his reply)
>>>
>>> My point earlier was three-fold:
>>> 1. Don't need the preempt_disable() for mainline: all callers are already
>>>      holding interrupt-disabling spinlocks.
>>> 2. -RT still needs to prevent scheduling there.
>>> 3. the problem is i915-specific.
>>>
>>> [update: the radeon driver should also BUG like the i915 driver but
>>> probably
>>> should have mmio_idx_lock spinlock as raw]
>>

Ok, so register access must be serialized, at least within a register 
block, no way around it. Thanks for explaining this. That makes me a bit 
depressed. Is this also true for older hw generations than gen6?

Daniel, could we move access to the display register block to a separate 
lock, so the I915_READ to PIPEDSL in  i915_get_crtc_scanoutpos() can 
avoid the uncore.lock? If the display registers don't need forcewake 
then i assume we could have much shorter lock hold times by avoiding the 
large up to 4 msecs waits in forcewake handling. Probably also much less 
contention in normal operation when no modesetting happens? I think i 
can get rid of the other register reads in that function. Those settings 
are already cached and accessible from the intel_crtc_config->adjusted_mode.

Steven, would it then be acceptable to convert that "faster" lock into a 
raw_spinlock_t or is this unacceptable? If so, the preempt_disable() 
could stay, right?

I really would like to keep those preempt_disable()/enable() calls where 
they are if somehow possible - it makes usage of the timestamping under 
any kernel nicely plug & play.

If that's not possible i think i could work around it for at least my 
application under rt kernel, but it would be somewhat ugly and high 
maintenance, and wouldn't help other apps with similar needs, which 
don't know what knobs to tune - or that any tuning is suddenly needed on 
the latest kernels.

In my case i could set the drm.vblankoffdelay module parameter to zero 
to keep vblank irq's running all the time. That would leave the irq 
kernel thread for the gpu interrupt as the only caller of the scanout 
timestamping. Then one could set the priority of that irq thread to 
something very high and hope for the best. How to locate the proper irq 
thread(s) for any given gpu(s), what priority(s) to choose etc. and how 
to do this portably across various versions of various distros, i don't 
know. Doesn't look very user friendly, compared to the current solution, 
but not impossible.

In any case, maybe the simple retry 3x loop in the DRM for measuring the 
timestamp is not good enough anymore to keep this reliable and accurate. 
Maybe we would need some loop that retries until an accurate measurement 
or a user configurable timeout is reached. Then users like  mine could 
set a very high timeout which rather totally degrades the system and 
causes severe stuttering of graphics updates than silently failing with 
inaccurate timestamps. The default could be something safe for normal 
desktop use.

Can i ask somebody with the hardware to give me an idea how big the 
error there is? At a high drm.debug setting, you should see this logged 
at each vblank interrupt, as seen in this DRM_DEBUG statement:

http://lxr.free-electrons.com/source/drivers/gpu/drm/drm_irq.c#L717

e.g., ....[e 4 us, 1 rep] ... for 1 repetitions with the lowest error 
being 4 usecs.

Especially while rendering a lot, so the uncore.lock gets something to do.

Thanks,
-mario

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-20 22:07             ` Mario Kleiner
@ 2013-09-23  8:38               ` Ville Syrjälä
  2013-09-25  4:32                 ` Mario Kleiner
  2013-09-25 14:07               ` Steven Rostedt
  1 sibling, 1 reply; 32+ messages in thread
From: Ville Syrjälä @ 2013-09-23  8:38 UTC (permalink / raw)
  To: Mario Kleiner
  Cc: Daniel Vetter, Peter Hurley, linux-rt-users, Clark Williams,
	Sebastian Andrzej Siewior, LKML, Steven Rostedt,
	dri-devel@lists.freedesktop.org, Dave Airlie, Thomas Gleixner,
	intel-gfx, Luis Claudio R. Goncalves

On Sat, Sep 21, 2013 at 12:07:36AM +0200, Mario Kleiner wrote:
> On 09/17/2013 10:55 PM, Daniel Vetter wrote:
> > On Tue, Sep 17, 2013 at 9:50 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
> >> On 09/11/2013 03:31 PM, Peter Hurley wrote:
> >>>
> >>> [+cc dri-devel]
> >>>
> >>> On 09/11/2013 11:38 AM, Steven Rostedt wrote:
> >>>>
> >>>> On Wed, 11 Sep 2013 11:16:43 -0400
> >>>> Peter Hurley <peter@hurleysoftware.com> wrote:
> >>>>
> >>>>>> The funny part is, there's a comment there that shows that this was
> >>>>>> done even for "PREEMPT_RT". Unfortunately, the call to
> >>>>>> "get_scanout_position()" can call functions that use the rt-mutex
> >>>>>> "sleeping spin locks" and it breaks there.
> >>>>>>
> >>>>>> I guess we need to ask the authors of the mainline patch exactly why
> >>>>>> that preempt_disable() is needed?
> >>>>>
> >>>>>
> >>>>> The drm core associates a timestamp with each vertical blank frame #.
> >>>>> Drm drivers can optionally support a 'high resolution' hw timestamp.
> >>>>> The vblank frame #/timestamp tuple is user-space visible.
> >>>>>
> >>>>> The i915 drm driver supports a hw timestamp via this drm helper function
> >>>>> which computes the timestamp from the crtc scan position (based on the
> >>>>> pixel clock).
> >>>>>
> >>>>> For mainline, the preempt_disable/_enable() isn't actually necessary
> >>>>> because every call tree that leads here already has preemption disabled.
> >>>>>
> >>>>> For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?
> >>>>>
> >>>>
> >>>> No, it should not. Note, any other lock that can be held when it is
> >>>> held would also need to be raw.
> >>>
> >>>
> >>> By that, you mean "any other lock" that might be claimed "would also need
> >>> to be raw"?  Hopefully not "any other lock" already held?
> >>>
> >>>> And by taking a quick audit of the code, I see this:
> >>>>
> >>>>      spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
> >>>>
> >>>>      /* Reset the chip */
> >>>>
> >>>>      /* GEN6_GDRST is not in the gt power well, no need to check
> >>>>       * for fifo space for the write or forcewake the chip for
> >>>>       * the read
> >>>>       */
> >>>>      __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL);
> >>>>
> >>>>      /* Spin waiting for the device to ack the reset request */
> >>>>      ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) &
> >>>> GEN6_GRDOM_FULL) == 0, 500);
> >>>>
> >>>> That spin is unacceptable in RT with preemption and interrupts disabled.
> >>>
> >>>
> >>> Yep. That would be bad.
> >>>
> >>> AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included
> >>> in the force-wake set, so raw reads of the registers would
> >>> probably be acceptable (thus obviating the need for claiming the
> >>> uncore.lock).
> >>>
> >>> Except that _ALL_ register access is disabled with the uncore.lock
> >>> during a gpu reset. Not sure if that's meant to include crtc registers
> >>> or not, or what other synchronization/serialization issues are being
> >>> handled/hidden by forcing all register accesses to wait during a gpu
> >>> reset.
> >>>
> >>> Hopefully an i915 expert can weigh in here?
> >>
> >>
> >>
> >> Daniel,
> >>
> >> Can you shed some light on whether the i915+ crtc registers (specifically
> >> those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter())
> >> read as part of the vblank counter/timestamp handling need to
> >> be prevented during gpu reset?
> >
> > The depency here in the locking is a recent addition:
> >
> > commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a
> > Author: Chris Wilson <chris@chris-wilson.co.uk>
> > Date:   Fri Jul 19 20:36:51 2013 +0100
> >
> >      drm/i915: Serialize almost all register access
> >
> > It's a (slightly) oversized hammer to work around a hardware issue -
> > we could break it down to register blocks, which can be accessed
> > concurrently, but that tends to be more fragile. But the chip really
> > dies if you access (even just reads) the same block concurrently :(
> >
> > We could try break the spinlock protected section a bit in the reset
> > handler - register access on a hung gpu tends to be ill-defined
> > anyway.
> >
> >> The implied wait with preemption and interrupts disabled is causing grief
> >> in -RT, but also a 4ms wait inside an irq handler seems like a bad idea.
> >
> > Oops, the magic code in wait_for which is just there to make the imo
> > totally misguided kgdb support work papered over the aweful long wait
> > in atomic context ever since we've added this in
> >
> > commit b6e45f866465f42b53d803b0c574da0fc508a0e9
> > Author: Keith Packard <keithp@keithp.com>
> > Date:   Fri Jan 6 11:34:04 2012 -0800
> >
> >      drm/i915: Move reset forcewake processing to gen6_do_reset
> >
> > Reverting this change should be enough (code moved obviously a bit).
> >
> > Cheers, Daniel
> >
> >>
> >> Regards,
> >> Peter Hurley
> >>
> >>
> >>
> >>>> What's the real issue here?
> >>>
> >>>
> >>> That the vblank timestamp needs to be an accurate measurement of a
> >>> realtime event. Sleeping/servicing interrupts while reading
> >>> the registers necessary to compute the timestamp would be bad too.
> >>>
> >>> (edit: which hopefully Mario Kleiner clarified in his reply)
> >>>
> >>> My point earlier was three-fold:
> >>> 1. Don't need the preempt_disable() for mainline: all callers are already
> >>>      holding interrupt-disabling spinlocks.
> >>> 2. -RT still needs to prevent scheduling there.
> >>> 3. the problem is i915-specific.
> >>>
> >>> [update: the radeon driver should also BUG like the i915 driver but
> >>> probably
> >>> should have mmio_idx_lock spinlock as raw]
> >>
> 
> Ok, so register access must be serialized, at least within a register 
> block, no way around it. Thanks for explaining this. That makes me a bit 
> depressed. Is this also true for older hw generations than gen6?
> 
> Daniel, could we move access to the display register block to a separate 
> lock, so the I915_READ to PIPEDSL in  i915_get_crtc_scanoutpos() can 
> avoid the uncore.lock? If the display registers don't need forcewake 
> then i assume we could have much shorter lock hold times by avoiding the 
> large up to 4 msecs waits in forcewake handling. Probably also much less 
> contention in normal operation when no modesetting happens? I think i 
> can get rid of the other register reads in that function. Those settings 
> are already cached and accessible from the intel_crtc_config->adjusted_mode.

I actually have a patch to kill most of the registers reads in
get_scanout_position on i915 somewhere. Let me dig that out and send it
to the list...

<snip>
> In any case, maybe the simple retry 3x loop in the DRM for measuring the 
> timestamp is not good enough anymore to keep this reliable and accurate. 
> Maybe we would need some loop that retries until an accurate measurement 
> or a user configurable timeout is reached. Then users like  mine could 
> set a very high timeout which rather totally degrades the system and 
> causes severe stuttering of graphics updates than silently failing with 
> inaccurate timestamps. The default could be something safe for normal 
> desktop use.

I never really understood the need for the preempt disabled retry loop in
drm core. What I would do is just something like this:

get_scanoutpos_and_timestamp()
{
 local_irq_disable();
 get_scanoutpos();
 get_timestamp();
 local_irq_enable();
}

If that has a lot of error, then it seems to me that the system is just
crap and we shouldn't care. Or if you're really worried about SMIs and
such then you could still do a retry loop.

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-23  8:38               ` [Intel-gfx] " Ville Syrjälä
@ 2013-09-25  4:32                 ` Mario Kleiner
  2013-09-25  7:49                   ` [Intel-gfx] " Ville Syrjälä
                                     ` (2 more replies)
  0 siblings, 3 replies; 32+ messages in thread
From: Mario Kleiner @ 2013-09-25  4:32 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Peter Hurley, Clark Williams, Daniel Vetter,
	Sebastian Andrzej Siewior, LKML, Steven Rostedt,
	dri-devel@lists.freedesktop.org, Mario Kleiner, Dave Airlie,
	Thomas Gleixner, Luis Claudio R. Goncalves, intel-gfx,
	linux-rt-users

On 23.09.13 10:38, Ville Syrjälä wrote:
> On Sat, Sep 21, 2013 at 12:07:36AM +0200, Mario Kleiner wrote:
>> On 09/17/2013 10:55 PM, Daniel Vetter wrote:
>>> On Tue, Sep 17, 2013 at 9:50 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
>>>> On 09/11/2013 03:31 PM, Peter Hurley wrote:
>>>>>
>>>>> [+cc dri-devel]
>>>>>
>>>>> On 09/11/2013 11:38 AM, Steven Rostedt wrote:
>>>>>>
>>>>>> On Wed, 11 Sep 2013 11:16:43 -0400
>>>>>> Peter Hurley <peter@hurleysoftware.com> wrote:
>>>>>>
>>>>>>>> The funny part is, there's a comment there that shows that this was
>>>>>>>> done even for "PREEMPT_RT". Unfortunately, the call to
>>>>>>>> "get_scanout_position()" can call functions that use the rt-mutex
>>>>>>>> "sleeping spin locks" and it breaks there.
>>>>>>>>
>>>>>>>> I guess we need to ask the authors of the mainline patch exactly why
>>>>>>>> that preempt_disable() is needed?
>>>>>>>
>>>>>>>
>>>>>>> The drm core associates a timestamp with each vertical blank frame #.
>>>>>>> Drm drivers can optionally support a 'high resolution' hw timestamp.
>>>>>>> The vblank frame #/timestamp tuple is user-space visible.
>>>>>>>
>>>>>>> The i915 drm driver supports a hw timestamp via this drm helper function
>>>>>>> which computes the timestamp from the crtc scan position (based on the
>>>>>>> pixel clock).
>>>>>>>
>>>>>>> For mainline, the preempt_disable/_enable() isn't actually necessary
>>>>>>> because every call tree that leads here already has preemption disabled.
>>>>>>>
>>>>>>> For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?
>>>>>>>
>>>>>>
>>>>>> No, it should not. Note, any other lock that can be held when it is
>>>>>> held would also need to be raw.
>>>>>
>>>>>
>>>>> By that, you mean "any other lock" that might be claimed "would also need
>>>>> to be raw"?  Hopefully not "any other lock" already held?
>>>>>
>>>>>> And by taking a quick audit of the code, I see this:
>>>>>>
>>>>>>       spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
>>>>>>
>>>>>>       /* Reset the chip */
>>>>>>
>>>>>>       /* GEN6_GDRST is not in the gt power well, no need to check
>>>>>>        * for fifo space for the write or forcewake the chip for
>>>>>>        * the read
>>>>>>        */
>>>>>>       __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL);
>>>>>>
>>>>>>       /* Spin waiting for the device to ack the reset request */
>>>>>>       ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) &
>>>>>> GEN6_GRDOM_FULL) == 0, 500);
>>>>>>
>>>>>> That spin is unacceptable in RT with preemption and interrupts disabled.
>>>>>
>>>>>
>>>>> Yep. That would be bad.
>>>>>
>>>>> AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included
>>>>> in the force-wake set, so raw reads of the registers would
>>>>> probably be acceptable (thus obviating the need for claiming the
>>>>> uncore.lock).
>>>>>
>>>>> Except that _ALL_ register access is disabled with the uncore.lock
>>>>> during a gpu reset. Not sure if that's meant to include crtc registers
>>>>> or not, or what other synchronization/serialization issues are being
>>>>> handled/hidden by forcing all register accesses to wait during a gpu
>>>>> reset.
>>>>>
>>>>> Hopefully an i915 expert can weigh in here?
>>>>
>>>>
>>>>
>>>> Daniel,
>>>>
>>>> Can you shed some light on whether the i915+ crtc registers (specifically
>>>> those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter())
>>>> read as part of the vblank counter/timestamp handling need to
>>>> be prevented during gpu reset?
>>>
>>> The depency here in the locking is a recent addition:
>>>
>>> commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a
>>> Author: Chris Wilson <chris@chris-wilson.co.uk>
>>> Date:   Fri Jul 19 20:36:51 2013 +0100
>>>
>>>       drm/i915: Serialize almost all register access
>>>
>>> It's a (slightly) oversized hammer to work around a hardware issue -
>>> we could break it down to register blocks, which can be accessed
>>> concurrently, but that tends to be more fragile. But the chip really
>>> dies if you access (even just reads) the same block concurrently :(
>>>
>>> We could try break the spinlock protected section a bit in the reset
>>> handler - register access on a hung gpu tends to be ill-defined
>>> anyway.
>>>
>>>> The implied wait with preemption and interrupts disabled is causing grief
>>>> in -RT, but also a 4ms wait inside an irq handler seems like a bad idea.
>>>
>>> Oops, the magic code in wait_for which is just there to make the imo
>>> totally misguided kgdb support work papered over the aweful long wait
>>> in atomic context ever since we've added this in
>>>
>>> commit b6e45f866465f42b53d803b0c574da0fc508a0e9
>>> Author: Keith Packard <keithp@keithp.com>
>>> Date:   Fri Jan 6 11:34:04 2012 -0800
>>>
>>>       drm/i915: Move reset forcewake processing to gen6_do_reset
>>>
>>> Reverting this change should be enough (code moved obviously a bit).
>>>
>>> Cheers, Daniel
>>>
>>>>
>>>> Regards,
>>>> Peter Hurley
>>>>
>>>>
>>>>
>>>>>> What's the real issue here?
>>>>>
>>>>>
>>>>> That the vblank timestamp needs to be an accurate measurement of a
>>>>> realtime event. Sleeping/servicing interrupts while reading
>>>>> the registers necessary to compute the timestamp would be bad too.
>>>>>
>>>>> (edit: which hopefully Mario Kleiner clarified in his reply)
>>>>>
>>>>> My point earlier was three-fold:
>>>>> 1. Don't need the preempt_disable() for mainline: all callers are already
>>>>>       holding interrupt-disabling spinlocks.
>>>>> 2. -RT still needs to prevent scheduling there.
>>>>> 3. the problem is i915-specific.
>>>>>
>>>>> [update: the radeon driver should also BUG like the i915 driver but
>>>>> probably
>>>>> should have mmio_idx_lock spinlock as raw]
>>>>
>>
>> Ok, so register access must be serialized, at least within a register
>> block, no way around it. Thanks for explaining this. That makes me a bit
>> depressed. Is this also true for older hw generations than gen6?
>>
>> Daniel, could we move access to the display register block to a separate
>> lock, so the I915_READ to PIPEDSL in  i915_get_crtc_scanoutpos() can
>> avoid the uncore.lock? If the display registers don't need forcewake
>> then i assume we could have much shorter lock hold times by avoiding the
>> large up to 4 msecs waits in forcewake handling. Probably also much less
>> contention in normal operation when no modesetting happens? I think i
>> can get rid of the other register reads in that function. Those settings
>> are already cached and accessible from the intel_crtc_config->adjusted_mode.
>
> I actually have a patch to kill most of the registers reads in
> get_scanout_position on i915 somewhere. Let me dig that out and send it
> to the list...
>
> <snip>
>> In any case, maybe the simple retry 3x loop in the DRM for measuring the
>> timestamp is not good enough anymore to keep this reliable and accurate.
>> Maybe we would need some loop that retries until an accurate measurement
>> or a user configurable timeout is reached. Then users like  mine could
>> set a very high timeout which rather totally degrades the system and
>> causes severe stuttering of graphics updates than silently failing with
>> inaccurate timestamps. The default could be something safe for normal
>> desktop use.
>
> I never really understood the need for the preempt disabled retry loop in
> drm core. What I would do is just something like this:
>
> get_scanoutpos_and_timestamp()
> {
>   local_irq_disable();
>   get_scanoutpos();
>   get_timestamp();
>   local_irq_enable();
> }

The preempt_disable serves the same purpose for PREEMPT_RT kernels as 
your local_irq_disable in your example - get rid of any preventable 
interruption, so a retry is unlikely to be ever needed. At least that 
was the idea.

I assume if a spin_lock_irqsave doesn't really disable interrupts on a 
RT kernel with normal spinlocks then local_irq_disable won't really 
disable interrupts either?

>
> If that has a lot of error, then it seems to me that the system is just
> crap and we shouldn't care. Or if you're really worried about SMIs and
> such then you could still do a retry loop.
>

I didn't expect a lot of error with preemption and interrupts disabled, 
essentially only such infrequent things as smi's, that's why the retry 
loop only tries 3x max, once for real, once in case the 1st one got 
spoiled by a smi or such, and once because three times a charm ;-) - In 
practice i didn't ever observe more than 3-4 usecs of delay, well below 
the 20 usecs retry threshold. But i didn't expect 
i915_crtc_get_scanoutpos() to ever take any locks or other stuff that 
could induce long waits.

But given the new situation, your proposal is great! If we push the 
clock readouts into the get_scanoutpos routine, we can make this robust 
without causing grief for the rt people and without the need for a new 
separate lock for display regs in intel-kms.

E.g., for intel-kms:

i915_get_crtc_scanoutpos(..., ktime_t *stime, ktime_t *etime)
{
   ...
   spin_lock_irqsave(...uncore.lock);
   preempt_disable();
   *stime = ktime_get();
   position = __raw_i915_read32(dev_priv, PIPEDSL(pipe));
   *etime = ktime_get();
   preempt_enable();
   spin_unlock_irqrestore(...uncore.lock)
   ...
}

With your patchset to reduce the amount of register reads needed in that 
function, and given that forcewake handling isn't needed for these 
registers, this should make it robust again and wouldn't need new locks.

Unless ktime_get is also a bad thing to do in a preempt disabled section?

-mario

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-25  4:32                 ` Mario Kleiner
@ 2013-09-25  7:49                   ` Ville Syrjälä
  2013-09-25 14:18                     ` Steven Rostedt
  2013-09-26 16:43                     ` Mario Kleiner
  2013-09-25 13:52                   ` Steven Rostedt
  2013-09-25 14:13                   ` Steven Rostedt
  2 siblings, 2 replies; 32+ messages in thread
From: Ville Syrjälä @ 2013-09-25  7:49 UTC (permalink / raw)
  To: Mario Kleiner
  Cc: Daniel Vetter, Peter Hurley, linux-rt-users, Clark Williams,
	Sebastian Andrzej Siewior, LKML, Steven Rostedt,
	dri-devel@lists.freedesktop.org, Dave Airlie, Thomas Gleixner,
	intel-gfx, Luis Claudio R. Goncalves

On Wed, Sep 25, 2013 at 06:32:10AM +0200, Mario Kleiner wrote:
> On 23.09.13 10:38, Ville Syrjälä wrote:
> > On Sat, Sep 21, 2013 at 12:07:36AM +0200, Mario Kleiner wrote:
> >> On 09/17/2013 10:55 PM, Daniel Vetter wrote:
> >>> On Tue, Sep 17, 2013 at 9:50 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
> >>>> On 09/11/2013 03:31 PM, Peter Hurley wrote:
> >>>>>
> >>>>> [+cc dri-devel]
> >>>>>
> >>>>> On 09/11/2013 11:38 AM, Steven Rostedt wrote:
> >>>>>>
> >>>>>> On Wed, 11 Sep 2013 11:16:43 -0400
> >>>>>> Peter Hurley <peter@hurleysoftware.com> wrote:
> >>>>>>
> >>>>>>>> The funny part is, there's a comment there that shows that this was
> >>>>>>>> done even for "PREEMPT_RT". Unfortunately, the call to
> >>>>>>>> "get_scanout_position()" can call functions that use the rt-mutex
> >>>>>>>> "sleeping spin locks" and it breaks there.
> >>>>>>>>
> >>>>>>>> I guess we need to ask the authors of the mainline patch exactly why
> >>>>>>>> that preempt_disable() is needed?
> >>>>>>>
> >>>>>>>
> >>>>>>> The drm core associates a timestamp with each vertical blank frame #.
> >>>>>>> Drm drivers can optionally support a 'high resolution' hw timestamp.
> >>>>>>> The vblank frame #/timestamp tuple is user-space visible.
> >>>>>>>
> >>>>>>> The i915 drm driver supports a hw timestamp via this drm helper function
> >>>>>>> which computes the timestamp from the crtc scan position (based on the
> >>>>>>> pixel clock).
> >>>>>>>
> >>>>>>> For mainline, the preempt_disable/_enable() isn't actually necessary
> >>>>>>> because every call tree that leads here already has preemption disabled.
> >>>>>>>
> >>>>>>> For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?
> >>>>>>>
> >>>>>>
> >>>>>> No, it should not. Note, any other lock that can be held when it is
> >>>>>> held would also need to be raw.
> >>>>>
> >>>>>
> >>>>> By that, you mean "any other lock" that might be claimed "would also need
> >>>>> to be raw"?  Hopefully not "any other lock" already held?
> >>>>>
> >>>>>> And by taking a quick audit of the code, I see this:
> >>>>>>
> >>>>>>       spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
> >>>>>>
> >>>>>>       /* Reset the chip */
> >>>>>>
> >>>>>>       /* GEN6_GDRST is not in the gt power well, no need to check
> >>>>>>        * for fifo space for the write or forcewake the chip for
> >>>>>>        * the read
> >>>>>>        */
> >>>>>>       __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL);
> >>>>>>
> >>>>>>       /* Spin waiting for the device to ack the reset request */
> >>>>>>       ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) &
> >>>>>> GEN6_GRDOM_FULL) == 0, 500);
> >>>>>>
> >>>>>> That spin is unacceptable in RT with preemption and interrupts disabled.
> >>>>>
> >>>>>
> >>>>> Yep. That would be bad.
> >>>>>
> >>>>> AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included
> >>>>> in the force-wake set, so raw reads of the registers would
> >>>>> probably be acceptable (thus obviating the need for claiming the
> >>>>> uncore.lock).
> >>>>>
> >>>>> Except that _ALL_ register access is disabled with the uncore.lock
> >>>>> during a gpu reset. Not sure if that's meant to include crtc registers
> >>>>> or not, or what other synchronization/serialization issues are being
> >>>>> handled/hidden by forcing all register accesses to wait during a gpu
> >>>>> reset.
> >>>>>
> >>>>> Hopefully an i915 expert can weigh in here?
> >>>>
> >>>>
> >>>>
> >>>> Daniel,
> >>>>
> >>>> Can you shed some light on whether the i915+ crtc registers (specifically
> >>>> those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter())
> >>>> read as part of the vblank counter/timestamp handling need to
> >>>> be prevented during gpu reset?
> >>>
> >>> The depency here in the locking is a recent addition:
> >>>
> >>> commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a
> >>> Author: Chris Wilson <chris@chris-wilson.co.uk>
> >>> Date:   Fri Jul 19 20:36:51 2013 +0100
> >>>
> >>>       drm/i915: Serialize almost all register access
> >>>
> >>> It's a (slightly) oversized hammer to work around a hardware issue -
> >>> we could break it down to register blocks, which can be accessed
> >>> concurrently, but that tends to be more fragile. But the chip really
> >>> dies if you access (even just reads) the same block concurrently :(
> >>>
> >>> We could try break the spinlock protected section a bit in the reset
> >>> handler - register access on a hung gpu tends to be ill-defined
> >>> anyway.
> >>>
> >>>> The implied wait with preemption and interrupts disabled is causing grief
> >>>> in -RT, but also a 4ms wait inside an irq handler seems like a bad idea.
> >>>
> >>> Oops, the magic code in wait_for which is just there to make the imo
> >>> totally misguided kgdb support work papered over the aweful long wait
> >>> in atomic context ever since we've added this in
> >>>
> >>> commit b6e45f866465f42b53d803b0c574da0fc508a0e9
> >>> Author: Keith Packard <keithp@keithp.com>
> >>> Date:   Fri Jan 6 11:34:04 2012 -0800
> >>>
> >>>       drm/i915: Move reset forcewake processing to gen6_do_reset
> >>>
> >>> Reverting this change should be enough (code moved obviously a bit).
> >>>
> >>> Cheers, Daniel
> >>>
> >>>>
> >>>> Regards,
> >>>> Peter Hurley
> >>>>
> >>>>
> >>>>
> >>>>>> What's the real issue here?
> >>>>>
> >>>>>
> >>>>> That the vblank timestamp needs to be an accurate measurement of a
> >>>>> realtime event. Sleeping/servicing interrupts while reading
> >>>>> the registers necessary to compute the timestamp would be bad too.
> >>>>>
> >>>>> (edit: which hopefully Mario Kleiner clarified in his reply)
> >>>>>
> >>>>> My point earlier was three-fold:
> >>>>> 1. Don't need the preempt_disable() for mainline: all callers are already
> >>>>>       holding interrupt-disabling spinlocks.
> >>>>> 2. -RT still needs to prevent scheduling there.
> >>>>> 3. the problem is i915-specific.
> >>>>>
> >>>>> [update: the radeon driver should also BUG like the i915 driver but
> >>>>> probably
> >>>>> should have mmio_idx_lock spinlock as raw]
> >>>>
> >>
> >> Ok, so register access must be serialized, at least within a register
> >> block, no way around it. Thanks for explaining this. That makes me a bit
> >> depressed. Is this also true for older hw generations than gen6?
> >>
> >> Daniel, could we move access to the display register block to a separate
> >> lock, so the I915_READ to PIPEDSL in  i915_get_crtc_scanoutpos() can
> >> avoid the uncore.lock? If the display registers don't need forcewake
> >> then i assume we could have much shorter lock hold times by avoiding the
> >> large up to 4 msecs waits in forcewake handling. Probably also much less
> >> contention in normal operation when no modesetting happens? I think i
> >> can get rid of the other register reads in that function. Those settings
> >> are already cached and accessible from the intel_crtc_config->adjusted_mode.
> >
> > I actually have a patch to kill most of the registers reads in
> > get_scanout_position on i915 somewhere. Let me dig that out and send it
> > to the list...
> >
> > <snip>
> >> In any case, maybe the simple retry 3x loop in the DRM for measuring the
> >> timestamp is not good enough anymore to keep this reliable and accurate.
> >> Maybe we would need some loop that retries until an accurate measurement
> >> or a user configurable timeout is reached. Then users like  mine could
> >> set a very high timeout which rather totally degrades the system and
> >> causes severe stuttering of graphics updates than silently failing with
> >> inaccurate timestamps. The default could be something safe for normal
> >> desktop use.
> >
> > I never really understood the need for the preempt disabled retry loop in
> > drm core. What I would do is just something like this:
> >
> > get_scanoutpos_and_timestamp()
> > {
> >   local_irq_disable();
> >   get_scanoutpos();
> >   get_timestamp();
> >   local_irq_enable();
> > }
> 
> The preempt_disable serves the same purpose for PREEMPT_RT kernels as 
> your local_irq_disable in your example - get rid of any preventable 
> interruption, so a retry is unlikely to be ever needed. At least that 
> was the idea.
> 
> I assume if a spin_lock_irqsave doesn't really disable interrupts on a 
> RT kernel with normal spinlocks then local_irq_disable won't really 
> disable interrupts either?
> 
> >
> > If that has a lot of error, then it seems to me that the system is just
> > crap and we shouldn't care. Or if you're really worried about SMIs and
> > such then you could still do a retry loop.
> >
> 
> I didn't expect a lot of error with preemption and interrupts disabled, 
> essentially only such infrequent things as smi's, that's why the retry 
> loop only tries 3x max, once for real, once in case the 1st one got 
> spoiled by a smi or such, and once because three times a charm ;-) - In 
> practice i didn't ever observe more than 3-4 usecs of delay, well below 
> the 20 usecs retry threshold. But i didn't expect 
> i915_crtc_get_scanoutpos() to ever take any locks or other stuff that 
> could induce long waits.
> 
> But given the new situation, your proposal is great! If we push the 
> clock readouts into the get_scanoutpos routine, we can make this robust 
> without causing grief for the rt people and without the need for a new 
> separate lock for display regs in intel-kms.
> 
> E.g., for intel-kms:
> 
> i915_get_crtc_scanoutpos(..., ktime_t *stime, ktime_t *etime)
> {
>    ...
>    spin_lock_irqsave(...uncore.lock);
>    preempt_disable();
>    *stime = ktime_get();
>    position = __raw_i915_read32(dev_priv, PIPEDSL(pipe));
>    *etime = ktime_get();
>    preempt_enable();
>    spin_unlock_irqrestore(...uncore.lock)
>    ...
> }
> 
> With your patchset to reduce the amount of register reads needed in that 
> function, and given that forcewake handling isn't needed for these 
> registers, this should make it robust again and wouldn't need new locks.
> 
> Unless ktime_get is also a bad thing to do in a preempt disabled section?

The preempt_disable/enable is not needed. The spinlock serves the same
purpose already.

As far as ktime_get(), I've used it from spinlocked/irq disabled sections
and so far haven't seen it do bad things. But would be nice to get some
official statement to that effect.

To minimize the chance of breakage, I guess we could add a new func
->get_scanout_pos_and_time or something like that, fill it by default
with the code from the current drm_calc_vbltimestamp_from_scanoutpos().

Or we could just shovel the delta_ns handling from
drm_calc_vbltimestamp_from_scanoutpos() into some small helper function
that we could call from i915_get_vblank_timestamp(), and then we can
call i915_get_crtc_scanoutpos() directly from there as well.

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-25  4:32                 ` Mario Kleiner
  2013-09-25  7:49                   ` [Intel-gfx] " Ville Syrjälä
@ 2013-09-25 13:52                   ` Steven Rostedt
  2013-09-25 14:13                   ` Steven Rostedt
  2 siblings, 0 replies; 32+ messages in thread
From: Steven Rostedt @ 2013-09-25 13:52 UTC (permalink / raw)
  To: Mario Kleiner
  Cc: Ville Syrjälä, Daniel Vetter, Peter Hurley,
	linux-rt-users, Clark Williams, Sebastian Andrzej Siewior, LKML,
	dri-devel@lists.freedesktop.org, Dave Airlie, Thomas Gleixner,
	intel-gfx, Luis Claudio R. Goncalves

On Wed, 25 Sep 2013 06:32:10 +0200
Mario Kleiner <mario.kleiner@tuebingen.mpg.de> wrote:


> I assume if a spin_lock_irqsave doesn't really disable interrupts on a 
> RT kernel with normal spinlocks then local_irq_disable won't really 
> disable interrupts either?
> 

That is incorrect. On PREEMPT_RT, you are right about
spin_lock_irqsave() not disabling interrupts, but local_irq_disable()
does indeed disable interrupts.

Open coded local_irq_disable() usually ends up being a bug as it does
nothing to synchronize interrupts from other CPUs. But most of those
bugs have been removed, and there are some very legit reasons for using
local_irq_disable(). PREEMPT_RT honors those.

The reason PREEMPT_RT does not disable interrupts for
spin_lock_irqsave(), is because that's usually used for when a
interrupt uses the same spinlock. You need the irqsave() part in order
to prevent a deadlock, if the code that has the spinlock gets preempted
by the interrupt and that interrupt tries to grab the same lock.

Because PREEMPT_RT runs interrupts as threads, we don't have that issue,
because if the interrupt preempts the holder of the lock and it tries
to grab the same lock, it will just block like any other thread trying
to grab that lock. That is, spinlocks turn into mutexes on PREEMPT_RT.

-- Steve

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-20 22:07             ` Mario Kleiner
  2013-09-23  8:38               ` [Intel-gfx] " Ville Syrjälä
@ 2013-09-25 14:07               ` Steven Rostedt
  1 sibling, 0 replies; 32+ messages in thread
From: Steven Rostedt @ 2013-09-25 14:07 UTC (permalink / raw)
  To: Mario Kleiner
  Cc: Daniel Vetter, Peter Hurley, Luis Claudio R. Goncalves,
	linux-rt-users, Thomas Gleixner, Sebastian Andrzej Siewior,
	Clark Williams, Dave Airlie, LKML,
	dri-devel@lists.freedesktop.org, intel-gfx


Sorry for the late reply, I was at Linux Plumbers, and had a bunch of
stuff to catch up on when I returned.

On Sat, 21 Sep 2013 00:07:36 +0200
Mario Kleiner <mario.kleiner@tuebingen.mpg.de> wrote:

 
> Steven, would it then be acceptable to convert that "faster" lock into a 
> raw_spinlock_t or is this unacceptable? If so, the preempt_disable() 
> could stay, right?

If a spinlock is tight (not held for more than 2us on todays
processors), and has little contention, than I would be fine with
converting it to raw. And if that's the only lock held you could do the
preempt_disable() call.

In fact, if you want, you can leave the preempt_disable() out of
mainline, and send a patch to us that uses "preempt_disable_rt()" and
add a comment to it. In the -rt patch, preempt_disable_rt() is a nop
when PREEMPT_RT is not set, and is preempt_disable() when it is.


-- Steve

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-25  4:32                 ` Mario Kleiner
  2013-09-25  7:49                   ` [Intel-gfx] " Ville Syrjälä
  2013-09-25 13:52                   ` Steven Rostedt
@ 2013-09-25 14:13                   ` Steven Rostedt
  2013-09-26 16:16                     ` Mario Kleiner
  2 siblings, 1 reply; 32+ messages in thread
From: Steven Rostedt @ 2013-09-25 14:13 UTC (permalink / raw)
  To: Mario Kleiner
  Cc: Ville Syrjälä, Daniel Vetter, Peter Hurley,
	linux-rt-users, Clark Williams, Sebastian Andrzej Siewior, LKML,
	dri-devel@lists.freedesktop.org, Dave Airlie, Thomas Gleixner,
	intel-gfx, Luis Claudio R. Goncalves

On Wed, 25 Sep 2013 06:32:10 +0200
Mario Kleiner <mario.kleiner@tuebingen.mpg.de> wrote:

 
> But given the new situation, your proposal is great! If we push the 
> clock readouts into the get_scanoutpos routine, we can make this robust 
> without causing grief for the rt people and without the need for a new 
> separate lock for display regs in intel-kms.
> 
> E.g., for intel-kms:
> 
> i915_get_crtc_scanoutpos(..., ktime_t *stime, ktime_t *etime)
> {
>    ...
>    spin_lock_irqsave(...uncore.lock);
>    preempt_disable();
>    *stime = ktime_get();
>    position = __raw_i915_read32(dev_priv, PIPEDSL(pipe));
>    *etime = ktime_get();
>    preempt_enable();
>    spin_unlock_irqrestore(...uncore.lock)
>    ...
> }
> 
> With your patchset to reduce the amount of register reads needed in that 
> function, and given that forcewake handling isn't needed for these 
> registers, this should make it robust again and wouldn't need new locks.
> 
> Unless ktime_get is also a bad thing to do in a preempt disabled section?

ktime_get() works fine in preempt_disable sections, although it may add
some latencies, but you shouldn't need to worry about it.

I like this solution the best too, but if it does go in, I would ask to
send us the patch for adding the preempt_disable() and we can add the
preempt_disable_rt() to it. Why make mainline have a little more
overhead?

-- Steve

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-25  7:49                   ` [Intel-gfx] " Ville Syrjälä
@ 2013-09-25 14:18                     ` Steven Rostedt
  2013-09-26 16:43                     ` Mario Kleiner
  1 sibling, 0 replies; 32+ messages in thread
From: Steven Rostedt @ 2013-09-25 14:18 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Mario Kleiner, Daniel Vetter, Peter Hurley, linux-rt-users,
	Clark Williams, Sebastian Andrzej Siewior, LKML,
	dri-devel@lists.freedesktop.org, Dave Airlie, Thomas Gleixner,
	intel-gfx, Luis Claudio R. Goncalves

On Wed, 25 Sep 2013 10:49:36 +0300
Ville Syrjälä <ville.syrjala@linux.intel.com> wrote:

 
> The preempt_disable/enable is not needed. The spinlock serves the same
> purpose already.

As stated, that was only for the -rt patch, as spin_lock_irqsave does
not disable preemption nor does it even disable interrupts.

But I agree, the added preempt_disable() should be sent to us to keep
in the -rt patch itself. We really appreciate that you are thinking
about us :-)  But something like this will just confuse the mainline
folks. Having a "preempt_disable_rt()" would make a lot more sense
(which exists in the -rt patch).

> 
> As far as ktime_get(), I've used it from spinlocked/irq disabled sections
> and so far haven't seen it do bad things. But would be nice to get some
> official statement to that effect.

It's just a read seqlock. It may do a few loops to get the correct
time, but it's fine to have in a preempt/irq disabled section.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-25 14:13                   ` Steven Rostedt
@ 2013-09-26 16:16                     ` Mario Kleiner
  2013-10-11 10:18                       ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 32+ messages in thread
From: Mario Kleiner @ 2013-09-26 16:16 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mario Kleiner, Peter Hurley, Clark Williams, Daniel Vetter,
	Sebastian Andrzej Siewior, LKML, dri-devel@lists.freedesktop.org,
	Dave Airlie, Thomas Gleixner, Luis Claudio R. Goncalves,
	intel-gfx, linux-rt-users, Ville Syrjälä

On 25.09.13 16:13, Steven Rostedt wrote:
> On Wed, 25 Sep 2013 06:32:10 +0200
> Mario Kleiner <mario.kleiner@tuebingen.mpg.de> wrote:
>
>
>> But given the new situation, your proposal is great! If we push the
>> clock readouts into the get_scanoutpos routine, we can make this robust
>> without causing grief for the rt people and without the need for a new
>> separate lock for display regs in intel-kms.
>>
>> E.g., for intel-kms:
>>
>> i915_get_crtc_scanoutpos(..., ktime_t *stime, ktime_t *etime)
>> {
>>     ...
>>     spin_lock_irqsave(...uncore.lock);
>>     preempt_disable();
>>     *stime = ktime_get();
>>     position = __raw_i915_read32(dev_priv, PIPEDSL(pipe));
>>     *etime = ktime_get();
>>     preempt_enable();
>>     spin_unlock_irqrestore(...uncore.lock)
>>     ...
>> }
>>
>> With your patchset to reduce the amount of register reads needed in that
>> function, and given that forcewake handling isn't needed for these
>> registers, this should make it robust again and wouldn't need new locks.
>>
>> Unless ktime_get is also a bad thing to do in a preempt disabled section?
>
> ktime_get() works fine in preempt_disable sections, although it may add
> some latencies, but you shouldn't need to worry about it.
>
> I like this solution the best too, but if it does go in, I would ask to
> send us the patch for adding the preempt_disable() and we can add the
> preempt_disable_rt() to it. Why make mainline have a little more
> overhead?
>
> -- Steve

Good! I will do that. Thanks for clarifying the irq and constraints on 
raw locks in the other thread.

-mario

> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-25  7:49                   ` [Intel-gfx] " Ville Syrjälä
  2013-09-25 14:18                     ` Steven Rostedt
@ 2013-09-26 16:43                     ` Mario Kleiner
  1 sibling, 0 replies; 32+ messages in thread
From: Mario Kleiner @ 2013-09-26 16:43 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Mario Kleiner, Peter Hurley, Clark Williams, Daniel Vetter,
	Sebastian Andrzej Siewior, LKML, Steven Rostedt,
	dri-devel@lists.freedesktop.org, Dave Airlie, Thomas Gleixner,
	Luis Claudio R. Goncalves, intel-gfx, linux-rt-users

On 25.09.13 09:49, Ville Syrjälä wrote:
> On Wed, Sep 25, 2013 at 06:32:10AM +0200, Mario Kleiner wrote:
>> On 23.09.13 10:38, Ville Syrjälä wrote:
>>> On Sat, Sep 21, 2013 at 12:07:36AM +0200, Mario Kleiner wrote:
>>>> On 09/17/2013 10:55 PM, Daniel Vetter wrote:
>>>>> On Tue, Sep 17, 2013 at 9:50 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
>>>>>> On 09/11/2013 03:31 PM, Peter Hurley wrote:
>>>>>>>
>>>>>>> [+cc dri-devel]
>>>>>>>
>>>>>>> On 09/11/2013 11:38 AM, Steven Rostedt wrote:
>>>>>>>>
>>>>>>>> On Wed, 11 Sep 2013 11:16:43 -0400
>>>>>>>> Peter Hurley <peter@hurleysoftware.com> wrote:
>>>>>>>>
>>>>>>>>>> The funny part is, there's a comment there that shows that this was
>>>>>>>>>> done even for "PREEMPT_RT". Unfortunately, the call to
>>>>>>>>>> "get_scanout_position()" can call functions that use the rt-mutex
>>>>>>>>>> "sleeping spin locks" and it breaks there.
>>>>>>>>>>
>>>>>>>>>> I guess we need to ask the authors of the mainline patch exactly why
>>>>>>>>>> that preempt_disable() is needed?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The drm core associates a timestamp with each vertical blank frame #.
>>>>>>>>> Drm drivers can optionally support a 'high resolution' hw timestamp.
>>>>>>>>> The vblank frame #/timestamp tuple is user-space visible.
>>>>>>>>>
>>>>>>>>> The i915 drm driver supports a hw timestamp via this drm helper function
>>>>>>>>> which computes the timestamp from the crtc scan position (based on the
>>>>>>>>> pixel clock).
>>>>>>>>>
>>>>>>>>> For mainline, the preempt_disable/_enable() isn't actually necessary
>>>>>>>>> because every call tree that leads here already has preemption disabled.
>>>>>>>>>
>>>>>>>>> For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?
>>>>>>>>>
>>>>>>>>
>>>>>>>> No, it should not. Note, any other lock that can be held when it is
>>>>>>>> held would also need to be raw.
>>>>>>>
>>>>>>>
>>>>>>> By that, you mean "any other lock" that might be claimed "would also need
>>>>>>> to be raw"?  Hopefully not "any other lock" already held?
>>>>>>>
>>>>>>>> And by taking a quick audit of the code, I see this:
>>>>>>>>
>>>>>>>>        spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
>>>>>>>>
>>>>>>>>        /* Reset the chip */
>>>>>>>>
>>>>>>>>        /* GEN6_GDRST is not in the gt power well, no need to check
>>>>>>>>         * for fifo space for the write or forcewake the chip for
>>>>>>>>         * the read
>>>>>>>>         */
>>>>>>>>        __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL);
>>>>>>>>
>>>>>>>>        /* Spin waiting for the device to ack the reset request */
>>>>>>>>        ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) &
>>>>>>>> GEN6_GRDOM_FULL) == 0, 500);
>>>>>>>>
>>>>>>>> That spin is unacceptable in RT with preemption and interrupts disabled.
>>>>>>>
>>>>>>>
>>>>>>> Yep. That would be bad.
>>>>>>>
>>>>>>> AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included
>>>>>>> in the force-wake set, so raw reads of the registers would
>>>>>>> probably be acceptable (thus obviating the need for claiming the
>>>>>>> uncore.lock).
>>>>>>>
>>>>>>> Except that _ALL_ register access is disabled with the uncore.lock
>>>>>>> during a gpu reset. Not sure if that's meant to include crtc registers
>>>>>>> or not, or what other synchronization/serialization issues are being
>>>>>>> handled/hidden by forcing all register accesses to wait during a gpu
>>>>>>> reset.
>>>>>>>
>>>>>>> Hopefully an i915 expert can weigh in here?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Daniel,
>>>>>>
>>>>>> Can you shed some light on whether the i915+ crtc registers (specifically
>>>>>> those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter())
>>>>>> read as part of the vblank counter/timestamp handling need to
>>>>>> be prevented during gpu reset?
>>>>>
>>>>> The depency here in the locking is a recent addition:
>>>>>
>>>>> commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a
>>>>> Author: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> Date:   Fri Jul 19 20:36:51 2013 +0100
>>>>>
>>>>>        drm/i915: Serialize almost all register access
>>>>>
>>>>> It's a (slightly) oversized hammer to work around a hardware issue -
>>>>> we could break it down to register blocks, which can be accessed
>>>>> concurrently, but that tends to be more fragile. But the chip really
>>>>> dies if you access (even just reads) the same block concurrently :(
>>>>>
>>>>> We could try break the spinlock protected section a bit in the reset
>>>>> handler - register access on a hung gpu tends to be ill-defined
>>>>> anyway.
>>>>>
>>>>>> The implied wait with preemption and interrupts disabled is causing grief
>>>>>> in -RT, but also a 4ms wait inside an irq handler seems like a bad idea.
>>>>>
>>>>> Oops, the magic code in wait_for which is just there to make the imo
>>>>> totally misguided kgdb support work papered over the aweful long wait
>>>>> in atomic context ever since we've added this in
>>>>>
>>>>> commit b6e45f866465f42b53d803b0c574da0fc508a0e9
>>>>> Author: Keith Packard <keithp@keithp.com>
>>>>> Date:   Fri Jan 6 11:34:04 2012 -0800
>>>>>
>>>>>        drm/i915: Move reset forcewake processing to gen6_do_reset
>>>>>
>>>>> Reverting this change should be enough (code moved obviously a bit).
>>>>>
>>>>> Cheers, Daniel
>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Peter Hurley
>>>>>>
>>>>>>
>>>>>>
>>>>>>>> What's the real issue here?
>>>>>>>
>>>>>>>
>>>>>>> That the vblank timestamp needs to be an accurate measurement of a
>>>>>>> realtime event. Sleeping/servicing interrupts while reading
>>>>>>> the registers necessary to compute the timestamp would be bad too.
>>>>>>>
>>>>>>> (edit: which hopefully Mario Kleiner clarified in his reply)
>>>>>>>
>>>>>>> My point earlier was three-fold:
>>>>>>> 1. Don't need the preempt_disable() for mainline: all callers are already
>>>>>>>        holding interrupt-disabling spinlocks.
>>>>>>> 2. -RT still needs to prevent scheduling there.
>>>>>>> 3. the problem is i915-specific.
>>>>>>>
>>>>>>> [update: the radeon driver should also BUG like the i915 driver but
>>>>>>> probably
>>>>>>> should have mmio_idx_lock spinlock as raw]
>>>>>>
>>>>
>>>> Ok, so register access must be serialized, at least within a register
>>>> block, no way around it. Thanks for explaining this. That makes me a bit
>>>> depressed. Is this also true for older hw generations than gen6?
>>>>
>>>> Daniel, could we move access to the display register block to a separate
>>>> lock, so the I915_READ to PIPEDSL in  i915_get_crtc_scanoutpos() can
>>>> avoid the uncore.lock? If the display registers don't need forcewake
>>>> then i assume we could have much shorter lock hold times by avoiding the
>>>> large up to 4 msecs waits in forcewake handling. Probably also much less
>>>> contention in normal operation when no modesetting happens? I think i
>>>> can get rid of the other register reads in that function. Those settings
>>>> are already cached and accessible from the intel_crtc_config->adjusted_mode.
>>>
>>> I actually have a patch to kill most of the registers reads in
>>> get_scanout_position on i915 somewhere. Let me dig that out and send it
>>> to the list...
>>>
>>> <snip>
>>>> In any case, maybe the simple retry 3x loop in the DRM for measuring the
>>>> timestamp is not good enough anymore to keep this reliable and accurate.
>>>> Maybe we would need some loop that retries until an accurate measurement
>>>> or a user configurable timeout is reached. Then users like  mine could
>>>> set a very high timeout which rather totally degrades the system and
>>>> causes severe stuttering of graphics updates than silently failing with
>>>> inaccurate timestamps. The default could be something safe for normal
>>>> desktop use.
>>>
>>> I never really understood the need for the preempt disabled retry loop in
>>> drm core. What I would do is just something like this:
>>>
>>> get_scanoutpos_and_timestamp()
>>> {
>>>    local_irq_disable();
>>>    get_scanoutpos();
>>>    get_timestamp();
>>>    local_irq_enable();
>>> }
>>
>> The preempt_disable serves the same purpose for PREEMPT_RT kernels as
>> your local_irq_disable in your example - get rid of any preventable
>> interruption, so a retry is unlikely to be ever needed. At least that
>> was the idea.
>>
>> I assume if a spin_lock_irqsave doesn't really disable interrupts on a
>> RT kernel with normal spinlocks then local_irq_disable won't really
>> disable interrupts either?
>>
>>>
>>> If that has a lot of error, then it seems to me that the system is just
>>> crap and we shouldn't care. Or if you're really worried about SMIs and
>>> such then you could still do a retry loop.
>>>
>>
>> I didn't expect a lot of error with preemption and interrupts disabled,
>> essentially only such infrequent things as smi's, that's why the retry
>> loop only tries 3x max, once for real, once in case the 1st one got
>> spoiled by a smi or such, and once because three times a charm ;-) - In
>> practice i didn't ever observe more than 3-4 usecs of delay, well below
>> the 20 usecs retry threshold. But i didn't expect
>> i915_crtc_get_scanoutpos() to ever take any locks or other stuff that
>> could induce long waits.
>>
>> But given the new situation, your proposal is great! If we push the
>> clock readouts into the get_scanoutpos routine, we can make this robust
>> without causing grief for the rt people and without the need for a new
>> separate lock for display regs in intel-kms.
>>
>> E.g., for intel-kms:
>>
>> i915_get_crtc_scanoutpos(..., ktime_t *stime, ktime_t *etime)
>> {
>>     ...
>>     spin_lock_irqsave(...uncore.lock);
>>     preempt_disable();
>>     *stime = ktime_get();
>>     position = __raw_i915_read32(dev_priv, PIPEDSL(pipe));
>>     *etime = ktime_get();
>>     preempt_enable();
>>     spin_unlock_irqrestore(...uncore.lock)
>>     ...
>> }
>>
>> With your patchset to reduce the amount of register reads needed in that
>> function, and given that forcewake handling isn't needed for these
>> registers, this should make it robust again and wouldn't need new locks.
>>
>> Unless ktime_get is also a bad thing to do in a preempt disabled section?
>
> The preempt_disable/enable is not needed. The spinlock serves the same
> purpose already.
>
> As far as ktime_get(), I've used it from spinlocked/irq disabled sections
> and so far haven't seen it do bad things. But would be nice to get some
> official statement to that effect.
>
> To minimize the chance of breakage, I guess we could add a new func
> ->get_scanout_pos_and_time or something like that, fill it by default
> with the code from the current drm_calc_vbltimestamp_from_scanoutpos().
>
> Or we could just shovel the delta_ns handling from
> drm_calc_vbltimestamp_from_scanoutpos() into some small helper function
> that we could call from i915_get_vblank_timestamp(), and then we can
> call i915_get_crtc_scanoutpos() directly from there as well.
>

I would like to keep most of the logic for scanoutpos timestamping in 
the the current drm_calc_vbltimestamp_from_scanoutpos() and only have 
the inner part of the retry loop in i915_get_crtc_scanoutpos(), so we 
have as much shared code as possible between drivers there. Makes it 
easier to keep track of what changes when, fix stuff, and to apply the 
module parameters related to timestamping.

We could add a new get_scanout_pos_with_time(), but afaik only i915 and 
radeon-kms currently use/implement that function, so maybe we can just 
convert those bits in one go, as this is only kernel internal api?

-mario

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-26 16:16                     ` Mario Kleiner
@ 2013-10-11 10:18                       ` Sebastian Andrzej Siewior
  2013-10-11 12:37                         ` Steven Rostedt
  0 siblings, 1 reply; 32+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-10-11 10:18 UTC (permalink / raw)
  To: Mario Kleiner
  Cc: Steven Rostedt, Mario Kleiner, Peter Hurley, Clark Williams,
	Daniel Vetter, LKML, dri-devel@lists.freedesktop.org, Dave Airlie,
	Thomas Gleixner, Luis Claudio R. Goncalves, intel-gfx,
	linux-rt-users, Ville Syrjälä

* Mario Kleiner | 2013-09-26 18:16:47 [+0200]:

>Good! I will do that. Thanks for clarifying the irq and constraints
>on raw locks in the other thread.

Are there any suggestions for "now"?  preempt_disable_nort() like Luis
suggesed?

>-mario

Sebastian

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-10-11 10:18                       ` Sebastian Andrzej Siewior
@ 2013-10-11 12:37                         ` Steven Rostedt
  2013-10-11 13:30                           ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 32+ messages in thread
From: Steven Rostedt @ 2013-10-11 12:37 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Mario Kleiner, Mario Kleiner, Peter Hurley, Clark Williams,
	Daniel Vetter, LKML, dri-devel@lists.freedesktop.org, Dave Airlie,
	Thomas Gleixner, Luis Claudio R. Goncalves, intel-gfx,
	linux-rt-users, Ville Syrjälä

On Fri, 11 Oct 2013 12:18:00 +0200
Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:

> * Mario Kleiner | 2013-09-26 18:16:47 [+0200]:
> 
> >Good! I will do that. Thanks for clarifying the irq and constraints
> >on raw locks in the other thread.
> 
> Are there any suggestions for "now"?  preempt_disable_nort() like Luis
> suggesed?
> 

The preempt_disable_nort() is rather pointless, because the
preempt_disable() was added specifically for -rt. When PREEMPT_RT is
not enabled, preemption is disabled there already by the previous calls
to spin_lock().

-- Steve

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-10-11 12:37                         ` Steven Rostedt
@ 2013-10-11 13:30                           ` Sebastian Andrzej Siewior
  2013-10-11 13:49                             ` Mario Kleiner
  2013-10-11 14:38                             ` Steven Rostedt
  0 siblings, 2 replies; 32+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-10-11 13:30 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mario Kleiner, Mario Kleiner, Peter Hurley, Clark Williams,
	Daniel Vetter, LKML, dri-devel@lists.freedesktop.org, Dave Airlie,
	Thomas Gleixner, Luis Claudio R. Goncalves, intel-gfx,
	linux-rt-users, Ville Syrjälä

On 10/11/2013 02:37 PM, Steven Rostedt wrote:
> On Fri, 11 Oct 2013 12:18:00 +0200
> Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> 
>> * Mario Kleiner | 2013-09-26 18:16:47 [+0200]:
>>
>>> Good! I will do that. Thanks for clarifying the irq and constraints
>>> on raw locks in the other thread.
>>
>> Are there any suggestions for "now"?  preempt_disable_nort() like Luis
>> suggesed?
>>
> 
> The preempt_disable_nort() is rather pointless, because the
> preempt_disable() was added specifically for -rt. When PREEMPT_RT is
> not enabled, preemption is disabled there already by the previous calls
> to spin_lock().

Either way. Then I remove the preempt_enable/disable call. Any
objections?

> -- Steve
> 
Sebastian

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-10-11 13:30                           ` Sebastian Andrzej Siewior
@ 2013-10-11 13:49                             ` Mario Kleiner
  2013-10-11 14:38                             ` Steven Rostedt
  1 sibling, 0 replies; 32+ messages in thread
From: Mario Kleiner @ 2013-10-11 13:49 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Steven Rostedt
  Cc: Mario Kleiner, Peter Hurley, Clark Williams, Daniel Vetter, LKML,
	dri-devel@lists.freedesktop.org, Dave Airlie, Thomas Gleixner,
	Luis Claudio R. Goncalves, intel-gfx, linux-rt-users,
	Ville Syrjälä

On 10/11/2013 03:30 PM, Sebastian Andrzej Siewior wrote:
> On 10/11/2013 02:37 PM, Steven Rostedt wrote:
>> On Fri, 11 Oct 2013 12:18:00 +0200
>> Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
>>
>>> * Mario Kleiner | 2013-09-26 18:16:47 [+0200]:
>>>
>>>> Good! I will do that. Thanks for clarifying the irq and constraints
>>>> on raw locks in the other thread.
>>>
>>> Are there any suggestions for "now"?  preempt_disable_nort() like Luis
>>> suggesed?
>>>
>>
>> The preempt_disable_nort() is rather pointless, because the
>> preempt_disable() was added specifically for -rt. When PREEMPT_RT is
>> not enabled, preemption is disabled there already by the previous calls
>> to spin_lock().
>
> Either way. Then I remove the preempt_enable/disable call. Any
> objections?
>

Good with me. I'm currently working on a replacement.
-mario

>> -- Steve
>>
> Sebastian
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-09-11 10:28 BUG: sleeping function called from invalid context on 3.10.10-rt7 Luis Claudio R. Goncalves
  2013-09-11 13:26 ` Steven Rostedt
@ 2013-10-11 14:19 ` Sebastian Andrzej Siewior
  2013-10-11 14:45   ` Luis Claudio R. Goncalves
  1 sibling, 1 reply; 32+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-10-11 14:19 UTC (permalink / raw)
  To: Luis Claudio R. Goncalves
  Cc: linux-rt-users, Steven Rostedt, Thomas Gleixner, Clark Williams

* Luis Claudio R. Goncalves | 2013-09-11 07:28:09 [-0300]:

>Hello,
Hi Luis,

>after Xorg started. I silenced that with a hackish patch, just for fun, and
>found a second problem, this time on tty_ldisc_reinit().
>
>[   25.423675] BUG: sleeping function called from invalid context at /home/lclaudio/SANDBOX/mrg-rt-v2-devel/kernel/rtmutex.c:659
>[   25.423676] in_atomic(): 1, irqs_disabled(): 1, pid: 188, name: plymouthd
>[   25.423676] 3 locks held by plymouthd/188:
>[   25.423682]  #0:  (&tty->legacy_mutex){......}, at: [<ffffffff816528d0>] tty_lock_nested+0x40/0x90
>[   25.423686]  #1:  (&tty->ldisc_mutex){......}, at: [<ffffffff8139b482>] tty_ldisc_hangup+0x152/0x300
>[   25.423688]  #2:  (tty_ldisc_lock){......}, at: [<ffffffff8139a9c2>] tty_ldisc_reinit+0x72/0x130
>[   25.423689] Preemption disabled at:[<ffffffff8139a9c2>] tty_ldisc_reinit+0x72/0x130
>[   25.423691] CPU: 2 PID: 188 Comm: plymouthd Not tainted 3.10.10-rt7+ #6
>[   25.423692] Hardware name: Hewlett-Packard p7-1512/2ADA, BIOS 8.15 02/05/2013
>[   25.423694]  005ff00000000000 ffff8801788ada68 ffffffff8164b790 ffff8801788ada80
>[   25.423695]  ffffffff8107e62f ffff8801991ce1a0 ffff8801788adaa0 ffffffff81651ac4
>[   25.423696]  0000000000000000 ffffea0005e26680 ffff8801788adaf8 ffffffff81130984
>[   25.423696] Call Trace:
>[   25.423700]  [<ffffffff8164b790>] dump_stack+0x19/0x1b
>[   25.423702]  [<ffffffff8107e62f>] __might_sleep+0xff/0x170
>[   25.423704]  [<ffffffff81651ac4>] rt_spin_lock+0x24/0x60
>[   25.423707]  [<ffffffff81130984>] free_hot_cold_page+0xb4/0x3c0
>[   25.423710]  [<ffffffff81178209>] ?unfreeze_partials.isra.42+0x229/0x2b0
>[   25.423711]  [<ffffffff81130dc7>] __free_pages+0x47/0x70
>[   25.423713]  [<ffffffff81130fb2>] __free_memcg_kmem_pages+0x22/0x50
>[   25.423714]  [<ffffffff81177528>] __free_slab+0xe8/0x1e0
>[   25.423716]  [<ffffffff81177654>] free_delayed+0x34/0x50
>[   25.423717]  [<ffffffff81649633>] __slab_free+0x273/0x36b
>[   25.423719]  [<ffffffff810acde9>] ? get_lock_stats+0x19/0x60
>[   25.423721]  [<ffffffff810ad49e>] ? put_lock_stats.isra.23+0xe/0x40
>[   25.423722]  [<ffffffff81178794>] kfree+0x1c4/0x210
>[   25.423724]  [<ffffffff8139a9f5>] ? tty_ldisc_reinit+0xa5/0x130
>[   25.423725]  [<ffffffff8139a9f5>] tty_ldisc_reinit+0xa5/0x130
>[   25.423726]  [<ffffffff8139b49f>] tty_ldisc_hangup+0x16f/0x300
>[   25.423728]  [<ffffffff81082e1d>] ? get_parent_ip+0xd/0x50
>[   25.423731]  [<ffffffff8104d736>] ? unpin_current_cpu+0x16/0x70
>[   25.423732]  [<ffffffff81392136>] __tty_hangup+0x346/0x460
>[   25.423733]  [<ffffffff81392260>] tty_vhangup+0x10/0x20
>[   25.423735]  [<ffffffff8139d6e1>] pty_close+0x131/0x180
>[   25.423736]  [<ffffffff813936ad>] tty_release+0x11d/0x5f0
>[   25.423737]  [<ffffffff810acde9>] ? get_lock_stats+0x19/0x60
>[   25.423747]  [<ffffffff81009d84>] ? native_sched_clock+0x24/0x80
>[   25.423749]  [<ffffffff81087025>] ? sched_clock_cpu+0xb5/0x100
>[   25.423750]  [<ffffffff810acde9>] ? get_lock_stats+0x19/0x60
>[   25.423752]  [<ffffffff8118eb45>] __fput+0xf5/0x250
>[   25.423762]  [<ffffffff8118ed6e>] ____fput+0xe/0x10
>[   25.423764]  [<ffffffff8106f1cc>] task_work_run+0xac/0xe0
>[   25.423765]  [<ffffffff81002a11>] do_notify_resume+0x91/0xf0
>[   25.423767]  [<ffffffff8165a41c>] int_signal+0x12/0x17

So the thread continues as a i915 bug thread. Now that I looked at it
again I noticed the tty problem which we probably have since v3.8. The
following patch should fix it.

diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c
index 1afe192..bc4ffe4 100644
--- a/drivers/tty/tty_ldisc.c
+++ b/drivers/tty/tty_ldisc.c
@@ -197,9 +197,10 @@ static inline void tty_ldisc_put(struct tty_ldisc *ld)
 	WARN_ON(!atomic_dec_and_test(&ld->users));
 
 	ld->ops->refcount--;
+	raw_spin_unlock_irqrestore(&tty_ldisc_lock, flags);
+
 	module_put(ld->ops->owner);
 	kfree(ld);
-	raw_spin_unlock_irqrestore(&tty_ldisc_lock, flags);
 }
 
 static void *tty_ldiscs_seq_start(struct seq_file *m, loff_t *pos)
-- 
1.8.4.rc3

Sebastian

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-10-11 13:30                           ` Sebastian Andrzej Siewior
  2013-10-11 13:49                             ` Mario Kleiner
@ 2013-10-11 14:38                             ` Steven Rostedt
  1 sibling, 0 replies; 32+ messages in thread
From: Steven Rostedt @ 2013-10-11 14:38 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Mario Kleiner, Mario Kleiner, Peter Hurley, Clark Williams,
	Daniel Vetter, LKML, dri-devel@lists.freedesktop.org, Dave Airlie,
	Thomas Gleixner, Luis Claudio R. Goncalves, intel-gfx,
	linux-rt-users, Ville Syrjälä

On Fri, 11 Oct 2013 15:30:22 +0200
Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:

> On 10/11/2013 02:37 PM, Steven Rostedt wrote:
> > On Fri, 11 Oct 2013 12:18:00 +0200
> > Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> > 
> >> * Mario Kleiner | 2013-09-26 18:16:47 [+0200]:
> >>
> >>> Good! I will do that. Thanks for clarifying the irq and constraints
> >>> on raw locks in the other thread.
> >>
> >> Are there any suggestions for "now"?  preempt_disable_nort() like Luis
> >> suggesed?
> >>
> > 
> > The preempt_disable_nort() is rather pointless, because the
> > preempt_disable() was added specifically for -rt. When PREEMPT_RT is
> > not enabled, preemption is disabled there already by the previous calls
> > to spin_lock().
> 
> Either way. Then I remove the preempt_enable/disable call. Any
> objections?
> 

I have no issues with it, but it may cause issues with timings for the
device. But I see Mario is looking into that :-)

-- Steve

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: sleeping function called from invalid context on 3.10.10-rt7
  2013-10-11 14:19 ` Sebastian Andrzej Siewior
@ 2013-10-11 14:45   ` Luis Claudio R. Goncalves
  0 siblings, 0 replies; 32+ messages in thread
From: Luis Claudio R. Goncalves @ 2013-10-11 14:45 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-rt-users, Steven Rostedt, Thomas Gleixner, Clark Williams

On Fri, Oct 11, 2013 at 04:19:31PM +0200, Sebastian Andrzej Siewior wrote:
| * Luis Claudio R. Goncalves | 2013-09-11 07:28:09 [-0300]:
| 
| >Hello,
| Hi Luis,
| 
| >after Xorg started. I silenced that with a hackish patch, just for fun, and
| >found a second problem, this time on tty_ldisc_reinit().
| >
| >[   25.423675] BUG: sleeping function called from invalid context at /home/lclaudio/SANDBOX/mrg-rt-v2-devel/kernel/rtmutex.c:659
| >[   25.423676] in_atomic(): 1, irqs_disabled(): 1, pid: 188, name: plymouthd
| >[   25.423676] 3 locks held by plymouthd/188:
| >[   25.423682]  #0:  (&tty->legacy_mutex){......}, at: [<ffffffff816528d0>] tty_lock_nested+0x40/0x90
| >[   25.423686]  #1:  (&tty->ldisc_mutex){......}, at: [<ffffffff8139b482>] tty_ldisc_hangup+0x152/0x300
| >[   25.423688]  #2:  (tty_ldisc_lock){......}, at: [<ffffffff8139a9c2>] tty_ldisc_reinit+0x72/0x130
| >[   25.423689] Preemption disabled at:[<ffffffff8139a9c2>] tty_ldisc_reinit+0x72/0x130
| >[   25.423691] CPU: 2 PID: 188 Comm: plymouthd Not tainted 3.10.10-rt7+ #6
| >[   25.423692] Hardware name: Hewlett-Packard p7-1512/2ADA, BIOS 8.15 02/05/2013
| >[   25.423694]  005ff00000000000 ffff8801788ada68 ffffffff8164b790 ffff8801788ada80
| >[   25.423695]  ffffffff8107e62f ffff8801991ce1a0 ffff8801788adaa0 ffffffff81651ac4
| >[   25.423696]  0000000000000000 ffffea0005e26680 ffff8801788adaf8 ffffffff81130984
| >[   25.423696] Call Trace:
| >[   25.423700]  [<ffffffff8164b790>] dump_stack+0x19/0x1b
| >[   25.423702]  [<ffffffff8107e62f>] __might_sleep+0xff/0x170
| >[   25.423704]  [<ffffffff81651ac4>] rt_spin_lock+0x24/0x60
| >[   25.423707]  [<ffffffff81130984>] free_hot_cold_page+0xb4/0x3c0
| >[   25.423710]  [<ffffffff81178209>] ?unfreeze_partials.isra.42+0x229/0x2b0
| >[   25.423711]  [<ffffffff81130dc7>] __free_pages+0x47/0x70
| >[   25.423713]  [<ffffffff81130fb2>] __free_memcg_kmem_pages+0x22/0x50
| >[   25.423714]  [<ffffffff81177528>] __free_slab+0xe8/0x1e0
| >[   25.423716]  [<ffffffff81177654>] free_delayed+0x34/0x50
| >[   25.423717]  [<ffffffff81649633>] __slab_free+0x273/0x36b
| >[   25.423719]  [<ffffffff810acde9>] ? get_lock_stats+0x19/0x60
| >[   25.423721]  [<ffffffff810ad49e>] ? put_lock_stats.isra.23+0xe/0x40
| >[   25.423722]  [<ffffffff81178794>] kfree+0x1c4/0x210
| >[   25.423724]  [<ffffffff8139a9f5>] ? tty_ldisc_reinit+0xa5/0x130
| >[   25.423725]  [<ffffffff8139a9f5>] tty_ldisc_reinit+0xa5/0x130
| >[   25.423726]  [<ffffffff8139b49f>] tty_ldisc_hangup+0x16f/0x300
| >[   25.423728]  [<ffffffff81082e1d>] ? get_parent_ip+0xd/0x50
| >[   25.423731]  [<ffffffff8104d736>] ? unpin_current_cpu+0x16/0x70
| >[   25.423732]  [<ffffffff81392136>] __tty_hangup+0x346/0x460
| >[   25.423733]  [<ffffffff81392260>] tty_vhangup+0x10/0x20
| >[   25.423735]  [<ffffffff8139d6e1>] pty_close+0x131/0x180
| >[   25.423736]  [<ffffffff813936ad>] tty_release+0x11d/0x5f0
| >[   25.423737]  [<ffffffff810acde9>] ? get_lock_stats+0x19/0x60
| >[   25.423747]  [<ffffffff81009d84>] ? native_sched_clock+0x24/0x80
| >[   25.423749]  [<ffffffff81087025>] ? sched_clock_cpu+0xb5/0x100
| >[   25.423750]  [<ffffffff810acde9>] ? get_lock_stats+0x19/0x60
| >[   25.423752]  [<ffffffff8118eb45>] __fput+0xf5/0x250
| >[   25.423762]  [<ffffffff8118ed6e>] ____fput+0xe/0x10
| >[   25.423764]  [<ffffffff8106f1cc>] task_work_run+0xac/0xe0
| >[   25.423765]  [<ffffffff81002a11>] do_notify_resume+0x91/0xf0
| >[   25.423767]  [<ffffffff8165a41c>] int_signal+0x12/0x17
| 
| So the thread continues as a i915 bug thread. Now that I looked at it
| again I noticed the tty problem which we probably have since v3.8. The
| following patch should fix it.

Thanks, Sebastian! Looking at yout patch it sounds pretty straightforward,
indeed.

I have to confess that I have never seen that backtrace again, after I
reported it. I will add your patch to my kernel and give it a try :)

Thanks,
Luis
 
| diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c
| index 1afe192..bc4ffe4 100644
| --- a/drivers/tty/tty_ldisc.c
| +++ b/drivers/tty/tty_ldisc.c
| @@ -197,9 +197,10 @@ static inline void tty_ldisc_put(struct tty_ldisc *ld)
|  	WARN_ON(!atomic_dec_and_test(&ld->users));
|  
|  	ld->ops->refcount--;
| +	raw_spin_unlock_irqrestore(&tty_ldisc_lock, flags);
| +
|  	module_put(ld->ops->owner);
|  	kfree(ld);
| -	raw_spin_unlock_irqrestore(&tty_ldisc_lock, flags);
|  }
|  
|  static void *tty_ldiscs_seq_start(struct seq_file *m, loff_t *pos)
| -- 
| 1.8.4.rc3
| 
| Sebastian
---end quoted text---

-- 
[ Luis Claudio R. Goncalves             Red Hat  -  Realtime Team ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9  2696 7203 D980 A448 C8F8 ]


^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2013-10-11 14:45 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-11 10:28 BUG: sleeping function called from invalid context on 3.10.10-rt7 Luis Claudio R. Goncalves
2013-09-11 13:26 ` Steven Rostedt
2013-09-11 15:16   ` Peter Hurley
2013-09-11 15:38     ` Steven Rostedt
2013-09-11 19:31       ` Peter Hurley
2013-09-17 19:50         ` Peter Hurley
2013-09-17 20:55           ` Daniel Vetter
2013-09-18 16:52             ` Peter Hurley
2013-09-18 17:03               ` Daniel Vetter
2013-09-18 17:03               ` Ville Syrjälä
2013-09-20 22:07             ` Mario Kleiner
2013-09-23  8:38               ` [Intel-gfx] " Ville Syrjälä
2013-09-25  4:32                 ` Mario Kleiner
2013-09-25  7:49                   ` [Intel-gfx] " Ville Syrjälä
2013-09-25 14:18                     ` Steven Rostedt
2013-09-26 16:43                     ` Mario Kleiner
2013-09-25 13:52                   ` Steven Rostedt
2013-09-25 14:13                   ` Steven Rostedt
2013-09-26 16:16                     ` Mario Kleiner
2013-10-11 10:18                       ` Sebastian Andrzej Siewior
2013-10-11 12:37                         ` Steven Rostedt
2013-10-11 13:30                           ` Sebastian Andrzej Siewior
2013-10-11 13:49                             ` Mario Kleiner
2013-10-11 14:38                             ` Steven Rostedt
2013-09-25 14:07               ` Steven Rostedt
2013-09-11 18:29     ` Mario Kleiner
2013-09-11 18:35       ` Steven Rostedt
2013-09-11 19:07         ` Mario Kleiner
2013-09-11 19:19           ` Steven Rostedt
2013-09-11 20:23             ` Mario Kleiner
2013-10-11 14:19 ` Sebastian Andrzej Siewior
2013-10-11 14:45   ` Luis Claudio R. Goncalves

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).