* i915 X lockup
@ 2009-02-27 9:28 Jiri Slaby
2009-02-27 10:01 ` Peter Zijlstra
2009-02-27 10:32 ` Andrew Morton
0 siblings, 2 replies; 15+ messages in thread
From: Jiri Slaby @ 2009-02-27 9:28 UTC (permalink / raw)
To: airlied; +Cc: eric, keithp, dri-devel, Andrew Morton, Linux kernel mailing list
Hi,
everytime I run X, it gets stuck. Currently running on mmotm
2009-02-26-16-58, but I think this is wider problem. I had i915 disabled
for a long time (until I noticed today).
SysRq : Show Locks Held
Showing all locks held in the system:
3 locks held by events/0/10:
#0: (events){+.+.+.}, at: [<ffffffff8025223d>] worker_thread+0x19d/0x340
#1: (&(&dev_priv->mm.retire_work)->work){+.+...}, at:
[<ffffffff8025223d>] worker_thread+0x19d/0x340
#2: (&dev->struct_mutex){+.+.+.}, at: [<ffffffff804057ba>]
i915_gem_retire_work_handler+0x3a/0x90
1 lock held by mingetty/3899:
#0: (&tty->atomic_read_lock){+.+.+.}, at: [<ffffffff803cb5de>]
n_tty_read+0x48e/0x8e0
1 lock held by mingetty/3900:
#0: (&tty->atomic_read_lock){+.+.+.}, at: [<ffffffff803cb5de>]
n_tty_read+0x48e/0x8e0
1 lock held by mingetty/3901:
#0: (&tty->atomic_read_lock){+.+.+.}, at: [<ffffffff803cb5de>]
n_tty_read+0x48e/0x8e0
1 lock held by X/4007:
#0: (&dev->struct_mutex){+.+.+.}, at: [<ffffffff8040563c>]
i915_gem_throttle_ioctl+0x2c/0x60
2 locks held by bash/4105:
#0: (sysrq_key_table_lock){......}, at: [<ffffffff803de366>]
__handle_sysrq+0x26/0x190
#1: (tasklist_lock){.+.+..}, at: [<ffffffff80266c1f>]
debug_show_all_locks+0x3f/0x1c0
=============================================
INFO: task events/0:10 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
events/0 D 0000000000000000 0 10 2
ffff8801cb22fd60 0000000000000046 ffff8801cb22fcc0 ffffffff809d5cb0
0000000000010400 ffffffff804057ba ffff8801cb20a6d0 ffff8801cb20a080
ffff8801cb20a328 00000000802690a3 00000000ffff0ea1 0000000000000002
Call Trace:
[<ffffffff804057ba>] ? i915_gem_retire_work_handler+0x3a/0x90
[<ffffffff8026804d>] ? mark_held_locks+0x6d/0x90
[<ffffffff80612fb5>] ? mutex_lock_nested+0x185/0x310
[<ffffffff80612f46>] mutex_lock_nested+0x116/0x310
[<ffffffff804057ba>] ? i915_gem_retire_work_handler+0x3a/0x90
[<ffffffff802690a3>] ? __lock_acquire+0xab3/0x12c0
[<ffffffff80405780>] ? i915_gem_retire_work_handler+0x0/0x90
[<ffffffff804057ba>] i915_gem_retire_work_handler+0x3a/0x90
[<ffffffff80252290>] worker_thread+0x1f0/0x340
[<ffffffff8025223d>] ? worker_thread+0x19d/0x340
[<ffffffff80614aff>] ? _spin_unlock_irqrestore+0x3f/0x60
[<ffffffff80256de0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff8026838d>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff802520a0>] ? worker_thread+0x0/0x340
[<ffffffff80256a2e>] kthread+0x9e/0xb0
[<ffffffff8020d51a>] child_rip+0xa/0x20
[<ffffffff8020cf3c>] ? restore_args+0x0/0x30
[<ffffffff80256990>] ? kthread+0x0/0xb0
[<ffffffff8020d510>] ? child_rip+0x0/0x20
3 locks held by events/0/10:
#0: (events){+.+.+.}, at: [<ffffffff8025223d>] worker_thread+0x19d/0x340
#1: (&(&dev_priv->mm.retire_work)->work){+.+...}, at:
[<ffffffff8025223d>] worker_thread+0x19d/0x340
#2: (&dev->struct_mutex){+.+.+.}, at: [<ffffffff804057ba>]
i915_gem_retire_work_handler+0x3a/0x90
Adapter is:
00:02.0 VGA compatible controller [0300]: Intel Corporation 82G33/G31
Express Integrated Graphics Controller [8086:29c2] (rev 02) (prog-if 00
[VGA controller])
Subsystem: Intel Corporation 82G33/G31 Express Integrated
Graphics Controller [8086:29c2]
Flags: bus master, fast devsel, latency 0, IRQ 26
Memory at ffa80000 (32-bit, non-prefetchable) [size=512K]
I/O ports at ec00 [size=8]
Memory at d0000000 (32-bit, prefetchable) [size=256M]
Memory at ff900000 (32-bit, non-prefetchable) [size=1M]
Capabilities: [90] Message Signalled Interrupts: Mask- 64bit-
Count=1/1 Enable+
Capabilities: [d0] Power Management version 2
Kernel driver in use: i915
X server complains:
[mi] EQ overflowing. The server is probably stuck in an infinite loop.
[mi] mieqEnequeue: out-of-order valuator event; dropping.
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: i915 X lockup 2009-02-27 9:28 i915 X lockup Jiri Slaby @ 2009-02-27 10:01 ` Peter Zijlstra 2009-02-27 10:12 ` Jiri Slaby 2009-02-27 10:32 ` Andrew Morton 1 sibling, 1 reply; 15+ messages in thread From: Peter Zijlstra @ 2009-02-27 10:01 UTC (permalink / raw) To: Jiri Slaby Cc: airlied, eric, keithp, dri-devel, Andrew Morton, Linux kernel mailing list On Fri, 2009-02-27 at 10:28 +0100, Jiri Slaby wrote: > SysRq : Show Locks Held > > Showing all locks held in the system: > 3 locks held by events/0/10: > #0: (events){+.+.+.}, at: [<ffffffff8025223d>] worker_thread+0x19d/0x340 > #1: (&(&dev_priv->mm.retire_work)->work){+.+...}, at: [<ffffffff8025223d>] worker_thread+0x19d/0x340 > #2: (&dev->struct_mutex){+.+.+.}, at: [<ffffffff804057ba>] i915_gem_retire_work_handler+0x3a/0x90 > 1 lock held by X/4007: > #0: (&dev->struct_mutex){+.+.+.}, at: [<ffffffff8040563c>] i915_gem_throttle_ioctl+0x2c/0x60 > ============================================= > > INFO: task events/0:10 blocked for more than 120 seconds. > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > events/0 D 0000000000000000 0 10 2 > ffff8801cb22fd60 0000000000000046 ffff8801cb22fcc0 ffffffff809d5cb0 > 0000000000010400 ffffffff804057ba ffff8801cb20a6d0 ffff8801cb20a080 > ffff8801cb20a328 00000000802690a3 00000000ffff0ea1 0000000000000002 > Call Trace: > [<ffffffff804057ba>] ? i915_gem_retire_work_handler+0x3a/0x90 > [<ffffffff8026804d>] ? mark_held_locks+0x6d/0x90 > [<ffffffff80612fb5>] ? mutex_lock_nested+0x185/0x310 > [<ffffffff80612f46>] mutex_lock_nested+0x116/0x310 > [<ffffffff804057ba>] ? i915_gem_retire_work_handler+0x3a/0x90 > [<ffffffff802690a3>] ? __lock_acquire+0xab3/0x12c0 > [<ffffffff80405780>] ? i915_gem_retire_work_handler+0x0/0x90 > [<ffffffff804057ba>] i915_gem_retire_work_handler+0x3a/0x90 > [<ffffffff80252290>] worker_thread+0x1f0/0x340 > [<ffffffff8025223d>] ? worker_thread+0x19d/0x340 > [<ffffffff80614aff>] ? _spin_unlock_irqrestore+0x3f/0x60 > [<ffffffff80256de0>] ? autoremove_wake_function+0x0/0x40 > [<ffffffff8026838d>] ? trace_hardirqs_on+0xd/0x10 > [<ffffffff802520a0>] ? worker_thread+0x0/0x340 > [<ffffffff80256a2e>] kthread+0x9e/0xb0 > [<ffffffff8020d51a>] child_rip+0xa/0x20 > [<ffffffff8020cf3c>] ? restore_args+0x0/0x30 > [<ffffffff80256990>] ? kthread+0x0/0xb0 > [<ffffffff8020d510>] ? child_rip+0x0/0x20 > 3 locks held by events/0/10: > #0: (events){+.+.+.}, at: [<ffffffff8025223d>] worker_thread+0x19d/0x340 > #1: (&(&dev_priv->mm.retire_work)->work){+.+...}, at: [<ffffffff8025223d>] worker_thread+0x19d/0x340 > #2: (&dev->struct_mutex){+.+.+.}, at: [<ffffffff804057ba>] i915_gem_retire_work_handler+0x3a/0x90 Looks like eventd blocking on X, would be good to have sysrq-w output too, to see what X is up to (assuming it is blocked, and not spinning like mad with a lock held). ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: i915 X lockup 2009-02-27 10:01 ` Peter Zijlstra @ 2009-02-27 10:12 ` Jiri Slaby 2009-02-27 10:14 ` Jiri Slaby 0 siblings, 1 reply; 15+ messages in thread From: Jiri Slaby @ 2009-02-27 10:12 UTC (permalink / raw) To: Peter Zijlstra Cc: Jiri Slaby, airlied, eric, keithp, dri-devel, Andrew Morton, Linux kernel mailing list On 27.2.2009 11:01, Peter Zijlstra wrote: > would be good to have sysrq-w output There was nothing but events. So this is rather an intel driver userspace bug? SysRq : Show Blocked State task PC stack pid father events/1 D 0000000000000000 0 11 2 ffff8801cb231da0 0000000000000046 ffff8801cb231d00 ffffffff80231ef8 0000003800000010 0000000000010180 ffff880028053840 ffff880028057180 ffff8801cb20c790 ffff8801cb20ca38 00000001ca3f90e8 00000000ffff7a0e Call Trace: [<ffffffff80231ef8>] ? dequeue_entity+0x18/0x1a0 [<ffffffff80230b50>] ? dequeue_task+0xb0/0xf0 [<ffffffff805fca2a>] __mutex_lock_slowpath+0xea/0x170 [<ffffffff803f60a0>] ? i915_gem_retire_work_handler+0x0/0x90 [<ffffffff805fc6f6>] mutex_lock+0x26/0x50 [<ffffffff803f60d8>] i915_gem_retire_work_handler+0x38/0x90 [<ffffffff80250792>] worker_thread+0x172/0x250 [<ffffffff80254da0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff80250620>] ? worker_thread+0x0/0x250 [<ffffffff802549be>] kthread+0x9e/0xb0 [<ffffffff8020d3da>] child_rip+0xa/0x20 [<ffffffff80254920>] ? kthread+0x0/0xb0 [<ffffffff8020d3d0>] ? child_rip+0x0/0x20 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: i915 X lockup 2009-02-27 10:12 ` Jiri Slaby @ 2009-02-27 10:14 ` Jiri Slaby 0 siblings, 0 replies; 15+ messages in thread From: Jiri Slaby @ 2009-02-27 10:14 UTC (permalink / raw) To: Peter Zijlstra Cc: airlied, eric, keithp, dri-devel, Andrew Morton, Linux kernel mailing list On 27.2.2009 11:12, Jiri Slaby wrote: > So this is rather an intel driver userspace bug? Bullshit, ignore me :). ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: i915 X lockup 2009-02-27 9:28 i915 X lockup Jiri Slaby 2009-02-27 10:01 ` Peter Zijlstra @ 2009-02-27 10:32 ` Andrew Morton 2009-02-27 13:04 ` Sitsofe Wheeler 1 sibling, 1 reply; 15+ messages in thread From: Andrew Morton @ 2009-02-27 10:32 UTC (permalink / raw) To: Jiri Slaby; +Cc: airlied, eric, keithp, dri-devel, Linux kernel mailing list On Fri, 27 Feb 2009 10:28:51 +0100 Jiri Slaby <jirislaby@gmail.com> wrote: > everytime I run X, it gets stuck. Currently running on mmotm > 2009-02-26-16-58, but I think this is wider problem. I had i915 disabled > for a long time (until I noticed today). > > SysRq : Show Locks Held > > Showing all locks held in the system: > 3 locks held by events/0/10: > #0: (events){+.+.+.}, at: [<ffffffff8025223d>] worker_thread+0x19d/0x340 > #1: (&(&dev_priv->mm.retire_work)->work){+.+...}, at: > [<ffffffff8025223d>] worker_thread+0x19d/0x340 > #2: (&dev->struct_mutex){+.+.+.}, at: [<ffffffff804057ba>] > i915_gem_retire_work_handler+0x3a/0x90 > 1 lock held by mingetty/3899: > #0: (&tty->atomic_read_lock){+.+.+.}, at: [<ffffffff803cb5de>] > n_tty_read+0x48e/0x8e0 > 1 lock held by mingetty/3900: > #0: (&tty->atomic_read_lock){+.+.+.}, at: [<ffffffff803cb5de>] > n_tty_read+0x48e/0x8e0 > 1 lock held by mingetty/3901: > #0: (&tty->atomic_read_lock){+.+.+.}, at: [<ffffffff803cb5de>] > n_tty_read+0x48e/0x8e0 > 1 lock held by X/4007: > #0: (&dev->struct_mutex){+.+.+.}, at: [<ffffffff8040563c>] > i915_gem_throttle_ioctl+0x2c/0x60 > 2 locks held by bash/4105: > #0: (sysrq_key_table_lock){......}, at: [<ffffffff803de366>] > __handle_sysrq+0x26/0x190 > #1: (tasklist_lock){.+.+..}, at: [<ffffffff80266c1f>] > debug_show_all_locks+0x3f/0x1c0 I assume that i915_gem_throttle_ioctl->i915_gem_ring_throttle is stuck in i915_wait_request(), holding struct_mutex. That of course makes keventd block. Perhaps i915_wait_request() is waiting for keventd to do something, which is the deadlock. That "something" could be to simply finish its current call to i915_gem_retire_work_handler(). But worse, it could be some completely other keventd handler which isn't getting run, because that keventd instance is stuck over in i915_gem_retire_work_handler(). IOW, the usual keventd problem. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: i915 X lockup 2009-02-27 10:32 ` Andrew Morton @ 2009-02-27 13:04 ` Sitsofe Wheeler 2009-02-27 13:49 ` Jiri Slaby 0 siblings, 1 reply; 15+ messages in thread From: Sitsofe Wheeler @ 2009-02-27 13:04 UTC (permalink / raw) To: Andrew Morton Cc: Jiri Slaby, airlied, eric, keithp, dri-devel, Linux kernel mailing list On Fri, Feb 27, 2009 at 02:32:31AM -0800, Andrew Morton wrote: > On Fri, 27 Feb 2009 10:28:51 +0100 Jiri Slaby <jirislaby@gmail.com> wrote: > > > everytime I run X, it gets stuck. Currently running on mmotm > > 2009-02-26-16-58, but I think this is wider problem. I had i915 disabled > > for a long time (until I noticed today). Which version of X are you using? Does it support kernel modesetting? If not, did you disable kernel modesetting in the KConfig file for i915? -- Sitsofe | http://sucs.org/~sits/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: i915 X lockup 2009-02-27 13:04 ` Sitsofe Wheeler @ 2009-02-27 13:49 ` Jiri Slaby 2009-02-27 23:12 ` Sitsofe Wheeler 0 siblings, 1 reply; 15+ messages in thread From: Jiri Slaby @ 2009-02-27 13:49 UTC (permalink / raw) To: Sitsofe Wheeler Cc: Andrew Morton, airlied, eric, keithp, dri-devel, Linux kernel mailing list On 27.2.2009 14:04, Sitsofe Wheeler wrote: > On Fri, Feb 27, 2009 at 02:32:31AM -0800, Andrew Morton wrote: >> On Fri, 27 Feb 2009 10:28:51 +0100 Jiri Slaby<jirislaby@gmail.com> wrote: >> >>> everytime I run X, it gets stuck. Currently running on mmotm >>> 2009-02-26-16-58, but I think this is wider problem. I had i915 disabled >>> for a long time (until I noticed today). > > Which version of X are you using? Does it support kernel modesetting? If > not, did you disable kernel modesetting in the KConfig file for i915? xorg-x11-server-7.4-17.3 which is X.Org X Server 1.5.2 modesetting enabled: CONFIG_DRM_I915_KMS=y Which X version is needed for that? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: i915 X lockup 2009-02-27 13:49 ` Jiri Slaby @ 2009-02-27 23:12 ` Sitsofe Wheeler 2009-02-28 0:20 ` Eric Anholt 0 siblings, 1 reply; 15+ messages in thread From: Sitsofe Wheeler @ 2009-02-27 23:12 UTC (permalink / raw) To: Jiri Slaby Cc: Andrew Morton, airlied, eric, keithp, dri-devel, Linux kernel mailing list On Fri, Feb 27, 2009 at 02:49:06PM +0100, Jiri Slaby wrote: > On 27.2.2009 14:04, Sitsofe Wheeler wrote: > >On Fri, Feb 27, 2009 at 02:32:31AM -0800, Andrew Morton wrote: > >>On Fri, 27 Feb 2009 10:28:51 +0100 Jiri Slaby<jirislaby@gmail.com> wrote: > >> > >>>everytime I run X, it gets stuck. Currently running on mmotm > >>>2009-02-26-16-58, but I think this is wider problem. I had i915 disabled > >>>for a long time (until I noticed today). > > > >Which version of X are you using? Does it support kernel modesetting? If > >not, did you disable kernel modesetting in the KConfig file for i915? > > xorg-x11-server-7.4-17.3 > which is > X.Org X Server 1.5.2 > > modesetting enabled: > CONFIG_DRM_I915_KMS=y > > Which X version is needed for that? Good question. I can see that 7.4 supports GEM but I see nothing about kernel modesetting ( http://www.phoronix.com/scan.php?page=article&item=xorg_74_final&num=1). I know it's enabled in the Fedora (since Fedora 9) xorgs but I have no idea about openSUSE (which I believe is what you are using based on package numbers). Apparently kernel modesetting can be turned off on the kernel command line by using nomodesetting so that might be a quick thing to try... -- Sitsofe | http://sucs.org/~sits/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: i915 X lockup 2009-02-27 23:12 ` Sitsofe Wheeler @ 2009-02-28 0:20 ` Eric Anholt 2009-02-28 8:31 ` Jiri Slaby 0 siblings, 1 reply; 15+ messages in thread From: Eric Anholt @ 2009-02-28 0:20 UTC (permalink / raw) To: Sitsofe Wheeler Cc: Jiri Slaby, Andrew Morton, airlied, keithp, dri-devel, Linux kernel mailing list [-- Attachment #1: Type: text/plain, Size: 1558 bytes --] On Fri, 2009-02-27 at 23:12 +0000, Sitsofe Wheeler wrote: > On Fri, Feb 27, 2009 at 02:49:06PM +0100, Jiri Slaby wrote: > > On 27.2.2009 14:04, Sitsofe Wheeler wrote: > > >On Fri, Feb 27, 2009 at 02:32:31AM -0800, Andrew Morton wrote: > > >>On Fri, 27 Feb 2009 10:28:51 +0100 Jiri Slaby<jirislaby@gmail.com> wrote: > > >> > > >>>everytime I run X, it gets stuck. Currently running on mmotm > > >>>2009-02-26-16-58, but I think this is wider problem. I had i915 disabled > > >>>for a long time (until I noticed today). > > > > > >Which version of X are you using? Does it support kernel modesetting? If > > >not, did you disable kernel modesetting in the KConfig file for i915? > > > > xorg-x11-server-7.4-17.3 > > which is > > X.Org X Server 1.5.2 > > > > modesetting enabled: > > CONFIG_DRM_I915_KMS=y > > > > Which X version is needed for that? > > Good question. I can see that 7.4 supports GEM but I see nothing about > kernel modesetting ( > http://www.phoronix.com/scan.php?page=article&item=xorg_74_final&num=1). > I know it's enabled in the Fedora (since Fedora 9) xorgs but I have no > idea about openSUSE (which I believe is what you are using based on > package numbers). Apparently kernel modesetting can be turned off on the > kernel command line by using nomodesetting so that might be a quick > thing to try... KMS support is not a feature of the server but of your 2D driver. You want 2.6.2, or things will be bad. -- Eric Anholt eric@anholt.net eric.anholt@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: i915 X lockup 2009-02-28 0:20 ` Eric Anholt @ 2009-02-28 8:31 ` Jiri Slaby 2009-02-28 8:47 ` Andrew Morton 0 siblings, 1 reply; 15+ messages in thread From: Jiri Slaby @ 2009-02-28 8:31 UTC (permalink / raw) To: Eric Anholt Cc: Sitsofe Wheeler, Andrew Morton, airlied, keithp, dri-devel, Linux kernel mailing list On 28.2.2009 01:20, Eric Anholt wrote: > KMS support is not a feature of the server but of your 2D driver. You > want 2.6.2, or things will be bad. I have 2.5.0. After turning KMS off, problem seems to be solved. Anyway, I would appreciate a version of the intel driver being in the Kconfig text, otherwise it looks like: don't use this on machines with installation from stone age. If one has latest stable release of a distro, he doesn't even think he doesn't have "new enough userspace". For reference, the text is: Choose this option if you want kernel modesetting enabled by default, and you have a new enough userspace to support this. Running old userspaces with this enabled will cause pain. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: i915 X lockup 2009-02-28 8:31 ` Jiri Slaby @ 2009-02-28 8:47 ` Andrew Morton 2009-02-28 9:00 ` Eric Anholt 2009-02-28 17:11 ` Keith Packard 0 siblings, 2 replies; 15+ messages in thread From: Andrew Morton @ 2009-02-28 8:47 UTC (permalink / raw) To: Jiri Slaby Cc: Eric Anholt, Sitsofe Wheeler, airlied, keithp, dri-devel, Linux kernel mailing list On Sat, 28 Feb 2009 09:31:28 +0100 Jiri Slaby <jirislaby@gmail.com> wrote: > On 28.2.2009 01:20, Eric Anholt wrote: > > KMS support is not a feature of the server but of your 2D driver. You > > want 2.6.2, or things will be bad. > > I have 2.5.0. After turning KMS off, problem seems to be solved. > > Anyway, I would appreciate a version of the intel driver being in the > Kconfig text, otherwise it looks like: don't use this on machines with > installation from stone age. If one has latest stable release of a > distro, he doesn't even think he doesn't have "new enough userspace". > > For reference, the text is: > Choose this option if you want kernel modesetting enabled by default, > and you have a new enough userspace to support this. Running old > userspaces with this enabled will cause pain. Hang on. The kernel deadlocked on struct_mutex, did it not? That's a kernel bug regardless of what userspace you're running. Do we know why this happened? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: i915 X lockup 2009-02-28 8:47 ` Andrew Morton @ 2009-02-28 9:00 ` Eric Anholt 2009-02-28 18:24 ` Bruno Prémont 2009-02-28 17:11 ` Keith Packard 1 sibling, 1 reply; 15+ messages in thread From: Eric Anholt @ 2009-02-28 9:00 UTC (permalink / raw) To: Andrew Morton Cc: Jiri Slaby, Sitsofe Wheeler, airlied, keithp, dri-devel, Linux kernel mailing list [-- Attachment #1: Type: text/plain, Size: 1965 bytes --] On Sat, 2009-02-28 at 00:47 -0800, Andrew Morton wrote: > On Sat, 28 Feb 2009 09:31:28 +0100 Jiri Slaby <jirislaby@gmail.com> wrote: > > > On 28.2.2009 01:20, Eric Anholt wrote: > > > KMS support is not a feature of the server but of your 2D driver. You > > > want 2.6.2, or things will be bad. > > > > I have 2.5.0. After turning KMS off, problem seems to be solved. > > > > Anyway, I would appreciate a version of the intel driver being in the > > Kconfig text, otherwise it looks like: don't use this on machines with > > installation from stone age. If one has latest stable release of a > > distro, he doesn't even think he doesn't have "new enough userspace". > > > > For reference, the text is: > > Choose this option if you want kernel modesetting enabled by default, > > and you have a new enough userspace to support this. Running old > > userspaces with this enabled will cause pain. > > Hang on. > > The kernel deadlocked on struct_mutex, did it not? That's a kernel bug > regardless of what userspace you're running. > > Do we know why this happened? Userland went stomping all over the device state that the kernel thinks it controls since you went and turned on the KMS option asserting "I'm not going to run old userland", so the GPU got hung, and further software using the GPU hung, and then somebody waiting for someone else finishing using the GPU (struct_mutex) got hung. The only proposal to prevent it is to use the "don't let userland map my PCI device any more" support we now have available to us, which would make X fail early on. The unfortunate side-effect of that is that we lose the ability to run incredibly useful userland debug tools that do read-only access to registers. We're moving bits of those into debugfs for 2.6.30, but it's work that's not done even for the tools we have today. -- Eric Anholt eric@anholt.net eric.anholt@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: i915 X lockup 2009-02-28 9:00 ` Eric Anholt @ 2009-02-28 18:24 ` Bruno Prémont 2009-02-28 19:57 ` Eric Anholt 0 siblings, 1 reply; 15+ messages in thread From: Bruno Prémont @ 2009-02-28 18:24 UTC (permalink / raw) To: Eric Anholt Cc: Andrew Morton, Jiri Slaby, Sitsofe Wheeler, airlied, keithp, dri-devel, Linux kernel mailing list On Sat, 28 February 2009 Eric Anholt <eric@anholt.net> wrote: > On Sat, 2009-02-28 at 00:47 -0800, Andrew Morton wrote: > > The kernel deadlocked on struct_mutex, did it not? That's a kernel > > bug regardless of what userspace you're running. > > > > Do we know why this happened? > > Userland went stomping all over the device state that the kernel > thinks it controls since you went and turned on the KMS option > asserting "I'm not going to run old userland", so the GPU got hung, > and further software using the GPU hung, and then somebody waiting > for someone else finishing using the GPU (struct_mutex) got hung. > > The only proposal to prevent it is to use the "don't let userland map > my PCI device any more" support we now have available to us, which > would make X fail early on. The unfortunate side-effect of that is > that we lose the ability to run incredibly useful userland debug > tools that do read-only access to registers. We're moving bits of > those into debugfs for 2.6.30, but it's work that's not done even for > the tools we have today. I also saw/see Xorg lockup... I'm running xf86-video-intel-2.6.1 (2.6.2 released only very recently) and kernel with KMS disabled (KMS not capable of getting framebuffer properly configured it seems, at least display remains black) For me it's a deadlock between intel driver dispatching GEM requests and events (see log below). For me each time it happends was while interacting with a webbrowser, though it does not happend very often. Connecting to the notebook via ssh I can kill -KILL Xorg (kill -TERM does not work). When doing this GPU gets at least confused though in order to get vesafb back working it's sufficient to start Xorg with vesa driver (running with intel driver before rebooting leads to locked X and need to retry the killing) Bruno Graphics card: 00:02.0 VGA compatible controller: Intel Corporation 82852/855GM Integrated Graphics Device (rev 02) (prog-if 00 [VGA controller]) Subsystem: Acer Incorporated [ALI] Device 0035 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx+ Latency: 0 Interrupt: pin A routed to IRQ 11 Region 0: Memory at e8000000 (32-bit, prefetchable) [size=128M] Region 1: Memory at e0000000 (32-bit, non-prefetchable) [size=512K] Region 2: I/O ports at 1800 [size=8] Capabilities: [d0] Power Management version 1 Flags: PMEClk- DSI+ D1+ D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- 00:02.1 Display controller: Intel Corporation 82852/855GM Integrated Graphics Device (rev 02) Subsystem: Acer Incorporated [ALI] Device 0035 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Region 0: Memory at f0000000 (32-bit, prefetchable) [size=128M] Region 1: Memory at e0080000 (32-bit, non-prefetchable) [size=512K] Capabilities: [d0] Power Management version 1 Flags: PMEClk- DSI+ D1+ D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- drm debug output around lock time + sysreq+t task traces: Feb 28 18:12:16 [kernel] [27945.376131] [drm:drm_agp_bind_pages] Feb 28 18:12:16 [kernel] [27945.376162] [drm:i915_add_request] 3688298 Feb 28 18:12:16 [kernel] [27945.376169] [drm:i915_add_request] 3688299 Feb 28 18:12:16 [kernel] [27945.376181] [drm:drm_ioctl] pid=6673, cmd=0xc0086457, nr=0x57, dev 0xe200, auth=1 Feb 28 18:12:16 [kernel] [27945.376192] [drm:drm_ioctl] pid=6673, cmd=0x400c645f, nr=0x5f, dev 0xe200, auth=1 Feb 28 18:12:16 [kernel] [27945.376204] [drm:drm_ioctl] pid=6673, cmd=0x6458, nr=0x58, dev 0xe200, auth=1 Feb 28 18:12:16 [kernel] [27945.390764] [drm:drm_ioctl] ret = fffffe00 Feb 28 18:12:16 [kernel] [27945.390831] [drm:drm_ioctl] pid=6673, cmd=0x6458, nr=0x58, dev 0xe200, auth=1 Feb 28 18:12:16 [kernel] [27945.403653] [drm:drm_ioctl] ret = fffffe00 Feb 28 18:12:16 [kernel] [27945.403697] [drm:drm_ioctl] pid=6673, cmd=0x6458, nr=0x58, dev 0xe200, auth=1 Feb 28 18:12:16 [kernel] [27945.416451] [drm:drm_ioctl] ret = fffffe00 Feb 28 18:12:16 [kernel] [27945.416500] [drm:drm_ioctl] pid=6673, cmd=0x6458, nr=0x58, dev 0xe200, auth=1 Feb 28 18:12:16 [kernel] [27945.430630] [drm:drm_ioctl] ret = fffffe00 Feb 28 18:12:16 [kernel] [27945.430677] [drm:drm_ioctl] pid=6673, cmd=0x6458, nr=0x58, dev 0xe200, auth=1 ... (same message repeating lots of time ...) Feb 28 18:12:56 [kernel] [27986.140232] [drm:drm_ioctl] ret = fffffe00 Feb 28 18:12:56 [kernel] [27986.140274] [drm:drm_ioctl] pid=6673, cmd=0x6458, nr=0x58, dev 0xe200, auth=1 Feb 28 18:16:00 [kernel] [28169.605072] SysRq : Show State Feb 28 18:16:00 [kernel] [28169.605085] task PC stack pid father Feb 28 18:16:00 [kernel] [28169.605092] init S 00000000 1188 1 0 Feb 28 18:16:00 [kernel] [28169.605104] dd814ae4 00000086 004c4b3e 00000000 00000296 dd82a000 00000286 00000000 Feb 28 18:16:00 [kernel] [28169.605120] 004c4b3e 004c4b3e 00000000 dd814b4c c036b4b7 004c4b3e 00000000 da6d3dfc Feb 28 18:16:00 [kernel] [28169.605134] dd2d1b94 da785b94 908c73bf 0000199f 90402881 0000199f c0133610 c046e72c Feb 28 18:16:00 [kernel] [28169.605149] Call Trace: Feb 28 18:16:00 [kernel] [28169.605170] [<c036b4b7>] schedule_hrtimeout_range+0xb7/0x100 Feb 28 18:16:00 [kernel] [28169.605185] [<c0133610>] ? hrtimer_wakeup+0x0/0x20 Feb 28 18:16:00 [kernel] [28169.605195] [<c036b4a0>] ? schedule_hrtimeout_range+0xa0/0x100 Feb 28 18:16:00 [kernel] [28169.605207] [<c0175b80>] poll_schedule_timeout+0x30/0x60 Feb 28 18:16:00 [kernel] [28169.605216] [<c01762b8>] do_select+0x448/0x5b0 Feb 28 18:16:00 [kernel] [28169.605226] [<c0175c50>] ? __pollwait+0x0/0xd0 Feb 28 18:16:00 [kernel] [28169.605234] [<c0175d20>] ? pollwake+0x0/0x60 Feb 28 18:16:00 [kernel] [28169.605245] [<c01e30e1>] ? cfq_insert_request+0x31/0x390 Feb 28 18:16:00 [kernel] [28169.605256] [<c01d8f76>] ? elv_insert+0x116/0x180 Feb 28 18:16:00 [kernel] [28169.605266] [<c01d9b4f>] ? part_round_stats+0x3f/0x50 Feb 28 18:16:00 [kernel] [28169.605274] [<c01d9049>] ? __elv_add_request+0x69/0xb0 Feb 28 18:16:00 [kernel] [28169.605284] [<c01db466>] ? __make_request+0xa6/0x300 Feb 28 18:16:00 [kernel] [28169.605294] [<c01489fe>] ? mempool_alloc_slab+0xe/0x10 Feb 28 18:16:00 [kernel] [28169.605303] [<c01489fe>] ? mempool_alloc_slab+0xe/0x10 Feb 28 18:16:00 [kernel] [28169.605311] [<c0148adc>] ? mempool_alloc+0x2c/0xc0 Feb 28 18:16:00 [kernel] [28169.605323] [<c028be95>] ? scsi_pool_alloc_command+0x45/0x70 Feb 28 18:16:00 [kernel] [28169.605334] [<c0290467>] ? scsi_init_sgtable+0x47/0xa0 Feb 28 18:16:00 [kernel] [28169.605343] [<c0291fc0>] ? scsi_sg_alloc+0x0/0x50 Feb 28 18:16:00 [kernel] [28169.605351] [<c0290674>] ? scsi_init_io+0x14/0xa0 Feb 28 18:16:00 [kernel] [28169.605361] [<c014a6ca>] ? __rmqueue+0x9a/0x1a0 Feb 28 18:16:00 [kernel] [28169.605371] [<c0178b39>] ? __d_lookup+0xa9/0xe0 Feb 28 18:16:00 [kernel] [28169.605383] [<c0171f40>] ? do_lookup+0x60/0x190 Feb 28 18:16:00 [kernel] [28169.605392] [<c0163c8d>] ? shmem_permission+0xd/0x10 Feb 28 18:16:00 [kernel] [28169.605402] [<c0176bca>] core_sys_select+0x1ba/0x2e0 Feb 28 18:16:00 [kernel] [28169.605412] [<c0170cf0>] ? path_put+0x20/0x30 Feb 28 18:16:00 [kernel] [28169.605421] [<c0172b4b>] ? path_walk+0x4b/0x90 Feb 28 18:16:00 [kernel] [28169.605431] [<c01ea1e7>] ? __copy_to_user_ll+0x57/0x60 Feb 28 18:16:00 [kernel] [28169.605440] [<c01ea61e>] ? copy_to_user+0x3e/0x60 Feb 28 18:16:00 [kernel] [28169.605449] [<c016cc8b>] ? cp_new_stat64+0xeb/0x100 Feb 28 18:16:00 [kernel] [28169.605460] [<c0115fed>] ? read_hpet+0xd/0x20 Feb 28 18:16:00 [kernel] [28169.605470] [<c0136f9b>] ? getnstimeofday+0x4b/0x110 Feb 28 18:16:00 [kernel] [28169.605482] [<c0123997>] ? timespec_add_safe+0x27/0x50 Feb 28 18:16:00 [kernel] [28169.605491] [<c0176e9c>] sys_select+0x2c/0xb0 Feb 28 18:16:00 [kernel] [28169.605501] [<c01030c5>] sysenter_do_call+0x12/0x25 Feb 28 18:16:00 [kernel] [28169.605508] kthreadd S 00000286 3384 2 0 Feb 28 18:16:00 [kernel] [28169.605522] dd81dfc0 00000046 da996bd8 00000286 000008b1 dd82a330 dd81dfc0 00000286 Feb 28 18:16:00 [kernel] [28169.605536] c046e6d0 000008b1 da996bbc dd81dfe0 c0130a25 00000000 00000001 00000001 Feb 28 18:16:00 [kernel] [28169.605550] c01309a0 00000000 00000000 00000000 c01037df 00000000 00000000 00000000 Feb 28 18:16:00 [kernel] [28169.605563] Call Trace: Feb 28 18:16:00 [kernel] [28169.605573] [<c0130a25>] kthreadd+0x85/0x120 Feb 28 18:16:00 [kernel] [28169.605582] [<c01309a0>] ? kthreadd+0x0/0x120 Feb 28 18:16:00 [kernel] [28169.605590] [<c01037df>] kernel_thread_helper+0x7/0x18 Feb 28 18:16:00 [kernel] [28169.605596] ksoftirqd/0 S 00000246 3764 3 2 Feb 28 18:16:00 [kernel] [28169.605611] dd81ffc0 00000046 00000000 00000246 c0124040 dd82a660 00000246 dd81ffb8 Feb 28 18:16:00 [kernel] [28169.605625] 00000000 00000000 c0124040 dd81ffcc c01240bc fffffffc dd81ffe0 c0130972 Feb 28 18:16:00 [kernel] [28169.605639] c0130930 00000000 00000000 00000000 c01037df dd814f00 00000000 00000000 Feb 28 18:16:00 [kernel] [28169.605652] Call Trace: Feb 28 18:16:00 [kernel] [28169.605661] [<c0124040>] ? ksoftirqd+0x0/0xb0 Feb 28 18:16:00 [kernel] [28169.605671] [<c0124040>] ? ksoftirqd+0x0/0xb0 Feb 28 18:16:00 [kernel] [28169.605680] [<c01240bc>] ksoftirqd+0x7c/0xb0 Feb 28 18:16:00 [kernel] [28169.605688] [<c0130972>] kthread+0x42/0x70 Feb 28 18:16:00 [kernel] [28169.605697] [<c0130930>] ? kthread+0x0/0x70 Feb 28 18:16:00 [kernel] [28169.605705] [<c01037df>] kernel_thread_helper+0x7/0x18 Feb 28 18:16:00 [kernel] [28169.605711] events/0 D c011b912 2496 4 2 Feb 28 18:16:00 [kernel] [28169.605724] dd830f34 00000046 dd830f34 c011b912 dd8c0cc0 dd82a990 2ad00b71 000007af Feb 28 18:16:00 [kernel] [28169.605738] dd914810 ffffffff dd914814 dd830f58 c036b22e dd82a990 dd914814 dd914814 Feb 28 18:16:00 [kernel] [28169.605752] dd82a990 dd914810 dd822000 dd822ea8 dd830f68 c036b199 dd8e9828 dd914800 Feb 28 18:16:00 [kernel] [28169.605766] Call Trace: Feb 28 18:16:00 [kernel] [28169.605797] [<c036b199>] mutex_lock+0x19/0x20 Feb 28 18:16:00 [kernel] [28169.605808] [<c0271998>] i915_gem_retire_work_handler+0x28/0x70 Feb 28 18:16:00 [kernel] [28169.605818] [<c0271970>] ? i915_gem_retire_work_handler+0x0/0x70 Feb 28 18:16:00 [kernel] [28169.605827] [<c012d937>] run_workqueue+0x67/0xe0 Feb 28 18:16:00 [kernel] [28169.605835] [<c012dbf7>] worker_thread+0x97/0xf0 Feb 28 18:16:00 [kernel] [28169.605845] [<c0130cf0>] ? autoremove_wake_function+0x0/0x50 Feb 28 18:16:00 [kernel] [28169.605854] [<c012db60>] ? worker_thread+0x0/0xf0 Feb 28 18:16:00 [kernel] [28169.605862] [<c0130972>] kthread+0x42/0x70 Feb 28 18:16:00 [kernel] [28169.605871] [<c0130930>] ? kthread+0x0/0x70 Feb 28 18:16:00 [kernel] [28169.605879] [<c01037df>] kernel_thread_helper+0x7/0x18 Feb 28 18:16:00 [kernel] [28169.605885] khelper S c012d937 3300 5 2 Feb 28 18:16:00 [kernel] [28169.605899] dd831fa4 00000046 dd831fa4 c012d937 dd845980 dd82acc0 c0464180 00000246 Feb 28 18:16:00 [kernel] [28169.605913] dd808208 dd808200 dd831fac dd831fcc c012dc27 00000000 dd82acc0 c0130cf0 Feb 28 18:16:00 [kernel] [28169.605927] dd808208 dd808208 fffffffc dd808200 c012db60 dd831fe0 c0130972 c0130930 Feb 28 18:16:00 [kernel] [28169.605941] Call Trace: Feb 28 18:16:00 [kernel] [28169.605948] [<c012d937>] ? run_workqueue+0x67/0xe0 Feb 28 18:16:00 [kernel] [28169.605957] [<c012dc27>] worker_thread+0xc7/0xf0 Feb 28 18:16:00 [kernel] [28169.605966] [<c0130cf0>] ? autoremove_wake_function+0x0/0x50 Feb 28 18:16:00 [kernel] [28169.605975] [<c012db60>] ? worker_thread+0x0/0xf0 Feb 28 18:16:00 [kernel] [28169.605983] [<c0130972>] kthread+0x42/0x70 Feb 28 18:16:00 [kernel] [28169.605992] [<c0130930>] ? kthread+0x0/0x70 Feb 28 18:16:00 [kernel] [28169.606000] [<c01037df>] kernel_thread_helper+0x7/0x18 (... skipping over non-related processes ...) Feb 28 18:16:00 [kernel] [28169.614590] agetty S c0489794 1552 6664 1 Feb 28 18:16:00 [kernel] [28169.614590] da94aea4 00000082 c1029ba0 c0489794 00000292 dd0ddcb0 00000292 00000001 Feb 28 18:16:00 [kernel] [28169.614590] 7fffffff 00000000 da8a5800 da94aeec c036adb5 00000202 da94aeb8 00000046 Feb 28 18:16:00 [kernel] [28169.614590] da8a5800 da94aedc 00000246 00000246 000091c3 00000000 da8a5800 da8a5800 Feb 28 18:16:00 [kernel] [28169.614590] Call Trace: Feb 28 18:16:00 [kernel] [28169.614590] [<c036adb5>] schedule_timeout+0x75/0xc0 Feb 28 18:16:00 [kernel] [28169.614590] [<c024264b>] n_tty_read+0x1ab/0x5a0 Feb 28 18:16:00 [kernel] [28169.614590] [<c011d490>] ? default_wake_function+0x0/0x10 Feb 28 18:16:00 [kernel] [28169.614590] [<c0240156>] tty_read+0x76/0xb0 Feb 28 18:16:00 [kernel] [28169.614590] [<c02424a0>] ? n_tty_read+0x0/0x5a0 Feb 28 18:16:00 [kernel] [28169.614590] [<c016a000>] vfs_read+0x90/0x110 Feb 28 18:16:00 [kernel] [28169.614590] [<c02400e0>] ? tty_read+0x0/0xb0 Feb 28 18:16:00 [kernel] [28169.614590] [<c016a4fd>] sys_read+0x3d/0x70 Feb 28 18:16:00 [kernel] [28169.614590] [<c01030c5>] sysenter_do_call+0x12/0x25 Feb 28 18:16:00 [kernel] [28169.614590] [<c0360000>] ? __inet6_lookup_established+0x2e0/0x4d0 Feb 28 18:16:00 [kernel] [28169.614590] Xorg S 000223f2 1588 6673 1 Feb 28 18:16:00 [kernel] [28169.614590] dd082e74 00003082 00006d52 000223f2 00000000 dd8c0cc0 00000007 00003246 Feb 28 18:16:00 [kernel] [28169.614590] 00000000 dd914800 0038476b dd082ebc c0271891 5d343732 ffff0020 00000000 Feb 28 18:16:00 [kernel] [28169.614590] 80000000 0000bffc 00000000 dd822084 dd822000 00000000 dd8c0cc0 c0130cf0 Feb 28 18:16:00 [kernel] [28169.614590] Call Trace: Feb 28 18:16:00 [kernel] [28169.614590] [<c0271891>] i915_wait_request+0x141/0x1a0 Feb 28 18:16:00 [kernel] [28169.614590] [<c0130cf0>] ? autoremove_wake_function+0x0/0x50 Feb 28 18:16:00 [kernel] [28169.614590] [<c0271925>] i915_gem_throttle_ioctl+0x35/0x50 Feb 28 18:16:00 [kernel] [28169.614590] [<c025e0f0>] drm_ioctl+0xe0/0x2b0 Feb 28 18:16:00 [kernel] [28169.614590] [<c02718f0>] ? i915_gem_throttle_ioctl+0x0/0x50 Feb 28 18:16:00 [kernel] [28169.614590] [<c025e010>] ? drm_ioctl+0x0/0x2b0 Feb 28 18:16:00 [kernel] [28169.614590] [<c0174b27>] vfs_ioctl+0x67/0x70 Feb 28 18:16:00 [kernel] [28169.614590] [<c01750fa>] do_vfs_ioctl+0x1fa/0x520 Feb 28 18:16:00 [kernel] [28169.614590] [<c0109751>] ? restore_i387_xstate+0xd1/0x1d0 Feb 28 18:16:00 [kernel] [28169.614590] [<c0115fed>] ? read_hpet+0xd/0x20 Feb 28 18:16:00 [kernel] [28169.614590] [<c0136f9b>] ? getnstimeofday+0x4b/0x110 Feb 28 18:16:00 [kernel] [28169.614590] [<c0102518>] ? restore_sigcontext+0xf8/0x120 Feb 28 18:16:00 [kernel] [28169.614590] [<c0175459>] sys_ioctl+0x39/0x60 Feb 28 18:16:00 [kernel] [28169.614590] [<c01030c5>] sysenter_do_call+0x12/0x25 Feb 28 18:16:00 [kernel] [28169.614590] enlightenment S 0001579d 1544 6681 1 Feb 28 18:16:00 [kernel] [28169.614590] dd23eb84 00000082 c01763e8 0001579d dd23ee68 da5e9cb0 c4e2db00 00000000 Feb 28 18:16:00 [kernel] [28169.614590] 00000000 00000000 00000000 dd23ebec c036b4df 00000000 00000001 00000000 Feb 28 18:16:00 [kernel] [28169.614590] dd23ee70 dd23ee74 dd23ee78 dd23ee68 dd23ee6c dd23ee70 50008ab0 00000246 Feb 28 18:16:00 [kernel] [28169.614590] Call Trace: Feb 28 18:16:00 [kernel] [28169.614590] [<c01763e8>] ? do_select+0x578/0x5b0 Feb 28 18:16:00 [kernel] [28169.614590] [<c036b4df>] schedule_hrtimeout_range+0xdf/0x100 Feb 28 18:16:00 [kernel] [28169.614590] [<c0175cbf>] ? __pollwait+0x6f/0xd0 Feb 28 18:16:00 [kernel] [28169.614590] [<c0175b80>] poll_schedule_timeout+0x30/0x60 Feb 28 18:16:00 [kernel] [28169.614590] [<c0176662>] do_sys_poll+0x242/0x3e0 Feb 28 18:16:00 [kernel] [28169.614590] [<c0175c50>] ? __pollwait+0x0/0xd0 Feb 28 18:16:00 [kernel] [28169.614590] [<c0175d20>] ? pollwake+0x0/0x60 - Last output repeated 2 times - Feb 28 18:16:00 [kernel] [28169.614590] [<c014a6ca>] ? __rmqueue+0x9a/0x1a0 Feb 28 18:16:00 [kernel] [28169.614590] [<c014ad2b>] ? get_page_from_freelist+0x3db/0x450 Feb 28 18:16:00 [kernel] [28169.614590] [<c01489fe>] ? mempool_alloc_slab+0xe/0x10 Feb 28 18:16:00 [kernel] [28169.614590] [<c014b194>] ? __alloc_pages_internal+0x94/0x400 Feb 28 18:16:00 [kernel] [28169.614590] [<c014b51c>] ? __get_free_pages+0x1c/0x40 Feb 28 18:16:00 [kernel] [28169.614590] [<c016741b>] ? __kmalloc_track_caller+0xbb/0xe0 Feb 28 18:16:00 [kernel] [28169.614590] [<c02daf8b>] ? __alloc_skb+0x4b/0x100 Feb 28 18:16:00 [kernel] [28169.614590] [<c036b18e>] ? mutex_lock+0xe/0x20 Feb 28 18:16:00 [kernel] [28169.614590] [<c0334cea>] ? unix_stream_recvmsg+0x1ea/0x450 Feb 28 18:16:00 [kernel] [28169.614590] [<c02d6b36>] ? sock_aio_read+0xe6/0x110 Feb 28 18:16:00 [kernel] [28169.614590] [<c0169e7c>] ? do_sync_read+0xcc/0x110 Feb 28 18:16:00 [kernel] [28169.614590] [<c033464e>] ? unix_ioctl+0x9e/0xd0 Feb 28 18:16:00 [kernel] [28169.614590] [<c0130cf0>] ? autoremove_wake_function+0x0/0x50 Feb 28 18:16:00 [kernel] [28169.614590] [<c016a07b>] ? vfs_read+0x10b/0x110 Feb 28 18:16:00 [kernel] [28169.614590] [<c0176964>] sys_poll+0x54/0xb0 Feb 28 18:16:00 [kernel] [28169.614590] [<c01030c5>] sysenter_do_call+0x12/0x25 Feb 28 18:16:00 [kernel] [28169.614590] [<c0360000>] ? __inet6_lookup_established+0x2e0/0x4d0 (... skipping over unrelated processes ...) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: i915 X lockup 2009-02-28 18:24 ` Bruno Prémont @ 2009-02-28 19:57 ` Eric Anholt 0 siblings, 0 replies; 15+ messages in thread From: Eric Anholt @ 2009-02-28 19:57 UTC (permalink / raw) To: Bruno Prémont Cc: Andrew Morton, Jiri Slaby, Sitsofe Wheeler, airlied, keithp, dri-devel, Linux kernel mailing list [-- Attachment #1: Type: text/plain, Size: 2415 bytes --] On Sat, 2009-02-28 at 19:24 +0100, Bruno Prémont wrote: > On Sat, 28 February 2009 Eric Anholt <eric@anholt.net> wrote: > > On Sat, 2009-02-28 at 00:47 -0800, Andrew Morton wrote: > > > The kernel deadlocked on struct_mutex, did it not? That's a kernel > > > bug regardless of what userspace you're running. > > > > > > Do we know why this happened? > > > > Userland went stomping all over the device state that the kernel > > thinks it controls since you went and turned on the KMS option > > asserting "I'm not going to run old userland", so the GPU got hung, > > and further software using the GPU hung, and then somebody waiting > > for someone else finishing using the GPU (struct_mutex) got hung. > > > > The only proposal to prevent it is to use the "don't let userland map > > my PCI device any more" support we now have available to us, which > > would make X fail early on. The unfortunate side-effect of that is > > that we lose the ability to run incredibly useful userland debug > > tools that do read-only access to registers. We're moving bits of > > those into debugfs for 2.6.30, but it's work that's not done even for > > the tools we have today. > > I also saw/see Xorg lockup... > > I'm running xf86-video-intel-2.6.1 (2.6.2 released only very recently) > and kernel with KMS disabled (KMS not capable of getting framebuffer > properly configured it seems, at least display remains black) > > For me it's a deadlock between intel driver dispatching GEM requests > and events (see log below). > For me each time it happends was while interacting with a webbrowser, > though it does not happend very often. > > Connecting to the notebook via ssh I can kill -KILL Xorg (kill -TERM > does not work). > When doing this GPU gets at least confused though in order to get > vesafb back working it's sufficient to start Xorg with vesa driver > (running with intel driver before rebooting leads to locked X and > need to retry the killing) You have a completely different problem. Your GPU is hung in doing something that should work. 8xx support is notoriously bad right now, and we haven't managed to get the developer time on fixing it that we need, though I keep pushing for it. My only 855 has a dead disk, or I probably would have poked at it by now. -- Eric Anholt eric@anholt.net eric.anholt@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: i915 X lockup 2009-02-28 8:47 ` Andrew Morton 2009-02-28 9:00 ` Eric Anholt @ 2009-02-28 17:11 ` Keith Packard 1 sibling, 0 replies; 15+ messages in thread From: Keith Packard @ 2009-02-28 17:11 UTC (permalink / raw) To: Andrew Morton Cc: Keith Packard, Jiri Slaby, Eric Anholt, Sitsofe Wheeler, airlied, dri-devel, Linux kernel mailing list [-- Attachment #1: Type: text/plain, Size: 929 bytes --] On Sat, 2009-02-28 at 00:47 -0800, Andrew Morton wrote: > The kernel deadlocked on struct_mutex, did it not? That's a kernel bug > regardless of what userspace you're running. No, it didn't deadlock on struct_mutex, it deadlocked because the hardware got wedged, and we still don't know how to unwedge the hardware and get it working again other than turning it off and back on again. > Do we know why this happened? Yes, the hardware will happily lock up when user space maps the PCI BAR covering the device registers and the application pokes various internal device registers directly. That's the fundamental contract KMS requires -- if the kernel is going to manage the device, then user space isn't supposed to manipulate it directly anymore. I suspect most any other device in the machine could be made to do 'bad things' if userspace went and poked it directly. -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2009-02-28 19:57 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-02-27 9:28 i915 X lockup Jiri Slaby 2009-02-27 10:01 ` Peter Zijlstra 2009-02-27 10:12 ` Jiri Slaby 2009-02-27 10:14 ` Jiri Slaby 2009-02-27 10:32 ` Andrew Morton 2009-02-27 13:04 ` Sitsofe Wheeler 2009-02-27 13:49 ` Jiri Slaby 2009-02-27 23:12 ` Sitsofe Wheeler 2009-02-28 0:20 ` Eric Anholt 2009-02-28 8:31 ` Jiri Slaby 2009-02-28 8:47 ` Andrew Morton 2009-02-28 9:00 ` Eric Anholt 2009-02-28 18:24 ` Bruno Prémont 2009-02-28 19:57 ` Eric Anholt 2009-02-28 17:11 ` Keith Packard
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox