* i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree) @ 2013-01-08 22:36 Greg KH 2013-01-09 0:38 ` Greg KH 0 siblings, 1 reply; 18+ messages in thread From: Greg KH @ 2013-01-08 22:36 UTC (permalink / raw) To: Chris Wilson, daniel.vetter; +Cc: intel-gfx, linux-kernel, Jesse Barnes Hi all, I've hit this 3 times today on Linus's latest 3.8-rc2+ tree: [11868.414648] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung [11868.414655] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state [11870.408342] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung [11870.408412] [drm:i915_reset] *ERROR* GPU hanging too fast, declaring wedged! [11870.408414] [drm:i915_reset] *ERROR* Failed to reset chip. [11883.083225] gnome-shell[19396]: segfault at 218 ip 00007feef5f32333 sp 00007ffffc1dc930 error 4 in i965_dri.so[7feef5ecb000+d0000] When it happens, gnome-shell dies a horrible death and it requires a reboot in order to get xorg working properly again (probably because gnome-shell is hosed.) The machine does still work to do other things from a text console (I'm writing this on the machine after the last time this happened.) It seems to happen when doing a "stressful" thing on the machine (i.e. multiple kernel builds at the same time). I also seem to be able to hit this on 3.7.1, but not as regularly, and not at all on 3.6.y. Any hints or ideas of what to try out? thanks, greg k-h ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree) 2013-01-08 22:36 i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree) Greg KH @ 2013-01-09 0:38 ` Greg KH 2013-01-09 3:42 ` Dave Airlie 0 siblings, 1 reply; 18+ messages in thread From: Greg KH @ 2013-01-09 0:38 UTC (permalink / raw) To: Chris Wilson, daniel.vetter; +Cc: intel-gfx, linux-kernel, Jesse Barnes [-- Attachment #1: Type: text/plain, Size: 852 bytes --] On Tue, Jan 08, 2013 at 02:36:11PM -0800, Greg KH wrote: > Hi all, > > I've hit this 3 times today on Linus's latest 3.8-rc2+ tree: > > [11868.414648] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung > [11868.414655] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state > [11870.408342] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung > [11870.408412] [drm:i915_reset] *ERROR* GPU hanging too fast, declaring wedged! > [11870.408414] [drm:i915_reset] *ERROR* Failed to reset chip. > [11883.083225] gnome-shell[19396]: segfault at 218 ip 00007feef5f32333 sp 00007ffffc1dc930 error 4 in i965_dri.so[7feef5ecb000+d0000] I just hit this again. And, as the kernel was asking for it, attached is the i915_error_state file, compressed due to the size of it. thanks, greg k-h [-- Attachment #2: i915_error_state.gz --] [-- Type: application/x-gzip, Size: 200230 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree) 2013-01-09 0:38 ` Greg KH @ 2013-01-09 3:42 ` Dave Airlie 2013-01-09 4:25 ` Greg KH 0 siblings, 1 reply; 18+ messages in thread From: Dave Airlie @ 2013-01-09 3:42 UTC (permalink / raw) To: Greg KH; +Cc: Chris Wilson, daniel.vetter, intel-gfx, linux-kernel, Jesse Barnes >> Hi all, >> >> I've hit this 3 times today on Linus's latest 3.8-rc2+ tree: >> >> [11868.414648] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung >> [11868.414655] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state >> [11870.408342] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung >> [11870.408412] [drm:i915_reset] *ERROR* GPU hanging too fast, declaring wedged! >> [11870.408414] [drm:i915_reset] *ERROR* Failed to reset chip. >> [11883.083225] gnome-shell[19396]: segfault at 218 ip 00007feef5f32333 sp 00007ffffc1dc930 error 4 in i965_dri.so[7feef5ecb000+d0000] > > I just hit this again. And, as the kernel was asking for it, attached > is the i915_error_state file, compressed due to the size of it. > Welcome to sink hole that is https://bugs.freedesktop.org/show_bug.cgi?id=55984 3 months and ticking, Intel guys are all running away from it saying they can't reproduce, everyone else on planet seems to reproduce quite easily. Its generally considered a bug in the relocation/shrinker/no idea category, Assuming you have an Ironlake machine which I'm going to guess you do. Dave. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree) 2013-01-09 3:42 ` Dave Airlie @ 2013-01-09 4:25 ` Greg KH 2013-01-09 5:31 ` Dave Airlie 0 siblings, 1 reply; 18+ messages in thread From: Greg KH @ 2013-01-09 4:25 UTC (permalink / raw) To: Dave Airlie Cc: Chris Wilson, daniel.vetter, intel-gfx, linux-kernel, Jesse Barnes On Wed, Jan 09, 2013 at 01:42:39PM +1000, Dave Airlie wrote: > >> Hi all, > >> > >> I've hit this 3 times today on Linus's latest 3.8-rc2+ tree: > >> > >> [11868.414648] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung > >> [11868.414655] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state > >> [11870.408342] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung > >> [11870.408412] [drm:i915_reset] *ERROR* GPU hanging too fast, declaring wedged! > >> [11870.408414] [drm:i915_reset] *ERROR* Failed to reset chip. > >> [11883.083225] gnome-shell[19396]: segfault at 218 ip 00007feef5f32333 sp 00007ffffc1dc930 error 4 in i965_dri.so[7feef5ecb000+d0000] > > > > I just hit this again. And, as the kernel was asking for it, attached > > is the i915_error_state file, compressed due to the size of it. > > > Welcome to sink hole that is > https://bugs.freedesktop.org/show_bug.cgi?id=55984 > > 3 months and ticking, Intel guys are all running away from it saying > they can't reproduce, everyone else on planet seems to reproduce quite > easily. > > Its generally considered a bug in the relocation/shrinker/no idea category, Ugh, what a mess. > Assuming you have an Ironlake machine which I'm going to guess you do. I don't know, it's an old i5 machine that has never had any video problems for many years now. How do I tell? thanks, greg k-h ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree) 2013-01-09 4:25 ` Greg KH @ 2013-01-09 5:31 ` Dave Airlie 2013-01-09 7:28 ` Lijo Antony 0 siblings, 1 reply; 18+ messages in thread From: Dave Airlie @ 2013-01-09 5:31 UTC (permalink / raw) To: Greg KH; +Cc: Chris Wilson, daniel.vetter, intel-gfx, linux-kernel, Jesse Barnes On Wed, Jan 9, 2013 at 2:25 PM, Greg KH <gregkh@linuxfoundation.org> wrote: > On Wed, Jan 09, 2013 at 01:42:39PM +1000, Dave Airlie wrote: >> >> Hi all, >> >> >> >> I've hit this 3 times today on Linus's latest 3.8-rc2+ tree: >> >> >> >> [11868.414648] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung >> >> [11868.414655] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state >> >> [11870.408342] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung >> >> [11870.408412] [drm:i915_reset] *ERROR* GPU hanging too fast, declaring wedged! >> >> [11870.408414] [drm:i915_reset] *ERROR* Failed to reset chip. >> >> [11883.083225] gnome-shell[19396]: segfault at 218 ip 00007feef5f32333 sp 00007ffffc1dc930 error 4 in i965_dri.so[7feef5ecb000+d0000] >> > >> > I just hit this again. And, as the kernel was asking for it, attached >> > is the i915_error_state file, compressed due to the size of it. >> > >> Welcome to sink hole that is >> https://bugs.freedesktop.org/show_bug.cgi?id=55984 >> >> 3 months and ticking, Intel guys are all running away from it saying >> they can't reproduce, everyone else on planet seems to reproduce quite >> easily. >> >> Its generally considered a bug in the relocation/shrinker/no idea category, > > Ugh, what a mess. > >> Assuming you have an Ironlake machine which I'm going to guess you do. > > I don't know, it's an old i5 machine that has never had any video > problems for many years now. How do I tell? lspci -nn probably an 8086:0046 device. Old i5 probably means original i5 which means ironlake. Dave. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree) 2013-01-09 5:31 ` Dave Airlie @ 2013-01-09 7:28 ` Lijo Antony 2013-01-09 19:44 ` Dave Kleikamp 0 siblings, 1 reply; 18+ messages in thread From: Lijo Antony @ 2013-01-09 7:28 UTC (permalink / raw) To: Dave Airlie Cc: Greg KH, Chris Wilson, daniel.vetter, intel-gfx, linux-kernel, Jesse Barnes On 01/09/2013 09:31 AM, Dave Airlie wrote: > On Wed, Jan 9, 2013 at 2:25 PM, Greg KH <gregkh@linuxfoundation.org> wrote: >> On Wed, Jan 09, 2013 at 01:42:39PM +1000, Dave Airlie wrote: >>>>> Hi all, >>>>> >>>>> I've hit this 3 times today on Linus's latest 3.8-rc2+ tree: >>>>> >>>>> [11868.414648] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung >>>>> [11868.414655] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state >>>>> [11870.408342] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung >>>>> [11870.408412] [drm:i915_reset] *ERROR* GPU hanging too fast, declaring wedged! >>>>> [11870.408414] [drm:i915_reset] *ERROR* Failed to reset chip. >>>>> [11883.083225] gnome-shell[19396]: segfault at 218 ip 00007feef5f32333 sp 00007ffffc1dc930 error 4 in i965_dri.so[7feef5ecb000+d0000] >>>> >>>> I just hit this again. And, as the kernel was asking for it, attached >>>> is the i915_error_state file, compressed due to the size of it. >>>> >>> Welcome to sink hole that is >>> https://bugs.freedesktop.org/show_bug.cgi?id=55984 >>> >>> 3 months and ticking, Intel guys are all running away from it saying >>> they can't reproduce, everyone else on planet seems to reproduce quite >>> easily. >>> >>> Its generally considered a bug in the relocation/shrinker/no idea category, >> >> Ugh, what a mess. >> >>> Assuming you have an Ironlake machine which I'm going to guess you do. >> >> I don't know, it's an old i5 machine that has never had any video >> problems for many years now. How do I tell? > > lspci -nn probably an 8086:0046 device. > > Old i5 probably means original i5 which means ironlake. > I have also seen this a couple of times on 3.7 and 3.8-rc1. Most of the times I was watching youtube video in chrome. Nothing crashed though(I am not running gnome shell). System recovered after few seconds. I didn't see this on 3.8-rc2 yet, probably because I haven't watched any video. -lijo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree) 2013-01-09 7:28 ` Lijo Antony @ 2013-01-09 19:44 ` Dave Kleikamp 2013-01-09 20:12 ` Dave Kleikamp 0 siblings, 1 reply; 18+ messages in thread From: Dave Kleikamp @ 2013-01-09 19:44 UTC (permalink / raw) To: Lijo Antony Cc: Dave Airlie, Greg KH, Chris Wilson, daniel.vetter, intel-gfx, linux-kernel, Jesse Barnes On 01/09/2013 01:28 AM, Lijo Antony wrote: > On 01/09/2013 09:31 AM, Dave Airlie wrote: >> On Wed, Jan 9, 2013 at 2:25 PM, Greg KH <gregkh@linuxfoundation.org> >> wrote: >>> On Wed, Jan 09, 2013 at 01:42:39PM +1000, Dave Airlie wrote: >>>>>> Hi all, >>>>>> >>>>>> I've hit this 3 times today on Linus's latest 3.8-rc2+ tree: >>>>>> >>>>>> [11868.414648] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer >>>>>> elapsed... GPU hung >>>>>> [11868.414655] [drm] capturing error event; look for more >>>>>> information in /debug/dri/0/i915_error_state >>>>>> [11870.408342] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer >>>>>> elapsed... GPU hung >>>>>> [11870.408412] [drm:i915_reset] *ERROR* GPU hanging too fast, >>>>>> declaring wedged! >>>>>> [11870.408414] [drm:i915_reset] *ERROR* Failed to reset chip. >>>>>> [11883.083225] gnome-shell[19396]: segfault at 218 ip >>>>>> 00007feef5f32333 sp 00007ffffc1dc930 error 4 in >>>>>> i965_dri.so[7feef5ecb000+d0000] >>>>> >>>>> I just hit this again. And, as the kernel was asking for it, attached >>>>> is the i915_error_state file, compressed due to the size of it. >>>>> >>>> Welcome to sink hole that is >>>> https://bugs.freedesktop.org/show_bug.cgi?id=55984 >>>> >>>> 3 months and ticking, Intel guys are all running away from it saying >>>> they can't reproduce, everyone else on planet seems to reproduce quite >>>> easily. >>>> >>>> Its generally considered a bug in the relocation/shrinker/no idea >>>> category, >>> >>> Ugh, what a mess. >>> >>>> Assuming you have an Ironlake machine which I'm going to guess you do. >>> >>> I don't know, it's an old i5 machine that has never had any video >>> problems for many years now. How do I tell? >> >> lspci -nn probably an 8086:0046 device. >> >> Old i5 probably means original i5 which means ironlake. >> > > I have also seen this a couple of times on 3.7 and 3.8-rc1. > Most of the times I was watching youtube video in chrome. Nothing > crashed though(I am not running gnome shell). System recovered after few > seconds. > > I didn't see this on 3.8-rc2 yet, probably because I haven't watched any > video. I can easily reproduce it running glxgears on 3.8-rc1 or 3.8-rc2. 00:02.0 VGA compatible controller [0300]: Intel Corporation Core Processor Integrated Graphics Controller [8086:0046] (rev 02) Thinkpad T410 Shaggy > > -lijo > > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree) 2013-01-09 19:44 ` Dave Kleikamp @ 2013-01-09 20:12 ` Dave Kleikamp 2013-01-09 21:08 ` Greg KH ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: Dave Kleikamp @ 2013-01-09 20:12 UTC (permalink / raw) Cc: Lijo Antony, Dave Airlie, Greg KH, Chris Wilson, daniel.vetter, intel-gfx, linux-kernel, Jesse Barnes On 01/09/2013 01:44 PM, Dave Kleikamp wrote: > > I can easily reproduce it running glxgears on 3.8-rc1 or 3.8-rc2. > > 00:02.0 VGA compatible controller [0300]: Intel Corporation Core > Processor Integrated Graphics Controller [8086:0046] (rev 02) > > Thinkpad T410 > > Shaggy Daniel's patch: drm/i915: Revert shrinker changes from "Track unbound pages" fixes the problem for me. Thanks, Shaggy ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree) 2013-01-09 20:12 ` Dave Kleikamp @ 2013-01-09 21:08 ` Greg KH 2013-01-10 0:40 ` Greg KH 2013-01-11 17:26 ` Nikola Pajkovsky 2 siblings, 0 replies; 18+ messages in thread From: Greg KH @ 2013-01-09 21:08 UTC (permalink / raw) To: Dave Kleikamp Cc: Lijo Antony, Dave Airlie, Chris Wilson, daniel.vetter, intel-gfx, linux-kernel, Jesse Barnes On Wed, Jan 09, 2013 at 02:12:04PM -0600, Dave Kleikamp wrote: > On 01/09/2013 01:44 PM, Dave Kleikamp wrote: > > > > I can easily reproduce it running glxgears on 3.8-rc1 or 3.8-rc2. > > > > 00:02.0 VGA compatible controller [0300]: Intel Corporation Core > > Processor Integrated Graphics Controller [8086:0046] (rev 02) > > > > Thinkpad T410 > > > > Shaggy > > Daniel's patch: > > drm/i915: Revert shrinker changes from "Track unbound pages" > > fixes the problem for me. Thanks for the hint, I'll go try that right now... greg k-h ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree) 2013-01-09 20:12 ` Dave Kleikamp 2013-01-09 21:08 ` Greg KH @ 2013-01-10 0:40 ` Greg KH 2013-01-10 1:07 ` Chris Wilson 2013-01-11 17:26 ` Nikola Pajkovsky 2 siblings, 1 reply; 18+ messages in thread From: Greg KH @ 2013-01-10 0:40 UTC (permalink / raw) To: Dave Kleikamp Cc: Lijo Antony, Dave Airlie, Chris Wilson, daniel.vetter, intel-gfx, linux-kernel, Jesse Barnes On Wed, Jan 09, 2013 at 02:12:04PM -0600, Dave Kleikamp wrote: > On 01/09/2013 01:44 PM, Dave Kleikamp wrote: > > > > I can easily reproduce it running glxgears on 3.8-rc1 or 3.8-rc2. > > > > 00:02.0 VGA compatible controller [0300]: Intel Corporation Core > > Processor Integrated Graphics Controller [8086:0046] (rev 02) > > > > Thinkpad T410 > > > > Shaggy > > Daniel's patch: > > drm/i915: Revert shrinker changes from "Track unbound pages" > > fixes the problem for me. After an afternoon of multiple kernel builds and other stressful things, it looks like it fixes it for me as well. Chris, this will be going to Linus soon, right? thanks, greg k-h ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree) 2013-01-10 0:40 ` Greg KH @ 2013-01-10 1:07 ` Chris Wilson 2013-01-10 1:19 ` Dave Airlie 0 siblings, 1 reply; 18+ messages in thread From: Chris Wilson @ 2013-01-10 1:07 UTC (permalink / raw) To: Greg KH, Dave Kleikamp Cc: Lijo Antony, Dave Airlie, daniel.vetter, intel-gfx, linux-kernel, Jesse Barnes On Wed, 9 Jan 2013 16:40:25 -0800, Greg KH <gregkh@linuxfoundation.org> wrote: > On Wed, Jan 09, 2013 at 02:12:04PM -0600, Dave Kleikamp wrote: > > On 01/09/2013 01:44 PM, Dave Kleikamp wrote: > > > > > > I can easily reproduce it running glxgears on 3.8-rc1 or 3.8-rc2. > > > > > > 00:02.0 VGA compatible controller [0300]: Intel Corporation Core > > > Processor Integrated Graphics Controller [8086:0046] (rev 02) > > > > > > Thinkpad T410 > > > > > > Shaggy > > > > Daniel's patch: > > > > drm/i915: Revert shrinker changes from "Track unbound pages" > > > > fixes the problem for me. > > After an afternoon of multiple kernel builds and other stressful things, > it looks like it fixes it for me as well. Chris, this will be going to > Linus soon, right? Daniel will send it on. I hope before he does so, he will clarify the changelog to note that it is just papering over the issue. If the conjecture is right, it will not prevent that path from triggering the hang, nor does it prevent other eviction paths from potentially causing the same issue. -Chris -- Chris Wilson, Intel Open Source Technology Centre ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree) 2013-01-10 1:07 ` Chris Wilson @ 2013-01-10 1:19 ` Dave Airlie 0 siblings, 0 replies; 18+ messages in thread From: Dave Airlie @ 2013-01-10 1:19 UTC (permalink / raw) To: Chris Wilson Cc: Greg KH, Dave Kleikamp, Lijo Antony, daniel.vetter, intel-gfx, linux-kernel, Jesse Barnes On Thu, Jan 10, 2013 at 11:07 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote: > On Wed, 9 Jan 2013 16:40:25 -0800, Greg KH <gregkh@linuxfoundation.org> wrote: >> On Wed, Jan 09, 2013 at 02:12:04PM -0600, Dave Kleikamp wrote: >> > On 01/09/2013 01:44 PM, Dave Kleikamp wrote: >> > > >> > > I can easily reproduce it running glxgears on 3.8-rc1 or 3.8-rc2. >> > > >> > > 00:02.0 VGA compatible controller [0300]: Intel Corporation Core >> > > Processor Integrated Graphics Controller [8086:0046] (rev 02) >> > > >> > > Thinkpad T410 >> > > >> > > Shaggy >> > >> > Daniel's patch: >> > >> > drm/i915: Revert shrinker changes from "Track unbound pages" >> > >> > fixes the problem for me. >> >> After an afternoon of multiple kernel builds and other stressful things, >> it looks like it fixes it for me as well. Chris, this will be going to >> Linus soon, right? > > Daniel will send it on. I hope before he does so, he will clarify the > changelog to note that it is just papering over the issue. If the > conjecture is right, it will not prevent that path from triggering the > hang, nor does it prevent other eviction paths from potentially causing > the same issue. In this case since the issue was papered over all the kernel up until 3.7, I think repapering is the answer for now. I have a novel idea maybe someone could spend some time working out what is broken in private on a test box instead of making everyone who runs 3.7 and 3.8 on ILK deal with it. I of course know this won't happen and I'll be reverting patches from you guys that cause Ironlake flakyness for ever. Dave. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree) 2013-01-09 20:12 ` Dave Kleikamp 2013-01-09 21:08 ` Greg KH 2013-01-10 0:40 ` Greg KH @ 2013-01-11 17:26 ` Nikola Pajkovsky 2013-01-11 18:42 ` Daniel Vetter 2 siblings, 1 reply; 18+ messages in thread From: Nikola Pajkovsky @ 2013-01-11 17:26 UTC (permalink / raw) To: Dave Kleikamp Cc: Lijo Antony, Dave Airlie, Greg KH, Chris Wilson, daniel.vetter, intel-gfx, linux-kernel, Jesse Barnes Dave Kleikamp <dave.kleikamp@oracle.com> writes: > On 01/09/2013 01:44 PM, Dave Kleikamp wrote: >> >> I can easily reproduce it running glxgears on 3.8-rc1 or 3.8-rc2. >> >> 00:02.0 VGA compatible controller [0300]: Intel Corporation Core >> Processor Integrated Graphics Controller [8086:0046] (rev 02) >> >> Thinkpad T410 >> >> Shaggy > > Daniel's patch: > > drm/i915: Revert shrinker changes from "Track unbound pages" > > fixes the problem for me. bug still kicking even w/ (drm/i915: Revert shrinker changes from "Track unbound pages") $ glxgears [ 429.656459] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung [ 429.656463] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state [ 429.665762] [drm:kick_ring] *ERROR* Kicking stuck wait on render ring -- Nikola ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree) 2013-01-11 17:26 ` Nikola Pajkovsky @ 2013-01-11 18:42 ` Daniel Vetter 2013-01-14 6:58 ` Nikola Pajkovsky 0 siblings, 1 reply; 18+ messages in thread From: Daniel Vetter @ 2013-01-11 18:42 UTC (permalink / raw) To: Nikola Pajkovsky Cc: Dave Kleikamp, Lijo Antony, Dave Airlie, Greg KH, Chris Wilson, intel-gfx, linux-kernel, Jesse Barnes On Fri, Jan 11, 2013 at 6:26 PM, Nikola Pajkovsky <npajkovs@redhat.com> wrote: > bug still kicking even w/ (drm/i915: Revert shrinker changes from "Track > unbound pages") Could be a different bug, can you please attach the error_state somewhere? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree) 2013-01-11 18:42 ` Daniel Vetter @ 2013-01-14 6:58 ` Nikola Pajkovsky 2013-01-14 9:06 ` Daniel Vetter 0 siblings, 1 reply; 18+ messages in thread From: Nikola Pajkovsky @ 2013-01-14 6:58 UTC (permalink / raw) To: Daniel Vetter Cc: Dave Kleikamp, Lijo Antony, Dave Airlie, Greg KH, Chris Wilson, intel-gfx, linux-kernel, Jesse Barnes [-- Attachment #1: Type: text/plain, Size: 412 bytes --] Daniel Vetter <daniel.vetter@ffwll.ch> writes: > On Fri, Jan 11, 2013 at 6:26 PM, Nikola Pajkovsky <npajkovs@redhat.com> wrote: >> bug still kicking even w/ (drm/i915: Revert shrinker changes from "Track >> unbound pages") > > Could be a different bug, can you please attach the error_state somewhere? yep, i915_error_state is attached. btw, I'm going to bisect kernel, so hopefully I will bring some commit. [-- Attachment #2: i915_error_state --] [-- Type: application/octet-stream, Size: 278740 bytes --] [-- Attachment #3: Type: text/plain, Size: 12 bytes --] -- Nikola ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree) 2013-01-14 6:58 ` Nikola Pajkovsky @ 2013-01-14 9:06 ` Daniel Vetter 2013-01-14 9:49 ` Nikola Pajkovsky 0 siblings, 1 reply; 18+ messages in thread From: Daniel Vetter @ 2013-01-14 9:06 UTC (permalink / raw) To: Nikola Pajkovsky Cc: Dave Kleikamp, Lijo Antony, Dave Airlie, Greg KH, Chris Wilson, intel-gfx, linux-kernel, Jesse Barnes On Mon, Jan 14, 2013 at 7:58 AM, Nikola Pajkovsky <npajkovs@redhat.com> wrote: > Daniel Vetter <daniel.vetter@ffwll.ch> writes: > >> On Fri, Jan 11, 2013 at 6:26 PM, Nikola Pajkovsky <npajkovs@redhat.com> wrote: >>> bug still kicking even w/ (drm/i915: Revert shrinker changes from "Track >>> unbound pages") >> >> Could be a different bug, can you please attach the error_state somewhere? > > yep, i915_error_state is attached. btw, I'm going to bisect kernel, so > hopefully I will bring some commit. Different bug, on a quick lock this could be a dupe of https://bugzilla.kernel.org/show_bug.cgi?id=52311 Chris should know the details. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree) 2013-01-14 9:06 ` Daniel Vetter @ 2013-01-14 9:49 ` Nikola Pajkovsky 2013-01-14 12:47 ` Chris Wilson 0 siblings, 1 reply; 18+ messages in thread From: Nikola Pajkovsky @ 2013-01-14 9:49 UTC (permalink / raw) To: Daniel Vetter Cc: Dave Kleikamp, Lijo Antony, Dave Airlie, Greg KH, Chris Wilson, intel-gfx, linux-kernel, Jesse Barnes Daniel Vetter <daniel.vetter@ffwll.ch> writes: > On Mon, Jan 14, 2013 at 7:58 AM, Nikola Pajkovsky <npajkovs@redhat.com> wrote: >> Daniel Vetter <daniel.vetter@ffwll.ch> writes: >> >>> On Fri, Jan 11, 2013 at 6:26 PM, Nikola Pajkovsky <npajkovs@redhat.com> wrote: >>>> bug still kicking even w/ (drm/i915: Revert shrinker changes from "Track >>>> unbound pages") >>> >>> Could be a different bug, can you please attach the error_state somewhere? >> >> yep, i915_error_state is attached. btw, I'm going to bisect kernel, so >> hopefully I will bring some commit. > > Different bug, on a quick lock this could be a dupe of > https://bugzilla.kernel.org/show_bug.cgi?id=52311 ok > Chris should know the details. thanks, bisection leads me to commit d7d4eed ("drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers). It's not possible to simply revert/test commit and I have no idea how i915 works. Chris any ideas? -- Nikola ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree) 2013-01-14 9:49 ` Nikola Pajkovsky @ 2013-01-14 12:47 ` Chris Wilson 0 siblings, 0 replies; 18+ messages in thread From: Chris Wilson @ 2013-01-14 12:47 UTC (permalink / raw) To: Nikola Pajkovsky, Daniel Vetter Cc: Dave Kleikamp, Lijo Antony, Dave Airlie, Greg KH, intel-gfx, linux-kernel, Jesse Barnes On Mon, 14 Jan 2013 10:49:08 +0100, Nikola Pajkovsky <npajkovs@redhat.com> wrote: > Daniel Vetter <daniel.vetter@ffwll.ch> writes: > > > On Mon, Jan 14, 2013 at 7:58 AM, Nikola Pajkovsky <npajkovs@redhat.com> wrote: > >> Daniel Vetter <daniel.vetter@ffwll.ch> writes: > >> > >>> On Fri, Jan 11, 2013 at 6:26 PM, Nikola Pajkovsky <npajkovs@redhat.com> wrote: > >>>> bug still kicking even w/ (drm/i915: Revert shrinker changes from "Track > >>>> unbound pages") > >>> > >>> Could be a different bug, can you please attach the error_state somewhere? > >> > >> yep, i915_error_state is attached. btw, I'm going to bisect kernel, so > >> hopefully I will bring some commit. > > > > Different bug, on a quick lock this could be a dupe of > > https://bugzilla.kernel.org/show_bug.cgi?id=52311 > > ok > > > Chris should know the details. > > thanks, bisection leads me to commit d7d4eed ("drm/i915: Allow > DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers). It's not > possible to simply revert/test commit and I have no idea how i915 works. > > Chris any ideas? Userspace is failing to prepare the GPU to execute a WAIT_FOR_EVENT command, which it can only try if the kernel allows execution of privileged batch buffers. Option "SwapbuffersWait" "false" in xorg.conf will prevent the ddx from issuing the hanging command sequence. It is not clear yet what the missing ingredient is, I suspect the ddx needs to be more careful about not setting conditions that can never be met. -Chris -- Chris Wilson, Intel Open Source Technology Centre ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2013-01-14 12:47 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-01-08 22:36 i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree) Greg KH 2013-01-09 0:38 ` Greg KH 2013-01-09 3:42 ` Dave Airlie 2013-01-09 4:25 ` Greg KH 2013-01-09 5:31 ` Dave Airlie 2013-01-09 7:28 ` Lijo Antony 2013-01-09 19:44 ` Dave Kleikamp 2013-01-09 20:12 ` Dave Kleikamp 2013-01-09 21:08 ` Greg KH 2013-01-10 0:40 ` Greg KH 2013-01-10 1:07 ` Chris Wilson 2013-01-10 1:19 ` Dave Airlie 2013-01-11 17:26 ` Nikola Pajkovsky 2013-01-11 18:42 ` Daniel Vetter 2013-01-14 6:58 ` Nikola Pajkovsky 2013-01-14 9:06 ` Daniel Vetter 2013-01-14 9:49 ` Nikola Pajkovsky 2013-01-14 12:47 ` Chris Wilson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox