public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* radeon 0000:02:00.0: GPU lockup CP stall for more than 10000msec
@ 2012-12-22 20:35 Borislav Petkov
  2012-12-23  0:01 ` Alex Deucher
  0 siblings, 1 reply; 40+ messages in thread
From: Borislav Petkov @ 2012-12-22 20:35 UTC (permalink / raw)
  To: Alex Deucher; +Cc: dri-devel, lkml

Hi Alex,

got the sickest bug on 3.8-rc1, see below. The GPU locks up somewhere
down radeon_fence_wait_seq, judging by the error messages.

And this doesn't happen with 3.7, of course.

Let me know if you need any more info, thanks.

[16273.668350] radeon 0000:02:00.0: GPU lockup CP stall for more than 10000msec
[16273.668361] radeon 0000:02:00.0: GPU lockup (waiting for 0x000000000000002b last fence id 0x000000000000002a)
[16273.882550] plugin-containe[11435]: segfault at 7f1f0a66cc08 ip 00007f1f13289bdb sp 00007f1f0a2fe9e0 error 4 in libflashplayer.so[7f1f130c5000+117b000]
[16274.502807] ------------[ cut here ]------------
[16274.502845] WARNING: at lib/list_debug.c:53 __list_del_entry+0x63/0xd0()
[16274.502880] Hardware name:  
[16274.502897] list_del corruption, ffff8802216a3f10->next is LIST_POISON1 (dead000000100100)
[16274.502939] Modules linked in: nls_iso8859_15 nls_cp437 acpi_cpufreq mperf cpufreq_powersave cpufreq_userspace cpufreq_conservative cpufreq_stats binfmt_misc dm_crypt dm_mod ipv6 vfat fat fuse kvm_amd kvm radeon drm_kms_helper ttm edac_core microcode cfbfillrect cfbimgblt cfbcopyarea k10temp
[16274.503141] Pid: 17386, comm: Xorg Not tainted 3.8.0-rc1 #13
[16274.503172] Call Trace:
[16274.503190]  [<ffffffff8124bd00>] ? __list_del_entry+0x60/0xd0
[16274.503224]  [<ffffffff8103b2cf>] warn_slowpath_common+0x7f/0xc0
[16274.503257]  [<ffffffff8103b3c6>] warn_slowpath_fmt+0x46/0x50
[16274.503289]  [<ffffffff8124bd03>] __list_del_entry+0x63/0xd0
[16274.503320]  [<ffffffff8124bd81>] list_del+0x11/0x40
[16274.503348]  [<ffffffff812fa00e>] drm_mm_remove_node+0x9e/0xd0
[16274.503383]  [<ffffffff812fa065>] drm_mm_put_block+0x25/0x70
[16274.503422]  [<ffffffffa003dd71>] ? ttm_bo_man_put_node+0x31/0x60 [ttm]
[16274.503464]  [<ffffffffa003dd79>] ttm_bo_man_put_node+0x39/0x60 [ttm]
[16274.503503]  [<ffffffffa0036790>] ttm_bo_cleanup_memtype_use+0x80/0xb0 [ttm]
[16274.503545]  [<ffffffffa0037a3b>] ttm_bo_release+0x1fb/0x270 [ttm]
[16274.503585]  [<ffffffffa0037ae1>] ttm_bo_unref+0x31/0x40 [ttm]
[16274.503656]  [<ffffffffa00a04c7>] radeon_bo_unref+0x47/0x80 [radeon]
[16274.503707]  [<ffffffffa00b2929>] radeon_gem_object_free+0x39/0x40 [radeon]
[16274.503748]  [<ffffffff812f0e59>] drm_gem_object_free+0x29/0x30
[16274.503781]  [<ffffffff812f1248>] drm_gem_object_release_handle+0xb8/0xd0
[16274.503819]  [<ffffffff8122f97d>] idr_for_each+0xdd/0x180
[16274.503850]  [<ffffffff812f1190>] ? drm_gem_handle_create+0x100/0x100
[16274.503887]  [<ffffffff81099e0d>] ? trace_hardirqs_on+0xd/0x10
[16274.503920]  [<ffffffff812f17c4>] drm_gem_release+0x24/0x40
[16274.503952]  [<ffffffff812efffa>] drm_release+0x54a/0x5e0
[16274.503984]  [<ffffffff8106eed2>] ? lg_local_unlock+0x42/0x70
[16274.504016]  [<ffffffff8113afe2>] __fput+0xb2/0x240
[16274.504044]  [<ffffffff8113b22e>] ____fput+0xe/0x10
[16274.504073]  [<ffffffff81061765>] task_work_run+0xb5/0xd0
[16274.504105]  [<ffffffff8104144a>] do_exit+0x23a/0xac0
[16274.504135]  [<ffffffff81041e8c>] do_group_exit+0x4c/0xc0
[16274.504167]  [<ffffffff8105453d>] get_signal_to_deliver+0x22d/0x960
[16274.504202]  [<ffffffff81001a9f>] do_signal+0x3f/0x5a0
[16274.504233]  [<ffffffff8123a36e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[16274.504269]  [<ffffffff8100205d>] do_notify_resume+0x5d/0x90
[16274.504300]  [<ffffffff8123a36e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[16274.504337]  [<ffffffff8151d6c8>] int_signal+0x12/0x17
[16274.504366] ---[ end trace 4aad5b52e5533e3e ]---


-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 40+ messages in thread
* Re: radeon 0000:02:00.0: GPU lockup CP stall for more than 10000msec
@ 2013-01-10  9:38 Borislav Petkov
  2013-01-10 16:21 ` Alex Deucher
  0 siblings, 1 reply; 40+ messages in thread
From: Borislav Petkov @ 2013-01-10  9:38 UTC (permalink / raw)
  To: Alex Deucher; +Cc: dri-devel, lkml

[ deliberately breaking the thread because it got too long]

On Sat, Dec 22, 2012 at 09:35:47PM +0100, Borislav Petkov wrote:
> Hi Alex,
> 
> got the sickest bug on 3.8-rc1, see below. The GPU locks up somewhere
> down radeon_fence_wait_seq, judging by the error messages.
> 
> And this doesn't happen with 3.7, of course.
> 
> Let me know if you need any more info, thanks.
> 
> [16273.668350] radeon 0000:02:00.0: GPU lockup CP stall for more than 10000msec
> [16273.668361] radeon 0000:02:00.0: GPU lockup (waiting for 0x000000000000002b last fence id 0x000000000000002a)
> [16273.882550] plugin-containe[11435]: segfault at 7f1f0a66cc08 ip 00007f1f13289bdb sp 00007f1f0a2fe9e0 error 4 in libflashplayer.so[7f1f130c5000+117b000]
> [16274.502807] ------------[ cut here ]------------
> [16274.502845] WARNING: at lib/list_debug.c:53 __list_del_entry+0x63/0xd0()

Ok, this got fixed by 909d9eb67f1e4e39f2ea88e96bde03d560cde3eb which is
upstream now. And I'm testing -rc2+ which contains this patch already
+ tip/master + another fix from Alan which reworks fb console locking
(should be unrelated) and the machine gets unresponsive for a couple of
seconds and then it is fine again.

See dmesg below, the GPU gets the same lockup CP stall without the list
corruption so it recovers fine. But I didn't have those stalls before so
it has to be something which came up with 3.8 merge window.

[44730.749380] radeon 0000:02:00.0: GPU lockup CP stall for more than 10000msec
[44730.749391] radeon 0000:02:00.0: GPU lockup (waiting for 0x0000000000305211 last fence id 0x0000000000305210)
[44730.750596] radeon 0000:02:00.0: Saved 25 dwords of commands on ring 0.
[44730.750612] radeon 0000:02:00.0: GPU softreset: 0x00000007
[44730.768865] radeon 0000:02:00.0:   R_008010_GRBM_STATUS      = 0xA0003030
[44730.768874] radeon 0000:02:00.0:   R_008014_GRBM_STATUS2     = 0x00000003
[44730.768880] radeon 0000:02:00.0:   R_000E50_SRBM_STATUS      = 0x200000C0
[44730.768885] radeon 0000:02:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[44730.768889] radeon 0000:02:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
[44730.768894] radeon 0000:02:00.0:   R_00867C_CP_BUSY_STAT     = 0x00020184
[44730.768898] radeon 0000:02:00.0:   R_008680_CP_STAT          = 0x80028645
[44730.768903] radeon 0000:02:00.0:   R_008020_GRBM_SOFT_RESET=0x00007FEE
[44730.783898] radeon 0000:02:00.0: R_008020_GRBM_SOFT_RESET=0x00000001
[44730.798893] radeon 0000:02:00.0:   R_008010_GRBM_STATUS      = 0xA0003030
[44730.798896] radeon 0000:02:00.0:   R_008014_GRBM_STATUS2     = 0x00000003
[44730.798899] radeon 0000:02:00.0:   R_000E50_SRBM_STATUS      = 0x200080C0
[44730.798901] radeon 0000:02:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[44730.798904] radeon 0000:02:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
[44730.798907] radeon 0000:02:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
[44730.798909] radeon 0000:02:00.0:   R_008680_CP_STAT          = 0x80100000
[44730.819926] radeon 0000:02:00.0: GPU reset succeeded, trying to resume
[44730.836763] [drm] probing gen 2 caps for device 10de:377 = 1/0
[44730.839732] [drm] PCIE GART of 512M enabled (table at 0x0000000000040000).
[44730.839826] radeon 0000:02:00.0: WB enabled
[44730.839831] radeon 0000:02:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff880220223c00
[44730.839834] radeon 0000:02:00.0: fence driver on ring 3 use gpu addr 0x0000000020000c0c and cpu addr 0xffff880220223c0c
[44730.871080] [drm] ring test on 0 succeeded in 0 usecs
[44730.871140] [drm] ring test on 3 succeeded in 1 usecs
[44730.871187] [drm] ib test on ring 0 succeeded in 0 usecs
[44730.871206] [drm] ib test on ring 3 succeeded in 1 usecs

Thanks.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2013-01-15 14:04 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-22 20:35 radeon 0000:02:00.0: GPU lockup CP stall for more than 10000msec Borislav Petkov
2012-12-23  0:01 ` Alex Deucher
2012-12-23  0:25   ` Borislav Petkov
2012-12-23  0:42     ` Alex Deucher
2012-12-23 10:55       ` Borislav Petkov
2012-12-23 11:01         ` Andy Furniss
2012-12-23 11:07           ` Borislav Petkov
2012-12-23 11:19             ` Andy Furniss
2012-12-23 11:31               ` Borislav Petkov
2012-12-23 11:51                 ` Markus Trippelsdorf
2012-12-23 12:22                   ` Borislav Petkov
2012-12-23 13:31                     ` Borislav Petkov
2012-12-25  4:50                       ` Shuah Khan
2012-12-25 10:54                         ` Borislav Petkov
2013-01-02  1:42                         ` Antti Palosaari
2013-01-02 12:02                           ` Borislav Petkov
2013-01-02 17:19                             ` Jerome Glisse
2013-01-02 17:58                               ` Antti Palosaari
2013-01-02 22:31                                 ` Jerome Glisse
2013-01-02 22:38                                   ` Markus Trippelsdorf
2013-01-02 23:37                                     ` Alex Deucher
2013-01-02 23:58                                       ` Shuah Khan
2013-01-02 23:59                                         ` Alex Deucher
2013-01-03  1:03                                           ` Antti Palosaari
2013-01-03  1:05                                           ` Shuah Khan
2013-01-03  8:33                                       ` Markus Trippelsdorf
2013-01-03 11:37                                       ` Boszormenyi Zoltan
2013-01-03 14:12                                         ` Deucher, Alexander
2013-01-03 15:30                                           ` Shuah Khan
2013-01-04  7:40                                       ` Borislav Petkov
2013-01-04 11:16                                         ` Boszormenyi Zoltan
2013-01-04 14:06                                           ` Alex Deucher
2012-12-23 11:52           ` Joe Perches
  -- strict thread matches above, loose matches on Subject: below --
2013-01-10  9:38 Borislav Petkov
2013-01-10 16:21 ` Alex Deucher
2013-01-10 20:32   ` Borislav Petkov
2013-01-10 20:47     ` Alex Deucher
2013-01-11 11:43       ` Borislav Petkov
2013-01-15 12:19         ` Borislav Petkov
2013-01-15 14:04           ` Alex Deucher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox