All of lore.kernel.org
 help / color / mirror / Atom feed
* GPU lockup CP stall for more than 10000msec on latest vanilla git
@ 2012-12-17 18:27 Markus Trippelsdorf
  2012-12-17 21:32 ` Alex Deucher
  0 siblings, 1 reply; 20+ messages in thread
From: Markus Trippelsdorf @ 2012-12-17 18:27 UTC (permalink / raw)
  To: dri-devel

As soon as I open the following website:
http://www.boston.com/bigpicture/2012/12/2012_year_in_pictures_part_i.html

my Radeon RS780 stalls (GPU lockup) leaving the machine unusable:


Dec 17 17:41:39 x4 kernel: [drm] Initialized drm 1.1.0 20060810
Dec 17 17:41:39 x4 kernel: [drm] radeon defaulting to kernel modesetting.
Dec 17 17:41:39 x4 kernel: [drm] radeon kernel modesetting enabled.
Dec 17 17:41:39 x4 kernel: [drm] initializing kernel modesetting (RS780 0x1002:0x9614 0x1043:0x834D).
Dec 17 17:41:39 x4 kernel: [drm] register mmio base: 0xFBEE0000
Dec 17 17:41:39 x4 kernel: [drm] register mmio size: 65536
Dec 17 17:41:39 x4 kernel: ATOM BIOS: 113
Dec 17 17:41:39 x4 kernel: radeon 0000:01:05.0: VRAM: 128M 0x00000000C0000000 - 0x00000000C7FFFFFF (128M used)
Dec 17 17:41:39 x4 kernel: radeon 0000:01:05.0: GTT: 512M 0x00000000A0000000 - 0x00000000BFFFFFFF
Dec 17 17:41:39 x4 kernel: [drm] Detected VRAM RAM=128M, BAR=128M
Dec 17 17:41:39 x4 kernel: [drm] RAM width 32bits DDR
Dec 17 17:41:39 x4 kernel: [TTM] Zone  kernel: Available graphics memory: 4083532 kiB
Dec 17 17:41:39 x4 kernel: [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
Dec 17 17:41:39 x4 kernel: [TTM] Initializing pool allocator
Dec 17 17:41:39 x4 kernel: [TTM] Initializing DMA pool allocator
Dec 17 17:41:39 x4 kernel: [drm] radeon: 128M of VRAM memory ready
Dec 17 17:41:39 x4 kernel: [drm] radeon: 512M of GTT memory ready.
Dec 17 17:41:39 x4 kernel: [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
Dec 17 17:41:39 x4 kernel: [drm] Driver supports precise vblank timestamp query.
Dec 17 17:41:39 x4 kernel: [drm] radeon: irq initialized.
Dec 17 17:41:39 x4 kernel: [drm] GART: num cpu pages 131072, num gpu pages 131072
Dec 17 17:41:39 x4 kernel: [drm] Loading RS780 Microcode
Dec 17 17:41:39 x4 kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000C0040000).
Dec 17 17:41:39 x4 kernel: radeon 0000:01:05.0: WB enabled
Dec 17 17:41:39 x4 kernel: radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000c00 and cpu addr 0xffff8802163acc00
Dec 17 17:41:39 x4 kernel: radeon 0000:01:05.0: fence driver on ring 3 use gpu addr 0x00000000a0000c0c and cpu addr 0xffff8802163acc0c
Dec 17 17:41:39 x4 kernel: radeon 0000:01:05.0: setting latency timer to 64
Dec 17 17:41:39 x4 kernel: [drm] ring test on 0 succeeded in 0 usecs
Dec 17 17:41:39 x4 kernel: [drm] ring test on 3 succeeded in 1 usecs
Dec 17 17:41:39 x4 kernel: [drm] ib test on ring 0 succeeded in 0 usecs
Dec 17 17:41:39 x4 kernel: [drm] ib test on ring 3 succeeded in 0 usecs
Dec 17 17:41:39 x4 kernel: [drm] Radeon Display Connectors
Dec 17 17:41:39 x4 kernel: [drm] Connector 0:
Dec 17 17:41:39 x4 kernel: [drm]   VGA-1
Dec 17 17:41:39 x4 kernel: [drm]   DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 0x7e4c
Dec 17 17:41:39 x4 kernel: [drm]   Encoders:
Dec 17 17:41:39 x4 kernel: [drm]     CRT1: INTERNAL_KLDSCP_DAC1
Dec 17 17:41:39 x4 kernel: [drm] Connector 1:
Dec 17 17:41:39 x4 kernel: [drm]   DVI-D-1
Dec 17 17:41:39 x4 kernel: [drm]   HPD3
Dec 17 17:41:39 x4 kernel: [drm]   DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c 0x7e5c
Dec 17 17:41:39 x4 kernel: [drm]   Encoders:
Dec 17 17:41:39 x4 kernel: [drm]     DFP3: INTERNAL_KLDSCP_LVTMA
Dec 17 17:41:39 x4 kernel: [drm] radeon: power management initialized
Dec 17 17:41:39 x4 kernel: [drm] fb mappable at 0xF0142000
Dec 17 17:41:39 x4 kernel: [drm] vram apper at 0xF0000000
Dec 17 17:41:39 x4 kernel: [drm] size 7299072
Dec 17 17:41:39 x4 kernel: [drm] fb depth is 24
Dec 17 17:41:39 x4 kernel: [drm]    pitch is 6912
Dec 17 17:41:39 x4 kernel: fbcon: radeondrmfb (fb0) is primary device
Dec 17 17:41:39 x4 kernel: Console: switching to colour frame buffer device 131x105
Dec 17 17:41:39 x4 kernel: radeon 0000:01:05.0: fb0: radeondrmfb frame buffer device
Dec 17 17:41:39 x4 kernel: radeon 0000:01:05.0: registered panic notifier
Dec 17 17:41:39 x4 kernel: [drm] Initialized radeon 2.27.0 20080528 for 0000:01:05.0 on minor 0
...
Dec 17 19:12:33 x4 kernel: radeon 0000:01:05.0: GPU lockup CP stall for more than 10000msec
Dec 17 19:12:33 x4 kernel: radeon 0000:01:05.0: GPU lockup (waiting for 0x0000000000022777 last fence id 0x0000000000022774)

after reboot:

Dec 17 19:14:32 x4 kernel: Adding 4194300k swap on /var/cache/swapfile.img.  Priority:-1 extents:9 across:629080060k 
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0: GPU lockup CP stall for more than 10000msec
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0: GPU lockup (waiting for 0x0000000000000954 last fence id 0x0000000000000952)
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0: Saved 89 dwords of commands on ring 0.
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0: GPU softreset 
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS=0xA000B030
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2=0x00000003
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS=0x20005040
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000002
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x0000D084
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80098645
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0:   R_008020_GRBM_SOFT_RESET=0x00007FEE
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00000001
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS=0xA000B030
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2=0x00000003
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS=0x2000C040
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80100000
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0: GPU reset succeeded, trying to resume
Dec 17 19:16:44 x4 kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000C0040000).
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0: WB enabled
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000c00 and cpu addr 0xffff8802163acc00
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0: fence driver on ring 3 use gpu addr 0x00000000a0000c0c and cpu addr 0xffff8802163acc0c
Dec 17 19:16:44 x4 kernel: radeon 0000:01:05.0: setting latency timer to 64
Dec 17 19:16:44 x4 kernel: [drm] ring test on 0 succeeded in 1 usecs
Dec 17 19:16:44 x4 kernel: [drm:r600_dma_ring_test] *ERROR* radeon: ring 3 test failed (0xCAFEDEAD)
Dec 17 19:16:44 x4 kernel: [drm:r600_resume] *ERROR* r600 startup failed on resume
Dec 17 19:17:03 x4 kernel: SysRq : Emergency Sync

-- 
Markus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: GPU lockup CP stall for more than 10000msec on latest vanilla git
  2012-12-17 18:27 GPU lockup CP stall for more than 10000msec on latest vanilla git Markus Trippelsdorf
@ 2012-12-17 21:32 ` Alex Deucher
  2012-12-17 21:48   ` Markus Trippelsdorf
  0 siblings, 1 reply; 20+ messages in thread
From: Alex Deucher @ 2012-12-17 21:32 UTC (permalink / raw)
  To: Markus Trippelsdorf; +Cc: dri-devel

On Mon, Dec 17, 2012 at 1:27 PM, Markus Trippelsdorf
<markus@trippelsdorf.de> wrote:
> As soon as I open the following website:
> http://www.boston.com/bigpicture/2012/12/2012_year_in_pictures_part_i.html
>
> my Radeon RS780 stalls (GPU lockup) leaving the machine unusable:

Is this a regression?  Most likely a 3D driver bug unless you are only
seeing it with specific kernels.  What browser are you using and do
you have hw accelerated webgl, etc. enabled?  If so, what version of
mesa are you using?

Alex

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: GPU lockup CP stall for more than 10000msec on latest vanilla git
  2012-12-17 21:32 ` Alex Deucher
@ 2012-12-17 21:48   ` Markus Trippelsdorf
  2012-12-17 21:58     ` Markus Trippelsdorf
  2012-12-17 22:00     ` Alex Deucher
  0 siblings, 2 replies; 20+ messages in thread
From: Markus Trippelsdorf @ 2012-12-17 21:48 UTC (permalink / raw)
  To: Alex Deucher; +Cc: dri-devel

On 2012.12.17 at 16:32 -0500, Alex Deucher wrote:
> On Mon, Dec 17, 2012 at 1:27 PM, Markus Trippelsdorf
> <markus@trippelsdorf.de> wrote:
> > As soon as I open the following website:
> > http://www.boston.com/bigpicture/2012/12/2012_year_in_pictures_part_i.html
> >
> > my Radeon RS780 stalls (GPU lockup) leaving the machine unusable:
> 
> Is this a regression?  Most likely a 3D driver bug unless you are only
> seeing it with specific kernels.  What browser are you using and do
> you have hw accelerated webgl, etc. enabled?  If so, what version of
> mesa are you using?

This is a regression, because it is caused by yesterdays merge of
drm-next by Linus. IOW I only see this bug when running a
v3.7-9432-g9360b53 kernel. 

-- 
Markus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: GPU lockup CP stall for more than 10000msec on latest vanilla git
  2012-12-17 21:48   ` Markus Trippelsdorf
@ 2012-12-17 21:58     ` Markus Trippelsdorf
  2012-12-17 22:00     ` Alex Deucher
  1 sibling, 0 replies; 20+ messages in thread
From: Markus Trippelsdorf @ 2012-12-17 21:58 UTC (permalink / raw)
  To: Alex Deucher; +Cc: dri-devel

On 2012.12.17 at 22:48 +0100, Markus Trippelsdorf wrote:
> On 2012.12.17 at 16:32 -0500, Alex Deucher wrote:
> > On Mon, Dec 17, 2012 at 1:27 PM, Markus Trippelsdorf
> > <markus@trippelsdorf.de> wrote:
> > > As soon as I open the following website:
> > > http://www.boston.com/bigpicture/2012/12/2012_year_in_pictures_part_i.html
> > >
> > > my Radeon RS780 stalls (GPU lockup) leaving the machine unusable:
> > 
> > Is this a regression?  Most likely a 3D driver bug unless you are only
> > seeing it with specific kernels.  What browser are you using and do
> > you have hw accelerated webgl, etc. enabled?  If so, what version of
> > mesa are you using?
> 
> This is a regression, because it is caused by yesterdays merge of
> drm-next by Linus. IOW I only see this bug when running a
> v3.7-9432-g9360b53 kernel. 

Forgot to mention that I don't use webgl. Browser is Firefox. And I use
my screen in portrait mode:

 DVI-0 connected 1050x1680+0+0 left (normal left inverted right x axis y axis) 434mm x 270mm

-- 
Markus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: GPU lockup CP stall for more than 10000msec on latest vanilla git
  2012-12-17 21:48   ` Markus Trippelsdorf
  2012-12-17 21:58     ` Markus Trippelsdorf
@ 2012-12-17 22:00     ` Alex Deucher
  2012-12-17 22:25       ` Markus Trippelsdorf
  1 sibling, 1 reply; 20+ messages in thread
From: Alex Deucher @ 2012-12-17 22:00 UTC (permalink / raw)
  To: Markus Trippelsdorf; +Cc: dri-devel

On Mon, Dec 17, 2012 at 4:48 PM, Markus Trippelsdorf
<markus@trippelsdorf.de> wrote:
> On 2012.12.17 at 16:32 -0500, Alex Deucher wrote:
>> On Mon, Dec 17, 2012 at 1:27 PM, Markus Trippelsdorf
>> <markus@trippelsdorf.de> wrote:
>> > As soon as I open the following website:
>> > http://www.boston.com/bigpicture/2012/12/2012_year_in_pictures_part_i.html
>> >
>> > my Radeon RS780 stalls (GPU lockup) leaving the machine unusable:
>>
>> Is this a regression?  Most likely a 3D driver bug unless you are only
>> seeing it with specific kernels.  What browser are you using and do
>> you have hw accelerated webgl, etc. enabled?  If so, what version of
>> mesa are you using?
>
> This is a regression, because it is caused by yesterdays merge of
> drm-next by Linus. IOW I only see this bug when running a
> v3.7-9432-g9360b53 kernel.

Can you bisect?  I'm guessing it may be related to the new DMA rings.  Possibly:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=2d6cc7296d4ee128ab0fa3b715f0afde511f49c2
or
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=4d75658bffea78f0c6f82fd46df1ec983ccacdf0

Alex

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: GPU lockup CP stall for more than 10000msec on latest vanilla git
  2012-12-17 22:00     ` Alex Deucher
@ 2012-12-17 22:25       ` Markus Trippelsdorf
  2012-12-17 22:55         ` Markus Trippelsdorf
  2012-12-23  1:46         ` Alex Deucher
  0 siblings, 2 replies; 20+ messages in thread
From: Markus Trippelsdorf @ 2012-12-17 22:25 UTC (permalink / raw)
  To: Alex Deucher; +Cc: dri-devel

On 2012.12.17 at 17:00 -0500, Alex Deucher wrote:
> On Mon, Dec 17, 2012 at 4:48 PM, Markus Trippelsdorf
> <markus@trippelsdorf.de> wrote:
> > On 2012.12.17 at 16:32 -0500, Alex Deucher wrote:
> >> On Mon, Dec 17, 2012 at 1:27 PM, Markus Trippelsdorf
> >> <markus@trippelsdorf.de> wrote:
> >> > As soon as I open the following website:
> >> > http://www.boston.com/bigpicture/2012/12/2012_year_in_pictures_part_i.html
> >> >
> >> > my Radeon RS780 stalls (GPU lockup) leaving the machine unusable:
> >>
> >> Is this a regression?  Most likely a 3D driver bug unless you are only
> >> seeing it with specific kernels.  What browser are you using and do
> >> you have hw accelerated webgl, etc. enabled?  If so, what version of
> >> mesa are you using?
> >
> > This is a regression, because it is caused by yesterdays merge of
> > drm-next by Linus. IOW I only see this bug when running a
> > v3.7-9432-g9360b53 kernel.
> 
> Can you bisect?  I'm guessing it may be related to the new DMA rings.  Possibly:
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=2d6cc7296d4ee128ab0fa3b715f0afde511f49c2

Yes, the commit above causes the issue. 

 2d6cc72  GPU lockups
 009ee7a  runs fine

-- 
Markus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: GPU lockup CP stall for more than 10000msec on latest vanilla git
  2012-12-17 22:25       ` Markus Trippelsdorf
@ 2012-12-17 22:55         ` Markus Trippelsdorf
  2012-12-18 11:20           ` Michel Dänzer
  2012-12-23  1:46         ` Alex Deucher
  1 sibling, 1 reply; 20+ messages in thread
From: Markus Trippelsdorf @ 2012-12-17 22:55 UTC (permalink / raw)
  To: Alex Deucher; +Cc: dri-devel

On 2012.12.17 at 23:25 +0100, Markus Trippelsdorf wrote:
> On 2012.12.17 at 17:00 -0500, Alex Deucher wrote:
> > On Mon, Dec 17, 2012 at 4:48 PM, Markus Trippelsdorf
> > <markus@trippelsdorf.de> wrote:
> > > On 2012.12.17 at 16:32 -0500, Alex Deucher wrote:
> > >> On Mon, Dec 17, 2012 at 1:27 PM, Markus Trippelsdorf
> > >> <markus@trippelsdorf.de> wrote:
> > >> > As soon as I open the following website:
> > >> > http://www.boston.com/bigpicture/2012/12/2012_year_in_pictures_part_i.html
> > >> >
> > >> > my Radeon RS780 stalls (GPU lockup) leaving the machine unusable:
> > >>
> > >> Is this a regression?  Most likely a 3D driver bug unless you are only
> > >> seeing it with specific kernels.  What browser are you using and do
> > >> you have hw accelerated webgl, etc. enabled?  If so, what version of
> > >> mesa are you using?
> > >
> > > This is a regression, because it is caused by yesterdays merge of
> > > drm-next by Linus. IOW I only see this bug when running a
> > > v3.7-9432-g9360b53 kernel.
> > 
> > Can you bisect?  I'm guessing it may be related to the new DMA rings.  Possibly:
> > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=2d6cc7296d4ee128ab0fa3b715f0afde511f49c2
> 
> Yes, the commit above causes the issue. 
> 
>  2d6cc72  GPU lockups

With 2d6cc72 reverted I get:

Dec 17 23:09:35 x4 kernel: ------------[ cut here ]------------
Dec 17 23:09:35 x4 kernel: WARNING: at include/linux/kref.h:42 radeon_fence_ref+0x2c/0x40()
Dec 17 23:09:35 x4 kernel: Hardware name: System Product Name
Dec 17 23:09:35 x4 kernel: Pid: 182, comm: X Not tainted 3.7.0-09433-ge033059 #155
Dec 17 23:09:35 x4 kernel: Call Trace:
Dec 17 23:09:35 x4 kernel: [<ffffffff81059c94>] ? warn_slowpath_common+0x74/0xb0
Dec 17 23:09:35 x4 kernel: [<ffffffff8129de0c>] ? radeon_fence_ref+0x2c/0x40
Dec 17 23:09:35 x4 kernel: [<ffffffff8126a02c>] ? ttm_bo_cleanup_refs_and_unlock+0x17c/0x2c0
Dec 17 23:09:35 x4 kernel: [<ffffffff8126a6f4>] ? ttm_mem_evict_first+0x94/0x1d0
Dec 17 23:09:35 x4 kernel: [<ffffffff8126f9c2>] ? ttm_bo_man_get_node+0x62/0xb0
Dec 17 23:09:35 x4 kernel: [<ffffffff8126aaa1>] ? ttm_bo_mem_space+0x271/0x320
Dec 17 23:09:35 x4 kernel: [<ffffffff8126b0bd>] ? ttm_bo_move_buffer+0xdd/0x150
Dec 17 23:09:35 x4 kernel: [<ffffffff8126b1b9>] ? ttm_bo_validate+0x89/0xf0
Dec 17 23:09:35 x4 kernel: [<ffffffff8126b509>] ? ttm_bo_init+0x2e9/0x3a0
Dec 17 23:09:35 x4 kernel: [<ffffffff8129f84a>] ? radeon_bo_create+0x18a/0x200
Dec 17 23:09:35 x4 kernel: [<ffffffff8129f510>] ? radeon_bo_clear_va+0x40/0x40
Dec 17 23:09:35 x4 kernel: [<ffffffff812b0d42>] ? radeon_gem_object_create+0x92/0x160
Dec 17 23:09:35 x4 kernel: [<ffffffff812b113c>] ? radeon_gem_create_ioctl+0x6c/0x150
Dec 17 23:09:35 x4 kernel: [<ffffffff81252250>] ? drm_ioctl+0x420/0x4f0
Dec 17 23:09:35 x4 kernel: [<ffffffff812b10d0>] ? radeon_gem_pwrite_ioctl+0x20/0x20
Dec 17 23:09:35 x4 kernel: [<ffffffff810521a9>] ? __do_page_fault+0x1a9/0x490
Dec 17 23:09:35 x4 kernel: [<ffffffff810d1ac9>] ? mmap_region+0x169/0x560
Dec 17 23:09:35 x4 kernel: [<ffffffff810f7f84>] ? do_vfs_ioctl+0x2e4/0x4e0
Dec 17 23:09:35 x4 kernel: [<ffffffff810c0e19>] ? vm_mmap_pgoff+0x69/0x80
Dec 17 23:09:35 x4 kernel: [<ffffffff810f81cc>] ? sys_ioctl+0x4c/0xa0
Dec 17 23:09:35 x4 kernel: [<ffffffff814c2a12>] ? system_call_fastpath+0x16/0x1b
Dec 17 23:09:35 x4 kernel: ---[ end trace eb6036661a77c177 ]---
Dec 17 23:09:35 x4 kernel: BUG: unable to handle kernel paging request at ffff8803d9ee4bd8
Dec 17 23:09:35 x4 kernel: IP: [<ffffffff8129d395>] radeon_fence_wait_seq+0x85/0x440
Dec 17 23:09:35 x4 kernel: PGD 180c063 PUD 0
Dec 17 23:09:35 x4 kernel: Oops: 0000 [#1] SMP
Dec 17 23:09:35 x4 kernel: CPU 3
Dec 17 23:09:35 x4 kernel: Pid: 182, comm: X Tainted: G        W    3.7.0-09433-ge033059 #155 System manufacturer System Product Name/M4A78T-E
Dec 17 23:09:35 x4 kernel: RIP: 0010:[<ffffffff8129d395>]  [<ffffffff8129d395>] radeon_fence_wait_seq+0x85/0x440
Dec 17 23:09:35 x4 kernel: RSP: 0018:ffff880210cc7a38  EFLAGS: 00010282
Dec 17 23:09:35 x4 kernel: RAX: ffff880210cc7a90 RBX: ffff88020674c970 RCX: 0000000000000001
Dec 17 23:09:35 x4 kernel: RDX: 000000000605b580 RSI: 0000000000000058 RDI: ffff8801c7f7dc80
Dec 17 23:09:35 x4 kernel: RBP: ffff8803d9ee4bd8 R08: 0000000000000001 R09: 00000000000002a9
Dec 17 23:09:35 x4 kernel: R10: 00000000000002a8 R11: 0000000000000006 R12: ffff880210ee6981
Dec 17 23:09:35 x4 kernel: R13: 000000000605b580 R14: ffff8801c7f7dc80 R15: ffff8802161864f8
Dec 17 23:09:35 x4 kernel: FS:  00007f5ee88f4880(0000) GS:ffff88021fd80000(0000) knlGS:0000000000000000
Dec 17 23:09:35 x4 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 17 23:09:35 x4 kernel: CR2: ffff8803d9ee4bd8 CR3: 0000000210c63000 CR4: 00000000000007e0
Dec 17 23:09:35 x4 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 17 23:09:35 x4 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 17 23:09:35 x4 kernel: Process X (pid: 182, threadinfo ffff880210cc6000, task ffff880215f45730)
Dec 17 23:09:35 x4 kernel: Stack:
Dec 17 23:09:35 x4 kernel: ffffffff8129de0c 000000000605b580 ffff8803d9ee4080 0000000000000010
Dec 17 23:09:35 x4 kernel: ffff880210cc7aa8 ffff880201cc7a68 ffff880210cc7a90 000000010177c177
Dec 17 23:09:35 x4 kernel: 00000000000000c7 0000000000000001 ffff88020674c890 0000000000000286
Dec 17 23:09:35 x4 kernel: Call Trace:
Dec 17 23:09:35 x4 kernel: [<ffffffff8129de0c>] ? radeon_fence_ref+0x2c/0x40
Dec 17 23:09:35 x4 kernel: [<ffffffff8129dc32>] ? radeon_fence_wait+0x22/0x60
Dec 17 23:09:35 x4 kernel: [<ffffffff8126a06d>] ? ttm_bo_cleanup_refs_and_unlock+0x1bd/0x2c0
Dec 17 23:09:35 x4 kernel: [<ffffffff8126a6f4>] ? ttm_mem_evict_first+0x94/0x1d0
Dec 17 23:09:35 x4 kernel: [<ffffffff8126f9c2>] ? ttm_bo_man_get_node+0x62/0xb0
Dec 17 23:09:35 x4 kernel: [<ffffffff8126aaa1>] ? ttm_bo_mem_space+0x271/0x320
Dec 17 23:09:35 x4 kernel: [<ffffffff8126b0bd>] ? ttm_bo_move_buffer+0xdd/0x150
Dec 17 23:09:35 x4 kernel: [<ffffffff8126b1b9>] ? ttm_bo_validate+0x89/0xf0
Dec 17 23:09:35 x4 kernel: [<ffffffff8126b509>] ? ttm_bo_init+0x2e9/0x3a0
Dec 17 23:09:35 x4 kernel: [<ffffffff8129f84a>] ? radeon_bo_create+0x18a/0x200
Dec 17 23:09:35 x4 kernel: [<ffffffff8129f510>] ? radeon_bo_clear_va+0x40/0x40
Dec 17 23:09:35 x4 kernel: [<ffffffff812b0d42>] ? radeon_gem_object_create+0x92/0x160
Dec 17 23:09:35 x4 kernel: [<ffffffff812b113c>] ? radeon_gem_create_ioctl+0x6c/0x150
Dec 17 23:09:35 x4 kernel: [<ffffffff81252250>] ? drm_ioctl+0x420/0x4f0
Dec 17 23:09:35 x4 kernel: [<ffffffff812b10d0>] ? radeon_gem_pwrite_ioctl+0x20/0x20
Dec 17 23:09:35 x4 kernel: [<ffffffff810521a9>] ? __do_page_fault+0x1a9/0x490
Dec 17 23:09:35 x4 kernel: [<ffffffff810d1ac9>] ? mmap_region+0x169/0x560
Dec 17 23:09:35 x4 kernel: [<ffffffff810f7f84>] ? do_vfs_ioctl+0x2e4/0x4e0
Dec 17 23:09:35 x4 kernel: [<ffffffff810c0e19>] ? vm_mmap_pgoff+0x69/0x80
Dec 17 23:09:35 x4 kernel: [<ffffffff810f81cc>] ? sys_ioctl+0x4c/0xa0
Dec 17 23:09:35 x4 kernel: [<ffffffff814c2a12>] ? system_call_fastpath+0x16/0x1b
Dec 17 23:09:35 x4 kernel: Code: c4 0f 87 77 01 00 00 41 89 df bb 01 00 00 00 44 89 ee 4c 89 f7 e8 ec 5a 01 00 45 85 ff 0f 88 43 03 00 00 84 db 0f 84 57 02 00 00 <48> 8b 45 00 4c 39 e0 0f 83 19 02 00 00 48 8b 44 24 08 48 c1 e0
Dec 17 23:09:35 x4 kernel: RIP  [<ffffffff8129d395>] radeon_fence_wait_seq+0x85/0x440
Dec 17 23:09:35 x4 kernel: RSP <ffff880210cc7a38>
Dec 17 23:09:35 x4 kernel: CR2: ffff8803d9ee4bd8
Dec 17 23:09:35 x4 kernel: ---[ end trace eb6036661a77c178 ]---
Dec 17 23:09:35 x4 kernel: [drm:drm_release] *ERROR* Device busy: 1

-- 
Markus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: GPU lockup CP stall for more than 10000msec on latest vanilla git
  2012-12-17 22:55         ` Markus Trippelsdorf
@ 2012-12-18 11:20           ` Michel Dänzer
  2012-12-18 13:38             ` Markus Trippelsdorf
  0 siblings, 1 reply; 20+ messages in thread
From: Michel Dänzer @ 2012-12-18 11:20 UTC (permalink / raw)
  To: Markus Trippelsdorf; +Cc: dri-devel

On Mon, 2012-12-17 at 23:55 +0100, Markus Trippelsdorf wrote: 
> On 2012.12.17 at 23:25 +0100, Markus Trippelsdorf wrote:
> > On 2012.12.17 at 17:00 -0500, Alex Deucher wrote:
> > > On Mon, Dec 17, 2012 at 4:48 PM, Markus Trippelsdorf
> > > <markus@trippelsdorf.de> wrote:
> > > > On 2012.12.17 at 16:32 -0500, Alex Deucher wrote:
> > > >> On Mon, Dec 17, 2012 at 1:27 PM, Markus Trippelsdorf
> > > >> <markus@trippelsdorf.de> wrote:
> > > >> > As soon as I open the following website:
> > > >> > http://www.boston.com/bigpicture/2012/12/2012_year_in_pictures_part_i.html
> > > >> >
> > > >> > my Radeon RS780 stalls (GPU lockup) leaving the machine unusable:
> > > >>
> > > >> Is this a regression?  Most likely a 3D driver bug unless you are only
> > > >> seeing it with specific kernels.  What browser are you using and do
> > > >> you have hw accelerated webgl, etc. enabled?  If so, what version of
> > > >> mesa are you using?
> > > >
> > > > This is a regression, because it is caused by yesterdays merge of
> > > > drm-next by Linus. IOW I only see this bug when running a
> > > > v3.7-9432-g9360b53 kernel.
> > > 
> > > Can you bisect?  I'm guessing it may be related to the new DMA rings.  Possibly:
> > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=2d6cc7296d4ee128ab0fa3b715f0afde511f49c2
> > 
> > Yes, the commit above causes the issue. 
> > 
> >  2d6cc72  GPU lockups
> 
> With 2d6cc72 reverted I get:
> 
> Dec 17 23:09:35 x4 kernel: ------------[ cut here ]------------

Probably a separate issue, can you bisect this one as well?


-- 
Earthling Michel Dänzer           |                   http://www.amd.com
Libre software enthusiast         |          Debian, X and DRI developer

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: GPU lockup CP stall for more than 10000msec on latest vanilla git
  2012-12-18 11:20           ` Michel Dänzer
@ 2012-12-18 13:38             ` Markus Trippelsdorf
  2012-12-18 13:51               ` Markus Trippelsdorf
  2012-12-18 15:24               ` Maarten Lankhorst
  0 siblings, 2 replies; 20+ messages in thread
From: Markus Trippelsdorf @ 2012-12-18 13:38 UTC (permalink / raw)
  To: Michel Dänzer; +Cc: dri-devel

On 2012.12.18 at 12:20 +0100, Michel Dänzer wrote:
> On Mon, 2012-12-17 at 23:55 +0100, Markus Trippelsdorf wrote: 
> > On 2012.12.17 at 23:25 +0100, Markus Trippelsdorf wrote:
> > > On 2012.12.17 at 17:00 -0500, Alex Deucher wrote:
> > > > On Mon, Dec 17, 2012 at 4:48 PM, Markus Trippelsdorf
> > > > <markus@trippelsdorf.de> wrote:
> > > > > On 2012.12.17 at 16:32 -0500, Alex Deucher wrote:
> > > > >> On Mon, Dec 17, 2012 at 1:27 PM, Markus Trippelsdorf
> > > > >> <markus@trippelsdorf.de> wrote:
> > > > >> > As soon as I open the following website:
> > > > >> > http://www.boston.com/bigpicture/2012/12/2012_year_in_pictures_part_i.html
> > > > >> >
> > > > >> > my Radeon RS780 stalls (GPU lockup) leaving the machine unusable:
> > > > >>
> > > > >> Is this a regression?  Most likely a 3D driver bug unless you are only
> > > > >> seeing it with specific kernels.  What browser are you using and do
> > > > >> you have hw accelerated webgl, etc. enabled?  If so, what version of
> > > > >> mesa are you using?
> > > > >
> > > > > This is a regression, because it is caused by yesterdays merge of
> > > > > drm-next by Linus. IOW I only see this bug when running a
> > > > > v3.7-9432-g9360b53 kernel.
> > > > 
> > > > Can you bisect?  I'm guessing it may be related to the new DMA rings.  Possibly:
> > > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=2d6cc7296d4ee128ab0fa3b715f0afde511f49c2
> > > 
> > > Yes, the commit above causes the issue. 
> > > 
> > >  2d6cc72  GPU lockups
> > 
> > With 2d6cc72 reverted I get:
> > 
> > Dec 17 23:09:35 x4 kernel: ------------[ cut here ]------------
> 
> Probably a separate issue, can you bisect this one as well?

Yes. Git-bisect points to:

85b144f860176ec18db927d6d9ecdfb24d9c6483 is the first bad commit
commit 85b144f860176ec18db927d6d9ecdfb24d9c6483
Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Date:   Thu Nov 29 11:36:54 2012 +0000

    drm/ttm: call ttm_bo_cleanup_refs with reservation and lru lock
    held, v3

(Please note that this bug is a little bit harder to reproduce. But
when you scroll up and down for ~10 seconds on the webpage mentioned
above it will trigger the oops.
So while I'm not 100% sure that the issue is caused by exactly this
commit, the vicinity should be right)

Dec 18 14:29:07 x4 kernel: ------------[ cut here ]------------
Dec 18 14:29:07 x4 kernel: WARNING: at include/linux/kref.h:42 radeon_fence_ref+0x2c/0x40()
Dec 18 14:29:07 x4 kernel: Hardware name: System Product Name
Dec 18 14:29:07 x4 kernel: Pid: 161, comm: X Not tainted 3.7.0-rc7-00520-g85b144f #168
Dec 18 14:29:07 x4 kernel: Call Trace:
Dec 18 14:29:07 x4 kernel: [<ffffffff81058c84>] ? warn_slowpath_common+0x74/0xb0
Dec 18 14:29:07 x4 kernel: [<ffffffff812926fc>] ? radeon_fence_ref+0x2c/0x40
Dec 18 14:29:07 x4 kernel: [<ffffffff8125e91c>] ? ttm_bo_cleanup_refs_and_unlock+0x17c/0x2c0
Dec 18 14:29:07 x4 kernel: [<ffffffff8125f13c>] ? ttm_mem_evict_first+0x1dc/0x2a0
Dec 18 14:29:07 x4 kernel: [<ffffffff81264412>] ? ttm_bo_man_get_node+0x62/0xb0
Dec 18 14:29:07 x4 kernel: [<ffffffff8125f48e>] ? ttm_bo_mem_space+0x28e/0x340
Dec 18 14:29:07 x4 kernel: [<ffffffff8125facc>] ? ttm_bo_move_buffer+0xfc/0x170
Dec 18 14:29:07 x4 kernel: [<ffffffff810de172>] ? kmem_cache_alloc+0xb2/0xc0
Dec 18 14:29:07 x4 kernel: [<ffffffff8125fbd5>] ? ttm_bo_validate+0x95/0x110
Dec 18 14:29:07 x4 kernel: [<ffffffff8125ff3c>] ? ttm_bo_init+0x2ec/0x3b0
Dec 18 14:29:07 x4 kernel: [<ffffffff8129415a>] ? radeon_bo_create+0x18a/0x200
Dec 18 14:29:07 x4 kernel: [<ffffffff81293e40>] ? radeon_bo_clear_va+0x40/0x40
Dec 18 14:29:07 x4 kernel: [<ffffffff812a5302>] ? radeon_gem_object_create+0x92/0x160
Dec 18 14:29:07 x4 kernel: [<ffffffff812a571c>] ? radeon_gem_create_ioctl+0x6c/0x150
Dec 18 14:29:07 x4 kernel: [<ffffffff81246b60>] ? drm_ioctl+0x420/0x4f0
Dec 18 14:29:07 x4 kernel: [<ffffffff812a56b0>] ? radeon_gem_pwrite_ioctl+0x20/0x20
Dec 18 14:29:07 x4 kernel: [<ffffffff810f53a4>] ? do_vfs_ioctl+0x2e4/0x4e0
Dec 18 14:29:07 x4 kernel: [<ffffffff810e5588>] ? vfs_read+0x118/0x160
Dec 18 14:29:07 x4 kernel: [<ffffffff810f55ec>] ? sys_ioctl+0x4c/0xa0
Dec 18 14:29:07 x4 kernel: [<ffffffff810e5851>] ? sys_read+0x51/0xa0
Dec 18 14:29:07 x4 kernel: [<ffffffff814b05d2>] ? system_call_fastpath+0x16/0x1b
Dec 18 14:29:07 x4 kernel: ---[ end trace c5e6f68fefd3a70b ]---
Dec 18 14:29:07 x4 kernel: BUG: unable to handle kernel paging request at 0000000100000077
Dec 18 14:29:07 x4 kernel: IP: [<ffffffff814afa15>] _raw_spin_lock+0x5/0x30
Dec 18 14:29:07 x4 kernel: PGD 2156c4067 PUD 0
Dec 18 14:29:07 x4 kernel: Oops: 0002 [#1] SMP
Dec 18 14:29:07 x4 kernel: CPU 1
Dec 18 14:29:07 x4 kernel: Pid: 161, comm: X Tainted: G        W    3.7.0-rc7-00520-g85b144f #168 System manufacturer System Product Name/M4A78T-E
Dec 18 14:29:07 x4 kernel: RIP: 0010:[<ffffffff814afa15>]  [<ffffffff814afa15>] _raw_spin_lock+0x5/0x30
Dec 18 14:29:07 x4 kernel: RSP: 0018:ffff880211645d58  EFLAGS: 00010286
Dec 18 14:29:07 x4 kernel: RAX: 0000000000000100 RBX: ffff8801c0e29448 RCX: 0000000000000000
Dec 18 14:29:07 x4 kernel: RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000100000077
Dec 18 14:29:07 x4 kernel: RBP: 00000000ffffffff R08: 0000000000000000 R09: ffffffff81838370
Dec 18 14:29:07 x4 kernel: R10: ffffffff812a5960 R11: 0000000000000246 R12: 0000000000000001
Dec 18 14:29:07 x4 kernel: R13: 0000000000000001 R14: 0000000000000000 R15: 00007fff0723dba0
Dec 18 14:29:07 x4 kernel: FS:  00007f958542f880(0000) GS:ffff88021fc80000(0000) knlGS:0000000000000000
Dec 18 14:29:07 x4 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 18 14:29:07 x4 kernel: CR2: 0000000100000077 CR3: 000000021161a000 CR4: 00000000000007e0
Dec 18 14:29:07 x4 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 18 14:29:07 x4 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 18 14:29:07 x4 kernel: Process X (pid: 161, threadinfo ffff880211644000, task ffff880215ab85d0)
Dec 18 14:29:07 x4 kernel: Stack:
Dec 18 14:29:07 x4 kernel: ffffffff8125d9ba 0000000015c83600 ffff8801c0e29400 ffff880211645e30
Dec 18 14:29:07 x4 kernel: ffff8801c0e29448 ffff880211645dcc 0000000000000001 ffffffff81294bff
Dec 18 14:29:07 x4 kernel: ffff8801c0e29608 ffff880211645e30 ffff880216a76000 ffff880211645e30
Dec 18 14:29:07 x4 kernel: Call Trace:
Dec 18 14:29:07 x4 kernel: [<ffffffff8125d9ba>] ? ttm_bo_reserve+0x3a/0x110
Dec 18 14:29:07 x4 kernel: [<ffffffff81294bff>] ? radeon_bo_wait+0x3f/0xc0
Dec 18 14:29:07 x4 kernel: [<ffffffff812a59b7>] ? radeon_gem_busy_ioctl+0x57/0x100
Dec 18 14:29:07 x4 kernel: [<ffffffff81246b60>] ? drm_ioctl+0x420/0x4f0
Dec 18 14:29:07 x4 kernel: [<ffffffff812a5960>] ? radeon_gem_mmap_ioctl+0x20/0x20
Dec 18 14:29:07 x4 kernel: [<ffffffff810f53a4>] ? do_vfs_ioctl+0x2e4/0x4e0
Dec 18 14:29:07 x4 kernel: [<ffffffff810e55ad>] ? vfs_read+0x13d/0x160
Dec 18 14:29:07 x4 kernel: [<ffffffff810f55ec>] ? sys_ioctl+0x4c/0xa0
Dec 18 14:29:07 x4 kernel: [<ffffffff810e5851>] ? sys_read+0x51/0xa0
Dec 18 14:29:07 x4 kernel: [<ffffffff814b05d2>] ? system_call_fastpath+0x16/0x1b
Dec 18 14:29:07 x4 kernel: Code: 31 c0 5b c3 66 90 8d 8a 00 01 00 00 89 d0 f0 66 0f b1 0b 66 39 d0 75 de b8 01 00 00 00 5b c3 0f 1f 80 00 00 00 00 b8 00 01 00 00 <f0> 66 0f c1 07 0f b6 d4 38 c2 74 10 0f 1f 80 00 00 00 00 f3 90
Dec 18 14:29:07 x4 kernel: RIP  [<ffffffff814afa15>] _raw_spin_lock+0x5/0x30
Dec 18 14:29:07 x4 kernel: RSP <ffff880211645d58>
Dec 18 14:29:07 x4 kernel: CR2: 0000000100000077
Dec 18 14:29:07 x4 kernel: ---[ end trace c5e6f68fefd3a70c ]---
Dec 18 14:29:28 x4 kernel: BUG: unable to handle kernel paging request at 0000000100000023
Dec 18 14:29:28 x4 kernel: IP: [<ffffffff81296448>] radeon_vm_bo_invalidate+0x18/0x30
Dec 18 14:29:28 x4 kernel: PGD 205289067 PUD 0
Dec 18 14:29:28 x4 kernel: Oops: 0002 [#2] SMP
Dec 18 14:29:28 x4 kernel: CPU 1
Dec 18 14:29:28 x4 kernel: Pid: 13, comm: kworker/1:0 Tainted: G      D W    3.7.0-rc7-00520-g85b144f #168 System manufacturer System Product Name/M4A78T-E
Dec 18 14:29:28 x4 kernel: RIP: 0010:[<ffffffff81296448>]  [<ffffffff81296448>] radeon_vm_bo_invalidate+0x18/0x30
Dec 18 14:29:28 x4 kernel: RSP: 0018:ffff8802168b3d78  EFLAGS: 00010207
Dec 18 14:29:28 x4 kernel: RAX: 00000000ffffffff RBX: ffff8801c0e29048 RCX: ffff8801c0e2b928
Dec 18 14:29:28 x4 kernel: RDX: 0000000000000001 RSI: ffff8801c0e291f0 RDI: 00000000ffffffff
Dec 18 14:29:28 x4 kernel: RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000000
Dec 18 14:29:28 x4 kernel: R10: ffffea0007038a00 R11: dead000000100100 R12: ffff880216a76590
Dec 18 14:29:28 x4 kernel: R13: ffffffff818383e0 R14: 0000000000000000 R15: ffff880215c83678
Dec 18 14:29:28 x4 kernel: FS:  00007f4bb2b64740(0000) GS:ffff88021fc80000(0000) knlGS:0000000000000000
Dec 18 14:29:28 x4 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 18 14:29:28 x4 kernel: CR2: 0000000100000023 CR3: 000000020698f000 CR4: 00000000000007e0
Dec 18 14:29:28 x4 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 18 14:29:28 x4 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 18 14:29:28 x4 kernel: Process kworker/1:0 (pid: 13, threadinfo ffff8802168b2000, task ffff88021687d730)
Dec 18 14:29:28 x4 kernel: Stack:
Dec 18 14:29:28 x4 kernel: ffffffff8125d2e9 ffff8801c0e29048 ffffffff8125e8cb ffff880216a769b8
Dec 18 14:29:28 x4 kernel: ffffffff810de82f ffff8801c0e2b848 ffff880215c83678 ffff8801c0e2b900
Dec 18 14:29:28 x4 kernel: 0000000000000001 ffff880216a76a80 ffff8801c0e29048 ffffffff8125eb7d
Dec 18 14:29:28 x4 kernel: Call Trace:
Dec 18 14:29:28 x4 kernel: [<ffffffff8125d2e9>] ? ttm_bo_cleanup_memtype_use+0x19/0x90
Dec 18 14:29:28 x4 kernel: [<ffffffff8125e8cb>] ? ttm_bo_cleanup_refs_and_unlock+0x12b/0x2c0
Dec 18 14:29:28 x4 kernel: [<ffffffff810de82f>] ? kfree+0xf/0xb0
Dec 18 14:29:28 x4 kernel: [<ffffffff8125eb7d>] ? ttm_bo_delayed_delete+0x11d/0x1a0
Dec 18 14:29:28 x4 kernel: [<ffffffff8125ec12>] ? ttm_bo_delayed_workqueue+0x12/0x30
Dec 18 14:29:28 x4 kernel: [<ffffffff8106e5f9>] ? process_one_work+0x179/0x480
Dec 18 14:29:28 x4 kernel: [<ffffffff8125ec00>] ? ttm_bo_delayed_delete+0x1a0/0x1a0
Dec 18 14:29:28 x4 kernel: [<ffffffff8106f5b1>] ? worker_thread+0x1b1/0x540
Dec 18 14:29:28 x4 kernel: [<ffffffff8106f400>] ? busy_worker_rebind_fn+0x100/0x100
Dec 18 14:29:28 x4 kernel: [<ffffffff810741cf>] ? kthread+0xaf/0xc0
Dec 18 14:29:28 x4 kernel: [<ffffffff81074120>] ? __kthread_bind+0x30/0x30
Dec 18 14:29:28 x4 kernel: [<ffffffff814b052c>] ? ret_from_fork+0x7c/0xb0
Dec 18 14:29:28 x4 kernel: [<ffffffff81074120>] ? __kthread_bind+0x30/0x30
Dec 18 14:29:28 x4 kernel: Code: 8b 44 24 04 48 83 c4 08 5b 5d 41 5c c3 66 0f 1f 44 00 00 48 8b 86 f0 01 00 00 48 81 c6 f0 01 00 00 48 39 f0 74 11 0f 1f 44 00 00 <c6> 40 24 00 48 8b 00 48 39 f0 75 f4 f3 c3 66 2e 0f 1f 84 00 00
Dec 18 14:29:28 x4 kernel: RIP  [<ffffffff81296448>] radeon_vm_bo_invalidate+0x18/0x30
Dec 18 14:29:28 x4 kernel: RSP <ffff8802168b3d78>
Dec 18 14:29:28 x4 kernel: CR2: 0000000100000023
Dec 18 14:29:28 x4 kernel: ---[ end trace c5e6f68fefd3a70d ]---
Dec 18 14:29:28 x4 kernel: BUG: unable to handle kernel paging request at ffffffffffffffd8
Dec 18 14:29:28 x4 kernel: IP: [<ffffffff81074257>] kthread_data+0x7/0x10
Dec 18 14:29:28 x4 kernel: PGD 180d067 PUD 180e067 PMD 0
Dec 18 14:29:28 x4 kernel: Oops: 0000 [#3] SMP
Dec 18 14:29:28 x4 kernel: CPU 1
Dec 18 14:29:28 x4 kernel: Pid: 13, comm: kworker/1:0 Tainted: G      D W    3.7.0-rc7-00520-g85b144f #168 System manufacturer System Product Name/M4A78T-E
Dec 18 14:29:28 x4 kernel: RIP: 0010:[<ffffffff81074257>]  [<ffffffff81074257>] kthread_data+0x7/0x10
Dec 18 14:29:28 x4 kernel: RSP: 0018:ffff8802168b3aa0  EFLAGS: 00010002
Dec 18 14:29:28 x4 kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: 000000015c7992d1
Dec 18 14:29:28 x4 kernel: RDX: ffffffffff8a8b63 RSI: 0000000000000001 RDI: ffff88021687d730
Dec 18 14:29:28 x4 kernel: RBP: ffff88021687d730 R08: 0000000000000000 R09: 0000000000000000
Dec 18 14:29:28 x4 kernel: R10: ffff880216887980 R11: 0000000000000000 R12: ffff88021fc912c0
Dec 18 14:29:28 x4 kernel: R13: 0000000000000001 R14: ffff88021687d720 R15: ffff88021687d730
Dec 18 14:29:28 x4 kernel: FS:  00007f4bb2b64740(0000) GS:ffff88021fc80000(0000) knlGS:0000000000000000
Dec 18 14:29:28 x4 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 18 14:29:28 x4 kernel: CR2: ffffffffffffffd8 CR3: 000000020698f000 CR4: 00000000000007e0
Dec 18 14:29:28 x4 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 18 14:29:28 x4 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 18 14:29:28 x4 kernel: Process kworker/1:0 (pid: 13, threadinfo ffff8802168b2000, task ffff88021687d730)
Dec 18 14:29:28 x4 kernel: Stack:
Dec 18 14:29:28 x4 kernel: ffffffff8106fb98 ffff88021687d9d0 ffffffff814ae8b5 00000000000112c0
Dec 18 14:29:28 x4 kernel: ffff8802168b3fd8 00000000000112c0 ffff8802168b3fd8 0000000000000001
Dec 18 14:29:28 x4 kernel: ffff88021687d8d8 ffff88021687d720 ffff880216878000 ffff88021687d720
Dec 18 14:29:28 x4 kernel: Call Trace:
Dec 18 14:29:28 x4 kernel: [<ffffffff8106fb98>] ? wq_worker_sleeping+0x8/0xb0
Dec 18 14:29:28 x4 kernel: [<ffffffff814ae8b5>] ? __schedule+0x3a5/0x5f0
Dec 18 14:29:28 x4 kernel: [<ffffffff8105dbba>] ? do_exit+0x52a/0x830
Dec 18 14:29:28 x4 kernel: [<ffffffff8103785e>] ? oops_end+0x8e/0xd0
Dec 18 14:29:28 x4 kernel: [<ffffffff814a94c8>] ? no_context+0x251/0x25d
Dec 18 14:29:28 x4 kernel: [<ffffffff810512ce>] ? __do_page_fault+0x2ee/0x490
Dec 18 14:29:28 x4 kernel: [<ffffffff81083e18>] ? find_busiest_group+0x28/0x480
Dec 18 14:29:28 x4 kernel: [<ffffffff814b00af>] ? page_fault+0x1f/0x30
Dec 18 14:29:28 x4 kernel: [<ffffffff81296448>] ? radeon_vm_bo_invalidate+0x18/0x30
Dec 18 14:29:28 x4 kernel: [<ffffffff8125d2e9>] ? ttm_bo_cleanup_memtype_use+0x19/0x90
Dec 18 14:29:28 x4 kernel: [<ffffffff8125e8cb>] ? ttm_bo_cleanup_refs_and_unlock+0x12b/0x2c0
Dec 18 14:29:28 x4 kernel: [<ffffffff810de82f>] ? kfree+0xf/0xb0
Dec 18 14:29:28 x4 kernel: [<ffffffff8125eb7d>] ? ttm_bo_delayed_delete+0x11d/0x1a0
Dec 18 14:29:28 x4 kernel: [<ffffffff8125ec12>] ? ttm_bo_delayed_workqueue+0x12/0x30
Dec 18 14:29:28 x4 kernel: [<ffffffff8106e5f9>] ? process_one_work+0x179/0x480
Dec 18 14:29:28 x4 kernel: [<ffffffff8125ec00>] ? ttm_bo_delayed_delete+0x1a0/0x1a0
Dec 18 14:29:28 x4 kernel: [<ffffffff8106f5b1>] ? worker_thread+0x1b1/0x540
Dec 18 14:29:28 x4 kernel: [<ffffffff8106f400>] ? busy_worker_rebind_fn+0x100/0x100
Dec 18 14:29:28 x4 kernel: [<ffffffff810741cf>] ? kthread+0xaf/0xc0
Dec 18 14:29:28 x4 kernel: [<ffffffff81074120>] ? __kthread_bind+0x30/0x30
Dec 18 14:29:28 x4 kernel: [<ffffffff814b052c>] ? ret_from_fork+0x7c/0xb0
Dec 18 14:29:28 x4 kernel: [<ffffffff81074120>] ? __kthread_bind+0x30/0x30
Dec 18 14:29:28 x4 kernel: Code: 74 03 c6 03 00 65 48 8b 04 25 c0 b9 00 00 48 8b 80 48 02 00 00 5b 48 8b 40 c8 48 d1 e8 83 e0 01 c3 0f 1f 00 48 8b 87 48 02 00 00 <48> 8b 40 d8 c3 0f 1f 40 00 65 48 8b 04 25 c0 b9 00 00 48 8b b8
Dec 18 14:29:28 x4 kernel: RIP  [<ffffffff81074257>] kthread_data+0x7/0x10
Dec 18 14:29:28 x4 kernel: RSP <ffff8802168b3aa0>
Dec 18 14:29:28 x4 kernel: CR2: ffffffffffffffd8
Dec 18 14:29:28 x4 kernel: ---[ end trace c5e6f68fefd3a70e ]---
Dec 18 14:29:28 x4 kernel: Fixing recursive fault but reboot is needed!
Dec 18 14:29:28 x4 kernel: SysRq : Emergency Sync

-- 
Markus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: GPU lockup CP stall for more than 10000msec on latest vanilla git
  2012-12-18 13:38             ` Markus Trippelsdorf
@ 2012-12-18 13:51               ` Markus Trippelsdorf
  2012-12-18 15:24               ` Maarten Lankhorst
  1 sibling, 0 replies; 20+ messages in thread
From: Markus Trippelsdorf @ 2012-12-18 13:51 UTC (permalink / raw)
  To: Michel Dänzer; +Cc: dri-devel

On 2012.12.18 at 14:38 +0100, Markus Trippelsdorf wrote:
> On 2012.12.18 at 12:20 +0100, Michel Dänzer wrote:
> > On Mon, 2012-12-17 at 23:55 +0100, Markus Trippelsdorf wrote: 
> > > On 2012.12.17 at 23:25 +0100, Markus Trippelsdorf wrote:
> > > > On 2012.12.17 at 17:00 -0500, Alex Deucher wrote:
> > > > > On Mon, Dec 17, 2012 at 4:48 PM, Markus Trippelsdorf
> > > > > <markus@trippelsdorf.de> wrote:
> > > > > > On 2012.12.17 at 16:32 -0500, Alex Deucher wrote:
> > > > > >> On Mon, Dec 17, 2012 at 1:27 PM, Markus Trippelsdorf
> > > > > >> <markus@trippelsdorf.de> wrote:
> > > > > >> > As soon as I open the following website:
> > > > > >> > http://www.boston.com/bigpicture/2012/12/2012_year_in_pictures_part_i.html
> > > > > >> >
> > > > > >> > my Radeon RS780 stalls (GPU lockup) leaving the machine unusable:
> > > > > >>
> > > > > >> Is this a regression?  Most likely a 3D driver bug unless you are only
> > > > > >> seeing it with specific kernels.  What browser are you using and do
> > > > > >> you have hw accelerated webgl, etc. enabled?  If so, what version of
> > > > > >> mesa are you using?
> > > > > >
> > > > > > This is a regression, because it is caused by yesterdays merge of
> > > > > > drm-next by Linus. IOW I only see this bug when running a
> > > > > > v3.7-9432-g9360b53 kernel.
> > > > > 
> > > > > Can you bisect?  I'm guessing it may be related to the new DMA rings.  Possibly:
> > > > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=2d6cc7296d4ee128ab0fa3b715f0afde511f49c2
> > > > 
> > > > Yes, the commit above causes the issue. 
> > > > 
> > > >  2d6cc72  GPU lockups
> > > 
> > > With 2d6cc72 reverted I get:
> > > 
> > > Dec 17 23:09:35 x4 kernel: ------------[ cut here ]------------
> > 
> > Probably a separate issue, can you bisect this one as well?
> 
> Yes. Git-bisect points to:
> 
> 85b144f860176ec18db927d6d9ecdfb24d9c6483 is the first bad commit
> commit 85b144f860176ec18db927d6d9ecdfb24d9c6483
> Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> Date:   Thu Nov 29 11:36:54 2012 +0000
> 
>     drm/ttm: call ttm_bo_cleanup_refs with reservation and lru lock
>     held, v3
> 
> (Please note that this bug is a little bit harder to reproduce. But
> when you scroll up and down for ~10 seconds on the webpage mentioned
> above it will trigger the oops.
> So while I'm not 100% sure that the issue is caused by exactly this
> commit, the vicinity should be right)
> 
> Dec 18 14:29:07 x4 kernel: ------------[ cut here ]------------
> Dec 18 14:29:07 x4 kernel: WARNING: at include/linux/kref.h:42 radeon_fence_ref+0x2c/0x40()
> Dec 18 14:29:07 x4 kernel: Hardware name: System Product Name
> Dec 18 14:29:07 x4 kernel: Pid: 161, comm: X Not tainted 3.7.0-rc7-00520-g85b144f #168
> Dec 18 14:29:07 x4 kernel: Call Trace:
> Dec 18 14:29:07 x4 kernel: [<ffffffff81058c84>] ? warn_slowpath_common+0x74/0xb0
> Dec 18 14:29:07 x4 kernel: [<ffffffff812926fc>] ? radeon_fence_ref+0x2c/0x40
> Dec 18 14:29:07 x4 kernel: [<ffffffff8125e91c>] ? ttm_bo_cleanup_refs_and_unlock+0x17c/0x2c0
> Dec 18 14:29:07 x4 kernel: [<ffffffff8125f13c>] ? ttm_mem_evict_first+0x1dc/0x2a0
> Dec 18 14:29:07 x4 kernel: [<ffffffff81264412>] ? ttm_bo_man_get_node+0x62/0xb0
> Dec 18 14:29:07 x4 kernel: [<ffffffff8125f48e>] ? ttm_bo_mem_space+0x28e/0x340
> Dec 18 14:29:07 x4 kernel: [<ffffffff8125facc>] ? ttm_bo_move_buffer+0xfc/0x170
> Dec 18 14:29:07 x4 kernel: [<ffffffff810de172>] ? kmem_cache_alloc+0xb2/0xc0
> Dec 18 14:29:07 x4 kernel: [<ffffffff8125fbd5>] ? ttm_bo_validate+0x95/0x110
> Dec 18 14:29:07 x4 kernel: [<ffffffff8125ff3c>] ? ttm_bo_init+0x2ec/0x3b0
> Dec 18 14:29:07 x4 kernel: [<ffffffff8129415a>] ? radeon_bo_create+0x18a/0x200
> Dec 18 14:29:07 x4 kernel: [<ffffffff81293e40>] ? radeon_bo_clear_va+0x40/0x40
> Dec 18 14:29:07 x4 kernel: [<ffffffff812a5302>] ? radeon_gem_object_create+0x92/0x160
> Dec 18 14:29:07 x4 kernel: [<ffffffff812a571c>] ? radeon_gem_create_ioctl+0x6c/0x150
> Dec 18 14:29:07 x4 kernel: [<ffffffff81246b60>] ? drm_ioctl+0x420/0x4f0
> Dec 18 14:29:07 x4 kernel: [<ffffffff812a56b0>] ? radeon_gem_pwrite_ioctl+0x20/0x20
> Dec 18 14:29:07 x4 kernel: [<ffffffff810f53a4>] ? do_vfs_ioctl+0x2e4/0x4e0
> Dec 18 14:29:07 x4 kernel: [<ffffffff810e5588>] ? vfs_read+0x118/0x160
> Dec 18 14:29:07 x4 kernel: [<ffffffff810f55ec>] ? sys_ioctl+0x4c/0xa0
> Dec 18 14:29:07 x4 kernel: [<ffffffff810e5851>] ? sys_read+0x51/0xa0
> Dec 18 14:29:07 x4 kernel: [<ffffffff814b05d2>] ? system_call_fastpath+0x16/0x1b
> Dec 18 14:29:07 x4 kernel: ---[ end trace c5e6f68fefd3a70b ]---
> Dec 18 14:29:07 x4 kernel: BUG: unable to handle kernel paging request at 0000000100000077
> Dec 18 14:29:07 x4 kernel: IP: [<ffffffff814afa15>] _raw_spin_lock+0x5/0x30
> Dec 18 14:29:07 x4 kernel: PGD 2156c4067 PUD 0
> Dec 18 14:29:07 x4 kernel: Oops: 0002 [#1] SMP
> Dec 18 14:29:07 x4 kernel: CPU 1
> Dec 18 14:29:07 x4 kernel: Pid: 161, comm: X Tainted: G        W    3.7.0-rc7-00520-g85b144f #168 System manufacturer System Product Name/M4A78T-E
> Dec 18 14:29:07 x4 kernel: RIP: 0010:[<ffffffff814afa15>]  [<ffffffff814afa15>] _raw_spin_lock+0x5/0x30
> Dec 18 14:29:07 x4 kernel: RSP: 0018:ffff880211645d58  EFLAGS: 00010286
> Dec 18 14:29:07 x4 kernel: RAX: 0000000000000100 RBX: ffff8801c0e29448 RCX: 0000000000000000
> Dec 18 14:29:07 x4 kernel: RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000100000077
> Dec 18 14:29:07 x4 kernel: RBP: 00000000ffffffff R08: 0000000000000000 R09: ffffffff81838370
> Dec 18 14:29:07 x4 kernel: R10: ffffffff812a5960 R11: 0000000000000246 R12: 0000000000000001
> Dec 18 14:29:07 x4 kernel: R13: 0000000000000001 R14: 0000000000000000 R15: 00007fff0723dba0
> Dec 18 14:29:07 x4 kernel: FS:  00007f958542f880(0000) GS:ffff88021fc80000(0000) knlGS:0000000000000000
> Dec 18 14:29:07 x4 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Dec 18 14:29:07 x4 kernel: CR2: 0000000100000077 CR3: 000000021161a000 CR4: 00000000000007e0
> Dec 18 14:29:07 x4 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Dec 18 14:29:07 x4 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Dec 18 14:29:07 x4 kernel: Process X (pid: 161, threadinfo ffff880211644000, task ffff880215ab85d0)
> Dec 18 14:29:07 x4 kernel: Stack:
> Dec 18 14:29:07 x4 kernel: ffffffff8125d9ba 0000000015c83600 ffff8801c0e29400 ffff880211645e30
> Dec 18 14:29:07 x4 kernel: ffff8801c0e29448 ffff880211645dcc 0000000000000001 ffffffff81294bff
> Dec 18 14:29:07 x4 kernel: ffff8801c0e29608 ffff880211645e30 ffff880216a76000 ffff880211645e30
> Dec 18 14:29:07 x4 kernel: Call Trace:
> Dec 18 14:29:07 x4 kernel: [<ffffffff8125d9ba>] ? ttm_bo_reserve+0x3a/0x110
> Dec 18 14:29:07 x4 kernel: [<ffffffff81294bff>] ? radeon_bo_wait+0x3f/0xc0
> Dec 18 14:29:07 x4 kernel: [<ffffffff812a59b7>] ? radeon_gem_busy_ioctl+0x57/0x100
> Dec 18 14:29:07 x4 kernel: [<ffffffff81246b60>] ? drm_ioctl+0x420/0x4f0
> Dec 18 14:29:07 x4 kernel: [<ffffffff812a5960>] ? radeon_gem_mmap_ioctl+0x20/0x20
> Dec 18 14:29:07 x4 kernel: [<ffffffff810f53a4>] ? do_vfs_ioctl+0x2e4/0x4e0
> Dec 18 14:29:07 x4 kernel: [<ffffffff810e55ad>] ? vfs_read+0x13d/0x160
> Dec 18 14:29:07 x4 kernel: [<ffffffff810f55ec>] ? sys_ioctl+0x4c/0xa0
> Dec 18 14:29:07 x4 kernel: [<ffffffff810e5851>] ? sys_read+0x51/0xa0
> Dec 18 14:29:07 x4 kernel: [<ffffffff814b05d2>] ? system_call_fastpath+0x16/0x1b
> Dec 18 14:29:07 x4 kernel: Code: 31 c0 5b c3 66 90 8d 8a 00 01 00 00 89 d0 f0 66 0f b1 0b 66 39 d0 75 de b8 01 00 00 00 5b c3 0f 1f 80 00 00 00 00 b8 00 01 00 00 <f0> 66 0f c1 07 0f b6 d4 38 c2 74 10 0f 1f 80 00 00 00 00 f3 90
> Dec 18 14:29:07 x4 kernel: RIP  [<ffffffff814afa15>] _raw_spin_lock+0x5/0x30
> Dec 18 14:29:07 x4 kernel: RSP <ffff880211645d58>
> Dec 18 14:29:07 x4 kernel: CR2: 0000000100000077
> Dec 18 14:29:07 x4 kernel: ---[ end trace c5e6f68fefd3a70c ]---
> Dec 18 14:29:28 x4 kernel: BUG: unable to handle kernel paging request at 0000000100000023
> Dec 18 14:29:28 x4 kernel: IP: [<ffffffff81296448>] radeon_vm_bo_invalidate+0x18/0x30
> Dec 18 14:29:28 x4 kernel: PGD 205289067 PUD 0
> Dec 18 14:29:28 x4 kernel: Oops: 0002 [#2] SMP
> Dec 18 14:29:28 x4 kernel: CPU 1
> Dec 18 14:29:28 x4 kernel: Pid: 13, comm: kworker/1:0 Tainted: G      D W    3.7.0-rc7-00520-g85b144f #168 System manufacturer System Product Name/M4A78T-E
> Dec 18 14:29:28 x4 kernel: RIP: 0010:[<ffffffff81296448>]  [<ffffffff81296448>] radeon_vm_bo_invalidate+0x18/0x30
> Dec 18 14:29:28 x4 kernel: RSP: 0018:ffff8802168b3d78  EFLAGS: 00010207
> Dec 18 14:29:28 x4 kernel: RAX: 00000000ffffffff RBX: ffff8801c0e29048 RCX: ffff8801c0e2b928
> Dec 18 14:29:28 x4 kernel: RDX: 0000000000000001 RSI: ffff8801c0e291f0 RDI: 00000000ffffffff
> Dec 18 14:29:28 x4 kernel: RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000000
> Dec 18 14:29:28 x4 kernel: R10: ffffea0007038a00 R11: dead000000100100 R12: ffff880216a76590
> Dec 18 14:29:28 x4 kernel: R13: ffffffff818383e0 R14: 0000000000000000 R15: ffff880215c83678
> Dec 18 14:29:28 x4 kernel: FS:  00007f4bb2b64740(0000) GS:ffff88021fc80000(0000) knlGS:0000000000000000
> Dec 18 14:29:28 x4 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Dec 18 14:29:28 x4 kernel: CR2: 0000000100000023 CR3: 000000020698f000 CR4: 00000000000007e0
> Dec 18 14:29:28 x4 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Dec 18 14:29:28 x4 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Dec 18 14:29:28 x4 kernel: Process kworker/1:0 (pid: 13, threadinfo ffff8802168b2000, task ffff88021687d730)
> Dec 18 14:29:28 x4 kernel: Stack:
> Dec 18 14:29:28 x4 kernel: ffffffff8125d2e9 ffff8801c0e29048 ffffffff8125e8cb ffff880216a769b8
> Dec 18 14:29:28 x4 kernel: ffffffff810de82f ffff8801c0e2b848 ffff880215c83678 ffff8801c0e2b900
> Dec 18 14:29:28 x4 kernel: 0000000000000001 ffff880216a76a80 ffff8801c0e29048 ffffffff8125eb7d
> Dec 18 14:29:28 x4 kernel: Call Trace:
> Dec 18 14:29:28 x4 kernel: [<ffffffff8125d2e9>] ? ttm_bo_cleanup_memtype_use+0x19/0x90
> Dec 18 14:29:28 x4 kernel: [<ffffffff8125e8cb>] ? ttm_bo_cleanup_refs_and_unlock+0x12b/0x2c0
> Dec 18 14:29:28 x4 kernel: [<ffffffff810de82f>] ? kfree+0xf/0xb0
> Dec 18 14:29:28 x4 kernel: [<ffffffff8125eb7d>] ? ttm_bo_delayed_delete+0x11d/0x1a0
> Dec 18 14:29:28 x4 kernel: [<ffffffff8125ec12>] ? ttm_bo_delayed_workqueue+0x12/0x30
> Dec 18 14:29:28 x4 kernel: [<ffffffff8106e5f9>] ? process_one_work+0x179/0x480
> Dec 18 14:29:28 x4 kernel: [<ffffffff8125ec00>] ? ttm_bo_delayed_delete+0x1a0/0x1a0
> Dec 18 14:29:28 x4 kernel: [<ffffffff8106f5b1>] ? worker_thread+0x1b1/0x540
> Dec 18 14:29:28 x4 kernel: [<ffffffff8106f400>] ? busy_worker_rebind_fn+0x100/0x100
> Dec 18 14:29:28 x4 kernel: [<ffffffff810741cf>] ? kthread+0xaf/0xc0
> Dec 18 14:29:28 x4 kernel: [<ffffffff81074120>] ? __kthread_bind+0x30/0x30
> Dec 18 14:29:28 x4 kernel: [<ffffffff814b052c>] ? ret_from_fork+0x7c/0xb0
> Dec 18 14:29:28 x4 kernel: [<ffffffff81074120>] ? __kthread_bind+0x30/0x30
> Dec 18 14:29:28 x4 kernel: Code: 8b 44 24 04 48 83 c4 08 5b 5d 41 5c c3 66 0f 1f 44 00 00 48 8b 86 f0 01 00 00 48 81 c6 f0 01 00 00 48 39 f0 74 11 0f 1f 44 00 00 <c6> 40 24 00 48 8b 00 48 39 f0 75 f4 f3 c3 66 2e 0f 1f 84 00 00
> Dec 18 14:29:28 x4 kernel: RIP  [<ffffffff81296448>] radeon_vm_bo_invalidate+0x18/0x30
> Dec 18 14:29:28 x4 kernel: RSP <ffff8802168b3d78>
> Dec 18 14:29:28 x4 kernel: CR2: 0000000100000023
> Dec 18 14:29:28 x4 kernel: ---[ end trace c5e6f68fefd3a70d ]---
> Dec 18 14:29:28 x4 kernel: BUG: unable to handle kernel paging request at ffffffffffffffd8
> Dec 18 14:29:28 x4 kernel: IP: [<ffffffff81074257>] kthread_data+0x7/0x10
> Dec 18 14:29:28 x4 kernel: PGD 180d067 PUD 180e067 PMD 0
> Dec 18 14:29:28 x4 kernel: Oops: 0000 [#3] SMP
> Dec 18 14:29:28 x4 kernel: CPU 1
> Dec 18 14:29:28 x4 kernel: Pid: 13, comm: kworker/1:0 Tainted: G      D W    3.7.0-rc7-00520-g85b144f #168 System manufacturer System Product Name/M4A78T-E
> Dec 18 14:29:28 x4 kernel: RIP: 0010:[<ffffffff81074257>]  [<ffffffff81074257>] kthread_data+0x7/0x10
> Dec 18 14:29:28 x4 kernel: RSP: 0018:ffff8802168b3aa0  EFLAGS: 00010002
> Dec 18 14:29:28 x4 kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: 000000015c7992d1
> Dec 18 14:29:28 x4 kernel: RDX: ffffffffff8a8b63 RSI: 0000000000000001 RDI: ffff88021687d730
> Dec 18 14:29:28 x4 kernel: RBP: ffff88021687d730 R08: 0000000000000000 R09: 0000000000000000
> Dec 18 14:29:28 x4 kernel: R10: ffff880216887980 R11: 0000000000000000 R12: ffff88021fc912c0
> Dec 18 14:29:28 x4 kernel: R13: 0000000000000001 R14: ffff88021687d720 R15: ffff88021687d730
> Dec 18 14:29:28 x4 kernel: FS:  00007f4bb2b64740(0000) GS:ffff88021fc80000(0000) knlGS:0000000000000000
> Dec 18 14:29:28 x4 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Dec 18 14:29:28 x4 kernel: CR2: ffffffffffffffd8 CR3: 000000020698f000 CR4: 00000000000007e0
> Dec 18 14:29:28 x4 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Dec 18 14:29:28 x4 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Dec 18 14:29:28 x4 kernel: Process kworker/1:0 (pid: 13, threadinfo ffff8802168b2000, task ffff88021687d730)
> Dec 18 14:29:28 x4 kernel: Stack:
> Dec 18 14:29:28 x4 kernel: ffffffff8106fb98 ffff88021687d9d0 ffffffff814ae8b5 00000000000112c0
> Dec 18 14:29:28 x4 kernel: ffff8802168b3fd8 00000000000112c0 ffff8802168b3fd8 0000000000000001
> Dec 18 14:29:28 x4 kernel: ffff88021687d8d8 ffff88021687d720 ffff880216878000 ffff88021687d720
> Dec 18 14:29:28 x4 kernel: Call Trace:
> Dec 18 14:29:28 x4 kernel: [<ffffffff8106fb98>] ? wq_worker_sleeping+0x8/0xb0
> Dec 18 14:29:28 x4 kernel: [<ffffffff814ae8b5>] ? __schedule+0x3a5/0x5f0
> Dec 18 14:29:28 x4 kernel: [<ffffffff8105dbba>] ? do_exit+0x52a/0x830
> Dec 18 14:29:28 x4 kernel: [<ffffffff8103785e>] ? oops_end+0x8e/0xd0
> Dec 18 14:29:28 x4 kernel: [<ffffffff814a94c8>] ? no_context+0x251/0x25d
> Dec 18 14:29:28 x4 kernel: [<ffffffff810512ce>] ? __do_page_fault+0x2ee/0x490
> Dec 18 14:29:28 x4 kernel: [<ffffffff81083e18>] ? find_busiest_group+0x28/0x480
> Dec 18 14:29:28 x4 kernel: [<ffffffff814b00af>] ? page_fault+0x1f/0x30
> Dec 18 14:29:28 x4 kernel: [<ffffffff81296448>] ? radeon_vm_bo_invalidate+0x18/0x30
> Dec 18 14:29:28 x4 kernel: [<ffffffff8125d2e9>] ? ttm_bo_cleanup_memtype_use+0x19/0x90
> Dec 18 14:29:28 x4 kernel: [<ffffffff8125e8cb>] ? ttm_bo_cleanup_refs_and_unlock+0x12b/0x2c0
> Dec 18 14:29:28 x4 kernel: [<ffffffff810de82f>] ? kfree+0xf/0xb0
> Dec 18 14:29:28 x4 kernel: [<ffffffff8125eb7d>] ? ttm_bo_delayed_delete+0x11d/0x1a0
> Dec 18 14:29:28 x4 kernel: [<ffffffff8125ec12>] ? ttm_bo_delayed_workqueue+0x12/0x30
> Dec 18 14:29:28 x4 kernel: [<ffffffff8106e5f9>] ? process_one_work+0x179/0x480
> Dec 18 14:29:28 x4 kernel: [<ffffffff8125ec00>] ? ttm_bo_delayed_delete+0x1a0/0x1a0
> Dec 18 14:29:28 x4 kernel: [<ffffffff8106f5b1>] ? worker_thread+0x1b1/0x540
> Dec 18 14:29:28 x4 kernel: [<ffffffff8106f400>] ? busy_worker_rebind_fn+0x100/0x100
> Dec 18 14:29:28 x4 kernel: [<ffffffff810741cf>] ? kthread+0xaf/0xc0
> Dec 18 14:29:28 x4 kernel: [<ffffffff81074120>] ? __kthread_bind+0x30/0x30
> Dec 18 14:29:28 x4 kernel: [<ffffffff814b052c>] ? ret_from_fork+0x7c/0xb0
> Dec 18 14:29:28 x4 kernel: [<ffffffff81074120>] ? __kthread_bind+0x30/0x30
> Dec 18 14:29:28 x4 kernel: Code: 74 03 c6 03 00 65 48 8b 04 25 c0 b9 00 00 48 8b 80 48 02 00 00 5b 48 8b 40 c8 48 d1 e8 83 e0 01 c3 0f 1f 00 48 8b 87 48 02 00 00 <48> 8b 40 d8 c3 0f 1f 40 00 65 48 8b 04 25 c0 b9 00 00 48 8b b8
> Dec 18 14:29:28 x4 kernel: RIP  [<ffffffff81074257>] kthread_data+0x7/0x10
> Dec 18 14:29:28 x4 kernel: RSP <ffff8802168b3aa0>
> Dec 18 14:29:28 x4 kernel: CR2: ffffffffffffffd8
> Dec 18 14:29:28 x4 kernel: ---[ end trace c5e6f68fefd3a70e ]---
> Dec 18 14:29:28 x4 kernel: Fixing recursive fault but reboot is needed!
> Dec 18 14:29:28 x4 kernel: SysRq : Emergency Sync

CCing Maarten

-- 
Markus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: GPU lockup CP stall for more than 10000msec on latest vanilla git
  2012-12-18 13:38             ` Markus Trippelsdorf
  2012-12-18 13:51               ` Markus Trippelsdorf
@ 2012-12-18 15:24               ` Maarten Lankhorst
  2012-12-18 16:12                 ` Markus Trippelsdorf
  1 sibling, 1 reply; 20+ messages in thread
From: Maarten Lankhorst @ 2012-12-18 15:24 UTC (permalink / raw)
  To: Markus Trippelsdorf; +Cc: Michel Dänzer, dri-devel

Op 18-12-12 14:38, Markus Trippelsdorf schreef:
> On 2012.12.18 at 12:20 +0100, Michel Dänzer wrote:
>> On Mon, 2012-12-17 at 23:55 +0100, Markus Trippelsdorf wrote: 
>>> On 2012.12.17 at 23:25 +0100, Markus Trippelsdorf wrote:
>>>> On 2012.12.17 at 17:00 -0500, Alex Deucher wrote:
>>>>> On Mon, Dec 17, 2012 at 4:48 PM, Markus Trippelsdorf
>>>>> <markus@trippelsdorf.de> wrote:
>>>>>> On 2012.12.17 at 16:32 -0500, Alex Deucher wrote:
>>>>>>> On Mon, Dec 17, 2012 at 1:27 PM, Markus Trippelsdorf
>>>>>>> <markus@trippelsdorf.de> wrote:
>>>>>>>> As soon as I open the following website:
>>>>>>>> http://www.boston.com/bigpicture/2012/12/2012_year_in_pictures_part_i.html
>>>>>>>>
>>>>>>>> my Radeon RS780 stalls (GPU lockup) leaving the machine unusable:
>>>>>>> Is this a regression?  Most likely a 3D driver bug unless you are only
>>>>>>> seeing it with specific kernels.  What browser are you using and do
>>>>>>> you have hw accelerated webgl, etc. enabled?  If so, what version of
>>>>>>> mesa are you using?
>>>>>> This is a regression, because it is caused by yesterdays merge of
>>>>>> drm-next by Linus. IOW I only see this bug when running a
>>>>>> v3.7-9432-g9360b53 kernel.
>>>>> Can you bisect?  I'm guessing it may be related to the new DMA rings.  Possibly:
>>>>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=2d6cc7296d4ee128ab0fa3b715f0afde511f49c2
>>>> Yes, the commit above causes the issue. 
>>>>
>>>>  2d6cc72  GPU lockups
>>> With 2d6cc72 reverted I get:
>>>
>>> Dec 17 23:09:35 x4 kernel: ------------[ cut here ]------------
>> Probably a separate issue, can you bisect this one as well?
> Yes. Git-bisect points to:
>
> 85b144f860176ec18db927d6d9ecdfb24d9c6483 is the first bad commit
> commit 85b144f860176ec18db927d6d9ecdfb24d9c6483
> Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> Date:   Thu Nov 29 11:36:54 2012 +0000
>
>     drm/ttm: call ttm_bo_cleanup_refs with reservation and lru lock
>     held, v3
>
> (Please note that this bug is a little bit harder to reproduce. But
> when you scroll up and down for ~10 seconds on the webpage mentioned
> above it will trigger the oops.
> So while I'm not 100% sure that the issue is caused by exactly this
> commit, the vicinity should be right)
>
Those dmesg warnings sound suspicious, looks like something is going very wrong there.

Can you revert the one before it? "drm/radeon: allow move_notify to be called without reservation"
Reservation should be held at this point, that commit got in accidentally.

I doubt not holding a reservation is causing it though, I don't really see how that commit could
cause it however, so can you please double check it never happened before that point, and only started at that commit?

also slap in a BUG_ON(!ttm_bo_is_reserved(bo)) in ttm_bo_cleanup_refs_and_unlock for good measure,
and a BUG_ON(spin_trylock(&bdev->fence_lock)); to ttm_bo_wait.

I really don't see how that specific commit can be wrong though, so awaiting your results first before I try to dig more into it.

~Maarten

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: GPU lockup CP stall for more than 10000msec on latest vanilla git
  2012-12-18 15:24               ` Maarten Lankhorst
@ 2012-12-18 16:12                 ` Markus Trippelsdorf
  2012-12-18 18:10                   ` Maarten Lankhorst
  2012-12-19 13:57                   ` Maarten Lankhorst
  0 siblings, 2 replies; 20+ messages in thread
From: Markus Trippelsdorf @ 2012-12-18 16:12 UTC (permalink / raw)
  To: Maarten Lankhorst; +Cc: Michel Dänzer, dri-devel

On 2012.12.18 at 16:24 +0100, Maarten Lankhorst wrote:
> Op 18-12-12 14:38, Markus Trippelsdorf schreef:
> > On 2012.12.18 at 12:20 +0100, Michel Dänzer wrote:
> >> On Mon, 2012-12-17 at 23:55 +0100, Markus Trippelsdorf wrote: 
> >>> On 2012.12.17 at 23:25 +0100, Markus Trippelsdorf wrote:
> >>>> On 2012.12.17 at 17:00 -0500, Alex Deucher wrote:
> >>>>> On Mon, Dec 17, 2012 at 4:48 PM, Markus Trippelsdorf
> >>>>> <markus@trippelsdorf.de> wrote:
> >>>>>> On 2012.12.17 at 16:32 -0500, Alex Deucher wrote:
> >>>>>>> On Mon, Dec 17, 2012 at 1:27 PM, Markus Trippelsdorf
> >>>>>>> <markus@trippelsdorf.de> wrote:
> >>>>>>>> As soon as I open the following website:
> >>>>>>>> http://www.boston.com/bigpicture/2012/12/2012_year_in_pictures_part_i.html
> >>>>>>>>
> >>>>>>>> my Radeon RS780 stalls (GPU lockup) leaving the machine unusable:
> >>>>>>> Is this a regression?  Most likely a 3D driver bug unless you are only
> >>>>>>> seeing it with specific kernels.  What browser are you using and do
> >>>>>>> you have hw accelerated webgl, etc. enabled?  If so, what version of
> >>>>>>> mesa are you using?
> >>>>>> This is a regression, because it is caused by yesterdays merge of
> >>>>>> drm-next by Linus. IOW I only see this bug when running a
> >>>>>> v3.7-9432-g9360b53 kernel.
> >>>>> Can you bisect?  I'm guessing it may be related to the new DMA rings.  Possibly:
> >>>>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=2d6cc7296d4ee128ab0fa3b715f0afde511f49c2
> >>>> Yes, the commit above causes the issue. 
> >>>>
> >>>>  2d6cc72  GPU lockups
> >>> With 2d6cc72 reverted I get:
> >>>
> >>> Dec 17 23:09:35 x4 kernel: ------------[ cut here ]------------
> >> Probably a separate issue, can you bisect this one as well?
> > Yes. Git-bisect points to:
> >
> > 85b144f860176ec18db927d6d9ecdfb24d9c6483 is the first bad commit
> > commit 85b144f860176ec18db927d6d9ecdfb24d9c6483
> > Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> > Date:   Thu Nov 29 11:36:54 2012 +0000
> >
> >     drm/ttm: call ttm_bo_cleanup_refs with reservation and lru lock
> >     held, v3
> >
> > (Please note that this bug is a little bit harder to reproduce. But
> > when you scroll up and down for ~10 seconds on the webpage mentioned
> > above it will trigger the oops.
> > So while I'm not 100% sure that the issue is caused by exactly this
> > commit, the vicinity should be right)
> >
> Those dmesg warnings sound suspicious, looks like something is going
> very wrong there.
> 
> Can you revert the one before it? "drm/radeon: allow move_notify to be
> called without reservation" Reservation should be held at this point,
> that commit got in accidentally.
> 
> I doubt not holding a reservation is causing it though, I don't really
> see how that commit could cause it however, so can you please double
> check it never happened before that point, and only started at that
> commit?
> 
> also slap in a BUG_ON(!ttm_bo_is_reserved(bo)) in
> ttm_bo_cleanup_refs_and_unlock for good measure, and a
> BUG_ON(spin_trylock(&bdev->fence_lock)); to ttm_bo_wait.
> 
> I really don't see how that specific commit can be wrong though, so
> awaiting your results first before I try to dig more into it.

I just reran git-bisect just on your commits (from 1a1494def to 97a875cbd)
and I landed on the same commit as above:

commit 85b144f86 (drm/ttm: call ttm_bo_cleanup_refs with reservation and lru lock held, v3)

So now I'm pretty sure it's specifically this commit that started the
issue.

With your supposed debugging BUG_ONs added I still get:

Dec 18 17:01:15 x4 kernel: ------------[ cut here ]------------
Dec 18 17:01:15 x4 kernel: WARNING: at include/linux/kref.h:42 radeon_fence_ref+0x2c/0x40()
Dec 18 17:01:15 x4 kernel: Hardware name: System Product Name
Dec 18 17:01:15 x4 kernel: Pid: 157, comm: X Not tainted 3.7.0-rc7-00520-g85b144f-dirty #174
Dec 18 17:01:15 x4 kernel: Call Trace:
Dec 18 17:01:15 x4 kernel: [<ffffffff81058c84>] ? warn_slowpath_common+0x74/0xb0
Dec 18 17:01:15 x4 kernel: [<ffffffff8129273c>] ? radeon_fence_ref+0x2c/0x40
Dec 18 17:01:15 x4 kernel: [<ffffffff8125e95c>] ? ttm_bo_cleanup_refs_and_unlock+0x18c/0x2d0
Dec 18 17:01:15 x4 kernel: [<ffffffff8125f17c>] ? ttm_mem_evict_first+0x1dc/0x2a0
Dec 18 17:01:15 x4 kernel: [<ffffffff81264452>] ? ttm_bo_man_get_node+0x62/0xb0
Dec 18 17:01:15 x4 kernel: [<ffffffff8125f4ce>] ? ttm_bo_mem_space+0x28e/0x340
Dec 18 17:01:15 x4 kernel: [<ffffffff8125fb0c>] ? ttm_bo_move_buffer+0xfc/0x170
Dec 18 17:01:15 x4 kernel: [<ffffffff810de172>] ? kmem_cache_alloc+0xb2/0xc0
Dec 18 17:01:15 x4 kernel: [<ffffffff8125fc15>] ? ttm_bo_validate+0x95/0x110
Dec 18 17:01:15 x4 kernel: [<ffffffff8125ff7c>] ? ttm_bo_init+0x2ec/0x3b0
Dec 18 17:01:15 x4 kernel: [<ffffffff8129419a>] ? radeon_bo_create+0x18a/0x200
Dec 18 17:01:15 x4 kernel: [<ffffffff81293e80>] ? radeon_bo_clear_va+0x40/0x40
Dec 18 17:01:15 x4 kernel: [<ffffffff812a5342>] ? radeon_gem_object_create+0x92/0x160
Dec 18 17:01:15 x4 kernel: [<ffffffff812a575c>] ? radeon_gem_create_ioctl+0x6c/0x150
Dec 18 17:01:15 x4 kernel: [<ffffffff812a529f>] ? radeon_gem_object_free+0x2f/0x40
Dec 18 17:01:15 x4 kernel: [<ffffffff81246b60>] ? drm_ioctl+0x420/0x4f0
Dec 18 17:01:15 x4 kernel: [<ffffffff812a56f0>] ? radeon_gem_pwrite_ioctl+0x20/0x20
Dec 18 17:01:15 x4 kernel: [<ffffffff810f53a4>] ? do_vfs_ioctl+0x2e4/0x4e0
Dec 18 17:01:15 x4 kernel: [<ffffffff810e5588>] ? vfs_read+0x118/0x160
Dec 18 17:01:15 x4 kernel: [<ffffffff810f55ec>] ? sys_ioctl+0x4c/0xa0
Dec 18 17:01:15 x4 kernel: [<ffffffff810e5851>] ? sys_read+0x51/0xa0
Dec 18 17:01:15 x4 kernel: [<ffffffff814b0612>] ? system_call_fastpath+0x16/0x1b
Dec 18 17:01:15 x4 kernel: ---[ end trace 485a2dd5755db51e ]---
Dec 18 17:01:15 x4 kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000024
Dec 18 17:01:15 x4 kernel: IP: [<ffffffff81296488>] radeon_vm_bo_invalidate+0x18/0x30
Dec 18 17:01:15 x4 kernel: PGD 211d09067 PUD 211d52067 PMD 0
Dec 18 17:01:15 x4 kernel: Oops: 0002 [#1] SMP
Dec 18 17:01:15 x4 kernel: CPU 1
Dec 18 17:01:15 x4 kernel: Pid: 157, comm: X Tainted: G        W    3.7.0-rc7-00520-g85b144f-dirty #174 System manufacturer System Product Name/M4A78T-E
Dec 18 17:01:15 x4 kernel: RIP: 0010:[<ffffffff81296488>]  [<ffffffff81296488>] radeon_vm_bo_invalidate+0x18/0x30
Dec 18 17:01:15 x4 kernel: RSP: 0018:ffff880211ddfaa8  EFLAGS: 00010203
Dec 18 17:01:15 x4 kernel: RAX: 0000000000000000 RBX: ffff8801f94e1c48 RCX: ffff880205de3128
Dec 18 17:01:15 x4 kernel: RDX: 0000000000000001 RSI: ffff8801f94e1df0 RDI: ffff8801f94e1df8
Dec 18 17:01:15 x4 kernel: RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000000
Dec 18 17:01:15 x4 kernel: R10: 0000000000000000 R11: ffff880216a766b8 R12: ffff880216a76590
Dec 18 17:01:15 x4 kernel: R13: ffffffff818383e0 R14: 0000000000000001 R15: ffff880215c83678
Dec 18 17:01:15 x4 kernel: FS:  00007fbcabc8c880(0000) GS:ffff88021fc80000(0000) knlGS:0000000000000000
Dec 18 17:01:15 x4 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 18 17:01:15 x4 kernel: CR2: 0000000000000024 CR3: 0000000211d07000 CR4: 00000000000007e0
Dec 18 17:01:15 x4 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 18 17:01:15 x4 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 18 17:01:15 x4 kernel: Process X (pid: 157, threadinfo ffff880211dde000, task ffff880211dc0ba0)
Dec 18 17:01:15 x4 kernel: Stack:
Dec 18 17:01:15 x4 kernel: ffffffff8125d2e9 ffff8801f94e1c48 ffffffff8125e909 ffff880216a769b8
Dec 18 17:01:15 x4 kernel: 01ff880200000001 ffff8801f94e1c84 0000000000000001 ffff880216a766b8
Dec 18 17:01:15 x4 kernel: 0000000000000000 ffff880215c83678 ffff8801f94e1c48 ffffffff8125f17c
Dec 18 17:01:15 x4 kernel: Call Trace:
Dec 18 17:01:15 x4 kernel: [<ffffffff8125d2e9>] ? ttm_bo_cleanup_memtype_use+0x19/0x90
Dec 18 17:01:15 x4 kernel: [<ffffffff8125e909>] ? ttm_bo_cleanup_refs_and_unlock+0x139/0x2d0
Dec 18 17:01:15 x4 kernel: [<ffffffff8125f17c>] ? ttm_mem_evict_first+0x1dc/0x2a0
Dec 18 17:01:15 x4 kernel: [<ffffffff81264452>] ? ttm_bo_man_get_node+0x62/0xb0
Dec 18 17:01:15 x4 kernel: [<ffffffff8125f4ce>] ? ttm_bo_mem_space+0x28e/0x340
Dec 18 17:01:15 x4 kernel: [<ffffffff8125fb0c>] ? ttm_bo_move_buffer+0xfc/0x170
Dec 18 17:01:15 x4 kernel: [<ffffffff810de172>] ? kmem_cache_alloc+0xb2/0xc0
Dec 18 17:01:15 x4 kernel: [<ffffffff8125fc15>] ? ttm_bo_validate+0x95/0x110
Dec 18 17:01:15 x4 kernel: [<ffffffff8125ff7c>] ? ttm_bo_init+0x2ec/0x3b0
Dec 18 17:01:15 x4 kernel: [<ffffffff8129419a>] ? radeon_bo_create+0x18a/0x200
Dec 18 17:01:15 x4 kernel: [<ffffffff81293e80>] ? radeon_bo_clear_va+0x40/0x40
Dec 18 17:01:15 x4 kernel: [<ffffffff812a5342>] ? radeon_gem_object_create+0x92/0x160
Dec 18 17:01:15 x4 kernel: [<ffffffff812a575c>] ? radeon_gem_create_ioctl+0x6c/0x150
Dec 18 17:01:15 x4 kernel: [<ffffffff81246b60>] ? drm_ioctl+0x420/0x4f0
Dec 18 17:01:15 x4 kernel: [<ffffffff812a56f0>] ? radeon_gem_pwrite_ioctl+0x20/0x20
Dec 18 17:01:15 x4 kernel: [<ffffffff8111c310>] ? fsnotify_clear_marks_by_inode+0x20/0xd0
Dec 18 17:01:15 x4 kernel: [<ffffffff810fbc35>] ? __destroy_inode+0x15/0x60
Dec 18 17:01:15 x4 kernel: [<ffffffff810de220>] ? kmem_cache_free+0x10/0x90
Dec 18 17:01:15 x4 kernel: [<ffffffff810f8eaf>] ? dput+0x2f/0x300
Dec 18 17:01:15 x4 kernel: [<ffffffff810f53a4>] ? do_vfs_ioctl+0x2e4/0x4e0
Dec 18 17:01:15 x4 kernel: [<ffffffff811005fb>] ? mntput_no_expire+0x7b/0x170
Dec 18 17:01:15 x4 kernel: [<ffffffff8107bb6b>] ? lg_global_unlock+0x3b/0x50
Dec 18 17:01:15 x4 kernel: [<ffffffff81071b9c>] ? task_work_run+0x8c/0xc0
Dec 18 17:01:15 x4 kernel: [<ffffffff810f55ec>] ? sys_ioctl+0x4c/0xa0
Dec 18 17:01:15 x4 kernel: [<ffffffff814b0612>] ? system_call_fastpath+0x16/0x1b
Dec 18 17:01:15 x4 kernel: Code: 8b 44 24 04 48 83 c4 08 5b 5d 41 5c c3 66 0f 1f 44 00 00 48 8b 86 f0 01 00 00 48 81 c6 f0 01 00 00 48 39 f0 74 11 0f 1f 44 00 00 <c6> 40 24 00 48 8b 00 48 39 f0 75 f4 f3 c3 66 2e 0f 1f 84 00 00
Dec 18 17:01:15 x4 kernel: RIP  [<ffffffff81296488>] radeon_vm_bo_invalidate+0x18/0x30
Dec 18 17:01:15 x4 kernel: RSP <ffff880211ddfaa8>
Dec 18 17:01:15 x4 kernel: CR2: 0000000000000024
Dec 18 17:01:15 x4 kernel: ---[ end trace 485a2dd5755db51f ]---
Dec 18 17:01:15 x4 kernel: [drm:drm_release] *ERROR* Device busy: 1

-- 
Markus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: GPU lockup CP stall for more than 10000msec on latest vanilla git
  2012-12-18 16:12                 ` Markus Trippelsdorf
@ 2012-12-18 18:10                   ` Maarten Lankhorst
  2012-12-19 13:57                   ` Maarten Lankhorst
  1 sibling, 0 replies; 20+ messages in thread
From: Maarten Lankhorst @ 2012-12-18 18:10 UTC (permalink / raw)
  To: Markus Trippelsdorf; +Cc: Michel Dänzer, dri-devel

Op 18-12-12 17:12, Markus Trippelsdorf schreef:
> On 2012.12.18 at 16:24 +0100, Maarten Lankhorst wrote:
>> Op 18-12-12 14:38, Markus Trippelsdorf schreef:
>>> On 2012.12.18 at 12:20 +0100, Michel Dänzer wrote:
>>>> On Mon, 2012-12-17 at 23:55 +0100, Markus Trippelsdorf wrote: 
>>>>> On 2012.12.17 at 23:25 +0100, Markus Trippelsdorf wrote:
>>>>>> On 2012.12.17 at 17:00 -0500, Alex Deucher wrote:
>>>>>>> On Mon, Dec 17, 2012 at 4:48 PM, Markus Trippelsdorf
>>>>>>> <markus@trippelsdorf.de> wrote:
>>>>>>>> On 2012.12.17 at 16:32 -0500, Alex Deucher wrote:
>>>>>>>>> On Mon, Dec 17, 2012 at 1:27 PM, Markus Trippelsdorf
>>>>>>>>> <markus@trippelsdorf.de> wrote:
>>>>>>>>>> As soon as I open the following website:
>>>>>>>>>> http://www.boston.com/bigpicture/2012/12/2012_year_in_pictures_part_i.html
>>>>>>>>>>
>>>>>>>>>> my Radeon RS780 stalls (GPU lockup) leaving the machine unusable:
>>>>>>>>> Is this a regression?  Most likely a 3D driver bug unless you are only
>>>>>>>>> seeing it with specific kernels.  What browser are you using and do
>>>>>>>>> you have hw accelerated webgl, etc. enabled?  If so, what version of
>>>>>>>>> mesa are you using?
>>>>>>>> This is a regression, because it is caused by yesterdays merge of
>>>>>>>> drm-next by Linus. IOW I only see this bug when running a
>>>>>>>> v3.7-9432-g9360b53 kernel.
>>>>>>> Can you bisect?  I'm guessing it may be related to the new DMA rings.  Possibly:
>>>>>>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=2d6cc7296d4ee128ab0fa3b715f0afde511f49c2
>>>>>> Yes, the commit above causes the issue. 
>>>>>>
>>>>>>  2d6cc72  GPU lockups
>>>>> With 2d6cc72 reverted I get:
>>>>>
>>>>> Dec 17 23:09:35 x4 kernel: ------------[ cut here ]------------
>>>> Probably a separate issue, can you bisect this one as well?
>>> Yes. Git-bisect points to:
>>>
>>> 85b144f860176ec18db927d6d9ecdfb24d9c6483 is the first bad commit
>>> commit 85b144f860176ec18db927d6d9ecdfb24d9c6483
>>> Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>> Date:   Thu Nov 29 11:36:54 2012 +0000
>>>
>>>     drm/ttm: call ttm_bo_cleanup_refs with reservation and lru lock
>>>     held, v3
>>>
>>> (Please note that this bug is a little bit harder to reproduce. But
>>> when you scroll up and down for ~10 seconds on the webpage mentioned
>>> above it will trigger the oops.
>>> So while I'm not 100% sure that the issue is caused by exactly this
>>> commit, the vicinity should be right)
>>>
>> Those dmesg warnings sound suspicious, looks like something is going
>> very wrong there.
>>
>> Can you revert the one before it? "drm/radeon: allow move_notify to be
>> called without reservation" Reservation should be held at this point,
>> that commit got in accidentally.
>>
>> I doubt not holding a reservation is causing it though, I don't really
>> see how that commit could cause it however, so can you please double
>> check it never happened before that point, and only started at that
>> commit?
>>
>> also slap in a BUG_ON(!ttm_bo_is_reserved(bo)) in
>> ttm_bo_cleanup_refs_and_unlock for good measure, and a
>> BUG_ON(spin_trylock(&bdev->fence_lock)); to ttm_bo_wait.
>>
>> I really don't see how that specific commit can be wrong though, so
>> awaiting your results first before I try to dig more into it.
> I just reran git-bisect just on your commits (from 1a1494def to 97a875cbd)
> and I landed on the same commit as above:
>
> commit 85b144f86 (drm/ttm: call ttm_bo_cleanup_refs with reservation and lru lock held, v3)
>
> So now I'm pretty sure it's specifically this commit that started the
> issue.
>
> With your supposed debugging BUG_ONs added I still get:
>
> Dec 18 17:01:15 x4 kernel: ------------[ cut here ]------------
> Dec 18 17:01:15 x4 kernel: WARNING: at include/linux/kref.h:42 radeon_fence_ref+0x2c/0x40()
> Dec 18 17:01:15 x4 kernel: Hardware name: System Product Name
> Dec 18 17:01:15 x4 kernel: Pid: 157, comm: X Not tainted 3.7.0-rc7-00520-g85b144f-dirty #174
> Dec 18 17:01:15 x4 kernel: Call Trace:
> Dec 18 17:01:15 x4 kernel: [<ffffffff81058c84>] ? warn_slowpath_common+0x74/0xb0
> Dec 18 17:01:15 x4 kernel: [<ffffffff8129273c>] ? radeon_fence_ref+0x2c/0x40
> Dec 18 17:01:15 x4 kernel: [<ffffffff8125e95c>] ? ttm_bo_cleanup_refs_and_unlock+0x18c/0x2d0
> Dec 18 17:01:15 x4 kernel: [<ffffffff8125f17c>] ? ttm_mem_evict_first+0x1dc/0x2a0
> Dec 18 17:01:15 x4 kernel: [<ffffffff81264452>] ? ttm_bo_man_get_node+0x62/0xb0
> Dec 18 17:01:15 x4 kernel: [<ffffffff8125f4ce>] ? ttm_bo_mem_space+0x28e/0x340
> Dec 18 17:01:15 x4 kernel: [<ffffffff8125fb0c>] ? ttm_bo_move_buffer+0xfc/0x170
> Dec 18 17:01:15 x4 kernel: [<ffffffff810de172>] ? kmem_cache_alloc+0xb2/0xc0
> Dec 18 17:01:15 x4 kernel: [<ffffffff8125fc15>] ? ttm_bo_validate+0x95/0x110
> Dec 18 17:01:15 x4 kernel: [<ffffffff8125ff7c>] ? ttm_bo_init+0x2ec/0x3b0
> Dec 18 17:01:15 x4 kernel: [<ffffffff8129419a>] ? radeon_bo_create+0x18a/0x200
> Dec 18 17:01:15 x4 kernel: [<ffffffff81293e80>] ? radeon_bo_clear_va+0x40/0x40
> Dec 18 17:01:15 x4 kernel: [<ffffffff812a5342>] ? radeon_gem_object_create+0x92/0x160
> Dec 18 17:01:15 x4 kernel: [<ffffffff812a575c>] ? radeon_gem_create_ioctl+0x6c/0x150
> Dec 18 17:01:15 x4 kernel: [<ffffffff812a529f>] ? radeon_gem_object_free+0x2f/0x40
> Dec 18 17:01:15 x4 kernel: [<ffffffff81246b60>] ? drm_ioctl+0x420/0x4f0
> Dec 18 17:01:15 x4 kernel: [<ffffffff812a56f0>] ? radeon_gem_pwrite_ioctl+0x20/0x20
> Dec 18 17:01:15 x4 kernel: [<ffffffff810f53a4>] ? do_vfs_ioctl+0x2e4/0x4e0
> Dec 18 17:01:15 x4 kernel: [<ffffffff810e5588>] ? vfs_read+0x118/0x160
> Dec 18 17:01:15 x4 kernel: [<ffffffff810f55ec>] ? sys_ioctl+0x4c/0xa0
> Dec 18 17:01:15 x4 kernel: [<ffffffff810e5851>] ? sys_read+0x51/0xa0
> Dec 18 17:01:15 x4 kernel: [<ffffffff814b0612>] ? system_call_fastpath+0x16/0x1b
So nothing changed.. did you revert the drm/radeon patch before it yet? And wtf is going on here?

That patch shouldn't cause such issues by itself, and I don't see how the refcount on bo->sync_obj can be zero, with bo->sync_obj non-null.

Refcounting seems to be messed up on the fence somewhere, but I don't think it's caused by this patch..

~Maarten

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: GPU lockup CP stall for more than 10000msec on latest vanilla git
  2012-12-18 16:12                 ` Markus Trippelsdorf
  2012-12-18 18:10                   ` Maarten Lankhorst
@ 2012-12-19 13:57                   ` Maarten Lankhorst
  2012-12-19 14:20                     ` Markus Trippelsdorf
  1 sibling, 1 reply; 20+ messages in thread
From: Maarten Lankhorst @ 2012-12-19 13:57 UTC (permalink / raw)
  To: Markus Trippelsdorf; +Cc: Michel Dänzer, dri-devel

Op 18-12-12 17:12, Markus Trippelsdorf schreef:
> On 2012.12.18 at 16:24 +0100, Maarten Lankhorst wrote:
>> Op 18-12-12 14:38, Markus Trippelsdorf schreef:
>>> On 2012.12.18 at 12:20 +0100, Michel Dänzer wrote:
>>>> On Mon, 2012-12-17 at 23:55 +0100, Markus Trippelsdorf wrote: 
>>>>> On 2012.12.17 at 23:25 +0100, Markus Trippelsdorf wrote:
>>>>>> On 2012.12.17 at 17:00 -0500, Alex Deucher wrote:
>>>>>>> On Mon, Dec 17, 2012 at 4:48 PM, Markus Trippelsdorf
>>>>>>> <markus@trippelsdorf.de> wrote:
>>>>>>>> On 2012.12.17 at 16:32 -0500, Alex Deucher wrote:
>>>>>>>>> On Mon, Dec 17, 2012 at 1:27 PM, Markus Trippelsdorf
>>>>>>>>> <markus@trippelsdorf.de> wrote:
>>>>>>>>>> As soon as I open the following website:
>>>>>>>>>> http://www.boston.com/bigpicture/2012/12/2012_year_in_pictures_part_i.html
>>>>>>>>>>
>>>>>>>>>> my Radeon RS780 stalls (GPU lockup) leaving the machine unusable:
>>>>>>>>> Is this a regression?  Most likely a 3D driver bug unless you are only
>>>>>>>>> seeing it with specific kernels.  What browser are you using and do
>>>>>>>>> you have hw accelerated webgl, etc. enabled?  If so, what version of
>>>>>>>>> mesa are you using?
>>>>>>>> This is a regression, because it is caused by yesterdays merge of
>>>>>>>> drm-next by Linus. IOW I only see this bug when running a
>>>>>>>> v3.7-9432-g9360b53 kernel.
>>>>>>> Can you bisect?  I'm guessing it may be related to the new DMA rings.  Possibly:
>>>>>>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=2d6cc7296d4ee128ab0fa3b715f0afde511f49c2
>>>>>> Yes, the commit above causes the issue. 
>>>>>>
>>>>>>  2d6cc72  GPU lockups
>>>>> With 2d6cc72 reverted I get:
>>>>>
>>>>> Dec 17 23:09:35 x4 kernel: ------------[ cut here ]------------
>>>> Probably a separate issue, can you bisect this one as well?
>>> Yes. Git-bisect points to:
>>>
>>> 85b144f860176ec18db927d6d9ecdfb24d9c6483 is the first bad commit
>>> commit 85b144f860176ec18db927d6d9ecdfb24d9c6483
>>> Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>> Date:   Thu Nov 29 11:36:54 2012 +0000
>>>
>>>     drm/ttm: call ttm_bo_cleanup_refs with reservation and lru lock
>>>     held, v3
>>>
>>> (Please note that this bug is a little bit harder to reproduce. But
>>> when you scroll up and down for ~10 seconds on the webpage mentioned
>>> above it will trigger the oops.
>>> So while I'm not 100% sure that the issue is caused by exactly this
>>> commit, the vicinity should be right)
>>>
>> Those dmesg warnings sound suspicious, looks like something is going
>> very wrong there.
>>
>> Can you revert the one before it? "drm/radeon: allow move_notify to be
>> called without reservation" Reservation should be held at this point,
>> that commit got in accidentally.
>>
>> I doubt not holding a reservation is causing it though, I don't really
>> see how that commit could cause it however, so can you please double
>> check it never happened before that point, and only started at that
>> commit?
>>
>> also slap in a BUG_ON(!ttm_bo_is_reserved(bo)) in
>> ttm_bo_cleanup_refs_and_unlock for good measure, and a
>> BUG_ON(spin_trylock(&bdev->fence_lock)); to ttm_bo_wait.
>>
>> I really don't see how that specific commit can be wrong though, so
>> awaiting your results first before I try to dig more into it.
> I just reran git-bisect just on your commits (from 1a1494def to 97a875cbd)
> and I landed on the same commit as above:
>
> commit 85b144f86 (drm/ttm: call ttm_bo_cleanup_refs with reservation and lru lock held, v3)
>
> So now I'm pretty sure it's specifically this commit that started the
> issue.
>
> With your supposed debugging BUG_ONs added I still get:
>
> Dec 18 17:01:15 x4 kernel: ------------[ cut here ]------------
> Dec 18 17:01:15 x4 kernel: WARNING: at include/linux/kref.h:42 radeon_fence_ref+0x2c/0x40()
> Dec 18 17:01:15 x4 kernel: Hardware name: System Product Name
> Dec 18 17:01:15 x4 kernel: Pid: 157, comm: X Not tainted 3.7.0-rc7-00520-g85b144f-dirty #174
> Dec 18 17:01:15 x4 kernel: Call Trace:
> Dec 18 17:01:15 x4 kernel: [<ffffffff81058c84>] ? warn_slowpath_common+0x74/0xb0
> Dec 18 17:01:15 x4 kernel: [<ffffffff8129273c>] ? radeon_fence_ref+0x2c/0x40
> Dec 18 17:01:15 x4 kernel: [<ffffffff8125e95c>] ? ttm_bo_cleanup_refs_and_unlock+0x18c/0x2d0
> Dec 18 17:01:15 x4 kernel: [<ffffffff8125f17c>] ? ttm_mem_evict_first+0x1dc/0x2a0
> Dec 18 17:01:15 x4 kernel: [<ffffffff81264452>] ? ttm_bo_man_get_node+0x62/0xb0
> Dec 18 17:01:15 x4 kernel: [<ffffffff8125f4ce>] ? ttm_bo_mem_space+0x28e/0x340
> Dec 18 17:01:15 x4 kernel: [<ffffffff8125fb0c>] ? ttm_bo_move_buffer+0xfc/0x170
> Dec 18 17:01:15 x4 kernel: [<ffffffff810de172>] ? kmem_cache_alloc+0xb2/0xc0
> Dec 18 17:01:15 x4 kernel: [<ffffffff8125fc15>] ? ttm_bo_validate+0x95/0x110
> Dec 18 17:01:15 x4 kernel: [<ffffffff8125ff7c>] ? ttm_bo_init+0x2ec/0x3b0
> Dec 18 17:01:15 x4 kernel: [<ffffffff8129419a>] ? radeon_bo_create+0x18a/0x200
> Dec 18 17:01:15 x4 kernel: [<ffffffff81293e80>] ? radeon_bo_clear_va+0x40/0x40
> Dec 18 17:01:15 x4 kernel: [<ffffffff812a5342>] ? radeon_gem_object_create+0x92/0x160
> Dec 18 17:01:15 x4 kernel: [<ffffffff812a575c>] ? radeon_gem_create_ioctl+0x6c/0x150
> Dec 18 17:01:15 x4 kernel: [<ffffffff812a529f>] ? radeon_gem_object_free+0x2f/0x40
> Dec 18 17:01:15 x4 kernel: [<ffffffff81246b60>] ? drm_ioctl+0x420/0x4f0
> Dec 18 17:01:15 x4 kernel: [<ffffffff812a56f0>] ? radeon_gem_pwrite_ioctl+0x20/0x20
> Dec 18 17:01:15 x4 kernel: [<ffffffff810f53a4>] ? do_vfs_ioctl+0x2e4/0x4e0
> Dec 18 17:01:15 x4 kernel: [<ffffffff810e5588>] ? vfs_read+0x118/0x160
> Dec 18 17:01:15 x4 kernel: [<ffffffff810f55ec>] ? sys_ioctl+0x4c/0xa0
> Dec 18 17:01:15 x4 kernel: [<ffffffff810e5851>] ? sys_read+0x51/0xa0
> Dec 18 17:01:15 x4 kernel: [<ffffffff814b0612>] ? system_call_fastpath+0x16/0x1b
so the kref to fence is null here. This should be impossible and indicates a bug in refcounting somewhere, or possibly memory corruption.

Lets first look where things could go wrong..

sync_obj member requires fence_lock to be taken, but radeon code in general doesn't do that, hm..

I think radeon_cs_sync_rings needs to take fence_lock during the iteration, then taking on a refcount to the fence,
and radeon_crtc_page_flip and radeon_move_blit are lacking refcount on fence_lock as well.

But that would probably still not explain why it crashes in radeon_vm_bo_invalidate shortly after,
so it seems just as likely that it's operating on freed memory there or something.

But none of the code touches refcounting for that bo, and I really don't see how I messed up anything there.

I seem to be able to reproduce it if I add a hack though, can you test
if you get the exact same issues if you apply this patch?

I call it "aggressively evict MRU buffer, and never call ddestroy", and for me it triggers by merely starting X. :-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 0bf66f9..9a8f0d8 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -512,6 +512,7 @@ static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo)
 	spin_lock(&glob->lru_lock);
 	ret = ttm_bo_reserve_locked(bo, false, true, false, 0);
 
+	goto skip;
 	spin_lock(&bdev->fence_lock);
 	(void) ttm_bo_wait(bo, false, false, true);
 	if (!ret && !bo->sync_obj && 0) {
@@ -529,6 +530,7 @@ static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo)
 		sync_obj = driver->sync_obj_ref(bo->sync_obj);
 	spin_unlock(&bdev->fence_lock);
 
+skip:
 	if (!ret) {
 		atomic_set(&bo->reserved, 0);
 		wake_up_all(&bo->event_queue);
@@ -542,8 +544,7 @@ static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo)
 		driver->sync_obj_flush(sync_obj);
 		driver->sync_obj_unref(&sync_obj);
 	}
-	schedule_delayed_work(&bdev->wq,
-			      ((HZ / 100) < 1) ? 1 : HZ / 100);
+	schedule_delayed_work(&bdev->wq, HZ * 100);
 }
 
 /**
@@ -699,8 +700,7 @@ static void ttm_bo_delayed_workqueue(struct work_struct *work)
 	    container_of(work, struct ttm_bo_device, wq.work);
 
 	if (ttm_bo_delayed_delete(bdev, false)) {
-		schedule_delayed_work(&bdev->wq,
-				      ((HZ / 100) < 1) ? 1 : HZ / 100);
+		schedule_delayed_work(&bdev->wq, HZ * 100);
 	}
 }
 
@@ -743,8 +743,7 @@ EXPORT_SYMBOL(ttm_bo_lock_delayed_workqueue);
 void ttm_bo_unlock_delayed_workqueue(struct ttm_bo_device *bdev, int resched)
 {
 	if (resched)
-		schedule_delayed_work(&bdev->wq,
-				      ((HZ / 100) < 1) ? 1 : HZ / 100);
+		schedule_delayed_work(&bdev->wq, HZ * 100);
 }
 EXPORT_SYMBOL(ttm_bo_unlock_delayed_workqueue);
 
@@ -815,12 +814,15 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
 
 retry:
 	spin_lock(&glob->lru_lock);
-	if (list_empty(&man->lru)) {
-		spin_unlock(&glob->lru_lock);
-		return -EBUSY;
-	}
+	if (list_empty(&bdev->ddestroy)) {
+		if (list_empty(&man->lru)) {
+			spin_unlock(&glob->lru_lock);
+			return -EBUSY;
+		}
+		bo = list_entry(man->lru.prev, struct ttm_buffer_object, lru);
+	} else
+		bo = list_entry(bdev->ddestroy.prev, struct ttm_buffer_object, ddestroy);
 
-	bo = list_first_entry(&man->lru, struct ttm_buffer_object, lru);
 	kref_get(&bo->list_kref);
 
 	if (!list_empty(&bo->ddestroy)) {

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: GPU lockup CP stall for more than 10000msec on latest vanilla git
  2012-12-19 13:57                   ` Maarten Lankhorst
@ 2012-12-19 14:20                     ` Markus Trippelsdorf
  2012-12-19 14:31                       ` Maarten Lankhorst
  0 siblings, 1 reply; 20+ messages in thread
From: Markus Trippelsdorf @ 2012-12-19 14:20 UTC (permalink / raw)
  To: Maarten Lankhorst; +Cc: Michel Dänzer, dri-devel

On 2012.12.19 at 14:57 +0100, Maarten Lankhorst wrote:
> Op 18-12-12 17:12, Markus Trippelsdorf schreef:
> > With your supposed debugging BUG_ONs added I still get:
> >
> > Dec 18 17:01:15 x4 kernel: ------------[ cut here ]------------
> > Dec 18 17:01:15 x4 kernel: WARNING: at include/linux/kref.h:42 radeon_fence_ref+0x2c/0x40()
> > Dec 18 17:01:15 x4 kernel: Hardware name: System Product Name
> > Dec 18 17:01:15 x4 kernel: Pid: 157, comm: X Not tainted 3.7.0-rc7-00520-g85b144f-dirty #174
> > Dec 18 17:01:15 x4 kernel: Call Trace:
> > Dec 18 17:01:15 x4 kernel: [<ffffffff81058c84>] ? warn_slowpath_common+0x74/0xb0
> > Dec 18 17:01:15 x4 kernel: [<ffffffff8129273c>] ? radeon_fence_ref+0x2c/0x40
> > Dec 18 17:01:15 x4 kernel: [<ffffffff8125e95c>] ? ttm_bo_cleanup_refs_and_unlock+0x18c/0x2d0
> > Dec 18 17:01:15 x4 kernel: [<ffffffff8125f17c>] ? ttm_mem_evict_first+0x1dc/0x2a0
> > Dec 18 17:01:15 x4 kernel: [<ffffffff81264452>] ? ttm_bo_man_get_node+0x62/0xb0
> > Dec 18 17:01:15 x4 kernel: [<ffffffff8125f4ce>] ? ttm_bo_mem_space+0x28e/0x340
> > Dec 18 17:01:15 x4 kernel: [<ffffffff8125fb0c>] ? ttm_bo_move_buffer+0xfc/0x170
> > Dec 18 17:01:15 x4 kernel: [<ffffffff810de172>] ? kmem_cache_alloc+0xb2/0xc0
> > Dec 18 17:01:15 x4 kernel: [<ffffffff8125fc15>] ? ttm_bo_validate+0x95/0x110
> > Dec 18 17:01:15 x4 kernel: [<ffffffff8125ff7c>] ? ttm_bo_init+0x2ec/0x3b0
> > Dec 18 17:01:15 x4 kernel: [<ffffffff8129419a>] ? radeon_bo_create+0x18a/0x200
> > Dec 18 17:01:15 x4 kernel: [<ffffffff81293e80>] ? radeon_bo_clear_va+0x40/0x40
> > Dec 18 17:01:15 x4 kernel: [<ffffffff812a5342>] ? radeon_gem_object_create+0x92/0x160
> > Dec 18 17:01:15 x4 kernel: [<ffffffff812a575c>] ? radeon_gem_create_ioctl+0x6c/0x150
> > Dec 18 17:01:15 x4 kernel: [<ffffffff812a529f>] ? radeon_gem_object_free+0x2f/0x40
> > Dec 18 17:01:15 x4 kernel: [<ffffffff81246b60>] ? drm_ioctl+0x420/0x4f0
> > Dec 18 17:01:15 x4 kernel: [<ffffffff812a56f0>] ? radeon_gem_pwrite_ioctl+0x20/0x20
> > Dec 18 17:01:15 x4 kernel: [<ffffffff810f53a4>] ? do_vfs_ioctl+0x2e4/0x4e0
> > Dec 18 17:01:15 x4 kernel: [<ffffffff810e5588>] ? vfs_read+0x118/0x160
> > Dec 18 17:01:15 x4 kernel: [<ffffffff810f55ec>] ? sys_ioctl+0x4c/0xa0
> > Dec 18 17:01:15 x4 kernel: [<ffffffff810e5851>] ? sys_read+0x51/0xa0
> > Dec 18 17:01:15 x4 kernel: [<ffffffff814b0612>] ? system_call_fastpath+0x16/0x1b
> so the kref to fence is null here. This should be impossible and
> indicates a bug in refcounting somewhere, or possibly memory
> corruption.
> 
> Lets first look where things could go wrong..
> 
> sync_obj member requires fence_lock to be taken, but radeon code in
> general doesn't do that, hm..
> 
> I think radeon_cs_sync_rings needs to take fence_lock during the
> iteration, then taking on a refcount to the fence, and
> radeon_crtc_page_flip and radeon_move_blit are lacking refcount on
> fence_lock as well.
> 
> But that would probably still not explain why it crashes in
> radeon_vm_bo_invalidate shortly after, so it seems just as likely that
> it's operating on freed memory there or something.
> 
> But none of the code touches refcounting for that bo, and I really
> don't see how I messed up anything there.
> 
> I seem to be able to reproduce it if I add a hack though, can you test
> if you get the exact same issues if you apply this patch?

Your patch doesn't apply unfortunately:

markus@x4 linux % patch -p1 --dry-run < ~/maarten.patch
checking file drivers/gpu/drm/ttm/ttm_bo.c
Hunk #1 succeeded at 512 with fuzz 1.
Hunk #6 FAILED at 814.
1 out of 6 hunks FAILED
markus@x4 linux % git describe
v3.7-10833-g752451f
markus@x4 linux % 

-- 
Markus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: GPU lockup CP stall for more than 10000msec on latest vanilla git
  2012-12-19 14:20                     ` Markus Trippelsdorf
@ 2012-12-19 14:31                       ` Maarten Lankhorst
  0 siblings, 0 replies; 20+ messages in thread
From: Maarten Lankhorst @ 2012-12-19 14:31 UTC (permalink / raw)
  To: Markus Trippelsdorf; +Cc: Michel Dänzer, dri-devel

Op 19-12-12 15:20, Markus Trippelsdorf schreef:
> On 2012.12.19 at 14:57 +0100, Maarten Lankhorst wrote:
>> Op 18-12-12 17:12, Markus Trippelsdorf schreef:
>>> With your supposed debugging BUG_ONs added I still get:
>>>
>>> Dec 18 17:01:15 x4 kernel: ------------[ cut here ]------------
>>> Dec 18 17:01:15 x4 kernel: WARNING: at include/linux/kref.h:42 radeon_fence_ref+0x2c/0x40()
>>> Dec 18 17:01:15 x4 kernel: Hardware name: System Product Name
>>> Dec 18 17:01:15 x4 kernel: Pid: 157, comm: X Not tainted 3.7.0-rc7-00520-g85b144f-dirty #174
>>> Dec 18 17:01:15 x4 kernel: Call Trace:
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff81058c84>] ? warn_slowpath_common+0x74/0xb0
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff8129273c>] ? radeon_fence_ref+0x2c/0x40
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff8125e95c>] ? ttm_bo_cleanup_refs_and_unlock+0x18c/0x2d0
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff8125f17c>] ? ttm_mem_evict_first+0x1dc/0x2a0
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff81264452>] ? ttm_bo_man_get_node+0x62/0xb0
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff8125f4ce>] ? ttm_bo_mem_space+0x28e/0x340
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff8125fb0c>] ? ttm_bo_move_buffer+0xfc/0x170
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff810de172>] ? kmem_cache_alloc+0xb2/0xc0
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff8125fc15>] ? ttm_bo_validate+0x95/0x110
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff8125ff7c>] ? ttm_bo_init+0x2ec/0x3b0
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff8129419a>] ? radeon_bo_create+0x18a/0x200
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff81293e80>] ? radeon_bo_clear_va+0x40/0x40
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff812a5342>] ? radeon_gem_object_create+0x92/0x160
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff812a575c>] ? radeon_gem_create_ioctl+0x6c/0x150
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff812a529f>] ? radeon_gem_object_free+0x2f/0x40
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff81246b60>] ? drm_ioctl+0x420/0x4f0
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff812a56f0>] ? radeon_gem_pwrite_ioctl+0x20/0x20
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff810f53a4>] ? do_vfs_ioctl+0x2e4/0x4e0
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff810e5588>] ? vfs_read+0x118/0x160
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff810f55ec>] ? sys_ioctl+0x4c/0xa0
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff810e5851>] ? sys_read+0x51/0xa0
>>> Dec 18 17:01:15 x4 kernel: [<ffffffff814b0612>] ? system_call_fastpath+0x16/0x1b
>> so the kref to fence is null here. This should be impossible and
>> indicates a bug in refcounting somewhere, or possibly memory
>> corruption.
>>
>> Lets first look where things could go wrong..
>>
>> sync_obj member requires fence_lock to be taken, but radeon code in
>> general doesn't do that, hm..
>>
>> I think radeon_cs_sync_rings needs to take fence_lock during the
>> iteration, then taking on a refcount to the fence, and
>> radeon_crtc_page_flip and radeon_move_blit are lacking refcount on
>> fence_lock as well.
>>
>> But that would probably still not explain why it crashes in
>> radeon_vm_bo_invalidate shortly after, so it seems just as likely that
>> it's operating on freed memory there or something.
>>
>> But none of the code touches refcounting for that bo, and I really
>> don't see how I messed up anything there.
>>
>> I seem to be able to reproduce it if I add a hack though, can you test
>> if you get the exact same issues if you apply this patch?
> Your patch doesn't apply unfortunately:
>
> markus@x4 linux % patch -p1 --dry-run < ~/maarten.patch
> checking file drivers/gpu/drm/ttm/ttm_bo.c
> Hunk #1 succeeded at 512 with fuzz 1.
> Hunk #6 FAILED at 814.
> 1 out of 6 hunks FAILED
> markus@x4 linux % git describe
> v3.7-10833-g752451f
> markus@x4 linux % 
It applies on top of the regressed commit. It should probably not be too hard to make it apply
manually on whatever you're using.

But the real fix will be "drm/ttm: fix delayed ttm_bo_cleanup_refs_and_unlock delayed handling",
which I cc'd you on. The patch I posted earlier in this thread will just aggressively stress test the codepath.

~Maarten

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: GPU lockup CP stall for more than 10000msec on latest vanilla git
  2012-12-17 22:25       ` Markus Trippelsdorf
  2012-12-17 22:55         ` Markus Trippelsdorf
@ 2012-12-23  1:46         ` Alex Deucher
  2012-12-23  8:43           ` Markus Trippelsdorf
  1 sibling, 1 reply; 20+ messages in thread
From: Alex Deucher @ 2012-12-23  1:46 UTC (permalink / raw)
  To: Markus Trippelsdorf; +Cc: dri-devel

On Mon, Dec 17, 2012 at 5:25 PM, Markus Trippelsdorf
<markus@trippelsdorf.de> wrote:
> On 2012.12.17 at 17:00 -0500, Alex Deucher wrote:
>> On Mon, Dec 17, 2012 at 4:48 PM, Markus Trippelsdorf
>> <markus@trippelsdorf.de> wrote:
>> > On 2012.12.17 at 16:32 -0500, Alex Deucher wrote:
>> >> On Mon, Dec 17, 2012 at 1:27 PM, Markus Trippelsdorf
>> >> <markus@trippelsdorf.de> wrote:
>> >> > As soon as I open the following website:
>> >> > http://www.boston.com/bigpicture/2012/12/2012_year_in_pictures_part_i.html
>> >> >
>> >> > my Radeon RS780 stalls (GPU lockup) leaving the machine unusable:
>> >>
>> >> Is this a regression?  Most likely a 3D driver bug unless you are only
>> >> seeing it with specific kernels.  What browser are you using and do
>> >> you have hw accelerated webgl, etc. enabled?  If so, what version of
>> >> mesa are you using?
>> >
>> > This is a regression, because it is caused by yesterdays merge of
>> > drm-next by Linus. IOW I only see this bug when running a
>> > v3.7-9432-g9360b53 kernel.
>>
>> Can you bisect?  I'm guessing it may be related to the new DMA rings.  Possibly:
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=2d6cc7296d4ee128ab0fa3b715f0afde511f49c2
>
> Yes, the commit above causes the issue.
>

Does booting with radeon.wb=0 fix the issue?  Please make sure your
kernel has this patch:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=86a1881d08f65a42c17071a59c0088dbe2870246

Alex

>  2d6cc72  GPU lockups
>  009ee7a  runs fine
>
> --
> Markus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: GPU lockup CP stall for more than 10000msec on latest vanilla git
  2012-12-23  1:46         ` Alex Deucher
@ 2012-12-23  8:43           ` Markus Trippelsdorf
  2012-12-23 10:09             ` Andy Furniss
  0 siblings, 1 reply; 20+ messages in thread
From: Markus Trippelsdorf @ 2012-12-23  8:43 UTC (permalink / raw)
  To: Alex Deucher; +Cc: dri-devel

On 2012.12.22 at 20:46 -0500, Alex Deucher wrote:
> On Mon, Dec 17, 2012 at 5:25 PM, Markus Trippelsdorf
> <markus@trippelsdorf.de> wrote:
> > On 2012.12.17 at 17:00 -0500, Alex Deucher wrote:
> >> On Mon, Dec 17, 2012 at 4:48 PM, Markus Trippelsdorf
> >> <markus@trippelsdorf.de> wrote:
> >> > On 2012.12.17 at 16:32 -0500, Alex Deucher wrote:
> >> >> On Mon, Dec 17, 2012 at 1:27 PM, Markus Trippelsdorf
> >> >> <markus@trippelsdorf.de> wrote:
> >> >> > As soon as I open the following website:
> >> >> > http://www.boston.com/bigpicture/2012/12/2012_year_in_pictures_part_i.html
> >> >> >
> >> >> > my Radeon RS780 stalls (GPU lockup) leaving the machine unusable:
> >> >>
> >> >> Is this a regression?  Most likely a 3D driver bug unless you are only
> >> >> seeing it with specific kernels.  What browser are you using and do
> >> >> you have hw accelerated webgl, etc. enabled?  If so, what version of
> >> >> mesa are you using?
> >> >
> >> > This is a regression, because it is caused by yesterdays merge of
> >> > drm-next by Linus. IOW I only see this bug when running a
> >> > v3.7-9432-g9360b53 kernel.
> >>
> >> Can you bisect?  I'm guessing it may be related to the new DMA rings.  Possibly:
> >> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=2d6cc7296d4ee128ab0fa3b715f0afde511f49c2
> >
> > Yes, the commit above causes the issue.
> >
> 
> Does booting with radeon.wb=0 fix the issue?  Please make sure your
> kernel has this patch:
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=86a1881d08f65a42c17071a59c0088dbe2870246

My kernel has this patch and radeon.wb=0 doesn't help. It still freezes
the machine as soon as you scroll on a website with many big images.

-- 
Markus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: GPU lockup CP stall for more than 10000msec on latest vanilla git
  2012-12-23  8:43           ` Markus Trippelsdorf
@ 2012-12-23 10:09             ` Andy Furniss
  2012-12-23 10:21               ` Markus Trippelsdorf
  0 siblings, 1 reply; 20+ messages in thread
From: Andy Furniss @ 2012-12-23 10:09 UTC (permalink / raw)
  To: Markus Trippelsdorf; +Cc: dri-devel

Markus Trippelsdorf wrote:

>> Does booting with radeon.wb=0 fix the issue?  Please make sure your
>> kernel has this patch:
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=86a1881d08f65a42c17071a59c0088dbe2870246
>
> My kernel has this patch and radeon.wb=0 doesn't help.

I think that should be no_wb=1

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: GPU lockup CP stall for more than 10000msec on latest vanilla git
  2012-12-23 10:09             ` Andy Furniss
@ 2012-12-23 10:21               ` Markus Trippelsdorf
  0 siblings, 0 replies; 20+ messages in thread
From: Markus Trippelsdorf @ 2012-12-23 10:21 UTC (permalink / raw)
  To: Andy Furniss; +Cc: dri-devel

On 2012.12.23 at 10:09 +0000, Andy Furniss wrote:
> Markus Trippelsdorf wrote:
> 
> >> Does booting with radeon.wb=0 fix the issue?  Please make sure your
> >> kernel has this patch:
> >> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=86a1881d08f65a42c17071a59c0088dbe2870246
> >
> > My kernel has this patch and radeon.wb=0 doesn't help.
> 
> I think that should be no_wb=1

Yes, you're right. But even with radeon.no_wb=1 it still hangs:


...
Dec 23 11:15:02 x4 kernel: radeon 0000:01:05.0: WB disabled
Dec 23 11:15:02 x4 kernel: radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000004 and cpu addr 0xffff8802163ad004
Dec 23 11:15:02 x4 kernel: radeon 0000:01:05.0: fence driver on ring 3 use gpu addr 0x00000000a0000c0c and cpu addr 0xffff8802163adc0c
Dec 23 11:15:02 x4 kernel: radeon 0000:01:05.0: setting latency timer to 64
...
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0: GPU lockup CP stall for more than 10000msec
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0: GPU lockup (waiting for 0x000000000000089c last fence id 0x000000000000089b)
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0: Saved 217 dwords of commands on ring 0.
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0: GPU softreset 
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS=0xA000B030
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2=0x00000003
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS=0x20005040
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000002
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x0000D086
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80098645
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0:   R_008020_GRBM_SOFT_RESET=0x00007FEE
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00000001
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS=0xA000B030
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2=0x00000003
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS=0x2000C040
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80100000
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0: GPU reset succeeded, trying to resume
Dec 23 11:16:04 x4 kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000C0040000).
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0: WB disabled
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000004 and cpu addr 0xffff8802163ad004
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0: fence driver on ring 3 use gpu addr 0x00000000a0000c0c and cpu addr 0xffff8802163adc0c
Dec 23 11:16:04 x4 kernel: radeon 0000:01:05.0: setting latency timer to 64
Dec 23 11:16:04 x4 kernel: [drm] ring test on 0 succeeded in 1 usecs
Dec 23 11:16:05 x4 kernel: [drm:r600_dma_ring_test] *ERROR* radeon: ring 3 test failed (0xCAFEDEAD)
Dec 23 11:16:05 x4 kernel: [drm:r600_resume] *ERROR* r600 startup failed on resume
Dec 23 11:16:09 x4 kernel: SysRq : Emergency Sync
Dec 23 11:16:09 x4 kernel: Emergency Sync complete
Dec 23 11:16:15 x4 kernel: SysRq : Emergency Remount R/O
Dec 23 11:16:15 x4 kernel: EXT4-fs (sdb2): re-mounted. Opts: (null)

-- 
Markus

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2012-12-23 10:33 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-17 18:27 GPU lockup CP stall for more than 10000msec on latest vanilla git Markus Trippelsdorf
2012-12-17 21:32 ` Alex Deucher
2012-12-17 21:48   ` Markus Trippelsdorf
2012-12-17 21:58     ` Markus Trippelsdorf
2012-12-17 22:00     ` Alex Deucher
2012-12-17 22:25       ` Markus Trippelsdorf
2012-12-17 22:55         ` Markus Trippelsdorf
2012-12-18 11:20           ` Michel Dänzer
2012-12-18 13:38             ` Markus Trippelsdorf
2012-12-18 13:51               ` Markus Trippelsdorf
2012-12-18 15:24               ` Maarten Lankhorst
2012-12-18 16:12                 ` Markus Trippelsdorf
2012-12-18 18:10                   ` Maarten Lankhorst
2012-12-19 13:57                   ` Maarten Lankhorst
2012-12-19 14:20                     ` Markus Trippelsdorf
2012-12-19 14:31                       ` Maarten Lankhorst
2012-12-23  1:46         ` Alex Deucher
2012-12-23  8:43           ` Markus Trippelsdorf
2012-12-23 10:09             ` Andy Furniss
2012-12-23 10:21               ` Markus Trippelsdorf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.