dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: bugzilla-daemon@bugzilla.kernel.org
To: dri-devel@lists.freedesktop.org
Subject: [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
Date: Tue, 09 Sep 2014 03:09:08 +0000	[thread overview]
Message-ID: <bug-78221-2300-JPm57nnuTz@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-78221-2300@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=78221

--- Comment #22 from t3st3r@mail.ru ---
Attempted to test on 3.17-rc4. Result: crashed in about 3 minutes of run (see
below).

Are some stability fixes missing 3.17-rc4 mainline? At first glance I do not
see radeon-related commits in drm-fixes which haven't made it to -rc4. Am I
missing something?

===cut===
 kernel: [  599.949295] radeon 0000:01:00.0: ring 3 stalled for more than
10167msec
 kernel: [  599.949305] radeon 0000:01:00.0: GPU lockup (waiting for
0x0000000000001eb0 last fence id 0x0000000000001eaf on ring 3)
 kernel: [  599.949312] radeon 0000:01:00.0: scheduling IB failed (-35).
 kernel: [  600.507409] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0
domain=0x0018 address=0x000000008040a840 flags=0x0010]
 kernel: [  600.507420] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0
domain=0x0018 address=0x000000008040a870 flags=0x0030]
 kernel: [  600.507426] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0
domain=0x0018 address=0x0000000080000100 flags=0x0030]
 kernel: [  600.507431] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0
domain=0x0018 address=0x000000008040a700 flags=0x0010]
 kernel: [  600.507460] radeon 0000:01:00.0: Saved 19308 dwords of commands on
ring 0.
 kernel: [  600.507590] radeon 0000:01:00.0: GPU softreset: 0x0000006C
 kernel: [  600.507593] radeon 0000:01:00.0:   GRBM_STATUS               =
0xA0003028
 kernel: [  600.507596] radeon 0000:01:00.0:   GRBM_STATUS_SE0           =
0x00000006
 kernel: [  600.507598] radeon 0000:01:00.0:   GRBM_STATUS_SE1           =
0x00000006
 kernel: [  600.507600] radeon 0000:01:00.0:   SRBM_STATUS               =
0x200000C0
 kernel: [  600.507711] radeon 0000:01:00.0:   SRBM_STATUS2              =
0x00000000
 kernel: [  600.507714] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 =
0x00000000
 kernel: [  600.507716] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 =
0x00010000
 kernel: [  600.507718] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     =
0x00000002
 kernel: [  600.507720] radeon 0000:01:00.0:   R_008680_CP_STAT          =
0x80010243
 kernel: [  600.507723] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   =
0x44483106
 kernel: [  600.507725] radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   =
0x44E84266
 kernel: [  600.507728] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
 kernel: [  600.507730] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
 kernel: [  601.054357] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF
 kernel: [  601.054411] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00100140
 kernel: [  601.055568] radeon 0000:01:00.0:   GRBM_STATUS               =
0x00003028
 kernel: [  601.055571] radeon 0000:01:00.0:   GRBM_STATUS_SE0           =
0x00000006
 kernel: [  601.055573] radeon 0000:01:00.0:   GRBM_STATUS_SE1           =
0x00000006
 kernel: [  601.055575] radeon 0000:01:00.0:   SRBM_STATUS               =
0x20000AC0
 kernel: [  601.055686] radeon 0000:01:00.0:   SRBM_STATUS2              =
0x00000000
 kernel: [  601.055689] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 =
0x00000000
 kernel: [  601.055691] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 =
0x00000000
 kernel: [  601.055693] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     =
0x00000000
 kernel: [  601.055695] radeon 0000:01:00.0:   R_008680_CP_STAT          =
0x00000000
 kernel: [  601.055698] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   =
0x44C83D57
 kernel: [  601.055700] radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   =
0x44C83D57
 kernel: [  601.055951] radeon 0000:01:00.0: GPU reset succeeded, trying to
resume
 kernel: [  601.083744] [drm] probing gen 2 caps for device 1002:5a16 =
31cd02/0
 kernel: [  601.083747] [drm] PCIE gen 2 link speeds already enabled
 kernel: [  601.084938] [drm] PCIE GART of 1024M enabled (table at
0x0000000000276000).
 kernel: [  601.085046] radeon 0000:01:00.0: WB enabled
 kernel: [  601.085049] radeon 0000:01:00.0: fence driver on ring 0 use gpu
addr 0x0000000080000c00 and cpu addr 0xffff880413fbec00
 kernel: [  601.085052] radeon 0000:01:00.0: fence driver on ring 1 use gpu
addr 0x0000000080000c04 and cpu addr 0xffff880413fbec04
 kernel: [  601.085054] radeon 0000:01:00.0: fence driver on ring 2 use gpu
addr 0x0000000080000c08 and cpu addr 0xffff880413fbec08
 kernel: [  601.085056] radeon 0000:01:00.0: fence driver on ring 3 use gpu
addr 0x0000000080000c0c and cpu addr 0xffff880413fbec0c
 kernel: [  601.085057] radeon 0000:01:00.0: fence driver on ring 4 use gpu
addr 0x0000000080000c10 and cpu addr 0xffff880413fbec10
 kernel: [  601.086030] radeon 0000:01:00.0: fence driver on ring 5 use gpu
addr 0x0000000000075a18 and cpu addr 0xffffc90011db5a18
 kernel: [  601.271000] [drm] ring test on 0 succeeded in 3 usecs
 kernel: [  601.271006] [drm] ring test on 1 succeeded in 1 usecs
 kernel: [  601.271011] [drm] ring test on 2 succeeded in 1 usecs
 kernel: [  601.271075] [drm] ring test on 3 succeeded in 2 usecs
 kernel: [  601.271084] [drm] ring test on 4 succeeded in 1 usecs
 kernel: [  601.448164] [drm] ring test on 5 succeeded in 2 usecs
 kernel: [  601.448172] [drm] UVD initialized successfully.
 kernel: [  611.444226] radeon 0000:01:00.0: ring 0 stalled for more than
10000msec
 kernel: [  611.444237] radeon 0000:01:00.0: GPU lockup (waiting for
0x000000000001a60a last fence id 0x000000000001a4dd on ring 0)
 kernel: [  611.444244] [drm:r600_ib_test] *ERROR* radeon: fence wait failed
(-35).
 kernel: [  611.444252] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed
testing IB on GFX ring (-35).
 kernel: [  611.444257] radeon 0000:01:00.0: ib ring test failed (-35).
 kernel: [  611.997330] radeon 0000:01:00.0: GPU softreset: 0x00000048
 kernel: [  611.997333] radeon 0000:01:00.0:   GRBM_STATUS               =
0xA0003028
 kernel: [  611.997336] radeon 0000:01:00.0:   GRBM_STATUS_SE0           =
0x00000006
 kernel: [  611.997338] radeon 0000:01:00.0:   GRBM_STATUS_SE1           =
0x00000006
 kernel: [  611.997341] radeon 0000:01:00.0:   SRBM_STATUS               =
0x200000C0
 kernel: [  611.997452] radeon 0000:01:00.0:   SRBM_STATUS2              =
0x00000000
 kernel: [  611.997454] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 =
0x00000000
 kernel: [  611.997456] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 =
0x00010000
 kernel: [  611.997458] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     =
0x00400002
 kernel: [  611.997461] radeon 0000:01:00.0:   R_008680_CP_STAT          =
0x84010243
 kernel: [  611.997463] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   =
0x44C83D57
 kernel: [  611.997465] radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   =
0x44C83D57
 kernel: [  611.997468] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
 kernel: [  611.997470] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
 kernel: [  612.542126] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF
 kernel: [  612.542180] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100
 kernel: [  612.543338] radeon 0000:01:00.0:   GRBM_STATUS               =
0x00003028
 kernel: [  612.543340] radeon 0000:01:00.0:   GRBM_STATUS_SE0           =
0x00000006
 kernel: [  612.543343] radeon 0000:01:00.0:   GRBM_STATUS_SE1           =
0x00000006
 kernel: [  612.543345] radeon 0000:01:00.0:   SRBM_STATUS               =
0x200000C0
 kernel: [  612.543456] radeon 0000:01:00.0:   SRBM_STATUS2              =
0x00000000
 kernel: [  612.543458] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 =
0x00000000
 kernel: [  612.543460] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 =
0x00000000
 kernel: [  612.543462] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     =
0x00000000
 kernel: [  612.543465] radeon 0000:01:00.0:   R_008680_CP_STAT          =
0x00000000
 kernel: [  612.543467] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   =
0x44C83D57
 kernel: [  612.543469] radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   =
0x44C83D57
 kernel: [  612.543724] radeon 0000:01:00.0: GPU reset succeeded, trying to
resume
 kernel: [  612.556911] [drm] probing gen 2 caps for device 1002:5a16 =
31cd02/0
 kernel: [  612.556915] [drm] PCIE gen 2 link speeds already enabled
 kernel: [  612.558107] [drm] PCIE GART of 1024M enabled (table at
0x0000000000276000).
 kernel: [  612.558216] radeon 0000:01:00.0: WB enabled
 kernel: [  612.558219] radeon 0000:01:00.0: fence driver on ring 0 use gpu
addr 0x0000000080000c00 and cpu addr 0xffff880413fbec00
 kernel: [  612.558222] radeon 0000:01:00.0: fence driver on ring 1 use gpu
addr 0x0000000080000c04 and cpu addr 0xffff880413fbec04
 kernel: [  612.558224] radeon 0000:01:00.0: fence driver on ring 2 use gpu
addr 0x0000000080000c08 and cpu addr 0xffff880413fbec08
 kernel: [  612.558226] radeon 0000:01:00.0: fence driver on ring 3 use gpu
addr 0x0000000080000c0c and cpu addr 0xffff880413fbec0c
 kernel: [  612.558228] radeon 0000:01:00.0: fence driver on ring 4 use gpu
addr 0x0000000080000c10 and cpu addr 0xffff880413fbec10
 kernel: [  612.559203] radeon 0000:01:00.0: fence driver on ring 5 use gpu
addr 0x0000000000075a18 and cpu addr 0xffffc90011db5a18
 kernel: [  612.744297] [drm] ring test on 0 succeeded in 3 usecs
 kernel: [  612.744302] [drm] ring test on 1 succeeded in 1 usecs
 kernel: [  612.744308] [drm] ring test on 2 succeeded in 1 usecs
 kernel: [  612.744371] [drm] ring test on 3 succeeded in 2 usecs
 kernel: [  612.744380] [drm] ring test on 4 succeeded in 1 usecs
 kernel: [  612.921464] [drm] ring test on 5 succeeded in 2 usecs
 kernel: [  612.921472] [drm] UVD initialized successfully.
 kernel: [  612.921539] [drm] ib test on ring 0 succeeded in 0 usecs
 kernel: [  612.921634] [drm] ib test on ring 1 succeeded in 0 usecs
 kernel: [  612.921722] [drm] ib test on ring 2 succeeded in 0 usecs
 kernel: [  612.921762] [drm] ib test on ring 3 succeeded in 0 usecs
 kernel: [  612.921796] [drm] ib test on ring 4 succeeded in 0 usecs
 kernel: [  623.068910] radeon 0000:01:00.0: ring 5 stalled for more than
10000msec
 kernel: [  623.068921] radeon 0000:01:00.0: GPU lockup (waiting for
0x0000000000000004 last fence id 0x0000000000000002 on ring 5)
 kernel: [  623.068927] [drm:uvd_v1_0_ib_test] *ERROR* radeon: fence wait
failed (-35).
 kernel: [  623.068935] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed
testing IB on ring 5 (-35).
 kernel: [  623.098333] radeon 0000:01:00.0: GPU fault detected: 146 0x07a23d0c
 kernel: [  623.098342] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0000BDBD
 kernel: [  623.098347] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0203D00C
 kernel: [  623.098352] VM fault (0x0c, vmid 1) at page 48573, read from DMA1
(61)
 kernel: [  623.098364] radeon 0000:01:00.0: GPU fault detected: 146 0x07c23d0c
 kernel: [  623.098368] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
 kernel: [  623.098372] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0208400C
 kernel: [  623.098377] VM fault (0x0c, vmid 1) at page 0, read from TC (132)
 kernel: [  623.098383] radeon 0000:01:00.0: GPU fault detected: 146 0x07e23d0c
 kernel: [  623.098387] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0000BDBC
 kernel: [  623.098391] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0200800C
 kernel: [  623.098395] VM fault (0x0c, vmid 1) at page 48572, read from TC (8)
 kernel: [  623.128770] radeon 0000:01:00.0: GPU fault detected: 146 0x06033d14
 kernel: [  623.128781] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0000BDB0
 kernel: [  623.128787] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0303D014
 kernel: [  623.128793] VM fault (0x04, vmid 1) at page 48560, write from DMA1
(61)
 kernel: [  623.128820] radeon 0000:01:00.0: GPU fault detected: 146 0x06033d14
 kernel: [  623.128825] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
 kernel: [  623.128830] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0204400C
 kernel: [  623.128835] VM fault (0x0c, vmid 1) at page 0, read from TC (68)
 kernel: [  623.128842] radeon 0000:01:00.0: GPU fault detected: 146 0x06033d14
 kernel: [  623.128847] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0000BDB8
 kernel: [  623.128852] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0204400C
 kernel: [  623.128857] VM fault (0x0c, vmid 1) at page 48568, read from TC
(68)
 kernel: [  623.129932] radeon 0000:01:00.0: GPU fault detected: 146 0x06033d14
 kernel: [  623.129940] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0000BDB0
 kernel: [  623.129944] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0303D014
 kernel: [  623.129948] VM fault (0x04, vmid 1) at page 48560, write from DMA1
(61)
 kernel: [  623.129965] radeon 0000:01:00.0: GPU fault detected: 146 0x06233d14
===cut===
Note: several megabytes of similar "VM fault" flood skipped.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

  parent reply	other threads:[~2014-09-09  3:09 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
2014-06-18  2:22 ` [Bug 78221] " bugzilla-daemon
2014-06-18 15:12 ` bugzilla-daemon
2014-06-19  7:37 ` bugzilla-daemon
2014-06-19 13:46 ` bugzilla-daemon
2014-06-21  4:04 ` bugzilla-daemon
2014-06-22  7:12 ` bugzilla-daemon
2014-06-23 14:44 ` bugzilla-daemon
2014-06-23 14:45 ` bugzilla-daemon
2014-06-24 11:40 ` bugzilla-daemon
2014-06-24 16:23 ` bugzilla-daemon
2014-06-24 16:23 ` bugzilla-daemon
2014-06-25  1:05 ` bugzilla-daemon
2014-06-25  2:11 ` bugzilla-daemon
2014-06-25  9:45 ` bugzilla-daemon
2014-06-25 13:17 ` bugzilla-daemon
2014-08-05  8:06 ` bugzilla-daemon
2014-08-14 11:56 ` bugzilla-daemon
2014-08-24  1:05 ` bugzilla-daemon
2014-08-25  9:58 ` bugzilla-daemon
2014-09-08 12:19 ` bugzilla-daemon
2014-09-08 12:22 ` bugzilla-daemon
2014-09-09  3:09 ` bugzilla-daemon [this message]
2014-09-30  4:03 ` bugzilla-daemon
2015-07-10 23:38 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-78221-2300-JPm57nnuTz@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@bugzilla.kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).