All of lore.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@bugzilla.kernel.org
To: dri-devel@lists.freedesktop.org
Subject: [Bug 215494] New: [radeon, rv370] Running piglit shaders@glsl-vs-raytrace-bug26691 test causes hard lockup & reboot
Date: Fri, 14 Jan 2022 19:33:26 +0000	[thread overview]
Message-ID: <bug-215494-2300@https.bugzilla.kernel.org/> (raw)

https://bugzilla.kernel.org/show_bug.cgi?id=215494

            Bug ID: 215494
           Summary: [radeon, rv370] Running piglit
                    shaders@glsl-vs-raytrace-bug26691 test causes hard
                    lockup & reboot
           Product: Drivers
           Version: 2.5
    Kernel Version: 5.16.0
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Video(DRI - non Intel)
          Assignee: drivers_video-dri@kernel-bugs.osdl.org
          Reporter: erhard_f@mailbox.org
                CC: alexdeucher@gmail.com
        Regression: No

Created attachment 300268
  --> https://bugzilla.kernel.org/attachment.cgi?id=300268&action=edit
kernel dmesg (kernel 5.16.0, Ryzen 9 5950X)

Running the piglit festsuite (git-11ee10ba04) for
https://gitlab.freedesktop.org/mesa/mesa/-/issues/3152 via './piglit run -1
quick -l verbose -s --dmesg' on a Radeon X600 causes the X600 to hard lockup &
reboot. On my system this happens with kernel 5.15.11, 5.16.0, mesa 21.3.4 and
mesa 22 (git-8b3d947267).

I had a closer look and found out that shaders@glsl-vs-raytrace-bug26691 causes
the lockup. Running "./piglit/bin/glsl-vs-raytrace-bug26691 -auto -fbo" as a
single test works sometimes the 1st time, but re-running it a 2nd or a 3rd time
always causes the lockup:

[...]
[  518.794824] radeon: wait for empty RBBM fifo failed! Bad things might
happen.
[  519.110152] Failed to wait GUI idle while programming pipes. Bad things
might happen.
[  519.111220] radeon 0000:07:00.0: Saved 59 dwords of commands on ring 0.
[  519.111247] radeon 0000:07:00.0: (r300_asic_reset:426)
RBBM_STATUS=0x8411C100
[  519.616733] radeon 0000:07:00.0: (r300_asic_reset:445)
RBBM_STATUS=0x8401C100
[  520.118160] radeon 0000:07:00.0: (r300_asic_reset:457)
RBBM_STATUS=0x8400C100
[  520.118231] radeon 0000:07:00.0: failed to reset GPU
[  520.319694] pcieport 0000:00:03.1: AER: Corrected error received:
0000:00:03.1
[  520.319723] pcieport 0000:00:03.1: PCIe Bus Error: severity=Corrected,
type=Transaction Layer, (Receiver ID)
[  520.319729] pcieport 0000:00:03.1:   device [1022:1483] error
status/mask=00002000/00004000
[  520.319735] pcieport 0000:00:03.1:    [13] NonFatalErr           
[  520.722345] pcieport 0000:00:03.1: AER: Corrected error received:
0000:00:03.1


For regular desktop usage the X600 seems ok so far. Some data about the system:

 $ inxi -b
System:
  Host: prototype Kernel: 5.16.0-Zen3 x86_64 bits: 64 Desktop: Openbox 3.6.1 
  Distro: Gentoo Base System release 2.7 
Machine:
  Type: Desktop Mobo: ASRock model: B450M Steel Legend 
  serial: <superuser/root required> UEFI: American Megatrends v: P4.20 
  date: 08/03/2021 
CPU:
  Info: 16-Core AMD Ryzen 9 5950X [MT MCP] speed: 3685 MHz 
  min/max: 2200/3400 MHz 
Graphics:
  Device-1: AMD RV370 [Radeon X600/X600 SE] driver: radeon v: kernel 
  Display: x11 server: X.Org 1.20.14 driver: ati,radeon 
  unloaded: fbdev,modesetting resolution: 1920x1080~60Hz 
  OpenGL: renderer: ATI RV370 v: 2.1 Mesa 22.0.0-devel (git-8b3d947267) 
Network:
  Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet 
  driver: r8169 

 # lspci 
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Root
Complex
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Starship/Matisse IOMMU
00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe
Dummy Host Bridge
00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP
Bridge
00:01.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP
Bridge
00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe
Dummy Host Bridge
00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe
Dummy Host Bridge
00:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP
Bridge
00:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe
Dummy Host Bridge
00:05.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe
Dummy Host Bridge
00:07.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe
Dummy Host Bridge
00:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse
Internal PCIe GPP Bridge 0 to bus[E:B]
00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe
Dummy Host Bridge
00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse
Internal PCIe GPP Bridge 0 to bus[E:B]
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 61)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51)
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data
Fabric: Device 18h; Function 0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data
Fabric: Device 18h; Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data
Fabric: Device 18h; Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data
Fabric: Device 18h; Function 3
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data
Fabric: Device 18h; Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data
Fabric: Device 18h; Function 5
00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data
Fabric: Device 18h; Function 6
00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data
Fabric: Device 18h; Function 7
01:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue SN550 NVMe SSD
(rev 01)
02:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset
USB 3.1 XHCI Controller (rev 01)
02:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset
SATA Controller (rev 01)
02:00.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe
Bridge (rev 01)
03:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe
Port (rev 01)
03:01.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe
Port (rev 01)
03:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe
Port (rev 01)
05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411
PCI Express Gigabit Ethernet Controller (rev 15)
07:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV370
[Radeon X600/X600 SE]
07:00.1 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] RV380
[Radeon X300/X550/X1050 Series] (Secondary)
08:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc.
[AMD] Starship/Matisse PCIe Dummy Function
09:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc.
[AMD] Starship/Matisse Reserved SPP
09:00.1 Encryption controller: Advanced Micro Devices, Inc. [AMD]
Starship/Matisse Cryptographic Coprocessor PSPCPP
09:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host
Controller

 # lspci -s 07:00.0 -vv
07:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV370
[Radeon X600/X600 SE] (prog-if 00 [VGA controller])
        Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] RV370 [Radeon
X600/X600 SE]
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 59
        IOMMU group: 2
        Region 0: Memory at e8000000 (64-bit, prefetchable) [size=128M]
        Region 2: Memory at fce30000 (64-bit, non-prefetchable) [size=64K]
        Region 4: I/O ports at e000 [size=256]
        Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: [50] Power Management version 2
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [58] Express (v1) Endpoint, MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <256ns,
L1 <4us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE- FLReset-
SlotPowerLimit 75.000W
                DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr-
TransPend-
                LnkCap: Port #0, Speed 2.5GT/s, Width x16, ASPM L0s L1, Exit
Latency L0s <256ns, L1 <2us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s (ok), Width x16 (ok)
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit+
                Address: 00000000fee01000  Data: 0022
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
AdvNonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
AdvNonFatalErr-
                AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn-
ECRCChkCap- ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 04000001 0000200f 07070000 b8cdf5fd
        Kernel driver in use: radeon
        Kernel modules: radeon

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

             reply	other threads:[~2022-01-14 19:33 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-14 19:33 bugzilla-daemon [this message]
2022-01-14 19:34 ` [Bug 215494] [radeon, rv370] Running piglit shaders@glsl-vs-raytrace-bug26691 test causes hard lockup & reboot bugzilla-daemon
2022-01-14 19:35 ` bugzilla-daemon
2022-01-14 20:29 ` bugzilla-daemon
2022-01-14 20:52 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-215494-2300@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@bugzilla.kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.