From mboxrd@z Thu Jan 1 00:00:00 1970 From: bugzilla-daemon@freedesktop.org Subject: [Bug 93015] Tonga Elemental segfault + VM faults since radeon: implement r600_query_hw_get_result via function pointers Date: Thu, 19 Nov 2015 14:45:28 +0000 Message-ID: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0288556346==" Return-path: Received: from culpepper.freedesktop.org (unknown [131.252.210.165]) by gabe.freedesktop.org (Postfix) with ESMTP id 2D2DB6E983 for ; Thu, 19 Nov 2015 06:45:28 -0800 (PST) List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============0288556346== Content-Type: multipart/alternative; boundary="1447944328.CDEa0.23178"; charset="UTF-8" --1447944328.CDEa0.23178 Date: Thu, 19 Nov 2015 14:45:28 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable https://bugs.freedesktop.org/show_bug.cgi?id=3D93015 Bug ID: 93015 Summary: Tonga Elemental segfault + VM faults since radeon: implement r600_query_hw_get_result via function pointers Product: DRI Version: DRI git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: DRM/AMDgpu Assignee: dri-devel@lists.freedesktop.org Reporter: adf.lists@gmail.com Unreal 4.5 Elemental demo on r9 285 using powerplay kernel. Since mesa commit - commit 50f0f938e3a577647fdfb6bdbb4ad3da252aa791 Author: Nicolai H=C3=A4hnle Date: Fri Nov 13 00:27:34 2015 +0100 radeon: implement r600_query_hw_get_result via function pointers We will need the clear_result override for the batch query implementati= on. About a minute into the demo (always same place) the demo will catch a segf= ault and quit. In dmesg I see a few VM faults. While confirming the bisect I see that though it doesn't crash on the commit before above =3D commit c207c55fc08a1bf3dd40e79b3aaec34afbee2e55 Author: Nicolai H=C3=A4hnle Date: Wed Nov 18 12:05:11 2015 +0100 radeon: split hw query buffer handling from cs emit The idea here is that driver queries implemented outside of common code will use the same query buffer handling with different logic for starti= ng and stopping the corresponding counters. At the point where it would have crashed I start getting flooded with VM fa= ults [17771.298259] VM fault (0x14, vmid 5) at page 1204016, write from 'TC0' (0x54433000) (8) [17771.330661] amdgpu 0000:01:00.0: GPU fault detected: 146 0x04c20814 [17771.330665] amdgpu 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR=20= =20 0x00125E98 [17771.330666] amdgpu 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0B008014 [17771.330668] VM fault (0x14, vmid 5) at page 1203864, write from 'TC0' (0x54433000) (8) [17771.363320] amdgpu 0000:01:00.0: GPU fault detected: 146 0x05e20814 [17771.363323] amdgpu 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR=20= =20 0x001264BC [17771.363325] amdgpu 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0B008014 [17771.363326] VM fault (0x14, vmid 5) at page 1205436, write from 'TC0' (0x54433000) (8) [17771.395828] amdgpu 0000:01:00.0: GPU fault detected: 146 0x06620814 [17771.395832] amdgpu 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR=20= =20 0x001260CC [17771.395833] amdgpu 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0B008014 [17771.395834] VM fault (0x14, vmid 5) at page 1204428, write from 'TC0' (0x54433000) (8) --=20 You are receiving this mail because: You are the assignee for the bug. --1447944328.CDEa0.23178 Date: Thu, 19 Nov 2015 14:45:28 +0000 MIME-Version: 1.0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Bug ID 93015
Summary Tonga Elemental segfault + VM faults since radeon: implement= r600_query_hw_get_result via function pointers
Product DRI
Version DRI git
Hardware x86-64 (AMD64)
OS Linux (All)
Status NEW
Severity normal
Priority medium
Component DRM/AMDgpu
Assignee dri-devel@lists.freedesktop.org
Reporter adf.lists@gmail.com

Unreal 4.5 Elemental demo on r9 285 using powerplay kernel.

Since mesa commit -

commit 50f0f938e3a577647fdfb6bdbb4ad3da252aa791
Author: Nicolai H=C3=A4hnle <n=
haehnle@gmail.com>
Date:   Fri Nov 13 00:27:34 2015 +0100

    radeon: implement r600_query_hw_get_result via function pointers

    We will need the clear_result override for the batch query implementati=
on.

About a minute into the demo (always same place) the demo will catch a segf=
ault
and quit.

In dmesg I see a few VM faults.

While confirming the bisect I see that though it doesn't crash on the commit
before above =3D

commit c207c55fc08a1bf3dd40e79b3aaec34afbee2e55
Author: Nicolai H=C3=A4hnle <n=
haehnle@gmail.com>
Date:   Wed Nov 18 12:05:11 2015 +0100

    radeon: split hw query buffer handling from cs emit

    The idea here is that driver queries implemented outside of common code
    will use the same query buffer handling with different logic for starti=
ng
    and stopping the corresponding counters.

At the point where it would have crashed I start getting flooded with VM fa=
ults

[17771.298259] VM fault (0x14, vmid 5) at page 1204016, write from 'TC0'
(0x54433000) (8)
[17771.330661] amdgpu 0000:01:00.0: GPU fault detected: 146 0x04c20814
[17771.330665] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR=20=
=20
0x00125E98
[17771.330666] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0B008014
[17771.330668] VM fault (0x14, vmid 5) at page 1203864, write from 'TC0'
(0x54433000) (8)
[17771.363320] amdgpu 0000:01:00.0: GPU fault detected: 146 0x05e20814
[17771.363323] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR=20=
=20
0x001264BC
[17771.363325] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0B008014
[17771.363326] VM fault (0x14, vmid 5) at page 1205436, write from 'TC0'
(0x54433000) (8)
[17771.395828] amdgpu 0000:01:00.0: GPU fault detected: 146 0x06620814
[17771.395832] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR=20=
=20
0x001260CC
[17771.395833] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0B008014
[17771.395834] VM fault (0x14, vmid 5) at page 1204428, write from 'TC0'
(0x54433000) (8)


You are receiving this mail because: =20=20=20=20=20=20
  • You are the assignee for the bug.
--1447944328.CDEa0.23178-- --===============0288556346== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHA6Ly9saXN0 cy5mcmVlZGVza3RvcC5vcmcvbWFpbG1hbi9saXN0aW5mby9kcmktZGV2ZWwK --===============0288556346==--