From mboxrd@z Thu Jan 1 00:00:00 1970 From: bugzilla-daemon@freedesktop.org Subject: [Bug 101946] Rebinding AMDGPU causes initialization errors [R9 290 / 4.10 kernel] Date: Thu, 27 Jul 2017 11:46:28 +0000 Message-ID: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1572071651==" Return-path: Received: from culpepper.freedesktop.org (culpepper.freedesktop.org [IPv6:2610:10:20:722:a800:ff:fe98:4b55]) by gabe.freedesktop.org (Postfix) with ESMTP id 1B2FA6EC02 for ; Thu, 27 Jul 2017 11:46:28 +0000 (UTC) List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============1572071651== Content-Type: multipart/alternative; boundary="15011559880.ce3bB.4448"; charset="UTF-8" --15011559880.ce3bB.4448 Date: Thu, 27 Jul 2017 11:46:27 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated https://bugs.freedesktop.org/show_bug.cgi?id=3D101946 Bug ID: 101946 Summary: Rebinding AMDGPU causes initialization errors [R9 290 / 4.10 kernel] Product: DRI Version: XOrg git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: DRM/AMDgpu Assignee: dri-devel@lists.freedesktop.org Reporter: beanow@oscp.info Created attachment 133068 --> https://bugs.freedesktop.org/attachment.cgi?id=3D133068&action=3Dedit The script used to reproduce the error. As I attempted to hotplug my R9 290 for a VM gaming setup, I stumbled on th= is issue. The main kern.log error to come up is: > [ 160.013733] [drm:ci_dpm_enable [amdgpu]] *ERROR* ci_start_dpm failed > [ 160.014134] [drm:amdgpu_device_init [amdgpu]] *ERROR* hw_init of IP bl= ock failed -22 > [ 160.014531] amdgpu 0000:01:00.0: amdgpu_init failed For my setup I use a Kaby Lake iGPU running i915. With the R9 290 using vfio-pci / amdgpu. Ubuntu 17.04 (4.10.0-28-generic). Mesa 17.1.4 from the padoka stable PPA. I'm able to reproduce this as follows. 1. Boot with vfio-pci capturing the card and amdgpu blacklisted. Kernel fla= gs: > intel_iommu=3Don iommu=3Dpt vfio-pci.ids=3D1002:67b1,1002:aac8 2. Since I run Gnome3 on Ubuntu 17.04, this will bring me to a wayland gree= ter which uses my iGPU. Drop to a free TTY, without logging in. This prevents X= org from responding to the AMD card becoming available. 3. Run the attached script "rebind-amd.sh" as root to bind back and forth between vfio-pci and amdgpu in an infinite loop. This will: A. modprobe both drivers to be sure they're loaded. B. Print information about the driver and card usage. C. Use the new_id > unbind > bind > remove_id sequence to switch drivers. What happens is: vfio-pci -> vfio-pci, Gives no problems, of course. vfio-pci -> amdgpu, This works and the amdgpu driver initializes the card. Attached monitor(s) start searching for signals. amdgpu -> vfio-pci, Since no Xorg is using the dGPU this works without problems. vfio-pci -> amdgpu, Fails to initialize dGPU with the kernel error above. I've attached the script, the output of the script and the full kern.log. --=20 You are receiving this mail because: You are the assignee for the bug.= --15011559880.ce3bB.4448 Date: Thu, 27 Jul 2017 11:46:28 +0000 MIME-Version: 1.0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated
Bug ID 101946
Summary Rebinding AMDGPU causes initialization errors [R9 290 / 4.10 = kernel]
Product DRI
Version XOrg git
Hardware x86-64 (AMD64)
OS Linux (All)
Status NEW
Severity normal
Priority medium
Component DRM/AMDgpu
Assignee dri-devel@lists.freedesktop.org
Reporter beanow@oscp.info

Created =
attachment 133068 [details]
The script used to reproduce the error.

As I attempted to hotplug my R9 290 for a VM gaming setup, I stumbled on th=
is
issue.

The main kern.log error to come up is:

> [  160.013733] [drm:ci_dpm_enable [amdgpu]] *ERR=
OR* ci_start_dpm failed
> [  160.014134] [drm:amdgpu_device_init [amdgpu]] *ERROR* hw_init of IP=
 block <amdgpu_powerplay> failed -22
> [  160.014531] amdgpu 0000:01:00.0: amdgpu_init failed


For my setup I use a Kaby Lake iGPU running i915.
With the R9 290 using vfio-pci / amdgpu.
Ubuntu 17.04 (4.10.0-28-generic).
Mesa 17.1.4 from the padoka stable PPA.


I'm able to reproduce this as follows.

1. Boot with vfio-pci capturing the card and amdgpu blacklisted. Kernel fla=
gs:
> intel_iommu=3Don iommu=3Dpt vfio-pci.ids=3D1002:=
67b1,1002:aac8

2. Since I run Gnome3 on Ubuntu 17.04, this will bring me to a wayland gree=
ter
which uses my iGPU. Drop to a free TTY, without logging in. This prevents X=
org
from responding to the AMD card becoming available.

3. Run the attached script "rebind-amd.sh" as root to bind back a=
nd forth
between vfio-pci and amdgpu in an infinite loop.

This will:

A. modprobe both drivers to be sure they're loaded.
B. Print information about the driver and card usage.
C. Use the new_id > unbind > bind > remove_id sequence to switch d=
rivers.

What happens is:

vfio-pci -> vfio-pci, Gives no problems, of course.
vfio-pci -> amdgpu, This works and the amdgpu driver initializes the car=
d.
Attached monitor(s) start searching for signals.
amdgpu -> vfio-pci, Since no Xorg is using the dGPU this works without
problems.
vfio-pci -> amdgpu, Fails to initialize dGPU with the kernel error above.


I've attached the script, the output of the script and the full kern.log.
        


You are receiving this mail because:
  • You are the assignee for the bug.
= --15011559880.ce3bB.4448-- --===============1572071651== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg== --===============1572071651==--