From mboxrd@z Thu Jan 1 00:00:00 1970 From: bugzilla-daemon@freedesktop.org Subject: [Bug 97500] Cannot unbind GPU from AMDGPU Date: Sun, 25 Sep 2016 21:06:31 +0000 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0991513472==" Return-path: Received: from culpepper.freedesktop.org (culpepper.freedesktop.org [IPv6:2610:10:20:722:a800:ff:fe98:4b55]) by gabe.freedesktop.org (Postfix) with ESMTP id 200646E27C for ; Sun, 25 Sep 2016 21:06:31 +0000 (UTC) In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============0991513472== Content-Type: multipart/alternative; boundary="14748375911.f5BedDB0.23746"; charset="UTF-8" --14748375911.f5BedDB0.23746 Date: Sun, 25 Sep 2016 21:06:31 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated https://bugs.freedesktop.org/show_bug.cgi?id=3D97500 --- Comment #6 from Grazvydas Ignotas --- Created attachment 126782 --> https://bugs.freedesktop.org/attachment.cgi?id=3D126782&action=3Dedit dmesg of powerplay crash I've sent some patches with fixes, but there seem to be multiple other issu= es. One of the problems is that struct amdgpu_i2c_chan contains struct drm_dp_a= ux, and on amdgpu_i2c_fini() call, which frees amdgpu_i2c_chan, drm_dp_aux is s= till in use. This causes memory corruption. Don't know how to solve this, perhaps somebody knows this code better? A hack can be used to trade this corruption for a leak: diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c index 34bab61..8beaee0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c @@ -221,6 +221,8 @@ void amdgpu_i2c_destroy(struct amdgpu_i2c_chan *i2c) if (!i2c) return; i2c_del_adapter(&i2c->adapter); + if (i2c->has_aux) + return; kfree(i2c); } --- Another one is TTM leak, can also be seen in this attachment. CONFIG_DMA_API_DEBUG reports: WARNING: CPU: 3 PID: 1666 at lib/dma-debug.c:976 dma_debug_device_change+0x1ca/0x240 pci 0000:01:00.0: DMA-API: device driver has pending DMA allocations while released from device [count=3D202] One of leaked entries details: [device address=3D0x00000003dcfe9000] [size= =3D4096 bytes] [mapped with DMA_BIDIRECTIONAL] [mapped as coherent] Mapped at: [] debug_dma_alloc_coherent+0x41/0x110 [] ttm_dma_populate+0xb64/0x1150 [ttm] [] amdgpu_ttm_tt_populate+0x35c/0x510 [amdgpu] [] ttm_tt_bind+0x71/0xd0 [ttm] [] ttm_bo_handle_move_mem+0xa08/0xaa0 [ttm] --- Next one is powerplay crash in drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c:3336 , dpm_table->sclk_table.count is 0 so array access ends up badly. Could be related to "DPM is already running right now, no need to enable DPM!" messa= ge, full dmesg attached. I won't have time to work on this for a while, but maybe somebody else does. --=20 You are receiving this mail because: You are the assignee for the bug.= --14748375911.f5BedDB0.23746 Date: Sun, 25 Sep 2016 21:06:31 +0000 MIME-Version: 1.0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated

Comment= # 6 on bug 97500<= /a> from Grazvydas Ignotas
Created attachment 1267=
82 [details]
dmesg of powerplay crash

I've sent some patches with fixes, but there seem to be multiple other issu=
es.

One of the problems is that struct amdgpu_i2c_chan contains struct drm_dp_a=
ux,
and on amdgpu_i2c_fini() call, which frees amdgpu_i2c_chan, drm_dp_aux is s=
till
in use. This causes memory corruption. Don't know how to solve this, perhaps
somebody knows this code better?
A hack can be used to trade this corruption for a leak:

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
index 34bab61..8beaee0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
@@ -221,6 +221,8 @@ void amdgpu_i2c_destroy(struct amdgpu_i=
2c_chan *i2c)
        if (!i2c)
                return;
        i2c_del_adapter(&i2c->adapter);
+       if (i2c->has_aux)
+               return;
        kfree(i2c);
 }

---
Another one is TTM leak, can also be seen in this attachment.
CONFIG_DMA_API_DEBUG reports:

WARNING: CPU: 3 PID: 1666 at lib/dma-debug.c:976
dma_debug_device_change+0x1ca/0x240
pci 0000:01:00.0: DMA-API: device driver has pending DMA allocations while
released from device [count=3D202]
One of leaked entries details: [device address=3D0x00000003dcfe9000] [size=
=3D4096
bytes] [mapped with DMA_BIDIRECTIONAL] [mapped as coherent]

Mapped at:
 [<ffffffff8163d941>] debug_dma_alloc_coherent+0x41/0x110
 [<ffffffffa0728d84>] ttm_dma_populate+0xb64/0x1150 [ttm]
 [<ffffffffa0b770ac>] amdgpu_ttm_tt_populate+0x35c/0x510 [amdgpu]
 [<ffffffffa0719141>] ttm_tt_bind+0x71/0xd0 [ttm]
 [<ffffffffa071c9d8>] ttm_bo_handle_move_mem+0xa08/0xaa0 [ttm]

---
Next one is powerplay crash in
drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c:3336 ,
dpm_table->sclk_table.count is 0 so array access ends up badly. Could be
related to "DPM is already running right now, no need to enable DPM!&q=
uot; message,
full dmesg attached.

I won't have time to work on this for a while, but maybe somebody else does=
.


You are receiving this mail because:
  • You are the assignee for the bug.
= --14748375911.f5BedDB0.23746-- --===============0991513472== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg== --===============0991513472==--