public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Dennis Li <Dennis.Li@amd.com>,
	Hawking Zhang <Hawking.Zhang@amd.com>,
	Alex Deucher <alexander.deucher@amd.com>,
	Sasha Levin <sashal@kernel.org>,
	amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: [PATCH AUTOSEL 5.10 42/51] drm/amdgpu: fix a GPU hang issue when remove device
Date: Tue, 12 Jan 2021 07:55:24 -0500	[thread overview]
Message-ID: <20210112125534.70280-42-sashal@kernel.org> (raw)
In-Reply-To: <20210112125534.70280-1-sashal@kernel.org>

From: Dennis Li <Dennis.Li@amd.com>

[ Upstream commit 88e21af1b3f887d217f2fb14fc7e7d3cd87ebf57 ]

When GFXOFF is enabled and GPU is idle, driver will fail to access some
registers. Therefore change to disable power gating before all access
registers with MMIO.

Dmesg log is as following:
amdgpu 0000:03:00.0: amdgpu: amdgpu: finishing device.
amdgpu: cp queue pipe 4 queue 0 preemption failed
amdgpu 0000:03:00.0: amdgpu: failed to write reg 2890 wait reg 28a2
amdgpu 0000:03:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
amdgpu 0000:03:00.0: amdgpu: failed to write reg 2890 wait reg 28a2
amdgpu 0000:03:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 026789b466db9..30c9d60c9b515 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2524,11 +2524,11 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
 	if (adev->gmc.xgmi.num_physical_nodes > 1)
 		amdgpu_xgmi_remove_device(adev);
 
-	amdgpu_amdkfd_device_fini(adev);
-
 	amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
 	amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
 
+	amdgpu_amdkfd_device_fini(adev);
+
 	/* need to disable SMC first */
 	for (i = 0; i < adev->num_ip_blocks; i++) {
 		if (!adev->ip_blocks[i].status.hw)
-- 
2.27.0


  parent reply	other threads:[~2021-01-12 13:13 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-12 12:54 [PATCH AUTOSEL 5.10 01/51] ARC: build: remove non-existing bootpImage from KBUILD_IMAGE Sasha Levin
2021-01-12 12:54 ` [PATCH AUTOSEL 5.10 02/51] ARC: build: add uImage.lzma to the top-level target Sasha Levin
2021-01-12 12:54 ` [PATCH AUTOSEL 5.10 03/51] ARC: build: add boot_targets to PHONY Sasha Levin
2021-01-12 12:54 ` [PATCH AUTOSEL 5.10 04/51] ARC: build: move symlink creation to arch/arc/Makefile to avoid race Sasha Levin
2021-01-12 12:54 ` [PATCH AUTOSEL 5.10 05/51] ARM: omap2: pmic-cpcap: fix maximum voltage to be consistent with defaults on xt875 Sasha Levin
2021-01-12 12:54 ` [PATCH AUTOSEL 5.10 06/51] ath11k: fix crash caused by NULL rx_channel Sasha Levin
2021-01-12 12:54 ` [PATCH AUTOSEL 5.10 07/51] rtlwifi: rise completion at the last step of firmware callback Sasha Levin
2021-01-12 12:54 ` [PATCH AUTOSEL 5.10 08/51] netfilter: ipset: fixes possible oops in mtype_resize Sasha Levin
2021-01-12 12:54 ` [PATCH AUTOSEL 5.10 09/51] ath11k: qmi: try to allocate a big block of DMA memory first Sasha Levin
2021-01-12 12:54 ` [PATCH AUTOSEL 5.10 10/51] btrfs: fix async discard stall Sasha Levin
2021-01-12 12:54 ` [PATCH AUTOSEL 5.10 11/51] btrfs: merge critical sections of discard lock in workfn Sasha Levin
2021-01-12 12:54 ` [PATCH AUTOSEL 5.10 12/51] btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan Sasha Levin
2021-01-12 12:54 ` [PATCH AUTOSEL 5.10 13/51] regulator: bd718x7: Add enable times Sasha Levin
2021-01-12 12:54 ` [PATCH AUTOSEL 5.10 14/51] ethernet: ucc_geth: fix definition and size of ucc_geth_tx_global_pram Sasha Levin
2021-01-12 12:54 ` [PATCH AUTOSEL 5.10 15/51] ARM: dts: ux500/golden: Set display max brightness Sasha Levin
2021-01-12 12:54 ` [PATCH AUTOSEL 5.10 16/51] qede: fix offload for IPIP tunnel packets Sasha Levin
2021-01-12 12:54 ` [PATCH AUTOSEL 5.10 17/51] stmmac: intel: Add PCI IDs for TGL-H platform Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 18/51] e1000e: Only run S0ix flows if shutdown succeeded Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 19/51] habanalabs: adjust pci controller init to new firmware Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 20/51] habanalabs/gaudi: retry loading TPC f/w on -EINTR Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 21/51] habanalabs: register to pci shutdown callback Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 22/51] staging: spmi: hisi-spmi-controller: Fix some error handling paths Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 23/51] CDC-NCM: remove "connected" log message Sasha Levin
2021-01-12 13:11   ` Greg Kroah-Hartman
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 24/51] spi: altera: fix return value for altera_spi_txrx() Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 25/51] habanalabs: Fix memleak in hl_device_reset Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 26/51] hwmon: (pwm-fan) Ensure that calculation doesn't discard big period values Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 27/51] lib/raid6: Let $(UNROLL) rules work with macOS userland Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 28/51] kconfig: remove 'kvmconfig' and 'xenconfig' shorthands Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 29/51] spi: fix the divide by 0 error when calculating xfer waiting time Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 30/51] dmaengine: stm32-mdma: fix STM32_MDMA_VERY_HIGH_PRIORITY value Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 31/51] net: usb: qmi_wwan: add Quectel EM160R-GL Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 32/51] io_uring: drop file refs after task cancel Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 33/51] bfq: Fix computation of shallow depth Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 34/51] arch/arc: add copy_user_page() to <asm/page.h> to fix build error on ARC Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 35/51] misdn: dsp: select CONFIG_BITREVERSE Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 36/51] net: ethernet: fs_enet: Add missing MODULE_LICENSE Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 37/51] selftests: fix the return value for UDP GRO test Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 38/51] nvme-pci: mark Samsung PM1725a as IGNORE_DEV_SUBNQN Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 39/51] nvme: avoid possible double fetch in handling CQE Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 40/51] nvmet-rdma: Fix list_del corruption on queue establishment failure Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 41/51] drm/amd/display: fix sysfs amdgpu_current_backlight_pwm NULL pointer issue Sasha Levin
2021-01-12 12:55 ` Sasha Levin [this message]
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 43/51] drm/amd/pm: fix the failure when change power profile for renoir Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 44/51] drm/amdgpu: fix potential memory leak during navi12 deinitialization Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 45/51] gcc-plugins: fix gcc 11 indigestion with plugins Sasha Levin
2021-01-12 17:31   ` Kees Cook
2021-01-17 16:18     ` Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 46/51] usb: typec: Fix copy paste error for NVIDIA alt-mode description Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 47/51] iommu/vt-d: Fix lockdep splat in sva bind()/unbind() Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 48/51] ACPI: scan: add stub acpi_create_platform_device() for !CONFIG_ACPI Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 49/51] drm/msm: Call msm_init_vram before binding the gpu Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 50/51] ARM: picoxcell: fix missing interrupt-parent properties Sasha Levin
2021-01-12 12:55 ` [PATCH AUTOSEL 5.10 51/51] poll: fix performance regression due to out-of-line __put_user() Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210112125534.70280-42-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=Dennis.Li@amd.com \
    --cc=Hawking.Zhang@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox