From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 10F6B190664 for ; Sun, 1 Mar 2026 01:36:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772329000; cv=none; b=Mm4fGQ9UK9blc8eWriPmDYerP/do9CsdgHwfuyDCkOJFTAifyWL5qmk5d8gwrlWfwoPIEb07xTF0txb53K1Q06Toh/bQHBcCrELckuMW6lh0ZcX6/tOMBhw7TB/a1HetOCCo7wcwjx9gJGafqZJQFBDEapH+s4w/GxF2Z8/AQwg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772329000; c=relaxed/simple; bh=laWl66KswGr72UnLuhEOT52MGNcx5A9C6+6xifedHKI=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=WKS9lfdR9V9EKkA7btjUubr6TskdF11+nnRdWmJit5wEwhWtDf2gWWHgaqeJjvOXRHeKpxOXenIKkWbZ7cmoUhkx6V52CbWkpgKghpCemvm+vnOEtsVqqpmWVmSMcbuJzXHbvwYKZY/lDOtfFWzmM44wangqtLUaLWbdC/EOkA8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=GbnfjB3V; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="GbnfjB3V" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 48658C19421; Sun, 1 Mar 2026 01:36:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1772329000; bh=laWl66KswGr72UnLuhEOT52MGNcx5A9C6+6xifedHKI=; h=From:To:Cc:Subject:Date:From; b=GbnfjB3VaCTiYWZBNl+MQ4mIpqyMP73ZvzDazAoYZjelZY9bBdcrtPaZKel5KThtZ 9hUqN+YWi7PhKGMX26QmEJ14UMxxzms+d0mfhWNBfywt8MNAP5NXnacvXktAM2I8FB QVEhtzTPykL0FhMupYRHvEtZXrlCsyimWKDS3LdQzg49VntITNhC4bJTnJY60Z2AFo iTD6eg7C8N2EEmeOhtuPTW6W1NYYgbMAJFQKbSJznaToxo21YF2p1OzV4qOkBZVLPM avs7ab5+Ep+WgWQ18ESYGFgzstNH4J8nOJYUUR8URRd8X0VMs8MmGy0XJ4KtOJuVMX aJ8u4r5IrTRKw== From: Sasha Levin To: stable@vger.kernel.org, mario.limonciello@amd.com Cc: Cal Peake , Alex Deucher , amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Subject: FAILED: Patch "drm/amd: Fix hang on amdgpu unload by using pci_dev_is_disconnected()" failed to apply to 6.6-stable tree Date: Sat, 28 Feb 2026 20:36:37 -0500 Message-ID: <20260301013638.1696641-1-sashal@kernel.org> X-Mailer: git-send-email 2.51.0 Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Hint: ignore X-stable: review Content-Transfer-Encoding: 8bit The patch below does not apply to the 6.6-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to . Thanks, Sasha ------------------ original commit in Linus's tree ------------------ >From f7afda7fcd169a9168695247d07ad94cf7b9798f Mon Sep 17 00:00:00 2001 From: Mario Limonciello Date: Thu, 5 Feb 2026 10:42:54 -0600 Subject: [PATCH] drm/amd: Fix hang on amdgpu unload by using pci_dev_is_disconnected() The commit 6a23e7b4332c ("drm/amd: Clean up kfd node on surprise disconnect") introduced early KFD cleanup when drm_dev_is_unplugged() returns true. However, this causes hangs during normal module unload (rmmod amdgpu). The issue occurs because drm_dev_unplug() is called in amdgpu_pci_remove() for all removal scenarios, not just surprise disconnects. This was done intentionally in commit 39934d3ed572 ("Revert "drm/amdgpu: TA unload messages are not actually sent to psp when amdgpu is uninstalled"") to fix IGT PCI software unplug test failures. As a result, drm_dev_is_unplugged() returns true even during normal module unload, triggering the early KFD cleanup inappropriately. The correct check should distinguish between: - Actual surprise disconnect (eGPU unplugged): pci_dev_is_disconnected() returns true - Normal module unload (rmmod): pci_dev_is_disconnected() returns false Replace drm_dev_is_unplugged() with pci_dev_is_disconnected() to ensure the early cleanup only happens during true hardware disconnect events. Cc: stable@vger.kernel.org Reported-by: Cal Peake Closes: https://lore.kernel.org/all/b0c22deb-c0fa-3343-33cf-fd9a77d7db99@absolutedigital.net/ Fixes: 6a23e7b4332c ("drm/amd: Clean up kfd node on surprise disconnect") Acked-by: Alex Deucher Signed-off-by: Mario Limonciello Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 528990a595ec9..9758221413814 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4924,7 +4924,7 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev) * before ip_fini_early to prevent kfd locking refcount issues by calling * amdgpu_amdkfd_suspend() */ - if (drm_dev_is_unplugged(adev_to_drm(adev))) + if (pci_dev_is_disconnected(adev->pdev)) amdgpu_amdkfd_device_fini_sw(adev); amdgpu_device_ip_fini_early(adev); @@ -4936,7 +4936,7 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev) amdgpu_gart_dummy_page_fini(adev); - if (drm_dev_is_unplugged(adev_to_drm(adev))) + if (pci_dev_is_disconnected(adev->pdev)) amdgpu_device_unmap_mmio(adev); } -- 2.51.0