From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 716C5C433E0 for ; Tue, 12 Jan 2021 12:57:20 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 235A12312D for ; Tue, 12 Jan 2021 12:57:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 235A12312D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=amd-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AD5DD6E20C; Tue, 12 Jan 2021 12:57:19 +0000 (UTC) Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by gabe.freedesktop.org (Postfix) with ESMTPS id A1EBB6E20C; Tue, 12 Jan 2021 12:57:18 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 6311A23136; Tue, 12 Jan 2021 12:57:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1610456238; bh=KVEfEUWOLEQmT7+CQxXQJ3XRAZoF8/ksCh1myHnPp1c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MNq/x0gx3q5FhtdvXASqolw4zJEHmNR+uWzL1cH+3BUjubDh8ZXr+JZRKWFOta1SN S8nN5SaMjJpy82EI1xPQ1gSu4hE2JJIkZ9oMJOqo9pS4aTUUFGi7dY+0X3YrMgSdIH /+OVjJzWSDKcUsSAZvt2WZLsVmlj3pjXv+y8bD4HlN68gxEYjWkzpdI69v034C0Kr0 FLKti5sJx2UnRVjp5ux+9GmejxvbFHGhtPglncrf5wc/d5QY4o4nlBGHmQx3VtfbNy lPnDEaVHLOp+Rt0Q+XzLS4E/gpKcG7aSCm9IrAL5OFkD4sPaSZWpTZjjHGXaJIY5hA uuWlYF93VTHuA== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH AUTOSEL 5.4 24/28] drm/amdgpu: fix a GPU hang issue when remove device Date: Tue, 12 Jan 2021 07:56:40 -0500 Message-Id: <20210112125645.70739-24-sashal@kernel.org> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20210112125645.70739-1-sashal@kernel.org> References: <20210112125645.70739-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-BeenThere: amd-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussion list for AMD gfx List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Sasha Levin , dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, Alex Deucher , Dennis Li , Hawking Zhang Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: amd-gfx-bounces@lists.freedesktop.org Sender: "amd-gfx" From: Dennis Li [ Upstream commit 88e21af1b3f887d217f2fb14fc7e7d3cd87ebf57 ] When GFXOFF is enabled and GPU is idle, driver will fail to access some registers. Therefore change to disable power gating before all access registers with MMIO. Dmesg log is as following: amdgpu 0000:03:00.0: amdgpu: amdgpu: finishing device. amdgpu: cp queue pipe 4 queue 0 preemption failed amdgpu 0000:03:00.0: amdgpu: failed to write reg 2890 wait reg 28a2 amdgpu 0000:03:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706 amdgpu 0000:03:00.0: amdgpu: failed to write reg 2890 wait reg 28a2 amdgpu 0000:03:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706 Signed-off-by: Dennis Li Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 29141bff4b572..3b3fc9a426e91 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2057,11 +2057,11 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev) if (adev->gmc.xgmi.num_physical_nodes > 1) amdgpu_xgmi_remove_device(adev); - amdgpu_amdkfd_device_fini(adev); - amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE); amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE); + amdgpu_amdkfd_device_fini(adev); + /* need to disable SMC first */ for (i = 0; i < adev->num_ip_blocks; i++) { if (!adev->ip_blocks[i].status.hw) -- 2.27.0 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3866EC433DB for ; Tue, 12 Jan 2021 12:57:21 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id DF97423129 for ; Tue, 12 Jan 2021 12:57:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DF97423129 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 227CC6E210; Tue, 12 Jan 2021 12:57:20 +0000 (UTC) Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by gabe.freedesktop.org (Postfix) with ESMTPS id A1EBB6E20C; Tue, 12 Jan 2021 12:57:18 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 6311A23136; Tue, 12 Jan 2021 12:57:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1610456238; bh=KVEfEUWOLEQmT7+CQxXQJ3XRAZoF8/ksCh1myHnPp1c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MNq/x0gx3q5FhtdvXASqolw4zJEHmNR+uWzL1cH+3BUjubDh8ZXr+JZRKWFOta1SN S8nN5SaMjJpy82EI1xPQ1gSu4hE2JJIkZ9oMJOqo9pS4aTUUFGi7dY+0X3YrMgSdIH /+OVjJzWSDKcUsSAZvt2WZLsVmlj3pjXv+y8bD4HlN68gxEYjWkzpdI69v034C0Kr0 FLKti5sJx2UnRVjp5ux+9GmejxvbFHGhtPglncrf5wc/d5QY4o4nlBGHmQx3VtfbNy lPnDEaVHLOp+Rt0Q+XzLS4E/gpKcG7aSCm9IrAL5OFkD4sPaSZWpTZjjHGXaJIY5hA uuWlYF93VTHuA== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH AUTOSEL 5.4 24/28] drm/amdgpu: fix a GPU hang issue when remove device Date: Tue, 12 Jan 2021 07:56:40 -0500 Message-Id: <20210112125645.70739-24-sashal@kernel.org> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20210112125645.70739-1-sashal@kernel.org> References: <20210112125645.70739-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Sasha Levin , dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, Alex Deucher , Dennis Li , Hawking Zhang Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Dennis Li [ Upstream commit 88e21af1b3f887d217f2fb14fc7e7d3cd87ebf57 ] When GFXOFF is enabled and GPU is idle, driver will fail to access some registers. Therefore change to disable power gating before all access registers with MMIO. Dmesg log is as following: amdgpu 0000:03:00.0: amdgpu: amdgpu: finishing device. amdgpu: cp queue pipe 4 queue 0 preemption failed amdgpu 0000:03:00.0: amdgpu: failed to write reg 2890 wait reg 28a2 amdgpu 0000:03:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706 amdgpu 0000:03:00.0: amdgpu: failed to write reg 2890 wait reg 28a2 amdgpu 0000:03:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706 Signed-off-by: Dennis Li Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 29141bff4b572..3b3fc9a426e91 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2057,11 +2057,11 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev) if (adev->gmc.xgmi.num_physical_nodes > 1) amdgpu_xgmi_remove_device(adev); - amdgpu_amdkfd_device_fini(adev); - amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE); amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE); + amdgpu_amdkfd_device_fini(adev); + /* need to disable SMC first */ for (i = 0; i < adev->num_ip_blocks; i++) { if (!adev->ip_blocks[i].status.hw) -- 2.27.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7953FC4332B for ; Tue, 12 Jan 2021 13:10:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4B42723104 for ; Tue, 12 Jan 2021 13:10:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731873AbhALNJy (ORCPT ); Tue, 12 Jan 2021 08:09:54 -0500 Received: from mail.kernel.org ([198.145.29.99]:54586 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2404205AbhALM6M (ORCPT ); Tue, 12 Jan 2021 07:58:12 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 6311A23136; Tue, 12 Jan 2021 12:57:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1610456238; bh=KVEfEUWOLEQmT7+CQxXQJ3XRAZoF8/ksCh1myHnPp1c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MNq/x0gx3q5FhtdvXASqolw4zJEHmNR+uWzL1cH+3BUjubDh8ZXr+JZRKWFOta1SN S8nN5SaMjJpy82EI1xPQ1gSu4hE2JJIkZ9oMJOqo9pS4aTUUFGi7dY+0X3YrMgSdIH /+OVjJzWSDKcUsSAZvt2WZLsVmlj3pjXv+y8bD4HlN68gxEYjWkzpdI69v034C0Kr0 FLKti5sJx2UnRVjp5ux+9GmejxvbFHGhtPglncrf5wc/d5QY4o4nlBGHmQx3VtfbNy lPnDEaVHLOp+Rt0Q+XzLS4E/gpKcG7aSCm9IrAL5OFkD4sPaSZWpTZjjHGXaJIY5hA uuWlYF93VTHuA== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Dennis Li , Hawking Zhang , Alex Deucher , Sasha Levin , amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Subject: [PATCH AUTOSEL 5.4 24/28] drm/amdgpu: fix a GPU hang issue when remove device Date: Tue, 12 Jan 2021 07:56:40 -0500 Message-Id: <20210112125645.70739-24-sashal@kernel.org> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20210112125645.70739-1-sashal@kernel.org> References: <20210112125645.70739-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Dennis Li [ Upstream commit 88e21af1b3f887d217f2fb14fc7e7d3cd87ebf57 ] When GFXOFF is enabled and GPU is idle, driver will fail to access some registers. Therefore change to disable power gating before all access registers with MMIO. Dmesg log is as following: amdgpu 0000:03:00.0: amdgpu: amdgpu: finishing device. amdgpu: cp queue pipe 4 queue 0 preemption failed amdgpu 0000:03:00.0: amdgpu: failed to write reg 2890 wait reg 28a2 amdgpu 0000:03:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706 amdgpu 0000:03:00.0: amdgpu: failed to write reg 2890 wait reg 28a2 amdgpu 0000:03:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706 Signed-off-by: Dennis Li Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 29141bff4b572..3b3fc9a426e91 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2057,11 +2057,11 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev) if (adev->gmc.xgmi.num_physical_nodes > 1) amdgpu_xgmi_remove_device(adev); - amdgpu_amdkfd_device_fini(adev); - amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE); amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE); + amdgpu_amdkfd_device_fini(adev); + /* need to disable SMC first */ for (i = 0; i < adev->num_ip_blocks; i++) { if (!adev->ip_blocks[i].status.hw) -- 2.27.0