From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 94CEDC3B1A2 for ; Fri, 14 Feb 2020 15:54:03 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6CC0A2465D for ; Fri, 14 Feb 2020 15:54:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="vjIFgXDZ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6CC0A2465D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=amd-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C4B416F9CF; Fri, 14 Feb 2020 15:54:02 +0000 (UTC) Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1DC156F9CC; Fri, 14 Feb 2020 15:54:00 +0000 (UTC) Received: from sasha-vm.mshome.net (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 1B20C24681; Fri, 14 Feb 2020 15:53:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1581695640; bh=YPGPEDMa2swDKgTCQRDKlLnyvqjdbgcYK9BNY5SuAws=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=vjIFgXDZuwtn8ZsTT1jqbfOMfdSZ6+WjkW3AZogYFACLBaIW6n2V7YXcOMNX4VNmH yoiTh9mtyrHfm64zLBMWq4FyddILky7VcoJvnAqPbU2YJSWYgrb0fQ1z2TWX+FWmjb dHfpJ796eQUU92+D3/IYp6ChZ/CkUiufoYD8LmDw= From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH AUTOSEL 5.5 235/542] drm/amdgpu: fix KIQ ring test fail in TDR of SRIOV Date: Fri, 14 Feb 2020 10:43:47 -0500 Message-Id: <20200214154854.6746-235-sashal@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200214154854.6746-1-sashal@kernel.org> References: <20200214154854.6746-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-BeenThere: amd-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussion list for AMD gfx List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Sasha Levin , dri-devel@lists.freedesktop.org, Emily Deng , amd-gfx@lists.freedesktop.org, Alex Deucher , Monk Liu Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: amd-gfx-bounces@lists.freedesktop.org Sender: "amd-gfx" From: Monk Liu [ Upstream commit 5a7489a7e189ee2be889485f90c8cf24ea4b9a40 ] issues: MEC is ruined by the amdkfd_pre_reset after VF FLR done fix: amdkfd_pre_reset() would ruin MEC after hypervisor finished the VF FLR, the correct sequence is do amdkfd_pre_reset before VF FLR but there is a limitation to block this sequence: if we do pre_reset() before VF FLR, it would go KIQ way to do register access and stuck there, because KIQ probably won't work by that time (e.g. you already made GFX hang) so the best way right now is to simply remove it. Signed-off-by: Monk Liu Reviewed-by: Emily Deng Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index c17505fba9884..332b9c24a2cd0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3639,8 +3639,6 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev, if (r) return r; - amdgpu_amdkfd_pre_reset(adev); - /* Resume IP prior to SMC */ r = amdgpu_device_ip_reinit_early_sriov(adev); if (r) -- 2.20.1 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB402C3B1A1 for ; Fri, 14 Feb 2020 15:54:02 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B31C92465D for ; Fri, 14 Feb 2020 15:54:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="vjIFgXDZ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B31C92465D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 437286F9CC; Fri, 14 Feb 2020 15:54:01 +0000 (UTC) Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1DC156F9CC; Fri, 14 Feb 2020 15:54:00 +0000 (UTC) Received: from sasha-vm.mshome.net (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 1B20C24681; Fri, 14 Feb 2020 15:53:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1581695640; bh=YPGPEDMa2swDKgTCQRDKlLnyvqjdbgcYK9BNY5SuAws=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=vjIFgXDZuwtn8ZsTT1jqbfOMfdSZ6+WjkW3AZogYFACLBaIW6n2V7YXcOMNX4VNmH yoiTh9mtyrHfm64zLBMWq4FyddILky7VcoJvnAqPbU2YJSWYgrb0fQ1z2TWX+FWmjb dHfpJ796eQUU92+D3/IYp6ChZ/CkUiufoYD8LmDw= From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH AUTOSEL 5.5 235/542] drm/amdgpu: fix KIQ ring test fail in TDR of SRIOV Date: Fri, 14 Feb 2020 10:43:47 -0500 Message-Id: <20200214154854.6746-235-sashal@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200214154854.6746-1-sashal@kernel.org> References: <20200214154854.6746-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Sasha Levin , dri-devel@lists.freedesktop.org, Emily Deng , amd-gfx@lists.freedesktop.org, Alex Deucher , Monk Liu Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Monk Liu [ Upstream commit 5a7489a7e189ee2be889485f90c8cf24ea4b9a40 ] issues: MEC is ruined by the amdkfd_pre_reset after VF FLR done fix: amdkfd_pre_reset() would ruin MEC after hypervisor finished the VF FLR, the correct sequence is do amdkfd_pre_reset before VF FLR but there is a limitation to block this sequence: if we do pre_reset() before VF FLR, it would go KIQ way to do register access and stuck there, because KIQ probably won't work by that time (e.g. you already made GFX hang) so the best way right now is to simply remove it. Signed-off-by: Monk Liu Reviewed-by: Emily Deng Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index c17505fba9884..332b9c24a2cd0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3639,8 +3639,6 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev, if (r) return r; - amdgpu_amdkfd_pre_reset(adev); - /* Resume IP prior to SMC */ r = amdgpu_device_ip_reinit_early_sriov(adev); if (r) -- 2.20.1 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.1 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59600C3B19F for ; Fri, 14 Feb 2020 15:54:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2746E24673 for ; Fri, 14 Feb 2020 15:54:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1581695646; bh=YPGPEDMa2swDKgTCQRDKlLnyvqjdbgcYK9BNY5SuAws=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=F4jsmeNXFfI0T1Qr7Kxp4G/1z3YT+7450Vxe3yZaT1j6kbi1q5AWZA8IUURnPe7pZ vtekSk5I7nbYsL2gQrl043O8ybmJh7rShZ+d70z1Nom7gWvqkLxxwzNClqFk0uh5m1 s6tcBeZt5MjJ5bmLB9tnhlcyEMdtYjAlcOLdS7uQ= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731411AbgBNPyF (ORCPT ); Fri, 14 Feb 2020 10:54:05 -0500 Received: from mail.kernel.org ([198.145.29.99]:33578 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730982AbgBNPyA (ORCPT ); Fri, 14 Feb 2020 10:54:00 -0500 Received: from sasha-vm.mshome.net (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 1B20C24681; Fri, 14 Feb 2020 15:53:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1581695640; bh=YPGPEDMa2swDKgTCQRDKlLnyvqjdbgcYK9BNY5SuAws=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=vjIFgXDZuwtn8ZsTT1jqbfOMfdSZ6+WjkW3AZogYFACLBaIW6n2V7YXcOMNX4VNmH yoiTh9mtyrHfm64zLBMWq4FyddILky7VcoJvnAqPbU2YJSWYgrb0fQ1z2TWX+FWmjb dHfpJ796eQUU92+D3/IYp6ChZ/CkUiufoYD8LmDw= From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Monk Liu , Emily Deng , Alex Deucher , Sasha Levin , amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Subject: [PATCH AUTOSEL 5.5 235/542] drm/amdgpu: fix KIQ ring test fail in TDR of SRIOV Date: Fri, 14 Feb 2020 10:43:47 -0500 Message-Id: <20200214154854.6746-235-sashal@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200214154854.6746-1-sashal@kernel.org> References: <20200214154854.6746-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Monk Liu [ Upstream commit 5a7489a7e189ee2be889485f90c8cf24ea4b9a40 ] issues: MEC is ruined by the amdkfd_pre_reset after VF FLR done fix: amdkfd_pre_reset() would ruin MEC after hypervisor finished the VF FLR, the correct sequence is do amdkfd_pre_reset before VF FLR but there is a limitation to block this sequence: if we do pre_reset() before VF FLR, it would go KIQ way to do register access and stuck there, because KIQ probably won't work by that time (e.g. you already made GFX hang) so the best way right now is to simply remove it. Signed-off-by: Monk Liu Reviewed-by: Emily Deng Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index c17505fba9884..332b9c24a2cd0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3639,8 +3639,6 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev, if (r) return r; - amdgpu_amdkfd_pre_reset(adev); - /* Resume IP prior to SMC */ r = amdgpu_device_ip_reinit_early_sriov(adev); if (r) -- 2.20.1