From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B4831E7716F for ; Wed, 4 Dec 2024 23:30:00 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 670FE10ED84; Wed, 4 Dec 2024 23:30:00 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.b="QUUI0Ktx"; dkim-atps=neutral Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0983610ED83; Wed, 4 Dec 2024 23:29:59 +0000 (UTC) Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 8E5C55C573B; Wed, 4 Dec 2024 23:29:15 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 61EC0C2BCC4; Wed, 4 Dec 2024 23:29:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1733354998; bh=+J+kz/M+Y8prMAq3RC1OxtSLP4O3Db7wbLLW3XlzyMg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QUUI0Ktx6giwokja8FInhutIaiBLV81gp9sLgwi+7wkT4mSuVC08D46/7lUuiaG0o /KeU0aUfXz+H7Y7L1yhdgnLCwVpuF1OwM8EVcELqJAhk3JYpdWzpA7fh7TBYB2T1Vl KvJSsw9ALIXJylUWKbZS0REfkrvepJRo7iWtF+YwZKvMVGSxiJp3QAxjQgsvCSBnn8 pPRh+Tnj/Efq9WnwhUAGw3YIKw7gZdq0euWafp1yjll7oCA/kcAkbHcLmGTGU6ONUo mC/ZQsib8KvS3zm4ZTCig0BJGvoMlr2Mp1aDLNj87TCgBqhDn+kVoery6ZWxpS3kQO q5Hz2xjrGOMzg== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Xiang Liu , =?UTF-8?q?Christian=20K=C3=B6nig?= , "Stanley . Yang" , Alex Deucher , Sasha Levin , Xinhui.Pan@amd.com, airlied@gmail.com, simona@ffwll.ch, sunil.khatri@amd.com, lijo.lazar@amd.com, Jane.Jian@amd.com, David.Wu3@amd.com, sathishkumar.sundararaju@amd.com, amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Subject: [PATCH AUTOSEL 6.6 07/10] drm/amdgpu/vcn: reset fw_shared when VCPU buffers corrupted on vcn v4.0.3 Date: Wed, 4 Dec 2024 17:18:05 -0500 Message-ID: <20241204221820.2248367-7-sashal@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241204221820.2248367-1-sashal@kernel.org> References: <20241204221820.2248367-1-sashal@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 6.6.63 Content-Transfer-Encoding: 8bit X-BeenThere: amd-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussion list for AMD gfx List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: amd-gfx-bounces@lists.freedesktop.org Sender: "amd-gfx" From: Xiang Liu [ Upstream commit 928cd772e18ffbd7723cb2361db4a8ccf2222235 ] It is not necessarily corrupted. When there is RAS fatal error, device memory access is blocked. Hence vcpu bo cannot be saved to system memory as in a regular suspend sequence before going for reset. In other full device reset cases, that gets saved and restored during resume. v2: Remove redundant code like vcn_v4_0 did v2: Refine commit message v3: Drop the volatile v3: Refine commit message Signed-off-by: Xiang Liu Acked-by: Christian König Reviewed-by: Stanley.Yang Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c | 30 ++++++++++++++++++------- 1 file changed, 22 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c index f85d18cd74eca..e80c4f5b4f402 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c @@ -77,6 +77,20 @@ static int vcn_v4_0_3_early_init(void *handle) return amdgpu_vcn_early_init(adev); } +static int vcn_v4_0_3_fw_shared_init(struct amdgpu_device *adev, int inst_idx) +{ + struct amdgpu_vcn4_fw_shared *fw_shared; + + fw_shared = adev->vcn.inst[inst_idx].fw_shared.cpu_addr; + fw_shared->present_flag_0 = cpu_to_le32(AMDGPU_FW_SHARED_FLAG_0_UNIFIED_QUEUE); + fw_shared->sq.is_enabled = 1; + + if (amdgpu_vcnfw_log) + amdgpu_vcn_fwlog_init(&adev->vcn.inst[inst_idx]); + + return 0; +} + /** * vcn_v4_0_3_sw_init - sw init for VCN block * @@ -107,8 +121,6 @@ static int vcn_v4_0_3_sw_init(void *handle) return r; for (i = 0; i < adev->vcn.num_vcn_inst; i++) { - volatile struct amdgpu_vcn4_fw_shared *fw_shared; - vcn_inst = GET_INST(VCN, i); ring = &adev->vcn.inst[i].ring_enc[0]; @@ -131,12 +143,7 @@ static int vcn_v4_0_3_sw_init(void *handle) if (r) return r; - fw_shared = adev->vcn.inst[i].fw_shared.cpu_addr; - fw_shared->present_flag_0 = cpu_to_le32(AMDGPU_FW_SHARED_FLAG_0_UNIFIED_QUEUE); - fw_shared->sq.is_enabled = true; - - if (amdgpu_vcnfw_log) - amdgpu_vcn_fwlog_init(&adev->vcn.inst[i]); + vcn_v4_0_3_fw_shared_init(adev, i); } if (amdgpu_sriov_vf(adev)) { @@ -221,6 +228,8 @@ static int vcn_v4_0_3_hw_init(void *handle) } } else { for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + struct amdgpu_vcn4_fw_shared *fw_shared; + vcn_inst = GET_INST(VCN, i); ring = &adev->vcn.inst[i].ring_enc[0]; @@ -244,6 +253,11 @@ static int vcn_v4_0_3_hw_init(void *handle) regVCN_RB1_DB_CTRL); } + /* Re-init fw_shared when RAS fatal error occurred */ + fw_shared = adev->vcn.inst[i].fw_shared.cpu_addr; + if (!fw_shared->sq.is_enabled) + vcn_v4_0_3_fw_shared_init(adev, i); + r = amdgpu_ring_test_helper(ring); if (r) goto done; -- 2.43.0