From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2AA7D1865A; Sat, 3 Feb 2024 04:13:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706933637; cv=none; b=Aj1L6PNv/gN06GSUrHZPOrCNIKEkyDZ/Bpj4tJxSjeKr1na83JPU3CrkyWOhjAAYd3WaFb7aXDocb+wQaCA3xSpS9Ydweki5vtw/lSSfoxFz//GWyhV4+uM6hEzrwiphPZZE+eLEmU0eH2rtBX3s3Y31fRtImKi5HHp7sAXoSEI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706933637; c=relaxed/simple; bh=bNP7F3Ak2h50Yi576JSTZBreNvWIKy1Ohkj8desAb2Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=urDs2NKF2O9hqsgmLPXpKY9fvViAN+R8JRtOecBcqNeVH0JLWm964cKGnwnixbhv7rQvSxrz8zBBNqY75BaacvLY8+FOyF6MVZFXgx1XQzZ5wniDEWR7F+aE+YbuwjUieYVmbBxdx9IbK9sscal8GPtm98r2IBIFpV3fOMe1wYA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=DgQzgL8n; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="DgQzgL8n" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E516DC43390; Sat, 3 Feb 2024 04:13:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1706933637; bh=bNP7F3Ak2h50Yi576JSTZBreNvWIKy1Ohkj8desAb2Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=DgQzgL8n/iFKCmgr8RXfbtaeIEj+lshqPcckAaZ/ZAcES0uDoiKWqz9osa5reUQ8O TuBU0N3NZprMmVWJhVWig9vynAcE9oDbHzcDxz2LsrWwVZUxAiZ5Ba1mscsVYOchNi ZYXf+shBW6B9QTdP4S0zCmVK6DONiZNV+PgdY6VI= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, "Stanley.Yang" , Tao Zhou , Hawking Zhang , Alex Deucher , Sasha Levin Subject: [PATCH 6.6 209/322] drm/amdgpu: Fix ecc irq enable/disable unpaired Date: Fri, 2 Feb 2024 20:05:06 -0800 Message-ID: <20240203035406.001414300@linuxfoundation.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240203035359.041730947@linuxfoundation.org> References: <20240203035359.041730947@linuxfoundation.org> User-Agent: quilt/0.67 X-stable: review X-Patchwork-Hint: ignore Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit 6.6-stable review patch. If anyone has any objections, please let me know. ------------------ From: Stanley.Yang [ Upstream commit a32c6f7f5737cc7e31cd7ad5133f0d96fca12ea6 ] The ecc_irq is disabled while GPU mode2 reset suspending process, but not be enabled during GPU mode2 reset resume process. Changed from V1: only do sdma/gfx ras_late_init in aldebaran_mode2_restore_ip delete amdgpu_ras_late_resume function Changed from V2: check umc ras supported before put ecc_irq Signed-off-by: Stanley.Yang Reviewed-by: Tao Zhou Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/aldebaran.c | 26 +++++++++++++++++++++++++- drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 4 ++++ drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 5 +++++ drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 4 ++++ 4 files changed, 38 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/aldebaran.c b/drivers/gpu/drm/amd/amdgpu/aldebaran.c index 2b97b8a96fb4..fa6193535d48 100644 --- a/drivers/gpu/drm/amd/amdgpu/aldebaran.c +++ b/drivers/gpu/drm/amd/amdgpu/aldebaran.c @@ -333,6 +333,7 @@ aldebaran_mode2_restore_hwcontext(struct amdgpu_reset_control *reset_ctl, { struct list_head *reset_device_list = reset_context->reset_device_list; struct amdgpu_device *tmp_adev = NULL; + struct amdgpu_ras *con; int r; if (reset_device_list == NULL) @@ -358,7 +359,30 @@ aldebaran_mode2_restore_hwcontext(struct amdgpu_reset_control *reset_ctl, */ amdgpu_register_gpu_instance(tmp_adev); - /* Resume RAS */ + /* Resume RAS, ecc_irq */ + con = amdgpu_ras_get_context(tmp_adev); + if (!amdgpu_sriov_vf(tmp_adev) && con) { + if (tmp_adev->sdma.ras && + tmp_adev->sdma.ras->ras_block.ras_late_init) { + r = tmp_adev->sdma.ras->ras_block.ras_late_init(tmp_adev, + &tmp_adev->sdma.ras->ras_block.ras_comm); + if (r) { + dev_err(tmp_adev->dev, "SDMA failed to execute ras_late_init! ret:%d\n", r); + goto end; + } + } + + if (tmp_adev->gfx.ras && + tmp_adev->gfx.ras->ras_block.ras_late_init) { + r = tmp_adev->gfx.ras->ras_block.ras_late_init(tmp_adev, + &tmp_adev->gfx.ras->ras_block.ras_comm); + if (r) { + dev_err(tmp_adev->dev, "GFX failed to execute ras_late_init! ret:%d\n", r); + goto end; + } + } + } + amdgpu_ras_resume(tmp_adev); /* Update PSP FW topology after reset */ diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c index fa87a85e1017..62ecf4d89cb9 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c @@ -1141,6 +1141,10 @@ static int gmc_v10_0_hw_fini(void *handle) amdgpu_irq_put(adev, &adev->gmc.vm_fault, 0); + if (adev->gmc.ecc_irq.funcs && + amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__UMC)) + amdgpu_irq_put(adev, &adev->gmc.ecc_irq, 0); + return 0; } diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c index e3b76fd28d15..3d797a1adef3 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c @@ -974,6 +974,11 @@ static int gmc_v11_0_hw_fini(void *handle) } amdgpu_irq_put(adev, &adev->gmc.vm_fault, 0); + + if (adev->gmc.ecc_irq.funcs && + amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__UMC)) + amdgpu_irq_put(adev, &adev->gmc.ecc_irq, 0); + gmc_v11_0_gart_disable(adev); return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index 89550d3df68d..f9f43742e9ce 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -2413,6 +2413,10 @@ static int gmc_v9_0_hw_fini(void *handle) amdgpu_irq_put(adev, &adev->gmc.vm_fault, 0); + if (adev->gmc.ecc_irq.funcs && + amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__UMC)) + amdgpu_irq_put(adev, &adev->gmc.ecc_irq, 0); + return 0; } -- 2.43.0