public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 6.1.y] drm/amdgpu/vcn: Disable indirect SRAM on Vangogh broken BIOSes
@ 2023-04-18 22:15 Guilherme G. Piccoli
  2023-04-19 13:16 ` Deucher, Alexander
  2023-04-22 15:19 ` Greg KH
  0 siblings, 2 replies; 13+ messages in thread
From: Guilherme G. Piccoli @ 2023-04-18 22:15 UTC (permalink / raw)
  To: stable
  Cc: gregkh, sashal, amd-gfx, alexander.deucher, James.Zhu, leo.liu,
	kernel, kernel-dev

commit 542a56e8eb4467ae654eefab31ff194569db39cd upstream.

The VCN firmware loading path enables the indirect SRAM mode if it's
advertised as supported. We might have some cases of FW issues that
prevents this mode to working properly though, ending-up in a failed
probe. An example below, observed in the Steam Deck:

[...]
[drm] failed to load ucode VCN0_RAM(0x3A)
[drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0000)
amdgpu 0000:04:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring vcn_dec_0 test failed (-110)
[drm:amdgpu_device_init.cold [amdgpu]] *ERROR* hw_init of IP block <vcn_v3_0> failed -110
amdgpu 0000:04:00.0: amdgpu: amdgpu_device_ip_init failed
amdgpu 0000:04:00.0: amdgpu: Fatal error during GPU init
[...]

Disabling the VCN block circumvents this, but it's a very invasive
workaround that turns off the entire feature. So, let's add a quirk
on VCN loading that checks for known problematic BIOSes on Vangogh,
so we can proactively disable the indirect SRAM mode and allow the
HW proper probe and VCN IP block to work fine.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2385
Fixes: 82132ecc5432 ("drm/amdgpu: enable Vangogh VCN indirect sram mode")
Fixes: 9a8cc8cabc1e ("drm/amdgpu: enable Vangogh VCN indirect sram mode")
Cc: stable@vger.kernel.org
Cc: James Zhu <James.Zhu@amd.com>
Cc: Leo Liu <leo.liu@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---

Hi folks, this was build/boot tested on Deck. I've also adjusted the
context, function was reworked on 6.2.

But what a surprise was for me not see this fix already in 6.1.y, since
I've CCed stable, and the reason for that is really peculiar:


$ git log -1 --pretty="%an <%ae>: %s" 82132ecc5432
Leo Liu <leo.liu@amd.com>: drm/amdgpu: enable Vangogh VCN indirect sram mode

$ git describe --contains 82132ecc5432
v6.2-rc1~124^2~1^2~13


$ git log -1 --pretty="%an <%ae>: %s" 9a8cc8cabc1e
Leo Liu <leo.liu@amd.com>: drm/amdgpu: enable Vangogh VCN indirect sram mode

$ git describe --contains 9a8cc8cabc1e
v6.1-rc8~16^2^2


This is quite strange for me, we have 2 commit hashes pointing to the *same*
commit, and each one is present..in a different release !!?!
Since I've marked this patch as fixing 82132ecc5432 originally, 6.1.y stable
misses it, since it only contains 9a8cc8cabc1e (which is the same patch!).

Alex, do you have an idea why sometimes commits from the AMD tree appear
duplicate in mainline? Specially in different releases, this could cause
some confusion I guess.

Thanks in advance,


Guilherme


 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index ce64ca1c6e66..5c1193dd7d88 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -26,6 +26,7 @@
 
 #include <linux/firmware.h>
 #include <linux/module.h>
+#include <linux/dmi.h>
 #include <linux/pci.h>
 #include <linux/debugfs.h>
 #include <drm/drm_drv.h>
@@ -84,6 +85,7 @@ int amdgpu_vcn_sw_init(struct amdgpu_device *adev)
 {
 	unsigned long bo_size;
 	const char *fw_name;
+	const char *bios_ver;
 	const struct common_firmware_header *hdr;
 	unsigned char fw_check;
 	unsigned int fw_shared_size, log_offset;
@@ -159,6 +161,21 @@ int amdgpu_vcn_sw_init(struct amdgpu_device *adev)
 		if ((adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) &&
 		    (adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG))
 			adev->vcn.indirect_sram = true;
+		/*
+		 * Some Steam Deck's BIOS versions are incompatible with the
+		 * indirect SRAM mode, leading to amdgpu being unable to get
+		 * properly probed (and even potentially crashing the kernel).
+		 * Hence, check for these versions here - notice this is
+		 * restricted to Vangogh (Deck's APU).
+		 */
+		bios_ver = dmi_get_system_info(DMI_BIOS_VERSION);
+
+		if (bios_ver && (!strncmp("F7A0113", bios_ver, 7) ||
+		     !strncmp("F7A0114", bios_ver, 7))) {
+			adev->vcn.indirect_sram = false;
+			dev_info(adev->dev,
+				"Steam Deck quirk: indirect SRAM disabled on BIOS %s\n", bios_ver);
+		}
 		break;
 	case IP_VERSION(3, 0, 16):
 		fw_name = FIRMWARE_DIMGREY_CAVEFISH;
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-04-22 15:19 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-18 22:15 [PATCH 6.1.y] drm/amdgpu/vcn: Disable indirect SRAM on Vangogh broken BIOSes Guilherme G. Piccoli
2023-04-19 13:16 ` Deucher, Alexander
2023-04-19 14:14   ` Guilherme G. Piccoli
2023-04-19 20:04     ` Deucher, Alexander
2023-04-20 12:36       ` Guilherme G. Piccoli
2023-04-20 12:42         ` gregkh
2023-04-20 12:59           ` Guilherme G. Piccoli
2023-04-20 15:02             ` gregkh
2023-04-20 15:26               ` Alex Deucher
2023-04-20 15:36               ` Guilherme G. Piccoli
2023-04-20 15:56                 ` gregkh
2023-04-20 16:45                   ` Guilherme G. Piccoli
2023-04-22 15:19 ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox