[PATCH 00/11] VCE1 fixes (v2)

public inbox for amd-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed

* [PATCH 00/11] VCE1 fixes (v2)
@ 2026-04-23  1:16 Timur Kristóf
  2026-04-23  1:16 ` [PATCH 01/11] drm/amdgpu: Align amdgpu_gtt_mgr entries to TLB size on Tahiti Timur Kristóf
                   ` (10 more replies)
  0 siblings, 11 replies; 23+ messages in thread
From: Timur Kristóf @ 2026-04-23  1:16 UTC (permalink / raw)
  To: amd-gfx, alexander.deucher, christian.koenig, John Olender
  Cc: Timur Kristóf

Fix various small issues regarding VCE1
and the workaround to ensure the VCPU BO
has a low 32-bit address.

Also fix an issue around firmware size
and offsets on all VCE versions.

Timur Kristóf (11):
  drm/amdgpu: Align amdgpu_gtt_mgr entries to TLB size on Tahiti
  drm/amdgpu/vce1: Check that the GPU address is < 128 MiB
  drm/amdgpu/vce1: Remove superfluous address check
  drm/amdgpu/vce1: Check if VRAM address is lower than GART.
  drm/amdgpu/vce1: Don't repeat GTT MGR node allocation
  drm/amdgpu/vce1: Fix VCE 1 firmware size and offsets
  drm/amdgpu/vce1: Stop using amdgpu_vce_resume
  drm/amdgpu/vce: Check maximum ucode size in amdgpu_vce_resume()
  drm/amdgpu/vce2: Fix VCE 2 firmware size and offsets
  drm/amdgpu/vce3: Fix VCE 3 firmware size and offsets
  drm/amdgpu/vce4: Fix VCE 4 firmware size and offsets

 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c |  9 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c     |  5 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h     |  2 +-
 drivers/gpu/drm/amd/amdgpu/vce_v1_0.c       | 64 +++++++++++++--------
 drivers/gpu/drm/amd/amdgpu/vce_v2_0.c       |  6 +-
 drivers/gpu/drm/amd/amdgpu/vce_v3_0.c       |  6 +-
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c       |  6 +-
 7 files changed, 61 insertions(+), 37 deletions(-)

-- 
2.53.0

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 01/11] drm/amdgpu: Align amdgpu_gtt_mgr entries to TLB size on Tahiti
  2026-04-23  1:16 [PATCH 00/11] VCE1 fixes (v2) Timur Kristóf
@ 2026-04-23  1:16 ` Timur Kristóf
  2026-04-23 11:04   ` Christian König
  2026-04-23  1:16 ` [PATCH 02/11] drm/amdgpu/vce1: Check that the GPU address is < 128 MiB Timur Kristóf
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 23+ messages in thread
From: Timur Kristóf @ 2026-04-23  1:16 UTC (permalink / raw)
  To: amd-gfx, alexander.deucher, christian.koenig, John Olender
  Cc: Timur Kristóf

The TLB is organized in groups of 8 entries, each one is 4K.
On Tahiti, the HW requires these GART entries to be 32K-aligned.

This fixes a VCE 1 firmware validation failure that can happen
after suspend/resume since we use amdgpu_gtt_mgr for VCE 1.

Fixes: 698fa62f56aa ("drm/amdgpu: Add helper to alloc GART entries")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index 9b0bcf6aca445..673e9e08c66a0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -198,12 +198,19 @@ int amdgpu_gtt_mgr_alloc_entries(struct amdgpu_gtt_mgr *mgr,
 				 u64 num_pages,
 				 enum drm_mm_insert_mode mode)
 {
+	u32 alignment = 0;
 	struct amdgpu_device *adev = container_of(mgr, typeof(*adev), mman.gtt_mgr);
 	int r;
 
+	/* Align to TLB size on Tahiti */
+	if (adev->asic_type == CHIP_TAHITI) {
+		alignment = 32 * 1024 / AMDGPU_GPU_PAGE_SIZE;
+		num_pages = ALIGN(num_pages, alignment);
+	}
+
 	spin_lock(&mgr->lock);
 	r = drm_mm_insert_node_in_range(&mgr->mm, mm_node, num_pages,
-					0, GART_ENTRY_WITHOUT_BO_COLOR, 0,
+					alignment, GART_ENTRY_WITHOUT_BO_COLOR, 0,
 					adev->gmc.gart_size >> PAGE_SHIFT,
 					mode);
 	spin_unlock(&mgr->lock);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 02/11] drm/amdgpu/vce1: Check that the GPU address is < 128 MiB
  2026-04-23  1:16 [PATCH 00/11] VCE1 fixes (v2) Timur Kristóf
  2026-04-23  1:16 ` [PATCH 01/11] drm/amdgpu: Align amdgpu_gtt_mgr entries to TLB size on Tahiti Timur Kristóf
@ 2026-04-23  1:16 ` Timur Kristóf
  2026-04-23 11:06   ` Christian König
  2026-04-23  1:16 ` [PATCH 03/11] drm/amdgpu/vce1: Remove superfluous address check Timur Kristóf
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 23+ messages in thread
From: Timur Kristóf @ 2026-04-23  1:16 UTC (permalink / raw)
  To: amd-gfx, alexander.deucher, christian.koenig, John Olender
  Cc: Timur Kristóf

When ensuring the low 32-bit address, make sure it is
less than 128 MiB, otherwise the VCE seems to fail to initialize.
This seems to be an undocumented limitation of the firmware
validation mechanism. Note that in case of VCE1 the BAR
address is zero and we can't change it also due to the
firmware validator.

When programming the mmVCE_VCPU_CACHE_OFFSETn registers,
don't AND them with a mask. This is incorrect because
the register mask is actually 0x0fffffff and useless because
we already ensure the addresses are below the limit.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
---
 drivers/gpu/drm/amd/amdgpu/vce_v1_0.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
index 5b7b46d242c6d..edabec442cb63 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
@@ -313,17 +313,17 @@ static int vce_v1_0_mc_resume(struct amdgpu_device *adev)
 
 	offset =  adev->vce.gpu_addr + AMDGPU_VCE_FIRMWARE_OFFSET;
 	size = VCE_V1_0_FW_SIZE;
-	WREG32(mmVCE_VCPU_CACHE_OFFSET0, offset & 0x7fffffff);
+	WREG32(mmVCE_VCPU_CACHE_OFFSET0, offset);
 	WREG32(mmVCE_VCPU_CACHE_SIZE0, size);
 
 	offset += size;
 	size = VCE_V1_0_STACK_SIZE;
-	WREG32(mmVCE_VCPU_CACHE_OFFSET1, offset & 0x7fffffff);
+	WREG32(mmVCE_VCPU_CACHE_OFFSET1, offset);
 	WREG32(mmVCE_VCPU_CACHE_SIZE1, size);
 
 	offset += size;
 	size = VCE_V1_0_DATA_SIZE;
-	WREG32(mmVCE_VCPU_CACHE_OFFSET2, offset & 0x7fffffff);
+	WREG32(mmVCE_VCPU_CACHE_OFFSET2, offset);
 	WREG32(mmVCE_VCPU_CACHE_SIZE2, size);
 
 	WREG32_P(mmVCE_LMI_CTRL2, 0x0, ~0x100);
@@ -527,11 +527,15 @@ static int vce_v1_0_early_init(struct amdgpu_ip_block *ip_block)
  * To accomodate that, we put GART to the LOW address range
  * and reserve some GART pages where we map the VCPU BO,
  * so that it gets a 32-bit address.
+ *
+ * The BAR address is zero and we can't change it
+ * due to the firmware validation mechanism.
+ * It seems that it fails to initialize if the address is >= 128 MiB.
  */
 static int vce_v1_0_ensure_vcpu_bo_32bit_addr(struct amdgpu_device *adev)
 {
 	u64 bo_size = amdgpu_bo_size(adev->vce.vcpu_bo);
-	u64 max_vcpu_bo_addr = 0xffffffff - bo_size;
+	u64 max_vcpu_bo_addr = 0x07ffffff - bo_size;
 	u64 num_pages = ALIGN(bo_size, AMDGPU_GPU_PAGE_SIZE) / AMDGPU_GPU_PAGE_SIZE;
 	u64 pa = amdgpu_gmc_vram_pa(adev, adev->vce.vcpu_bo);
 	u64 flags = AMDGPU_PTE_READABLE | AMDGPU_PTE_WRITEABLE | AMDGPU_PTE_VALID;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 03/11] drm/amdgpu/vce1: Remove superfluous address check
  2026-04-23  1:16 [PATCH 00/11] VCE1 fixes (v2) Timur Kristóf
  2026-04-23  1:16 ` [PATCH 01/11] drm/amdgpu: Align amdgpu_gtt_mgr entries to TLB size on Tahiti Timur Kristóf
  2026-04-23  1:16 ` [PATCH 02/11] drm/amdgpu/vce1: Check that the GPU address is < 128 MiB Timur Kristóf
@ 2026-04-23  1:16 ` Timur Kristóf
  2026-04-23  1:16 ` [PATCH 04/11] drm/amdgpu/vce1: Check if VRAM address is lower than GART Timur Kristóf
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 23+ messages in thread
From: Timur Kristóf @ 2026-04-23  1:16 UTC (permalink / raw)
  To: amd-gfx, alexander.deucher, christian.koenig, John Olender
  Cc: Timur Kristóf

The same thing is already checked a few lines above.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/vce_v1_0.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
index edabec442cb63..884f24be36859 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
@@ -557,8 +557,6 @@ static int vce_v1_0_ensure_vcpu_bo_32bit_addr(struct amdgpu_device *adev)
 	amdgpu_gart_map_vram_range(adev, pa, adev->vce.gart_node.start,
 				   num_pages, flags, adev->gart.ptr);
 	adev->vce.gpu_addr = adev->gmc.gart_start + vce_gart_start_offs;
-	if (adev->vce.gpu_addr > max_vcpu_bo_addr)
-		return -EINVAL;
 
 	return 0;
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 04/11] drm/amdgpu/vce1: Check if VRAM address is lower than GART.
  2026-04-23  1:16 [PATCH 00/11] VCE1 fixes (v2) Timur Kristóf
                   ` (2 preceding siblings ...)
  2026-04-23  1:16 ` [PATCH 03/11] drm/amdgpu/vce1: Remove superfluous address check Timur Kristóf
@ 2026-04-23  1:16 ` Timur Kristóf
  2026-04-23  1:16 ` [PATCH 05/11] drm/amdgpu/vce1: Don't repeat GTT MGR node allocation Timur Kristóf
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 23+ messages in thread
From: Timur Kristóf @ 2026-04-23  1:16 UTC (permalink / raw)
  To: amd-gfx, alexander.deucher, christian.koenig, John Olender
  Cc: Timur Kristóf

Previously, I had assumed this was not possible
so it was OK to not handle it, but now we got a report
from a user who has a board that is configured this way.

When the VCPU BO is already located in a low 32-bit address
in VRAM (eg. when VRAM is mapped to the low address space),
don't do the workaround.

Fixes: 66a80158aa2a ("amdgpu/vce: use amdgpu_gtt_mgr_alloc_entries")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/vce_v1_0.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
index 884f24be36859..a49f11be74b20 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
@@ -542,6 +542,9 @@ static int vce_v1_0_ensure_vcpu_bo_32bit_addr(struct amdgpu_device *adev)
 	u64 vce_gart_start_offs;
 	int r;
 
+	if (adev->gmc.vram_start < adev->gmc.gart_start)
+		return amdgpu_bo_gpu_offset(adev->vce.vcpu_bo) <= max_vcpu_bo_addr ? 0 : -EINVAL;
+
 	r = amdgpu_gtt_mgr_alloc_entries(&adev->mman.gtt_mgr,
 					 &adev->vce.gart_node, num_pages,
 					 DRM_MM_INSERT_LOW);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 05/11] drm/amdgpu/vce1: Don't repeat GTT MGR node allocation
  2026-04-23  1:16 [PATCH 00/11] VCE1 fixes (v2) Timur Kristóf
                   ` (3 preceding siblings ...)
  2026-04-23  1:16 ` [PATCH 04/11] drm/amdgpu/vce1: Check if VRAM address is lower than GART Timur Kristóf
@ 2026-04-23  1:16 ` Timur Kristóf
  2026-04-23  1:16 ` [PATCH 06/11] drm/amdgpu/vce1: Fix VCE 1 firmware size and offsets Timur Kristóf
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 23+ messages in thread
From: Timur Kristóf @ 2026-04-23  1:16 UTC (permalink / raw)
  To: amd-gfx, alexander.deucher, christian.koenig, John Olender
  Cc: Timur Kristóf

Only allocate entries from the GTT manager when the
VCE GTT node is not allocated yet. This prevents the
possibility of allocating them multiple times, which
causes issues during GPU reset and suspend/resume.

Fixes: 66a80158aa2a ("amdgpu/vce: use amdgpu_gtt_mgr_alloc_entries")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/vce_v1_0.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
index a49f11be74b20..92c3cf3fce4f0 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
@@ -545,11 +545,13 @@ static int vce_v1_0_ensure_vcpu_bo_32bit_addr(struct amdgpu_device *adev)
 	if (adev->gmc.vram_start < adev->gmc.gart_start)
 		return amdgpu_bo_gpu_offset(adev->vce.vcpu_bo) <= max_vcpu_bo_addr ? 0 : -EINVAL;
 
-	r = amdgpu_gtt_mgr_alloc_entries(&adev->mman.gtt_mgr,
-					 &adev->vce.gart_node, num_pages,
-					 DRM_MM_INSERT_LOW);
-	if (r)
-		return r;
+	if (!drm_mm_node_allocated(&adev->vce.gart_node)) {
+		r = amdgpu_gtt_mgr_alloc_entries(&adev->mman.gtt_mgr,
+						 &adev->vce.gart_node, num_pages,
+						 DRM_MM_INSERT_LOW);
+		if (r)
+			return r;
+	}
 
 	vce_gart_start_offs = amdgpu_gtt_node_to_byte_offset(&adev->vce.gart_node);
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 06/11] drm/amdgpu/vce1: Fix VCE 1 firmware size and offsets
  2026-04-23  1:16 [PATCH 00/11] VCE1 fixes (v2) Timur Kristóf
                   ` (4 preceding siblings ...)
  2026-04-23  1:16 ` [PATCH 05/11] drm/amdgpu/vce1: Don't repeat GTT MGR node allocation Timur Kristóf
@ 2026-04-23  1:16 ` Timur Kristóf
  2026-04-23 11:12   ` Christian König
  2026-04-23  1:16 ` [PATCH 07/11] drm/amdgpu/vce1: Stop using amdgpu_vce_resume Timur Kristóf
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 23+ messages in thread
From: Timur Kristóf @ 2026-04-23  1:16 UTC (permalink / raw)
  To: amd-gfx, alexander.deucher, christian.koenig, John Olender
  Cc: Timur Kristóf

The VCPU BO contains the actual FW at an offset, but
it was not calculated into the VCPU BO size.
Subtract this from the FW size to make sure there is
no out of bounds access.

Make sure the stack and data offsets are aligned to
the 32K TLB size.

Check that the FW microcode actually fits in the
space that is reserved for it.

Fixes: d4a640d4b9f3 ("drm/amdgpu/vce1: Implement VCE1 IP block (v2)")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
---
 drivers/gpu/drm/amd/amdgpu/vce_v1_0.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
index 92c3cf3fce4f0..c8e7297fd7ca3 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
@@ -42,9 +42,10 @@
 #include "oss/oss_1_0_d.h"
 #include "oss/oss_1_0_sh_mask.h"
 
+#define VCE_V1_0_ALIGNMENT	(32 * 1024)
 #define VCE_V1_0_FW_SIZE	(256 * 1024)
 #define VCE_V1_0_STACK_SIZE	(64 * 1024)
-#define VCE_V1_0_DATA_SIZE	(7808 * (AMDGPU_MAX_VCE_HANDLES + 1))
+#define VCE_V1_0_DATA_SIZE	(ALIGN(7808 * (AMDGPU_MAX_VCE_HANDLES + 1), VCE_V1_0_ALIGNMENT))
 #define VCE_STATUS_VCPU_REPORT_FW_LOADED_MASK	0x02
 
 static void vce_v1_0_set_ring_funcs(struct amdgpu_device *adev);
@@ -189,17 +190,22 @@ static int vce_v1_0_load_fw_signature(struct amdgpu_device *adev)
 {
 	const struct common_firmware_header *hdr;
 	struct vce_v1_0_fw_signature *sign;
-	unsigned int ucode_offset;
+	u32 ucode_offset;
+	u32 ucode_size;
 	uint32_t chip_id;
 	u32 *cpu_addr;
 	int i;
 
 	hdr = (const struct common_firmware_header *)adev->vce.fw->data;
 	ucode_offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
+	ucode_size = hdr->ucode_size_bytes - sizeof(struct vce_v1_0_fw_signature *);
 	cpu_addr = adev->vce.cpu_addr;
 
 	sign = (void *)adev->vce.fw->data + ucode_offset;
 
+	if (ucode_size > VCE_V1_0_FW_SIZE - AMDGPU_VCE_FIRMWARE_OFFSET)
+		return -EINVAL;
+
 	switch (adev->asic_type) {
 	case CHIP_TAHITI:
 		chip_id = 0x01000014;
@@ -231,7 +237,7 @@ static int vce_v1_0_load_fw_signature(struct amdgpu_device *adev)
 	cpu_addr[4] = cpu_to_le32(le32_to_cpu(sign->length) + 64);
 
 	memset_io(&cpu_addr[5], 0, 44);
-	memcpy_toio(&cpu_addr[16], &sign[1], hdr->ucode_size_bytes - sizeof(*sign));
+	memcpy_toio(&cpu_addr[16], &sign[1], ucode_size);
 
 	cpu_addr += (le32_to_cpu(sign->length) + 64) / 4;
 	memcpy_toio(&cpu_addr[0], &sign->val[i].sigval[0], 16);
@@ -312,17 +318,22 @@ static int vce_v1_0_mc_resume(struct amdgpu_device *adev)
 	WREG32(mmVCE_VCPU_SCRATCH7, AMDGPU_MAX_VCE_HANDLES);
 
 	offset =  adev->vce.gpu_addr + AMDGPU_VCE_FIRMWARE_OFFSET;
-	size = VCE_V1_0_FW_SIZE;
+	size = VCE_V1_0_FW_SIZE - AMDGPU_VCE_FIRMWARE_OFFSET;
 	WREG32(mmVCE_VCPU_CACHE_OFFSET0, offset);
 	WREG32(mmVCE_VCPU_CACHE_SIZE0, size);
 
 	offset += size;
 	size = VCE_V1_0_STACK_SIZE;
+	WARN_ON(!IS_ALIGNED(offset, VCE_V1_0_ALIGNMENT));
+	WARN_ON(!IS_ALIGNED(size, VCE_V1_0_ALIGNMENT));
 	WREG32(mmVCE_VCPU_CACHE_OFFSET1, offset);
 	WREG32(mmVCE_VCPU_CACHE_SIZE1, size);
 
 	offset += size;
 	size = VCE_V1_0_DATA_SIZE;
+	WARN_ON(!IS_ALIGNED(offset, VCE_V1_0_ALIGNMENT));
+	WARN_ON(!IS_ALIGNED(size, VCE_V1_0_ALIGNMENT));
+	WARN_ON(offset + size > amdgpu_bo_size(adev->vce.vcpu_bo));
 	WREG32(mmVCE_VCPU_CACHE_OFFSET2, offset);
 	WREG32(mmVCE_VCPU_CACHE_SIZE2, size);
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 07/11] drm/amdgpu/vce1: Stop using amdgpu_vce_resume
  2026-04-23  1:16 [PATCH 00/11] VCE1 fixes (v2) Timur Kristóf
                   ` (5 preceding siblings ...)
  2026-04-23  1:16 ` [PATCH 06/11] drm/amdgpu/vce1: Fix VCE 1 firmware size and offsets Timur Kristóf
@ 2026-04-23  1:16 ` Timur Kristóf
  2026-04-23 11:13   ` Christian König
  2026-04-23  1:16 ` [PATCH 08/11] drm/amdgpu/vce: Check maximum ucode size in amdgpu_vce_resume() Timur Kristóf
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 23+ messages in thread
From: Timur Kristóf @ 2026-04-23  1:16 UTC (permalink / raw)
  To: amd-gfx, alexander.deucher, christian.koenig, John Olender
  Cc: Timur Kristóf

The VCE1 firmware works slightly differently and is already
loaded by vce_v1_0_load_fw(). It doesn't actually need to
call amdgpu_vce_resume().

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
---
 drivers/gpu/drm/amd/amdgpu/vce_v1_0.c | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
index c8e7297fd7ca3..db8cc97a72d8e 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
@@ -178,7 +178,7 @@ static void vce_v1_0_init_cg(struct amdgpu_device *adev)
 }
 
 /**
- * vce_v1_0_load_fw_signature - load firmware signature into VCPU BO
+ * vce_v1_0_load_fw() - load firmware signature into VCPU BO
  *
  * @adev: amdgpu_device pointer
  *
@@ -186,7 +186,7 @@ static void vce_v1_0_init_cg(struct amdgpu_device *adev)
  * This function finds the signature appropriate for the current
  * ASIC and writes that into the VCPU BO.
  */
-static int vce_v1_0_load_fw_signature(struct amdgpu_device *adev)
+static int vce_v1_0_load_fw(struct amdgpu_device *adev)
 {
 	const struct common_firmware_header *hdr;
 	struct vce_v1_0_fw_signature *sign;
@@ -232,6 +232,8 @@ static int vce_v1_0_load_fw_signature(struct amdgpu_device *adev)
 		return -EINVAL;
 	}
 
+	memset_io(&cpu_addr[0], 0, amdgpu_bo_size(adev->vce.vcpu_bo));
+
 	cpu_addr += (256 - 64) / 4;
 	memcpy_toio(&cpu_addr[0], &sign->val[i].nonce[0], 16);
 	cpu_addr[4] = cpu_to_le32(le32_to_cpu(sign->length) + 64);
@@ -592,10 +594,7 @@ static int vce_v1_0_sw_init(struct amdgpu_ip_block *ip_block)
 	if (r)
 		return r;
 
-	r = amdgpu_vce_resume(adev);
-	if (r)
-		return r;
-	r = vce_v1_0_load_fw_signature(adev);
+	r = vce_v1_0_load_fw(adev);
 	if (r)
 		return r;
 	r = vce_v1_0_ensure_vcpu_bo_32bit_addr(adev);
@@ -714,10 +713,7 @@ static int vce_v1_0_resume(struct amdgpu_ip_block *ip_block)
 	struct amdgpu_device *adev = ip_block->adev;
 	int r;
 
-	r = amdgpu_vce_resume(adev);
-	if (r)
-		return r;
-	r = vce_v1_0_load_fw_signature(adev);
+	r = vce_v1_0_load_fw(adev);
 	if (r)
 		return r;
 	r = vce_v1_0_ensure_vcpu_bo_32bit_addr(adev);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 08/11] drm/amdgpu/vce: Check maximum ucode size in amdgpu_vce_resume()
  2026-04-23  1:16 [PATCH 00/11] VCE1 fixes (v2) Timur Kristóf
                   ` (6 preceding siblings ...)
  2026-04-23  1:16 ` [PATCH 07/11] drm/amdgpu/vce1: Stop using amdgpu_vce_resume Timur Kristóf
@ 2026-04-23  1:16 ` Timur Kristóf
  2026-04-23  1:16 ` [PATCH 09/11] drm/amdgpu/vce2: Fix VCE 2 firmware size and offsets Timur Kristóf
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 23+ messages in thread
From: Timur Kristóf @ 2026-04-23  1:16 UTC (permalink / raw)
  To: amd-gfx, alexander.deucher, christian.koenig, John Olender
  Cc: Timur Kristóf

Verify that the ucode fits the part of the BO that is
specifically meant for it to avoid overflowing it.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 5 ++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h | 2 +-
 drivers/gpu/drm/amd/amdgpu/vce_v2_0.c   | 4 ++--
 drivers/gpu/drm/amd/amdgpu/vce_v3_0.c   | 4 ++--
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c   | 4 ++--
 5 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index efdebd9c0a1f3..8c620254f0374 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -324,7 +324,7 @@ int amdgpu_vce_suspend(struct amdgpu_device *adev)
  * @adev: amdgpu_device pointer
  *
  */
-int amdgpu_vce_resume(struct amdgpu_device *adev)
+int amdgpu_vce_resume(struct amdgpu_device *adev, const unsigned long max_size)
 {
 	const struct common_firmware_header *hdr;
 	unsigned int offset;
@@ -336,6 +336,9 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
 	hdr = (const struct common_firmware_header *)adev->vce.fw->data;
 	offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
 
+	if (adev->vce.fw->size - offset > max_size)
+		return -EINVAL;
+
 	if (drm_dev_enter(adev_to_drm(adev), &idx)) {
 		memset_io(adev->vce.cpu_addr, 0, amdgpu_bo_size(adev->vce.vcpu_bo));
 		memcpy_toio(adev->vce.cpu_addr, adev->vce.fw->data + offset,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
index 778c714c8385d..a57e2f6f5f930 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
@@ -60,7 +60,7 @@ int amdgpu_vce_sw_init(struct amdgpu_device *adev, unsigned long size);
 int amdgpu_vce_sw_fini(struct amdgpu_device *adev);
 int amdgpu_vce_entity_init(struct amdgpu_device *adev, struct amdgpu_ring *ring);
 int amdgpu_vce_suspend(struct amdgpu_device *adev);
-int amdgpu_vce_resume(struct amdgpu_device *adev);
+int amdgpu_vce_resume(struct amdgpu_device *adev, const unsigned long max_size);
 void amdgpu_vce_free_handles(struct amdgpu_device *adev, struct drm_file *filp);
 int amdgpu_vce_ring_parse_cs(struct amdgpu_cs_parser *p, struct amdgpu_job *job,
 			     struct amdgpu_ib *ib);
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
index db149eda62044..00b4037d4bc89 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
@@ -437,7 +437,7 @@ static int vce_v2_0_sw_init(struct amdgpu_ip_block *ip_block)
 	if (r)
 		return r;
 
-	r = amdgpu_vce_resume(adev);
+	r = amdgpu_vce_resume(adev, VCE_V2_0_FW_SIZE);
 	if (r)
 		return r;
 
@@ -533,7 +533,7 @@ static int vce_v2_0_resume(struct amdgpu_ip_block *ip_block)
 {
 	int r;
 
-	r = amdgpu_vce_resume(ip_block->adev);
+	r = amdgpu_vce_resume(ip_block->adev, VCE_V2_0_FW_SIZE);
 	if (r)
 		return r;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index 03d79e464f04f..2e97376ff30e5 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -440,7 +440,7 @@ static int vce_v3_0_sw_init(struct amdgpu_ip_block *ip_block)
 	if (adev->vce.fw_version < FW_52_8_3)
 		adev->vce.num_rings = 2;
 
-	r = amdgpu_vce_resume(adev);
+	r = amdgpu_vce_resume(adev, VCE_V3_0_FW_SIZE);
 	if (r)
 		return r;
 
@@ -544,7 +544,7 @@ static int vce_v3_0_resume(struct amdgpu_ip_block *ip_block)
 {
 	int r;
 
-	r = amdgpu_vce_resume(ip_block->adev);
+	r = amdgpu_vce_resume(ip_block->adev, VCE_V3_0_FW_SIZE);
 	if (r)
 		return r;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
index ee445d8abe474..3309e7b8f2a2e 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -462,7 +462,7 @@ static int vce_v4_0_sw_init(struct amdgpu_ip_block *ip_block)
 			ALIGN(le32_to_cpu(hdr->ucode_size_bytes), PAGE_SIZE);
 		drm_info(adev_to_drm(adev), "PSP loading VCE firmware\n");
 	} else {
-		r = amdgpu_vce_resume(adev);
+		r = amdgpu_vce_resume(adev, VCE_V4_0_FW_SIZE);
 		if (r)
 			return r;
 	}
@@ -624,7 +624,7 @@ static int vce_v4_0_resume(struct amdgpu_ip_block *ip_block)
 			drm_dev_exit(idx);
 		}
 	} else {
-		r = amdgpu_vce_resume(adev);
+		r = amdgpu_vce_resume(adev, VCE_V4_0_FW_SIZE);
 		if (r)
 			return r;
 	}
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 09/11] drm/amdgpu/vce2: Fix VCE 2 firmware size and offsets
  2026-04-23  1:16 [PATCH 00/11] VCE1 fixes (v2) Timur Kristóf
                   ` (7 preceding siblings ...)
  2026-04-23  1:16 ` [PATCH 08/11] drm/amdgpu/vce: Check maximum ucode size in amdgpu_vce_resume() Timur Kristóf
@ 2026-04-23  1:16 ` Timur Kristóf
  2026-04-23 11:28   ` Christian König
  2026-04-23 18:10   ` John Olender
  2026-04-23  1:16 ` [PATCH 10/11] drm/amdgpu/vce3: Fix VCE 3 " Timur Kristóf
  2026-04-23  1:16 ` [PATCH 11/11] drm/amdgpu/vce4: Fix VCE 4 " Timur Kristóf
  10 siblings, 2 replies; 23+ messages in thread
From: Timur Kristóf @ 2026-04-23  1:16 UTC (permalink / raw)
  To: amd-gfx, alexander.deucher, christian.koenig, John Olender
  Cc: Timur Kristóf

The VCPU BO contains the actual FW at an offset, but
it was not calculated into the VCPU BO size.
Subtract this from the FW size to make sure there is
no out of bounds access.

This may fix VM faults when using VCE 2.

Cc: John Olender <john.olender@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4802
Fixes: e98226221467 ("drm/amdgpu: recalculate VCE firmware BO size")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
---
 drivers/gpu/drm/amd/amdgpu/vce_v2_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
index 00b4037d4bc89..3b493a2e94dd0 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
@@ -183,7 +183,7 @@ static void vce_v2_0_mc_resume(struct amdgpu_device *adev)
 	WREG32(mmVCE_LMI_VCPU_CACHE_40BIT_BAR, (adev->vce.gpu_addr >> 8));
 
 	offset = AMDGPU_VCE_FIRMWARE_OFFSET;
-	size = VCE_V2_0_FW_SIZE;
+	size = VCE_V2_0_FW_SIZE - AMDGPU_VCE_FIRMWARE_OFFSET;
 	WREG32(mmVCE_VCPU_CACHE_OFFSET0, offset & 0x7fffffff);
 	WREG32(mmVCE_VCPU_CACHE_SIZE0, size);
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 10/11] drm/amdgpu/vce3: Fix VCE 3 firmware size and offsets
  2026-04-23  1:16 [PATCH 00/11] VCE1 fixes (v2) Timur Kristóf
                   ` (8 preceding siblings ...)
  2026-04-23  1:16 ` [PATCH 09/11] drm/amdgpu/vce2: Fix VCE 2 firmware size and offsets Timur Kristóf
@ 2026-04-23  1:16 ` Timur Kristóf
  2026-04-23 11:29   ` Christian König
  2026-04-23  1:16 ` [PATCH 11/11] drm/amdgpu/vce4: Fix VCE 4 " Timur Kristóf
  10 siblings, 1 reply; 23+ messages in thread
From: Timur Kristóf @ 2026-04-23  1:16 UTC (permalink / raw)
  To: amd-gfx, alexander.deucher, christian.koenig, John Olender
  Cc: Timur Kristóf

The VCPU BO contains the actual FW at an offset, but
it was not calculated into the VCPU BO size.
Subtract this from the FW size to make sure there is
no out of bounds access.

This may fix VM faults when using VCE 3.

Cc: John Olender <john.olender@gmail.com>
Fixes: e98226221467 ("drm/amdgpu: recalculate VCE firmware BO size")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
---
 drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index 2e97376ff30e5..2b6ddb6bec3b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -574,7 +574,7 @@ static void vce_v3_0_mc_resume(struct amdgpu_device *adev, int idx)
 	} else
 		WREG32(mmVCE_LMI_VCPU_CACHE_40BIT_BAR, (adev->vce.gpu_addr >> 8));
 	offset = AMDGPU_VCE_FIRMWARE_OFFSET;
-	size = VCE_V3_0_FW_SIZE;
+	size = VCE_V3_0_FW_SIZE - AMDGPU_VCE_FIRMWARE_OFFSET;
 	WREG32(mmVCE_VCPU_CACHE_OFFSET0, offset & 0x7fffffff);
 	WREG32(mmVCE_VCPU_CACHE_SIZE0, size);
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 11/11] drm/amdgpu/vce4: Fix VCE 4 firmware size and offsets
  2026-04-23  1:16 [PATCH 00/11] VCE1 fixes (v2) Timur Kristóf
                   ` (9 preceding siblings ...)
  2026-04-23  1:16 ` [PATCH 10/11] drm/amdgpu/vce3: Fix VCE 3 " Timur Kristóf
@ 2026-04-23  1:16 ` Timur Kristóf
  2026-04-23 11:31   ` Christian König
  10 siblings, 1 reply; 23+ messages in thread
From: Timur Kristóf @ 2026-04-23  1:16 UTC (permalink / raw)
  To: amd-gfx, alexander.deucher, christian.koenig, John Olender
  Cc: Timur Kristóf

The VCPU BO contains the actual FW at an offset, but
it was not calculated into the VCPU BO size.
Subtract this from the FW size to make sure there is
no out of bounds access.

This may fix VM faults when using VCE 4.

Cc: John Olender <john.olender@gmail.com>
Fixes: c1dc356a116c ("drm/amdgpu: add initial vce 4.0 support for vega10")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
---
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
index 3309e7b8f2a2e..eaa3e05a52e59 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -281,7 +281,7 @@ static int vce_v4_0_sriov_start(struct amdgpu_device *adev)
 						mmVCE_LMI_VCPU_CACHE_64BIT_BAR2),
 						(adev->vce.gpu_addr >> 40) & 0xff);
 
-		size = VCE_V4_0_FW_SIZE;
+		size = VCE_V4_0_FW_SIZE - AMDGPU_VCE_FIRMWARE_OFFSET;
 		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_SIZE0), size);
 
 		offset = (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) ? offset + size : 0;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 01/11] drm/amdgpu: Align amdgpu_gtt_mgr entries to TLB size on Tahiti
  2026-04-23  1:16 ` [PATCH 01/11] drm/amdgpu: Align amdgpu_gtt_mgr entries to TLB size on Tahiti Timur Kristóf
@ 2026-04-23 11:04   ` Christian König
  2026-04-23 12:18     ` Timur Kristóf
  0 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2026-04-23 11:04 UTC (permalink / raw)
  To: Timur Kristóf, amd-gfx, alexander.deucher, John Olender

On 4/23/26 03:16, Timur Kristóf wrote:
> The TLB is organized in groups of 8 entries, each one is 4K.
> On Tahiti, the HW requires these GART entries to be 32K-aligned.
> 
> This fixes a VCE 1 firmware validation failure that can happen
> after suspend/resume since we use amdgpu_gtt_mgr for VCE 1.
> 
> Fixes: 698fa62f56aa ("drm/amdgpu: Add helper to alloc GART entries")
> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> index 9b0bcf6aca445..673e9e08c66a0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> @@ -198,12 +198,19 @@ int amdgpu_gtt_mgr_alloc_entries(struct amdgpu_gtt_mgr *mgr,
>  				 u64 num_pages,
>  				 enum drm_mm_insert_mode mode)
>  {
> +	u32 alignment = 0;
>  	struct amdgpu_device *adev = container_of(mgr, typeof(*adev), mman.gtt_mgr);
>  	int r;

Only a nit, but reverse xmas tree order please for variable declarations.

>  
> +	/* Align to TLB size on Tahiti */

Oh that needs improvement.

Maybe something like /* Align to TLB L2 cache entry size to work around V bit HW bug */

Mostly nobody will know what that "V bit HW bug" is, but at least AMD people can search for that in the HW docs.

With that fixed Reviewed-by: Christian König <christian.koenig@amd.com>.

Thanks,
Christian.

> +	if (adev->asic_type == CHIP_TAHITI) {
> +		alignment = 32 * 1024 / AMDGPU_GPU_PAGE_SIZE;
> +		num_pages = ALIGN(num_pages, alignment);
> +	}
> +
>  	spin_lock(&mgr->lock);
>  	r = drm_mm_insert_node_in_range(&mgr->mm, mm_node, num_pages,
> -					0, GART_ENTRY_WITHOUT_BO_COLOR, 0,
> +					alignment, GART_ENTRY_WITHOUT_BO_COLOR, 0,
>  					adev->gmc.gart_size >> PAGE_SHIFT,
>  					mode);
>  	spin_unlock(&mgr->lock);


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 02/11] drm/amdgpu/vce1: Check that the GPU address is < 128 MiB
  2026-04-23  1:16 ` [PATCH 02/11] drm/amdgpu/vce1: Check that the GPU address is < 128 MiB Timur Kristóf
@ 2026-04-23 11:06   ` Christian König
  0 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2026-04-23 11:06 UTC (permalink / raw)
  To: Timur Kristóf, amd-gfx, alexander.deucher, John Olender



On 4/23/26 03:16, Timur Kristóf wrote:
> When ensuring the low 32-bit address, make sure it is
> less than 128 MiB, otherwise the VCE seems to fail to initialize.
> This seems to be an undocumented limitation of the firmware
> validation mechanism. Note that in case of VCE1 the BAR
> address is zero and we can't change it also due to the
> firmware validator.
> 
> When programming the mmVCE_VCPU_CACHE_OFFSETn registers,
> don't AND them with a mask. This is incorrect because
> the register mask is actually 0x0fffffff and useless because
> we already ensure the addresses are below the limit.
> 
> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/vce_v1_0.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
> index 5b7b46d242c6d..edabec442cb63 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
> @@ -313,17 +313,17 @@ static int vce_v1_0_mc_resume(struct amdgpu_device *adev)
>  
>  	offset =  adev->vce.gpu_addr + AMDGPU_VCE_FIRMWARE_OFFSET;
>  	size = VCE_V1_0_FW_SIZE;
> -	WREG32(mmVCE_VCPU_CACHE_OFFSET0, offset & 0x7fffffff);
> +	WREG32(mmVCE_VCPU_CACHE_OFFSET0, offset);
>  	WREG32(mmVCE_VCPU_CACHE_SIZE0, size);
>  
>  	offset += size;
>  	size = VCE_V1_0_STACK_SIZE;
> -	WREG32(mmVCE_VCPU_CACHE_OFFSET1, offset & 0x7fffffff);
> +	WREG32(mmVCE_VCPU_CACHE_OFFSET1, offset);
>  	WREG32(mmVCE_VCPU_CACHE_SIZE1, size);
>  
>  	offset += size;
>  	size = VCE_V1_0_DATA_SIZE;
> -	WREG32(mmVCE_VCPU_CACHE_OFFSET2, offset & 0x7fffffff);
> +	WREG32(mmVCE_VCPU_CACHE_OFFSET2, offset);
>  	WREG32(mmVCE_VCPU_CACHE_SIZE2, size);
>  
>  	WREG32_P(mmVCE_LMI_CTRL2, 0x0, ~0x100);
> @@ -527,11 +527,15 @@ static int vce_v1_0_early_init(struct amdgpu_ip_block *ip_block)
>   * To accomodate that, we put GART to the LOW address range
>   * and reserve some GART pages where we map the VCPU BO,
>   * so that it gets a 32-bit address.
> + *
> + * The BAR address is zero and we can't change it
> + * due to the firmware validation mechanism.
> + * It seems that it fails to initialize if the address is >= 128 MiB.
>   */
>  static int vce_v1_0_ensure_vcpu_bo_32bit_addr(struct amdgpu_device *adev)
>  {
>  	u64 bo_size = amdgpu_bo_size(adev->vce.vcpu_bo);
> -	u64 max_vcpu_bo_addr = 0xffffffff - bo_size;
> +	u64 max_vcpu_bo_addr = 0x07ffffff - bo_size;
>  	u64 num_pages = ALIGN(bo_size, AMDGPU_GPU_PAGE_SIZE) / AMDGPU_GPU_PAGE_SIZE;
>  	u64 pa = amdgpu_gmc_vram_pa(adev, adev->vce.vcpu_bo);
>  	u64 flags = AMDGPU_PTE_READABLE | AMDGPU_PTE_WRITEABLE | AMDGPU_PTE_VALID;


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 06/11] drm/amdgpu/vce1: Fix VCE 1 firmware size and offsets
  2026-04-23  1:16 ` [PATCH 06/11] drm/amdgpu/vce1: Fix VCE 1 firmware size and offsets Timur Kristóf
@ 2026-04-23 11:12   ` Christian König
  0 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2026-04-23 11:12 UTC (permalink / raw)
  To: Timur Kristóf, amd-gfx, alexander.deucher, John Olender

On 4/23/26 03:16, Timur Kristóf wrote:
> The VCPU BO contains the actual FW at an offset, but
> it was not calculated into the VCPU BO size.
> Subtract this from the FW size to make sure there is
> no out of bounds access.
> 
> Make sure the stack and data offsets are aligned to
> the 32K TLB size.
> 
> Check that the FW microcode actually fits in the
> space that is reserved for it.
> 
> Fixes: d4a640d4b9f3 ("drm/amdgpu/vce1: Implement VCE1 IP block (v2)")
> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/vce_v1_0.c | 19 +++++++++++++++----
>  1 file changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
> index 92c3cf3fce4f0..c8e7297fd7ca3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
> @@ -42,9 +42,10 @@
>  #include "oss/oss_1_0_d.h"
>  #include "oss/oss_1_0_sh_mask.h"
>  
> +#define VCE_V1_0_ALIGNMENT	(32 * 1024)
>  #define VCE_V1_0_FW_SIZE	(256 * 1024)
>  #define VCE_V1_0_STACK_SIZE	(64 * 1024)
> -#define VCE_V1_0_DATA_SIZE	(7808 * (AMDGPU_MAX_VCE_HANDLES + 1))
> +#define VCE_V1_0_DATA_SIZE	(ALIGN(7808 * (AMDGPU_MAX_VCE_HANDLES + 1), VCE_V1_0_ALIGNMENT))
>  #define VCE_STATUS_VCPU_REPORT_FW_LOADED_MASK	0x02
>  
>  static void vce_v1_0_set_ring_funcs(struct amdgpu_device *adev);
> @@ -189,17 +190,22 @@ static int vce_v1_0_load_fw_signature(struct amdgpu_device *adev)
>  {
>  	const struct common_firmware_header *hdr;
>  	struct vce_v1_0_fw_signature *sign;
> -	unsigned int ucode_offset;
> +	u32 ucode_offset;
> +	u32 ucode_size;
>  	uint32_t chip_id;
>  	u32 *cpu_addr;
>  	int i;
>  
>  	hdr = (const struct common_firmware_header *)adev->vce.fw->data;
>  	ucode_offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
> +	ucode_size = hdr->ucode_size_bytes - sizeof(struct vce_v1_0_fw_signature *);
>  	cpu_addr = adev->vce.cpu_addr;
>  
>  	sign = (void *)adev->vce.fw->data + ucode_offset;
>  
> +	if (ucode_size > VCE_V1_0_FW_SIZE - AMDGPU_VCE_FIRMWARE_OFFSET)
> +		return -EINVAL;
> +
>  	switch (adev->asic_type) {
>  	case CHIP_TAHITI:
>  		chip_id = 0x01000014;
> @@ -231,7 +237,7 @@ static int vce_v1_0_load_fw_signature(struct amdgpu_device *adev)
>  	cpu_addr[4] = cpu_to_le32(le32_to_cpu(sign->length) + 64);
>  
>  	memset_io(&cpu_addr[5], 0, 44);
> -	memcpy_toio(&cpu_addr[16], &sign[1], hdr->ucode_size_bytes - sizeof(*sign));
> +	memcpy_toio(&cpu_addr[16], &sign[1], ucode_size);
>  
>  	cpu_addr += (le32_to_cpu(sign->length) + 64) / 4;
>  	memcpy_toio(&cpu_addr[0], &sign->val[i].sigval[0], 16);
> @@ -312,17 +318,22 @@ static int vce_v1_0_mc_resume(struct amdgpu_device *adev)
>  	WREG32(mmVCE_VCPU_SCRATCH7, AMDGPU_MAX_VCE_HANDLES);
>  
>  	offset =  adev->vce.gpu_addr + AMDGPU_VCE_FIRMWARE_OFFSET;
> -	size = VCE_V1_0_FW_SIZE;
> +	size = VCE_V1_0_FW_SIZE - AMDGPU_VCE_FIRMWARE_OFFSET;
>  	WREG32(mmVCE_VCPU_CACHE_OFFSET0, offset);
>  	WREG32(mmVCE_VCPU_CACHE_SIZE0, size);
>  
>  	offset += size;
>  	size = VCE_V1_0_STACK_SIZE;
> +	WARN_ON(!IS_ALIGNED(offset, VCE_V1_0_ALIGNMENT));
> +	WARN_ON(!IS_ALIGNED(size, VCE_V1_0_ALIGNMENT));
>  	WREG32(mmVCE_VCPU_CACHE_OFFSET1, offset);
>  	WREG32(mmVCE_VCPU_CACHE_SIZE1, size);
>  
>  	offset += size;
>  	size = VCE_V1_0_DATA_SIZE;
> +	WARN_ON(!IS_ALIGNED(offset, VCE_V1_0_ALIGNMENT));
> +	WARN_ON(!IS_ALIGNED(size, VCE_V1_0_ALIGNMENT));
> +	WARN_ON(offset + size > amdgpu_bo_size(adev->vce.vcpu_bo));
>  	WREG32(mmVCE_VCPU_CACHE_OFFSET2, offset);
>  	WREG32(mmVCE_VCPU_CACHE_SIZE2, size);
>  


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 07/11] drm/amdgpu/vce1: Stop using amdgpu_vce_resume
  2026-04-23  1:16 ` [PATCH 07/11] drm/amdgpu/vce1: Stop using amdgpu_vce_resume Timur Kristóf
@ 2026-04-23 11:13   ` Christian König
  0 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2026-04-23 11:13 UTC (permalink / raw)
  To: Timur Kristóf, amd-gfx, alexander.deucher, John Olender



On 4/23/26 03:16, Timur Kristóf wrote:
> The VCE1 firmware works slightly differently and is already
> loaded by vce_v1_0_load_fw(). It doesn't actually need to
> call amdgpu_vce_resume().
> 
> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/vce_v1_0.c | 16 ++++++----------
>  1 file changed, 6 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
> index c8e7297fd7ca3..db8cc97a72d8e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
> @@ -178,7 +178,7 @@ static void vce_v1_0_init_cg(struct amdgpu_device *adev)
>  }
>  
>  /**
> - * vce_v1_0_load_fw_signature - load firmware signature into VCPU BO
> + * vce_v1_0_load_fw() - load firmware signature into VCPU BO
>   *
>   * @adev: amdgpu_device pointer
>   *
> @@ -186,7 +186,7 @@ static void vce_v1_0_init_cg(struct amdgpu_device *adev)
>   * This function finds the signature appropriate for the current
>   * ASIC and writes that into the VCPU BO.
>   */
> -static int vce_v1_0_load_fw_signature(struct amdgpu_device *adev)
> +static int vce_v1_0_load_fw(struct amdgpu_device *adev)
>  {
>  	const struct common_firmware_header *hdr;
>  	struct vce_v1_0_fw_signature *sign;
> @@ -232,6 +232,8 @@ static int vce_v1_0_load_fw_signature(struct amdgpu_device *adev)
>  		return -EINVAL;
>  	}
>  
> +	memset_io(&cpu_addr[0], 0, amdgpu_bo_size(adev->vce.vcpu_bo));
> +
>  	cpu_addr += (256 - 64) / 4;
>  	memcpy_toio(&cpu_addr[0], &sign->val[i].nonce[0], 16);
>  	cpu_addr[4] = cpu_to_le32(le32_to_cpu(sign->length) + 64);
> @@ -592,10 +594,7 @@ static int vce_v1_0_sw_init(struct amdgpu_ip_block *ip_block)
>  	if (r)
>  		return r;
>  
> -	r = amdgpu_vce_resume(adev);
> -	if (r)
> -		return r;
> -	r = vce_v1_0_load_fw_signature(adev);
> +	r = vce_v1_0_load_fw(adev);
>  	if (r)
>  		return r;
>  	r = vce_v1_0_ensure_vcpu_bo_32bit_addr(adev);
> @@ -714,10 +713,7 @@ static int vce_v1_0_resume(struct amdgpu_ip_block *ip_block)
>  	struct amdgpu_device *adev = ip_block->adev;
>  	int r;
>  
> -	r = amdgpu_vce_resume(adev);
> -	if (r)
> -		return r;
> -	r = vce_v1_0_load_fw_signature(adev);
> +	r = vce_v1_0_load_fw(adev);
>  	if (r)
>  		return r;
>  	r = vce_v1_0_ensure_vcpu_bo_32bit_addr(adev);


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 09/11] drm/amdgpu/vce2: Fix VCE 2 firmware size and offsets
  2026-04-23  1:16 ` [PATCH 09/11] drm/amdgpu/vce2: Fix VCE 2 firmware size and offsets Timur Kristóf
@ 2026-04-23 11:28   ` Christian König
  2026-04-23 18:10   ` John Olender
  1 sibling, 0 replies; 23+ messages in thread
From: Christian König @ 2026-04-23 11:28 UTC (permalink / raw)
  To: Timur Kristóf, amd-gfx, alexander.deucher, John Olender

On 4/23/26 03:16, Timur Kristóf wrote:
> The VCPU BO contains the actual FW at an offset, but
> it was not calculated into the VCPU BO size.
> Subtract this from the FW size to make sure there is
> no out of bounds access.
> 
> This may fix VM faults when using VCE 2.
> 
> Cc: John Olender <john.olender@gmail.com>
> Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4802
> Fixes: e98226221467 ("drm/amdgpu: recalculate VCE firmware BO size")
> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/vce_v2_0.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
> index 00b4037d4bc89..3b493a2e94dd0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
> @@ -183,7 +183,7 @@ static void vce_v2_0_mc_resume(struct amdgpu_device *adev)
>  	WREG32(mmVCE_LMI_VCPU_CACHE_40BIT_BAR, (adev->vce.gpu_addr >> 8));
>  
>  	offset = AMDGPU_VCE_FIRMWARE_OFFSET;
> -	size = VCE_V2_0_FW_SIZE;
> +	size = VCE_V2_0_FW_SIZE - AMDGPU_VCE_FIRMWARE_OFFSET;
>  	WREG32(mmVCE_VCPU_CACHE_OFFSET0, offset & 0x7fffffff);
>  	WREG32(mmVCE_VCPU_CACHE_SIZE0, size);
>  


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 10/11] drm/amdgpu/vce3: Fix VCE 3 firmware size and offsets
  2026-04-23  1:16 ` [PATCH 10/11] drm/amdgpu/vce3: Fix VCE 3 " Timur Kristóf
@ 2026-04-23 11:29   ` Christian König
  0 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2026-04-23 11:29 UTC (permalink / raw)
  To: Timur Kristóf, amd-gfx, alexander.deucher, John Olender

On 4/23/26 03:16, Timur Kristóf wrote:
> The VCPU BO contains the actual FW at an offset, but
> it was not calculated into the VCPU BO size.
> Subtract this from the FW size to make sure there is
> no out of bounds access.
> 
> This may fix VM faults when using VCE 3.
> 
> Cc: John Olender <john.olender@gmail.com>
> Fixes: e98226221467 ("drm/amdgpu: recalculate VCE firmware BO size")
> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> index 2e97376ff30e5..2b6ddb6bec3b0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> @@ -574,7 +574,7 @@ static void vce_v3_0_mc_resume(struct amdgpu_device *adev, int idx)
>  	} else
>  		WREG32(mmVCE_LMI_VCPU_CACHE_40BIT_BAR, (adev->vce.gpu_addr >> 8));
>  	offset = AMDGPU_VCE_FIRMWARE_OFFSET;
> -	size = VCE_V3_0_FW_SIZE;
> +	size = VCE_V3_0_FW_SIZE - AMDGPU_VCE_FIRMWARE_OFFSET;
>  	WREG32(mmVCE_VCPU_CACHE_OFFSET0, offset & 0x7fffffff);
>  	WREG32(mmVCE_VCPU_CACHE_SIZE0, size);
>  


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 11/11] drm/amdgpu/vce4: Fix VCE 4 firmware size and offsets
  2026-04-23  1:16 ` [PATCH 11/11] drm/amdgpu/vce4: Fix VCE 4 " Timur Kristóf
@ 2026-04-23 11:31   ` Christian König
  2026-04-23 11:50     ` Timur Kristóf
  0 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2026-04-23 11:31 UTC (permalink / raw)
  To: Timur Kristóf, amd-gfx, alexander.deucher, John Olender,
	Liu, Leo



On 4/23/26 03:16, Timur Kristóf wrote:
> The VCPU BO contains the actual FW at an offset, but
> it was not calculated into the VCPU BO size.
> Subtract this from the FW size to make sure there is
> no out of bounds access.
> 
> This may fix VM faults when using VCE 4.
> 
> Cc: John Olender <john.olender@gmail.com>
> Fixes: c1dc356a116c ("drm/amdgpu: add initial vce 4.0 support for vega10")
> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>

Leo can you take a look at this? VCE4 doesn't use the classic VCE FW validation any more.

So I'm not sure we have nor need that here.

Regards,
Christian.

> ---
>  drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> index 3309e7b8f2a2e..eaa3e05a52e59 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> @@ -281,7 +281,7 @@ static int vce_v4_0_sriov_start(struct amdgpu_device *adev)
>  						mmVCE_LMI_VCPU_CACHE_64BIT_BAR2),
>  						(adev->vce.gpu_addr >> 40) & 0xff);
>  
> -		size = VCE_V4_0_FW_SIZE;
> +		size = VCE_V4_0_FW_SIZE - AMDGPU_VCE_FIRMWARE_OFFSET;
>  		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_SIZE0), size);
>  
>  		offset = (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) ? offset + size : 0;


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 11/11] drm/amdgpu/vce4: Fix VCE 4 firmware size and offsets
  2026-04-23 11:31   ` Christian König
@ 2026-04-23 11:50     ` Timur Kristóf
  0 siblings, 0 replies; 23+ messages in thread
From: Timur Kristóf @ 2026-04-23 11:50 UTC (permalink / raw)
  To: amd-gfx, alexander.deucher, John Olender, Liu, Leo,
	Christian König

On Thursday, April 23, 2026 1:31:03 PM Central European Summer Time Christian 
König wrote:
> On 4/23/26 03:16, Timur Kristóf wrote:
> > The VCPU BO contains the actual FW at an offset, but
> > it was not calculated into the VCPU BO size.
> > Subtract this from the FW size to make sure there is
> > no out of bounds access.
> > 
> > This may fix VM faults when using VCE 4.
> > 
> > Cc: John Olender <john.olender@gmail.com>
> > Fixes: c1dc356a116c ("drm/amdgpu: add initial vce 4.0 support for vega10")
> > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> 
> Leo can you take a look at this? VCE4 doesn't use the classic VCE FW
> validation any more.
> 
> So I'm not sure we have nor need that here.
> 
> Regards,
> Christian.

Hi  Leo & Christian,

If you take a look at vce_v4_0_mc_resume() you can see that it initializes the 
firmware offset like this:

offset = AMDGPU_VCE_FIRMWARE_OFFSET;

Specifically for the non-PSP code path, this triggers exactly the same issue as 
VCE2-3, that this causes the STACK and DATA to be also offset, and hence the 
DATA will be out of bounds of the VCPU BO.

For the PSP code path, the STACK and DATA no longer depend on the firmware 
offset and size, so that is fortunately not an issue anymore. In that case 
however the driver still uses the same offset for the ucode.

Best regards,
Timur

> 
> > ---
> > 
> >  drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> > b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c index
> > 3309e7b8f2a2e..eaa3e05a52e59 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> > @@ -281,7 +281,7 @@ static int vce_v4_0_sriov_start(struct amdgpu_device
> > *adev)> 
> >  						
mmVCE_LMI_VCPU_CACHE_64BIT_BAR2),
> >  						(adev-
>vce.gpu_addr >> 40) & 0xff);
> > 
> > -		size = VCE_V4_0_FW_SIZE;
> > +		size = VCE_V4_0_FW_SIZE - AMDGPU_VCE_FIRMWARE_OFFSET;
> > 
> >  		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
> >  		mmVCE_VCPU_CACHE_SIZE0), size);
> >  		
> >  		offset = (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) 
? offset +
> >  		size : 0;





^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 01/11] drm/amdgpu: Align amdgpu_gtt_mgr entries to TLB size on Tahiti
  2026-04-23 11:04   ` Christian König
@ 2026-04-23 12:18     ` Timur Kristóf
  2026-04-23 13:32       ` Christian König
  0 siblings, 1 reply; 23+ messages in thread
From: Timur Kristóf @ 2026-04-23 12:18 UTC (permalink / raw)
  To: amd-gfx, alexander.deucher, John Olender, Christian König

On Thursday, April 23, 2026 1:04:53 PM Central European Summer Time Christian 
König wrote:
> On 4/23/26 03:16, Timur Kristóf wrote:
> > The TLB is organized in groups of 8 entries, each one is 4K.
> > On Tahiti, the HW requires these GART entries to be 32K-aligned.
> > 
> > This fixes a VCE 1 firmware validation failure that can happen
> > after suspend/resume since we use amdgpu_gtt_mgr for VCE 1.
> > 
> > Fixes: 698fa62f56aa ("drm/amdgpu: Add helper to alloc GART entries")
> > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > ---
> > 
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c index
> > 9b0bcf6aca445..673e9e08c66a0 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> > @@ -198,12 +198,19 @@ int amdgpu_gtt_mgr_alloc_entries(struct
> > amdgpu_gtt_mgr *mgr,> 
> >  				 u64 num_pages,
> >  				 enum drm_mm_insert_mode mode)
> >  
> >  {
> > 
> > +	u32 alignment = 0;
> > 
> >  	struct amdgpu_device *adev = container_of(mgr, typeof(*adev),
> >  	mman.gtt_mgr); int r;
> 
> Only a nit, but reverse xmas tree order please for variable declarations.

I haven't found this in the Linux coding style guide, can you elaborate what 
you are referring to exactly?

> 
> > +	/* Align to TLB size on Tahiti */
> 
> Oh that needs improvement.
> 
> Maybe something like /* Align to TLB L2 cache entry size to work around V
> bit HW bug */
> 
> Mostly nobody will know what that "V bit HW bug" is, but at least AMD people
> can search for that in the HW docs.

Sounds good, will add those details to the comments (and commit message)
in the next version of the series.

> 
> With that fixed Reviewed-by: Christian König <christian.koenig@amd.com>.
> 
> Thanks,
> Christian.

Thanks!

> 
> > +	if (adev->asic_type == CHIP_TAHITI) {
> > +		alignment = 32 * 1024 / AMDGPU_GPU_PAGE_SIZE;
> > +		num_pages = ALIGN(num_pages, alignment);
> > +	}
> > +
> > 
> >  	spin_lock(&mgr->lock);
> >  	r = drm_mm_insert_node_in_range(&mgr->mm, mm_node, num_pages,
> > 
> > -					0, 
GART_ENTRY_WITHOUT_BO_COLOR, 0,
> > +					alignment, 
GART_ENTRY_WITHOUT_BO_COLOR, 0,
> > 
> >  					adev->gmc.gart_size >> 
PAGE_SHIFT,
> >  					mode);
> >  	
> >  	spin_unlock(&mgr->lock);





^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 01/11] drm/amdgpu: Align amdgpu_gtt_mgr entries to TLB size on Tahiti
  2026-04-23 12:18     ` Timur Kristóf
@ 2026-04-23 13:32       ` Christian König
  0 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2026-04-23 13:32 UTC (permalink / raw)
  To: Timur Kristóf, amd-gfx, alexander.deucher, John Olender

On 4/23/26 14:18, Timur Kristóf wrote:
> On Thursday, April 23, 2026 1:04:53 PM Central European Summer Time Christian 
> König wrote:
>> On 4/23/26 03:16, Timur Kristóf wrote:
>>> The TLB is organized in groups of 8 entries, each one is 4K.
>>> On Tahiti, the HW requires these GART entries to be 32K-aligned.
>>>
>>> This fixes a VCE 1 firmware validation failure that can happen
>>> after suspend/resume since we use amdgpu_gtt_mgr for VCE 1.
>>>
>>> Fixes: 698fa62f56aa ("drm/amdgpu: Add helper to alloc GART entries")
>>> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
>>> ---
>>>
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 9 ++++++++-
>>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c index
>>> 9b0bcf6aca445..673e9e08c66a0 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
>>> @@ -198,12 +198,19 @@ int amdgpu_gtt_mgr_alloc_entries(struct
>>> amdgpu_gtt_mgr *mgr,> 
>>>  				 u64 num_pages,
>>>  				 enum drm_mm_insert_mode mode)
>>>  
>>>  {
>>>
>>> +	u32 alignment = 0;
>>>
>>>  	struct amdgpu_device *adev = container_of(mgr, typeof(*adev),
>>>  	mman.gtt_mgr); int r;
>>
>> Only a nit, but reverse xmas tree order please for variable declarations.
> 
> I haven't found this in the Linux coding style guide, can you elaborate what 
> you are referring to exactly?

It's not general coding style rule, but some subsystem maintainers usually suggest to have longer lines with constant initializers (e.g. like adev in this example) first. Then variables with longer names like "alignment" and finally temporary variables like "i", "r", etc...

I think it keeps code a bit more readable sometimes.

Regards,
Christian.

> 
>>
>>> +	/* Align to TLB size on Tahiti */
>>
>> Oh that needs improvement.
>>
>> Maybe something like /* Align to TLB L2 cache entry size to work around V
>> bit HW bug */
>>
>> Mostly nobody will know what that "V bit HW bug" is, but at least AMD people
>> can search for that in the HW docs.
> 
> Sounds good, will add those details to the comments (and commit message)
> in the next version of the series.
> 
>>
>> With that fixed Reviewed-by: Christian König <christian.koenig@amd.com>.
>>
>> Thanks,
>> Christian.
> 
> Thanks!
> 
>>
>>> +	if (adev->asic_type == CHIP_TAHITI) {
>>> +		alignment = 32 * 1024 / AMDGPU_GPU_PAGE_SIZE;
>>> +		num_pages = ALIGN(num_pages, alignment);
>>> +	}
>>> +
>>>
>>>  	spin_lock(&mgr->lock);
>>>  	r = drm_mm_insert_node_in_range(&mgr->mm, mm_node, num_pages,
>>>
>>> -					0, 
> GART_ENTRY_WITHOUT_BO_COLOR, 0,
>>> +					alignment, 
> GART_ENTRY_WITHOUT_BO_COLOR, 0,
>>>
>>>  					adev->gmc.gart_size >> 
> PAGE_SHIFT,
>>>  					mode);
>>>  	
>>>  	spin_unlock(&mgr->lock);
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 09/11] drm/amdgpu/vce2: Fix VCE 2 firmware size and offsets
  2026-04-23  1:16 ` [PATCH 09/11] drm/amdgpu/vce2: Fix VCE 2 firmware size and offsets Timur Kristóf
  2026-04-23 11:28   ` Christian König
@ 2026-04-23 18:10   ` John Olender
  1 sibling, 0 replies; 23+ messages in thread
From: John Olender @ 2026-04-23 18:10 UTC (permalink / raw)
  To: Timur Kristóf, amd-gfx, alexander.deucher, christian.koenig

On 4/22/26 9:16 PM, Timur Kristóf wrote:
> The VCPU BO contains the actual FW at an offset, but
> it was not calculated into the VCPU BO size.
> Subtract this from the FW size to make sure there is
> no out of bounds access.
> 
> This may fix VM faults when using VCE 2.

I gave this patchset a test on Hawaii and Ellesmere.

I'm still seeing VM faults and card lockup on Hawaii.

Link: https://gitlab.freedesktop.org/drm/amd/-/work_items/4802

Adding additional padding to the VCE VCPU BO (e.g.: rounding up to 1MB)
does seem like an effective workaround, both when the the VCPU BO is
placed in VRAM or in GTT.

I don't see this problem on Ellesmere both with and without this
patchset.

Thanks,
John

> 
> Cc: John Olender <john.olender@gmail.com>
> Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4802
> Fixes: e98226221467 ("drm/amdgpu: recalculate VCE firmware BO size")
> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/vce_v2_0.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
> index 00b4037d4bc89..3b493a2e94dd0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
> @@ -183,7 +183,7 @@ static void vce_v2_0_mc_resume(struct amdgpu_device *adev)
>  	WREG32(mmVCE_LMI_VCPU_CACHE_40BIT_BAR, (adev->vce.gpu_addr >> 8));
>  
>  	offset = AMDGPU_VCE_FIRMWARE_OFFSET;
> -	size = VCE_V2_0_FW_SIZE;
> +	size = VCE_V2_0_FW_SIZE - AMDGPU_VCE_FIRMWARE_OFFSET;
>  	WREG32(mmVCE_VCPU_CACHE_OFFSET0, offset & 0x7fffffff);
>  	WREG32(mmVCE_VCPU_CACHE_SIZE0, size);
>  


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2026-04-23 18:10 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-23  1:16 [PATCH 00/11] VCE1 fixes (v2) Timur Kristóf
2026-04-23  1:16 ` [PATCH 01/11] drm/amdgpu: Align amdgpu_gtt_mgr entries to TLB size on Tahiti Timur Kristóf
2026-04-23 11:04   ` Christian König
2026-04-23 12:18     ` Timur Kristóf
2026-04-23 13:32       ` Christian König
2026-04-23  1:16 ` [PATCH 02/11] drm/amdgpu/vce1: Check that the GPU address is < 128 MiB Timur Kristóf
2026-04-23 11:06   ` Christian König
2026-04-23  1:16 ` [PATCH 03/11] drm/amdgpu/vce1: Remove superfluous address check Timur Kristóf
2026-04-23  1:16 ` [PATCH 04/11] drm/amdgpu/vce1: Check if VRAM address is lower than GART Timur Kristóf
2026-04-23  1:16 ` [PATCH 05/11] drm/amdgpu/vce1: Don't repeat GTT MGR node allocation Timur Kristóf
2026-04-23  1:16 ` [PATCH 06/11] drm/amdgpu/vce1: Fix VCE 1 firmware size and offsets Timur Kristóf
2026-04-23 11:12   ` Christian König
2026-04-23  1:16 ` [PATCH 07/11] drm/amdgpu/vce1: Stop using amdgpu_vce_resume Timur Kristóf
2026-04-23 11:13   ` Christian König
2026-04-23  1:16 ` [PATCH 08/11] drm/amdgpu/vce: Check maximum ucode size in amdgpu_vce_resume() Timur Kristóf
2026-04-23  1:16 ` [PATCH 09/11] drm/amdgpu/vce2: Fix VCE 2 firmware size and offsets Timur Kristóf
2026-04-23 11:28   ` Christian König
2026-04-23 18:10   ` John Olender
2026-04-23  1:16 ` [PATCH 10/11] drm/amdgpu/vce3: Fix VCE 3 " Timur Kristóf
2026-04-23 11:29   ` Christian König
2026-04-23  1:16 ` [PATCH 11/11] drm/amdgpu/vce4: Fix VCE 4 " Timur Kristóf
2026-04-23 11:31   ` Christian König
2026-04-23 11:50     ` Timur Kristóf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox