[PATCH v4 0/2] drm/amd: Add support for non-4K page size systems

public inbox for amd-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed

* [PATCH v4 0/2] drm/amd: Add support for non-4K page size systems
@ 2026-03-26 12:21 Donet Tom
  2026-03-26 12:21 ` [PATCH v4 1/2] drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KB Donet Tom
  2026-03-26 12:21 ` [PATCH v4 2/2] drm/amdgpu: Fix AMDGPU_GTT_MAX_TRANSFER_SIZE for non-4K page size Donet Tom
  0 siblings, 2 replies; 7+ messages in thread
From: Donet Tom @ 2026-03-26 12:21 UTC (permalink / raw)
  To: amd-gfx, Felix Kuehling, Alex Deucher, Alex Deucher,
	christian.koenig, Philip Yang
  Cc: David.YatSin, Kent.Russell, Ritesh Harjani,
	Vaidyanathan Srinivasan, donettom

This is v4 of the patch series enabling 64 KB system page size
support.

v3 of this series [1] contained 6 patches, of which 4 have been
picked up and applied to drm-next. The initial minimal
infrastructure required for 64 KB page size support has already
been merged upstream [2].

This series includes the remaining fixes:

- Patch 1 fixes a kernel crash observed when running rocminfo
  on systems with a 64 KB page size by updating the trap
  reservation size.

- Patch 2 updates AMDGPU_GTT_MAX_TRANSFER_SIZE to always match
  the PMD size across all page sizes.

Setup details:
==============
System details: Power10 LPAR using 64K pagesize.
AMD GPU:
Name:                    gfx90a
Marketing Name:          AMD Instinct MI210

Changes since v3:
-----------------
- Based on feedback from Felix and Christian,
  AMDGPU_VA_RESERVED_TRAP_SIZE has been updated. The virtual
  address space now reserves 64 KB for the trap, while only
  8 KB is allocated for both 4 KB and 64 KB page sizes. This
  ensures that the allocation remains within the reserved
  region.

Links:
------
[1] https://lore.kernel.org/all/cover.1774239489.git.donettom@linux.ibm.com/
[2] https://lore.kernel.org/all/cover.1765519875.git.donettom@linux.ibm.com/

Previous versions:
------------------
RFC v3 resend:
https://lore.kernel.org/all/cover.1774239489.git.donettom@linux.ibm.com/
RFC v3:
https://lore.kernel.org/all/cover.1771656655.git.donettom@linux.ibm.com/
RFC v2:
https://lore.kernel.org/all/cover.1769612973.git.donettom@linux.ibm.com/
RFC v1:
https://lore.kernel.org/all/cover.1765519875.git.donettom@linux.ibm.com/

Donet Tom (2):
  drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KB
  drm/amdgpu: Fix AMDGPU_GTT_MAX_TRANSFER_SIZE for non-4K page size

 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 8 +++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  | 2 +-
 drivers/gpu/drm/amd/amdgpu/vce_v1_0.c   | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h   | 4 ++--
 5 files changed, 11 insertions(+), 8 deletions(-)

-- 
2.52.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v4 1/2] drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KB
  2026-03-26 12:21 [PATCH v4 0/2] drm/amd: Add support for non-4K page size systems Donet Tom
@ 2026-03-26 12:21 ` Donet Tom
  2026-03-26 12:36   ` Christian König
  2026-03-26 12:21 ` [PATCH v4 2/2] drm/amdgpu: Fix AMDGPU_GTT_MAX_TRANSFER_SIZE for non-4K page size Donet Tom
  1 sibling, 1 reply; 7+ messages in thread
From: Donet Tom @ 2026-03-26 12:21 UTC (permalink / raw)
  To: amd-gfx, Felix Kuehling, Alex Deucher, Alex Deucher,
	christian.koenig, Philip Yang
  Cc: David.YatSin, Kent.Russell, Ritesh Harjani,
	Vaidyanathan Srinivasan, donettom, stable, Felix Kuehling

Currently, AMDGPU_VA_RESERVED_TRAP_SIZE is hardcoded to 8KB, while
KFD_CWSR_TBA_TMA_SIZE is defined as 2 * PAGE_SIZE. On systems with
4K pages, both values match (8KB), so allocation and reserved space
are consistent.

However, on 64K page-size systems, KFD_CWSR_TBA_TMA_SIZE becomes 128KB,
while the reserved trap area remains 8KB. This mismatch causes the
kernel to crash when running rocminfo or rccl unit tests.

Kernel attempted to read user page (2) - exploit attempt? (uid: 1001)
BUG: Kernel NULL pointer dereference on read at 0x00000002
Faulting instruction address: 0xc0000000002c8a64
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
CPU: 34 UID: 1001 PID: 9379 Comm: rocminfo Tainted: G E
6.19.0-rc4-amdgpu-00320-gf23176405700 #56 VOLUNTARY
Tainted: [E]=UNSIGNED_MODULE
Hardware name: IBM,9105-42A POWER10 (architected) 0x800200 0xf000006
of:IBM,FW1060.30 (ML1060_896) hv:phyp pSeries
NIP:  c0000000002c8a64 LR: c00000000125dbc8 CTR: c00000000125e730
REGS: c0000001e0957580 TRAP: 0300 Tainted: G E
MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24008268
XER: 00000036
CFAR: c00000000125dbc4 DAR: 0000000000000002 DSISR: 40000000
IRQMASK: 1
GPR00: c00000000125d908 c0000001e0957820 c0000000016e8100
c00000013d814540
GPR04: 0000000000000002 c00000013d814550 0000000000000045
0000000000000000
GPR08: c00000013444d000 c00000013d814538 c00000013d814538
0000000084002268
GPR12: c00000000125e730 c000007e2ffd5f00 ffffffffffffffff
0000000000020000
GPR16: 0000000000000000 0000000000000002 c00000015f653000
0000000000000000
GPR20: c000000138662400 c00000013d814540 0000000000000000
c00000013d814500
GPR24: 0000000000000000 0000000000000002 c0000001e0957888
c0000001e0957878
GPR28: c00000013d814548 0000000000000000 c00000013d814540
c0000001e0957888
NIP [c0000000002c8a64] __mutex_add_waiter+0x24/0xc0
LR [c00000000125dbc8] __mutex_lock.constprop.0+0x318/0xd00
Call Trace:
0xc0000001e0957890 (unreliable)
__mutex_lock.constprop.0+0x58/0xd00
amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x6fc/0xb60 [amdgpu]
kfd_process_alloc_gpuvm+0x54/0x1f0 [amdgpu]
kfd_process_device_init_cwsr_dgpu+0xa4/0x1a0 [amdgpu]
kfd_process_device_init_vm+0xd8/0x2e0 [amdgpu]
kfd_ioctl_acquire_vm+0xd0/0x130 [amdgpu]
kfd_ioctl+0x514/0x670 [amdgpu]
sys_ioctl+0x134/0x180
system_call_exception+0x114/0x300
system_call_vectored_common+0x15c/0x2ec

This patch changes AMDGPU_VA_RESERVED_TRAP_SIZE to 64 KB and
KFD_CWSR_TBA_TMA_SIZE to the AMD GPU page size. This means we reserve
64 KB for the trap in the address space, but only allocate 8 KB within
it. With this approach, the allocation size never exceeds the reserved
area.

cc: stable@vger.kernel.org
Fixes: 34a1de0f7935 ("drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole")
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h  | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index bb276c0ad06d..d5b7061556ba 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -173,7 +173,7 @@ struct amdgpu_bo_vm;
 #define AMDGPU_VA_RESERVED_SEQ64_SIZE		(2ULL << 20)
 #define AMDGPU_VA_RESERVED_SEQ64_START(adev)	(AMDGPU_VA_RESERVED_CSA_START(adev) \
 						 - AMDGPU_VA_RESERVED_SEQ64_SIZE)
-#define AMDGPU_VA_RESERVED_TRAP_SIZE		(2ULL << 12)
+#define AMDGPU_VA_RESERVED_TRAP_SIZE		(1ULL << 16)
 #define AMDGPU_VA_RESERVED_TRAP_START(adev)	(AMDGPU_VA_RESERVED_SEQ64_START(adev) \
 						 - AMDGPU_VA_RESERVED_TRAP_SIZE)
 #define AMDGPU_VA_RESERVED_BOTTOM		(1ULL << 16)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index e5b56412931b..035687a17d89 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -102,8 +102,8 @@
  * The first chunk is the TBA used for the CWSR ISA code. The second
  * chunk is used as TMA for user-mode trap handler setup in daisy-chain mode.
  */
-#define KFD_CWSR_TBA_TMA_SIZE (PAGE_SIZE * 2)
-#define KFD_CWSR_TMA_OFFSET (PAGE_SIZE + 2048)
+#define KFD_CWSR_TBA_TMA_SIZE (AMDGPU_GPU_PAGE_SIZE * 2)
+#define KFD_CWSR_TMA_OFFSET (AMDGPU_GPU_PAGE_SIZE + 2048)
 
 #define KFD_MAX_NUM_OF_QUEUES_PER_DEVICE		\
 	(KFD_MAX_NUM_OF_PROCESSES *			\
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 2/2] drm/amdgpu: Fix AMDGPU_GTT_MAX_TRANSFER_SIZE for non-4K page size
  2026-03-26 12:21 [PATCH v4 0/2] drm/amd: Add support for non-4K page size systems Donet Tom
  2026-03-26 12:21 ` [PATCH v4 1/2] drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KB Donet Tom
@ 2026-03-26 12:21 ` Donet Tom
  2026-04-03 13:43   ` Donet Tom
  1 sibling, 1 reply; 7+ messages in thread
From: Donet Tom @ 2026-03-26 12:21 UTC (permalink / raw)
  To: amd-gfx, Felix Kuehling, Alex Deucher, Alex Deucher,
	christian.koenig, Philip Yang
  Cc: David.YatSin, Kent.Russell, Ritesh Harjani,
	Vaidyanathan Srinivasan, donettom

AMDGPU_GTT_MAX_TRANSFER_SIZE represented the maximum number of
system-page-sized pages that could be transferred in a single
operation. The effective maximum transfer size was intended to be
one PMD-sized mapping.

In the existing code, AMDGPU_GTT_MAX_TRANSFER_SIZE was hard-coded
to 512 pages. This corresponded to 2 MB on 4 KB page-size systems,
matching the PMD size. However, on systems with a non-4 KB page
size, this value no longer matched the PMD size.

This patch changed the calculation of AMDGPU_GTT_MAX_TRANSFER_SIZE
to derive it from PMD_SHIFT and PAGE_SHIFT, ensuring that the
maximum transfer size remained PMD-sized across all system page
sizes.

Additionally, in some places, AMDGPU_GTT_MAX_TRANSFER_SIZE was
implicitly assumed to be based on 4 KB pages. This resulted in
incorrect address offset calculations. This patch updated the
address calculations to correctly handle non-4 KB system page
sizes as well.

amdgpu_ttm_map_buffer() can create both GTT GART entries and
VRAM GART entries. For GTT mappings, amdgpu_gart_map() takes
system page–sized PFNs, and the mappings are created correctly.

However, for VRAM GART mappings, amdgpu_gart_map_vram_range() expects
GPU page–sized PFNs, but CPU page–sized PFNs were being passed,
resulting in incorrect mappings.

This patch updates the code to pass GPU page–sized PFNs to
amdgpu_gart_map_vram_range(), ensuring that VRAM GART mappings are
created correctly.

Signed-off-by: Donet Tom <donettom@linux.ibm.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 8 +++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 2 +-
 drivers/gpu/drm/amd/amdgpu/vce_v1_0.c   | 3 ++-
 3 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 0ccb31788b20..f9f534119cbe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -204,7 +204,7 @@ static int amdgpu_ttm_map_buffer(struct amdgpu_ttm_buffer_entity *entity,
 	int r;
 
 	BUG_ON(adev->mman.buffer_funcs->copy_max_bytes <
-	       AMDGPU_GTT_MAX_TRANSFER_SIZE * 8);
+	       AMDGPU_GTT_MAX_TRANSFER_SIZE * AMDGPU_GPU_PAGES_IN_CPU_PAGE * 8);
 
 	if (WARN_ON(mem->mem_type == AMDGPU_PL_PREEMPT))
 		return -EINVAL;
@@ -230,7 +230,7 @@ static int amdgpu_ttm_map_buffer(struct amdgpu_ttm_buffer_entity *entity,
 
 	*addr = adev->gmc.gart_start;
 	*addr += (u64)window * AMDGPU_GTT_MAX_TRANSFER_SIZE *
-		AMDGPU_GPU_PAGE_SIZE;
+		AMDGPU_GPU_PAGES_IN_CPU_PAGE * AMDGPU_GPU_PAGE_SIZE;
 	*addr += offset;
 
 	num_dw = ALIGN(adev->mman.buffer_funcs->copy_num_dw, 8);
@@ -248,7 +248,8 @@ static int amdgpu_ttm_map_buffer(struct amdgpu_ttm_buffer_entity *entity,
 	src_addr += job->ibs[0].gpu_addr;
 
 	dst_addr = amdgpu_bo_gpu_offset(adev->gart.bo);
-	dst_addr += window * AMDGPU_GTT_MAX_TRANSFER_SIZE * 8;
+	dst_addr += window * AMDGPU_GTT_MAX_TRANSFER_SIZE *
+		AMDGPU_GPU_PAGES_IN_CPU_PAGE * 8;
 	amdgpu_emit_copy_buffer(adev, &job->ibs[0], src_addr,
 				dst_addr, num_bytes, 0);
 
@@ -266,6 +267,7 @@ static int amdgpu_ttm_map_buffer(struct amdgpu_ttm_buffer_entity *entity,
 	} else {
 		u64 pa = mm_cur->start + adev->vm_manager.vram_base_offset;
 
+		num_pages *= AMDGPU_GPU_PAGES_IN_CPU_PAGE;
 		amdgpu_gart_map_vram_range(adev, pa, 0, num_pages, flags, cpu_addr);
 	}
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
index 143201ecea3f..15aff225af1d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
@@ -38,7 +38,7 @@
 #define AMDGPU_PL_MMIO_REMAP	(TTM_PL_PRIV + 5)
 #define __AMDGPU_PL_NUM	(TTM_PL_PRIV + 6)
 
-#define AMDGPU_GTT_MAX_TRANSFER_SIZE	512
+#define AMDGPU_GTT_MAX_TRANSFER_SIZE	(1 << (PMD_SHIFT - PAGE_SHIFT))
 #define AMDGPU_GTT_NUM_TRANSFER_WINDOWS	2
 
 extern const struct attribute_group amdgpu_vram_mgr_attr_group;
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
index 9ae424618556..b2d4114c258c 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
@@ -48,7 +48,8 @@
 #define VCE_STATUS_VCPU_REPORT_FW_LOADED_MASK	0x02
 
 #define VCE_V1_0_GART_PAGE_START \
-	(AMDGPU_GTT_MAX_TRANSFER_SIZE * AMDGPU_GTT_NUM_TRANSFER_WINDOWS)
+	(AMDGPU_GTT_MAX_TRANSFER_SIZE * AMDGPU_GPU_PAGES_IN_CPU_PAGE * \
+	 AMDGPU_GTT_NUM_TRANSFER_WINDOWS)
 #define VCE_V1_0_GART_ADDR_START \
 	(VCE_V1_0_GART_PAGE_START * AMDGPU_GPU_PAGE_SIZE)
 
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 1/2] drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KB
  2026-03-26 12:21 ` [PATCH v4 1/2] drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KB Donet Tom
@ 2026-03-26 12:36   ` Christian König
  2026-03-26 13:47     ` Alex Deucher
  2026-03-26 19:38     ` Kuehling, Felix
  0 siblings, 2 replies; 7+ messages in thread
From: Christian König @ 2026-03-26 12:36 UTC (permalink / raw)
  To: Donet Tom, amd-gfx, Felix Kuehling, Alex Deucher, Alex Deucher,
	Philip Yang
  Cc: David.YatSin, Kent.Russell, Ritesh Harjani,
	Vaidyanathan Srinivasan, stable

On 3/26/26 13:21, Donet Tom wrote:
> Currently, AMDGPU_VA_RESERVED_TRAP_SIZE is hardcoded to 8KB, while
> KFD_CWSR_TBA_TMA_SIZE is defined as 2 * PAGE_SIZE. On systems with
> 4K pages, both values match (8KB), so allocation and reserved space
> are consistent.
> 
> However, on 64K page-size systems, KFD_CWSR_TBA_TMA_SIZE becomes 128KB,
> while the reserved trap area remains 8KB. This mismatch causes the
> kernel to crash when running rocminfo or rccl unit tests.
> 
> Kernel attempted to read user page (2) - exploit attempt? (uid: 1001)
> BUG: Kernel NULL pointer dereference on read at 0x00000002
> Faulting instruction address: 0xc0000000002c8a64
> Oops: Kernel access of bad area, sig: 11 [#1]
> LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> CPU: 34 UID: 1001 PID: 9379 Comm: rocminfo Tainted: G E
> 6.19.0-rc4-amdgpu-00320-gf23176405700 #56 VOLUNTARY
> Tainted: [E]=UNSIGNED_MODULE
> Hardware name: IBM,9105-42A POWER10 (architected) 0x800200 0xf000006
> of:IBM,FW1060.30 (ML1060_896) hv:phyp pSeries
> NIP:  c0000000002c8a64 LR: c00000000125dbc8 CTR: c00000000125e730
> REGS: c0000001e0957580 TRAP: 0300 Tainted: G E
> MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24008268
> XER: 00000036
> CFAR: c00000000125dbc4 DAR: 0000000000000002 DSISR: 40000000
> IRQMASK: 1
> GPR00: c00000000125d908 c0000001e0957820 c0000000016e8100
> c00000013d814540
> GPR04: 0000000000000002 c00000013d814550 0000000000000045
> 0000000000000000
> GPR08: c00000013444d000 c00000013d814538 c00000013d814538
> 0000000084002268
> GPR12: c00000000125e730 c000007e2ffd5f00 ffffffffffffffff
> 0000000000020000
> GPR16: 0000000000000000 0000000000000002 c00000015f653000
> 0000000000000000
> GPR20: c000000138662400 c00000013d814540 0000000000000000
> c00000013d814500
> GPR24: 0000000000000000 0000000000000002 c0000001e0957888
> c0000001e0957878
> GPR28: c00000013d814548 0000000000000000 c00000013d814540
> c0000001e0957888
> NIP [c0000000002c8a64] __mutex_add_waiter+0x24/0xc0
> LR [c00000000125dbc8] __mutex_lock.constprop.0+0x318/0xd00
> Call Trace:
> 0xc0000001e0957890 (unreliable)
> __mutex_lock.constprop.0+0x58/0xd00
> amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x6fc/0xb60 [amdgpu]
> kfd_process_alloc_gpuvm+0x54/0x1f0 [amdgpu]
> kfd_process_device_init_cwsr_dgpu+0xa4/0x1a0 [amdgpu]
> kfd_process_device_init_vm+0xd8/0x2e0 [amdgpu]
> kfd_ioctl_acquire_vm+0xd0/0x130 [amdgpu]
> kfd_ioctl+0x514/0x670 [amdgpu]
> sys_ioctl+0x134/0x180
> system_call_exception+0x114/0x300
> system_call_vectored_common+0x15c/0x2ec
> 
> This patch changes AMDGPU_VA_RESERVED_TRAP_SIZE to 64 KB and
> KFD_CWSR_TBA_TMA_SIZE to the AMD GPU page size. This means we reserve
> 64 KB for the trap in the address space, but only allocate 8 KB within
> it. With this approach, the allocation size never exceeds the reserved
> area.
> 
> cc: stable@vger.kernel.org
> Fixes: 34a1de0f7935 ("drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole")
> Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
> Suggested-by: Christian König <christian.koenig@amd.com>
> Signed-off-by: Donet Tom <donettom@linux.ibm.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h  | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index bb276c0ad06d..d5b7061556ba 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -173,7 +173,7 @@ struct amdgpu_bo_vm;
>  #define AMDGPU_VA_RESERVED_SEQ64_SIZE		(2ULL << 20)
>  #define AMDGPU_VA_RESERVED_SEQ64_START(adev)	(AMDGPU_VA_RESERVED_CSA_START(adev) \
>  						 - AMDGPU_VA_RESERVED_SEQ64_SIZE)
> -#define AMDGPU_VA_RESERVED_TRAP_SIZE		(2ULL << 12)
> +#define AMDGPU_VA_RESERVED_TRAP_SIZE		(1ULL << 16)
>  #define AMDGPU_VA_RESERVED_TRAP_START(adev)	(AMDGPU_VA_RESERVED_SEQ64_START(adev) \
>  						 - AMDGPU_VA_RESERVED_TRAP_SIZE)
>  #define AMDGPU_VA_RESERVED_BOTTOM		(1ULL << 16)
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> index e5b56412931b..035687a17d89 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> @@ -102,8 +102,8 @@
>   * The first chunk is the TBA used for the CWSR ISA code. The second
>   * chunk is used as TMA for user-mode trap handler setup in daisy-chain mode.
>   */
> -#define KFD_CWSR_TBA_TMA_SIZE (PAGE_SIZE * 2)
> -#define KFD_CWSR_TMA_OFFSET (PAGE_SIZE + 2048)
> +#define KFD_CWSR_TBA_TMA_SIZE (AMDGPU_GPU_PAGE_SIZE * 2)
> +#define KFD_CWSR_TMA_OFFSET (AMDGPU_GPU_PAGE_SIZE + 2048)
>  
>  #define KFD_MAX_NUM_OF_QUEUES_PER_DEVICE		\
>  	(KFD_MAX_NUM_OF_PROCESSES *			\


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 1/2] drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KB
  2026-03-26 12:36   ` Christian König
@ 2026-03-26 13:47     ` Alex Deucher
  2026-03-26 19:38     ` Kuehling, Felix
  1 sibling, 0 replies; 7+ messages in thread
From: Alex Deucher @ 2026-03-26 13:47 UTC (permalink / raw)
  To: Christian König
  Cc: Donet Tom, amd-gfx, Felix Kuehling, Alex Deucher, Philip Yang,
	David.YatSin, Kent.Russell, Ritesh Harjani,
	Vaidyanathan Srinivasan, stable

Applied.  Thanks!

Alex

On Thu, Mar 26, 2026 at 8:36 AM Christian König
<christian.koenig@amd.com> wrote:
>
> On 3/26/26 13:21, Donet Tom wrote:
> > Currently, AMDGPU_VA_RESERVED_TRAP_SIZE is hardcoded to 8KB, while
> > KFD_CWSR_TBA_TMA_SIZE is defined as 2 * PAGE_SIZE. On systems with
> > 4K pages, both values match (8KB), so allocation and reserved space
> > are consistent.
> >
> > However, on 64K page-size systems, KFD_CWSR_TBA_TMA_SIZE becomes 128KB,
> > while the reserved trap area remains 8KB. This mismatch causes the
> > kernel to crash when running rocminfo or rccl unit tests.
> >
> > Kernel attempted to read user page (2) - exploit attempt? (uid: 1001)
> > BUG: Kernel NULL pointer dereference on read at 0x00000002
> > Faulting instruction address: 0xc0000000002c8a64
> > Oops: Kernel access of bad area, sig: 11 [#1]
> > LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> > CPU: 34 UID: 1001 PID: 9379 Comm: rocminfo Tainted: G E
> > 6.19.0-rc4-amdgpu-00320-gf23176405700 #56 VOLUNTARY
> > Tainted: [E]=UNSIGNED_MODULE
> > Hardware name: IBM,9105-42A POWER10 (architected) 0x800200 0xf000006
> > of:IBM,FW1060.30 (ML1060_896) hv:phyp pSeries
> > NIP:  c0000000002c8a64 LR: c00000000125dbc8 CTR: c00000000125e730
> > REGS: c0000001e0957580 TRAP: 0300 Tainted: G E
> > MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24008268
> > XER: 00000036
> > CFAR: c00000000125dbc4 DAR: 0000000000000002 DSISR: 40000000
> > IRQMASK: 1
> > GPR00: c00000000125d908 c0000001e0957820 c0000000016e8100
> > c00000013d814540
> > GPR04: 0000000000000002 c00000013d814550 0000000000000045
> > 0000000000000000
> > GPR08: c00000013444d000 c00000013d814538 c00000013d814538
> > 0000000084002268
> > GPR12: c00000000125e730 c000007e2ffd5f00 ffffffffffffffff
> > 0000000000020000
> > GPR16: 0000000000000000 0000000000000002 c00000015f653000
> > 0000000000000000
> > GPR20: c000000138662400 c00000013d814540 0000000000000000
> > c00000013d814500
> > GPR24: 0000000000000000 0000000000000002 c0000001e0957888
> > c0000001e0957878
> > GPR28: c00000013d814548 0000000000000000 c00000013d814540
> > c0000001e0957888
> > NIP [c0000000002c8a64] __mutex_add_waiter+0x24/0xc0
> > LR [c00000000125dbc8] __mutex_lock.constprop.0+0x318/0xd00
> > Call Trace:
> > 0xc0000001e0957890 (unreliable)
> > __mutex_lock.constprop.0+0x58/0xd00
> > amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x6fc/0xb60 [amdgpu]
> > kfd_process_alloc_gpuvm+0x54/0x1f0 [amdgpu]
> > kfd_process_device_init_cwsr_dgpu+0xa4/0x1a0 [amdgpu]
> > kfd_process_device_init_vm+0xd8/0x2e0 [amdgpu]
> > kfd_ioctl_acquire_vm+0xd0/0x130 [amdgpu]
> > kfd_ioctl+0x514/0x670 [amdgpu]
> > sys_ioctl+0x134/0x180
> > system_call_exception+0x114/0x300
> > system_call_vectored_common+0x15c/0x2ec
> >
> > This patch changes AMDGPU_VA_RESERVED_TRAP_SIZE to 64 KB and
> > KFD_CWSR_TBA_TMA_SIZE to the AMD GPU page size. This means we reserve
> > 64 KB for the trap in the address space, but only allocate 8 KB within
> > it. With this approach, the allocation size never exceeds the reserved
> > area.
> >
> > cc: stable@vger.kernel.org
> > Fixes: 34a1de0f7935 ("drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole")
> > Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
> > Suggested-by: Christian König <christian.koenig@amd.com>
> > Signed-off-by: Donet Tom <donettom@linux.ibm.com>
>
> Reviewed-by: Christian König <christian.koenig@amd.com>
>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 +-
> >  drivers/gpu/drm/amd/amdkfd/kfd_priv.h  | 4 ++--
> >  2 files changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > index bb276c0ad06d..d5b7061556ba 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > @@ -173,7 +173,7 @@ struct amdgpu_bo_vm;
> >  #define AMDGPU_VA_RESERVED_SEQ64_SIZE                (2ULL << 20)
> >  #define AMDGPU_VA_RESERVED_SEQ64_START(adev) (AMDGPU_VA_RESERVED_CSA_START(adev) \
> >                                                - AMDGPU_VA_RESERVED_SEQ64_SIZE)
> > -#define AMDGPU_VA_RESERVED_TRAP_SIZE         (2ULL << 12)
> > +#define AMDGPU_VA_RESERVED_TRAP_SIZE         (1ULL << 16)
> >  #define AMDGPU_VA_RESERVED_TRAP_START(adev)  (AMDGPU_VA_RESERVED_SEQ64_START(adev) \
> >                                                - AMDGPU_VA_RESERVED_TRAP_SIZE)
> >  #define AMDGPU_VA_RESERVED_BOTTOM            (1ULL << 16)
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> > index e5b56412931b..035687a17d89 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> > @@ -102,8 +102,8 @@
> >   * The first chunk is the TBA used for the CWSR ISA code. The second
> >   * chunk is used as TMA for user-mode trap handler setup in daisy-chain mode.
> >   */
> > -#define KFD_CWSR_TBA_TMA_SIZE (PAGE_SIZE * 2)
> > -#define KFD_CWSR_TMA_OFFSET (PAGE_SIZE + 2048)
> > +#define KFD_CWSR_TBA_TMA_SIZE (AMDGPU_GPU_PAGE_SIZE * 2)
> > +#define KFD_CWSR_TMA_OFFSET (AMDGPU_GPU_PAGE_SIZE + 2048)
> >
> >  #define KFD_MAX_NUM_OF_QUEUES_PER_DEVICE             \
> >       (KFD_MAX_NUM_OF_PROCESSES *                     \
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 1/2] drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KB
  2026-03-26 12:36   ` Christian König
  2026-03-26 13:47     ` Alex Deucher
@ 2026-03-26 19:38     ` Kuehling, Felix
  1 sibling, 0 replies; 7+ messages in thread
From: Kuehling, Felix @ 2026-03-26 19:38 UTC (permalink / raw)
  To: Christian König, Donet Tom, amd-gfx, Alex Deucher,
	Alex Deucher, Philip Yang
  Cc: David.YatSin, Kent.Russell, Ritesh Harjani,
	Vaidyanathan Srinivasan, stable


On 2026-03-26 08:36, Christian König wrote:
> On 3/26/26 13:21, Donet Tom wrote:
>> Currently, AMDGPU_VA_RESERVED_TRAP_SIZE is hardcoded to 8KB, while
>> KFD_CWSR_TBA_TMA_SIZE is defined as 2 * PAGE_SIZE. On systems with
>> 4K pages, both values match (8KB), so allocation and reserved space
>> are consistent.
>>
>> However, on 64K page-size systems, KFD_CWSR_TBA_TMA_SIZE becomes 128KB,
>> while the reserved trap area remains 8KB. This mismatch causes the
>> kernel to crash when running rocminfo or rccl unit tests.
>>
>> Kernel attempted to read user page (2) - exploit attempt? (uid: 1001)
>> BUG: Kernel NULL pointer dereference on read at 0x00000002
>> Faulting instruction address: 0xc0000000002c8a64
>> Oops: Kernel access of bad area, sig: 11 [#1]
>> LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
>> CPU: 34 UID: 1001 PID: 9379 Comm: rocminfo Tainted: G E
>> 6.19.0-rc4-amdgpu-00320-gf23176405700 #56 VOLUNTARY
>> Tainted: [E]=UNSIGNED_MODULE
>> Hardware name: IBM,9105-42A POWER10 (architected) 0x800200 0xf000006
>> of:IBM,FW1060.30 (ML1060_896) hv:phyp pSeries
>> NIP:  c0000000002c8a64 LR: c00000000125dbc8 CTR: c00000000125e730
>> REGS: c0000001e0957580 TRAP: 0300 Tainted: G E
>> MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24008268
>> XER: 00000036
>> CFAR: c00000000125dbc4 DAR: 0000000000000002 DSISR: 40000000
>> IRQMASK: 1
>> GPR00: c00000000125d908 c0000001e0957820 c0000000016e8100
>> c00000013d814540
>> GPR04: 0000000000000002 c00000013d814550 0000000000000045
>> 0000000000000000
>> GPR08: c00000013444d000 c00000013d814538 c00000013d814538
>> 0000000084002268
>> GPR12: c00000000125e730 c000007e2ffd5f00 ffffffffffffffff
>> 0000000000020000
>> GPR16: 0000000000000000 0000000000000002 c00000015f653000
>> 0000000000000000
>> GPR20: c000000138662400 c00000013d814540 0000000000000000
>> c00000013d814500
>> GPR24: 0000000000000000 0000000000000002 c0000001e0957888
>> c0000001e0957878
>> GPR28: c00000013d814548 0000000000000000 c00000013d814540
>> c0000001e0957888
>> NIP [c0000000002c8a64] __mutex_add_waiter+0x24/0xc0
>> LR [c00000000125dbc8] __mutex_lock.constprop.0+0x318/0xd00
>> Call Trace:
>> 0xc0000001e0957890 (unreliable)
>> __mutex_lock.constprop.0+0x58/0xd00
>> amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x6fc/0xb60 [amdgpu]
>> kfd_process_alloc_gpuvm+0x54/0x1f0 [amdgpu]
>> kfd_process_device_init_cwsr_dgpu+0xa4/0x1a0 [amdgpu]
>> kfd_process_device_init_vm+0xd8/0x2e0 [amdgpu]
>> kfd_ioctl_acquire_vm+0xd0/0x130 [amdgpu]
>> kfd_ioctl+0x514/0x670 [amdgpu]
>> sys_ioctl+0x134/0x180
>> system_call_exception+0x114/0x300
>> system_call_vectored_common+0x15c/0x2ec
>>
>> This patch changes AMDGPU_VA_RESERVED_TRAP_SIZE to 64 KB and
>> KFD_CWSR_TBA_TMA_SIZE to the AMD GPU page size. This means we reserve
>> 64 KB for the trap in the address space, but only allocate 8 KB within
>> it. With this approach, the allocation size never exceeds the reserved
>> area.
>>
>> cc: stable@vger.kernel.org
>> Fixes: 34a1de0f7935 ("drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole")
>> Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
>> Suggested-by: Christian König <christian.koenig@amd.com>
>> Signed-off-by: Donet Tom <donettom@linux.ibm.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>

Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>


>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 +-
>>   drivers/gpu/drm/amd/amdkfd/kfd_priv.h  | 4 ++--
>>   2 files changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>> index bb276c0ad06d..d5b7061556ba 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>> @@ -173,7 +173,7 @@ struct amdgpu_bo_vm;
>>   #define AMDGPU_VA_RESERVED_SEQ64_SIZE		(2ULL << 20)
>>   #define AMDGPU_VA_RESERVED_SEQ64_START(adev)	(AMDGPU_VA_RESERVED_CSA_START(adev) \
>>   						 - AMDGPU_VA_RESERVED_SEQ64_SIZE)
>> -#define AMDGPU_VA_RESERVED_TRAP_SIZE		(2ULL << 12)
>> +#define AMDGPU_VA_RESERVED_TRAP_SIZE		(1ULL << 16)
>>   #define AMDGPU_VA_RESERVED_TRAP_START(adev)	(AMDGPU_VA_RESERVED_SEQ64_START(adev) \
>>   						 - AMDGPU_VA_RESERVED_TRAP_SIZE)
>>   #define AMDGPU_VA_RESERVED_BOTTOM		(1ULL << 16)
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>> index e5b56412931b..035687a17d89 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>> @@ -102,8 +102,8 @@
>>    * The first chunk is the TBA used for the CWSR ISA code. The second
>>    * chunk is used as TMA for user-mode trap handler setup in daisy-chain mode.
>>    */
>> -#define KFD_CWSR_TBA_TMA_SIZE (PAGE_SIZE * 2)
>> -#define KFD_CWSR_TMA_OFFSET (PAGE_SIZE + 2048)
>> +#define KFD_CWSR_TBA_TMA_SIZE (AMDGPU_GPU_PAGE_SIZE * 2)
>> +#define KFD_CWSR_TMA_OFFSET (AMDGPU_GPU_PAGE_SIZE + 2048)
>>   
>>   #define KFD_MAX_NUM_OF_QUEUES_PER_DEVICE		\
>>   	(KFD_MAX_NUM_OF_PROCESSES *			\

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 2/2] drm/amdgpu: Fix AMDGPU_GTT_MAX_TRANSFER_SIZE for non-4K page size
  2026-03-26 12:21 ` [PATCH v4 2/2] drm/amdgpu: Fix AMDGPU_GTT_MAX_TRANSFER_SIZE for non-4K page size Donet Tom
@ 2026-04-03 13:43   ` Donet Tom
  0 siblings, 0 replies; 7+ messages in thread
From: Donet Tom @ 2026-04-03 13:43 UTC (permalink / raw)
  To: amd-gfx, Felix Kuehling, Alex Deucher, Alex Deucher,
	christian.koenig, Philip Yang
  Cc: David.YatSin, Kent.Russell, Ritesh Harjani,
	Vaidyanathan Srinivasan


Hi @christian @Felix @Alex

Thank you for your help in reviewing this series. All the patches except 
this one have been picked up. Could you please share your thoughts on 
this patch?

-Donet

On 3/26/26 5:51 PM, Donet Tom wrote:
> AMDGPU_GTT_MAX_TRANSFER_SIZE represented the maximum number of
> system-page-sized pages that could be transferred in a single
> operation. The effective maximum transfer size was intended to be
> one PMD-sized mapping.
>
> In the existing code, AMDGPU_GTT_MAX_TRANSFER_SIZE was hard-coded
> to 512 pages. This corresponded to 2 MB on 4 KB page-size systems,
> matching the PMD size. However, on systems with a non-4 KB page
> size, this value no longer matched the PMD size.
>
> This patch changed the calculation of AMDGPU_GTT_MAX_TRANSFER_SIZE
> to derive it from PMD_SHIFT and PAGE_SHIFT, ensuring that the
> maximum transfer size remained PMD-sized across all system page
> sizes.
>
> Additionally, in some places, AMDGPU_GTT_MAX_TRANSFER_SIZE was
> implicitly assumed to be based on 4 KB pages. This resulted in
> incorrect address offset calculations. This patch updated the
> address calculations to correctly handle non-4 KB system page
> sizes as well.
>
> amdgpu_ttm_map_buffer() can create both GTT GART entries and
> VRAM GART entries. For GTT mappings, amdgpu_gart_map() takes
> system page–sized PFNs, and the mappings are created correctly.
>
> However, for VRAM GART mappings, amdgpu_gart_map_vram_range() expects
> GPU page–sized PFNs, but CPU page–sized PFNs were being passed,
> resulting in incorrect mappings.
>
> This patch updates the code to pass GPU page–sized PFNs to
> amdgpu_gart_map_vram_range(), ensuring that VRAM GART mappings are
> created correctly.
>
> Signed-off-by: Donet Tom <donettom@linux.ibm.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 8 +++++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 2 +-
>   drivers/gpu/drm/amd/amdgpu/vce_v1_0.c   | 3 ++-
>   3 files changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 0ccb31788b20..f9f534119cbe 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -204,7 +204,7 @@ static int amdgpu_ttm_map_buffer(struct amdgpu_ttm_buffer_entity *entity,
>   	int r;
>   
>   	BUG_ON(adev->mman.buffer_funcs->copy_max_bytes <
> -	       AMDGPU_GTT_MAX_TRANSFER_SIZE * 8);
> +	       AMDGPU_GTT_MAX_TRANSFER_SIZE * AMDGPU_GPU_PAGES_IN_CPU_PAGE * 8);
>   
>   	if (WARN_ON(mem->mem_type == AMDGPU_PL_PREEMPT))
>   		return -EINVAL;
> @@ -230,7 +230,7 @@ static int amdgpu_ttm_map_buffer(struct amdgpu_ttm_buffer_entity *entity,
>   
>   	*addr = adev->gmc.gart_start;
>   	*addr += (u64)window * AMDGPU_GTT_MAX_TRANSFER_SIZE *
> -		AMDGPU_GPU_PAGE_SIZE;
> +		AMDGPU_GPU_PAGES_IN_CPU_PAGE * AMDGPU_GPU_PAGE_SIZE;
>   	*addr += offset;
>   
>   	num_dw = ALIGN(adev->mman.buffer_funcs->copy_num_dw, 8);
> @@ -248,7 +248,8 @@ static int amdgpu_ttm_map_buffer(struct amdgpu_ttm_buffer_entity *entity,
>   	src_addr += job->ibs[0].gpu_addr;
>   
>   	dst_addr = amdgpu_bo_gpu_offset(adev->gart.bo);
> -	dst_addr += window * AMDGPU_GTT_MAX_TRANSFER_SIZE * 8;
> +	dst_addr += window * AMDGPU_GTT_MAX_TRANSFER_SIZE *
> +		AMDGPU_GPU_PAGES_IN_CPU_PAGE * 8;
>   	amdgpu_emit_copy_buffer(adev, &job->ibs[0], src_addr,
>   				dst_addr, num_bytes, 0);
>   
> @@ -266,6 +267,7 @@ static int amdgpu_ttm_map_buffer(struct amdgpu_ttm_buffer_entity *entity,
>   	} else {
>   		u64 pa = mm_cur->start + adev->vm_manager.vram_base_offset;
>   
> +		num_pages *= AMDGPU_GPU_PAGES_IN_CPU_PAGE;
>   		amdgpu_gart_map_vram_range(adev, pa, 0, num_pages, flags, cpu_addr);
>   	}
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
> index 143201ecea3f..15aff225af1d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
> @@ -38,7 +38,7 @@
>   #define AMDGPU_PL_MMIO_REMAP	(TTM_PL_PRIV + 5)
>   #define __AMDGPU_PL_NUM	(TTM_PL_PRIV + 6)
>   
> -#define AMDGPU_GTT_MAX_TRANSFER_SIZE	512
> +#define AMDGPU_GTT_MAX_TRANSFER_SIZE	(1 << (PMD_SHIFT - PAGE_SHIFT))
>   #define AMDGPU_GTT_NUM_TRANSFER_WINDOWS	2
>   
>   extern const struct attribute_group amdgpu_vram_mgr_attr_group;
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
> index 9ae424618556..b2d4114c258c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c
> @@ -48,7 +48,8 @@
>   #define VCE_STATUS_VCPU_REPORT_FW_LOADED_MASK	0x02
>   
>   #define VCE_V1_0_GART_PAGE_START \
> -	(AMDGPU_GTT_MAX_TRANSFER_SIZE * AMDGPU_GTT_NUM_TRANSFER_WINDOWS)
> +	(AMDGPU_GTT_MAX_TRANSFER_SIZE * AMDGPU_GPU_PAGES_IN_CPU_PAGE * \
> +	 AMDGPU_GTT_NUM_TRANSFER_WINDOWS)
>   #define VCE_V1_0_GART_ADDR_START \
>   	(VCE_V1_0_GART_PAGE_START * AMDGPU_GPU_PAGE_SIZE)
>   

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-04-03 13:43 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-26 12:21 [PATCH v4 0/2] drm/amd: Add support for non-4K page size systems Donet Tom
2026-03-26 12:21 ` [PATCH v4 1/2] drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KB Donet Tom
2026-03-26 12:36   ` Christian König
2026-03-26 13:47     ` Alex Deucher
2026-03-26 19:38     ` Kuehling, Felix
2026-03-26 12:21 ` [PATCH v4 2/2] drm/amdgpu: Fix AMDGPU_GTT_MAX_TRANSFER_SIZE for non-4K page size Donet Tom
2026-04-03 13:43   ` Donet Tom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox