From: Alex Deucher <alexdeucher@gmail.com>
To: "Christian König" <christian.koenig@amd.com>
Cc: Donet Tom <donettom@linux.ibm.com>,
amd-gfx@lists.freedesktop.org,
Felix Kuehling <Felix.Kuehling@amd.com>,
Alex Deucher <alexander.deucher@amd.com>,
Philip Yang <yangp@amd.com>,
David.YatSin@amd.com, Kent.Russell@amd.com,
Ritesh Harjani <ritesh.list@gmail.com>,
Vaidyanathan Srinivasan <svaidy@linux.ibm.com>,
stable@vger.kernel.org
Subject: Re: [PATCH v4 1/2] drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KB
Date: Thu, 26 Mar 2026 09:47:01 -0400 [thread overview]
Message-ID: <CADnq5_NWkGCb_WtaOk6Q4T4eG4EZc8ZNoLtxQkXowhYh3NaCVQ@mail.gmail.com> (raw)
In-Reply-To: <9c9c73e1-abe4-4307-9d44-37544fbd1596@amd.com>
Applied. Thanks!
Alex
On Thu, Mar 26, 2026 at 8:36 AM Christian König
<christian.koenig@amd.com> wrote:
>
> On 3/26/26 13:21, Donet Tom wrote:
> > Currently, AMDGPU_VA_RESERVED_TRAP_SIZE is hardcoded to 8KB, while
> > KFD_CWSR_TBA_TMA_SIZE is defined as 2 * PAGE_SIZE. On systems with
> > 4K pages, both values match (8KB), so allocation and reserved space
> > are consistent.
> >
> > However, on 64K page-size systems, KFD_CWSR_TBA_TMA_SIZE becomes 128KB,
> > while the reserved trap area remains 8KB. This mismatch causes the
> > kernel to crash when running rocminfo or rccl unit tests.
> >
> > Kernel attempted to read user page (2) - exploit attempt? (uid: 1001)
> > BUG: Kernel NULL pointer dereference on read at 0x00000002
> > Faulting instruction address: 0xc0000000002c8a64
> > Oops: Kernel access of bad area, sig: 11 [#1]
> > LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> > CPU: 34 UID: 1001 PID: 9379 Comm: rocminfo Tainted: G E
> > 6.19.0-rc4-amdgpu-00320-gf23176405700 #56 VOLUNTARY
> > Tainted: [E]=UNSIGNED_MODULE
> > Hardware name: IBM,9105-42A POWER10 (architected) 0x800200 0xf000006
> > of:IBM,FW1060.30 (ML1060_896) hv:phyp pSeries
> > NIP: c0000000002c8a64 LR: c00000000125dbc8 CTR: c00000000125e730
> > REGS: c0000001e0957580 TRAP: 0300 Tainted: G E
> > MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24008268
> > XER: 00000036
> > CFAR: c00000000125dbc4 DAR: 0000000000000002 DSISR: 40000000
> > IRQMASK: 1
> > GPR00: c00000000125d908 c0000001e0957820 c0000000016e8100
> > c00000013d814540
> > GPR04: 0000000000000002 c00000013d814550 0000000000000045
> > 0000000000000000
> > GPR08: c00000013444d000 c00000013d814538 c00000013d814538
> > 0000000084002268
> > GPR12: c00000000125e730 c000007e2ffd5f00 ffffffffffffffff
> > 0000000000020000
> > GPR16: 0000000000000000 0000000000000002 c00000015f653000
> > 0000000000000000
> > GPR20: c000000138662400 c00000013d814540 0000000000000000
> > c00000013d814500
> > GPR24: 0000000000000000 0000000000000002 c0000001e0957888
> > c0000001e0957878
> > GPR28: c00000013d814548 0000000000000000 c00000013d814540
> > c0000001e0957888
> > NIP [c0000000002c8a64] __mutex_add_waiter+0x24/0xc0
> > LR [c00000000125dbc8] __mutex_lock.constprop.0+0x318/0xd00
> > Call Trace:
> > 0xc0000001e0957890 (unreliable)
> > __mutex_lock.constprop.0+0x58/0xd00
> > amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x6fc/0xb60 [amdgpu]
> > kfd_process_alloc_gpuvm+0x54/0x1f0 [amdgpu]
> > kfd_process_device_init_cwsr_dgpu+0xa4/0x1a0 [amdgpu]
> > kfd_process_device_init_vm+0xd8/0x2e0 [amdgpu]
> > kfd_ioctl_acquire_vm+0xd0/0x130 [amdgpu]
> > kfd_ioctl+0x514/0x670 [amdgpu]
> > sys_ioctl+0x134/0x180
> > system_call_exception+0x114/0x300
> > system_call_vectored_common+0x15c/0x2ec
> >
> > This patch changes AMDGPU_VA_RESERVED_TRAP_SIZE to 64 KB and
> > KFD_CWSR_TBA_TMA_SIZE to the AMD GPU page size. This means we reserve
> > 64 KB for the trap in the address space, but only allocate 8 KB within
> > it. With this approach, the allocation size never exceeds the reserved
> > area.
> >
> > cc: stable@vger.kernel.org
> > Fixes: 34a1de0f7935 ("drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole")
> > Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
> > Suggested-by: Christian König <christian.koenig@amd.com>
> > Signed-off-by: Donet Tom <donettom@linux.ibm.com>
>
> Reviewed-by: Christian König <christian.koenig@amd.com>
>
> > ---
> > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 +-
> > drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 4 ++--
> > 2 files changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > index bb276c0ad06d..d5b7061556ba 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > @@ -173,7 +173,7 @@ struct amdgpu_bo_vm;
> > #define AMDGPU_VA_RESERVED_SEQ64_SIZE (2ULL << 20)
> > #define AMDGPU_VA_RESERVED_SEQ64_START(adev) (AMDGPU_VA_RESERVED_CSA_START(adev) \
> > - AMDGPU_VA_RESERVED_SEQ64_SIZE)
> > -#define AMDGPU_VA_RESERVED_TRAP_SIZE (2ULL << 12)
> > +#define AMDGPU_VA_RESERVED_TRAP_SIZE (1ULL << 16)
> > #define AMDGPU_VA_RESERVED_TRAP_START(adev) (AMDGPU_VA_RESERVED_SEQ64_START(adev) \
> > - AMDGPU_VA_RESERVED_TRAP_SIZE)
> > #define AMDGPU_VA_RESERVED_BOTTOM (1ULL << 16)
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> > index e5b56412931b..035687a17d89 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> > @@ -102,8 +102,8 @@
> > * The first chunk is the TBA used for the CWSR ISA code. The second
> > * chunk is used as TMA for user-mode trap handler setup in daisy-chain mode.
> > */
> > -#define KFD_CWSR_TBA_TMA_SIZE (PAGE_SIZE * 2)
> > -#define KFD_CWSR_TMA_OFFSET (PAGE_SIZE + 2048)
> > +#define KFD_CWSR_TBA_TMA_SIZE (AMDGPU_GPU_PAGE_SIZE * 2)
> > +#define KFD_CWSR_TMA_OFFSET (AMDGPU_GPU_PAGE_SIZE + 2048)
> >
> > #define KFD_MAX_NUM_OF_QUEUES_PER_DEVICE \
> > (KFD_MAX_NUM_OF_PROCESSES * \
>
prev parent reply other threads:[~2026-03-26 13:47 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <cover.1774521183.git.donettom@linux.ibm.com>
2026-03-26 12:21 ` [PATCH v4 1/2] drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KB Donet Tom
2026-03-26 12:36 ` Christian König
2026-03-26 13:47 ` Alex Deucher [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CADnq5_NWkGCb_WtaOk6Q4T4eG4EZc8ZNoLtxQkXowhYh3NaCVQ@mail.gmail.com \
--to=alexdeucher@gmail.com \
--cc=David.YatSin@amd.com \
--cc=Felix.Kuehling@amd.com \
--cc=Kent.Russell@amd.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=christian.koenig@amd.com \
--cc=donettom@linux.ibm.com \
--cc=ritesh.list@gmail.com \
--cc=stable@vger.kernel.org \
--cc=svaidy@linux.ibm.com \
--cc=yangp@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox