public inbox for amd-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Donet Tom <donettom@linux.ibm.com>
To: "Kuehling, Felix" <felix.kuehling@amd.com>,
	amd-gfx@lists.freedesktop.org,
	 Alex Deucher <alexander.deucher@amd.com>,
	Alex Deucher <alexdeucher@gmail.com>,
	christian.koenig@amd.com, Philip Yang <yangp@amd.com>
Cc: David.YatSin@amd.com, Kent.Russell@amd.com,
	Ritesh Harjani <ritesh.list@gmail.com>,
	Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Subject: Re: [RESEND RFC PATCH v3 0/6] drm/amd: Add support for non-4K page size systems
Date: Wed, 25 Mar 2026 13:32:40 +0530	[thread overview]
Message-ID: <a5cae3db-4b67-4a64-80ea-14bbde51d7f2@linux.ibm.com> (raw)
In-Reply-To: <ff78dfa4-f16b-4313-af73-1e63db67ddca@amd.com>

[-- Attachment #1: Type: text/plain, Size: 7487 bytes --]


On 3/25/26 7:57 AM, Kuehling, Felix wrote:
> On 2026-03-23 00: 28, Donet Tom wrote: This is v3 of the patch series 
> enabling 64 KB system page size support in AMDGPU. v2, part 1 of this 
> series [1] has already been merged upstream and provides the minimal 
> infrastructure required for 64 KB
> 
>
>
> On 2026-03-23 00:28, Donet Tom wrote:
>> This is v3 of the patch series enabling 64 KB system page size support
>> in AMDGPU. v2, part 1 of this series [1] has already been merged
>> upstream and provides the minimal infrastructure required for 64 KB
>> page support.
>>
>> This series addresses additional issues uncovered in AMDGPU when
>> running rccl unit tests and rocr-debug-agent tessts on 64KB page-size
>> systems.
>>
>> With this series applied, all RCCL unit tests and rocr-debug-agent
>> tests pass on systems using a 64 KB system page size, across
>> multi-GPU configurations, with XNACK both enabled and disabled.
>>
>> Patch 1 in this series (drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE
>> to 2 * PAGE_SIZE) fixes a kernel crash observed when running rocminfo
>> on systems with a 64 KB page size. This patch is required to enable
>> minimal support for 64 KB system page sizes.
>>
>> Since RFC v2, we observed AQL queue creation failures while running
>> certain workloads on 64K page-size systems due to an expected queue size
>> mismatch. This issue is addressed in patch 2 of this series.
>>
>> The questions we had in this seres are:
>> =======================================
>> 1 When the control stack size is aligned to 64 KB, we consistently
>>    observe queue preemption or eviction failures on gfx9, on both
>>    4 KB and 64 KB system page-size configurations.
>>
>>    The control stack size is calculated based on the number of CUs and
>>    waves and is then aligned to PAGE_SIZE. On systems with a 64 KB
>>    system page size, this alignment always results in a 64 KB-aligned
>>    control stack size, after which queue preemption fails.
>>
>>    Is there any hardware-imposed limitation on gfx9 that prevents the
>>    control stack size from being 64 KB? For gfx10, I see explicit
>>    hardware limitations on the control stack size in the code [2].
>>    Is there anything similar for gfx9?
>>
>>    What is the correct or recommended control stack size for gfx9?
>>    With a 4 KB system page size, I observe a control stack size of
>>    around 44 KB—can it grow beyond this? If the control stack size
>>    is fixed for a given gfx version, do you see any issues with
>>    aligning the control stack size to the GPU page size?


Thank you, Felix, for your time and for reviewing this patch


> I think there is a bug in user mode that uses its own calculation of 
> the ctl_stack_size to calculate the total context save area size. If 
> kernel mode increases the ctl_stack_size, the context save are 
> allocated by user mode will be too small.
>
> This is in 
> https://github.com/ROCm/rocm-systems/blob/3a8bafb6a60f4cfa1047a5516fa7212beef4c98f/projects/rocr-runtime/libhsakmt/src/queues.c#L349
>
>                  /* Keep calculating it in case we are using an older kernel, but if we have
>                   * the CtlStackSize and CwsrSize from KFD, use that as the definitive value
>                   */
>                  q->ctx_save_restore_size = node.CwsrSize > 0 ? node.CwsrSize :
>                                             q->ctl_stack_size + PAGE_ALIGN_UP(wg_data_size);
>                  q->ctl_stack_size = node.CtlStackSize > 0 ? node.CtlStackSize : q->ctl_stack_size;
>
> ctx_save_restore_size should be calculated after correcting 
> ctl_stack_size with the one from the kernel mode driver.
>

Yes, we also need a fix in rocr-runtime. In rocr-runtime, I used the 
same approach as in the kernel (patch 6/6) to calculate ctl_stack_size 
and ctx_save_restore_size. Without this library change, I was 
encountering queue creation failures. With the library change and with 
this series all rccl tests are passing on both 4K and 64K page sizes.


-Donet


> Regards,
>    Felix
>
>> This series has 5 patches
>> =========================
>> 1. AMDGPU_VA_RESERVED_TRAP_SIZE was hard-coded to 8 KB while
>>     KFD_CWSR_TBA_TMA_SIZE is defined as 2 * PAGE_SIZE, which matches on
>>     4 KB page-size systems but results in a size mismatch on 64 KB
>>     systems, leading to kernel crashes when running rocminfo or RCCL
>>     unit tests.
>>     This patch updates AMDGPU_VA_RESERVED_TRAP_SIZE to 2 * PAGE_SIZE so
>>     that the reserved trap area matches the allocation size across all
>>     system page sizes. This is a must needed patch to enable minimal
>>     support for 64 KB system page sizes.
>>
>> 2. Aligned expected_queue_size to PAGE_SIZE to fix AQL queue creation
>>     failure.
>>
>> 3. Fix amdgpu page fault handler (for xnack) to pass the corresponding
>>     system pfn (instead of gpu pfn) for restoring SVM range mapping.
>>
>> 4. Updated AMDGPU_GTT_MAX_TRANSFER_SIZE to always match the PMD size
>>     across all page sizes.
>>
>> 5. On systems where the CPU page size is larger than the GPU’s 4 KB page
>>     size, the MQD and control stack were aligned to the CPU PAGE_SIZE,
>>     causing multiple GPU pages to incorrectly inherit the UC attribute.
>>     This change aligns both regions to the GPU page size, ensuring that
>>     the MQD is mapped as UC and the control stack as NC, restoring the
>>     correct behavior.
>>
>> 6. Queue preemption fails when the control stack size is aligned to
>>     64 KB. This patch fixes this issue by aligning the control stack
>>     size to gpu page size.
>>
>> Setup details:
>> ============
>> System details: Power10 LPAR using 64K pagesize.
>> AMD GPU:
>> Name:                    gfx90a
>> Marketing Name:          AMD Instinct MI210
>>
>> [1]https://lore.kernel.org/all/cover.1765519875.git.donettom@linux.ibm.com/
>> [2]https://elixir.bootlin.com/linux/v6.19-rc5/source/drivers/gpu/drm/amd/amdkfd/kfd_queue.c#L457
>>
>> RFC V3 -https://lore.kernel.org/all/cover.1771656655.git.donettom@linux.ibm.com/
>> RFC V2 -https://lore.kernel.org/all/cover.1769612973.git.donettom@linux.ibm.com/
>> RFC V1 -https://lore.kernel.org/all/cover.1765519875.git.donettom@linux.ibm.com/
>>
>>
>> Donet Tom (6):
>>    drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 2 PAGE_SIZE pages
>>    drm/amdkfd: Align expected_queue_size to PAGE_SIZE
>>    drm/amdgpu: Handle GPU page faults correctly on non-4K page systems
>>    drm/amdgpu: Fix AMDGPU_GTT_MAX_TRANSFER_SIZE for non-4K page size
>>    drm/amd: Fix MQD and control stack alignment for non-4K
>>    drm/amdkfd: Fix queue preemption/eviction failures by aligning control
>>      stack size to GPU page size
>>
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c      | 44 +++++++++++++++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h      |  2 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       | 24 ++++------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h       |  2 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  6 +--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h        |  2 +-
>>   drivers/gpu/drm/amd/amdgpu/vce_v1_0.c         |  3 +-
>>   .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   | 23 ++++++----
>>   drivers/gpu/drm/amd/amdkfd/kfd_queue.c        | 11 ++---
>>   9 files changed, 82 insertions(+), 35 deletions(-)
>>

[-- Attachment #2: Type: text/html, Size: 11043 bytes --]

      reply	other threads:[~2026-03-25  8:02 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-23  4:28 [RESEND RFC PATCH v3 0/6] drm/amd: Add support for non-4K page size systems Donet Tom
2026-03-23  4:28 ` [RESEND RFC PATCH v3 1/6] drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 2 PAGE_SIZE pages Donet Tom
2026-03-23 10:11   ` Christian König
2026-03-23 11:50     ` Donet Tom
2026-03-23 13:12       ` Christian König
2026-03-24 18:19         ` Donet Tom
2026-03-25  2:26           ` Kuehling, Felix
2026-03-25  9:34             ` Christian König
2026-03-25 10:26               ` Donet Tom
2026-03-25 10:29                 ` Christian König
2026-03-25 17:54                   ` Kuehling, Felix
2026-03-25 17:59                     ` Donet Tom
2026-03-23  4:28 ` [RESEND RFC PATCH v3 2/6] drm/amdkfd: Align expected_queue_size to PAGE_SIZE Donet Tom
2026-03-25  2:28   ` Kuehling, Felix
2026-03-25 18:33     ` Alex Deucher
2026-03-23  4:28 ` [RESEND RFC PATCH v3 3/6] drm/amdgpu: Handle GPU page faults correctly on non-4K page systems Donet Tom
2026-03-23 13:04   ` Christian König
2026-03-24 13:10     ` Alex Deucher
2026-03-25 18:04       ` Donet Tom
2026-03-25 18:36         ` Alex Deucher
2026-03-25  2:30   ` Kuehling, Felix
2026-03-23  4:28 ` [RESEND RFC PATCH v3 4/6] drm/amdgpu: Fix AMDGPU_GTT_MAX_TRANSFER_SIZE for non-4K page size Donet Tom
2026-03-23  4:28 ` [RESEND RFC PATCH v3 5/6] drm/amd: Fix MQD and control stack alignment for non-4K Donet Tom
2026-03-25  2:58   ` Kuehling, Felix
2026-03-25 18:41     ` Alex Deucher
2026-03-23  4:28 ` [RESEND RFC PATCH v3 6/6] drm/amdkfd: Fix queue preemption/eviction failures by aligning control stack size to GPU page size Donet Tom
2026-03-25  3:00   ` Kuehling, Felix
2026-03-25 18:42     ` Alex Deucher
2026-03-25  2:27 ` [RESEND RFC PATCH v3 0/6] drm/amd: Add support for non-4K page size systems Kuehling, Felix
2026-03-25  8:02   ` Donet Tom [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a5cae3db-4b67-4a64-80ea-14bbde51d7f2@linux.ibm.com \
    --to=donettom@linux.ibm.com \
    --cc=David.YatSin@amd.com \
    --cc=Kent.Russell@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=alexdeucher@gmail.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=felix.kuehling@amd.com \
    --cc=ritesh.list@gmail.com \
    --cc=svaidy@linux.ibm.com \
    --cc=yangp@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox