[PATCH] drm/amdkfd: Update CWSR area calculations for GFX 12.1

AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: Alex Deucher <alexander.deucher@amd.com>
To: <amd-gfx@lists.freedesktop.org>
Cc: Mukul Joshi <mukul.joshi@amd.com>, Feifei Xu <Feifei.Xu@amd.com>,
	"Alex Deucher" <alexander.deucher@amd.com>
Subject: [PATCH] drm/amdkfd: Update CWSR area calculations for GFX 12.1
Date: Wed, 10 Dec 2025 02:13:57 -0500	[thread overview]
Message-ID: <20251210071415.19983-3-alexander.deucher@amd.com> (raw)
In-Reply-To: <20251210071415.19983-1-alexander.deucher@amd.com>

From: Mukul Joshi <mukul.joshi@amd.com>

Update the SGPR, VGPR, HWREG size and number of waves supported
for GFX 12.1 CWSR memory limits. The CU calculation changed in
topology, as a result, the values need to be updated.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_queue.c | 63 ++++++++++++++++++++++----
 1 file changed, 54 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
index 80c4fa2b0975d..56c97189e7f12 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
@@ -392,12 +392,20 @@ int kfd_queue_unref_bo_vas(struct kfd_process_device *pdd,
 	return 0;
 }
 
-#define SGPR_SIZE_PER_CU	0x4000
-#define LDS_SIZE_PER_CU		0x10000
-#define HWREG_SIZE_PER_CU	0x1000
 #define DEBUGGER_BYTES_ALIGN	64
 #define DEBUGGER_BYTES_PER_WAVE	32
 
+static u32 kfd_get_sgpr_size_per_cu(u32 gfxv)
+{
+	u32 sgpr_size = 0x4000;
+
+	if (gfxv == 120500 ||
+	    gfxv == 120501)
+		sgpr_size = 0x8000;
+
+	return sgpr_size;
+}
+
 static u32 kfd_get_vgpr_size_per_cu(u32 gfxv)
 {
 	u32 vgpr_size = 0x40000;
@@ -413,14 +421,53 @@ static u32 kfd_get_vgpr_size_per_cu(u32 gfxv)
 		 gfxv == 120000 ||		/* GFX_VERSION_GFX1200 */
 		 gfxv == 120001)		/* GFX_VERSION_GFX1201 */
 		vgpr_size = 0x60000;
+	else if (gfxv == 120500 ||		/* GFX_VERSION_GFX1250 */
+		 gfxv == 120501)		/* GFX_VERSION_GFX1251 */
+		vgpr_size = 0x80000;
 
 	return vgpr_size;
 }
 
+static u32 kfd_get_hwreg_size_per_cu(u32 gfxv)
+{
+	u32 hwreg_size = 0x1000;
+
+	if (gfxv == 120500 || gfxv == 120501)
+		hwreg_size = 0x8000;
+
+	return hwreg_size;
+}
+
+static u32 kfd_get_lds_size_per_cu(u32 gfxv, struct kfd_node_properties *props)
+{
+	u32 lds_size = 0x10000;
+
+	if (gfxv == 90500 || gfxv == 120500 || gfxv == 120501)
+		lds_size = props->lds_size_in_kb << 10;
+
+	return lds_size;
+}
+
+static u32 get_num_waves(struct kfd_node_properties *props, u32 gfxv, u32 cu_num)
+{
+	u32 wave_num = 0;
+
+	if (gfxv < 100100)
+		wave_num = min(cu_num * 40,
+				props->array_count / props->simd_arrays_per_engine * 512);
+	else if (gfxv < 120500)
+		wave_num = cu_num * 32;
+	else if (gfxv <= 120501)
+		wave_num = cu_num * 64;
+
+	WARN_ON(wave_num == 0);
+
+	return wave_num;
+}
+
 #define WG_CONTEXT_DATA_SIZE_PER_CU(gfxv, props)	\
-	(kfd_get_vgpr_size_per_cu(gfxv) + SGPR_SIZE_PER_CU +\
-	 (((gfxv) == 90500) ? (props->lds_size_in_kb << 10) : LDS_SIZE_PER_CU) +\
-	 HWREG_SIZE_PER_CU)
+	(kfd_get_vgpr_size_per_cu(gfxv) + kfd_get_sgpr_size_per_cu(gfxv) +\
+	 kfd_get_lds_size_per_cu(gfxv, props) + kfd_get_hwreg_size_per_cu(gfxv))
 
 #define CNTL_STACK_BYTES_PER_WAVE(gfxv)	\
 	((gfxv) >= 100100 ? 12 : 8)	/* GFX_VERSION_NAVI10*/
@@ -440,9 +487,7 @@ void kfd_queue_ctx_save_restore_size(struct kfd_topology_device *dev)
 		return;
 
 	cu_num = props->simd_count / props->simd_per_cu / NUM_XCC(dev->gpu->xcc_mask);
-	wave_num = (gfxv < 100100) ?	/* GFX_VERSION_NAVI10 */
-		    min(cu_num * 40, props->array_count / props->simd_arrays_per_engine * 512)
-		    : cu_num * 32;
+	wave_num = get_num_waves(props, gfxv, cu_num);
 
 	wg_data_size = ALIGN(cu_num * WG_CONTEXT_DATA_SIZE_PER_CU(gfxv, props), PAGE_SIZE);
 	ctl_stack_size = wave_num * CNTL_STACK_BYTES_PER_WAVE(gfxv) + 8;
-- 
2.52.0

next prev parent reply	other threads:[~2025-12-10  7:15 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-10  7:13 [PATCH] drm/amdgpu: Flush TLB on all XCCs on GFX 12.1 Alex Deucher
2025-12-10  7:13 ` [PATCH] drm/amdgpu: Add soc v1_0 ih client id table Alex Deucher
2025-12-10  7:13 ` Alex Deucher [this message]
2025-12-10  7:13 ` [PATCH] drm/amdgpu: Fix CU info calculations for GFX 12.1 Alex Deucher
2025-12-10  7:13 ` [PATCH] drm/amdgpu: init RS64_MEC_P2/P3_STACK for gfx12.1 Alex Deucher
2025-12-10  7:14 ` [PATCH] drm/amdgpu: Enable 5-level page table for GFX 12.1.0 Alex Deucher
2025-12-10  7:14 ` [PATCH] drm/amdkfd: Update LDS, Scratch base for 57bit address Alex Deucher
2025-12-10  7:14 ` [PATCH] drm/amdgpu: Add pde3 table invalidation request for GFX 12.1.0 Alex Deucher
2025-12-10  7:14 ` [PATCH] drm/amdgpu: Support 57bit fault address " Alex Deucher
2025-12-10  7:14 ` [PATCH] drm/amdgpu: Fix CP_MEC_MDBASE in multi-xcc for gfx v12_1 Alex Deucher
2025-12-10  7:14 ` [PATCH] drm/amdgpu: Correct xcc_id input to GET_INST from physical to logic Alex Deucher
2025-12-10  7:14 ` [PATCH] drm/amdgpu: use physical xcc id to get rrmt Alex Deucher
2025-12-10  7:14 ` [PATCH] drm/amdgpu: Correct inst_id input from physical to logic Alex Deucher
2025-12-10  7:14 ` [PATCH] drm/amdgpu: support xcc harvest for ih translate Alex Deucher
2025-12-10  7:14 ` [PATCH] drm/amdgpu: normalize reg addr as local xcc for gfx v12_1 Alex Deucher
2025-12-10  7:14 ` [PATCH] drm/amdgpu/mes_v12_1: fix mes access xcd register Alex Deucher
2025-12-10  7:14 ` [PATCH] drm/amdgpu: add gfx sysfs support for gfx_v12_1 Alex Deucher
2025-12-10  7:14 ` [PATCH] drm/amdgpu: correct rlc autoload for xcc harvest Alex Deucher
2025-12-10  7:14 ` [PATCH] drm/amdkfd: Override KFD SVM mappings for GFX 12.1 Alex Deucher
2025-12-10  7:14 ` [PATCH] drm/amdgpu: Add gfx v12_1 interrupt source header Alex Deucher

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:80c4fa2b0975 dfblob:56c97189e7f1 )
 OR (
bs:"[PATCH] drm/amdkfd: Update CWSR area calculations for GFX 12.1" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251210071415.19983-3-alexander.deucher@amd.com \
    --to=alexander.deucher@amd.com \
    --cc=Feifei.Xu@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=mukul.joshi@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox