From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Mukul Joshi <mukul.joshi@amd.com>,
Felix Kuehling <Felix.Kuehling@amd.com>,
Alex Deucher <alexander.deucher@amd.com>,
Sasha Levin <sashal@kernel.org>,
christian.koenig@amd.com, Xinhui.Pan@amd.com, airlied@gmail.com,
daniel@ffwll.ch, Hawking.Zhang@amd.com, le.ma@amd.com,
lijo.lazar@amd.com, tao.zhou1@amd.com, YiPeng.Chai@amd.com,
Likun.Gao@amd.com, rajneesh.bhardwaj@amd.com,
tom.stdenis@amd.com, guchun.chen@amd.com, David.Francis@amd.com,
James.Zhu@amd.com, evan.quan@amd.com, marek.olsak@amd.com,
Graham.Sider@amd.com, srinivasan.shanmugam@amd.com,
Lang.Yu@amd.com, bas@basnieuwenhuizen.nl,
sukrut.bellary@linux.com, mario.limonciello@amd.com,
jonathan.kim@amd.com, kenneth.feng@amd.com,
Hongkun.Zhang@amd.com, sunran001@208suo.com, error27@gmail.com,
Jiadong.Zhu@amd.com, jesse.zhang@amd.com, shiwu.zhang@amd.com,
amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: [PATCH AUTOSEL 6.5 24/41] drm/amdgpu: Store CU info from all XCCs for GFX v9.4.3
Date: Sun, 24 Sep 2023 09:15:12 -0400 [thread overview]
Message-ID: <20230924131529.1275335-24-sashal@kernel.org> (raw)
In-Reply-To: <20230924131529.1275335-1-sashal@kernel.org>
From: Mukul Joshi <mukul.joshi@amd.com>
[ Upstream commit 97e3c6a853f2af9145daf0c6ca25bcdf55c759d4 ]
Currently, we store CU info only for a single XCC assuming
that it is the same for all XCCs. However, that may not be
true. As a result, store CU info for all XCCs. This info is
later used for CU masking.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 3 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 2 +-
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 +-
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 2 +-
drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c | 2 +-
drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 2 +-
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 2 +-
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 +-
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 76 +++++++++----------
drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 3 +-
drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c | 8 +-
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 11 ++-
.../gpu/drm/amd/include/kgd_kfd_interface.h | 6 +-
14 files changed, 60 insertions(+), 65 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index b4fcad0e62f7e..a7c8beff1647c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -492,7 +492,7 @@ void amdgpu_amdkfd_get_cu_info(struct amdgpu_device *adev, struct kfd_cu_info *c
cu_info->cu_active_number = acu_info.number;
cu_info->cu_ao_mask = acu_info.ao_cu_mask;
memcpy(&cu_info->cu_bitmap[0], &acu_info.bitmap[0],
- sizeof(acu_info.bitmap));
+ sizeof(cu_info->cu_bitmap));
cu_info->num_shader_engines = adev->gfx.config.max_shader_engines;
cu_info->num_shader_arrays_per_engine = adev->gfx.config.max_sh_per_se;
cu_info->num_cu_per_sh = adev->gfx.config.max_cu_per_sh;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index a4ff515ce8966..59ba03d387fcc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -43,6 +43,7 @@
#define AMDGPU_GFX_LBPW_DISABLED_MODE 0x00000008L
#define AMDGPU_MAX_GC_INSTANCES 8
+#define KGD_MAX_QUEUES 128
#define AMDGPU_MAX_GFX_QUEUES KGD_MAX_QUEUES
#define AMDGPU_MAX_COMPUTE_QUEUES KGD_MAX_QUEUES
@@ -254,7 +255,7 @@ struct amdgpu_cu_info {
uint32_t number;
uint32_t ao_cu_mask;
uint32_t ao_cu_bitmap[4][4];
- uint32_t bitmap[4][4];
+ uint32_t bitmap[AMDGPU_MAX_GC_INSTANCES][4][4];
};
struct amdgpu_gfx_ras {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index d4ca19ba5a289..f678bdd5f353d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -839,7 +839,7 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
memcpy(&dev_info->cu_ao_bitmap[0], &adev->gfx.cu_info.ao_cu_bitmap[0],
sizeof(adev->gfx.cu_info.ao_cu_bitmap));
memcpy(&dev_info->cu_bitmap[0], &adev->gfx.cu_info.bitmap[0],
- sizeof(adev->gfx.cu_info.bitmap));
+ sizeof(dev_info->cu_bitmap));
dev_info->vram_type = adev->gmc.vram_type;
dev_info->vram_bit_width = adev->gmc.vram_width;
dev_info->vce_harvest_config = adev->vce.harvest_config;
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 44af8022b89fa..f743bf2c92877 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -9448,7 +9448,7 @@ static int gfx_v10_0_get_cu_info(struct amdgpu_device *adev,
gfx_v10_0_set_user_wgp_inactive_bitmap_per_sh(
adev, disable_masks[i * 2 + j]);
bitmap = gfx_v10_0_get_cu_active_bitmap_per_sh(adev);
- cu_info->bitmap[i][j] = bitmap;
+ cu_info->bitmap[0][i][j] = bitmap;
for (k = 0; k < adev->gfx.config.max_cu_per_sh; k++) {
if (bitmap & mask) {
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index 0451533ddde41..a82cba884c48f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -6394,7 +6394,7 @@ static int gfx_v11_0_get_cu_info(struct amdgpu_device *adev,
* SE6: {SH0,SH1} --> {bitmap[2][2], bitmap[2][3]}
* SE7: {SH0,SH1} --> {bitmap[3][2], bitmap[3][3]}
*/
- cu_info->bitmap[i % 4][j + (i / 4) * 2] = bitmap;
+ cu_info->bitmap[0][i % 4][j + (i / 4) * 2] = bitmap;
for (k = 0; k < adev->gfx.config.max_cu_per_sh; k++) {
if (bitmap & mask)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
index da6caff78c22b..34f9211b26793 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
@@ -3577,7 +3577,7 @@ static void gfx_v6_0_get_cu_info(struct amdgpu_device *adev)
gfx_v6_0_set_user_cu_inactive_bitmap(
adev, disable_masks[i * 2 + j]);
bitmap = gfx_v6_0_get_cu_enabled(adev);
- cu_info->bitmap[i][j] = bitmap;
+ cu_info->bitmap[0][i][j] = bitmap;
for (k = 0; k < adev->gfx.config.max_cu_per_sh; k++) {
if (bitmap & mask) {
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index 8c174c11eaee0..6feae2548e8ee 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -5122,7 +5122,7 @@ static void gfx_v7_0_get_cu_info(struct amdgpu_device *adev)
gfx_v7_0_set_user_cu_inactive_bitmap(
adev, disable_masks[i * 2 + j]);
bitmap = gfx_v7_0_get_cu_active_bitmap(adev);
- cu_info->bitmap[i][j] = bitmap;
+ cu_info->bitmap[0][i][j] = bitmap;
for (k = 0; k < adev->gfx.config.max_cu_per_sh; k ++) {
if (bitmap & mask) {
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 51c1745c83697..885ebd703260f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -7121,7 +7121,7 @@ static void gfx_v8_0_get_cu_info(struct amdgpu_device *adev)
gfx_v8_0_set_user_cu_inactive_bitmap(
adev, disable_masks[i * 2 + j]);
bitmap = gfx_v8_0_get_cu_active_bitmap(adev);
- cu_info->bitmap[i][j] = bitmap;
+ cu_info->bitmap[0][i][j] = bitmap;
for (k = 0; k < adev->gfx.config.max_cu_per_sh; k ++) {
if (bitmap & mask) {
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 65577eca58f1c..e511e49d3023f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -1497,7 +1497,7 @@ static void gfx_v9_0_init_always_on_cu_mask(struct amdgpu_device *adev)
amdgpu_gfx_select_se_sh(adev, i, j, 0xffffffff, 0);
for (k = 0; k < adev->gfx.config.max_cu_per_sh; k ++) {
- if (cu_info->bitmap[i][j] & mask) {
+ if (cu_info->bitmap[0][i][j] & mask) {
if (counter == pg_always_on_cu_num)
WREG32_SOC15(GC, 0, mmRLC_PG_ALWAYS_ON_CU_MASK, cu_bitmap);
if (counter < always_on_cu_num)
@@ -7234,7 +7234,7 @@ static int gfx_v9_0_get_cu_info(struct amdgpu_device *adev,
* SE6,SH0 --> bitmap[2][1]
* SE7,SH0 --> bitmap[3][1]
*/
- cu_info->bitmap[i % 4][j + i / 4] = bitmap;
+ cu_info->bitmap[0][i % 4][j + i / 4] = bitmap;
for (k = 0; k < adev->gfx.config.max_cu_per_sh; k ++) {
if (bitmap & mask) {
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
index 4f883b94f98ef..84a74a6c6b2de 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
@@ -4228,7 +4228,7 @@ static void gfx_v9_4_3_set_gds_init(struct amdgpu_device *adev)
}
static void gfx_v9_4_3_set_user_cu_inactive_bitmap(struct amdgpu_device *adev,
- u32 bitmap)
+ u32 bitmap, int xcc_id)
{
u32 data;
@@ -4238,15 +4238,15 @@ static void gfx_v9_4_3_set_user_cu_inactive_bitmap(struct amdgpu_device *adev,
data = bitmap << GC_USER_SHADER_ARRAY_CONFIG__INACTIVE_CUS__SHIFT;
data &= GC_USER_SHADER_ARRAY_CONFIG__INACTIVE_CUS_MASK;
- WREG32_SOC15(GC, GET_INST(GC, 0), regGC_USER_SHADER_ARRAY_CONFIG, data);
+ WREG32_SOC15(GC, GET_INST(GC, xcc_id), regGC_USER_SHADER_ARRAY_CONFIG, data);
}
-static u32 gfx_v9_4_3_get_cu_active_bitmap(struct amdgpu_device *adev)
+static u32 gfx_v9_4_3_get_cu_active_bitmap(struct amdgpu_device *adev, int xcc_id)
{
u32 data, mask;
- data = RREG32_SOC15(GC, GET_INST(GC, 0), regCC_GC_SHADER_ARRAY_CONFIG);
- data |= RREG32_SOC15(GC, GET_INST(GC, 0), regGC_USER_SHADER_ARRAY_CONFIG);
+ data = RREG32_SOC15(GC, GET_INST(GC, xcc_id), regCC_GC_SHADER_ARRAY_CONFIG);
+ data |= RREG32_SOC15(GC, GET_INST(GC, xcc_id), regGC_USER_SHADER_ARRAY_CONFIG);
data &= CC_GC_SHADER_ARRAY_CONFIG__INACTIVE_CUS_MASK;
data >>= CC_GC_SHADER_ARRAY_CONFIG__INACTIVE_CUS__SHIFT;
@@ -4259,7 +4259,7 @@ static u32 gfx_v9_4_3_get_cu_active_bitmap(struct amdgpu_device *adev)
static int gfx_v9_4_3_get_cu_info(struct amdgpu_device *adev,
struct amdgpu_cu_info *cu_info)
{
- int i, j, k, counter, active_cu_number = 0;
+ int i, j, k, counter, xcc_id, active_cu_number = 0;
u32 mask, bitmap, ao_bitmap, ao_cu_mask = 0;
unsigned disable_masks[4 * 4];
@@ -4278,46 +4278,38 @@ static int gfx_v9_4_3_get_cu_info(struct amdgpu_device *adev,
adev->gfx.config.max_sh_per_se);
mutex_lock(&adev->grbm_idx_mutex);
- for (i = 0; i < adev->gfx.config.max_shader_engines; i++) {
- for (j = 0; j < adev->gfx.config.max_sh_per_se; j++) {
- mask = 1;
- ao_bitmap = 0;
- counter = 0;
- gfx_v9_4_3_xcc_select_se_sh(adev, i, j, 0xffffffff, 0);
- gfx_v9_4_3_set_user_cu_inactive_bitmap(
- adev, disable_masks[i * adev->gfx.config.max_sh_per_se + j]);
- bitmap = gfx_v9_4_3_get_cu_active_bitmap(adev);
-
- /*
- * The bitmap(and ao_cu_bitmap) in cu_info structure is
- * 4x4 size array, and it's usually suitable for Vega
- * ASICs which has 4*2 SE/SH layout.
- * But for Arcturus, SE/SH layout is changed to 8*1.
- * To mostly reduce the impact, we make it compatible
- * with current bitmap array as below:
- * SE4,SH0 --> bitmap[0][1]
- * SE5,SH0 --> bitmap[1][1]
- * SE6,SH0 --> bitmap[2][1]
- * SE7,SH0 --> bitmap[3][1]
- */
- cu_info->bitmap[i % 4][j + i / 4] = bitmap;
-
- for (k = 0; k < adev->gfx.config.max_cu_per_sh; k++) {
- if (bitmap & mask) {
- if (counter < adev->gfx.config.max_cu_per_sh)
- ao_bitmap |= mask;
- counter++;
+ for (xcc_id = 0; xcc_id < NUM_XCC(adev->gfx.xcc_mask); xcc_id++) {
+ for (i = 0; i < adev->gfx.config.max_shader_engines; i++) {
+ for (j = 0; j < adev->gfx.config.max_sh_per_se; j++) {
+ mask = 1;
+ ao_bitmap = 0;
+ counter = 0;
+ gfx_v9_4_3_xcc_select_se_sh(adev, i, j, 0xffffffff, xcc_id);
+ gfx_v9_4_3_set_user_cu_inactive_bitmap(
+ adev,
+ disable_masks[i * adev->gfx.config.max_sh_per_se + j],
+ xcc_id);
+ bitmap = gfx_v9_4_3_get_cu_active_bitmap(adev, xcc_id);
+
+ cu_info->bitmap[xcc_id][i][j] = bitmap;
+
+ for (k = 0; k < adev->gfx.config.max_cu_per_sh; k++) {
+ if (bitmap & mask) {
+ if (counter < adev->gfx.config.max_cu_per_sh)
+ ao_bitmap |= mask;
+ counter++;
+ }
+ mask <<= 1;
}
- mask <<= 1;
+ active_cu_number += counter;
+ if (i < 2 && j < 2)
+ ao_cu_mask |= (ao_bitmap << (i * 16 + j * 8));
+ cu_info->ao_cu_bitmap[i][j] = ao_bitmap;
}
- active_cu_number += counter;
- if (i < 2 && j < 2)
- ao_cu_mask |= (ao_bitmap << (i * 16 + j * 8));
- cu_info->ao_cu_bitmap[i % 4][j + i / 4] = ao_bitmap;
}
+ gfx_v9_4_3_xcc_select_se_sh(adev, 0xffffffff, 0xffffffff, 0xffffffff,
+ xcc_id);
}
- gfx_v9_4_3_xcc_select_se_sh(adev, 0xffffffff, 0xffffffff, 0xffffffff,
- 0);
mutex_unlock(&adev->grbm_idx_mutex);
cu_info->number = active_cu_number;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
index f5a6f562e2a80..11b9837292536 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
@@ -2154,7 +2154,8 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
amdgpu_amdkfd_get_cu_info(kdev->adev, &cu_info);
cu->num_simd_per_cu = cu_info.simd_per_cu;
- cu->num_simd_cores = cu_info.simd_per_cu * cu_info.cu_active_number;
+ cu->num_simd_cores = cu_info.simd_per_cu *
+ (cu_info.cu_active_number / kdev->kfd->num_nodes);
cu->max_waves_simd = cu_info.max_waves_per_simd;
cu->wave_front_size = cu_info.wave_front_size;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
index 863cf060af484..35e05ee89eac5 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
@@ -104,11 +104,13 @@ void mqd_symmetrically_map_cu_mask(struct mqd_manager *mm,
bool wgp_mode_req = KFD_GC_VERSION(mm->dev) >= IP_VERSION(10, 0, 0);
uint32_t en_mask = wgp_mode_req ? 0x3 : 0x1;
int i, se, sh, cu, cu_bitmap_sh_mul, inc = wgp_mode_req ? 2 : 1;
+ uint32_t cu_active_per_node;
amdgpu_amdkfd_get_cu_info(mm->dev->adev, &cu_info);
- if (cu_mask_count > cu_info.cu_active_number)
- cu_mask_count = cu_info.cu_active_number;
+ cu_active_per_node = cu_info.cu_active_number / mm->dev->kfd->num_nodes;
+ if (cu_mask_count > cu_active_per_node)
+ cu_mask_count = cu_active_per_node;
/* Exceeding these bounds corrupts the stack and indicates a coding error.
* Returning with no CU's enabled will hang the queue, which should be
@@ -141,7 +143,7 @@ void mqd_symmetrically_map_cu_mask(struct mqd_manager *mm,
for (se = 0; se < cu_info.num_shader_engines; se++)
for (sh = 0; sh < cu_info.num_shader_arrays_per_engine; sh++)
cu_per_sh[se][sh] = hweight32(
- cu_info.cu_bitmap[se % 4][sh + (se / 4) * cu_bitmap_sh_mul]);
+ cu_info.cu_bitmap[0][se % 4][sh + (se / 4) * cu_bitmap_sh_mul]);
/* Symmetrically map cu_mask to all SEs & SHs:
* se_mask programs up to 2 SH in the upper and lower 16 bits.
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index 4a17bb7c7b27d..ea67a353beb00 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -450,8 +450,7 @@ static ssize_t node_show(struct kobject *kobj, struct attribute *attr,
sysfs_show_32bit_prop(buffer, offs, "cpu_cores_count",
dev->node_props.cpu_cores_count);
sysfs_show_32bit_prop(buffer, offs, "simd_count",
- dev->gpu ? (dev->node_props.simd_count *
- NUM_XCC(dev->gpu->xcc_mask)) : 0);
+ dev->gpu ? dev->node_props.simd_count : 0);
sysfs_show_32bit_prop(buffer, offs, "mem_banks_count",
dev->node_props.mem_banks_count);
sysfs_show_32bit_prop(buffer, offs, "caches_count",
@@ -1658,7 +1657,7 @@ static int fill_in_l2_l3_pcache(struct kfd_cache_properties **props_ext,
int i, j, k;
struct kfd_cache_properties *pcache = NULL;
- cu_sibling_map_mask = cu_info->cu_bitmap[0][0];
+ cu_sibling_map_mask = cu_info->cu_bitmap[0][0][0];
cu_sibling_map_mask &=
((1 << pcache_info[cache_type].num_cu_shared) - 1);
first_active_cu = ffs(cu_sibling_map_mask);
@@ -1701,7 +1700,7 @@ static int fill_in_l2_l3_pcache(struct kfd_cache_properties **props_ext,
pcache->sibling_map[k+3] = (uint8_t)((cu_sibling_map_mask >> 24) & 0xFF);
k += 4;
- cu_sibling_map_mask = cu_info->cu_bitmap[i % 4][j + i / 4];
+ cu_sibling_map_mask = cu_info->cu_bitmap[0][i % 4][j + i / 4];
cu_sibling_map_mask &= ((1 << pcache_info[cache_type].num_cu_shared) - 1);
}
}
@@ -1762,8 +1761,8 @@ static void kfd_fill_cache_non_crat_info(struct kfd_topology_device *dev, struct
for (k = 0; k < pcu_info->num_cu_per_sh; k += pcache_info[ct].num_cu_shared) {
ret = fill_in_l1_pcache(&props_ext, pcache_info, pcu_info,
- pcu_info->cu_bitmap[i % 4][j + i / 4], ct,
- cu_processor_id, k);
+ pcu_info->cu_bitmap[0][i % 4][j + i / 4], ct,
+ cu_processor_id, k);
if (ret < 0)
break;
diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
index d0df3381539f0..74cc545085a02 100644
--- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
@@ -31,12 +31,12 @@
#include <linux/types.h>
#include <linux/bitmap.h>
#include <linux/dma-fence.h>
+#include "amdgpu_irq.h"
+#include "amdgpu_gfx.h"
struct pci_dev;
struct amdgpu_device;
-#define KGD_MAX_QUEUES 128
-
struct kfd_dev;
struct kgd_mem;
@@ -68,7 +68,7 @@ struct kfd_cu_info {
uint32_t wave_front_size;
uint32_t max_scratch_slots_per_cu;
uint32_t lds_size;
- uint32_t cu_bitmap[4][4];
+ uint32_t cu_bitmap[AMDGPU_MAX_GC_INSTANCES][4][4];
};
/* For getting GPU local memory information from KGD */
--
2.40.1
next prev parent reply other threads:[~2023-09-24 13:17 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-24 13:14 [PATCH AUTOSEL 6.5 01/41] nvme-fc: Prevent null pointer dereference in nvme_fc_io_getuuid() Sasha Levin
2023-09-24 13:14 ` [PATCH AUTOSEL 6.5 02/41] parisc: sba: Fix compile warning wrt list of SBA devices Sasha Levin
2023-09-24 13:14 ` [PATCH AUTOSEL 6.5 03/41] parisc: sba-iommu: Fix sparse warnigs Sasha Levin
2023-09-24 13:14 ` [PATCH AUTOSEL 6.5 04/41] parisc: ccio-dma: Fix sparse warnings Sasha Levin
2023-09-24 13:14 ` [PATCH AUTOSEL 6.5 05/41] parisc: iosapic.c: " Sasha Levin
2023-09-24 13:14 ` [PATCH AUTOSEL 6.5 06/41] parisc: drivers: Fix sparse warning Sasha Levin
2023-09-24 13:14 ` [PATCH AUTOSEL 6.5 07/41] parisc: irq: Make irq_stack_union static to avoid " Sasha Levin
2023-09-24 13:14 ` [PATCH AUTOSEL 6.5 08/41] scsi: qedf: Add synchronization between I/O completions and abort Sasha Levin
2023-09-24 13:14 ` [PATCH AUTOSEL 6.5 09/41] scsi: ufs: core: Move __ufshcd_send_uic_cmd() outside host_lock Sasha Levin
2023-09-24 13:14 ` [PATCH AUTOSEL 6.5 10/41] scsi: ufs: core: Poll HCS.UCRDY before issuing a UIC command Sasha Levin
2023-09-24 13:14 ` [PATCH AUTOSEL 6.5 11/41] selftests/ftrace: Correctly enable event in instance-event.tc Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 12/41] ring-buffer: Avoid softlockup in ring_buffer_resize() Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 13/41] btrfs: do not block starts waiting on previous transaction commit Sasha Levin
2023-09-25 13:01 ` David Sterba
2023-09-25 17:47 ` Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 14/41] btrfs: improve error message after failure to add delayed dir index item Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 15/41] btrfs: assert delayed node locked when removing delayed item Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 16/41] selftests: fix dependency checker script Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 17/41] ring-buffer: Do not attempt to read past "commit" Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 18/41] net/smc: bugfix for smcr v2 server connect success statistic Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 19/41] ata: sata_mv: Fix incorrect string length computation in mv_dump_mem() Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 20/41] efi/x86: Ensure that EFI_RUNTIME_MAP is enabled for kexec Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 21/41] platform/mellanox: mlxbf-bootctl: add NET dependency into Kconfig Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 22/41] platform/x86: asus-wmi: Support 2023 ROG X16 tablet mode Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 23/41] thermal/of: add missing of_node_put() Sasha Levin
2023-09-24 13:15 ` Sasha Levin [this message]
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 25/41] drm/amdkfd: Update cache info reporting for GFX v9.4.3 Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 26/41] drm/amdkfd: Update CU masking for GFX 9.4.3 Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 27/41] drm/amd/display: Add dirty rect support for Replay Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 28/41] drm/amd/display: Don't check registers, if using AUX BL control Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 29/41] drm/amdgpu/soc21: don't remap HDP registers for SR-IOV Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 30/41] drm/amdgpu/nbio4.3: set proper rmmio_remap.reg_offset " Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 31/41] drm/amdgpu: fallback to old RAS error message for aqua_vanjaram Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 32/41] drm/amdkfd: Checkpoint and restore queues on GFX11 Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 33/41] drm/amdgpu: Handle null atom context in VBIOS info ioctl Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 34/41] objtool: Fix _THIS_IP_ detection for cold functions Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 35/41] nvme-pci: do not set the NUMA node of device if it has none Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 36/41] riscv: errata: fix T-Head dcache.cva encoding Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 37/41] scsi: pm80xx: Use phy-specific SAS address when sending PHY_START command Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 38/41] scsi: pm80xx: Avoid leaking tags when processing OPC_INB_SET_CONTROLLER_CONFIG command Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 39/41] smb3: correct places where ENOTSUPP is used instead of preferred EOPNOTSUPP Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 40/41] ata: libata-eh: do not clear ATA_PFLAG_EH_PENDING in ata_eh_reset() Sasha Levin
2023-09-24 13:15 ` [PATCH AUTOSEL 6.5 41/41] ata: libata-eh: do not thaw the port twice " Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230924131529.1275335-24-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=David.Francis@amd.com \
--cc=Felix.Kuehling@amd.com \
--cc=Graham.Sider@amd.com \
--cc=Hawking.Zhang@amd.com \
--cc=Hongkun.Zhang@amd.com \
--cc=James.Zhu@amd.com \
--cc=Jiadong.Zhu@amd.com \
--cc=Lang.Yu@amd.com \
--cc=Likun.Gao@amd.com \
--cc=Xinhui.Pan@amd.com \
--cc=YiPeng.Chai@amd.com \
--cc=airlied@gmail.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=bas@basnieuwenhuizen.nl \
--cc=christian.koenig@amd.com \
--cc=daniel@ffwll.ch \
--cc=dri-devel@lists.freedesktop.org \
--cc=error27@gmail.com \
--cc=evan.quan@amd.com \
--cc=guchun.chen@amd.com \
--cc=jesse.zhang@amd.com \
--cc=jonathan.kim@amd.com \
--cc=kenneth.feng@amd.com \
--cc=le.ma@amd.com \
--cc=lijo.lazar@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=marek.olsak@amd.com \
--cc=mario.limonciello@amd.com \
--cc=mukul.joshi@amd.com \
--cc=rajneesh.bhardwaj@amd.com \
--cc=shiwu.zhang@amd.com \
--cc=srinivasan.shanmugam@amd.com \
--cc=stable@vger.kernel.org \
--cc=sukrut.bellary@linux.com \
--cc=sunran001@208suo.com \
--cc=tao.zhou1@amd.com \
--cc=tom.stdenis@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox