From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from CH5PR02CU005.outbound.protection.outlook.com (mail-northcentralusazon11012043.outbound.protection.outlook.com [40.107.200.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 16E13379ED2 for ; Tue, 28 Apr 2026 02:26:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.200.43 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777343175; cv=fail; b=qFIYLuYVC+vxTPeoBLJYO1mCsWyI2Wza2UOQI2IXpk1FG0qMdl+Sf9odFGPAZ5wIqJXVqiOZ+Q6vlfowj9Uz24DmFtO5F6DZ1uT0qqHmIk3hzEVynbJqQLUQTLgeZ0cHO98AvdnnRrQZl3sWiKawZTqZyHO2fwWyGzM3oG/wftY= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777343175; c=relaxed/simple; bh=u+rBez26vvzNIP8ceUx+p4xNiCjikGUvyTMGph8GLyA=; h=From:To:Cc:Subject:Date:Message-ID:Content-Type:MIME-Version; b=mWoo7udOBxFS0KjqEDFbXGhsa/v8EFAgKQMGX9f0qv8NwkhUX5WXqRagqdXgvJjTOHRoVhQXAY6tk21GA/tQfON8UWbWkrfI5dU0/LdQzGJPo7a0IGXBhs2uVjOLONQgBTdJolKWUrTdf+noqzyg22ANp6MVmktd1662UrLjsnc= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=unioIHw9; arc=fail smtp.client-ip=40.107.200.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="unioIHw9" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=qSM7YELtnh+W3/V1GBXekz228fGIKi5OlvMsKQPaUwngHy7RfIuvG+LeL7NDxQbHa+E/d560zaXd+dhxphMIXFtCzBfFaP3Ll5RXHLat2ETSmoXZCh+Te7H20nB6cWM2a5pZwMtuUHYmiHrkAWChvj1FpuUHubbL0OX0KipjHTYkOhqVoWBnLK949xvHdJbJtTN6k6ccGesTN/K7HBLBNs0orD8twUV9ggWl93fKqnNQvStNMoCuwMf/xCu8DbOpP+J1sVranHvtc/6u5GWiem2tvETJaBIBf02rr7zr0CZqJrOS4+PsL9senxpMQ59oPHEBOZZg7alJesRgwhPEsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=UDyUNoctDMfwm+g2l5JMBDbLiL5Q6jDUdV3kLUHo9hU=; b=sGLJJZZNk5Xq1A6GjYcRSEp1CoMQ5W0HsrckOjj5kxMPqKhdoxTrcA2gYjRqazViPRvW2uRA15gdb1s+6aPyNXD26dO5wZVUqIhmE0rrR/6cZDJIROeHiHy3baPJeW0lURzNmiBaug0MSZzpYi2cupC0/skzzuBdVH1n8GwsIms/oIWclxDf6S9QBmsmBVAlunVIMzDm5YHYiBM7vTOd2RPUZFgGeMFJ7Z4/KC/ivJEP0bGztGrEKbRkjR5nQysuCl9kb7SskegaRX3cSAxSKVf5aJBLnIVSBW7VKS7I/4RowOtpv2E2uzkdWhnSeDA3EnbPIKhv1Smxl2e8xHjltw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=UDyUNoctDMfwm+g2l5JMBDbLiL5Q6jDUdV3kLUHo9hU=; b=unioIHw9S2L35SAAkqPP8IkKrN8CbDnXjTZjWrO+FfdEM80U3jzgjhbOMBWP8I1wT/dhU+5fwAbmn2LEOKRUove9sSzsnHArlsyoV/IzliR0eQC/HGevtiIzSeEvl2Amw+yxg14GikO3yfWtEHu384YYti9HQaYiPo0fy/D0Lyw/iUFm5nO24xJe3XcYWeQD3fMVqytocGtNKWB20QH7DRx2s+lF1p4qqo5qnDjXBV2kxD2eiKOvkoyGEwfDQ2P0COsYAeMl3xjfSDmR9aGPsngVPEsTom1lOYaTQZ/4iBm7LnHEr9H9xi2b/58qwVk0Dw5ry2Mv29/WO+QjDvtaRQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from CY8PR12MB8300.namprd12.prod.outlook.com (2603:10b6:930:7d::16) by SN7PR12MB7323.namprd12.prod.outlook.com (2603:10b6:806:29a::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9870.16; Tue, 28 Apr 2026 02:26:08 +0000 Received: from CY8PR12MB8300.namprd12.prod.outlook.com ([fe80::ce75:8187:3ac3:c5de]) by CY8PR12MB8300.namprd12.prod.outlook.com ([fe80::ce75:8187:3ac3:c5de%3]) with mapi id 15.20.9870.013; Tue, 28 Apr 2026 02:26:08 +0000 From: Yury Norov To: Felix Kuehling , Alex Deucher , =?UTF-8?q?Christian=20K=C3=B6nig?= , David Airlie , Simona Vetter , amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: Yury Norov Subject: [PATCH] drm/amdkfd: simplify kgd_gfx_v9_get_cu_occupancy() Date: Mon, 27 Apr 2026 22:26:04 -0400 Message-ID: <20260428022605.447901-1-ynorov@nvidia.com> X-Mailer: git-send-email 2.51.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-ClientProxiedBy: BN0PR10CA0014.namprd10.prod.outlook.com (2603:10b6:408:143::12) To CY8PR12MB8300.namprd12.prod.outlook.com (2603:10b6:930:7d::16) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CY8PR12MB8300:EE_|SN7PR12MB7323:EE_ X-MS-Office365-Filtering-Correlation-Id: b3b13a83-8e95-4300-f1b4-08dea4cd816d X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|376014|10070799003|56012099003|18002099003; X-Microsoft-Antispam-Message-Info: rrjMYlU5UhptM2V1qj9cMGRO6F7V84+l5emyb/R8HckBmznxk2G6lq8PFTIrY228UlronmoonwVbiSoy3y/1LTyAO+trEPTwe4QkRjcWszp6Ubv0m6Yj3S6QVbQk3HNMDyqgSn5sfbx41iBCemHTLfcpIvjqpXDJWqIrgz2ioTabSaeE4FKJh3pQvs8SjFByAw3Ob1qYRRWwi4GEjfuvwwlCzelQ6nvjN7XePy8qiVyGLKqh4mF4qzIZIA6vYE/023LxCiAgLDzBKnYS0hWXdjnYQcENyUodssLU7U4hDpFSHtajKP8iWxyV8rUNe5c+32LsS3Iy3uJVoeRPdPGLPpC2rzvo+mUQoFdaq74UBfyBy3Dfz1//M0cxcTPD/eFbwOHohnEtXEB0uNnIKebs6szOeS/ce3WtPwoSm/6Yw1vTQunZ2PKoj4rQnbhoIcziGL5WGpG0i1/aT3YZL0Ce3w/NgAM29g4q8BTaoKtxYCnrxgXCd79ImrMQ+oHuMme4jCtqo9ucItUfQ45O7BwUxV+y0uQ3i6FaX/5pSh+DaCyNd6YEpCG9Esm0kdhDKPHRUly8xClLH5ulv7/97AjMQVqvQ0uZJrxkehRVKB80nQED0zgWYGxkfPjoDVAPES/R27mVlJRCMFmeI+3mruT1L0OOOkl/s4RJCBbRyqwzZy8L8fCaP3m47mpEy/zvJ4A1or3RAAFvqsMhUiRMqfdKvv0vDBcw24ivxAjc3sjsOn4= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CY8PR12MB8300.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(1800799024)(376014)(10070799003)(56012099003)(18002099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?c0BZQAeZugx2o51ZjCMZJASQF0/T2BfzV52MVo+a4H9gaHSeXP/Y87poeYeP?= =?us-ascii?Q?eUtKWjV+f5BIbxqN+Jr9GhpNQenijIwxZItSZ9PaULbSYmvUQMu2MDCmZMvB?= =?us-ascii?Q?yZhXMX5aEh+ZoW8bhimLtgJjL3HWH+IraWaucvZsD6ow8+Bzn/uLqlXig/C4?= =?us-ascii?Q?pR7Q+mXchhS7oqMFjGTYIddeg7F8M6wtjXCelGJaSeEMcnKcjWACrmEkPl2o?= =?us-ascii?Q?76e1IEI28NhHOhKgSkbMsGTsyFRabzhThBAAjFAQt6jQaxyig+LpcxTXeJh2?= =?us-ascii?Q?Yk8JpH4X6ean1/pC5s18PKB2k7420+neBpz57fcDebvcKaScMCUKqr5EeQ1u?= =?us-ascii?Q?SwjvACI50E50sRCErS9dmgslKP4FKQEBwRwEYZs2JIggr/sveE2jPGDFam+t?= =?us-ascii?Q?gZjOUvncVtSZibhg0cBEA87Duh/x/h0s446N0ddbHDEcFH2yEk/KjE1cW3Db?= =?us-ascii?Q?ExuYbq7shAsdKUd83DW8x4Kx9WXrAjiRXxmUBK/9Zk9toDgOUVhHkm68af2K?= =?us-ascii?Q?TN/sGdjxP/zbjot1IQ6sYlEZOa/rrJ8F885nunYGHKN9HX2elLs74KXQ3aO8?= =?us-ascii?Q?b398R0rV4JuK90aJ5hREloZhTOG2hReH5AdGTBnseBALGL46fjgWihcKtQfg?= =?us-ascii?Q?GQ/ud5p84bKCqsNwXrH7ZYE/9tezL2cwxoyKn6nKfnAJyZKSU0DeA/plWJIU?= =?us-ascii?Q?WxTQtvlF2gfxQxlPAW2++WSgREBR99CTbtSxtS0KzA3+GEgNJMtCGWgAVQ4r?= =?us-ascii?Q?JEpfcMM+ctDfRhhhgPy8JgW7d0X1IDzM05SCB0UcZYPfKSzbelbUOzrHNXCk?= =?us-ascii?Q?anqRfgk+YUpSoXy+Uu+AHUXfIEXLRxfpBQ4FA6eK5J112yz/okc1SwC16444?= =?us-ascii?Q?sXYgPSfdzcS91r8uHFwoyZ8mlJe4crTzY/yeRhw1cxDZiWge8ocgt6wSOeI8?= =?us-ascii?Q?2uclyF8v0NR/AhefmJQTd5ZxLKHvdkHi5wTS/WEzTYYCw0jivZbXzKZ12PfK?= =?us-ascii?Q?O1CY+EN/1OUngE4pDmOZbv/wOMlW8KoM61xDCU7KaVmurbUO9E2pkufEXXFI?= =?us-ascii?Q?BjBmWRsWdKGPoOPXLaeNT4NYN+Ksqe265xuLc57sARErrASq9Z+IvARS1PD3?= =?us-ascii?Q?AP4+zTddQHkIssoyGFhmoyqfJ69j6ldQC+pTlt8fCaXm9v1M34hf0Vk9ZyD3?= =?us-ascii?Q?ouerozH8hpRWGam/1z5kxLjdSZiIa3OIfx4C5yn14olzzOojQKhrxTrWI+Wi?= =?us-ascii?Q?SCGwK6Ru6tEJKFRG3m3/cr0noyU6t2EAsCjCJMWJ+fC7ZFovFF4/6QVHqJJz?= =?us-ascii?Q?SCjG3OAcAnwR0GgN90Y+FgLHQJHqlkj1oiFp4OnMYxmNyCFdVvGhu/seQsgz?= =?us-ascii?Q?pLJIjgqCflSReVt5DZpTVxAD84M+dCKJd5ksa6VJj4mBz4+YR9ZwH64zcSQd?= =?us-ascii?Q?RO7zhaVsZDfwonKJiBM3JhDUU2PvMnE6YluWFmjxRFwbENbFi1wxCklm4fS4?= =?us-ascii?Q?EquekwEnSfsoAHHr0+Ve+G36eJbGKUmxJkwHj3ZLhBuRSkMS3t5mEpwzj/qH?= =?us-ascii?Q?QAVTxDiuMefTKCyJz+BvZDFELS9qv7xbdfu6TycFMHwVnfShahqTJMC0jpUo?= =?us-ascii?Q?QQ2Crgx4vfaOpO8ok9KEmvLHqRIQrzAactv/hRlYIo673DN+LkjiwocSuHux?= =?us-ascii?Q?NYHC/M0jxHpgzTSngw2CNwl5Xl00wE5XJqYhys+6vvBSHaI4+HQ3mMWuB9lP?= =?us-ascii?Q?2TUPPNN+X9C8Az5OTRCzhABn1noe3R/g6dO13M6PyLTQZpJBQ8XT?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: b3b13a83-8e95-4300-f1b4-08dea4cd816d X-MS-Exchange-CrossTenant-AuthSource: CY8PR12MB8300.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Apr 2026 02:26:08.1570 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: wBk1MVGITPv87UCMBXa9/Af3cyONbB5q3YqvZCWaSF/9BSYE+oKuBtMvp2oJDnjtMk3we3kLiM1Ob+z4j0D89g== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN7PR12MB7323 Switch inner loop in kgd_gfx_v9_get_cu_occupancy() to the dedicated for_each(), and drop most of housekeeping code. Signed-off-by: Yury Norov --- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 32 +++++++------------ 1 file changed, 11 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c index 2e116c06d5be..08f4c4bf20c3 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c @@ -1026,9 +1026,8 @@ void kgd_gfx_v9_get_cu_occupancy(struct amdgpu_device *adev, int qidx; int se_idx; int se_cnt; - int queue_map; - int max_queue_cnt; - DECLARE_BITMAP(cp_queue_bitmap, AMDGPU_MAX_QUEUES); + DECLARE_BITMAP(queue_map, 32); + int max_queue_cnt, queue_reg; lock_spi_csq_mutexes(adev); soc15_grbm_select(adev, 1, 0, 0, 0, GET_INST(GC, inst)); @@ -1037,35 +1036,26 @@ void kgd_gfx_v9_get_cu_occupancy(struct amdgpu_device *adev, * Iterate through the shader engines and arrays of the device * to get number of waves in flight */ - bitmap_complement(cp_queue_bitmap, adev->gfx.mec_bitmap[0].queue_bitmap, - AMDGPU_MAX_QUEUES); max_queue_cnt = adev->gfx.mec.num_pipe_per_mec * adev->gfx.mec.num_queue_per_pipe; se_cnt = adev->gfx.config.max_shader_engines; for (se_idx = 0; se_idx < se_cnt; se_idx++) { amdgpu_gfx_select_se_sh(adev, se_idx, 0, 0xffffffff, inst); - queue_map = RREG32_SOC15(GC, GET_INST(GC, inst), mmSPI_CSQ_WF_ACTIVE_STATUS); + queue_reg = RREG32_SOC15(GC, GET_INST(GC, inst), mmSPI_CSQ_WF_ACTIVE_STATUS); + bitmap_from_arr32(queue_map, &queue_reg, 32); /* * Assumption: queue map encodes following schema: four * pipes per each micro-engine, with each pipe mapping * eight queues. This schema is true for GFX9 devices - * and must be verified for newer device families + * and must be verified for newer device families. + * + * Get number of waves in flight and aggregate them, and skip + * queues that are not associated with compute functions. */ - for (qidx = 0; qidx < max_queue_cnt; qidx++) { - /* Skip qeueus that are not associated with - * compute functions - */ - if (!test_bit(qidx, cp_queue_bitmap)) - continue; - - if (!(queue_map & (1 << qidx))) - continue; - - /* Get number of waves in flight and aggregate them */ - get_wave_count(adev, qidx, &cu_occupancy[qidx], - inst); - } + for_each_andnot_bit(qidx, adev->gfx.mec_bitmap[0].queue_bitmap, + queue_map, max_queue_cnt) + get_wave_count(adev, qidx, &cu_occupancy[qidx], inst); } amdgpu_gfx_select_se_sh(adev, 0xffffffff, 0xffffffff, 0xffffffff, inst); -- 2.51.0