From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from CO1PR03CU002.outbound.protection.outlook.com (mail-westus2azon11010023.outbound.protection.outlook.com [52.101.46.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 980EB3538B6 for ; Wed, 19 Nov 2025 13:06:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.46.23 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763557615; cv=fail; b=vGT0dscvZMOukb5gKlPmgbaeGgQg46NrG5Y3bg62QUMFy6j6brOMqUsQkk5z21uDz6H9MLN/TW1S6lhWTqsQQ3DosWZG68dj5CeqcckH0caquS4Eoiwp4MWKWI+w0gX/edEP3pXwCkhD3t52d4ri7rTOMbm3D7PaSUKgX9gDBpY= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763557615; c=relaxed/simple; bh=WCdVcekRidDAQ+v5p/yL5WikFKSFIKH/mVUnAXkJRJ0=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ucTF1Inp9EuuNesFUpiZ3aXzzY05LJV7scuhqJ2AhH97Wn4OQqyzcr5G/fMBlMqq7RxNOBupy0/cTiN2WkhHJIA70vqup9J4b/dh0ONejGLvFQN4wo0FQk1zIEft8z6btsJ8Y+u1asqa09laAeOsVBGm3QLA+34j+VCmmzinLIw= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=LKfAR875; arc=fail smtp.client-ip=52.101.46.23 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="LKfAR875" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=pdomClm92cms3MW13qMwxJiPmWXWhfmlDLO6dq1Pp5z56cFLdUxGMg8O55G5rTa0og9OSQVWwjG4qqdP0HcvL1JqVxbunmKACJKAeXllm8yCBf/ose0THqx7lIIxYRT2xDBj89vqZxGfqfVKvr+fX4GDOGyQh63SLjL90KVMmTzpOnrS+FYDEOu6txQnh6cp60BGtPDBj8RqO27fUIRyAVHwPOywUXlqorCIC5vyP+M5u+Sb54eFezlVYFx8kHou0f5gV3oJB0q9aJeT3bXmRjjuxuausZN2GnWbMTrP6O6hxuEsRggB40wg7YdvdqqEI+Eu+zc9WiI0gkGHkd+I8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ECEGdV4h59a5+PFe/BMzq/fdFFJ67UTZyon4YfLkUtU=; b=qBQpIjxkHQYKXwvgxyBXosaFtnOKVXnYLnpk0dYGBSJhko0WHsoqBvphPv7unV6u8rjp0e1WSue8X1YEpLQ4ph186nyTFNTKTL07DjQ2A869RDqoN18x/3WFOUIgoWlleAR3oQlV0y5w8NxIjN7y4zZsnE6PlRGgYdIoMbGoPvDZEWXP/nGsS/AXOxbJ7I0PpGIYjfnaDevFbPP5FXh7dIEw8ZcZb4/wkm2lWJUNqbUM8s8614axZa4TJF4Q7oxI9WMYukqVHFCR8kTBrEy2/bfnY2Vk7VK2Dam48ymU/xDEXrEy8JPzRhMwkq1ydSeRCID+dZ2YC7i1NAZ34PHakg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=huawei.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ECEGdV4h59a5+PFe/BMzq/fdFFJ67UTZyon4YfLkUtU=; b=LKfAR875aaL+DVjJgbs7Jrqzj0ixPMTfLxvTIBSC7+1NN9Diwlqya7DG9oUUDOhEigqqpcZzN5LsBbhwBgz27BVn1otaruozhtH6qdY10LsN+ljTGReCM/coMGGqnk7RdMoBSC2+Y9fOqNRIOFZgq562kwrDCpOZRKZIFLlDz+8= Received: from PH8PR22CA0009.namprd22.prod.outlook.com (2603:10b6:510:2d1::25) by LV8PR12MB9713.namprd12.prod.outlook.com (2603:10b6:408:2a1::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9343.10; Wed, 19 Nov 2025 13:06:48 +0000 Received: from CY4PEPF0000EE35.namprd05.prod.outlook.com (2603:10b6:510:2d1:cafe::a3) by PH8PR22CA0009.outlook.office365.com (2603:10b6:510:2d1::25) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9343.10 via Frontend Transport; Wed, 19 Nov 2025 13:06:47 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=satlexmb07.amd.com; pr=C Received: from satlexmb07.amd.com (165.204.84.17) by CY4PEPF0000EE35.mail.protection.outlook.com (10.167.242.41) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9343.9 via Frontend Transport; Wed, 19 Nov 2025 13:06:46 +0000 Received: from BLR-L-BHARARAO.amd.com (10.180.168.240) by satlexmb07.amd.com (10.181.42.216) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.17; Wed, 19 Nov 2025 05:06:39 -0800 From: Bharata B Rao To: CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: Re: [RFC PATCH v3 0/8] mm: Hot page tracking and promotion infrastructure Date: Wed, 19 Nov 2025 18:36:24 +0530 Message-ID: <20251119130624.74880-1-bharata@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251110052343.208768-1-bharata@amd.com> References: <20251110052343.208768-1-bharata@amd.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-ClientProxiedBy: satlexmb07.amd.com (10.181.42.216) To satlexmb07.amd.com (10.181.42.216) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CY4PEPF0000EE35:EE_|LV8PR12MB9713:EE_ X-MS-Office365-Filtering-Correlation-Id: 0156d9e9-bc02-453c-872d-08de276c7eb6 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|7416014|376014|1800799024|36860700013|82310400026; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?AeglHXnugh8yrtH9YWEWE+q9Of5G7x+LkQIMpBuhbtdkNH3/g7KWxkstVCGw?= =?us-ascii?Q?SjH+p4psBSpKmhAUduOWIZn+aoUdLKpiaCc063HV64GjdXv8lCqbn3N/zxRv?= =?us-ascii?Q?g7vwISMfP7ZHAcAQRcvSn/dZg/cN2x4718oaXZ3zClznx3nhJC0QM4Tu2qjr?= =?us-ascii?Q?pT/Hw1s/mjNzLyeWoA8H4Ekrnr5nQBROsbAeuvfeEG2irA/GzmcLSnK2RtAU?= =?us-ascii?Q?lfHh1rdVAegmgceDObDFNQefRtTCEI0um72BexraJe8ZLju8Z8GqQEc39yQY?= =?us-ascii?Q?QICq9HpldrtVlYfitzUbzXv3omDtz8feWH+QDeCmgfFq89xPlH5QkcZYtDCW?= =?us-ascii?Q?1dgJ5bMklKHOSQRjdCmFVcsCHqDaFK8V3bx65OHs7pFR+C0cgm+DL9yXN6dA?= =?us-ascii?Q?pj9xHGD957KK4Skl+PC8dc/hdyC/ADhwkKBKBbOdOnO4ubUFjSkOkOtFGRql?= =?us-ascii?Q?26Y/oThKSSdenxHKjmGvdNJAnEWbmKSHUnrBZZQHws1rlkVHkDhBXjhLKNH1?= =?us-ascii?Q?5V5rIMJnJ34L+mzhxWpMS5LsXAm7SnN1pBBpxIf7FoGSCRLZLP3iZeVwFcPe?= =?us-ascii?Q?txK5wkFMyFNYjc8mZsWuY83ggSNtvRDE374hSIrTeBGkqlZFuSA8fWwHvLDo?= =?us-ascii?Q?fJBhLJ16xk45UTwH+Ckx6uCN2om5rFCtnZuur63zmp8ZqDYKB5M6JiU/Iel2?= =?us-ascii?Q?6EyhBGfJcrpm4+1u1+F51bdn/ziAwy5MAQrNuZYrcJ+fSrN9lwjUt16WaDa4?= =?us-ascii?Q?heKtxqJBHyKBpOdgeVKxzg0LYwwnjb/wE5BMibZPCGZsVb0FYXnIwVbSx+tk?= =?us-ascii?Q?aua81jXgKvuLsK+c7gVOKXWAoMz+I4BrSXutRvyEG4U4+pmCACVaZxzgPgXl?= =?us-ascii?Q?pO3YBT50S+n9B8uvQVG8pe5TpKyR8K9vTw8JY/pSTtwyLxv55t6MH7DPefLV?= =?us-ascii?Q?cCSIxcFDoVfH+GhY1KcJ/8Yk/MUMIT+Lxtfc0AABgrIaKImtZYk0bU8fkolP?= =?us-ascii?Q?JaTVK/2F9uSqh+sBJMUm1UYCmM+PPgJ51oW/qGJBuPkacWQ6aRmIgAIXV5z/?= =?us-ascii?Q?orMYoBpC10Rg3nr8Jw03jnYcGbvLis8SuwpHB1RXbo+GHUYgdklngr8VNdva?= =?us-ascii?Q?ohwg0KzA0cX90cedwDN63Rp2b9iY2Fpdik6yA9dU7w43gc/VjrcxXGmF1cAn?= =?us-ascii?Q?cEipowBdsFHxwnuUiKWCm8sqzehnGOSKmQcPUbWHW4ukNeKF9GIxqd9B6GVa?= =?us-ascii?Q?5jZ4B27t5oWJIiRhU1SoaeIkf8c9inlAb0ABDdUQ8kwhGZPIAeMOy59+B5hy?= =?us-ascii?Q?f40/PsHpmjaPm7jsJu9VCSgwzYXe+9CIj/GZaTBfNnKs4xIeCmIcyo9gsSvx?= =?us-ascii?Q?4vOVSw9M1z5iwcuZY07yb16jxrbxVtmvZ3CSo9a7HEgqyr6tXcXV6XSoab35?= =?us-ascii?Q?bLha5IjbemB5RKF7PR3m7iTbGx1/ROvL4GMnmCxm78hjrx0lx7XzZdXYFdj5?= =?us-ascii?Q?UIYVrNqsC7L4PWosBzqoY+ByDgWN+z3Y82vKRqV3uSu00SYHaeqNvWeohO3q?= =?us-ascii?Q?sMPCffaa9bUHei6lrcA=3D?= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:satlexmb07.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(7416014)(376014)(1800799024)(36860700013)(82310400026);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Nov 2025 13:06:46.8135 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0156d9e9-bc02-453c-872d-08de276c7eb6 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[satlexmb07.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CY4PEPF0000EE35.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: LV8PR12MB9713 On 10-Nov-25 10:53 AM, Bharata B Rao wrote: > Results > ======= Earlier I included results from the scenario where there was enough free memory in the toptier node and hence demotions weren't getting triggered. Here I am including results from a similar microbenchmark that results in demotion too. System details -------------- 3 node AMD Zen5 system with 2 regular NUMA nodes (0, 1) and a CXL node (2) $ numactl -H available: 3 nodes (0-2) node 0 cpus: 0-95,192-287 node 0 size: 128460 MB node 1 cpus: 96-191,288-383 node 1 size: 128893 MB node 2 cpus: node 2 size: 257993 MB node distances: node 0 1 2 0: 10 32 50 1: 32 10 60 2: 255 255 10 Microbenchmark details ---------------------- Single threaded application that allocates memory on both DRAM and CXL nodes using mmap(MAP_POPULATE). Every 1G region of allocated memory on CXL node is accessed at 4K granularity randomly and repetitively to build up the notion of hotness in the 1GB region that is under access. This should drive promotion. For promotion to work successfully, the DRAM memory that has been provisioned (and not being accessed) should be demoted first. There is enough free memory in the CXL node to for demotions. In summary, this benchmark creates a memory pressure on DRAM node and does CXL memory accesses to drive both demotion and promotion. The number of accesses are fixed and hence, the quicker the accessed pages get promoted to DRAM, the sooner the benchmark is expected to finish. DRAM-node = 1 CXL-node = 2 Initial DRAM alloc ratio = 75% Allocation-size = 171798691840 Initial DRAM Alloc-size = 128849018880 Initial CXL Alloc-size = 42949672960 Hot-region-size = 1073741824 Nr-regions = 160 Nr-regions DRAM = 120 (provisioned but not accessed) Nr-hot-regions CXL = 40 Access pattern = random Access granularity = 4096 Delay b/n accesses = 0 Load/store ratio = 50l50s THP used = no Nr accesses = 42949672960 Nr repetitions = 1024 Hotness sources --------------- NUMAB0 - Without NUMA Balancing in base case and with no source enabled in the patched case. No migrations. NUMAB2 - Existing hot page promotion for the base case and use of hint faults as source in the patched case. pgtscan - Klruscand (MGLRU based PTE A bit scanning) source hwhints - IBS as source Time taken (microseconds, lower is better) ---------------------------------------------- Source Base Patched Change ---------------------------------------------- NUMAB0 63,036,030 64,441,675 +2.2% NUMAB2 62,286,691 68,786,394 +10.4%(#) pgtscan NA 68,702,226 hwhints NA 67,455,607 ---------------------------------------------- Pages migrated (pgpromote_success) ---------------------------------------------- Source Base Patched ---------------------------------------------- NUMAB0 0 0 NUMAB2 82134(*) 0(#) pgtscan NA 6,561,136 hwhints NA 3,293($) ---------------------------------------------- (#) Unlike base NUMAB2, pghot migrates after 2 accesses. Getting two successive accesses within the observation window is hard with NUMA hint faults. The default sysctl_numa_balancing_scan_size of 256MB is too less to obtain significant number of hint faults. (*) High run-to-run variation, so the average isn't really representative. Hint fault latency comes out higher than the default 1s threshold mostly, preventing migrations. ($) Sampling limitation Pages demoted (pgdemote_kswapd+pgdemote_direct) (This data is not really a comparision point but just providing these numbers to show that the workload results in both promotion and demotion) ---------------------------------------------- Source Base Patched ---------------------------------------------- NUMAB0 5,222,366 5,341,502 NUMAB2 5,256,310 5,325,845 pgtscan NA 5,317,709 hwhints NA 5,287,091 ---------------------------------------------- Promotion candidate pages (pgpromote_candidate) ---------------------------------------------- Source Base Patched ---------------------------------------------- NUMAB0 0 0 NUMAB2 82,848 0 pgtscan NA 0 hwhints NA 0 ---------------------------------------------- Non-rate limited Promotion candidate pages (pgpromote_candidate_nrl) ---------------------------------------------- Source Base Patched ---------------------------------------------- NUMAB0 0 0 NUMAB2 0 0 pgtscan NA 6,561,147 hwhints NA 3,292 ----------------------------------------------