From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2056.outbound.protection.outlook.com [40.107.94.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 162B21AA1C9 for ; Thu, 19 Dec 2024 19:44:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.94.56 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637444; cv=fail; b=KIqHfNgTGnV37gCEt+5FRcsi1+76WfAdcVRgoDU334Q5qLw1th9yLUEfAfP8HfMVWV4SRTfHy5rtNHTR6Du72AeP2bUTVq7wOlQmkDSMwyKn6qDR2l3G3Ly0HscjXhe1AMRMJ62hSCgWtI0cZHpmOf9F8QvviHt77z98+SHa6I4= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637444; c=relaxed/simple; bh=YyWgdcWlxv+vnv68tHk+ugklmgQyqjFDZKqjKmp0mmk=; h=Date:From:To:Cc:Subject:Message-ID:References:Content-Type: Content-Disposition:In-Reply-To:MIME-Version; b=NmuTfPVZf8tX35Owku/mUl3geUpYptcISTrcjX4TvnlPw35JKPBBdDPTuVdnR6hNa6FquGM7KaldspaxHzyeZLn8z0rKbTDx7EIWcUQRlHoB8UXfAYKbng1ULQAmUElFKD0/Nd2W9jXszehVRnCQNN296ipO8T4KNVmuei9/Eoo= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=rUFAyt0H; arc=fail smtp.client-ip=40.107.94.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="rUFAyt0H" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=RydSU3jnlSOZL+aDZyKZK/6QyK38+au7+78mKnL/rLbkJLw4s5tIJNbBJbDYydZonrNTfsDj8AH4cpC5ta5gui2LeVujTH+1xD1nFmCvQe8HJpdKgbaWimVtzAU3ZhhmUsnpxViiJ7EC1cQHHi/bK3BZ7SkVpvJ42RTMrM8knyODwp7Y9Hc6i7mF4Oz2qOOBPdf8yMWQekbK1FaQorfGtX50wJGLg14yrEXY60uZlqUAbFNS25tnXVkgqoa+vJ3WmziUaAMOZTqhpk/WSVAX62N93ymLZMmVQTwZfg5hsx1pflAzsfHQEDmVXEJol49XIRmA6CY48Wh2N6o/gRwi8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=1y4/Mk7BXCmkL5lAyq/UkT1ujfKmsGx/QB9iXV/2AD4=; b=Y/OjKF/wOrz07xlFI9ngS9HlbIJtsvvP72/kiW9dWtqt45J2EqdVdXOZE9VovbmwTIca1eTjmCbSGX2L/1QmyrHTHlw6d6DaDxY7bBMzTHpy2Z2R2ONGp1uATkyNUYRYx5fk0BhzQTQdOBTSZ7DnqT/MTtFEq9sOvjJx05NqeUl+8OFBf2VOydpErlGRJ/F1bZfdLowqSwR2F5+SwybhBzCrArlcRqTeY4iOal0bH1hBY6KrNcao8t1Q075LUEWEAx5GwPee5Tc+4sbcirHJLnno1lNQHpHX9Z7d2yhjCnnFLB0R2ltPJFiIgzwpE1A4P+P8fn+m7vxJmpKj1m9vcA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1y4/Mk7BXCmkL5lAyq/UkT1ujfKmsGx/QB9iXV/2AD4=; b=rUFAyt0H1S3YCqcLbfSoOl794Zkm2d5jZZMKFspULgw8rsfygFhbG9mD5s/ewzNhS062kLAQA/sKxMFuF7ZSyQpCMF5nzxO39vJUAMfAdke/WYtVw1SixWjiH+3SOs1qM+paixEZXFqKREvjRt0SV8+DIi2y46DwpklfbXvVtlCC+O0gNoW6YzjLk5D8YodwTrt3dsVljakW9CkOfvesg0EByKsvOo/gmtTl5HkTCAz+8nN3oqJk1OBwmYOYGa9Zs5m/i6Pxa+gxdgIaaU4MbMisSefm3+IJXNNdqo2nMnrbCwzADnfcsQDNn0yze9EVjYHGo0nWYrhjXVuxH/NuGQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from CY5PR12MB6405.namprd12.prod.outlook.com (2603:10b6:930:3e::17) by SA1PR12MB7040.namprd12.prod.outlook.com (2603:10b6:806:24f::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8272.13; Thu, 19 Dec 2024 19:44:00 +0000 Received: from CY5PR12MB6405.namprd12.prod.outlook.com ([fe80::2119:c96c:b455:53b5]) by CY5PR12MB6405.namprd12.prod.outlook.com ([fe80::2119:c96c:b455:53b5%3]) with mapi id 15.20.8272.005; Thu, 19 Dec 2024 19:44:00 +0000 Date: Thu, 19 Dec 2024 20:43:52 +0100 From: Andrea Righi To: Yury Norov Cc: Tejun Heo , David Vernet , Changwoo Min , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/6] sched/topology: introduce for_each_numa_hop_node() / sched_numa_hop_node() Message-ID: References: <20241217094156.577262-1-arighi@nvidia.com> <20241217094156.577262-2-arighi@nvidia.com> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: FR0P281CA0129.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:97::15) To CY5PR12MB6405.namprd12.prod.outlook.com (2603:10b6:930:3e::17) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CY5PR12MB6405:EE_|SA1PR12MB7040:EE_ X-MS-Office365-Filtering-Correlation-Id: b3a0ac8a-883c-4de7-2d9b-08dd20657be4 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|7416014|376014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?OH3AaD+jI3h/77Z7zPQinys7ef9m6N+kHhG/GMi+rqlcotl9UVwnvbJiedsX?= =?us-ascii?Q?78lqnZ0CZbOzwm7HZqgVcpuPLeFjsuPJhexnmUqpmbmWNQ5puJCSyJt/QSh0?= =?us-ascii?Q?C83ARti9ESn6ofxz3pvNNDIDi5VYnlDHK9lH/16HXeUiJWYb5l+jGWn6am7z?= =?us-ascii?Q?cOARstXyutAVRIS/muTQk0WDYML4Eh3Flv3lSA1/Td+o7/yEos1BvFTU9f9Y?= =?us-ascii?Q?jsRO791DKkiZTWMbxySjwoCrvRsmxgYCZtbduj/+746kGXEH5lL2ZhGjZ2IR?= =?us-ascii?Q?rnUuQiHCVV6IUx31mInR9dUT7UuZt7ge40c4lyX8ROQ/ge6DHk/v7HQ4nJrZ?= =?us-ascii?Q?Q0rB9F26KfV19qLFzTdMaW8bEFyiGnVe/2bAArffuiH9rEtlio3+IHQX+Jzw?= =?us-ascii?Q?R6wsbPO9vCi65VTERQiA9Ka3NmdKpxkmI/toTbt9OplooGiq/9oYLK+FMI6+?= =?us-ascii?Q?TbENWPKDtpzbE9lMwL9cRgbB/th5ZykyE7l4d5QZ5Si19ozd8ZCtgFK9IeZy?= =?us-ascii?Q?2iSwyXdBwHYlGbbTf24XBDdn5gE9OOT4HH/D8HT3vDPPk/YVKZ1UB2U0D3P/?= =?us-ascii?Q?IHKU4yQZCs/GR+g4C4I9accxBhFEbeD7vXQSecUVy1ALt77cFlZFERdGxXpO?= =?us-ascii?Q?KQh9YRjINgfRSQyD6KI+GV9LpgrUHm44tgrxISbe1NvTIvSVaB6DM2MxNSg1?= =?us-ascii?Q?DmXVoCwKf5IKk1k3GcZ/dHXVl3psqgYDQ0c1OFJUthnBKpVpw5xtzMg78Jnz?= =?us-ascii?Q?37vr8/GvYg9blL42pPUQY+RjsQyhpQtLI0pU0sVp3DlgdJmJTS8+tKL/F9YK?= =?us-ascii?Q?Kp4+ZtWriXZZ5xe2lm2l8jHWLLgN9jZgDQKkcX5PaPHAkL8VJcENkm+Az4e+?= =?us-ascii?Q?IkxpoQA6QCnPZpJ3WjieAEuuCO1Cb18Nd0pOPE/jp2/y7mgLy3ujeT88JDEX?= =?us-ascii?Q?1Ridk0rvRhBW9ij8luFTbrPz2PJHhBa1PtLwoWL4xaKm7Jvz+exZzEdI1fSk?= =?us-ascii?Q?IeRUDyd9qtwVJ2zQaxDOKD5LtVgyY8XO/S0fgRxKHjYvPRKBCrGTwQJPXXLf?= =?us-ascii?Q?l1/lc9WncvfqWZLcpWtvecDviIEjrtO5Z63gb2RnDf+idNyJwK/Puotlz0zd?= =?us-ascii?Q?ETLgCGAg12Zwug4NK1/yGXtLjjtjtLx0nv13WvSxSXMYd6SLdpktv1lQMFFC?= =?us-ascii?Q?gOAVSU/ABGVxvtsOxH2IK68RQXLuTEyKIPne71gC+D6ZZ+Eq99GpqFF25sOC?= =?us-ascii?Q?YCIQ+fOLpbq+dsjk8oqfvxAHK2W/NMks7HWFfnQkKnNWfp89xEx2GemHUj9w?= =?us-ascii?Q?MySNY2m6VjQfPqo1gT7bn5e+QkwF7jGMzdOJsdfQ/Be5T8YAzKO+NtMtajDi?= =?us-ascii?Q?1elEprSzT8B1qCcPhStMpxQaXGFV?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CY5PR12MB6405.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(7416014)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?e9Ng5K3tRu4BRJu1uOKMZ5UsBC4prAoKHuNyf7xJL4TWG5hB4C2Mhvs6I7Zj?= =?us-ascii?Q?O4y0h4Hl+4p6zj41cmONI/HS0s7BSBKQfYiXYxZFq/h33q0Fx0sJ//gHApYL?= =?us-ascii?Q?rVv1KKSzqncIR6Wcc2cjJzxdb6XxxR91L29jDNWW+6wyDcCW1OHAxL75AnYR?= =?us-ascii?Q?o2ZXi96rO9tFzwDuVf+kPgGVxhvlna7TC7qNxlSuuVoJEOu3OxoHnYnnnq0O?= =?us-ascii?Q?Pbbx2DZjfj5zfXZ0U0OlDDvnY/hkzX2tiqsDkCpzLFnZ9IgMRpdLhcXY4SyB?= =?us-ascii?Q?YCGCNS5kwjiZOl61r9TAB2InljZh3w4vPMSrzNEm/ZXestfhF53cxCWr464/?= =?us-ascii?Q?wXj/odOS1CoY9buIhMsl+fjZ9tJHQ0krr6cc62np4JT4LO/+quGmXQcvSVgI?= =?us-ascii?Q?c/RPW1TmoGckMTE1mGjdmuN+zsbWK9QWNny011cCyiQkVyEw+kfUpsXHWmJM?= =?us-ascii?Q?VPbod4flwUuvlsWzSv9Wjkb/uhLKw0YGqfT4cuyQQZhICbdZTfBLsKaMxiRv?= =?us-ascii?Q?iTDOLc0EPL6FFu9oSm5H6KMgXPCyKpxFn1njyIIM7EuZbXL2vcKENDYYgXZg?= =?us-ascii?Q?b76zsTNHIzuddse8fHThIXFVBRRR0mMpVIhL5jeVHbGukN2se4CnzZVDCTD4?= =?us-ascii?Q?Hh1dkVwSRlbBrTWIKDkHIrL7V3eUy+CKPUoz5b/s9cJubiVa+9EqT1LqHpe2?= =?us-ascii?Q?4KGTBESTHv+VMYBJz3sDZpWXWkVF0W3nftqjd7NwQ050wbCmvprkSLPvVdnQ?= =?us-ascii?Q?dKy6R6yekfR1oKDpS2GciMYo9smKWTVanaWU4D+c6xf262qdzxPb5dsWQY0h?= =?us-ascii?Q?u1i75iEM/A4FMlrm5BD0Lz6B156yfIvbkLuwNpMCIo5gtLQt2i7dRK3CXRkK?= =?us-ascii?Q?vTOvIr6nxdV01Chj8bOKAIUAO4v9CaIzItinfyj/xT2PCjTsQis/JGtY+nYK?= =?us-ascii?Q?1THeL5jHxG1wHMLU+6K1OgkQTz6P70r2ifXOhchixKPVUb457VSWCGgP5Jbe?= =?us-ascii?Q?KIof8on3rf6rODRh1LS8xXQisRDQOGJ81QG406N6HhaNqgo+f/7E/A+mflS/?= =?us-ascii?Q?JBglusTLpr2xhAFT7pXqlYDUlM91HinZW9EKUIMeKwwEzUrBXkyIP4nLlPmg?= =?us-ascii?Q?syi54T1oRiLLPo7lTDnyeFCW5pytb24Yq744CJkKzRrdLR2a1UXOUjpuHFTB?= =?us-ascii?Q?9JXXtCZNYEVLHiKPIgCLbkzL6cjam+0/xFdLdbKx50/uPQTtzaozgQEDbiLs?= =?us-ascii?Q?JyugN8cUMJ8FvTOyrHUPZW3g0M0UA3CviXPyvgpi/X9LDG3iSLrc1PCyLHoj?= =?us-ascii?Q?HnYWKnlwW8HhCGALSRLerv3N/VxSXoQLZ26X+Mcbr4k7HRQKvFlksXmrZNkb?= =?us-ascii?Q?AlGunhhbumxzLwL6g7Rl5u1U7B8QopmBqvU0hkYKLzkop9DQ5IYL69gzcvFW?= =?us-ascii?Q?z9PdQ8/Ev0xP4eEnnLQ3ZzQL0F6bQ6S5qUshvIqb+CNFfyBXcYaAP9UvHYch?= =?us-ascii?Q?GMeRn+aO0Y643HfZIKutz4Co/L3g/5qEEPxHVRYrAz5POTRkAQwuK+MN65wG?= =?us-ascii?Q?/js6bYNpEIUCBJpWY5nRaTFzvdgWO2xelGORaEZl?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: b3a0ac8a-883c-4de7-2d9b-08dd20657be4 X-MS-Exchange-CrossTenant-AuthSource: CY5PR12MB6405.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Dec 2024 19:44:00.0450 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: eKAWbMtoQrYEZ+8oU4uiEaWKd1mNpcS9osBhdfp36iWKXZpnpWHujNJ24bJskMVIrpnwguCj/eHxQcQjv+orfA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA1PR12MB7040 On Thu, Dec 19, 2024 at 10:26:59AM -0800, Yury Norov wrote: > On Wed, Dec 18, 2024 at 06:04:53AM -1000, Tejun Heo wrote: > > Hello, > > > > On Wed, Dec 18, 2024 at 11:23:40AM +0100, Andrea Righi wrote: > > ... > > > > So, this would work but given that there is nothing dynamic about this > > > > ordering, would it make more sense to build the ordering and store it > > > > per-node? Then, the iteration just becomes walking that array. > > > > > > I've also considered doing that. I don't know if it'd work with offline > > > nodes, but maybe we can just check node_online(node) at each iteration and > > > skip those that are not online. > > for_each_numa_hop_mask() only traverses N_CPU nodes, and N_CPU nodes have > proper distances. > > I think that for_each_numa_hop_node() should match for_each_numa_hop_mask(). > It would be good to cross-test them to ensure that they generate the same > order at least for N_CPU nodes. It'd be nice to have a kunit, I can take a look at this (in a separate patch, I think we can add this later). > > If you think that for_each_numa_hop_node() should traverse non-N_CPU nodes, > you need a 'node_state' parameter. This will allow to make sure that at > least N_CPU portion works correctly. > > > Yeah, there can be e.g. for_each_possible_node_by_dist() wheke nodes with > > unknown distances (offline ones?) are put at the end and then there's also > > for_each_online_node_by_dist() which filters out offline ones, and the > > ordering can be updated from a CPU hotplug callback. > > We can assign UINT_MAX for those nodes I guess? > > > The ordering can be > > probably put in an rcu protected array? I'm not sure what's the > > synchronization convention around node on/offlining. Is that protected > > together with CPU on/offlining? > > The machinery is already there, we just need another array of nodemasks - > sched_domains_numa_nodes in addition to sched_domains_numa_nodes. The > last one is already protected by RCU, and we need to update new array every > time when sched_domains_numa_nodes updated. > > > Given that there usually aren't that many nodes, the current implementation > > is probably fine too, so please feel free to ignore this suggestion for now > > too. > > I agree. The number of nodes on typical system is 1 or 2. Even if > it's 8, the Andrea's bubble sort will be still acceptable. So, I'm > OK with O(N^2) if you guys OK with it. I only would like to have > this choice explained in commit message. Good point, I'll add a comment about that. Thanks, -Andrea