From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from SJ2PR03CU001.outbound.protection.outlook.com (mail-westusazon11012027.outbound.protection.outlook.com [52.101.43.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1448A32B128 for ; Thu, 21 May 2026 20:14:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.43.27 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779394444; cv=fail; b=B/Ka22AZ7XqGelE7y8bdeOJ1ZJdNXkO+gLzMCnBhhn3Kyq+0O6hPvN9C0c2J6ZCHexOncTQD9nuJ+WSY/kfz4c8uVDvzX+qQZQrkhUKPYONsIfy9CXvCbG3lL+Yn7qtD1SdXnbbon+J+W6dnudH5tDJgPcclXzY9LKAuAMVAMUM= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779394444; c=relaxed/simple; bh=PHHHRlVup3LEXzS+Qffy0zLOXRPClpcrmq/vJ9L6r8o=; h=Date:From:To:Cc:Subject:Message-ID:References:Content-Type: Content-Disposition:In-Reply-To:MIME-Version; b=CZqmVmOzaql01wxnSx3Rx9xIWgyPhEQ/vANBAkR7wBxdYNHKSCA4ICcI1HmIyjZ/YFZLkXPVjRwemOP4EI05R+B8glKUniDCI8rS3uvQi6fGSS1PgVTXi0fHDYlJtERBwpn5fSRqTlq0YUiqPLc4H4K9hfJUI+u+rcrH+6EsMnk= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=fail (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=fail (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=mmY23t1Y reason="signature verification failed"; arc=fail smtp.client-ip=52.101.43.27 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="mmY23t1Y" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=StppSYqprkSBIR7KIMGEzgAcGPESjQtJ5K8B5rbqJv8JwFFj6sivLJZIZVQV1sCxoeMe934Imake9/w5dPp4UWywjmeV4Rv6FyA9tNMOgEcl7KiG7JmzdsLnKW3lmWjLSGL8vPXCpUDTkg58hBEZBV9hSwthXctMR85qFmVRRy3VmTROZY4gp2CNJYHV/QveNU7Gaj0fxiXRoZ/o3NUVqXmRw+J5Y0r1H4AvPEqRACrXEBFlrly5DkxVCFUhPpm0swHR/YUcOp86iBXTD3OBnPOEaS9h5DUDvs9E6j4f8IDPpzpQd7KreeImDBJhgpnPcx24fc409k4ksuElYw9H8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=8DaB5y8j2dj3sCd5/WJwM8S8A4x31as/ktynUKgXtjU=; b=PKTZnUEgFdTkEkbS2ufejI/rmge4g72NBrZHTf5VWBnxPl2zO/mKwx5Rfcwv2pbbgYM4kV72Fdrv0jCejDLLS5ZLK0lJ7jTCHzp2WuaaUhfE9qb+NWfHpRSfNbOWXelBazPkg8ngg+3I9D5JNx/MnXixCYZhS0JQKXroGqSeRwiEPw5P4lX67ZSPA/WheqJKu8WSABQCeIhqfIMYp11CnXecpKeI9z5Cv0uEyG2f0XYcgkNfdQ5+1smeD32FHlWqFLlZEmO7itlOHg6aAGDBundcZox77FXueQe+7Titp8iJ9oPDTzdmggH+67jn565dJpoM/VwSCTX9SXYbNE3Atg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8DaB5y8j2dj3sCd5/WJwM8S8A4x31as/ktynUKgXtjU=; b=mmY23t1YrZtxOhwJJlarWZYd7Hwh6MubW1GVqrfb2kIgwad2EbZhVmAyeeUcQJTgE5D/1GKBXbEv+E5deWtX51zE5xpMWwzWaJ+9w4H9SXPiBI6BgZ/61hv3a7iczMhRd1ecoQ5tdwEfMnm5WuTz84RLHgKEr6f5WalwDNYAaYavg2l2Z5kNwOY/dwFN4e9z/OjWkzKPuLGR8lTArvvZMnDL0nJP0Fy85DQtwBBrsXsw2nFIlGEm6QCBTv1gwcHyV+3b6jZJ8qMln0CXF7uW9T3jiAEnPJACK008I76mXZ67nL6yvREhITUOGg8EApMHN5WEOIemS4jOT7aOORXmew== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DM6PR12MB4827.namprd12.prod.outlook.com (2603:10b6:5:1d6::14) by DS0PR12MB8367.namprd12.prod.outlook.com (2603:10b6:8:fd::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.21.48.14; Thu, 21 May 2026 20:13:55 +0000 Received: from DM6PR12MB4827.namprd12.prod.outlook.com ([fe80::6261:3040:864b:159c]) by DM6PR12MB4827.namprd12.prod.outlook.com ([fe80::6261:3040:864b:159c%4]) with mapi id 15.21.0048.016; Thu, 21 May 2026 20:13:53 +0000 Date: Thu, 21 May 2026 22:13:43 +0200 From: Andrea Righi To: Marek Szyprowski Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , K Prateek Nayak , Christian Loehle , Phil Auld , Koba Ko , Felix Abecassis , Balbir Singh , Joel Fernandes , Shrikanth Hegde , linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/5] sched/fair: Drop redundant RCU read lock in NOHZ kick path Message-ID: References: <20260509180955.1840064-1-arighi@nvidia.com> <20260509180955.1840064-2-arighi@nvidia.com> <38fe0a1d-1a48-435a-910a-c278024d9ac9@samsung.com> Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <38fe0a1d-1a48-435a-910a-c278024d9ac9@samsung.com> X-ClientProxiedBy: MI3PEPF00004E9A.ITAP293.PROD.OUTLOOK.COM (2603:10a6:298:1::456) To DM6PR12MB4827.namprd12.prod.outlook.com (2603:10b6:5:1d6::14) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM6PR12MB4827:EE_|DS0PR12MB8367:EE_ X-MS-Office365-Filtering-Correlation-Id: aa7322d2-c5b2-4e13-b24d-08deb7757a98 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|376014|7416014|11063799006|5023799004|6133799003|56012099003|4143699003|22082099003|18002099003; X-Microsoft-Antispam-Message-Info: J6mVBrLmuAsyCkZtDpTzmHptcoo93UDmwFc47ymVJYw+uh8PmeZ+yPghpWLAUQkcAP2Pp7tsjByA//6BHDMMuLRiHW2CFpIWLkpbZ+EW5Pw62h514M/mprhn0fWQywi+nB/fH4+xrYC+qqIWTTxWWmwXSI9niaJIXpg4VOkht5+ZEfmfH3fToDLsyQrmv20MbapGQSkPFYTh1r08Te1k7TkRP2r/9mvtKCL1O32mJNvwJLLRqo4cwhykFsQKFpqKcTHrvXvFAtiNtMqnuuFGaNaJWwwalmYRtvoRd2M+iUeeiKyfETVkhK/oSOzrEdqb2m7y9K4s7bN9RHPOHBkeWuIXyRhRB5BACC1377IjEwcHFz1LGo/Bf/OPQvSjYE2Gay3CVSDvgIJkbKm6S9NB7WQ0KZd42FKQG7ZDZswuFUX6sf3BgVF/tOK4ph7XtYcRj7XclYsZluGQHqJhw301cE/Bof58UcBNSQPjQ/RgCPO55YYLkI06tRRoDpQ8XOfdbhy8HNTYptbawvQCfUP/gcJnkNXShDQr3cky6rPp2V8ioKD9QzATPixkM+ZAd432TN9DgWIOowwTtB6U3l3qgztuf4LBdnJTZDllzF2QZK7dOSUGPq8TsFT+ACPpZ7XYQ8Yh/xxLa+F7bJAakhu5L4JR2MgIiQgUbyzErk2Wq0UBV5G8Fkfpvls5M4t8vve9 X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM6PR12MB4827.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(376014)(7416014)(11063799006)(5023799004)(6133799003)(56012099003)(4143699003)(22082099003)(18002099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?iso-8859-1?Q?GVty1IfwDkuL+YiqleAN4QfMSzAfcoJvP8b+f9x9d44Io32q70U6Z3cek/?= =?iso-8859-1?Q?iglUDmxfXeH2BcTZ5h0+6VyXnDvyj0q1OULRtpiSC0rasKxAawUiSY6A9z?= =?iso-8859-1?Q?eF81gi2e27W36Nhy34PH0rUtHor36rTcSdNEHR78Et2TyCNaZ0AWT0xFSx?= =?iso-8859-1?Q?RMK7wf7I50+C7ivKrHyMrMBuQ/6YZkGf+vpWswSItf5K8cFJYxcq98aEpQ?= =?iso-8859-1?Q?iatS+qi1EECp7Gjq0T2E74iUvKrpV+02phWDPpRIH0fNLKfKeV6foRWCZr?= =?iso-8859-1?Q?Qu6SzQIKTsHrpe3/ncHAepTtbiKCxsvka+LRb01HkHqI3KEq8nvpjFeOv0?= =?iso-8859-1?Q?l2I0J7ojKXAXHpsopNfm3nsQxLXe58nkz3oIT15XU9Nzp+NwcX/PwSzDrb?= =?iso-8859-1?Q?zLiUCJIFXKI1geCTnmj5B7l0kPNSEUUzwJk+cT03NJrGAe9QWZIQndfRHs?= =?iso-8859-1?Q?vCMjNVAwBbbX4fm57Jy7UUhl9PG3CCRinOYvwqcyMHKBddkZ0dvUuCnm7N?= =?iso-8859-1?Q?adZ+se5+6LybtDgJM1KOFc+NKRtDa3iMyUoCnqXEvZFpNYvbkDmycxchVN?= =?iso-8859-1?Q?qycd+3PYc/Z3/FZRqNN9qqKkPdCvQlD+dLoyjDJtdXilT+FvgTqOfeRtG9?= =?iso-8859-1?Q?795lm5/7K9WDs8VFFjiWcWuu5yuHWudxmi6DyLUcmPL41gPXIQlRvWYGi6?= =?iso-8859-1?Q?Vs7jIcKcxfJ3D7B5jGw4uR+P2Dy3vrCOZU3VyZGBETPEiTJQpFixrq39+4?= =?iso-8859-1?Q?xVITyFeT+z0jCrZlDxX+072aXnaT1awc7SYinpFxhE6w3ZKTwZsjB4YJgA?= =?iso-8859-1?Q?OC2EtlkmOequd4UVLBohejhZfEcc8VPhwmyzMFS+voQAKrqtEVRLCoN7E4?= =?iso-8859-1?Q?QugpIGWiBTeh9QdRnd4Ex12mJbf+y20OhMNyJGr1qenY5Wk7d4XyNtgCAZ?= =?iso-8859-1?Q?TrQ1eN35qRKK1G4Eygb7kw1YcP8QlvP3AUPF+aLcapodFU5OeGJjMc84JK?= =?iso-8859-1?Q?5sler59LGmcCuU3qNJI7MjaNOoQhxVj8ovpSV9wItHWs1kUfyDkiuwsk4w?= =?iso-8859-1?Q?Rtbbq8EhdPW1+CN9ggVBYPeLY1IDCCZMX2lZ0Mw9Hz0aaANDFB0CPcA5al?= =?iso-8859-1?Q?PTd00yRJR3X0ge0kSKdjsyw/Als7cfHyG8QwinLDfjQ1KvKbDFdzhrwcDi?= =?iso-8859-1?Q?17fq8pQI7JhdPxL8uGsQ9APnESa82eOfCJwvFbVR8l5LD8oSMpjCK2ACsa?= =?iso-8859-1?Q?Nr/awH2ssCQBUGvjvZbDQzwG3/E0b8fWJdvfl42gI6kp/5IGZk8mnOIAbj?= =?iso-8859-1?Q?VKo9cgjxFjn+RhQiO+jrcSTSqgw/09T32as65OHI0J2cpsHz/3x/yT7eMW?= =?iso-8859-1?Q?3u0h3/yZvnCUZw03n7AUQkD4o9uy2mOpPkxqmwrC6VGRxGwjVowYuwP9+H?= =?iso-8859-1?Q?VdHpZ8OEdPTnxpaFkYwpIf8XPEG+1qjHM7x0/YG+/jAcgX8O5zoZh5dt6I?= =?iso-8859-1?Q?EjeY7+Z4T+sc854MTDWGCEIH5fVOL4/LhyICWxAK0keCmAS0Eee7VQgxzX?= =?iso-8859-1?Q?BHgeNXCWfZnSGyCUCD83MWmDAoMmcjnlGcWwOYChh/os1GpW1/YU+6blAz?= =?iso-8859-1?Q?8taFdqFBoFul40TSZiKJqpP1nkmxSI9N+UMHr9wis8AOSqCBzH3lK9EbIg?= =?iso-8859-1?Q?MQL3nFUcK/Bd0vy1bsJ+ygNpUfyqQ6mEXs1LP78FM5NgCawe6CldbOPCQK?= =?iso-8859-1?Q?vWm58q5Ra5eiEqcqkzNzpObJJKdW1OK6BqGXDEdBl8bQqykoB737Evq7Q6?= =?iso-8859-1?Q?wVO1rJDerA=3D=3D?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: aa7322d2-c5b2-4e13-b24d-08deb7757a98 X-MS-Exchange-CrossTenant-AuthSource: DM6PR12MB4827.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 May 2026 20:13:53.1172 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: y2pk+xNV/zBem4vCFln3EjAMVrnoK0icLG03VcHgYFLtCbZksHiF5rNqxwV/i8baeNLabimLmJbThB8Ox5TZvQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR12MB8367 Hi Marek, On Thu, May 21, 2026 at 09:47:03PM +0200, Marek Szyprowski wrote: > On 09.05.2026 20:07, Andrea Righi wrote: > > nohz_balancer_kick() is reached from sched_balance_trigger(), which is > > called from sched_tick(). sched_tick() runs with IRQs disabled, so the > > additional rcu_read_lock/unlock() used around sched_domain accesses in > > this path is redundant. Rely on the existing IRQ-disabled context (and > > the rcu_dereference_all() checking) instead. > > > > The same applies to set_cpu_sd_state_idle(), called from the idle entry > > path with IRQs disabled, and to set_cpu_sd_state_busy(), reachable via > > nohz_balance_exit_idle() from two contexts: nohz_balancer_kick() (IRQs > > disabled, as above) and sched_cpu_deactivate() (the CPUHP_AP_ACTIVE > > teardown, which runs under cpus_write_lock(), so it cannot race with > > sched-domain rebuilds). In both cases the rcu_dereference_all() > > validation is sufficient. > > > > No functional change intended. > > > > Cc: Vincent Guittot > > Cc: Dietmar Eggemann > > Suggested-by: K Prateek Nayak > > Reviewed-by: K Prateek Nayak > > Signed-off-by: Andrea Righi > This patch landed in today's linux-next as commit c9d93a73ce87 ("sched/fair: Drop > redundant RCU read lock in NOHZ kick path"). In my tests I found that it introduced > the following warning during the CPU hot-plug tests: > > > root@target:~# for i in /sys/devices/system/cpu/cpu[1-9]; do echo 0 >$i/online; done > > ============================= > WARNING: suspicious RCU usage > 7.1.0-rc2+ #12775 Not tainted > ----------------------------- > kernel/sched/fair.c:12793 suspicious rcu_dereference_check() usage! > > other info that might help us debug this: > > > rcu_scheduler_active = 2, debug_locks = 1 > 2 locks held by cpuhp/1/20: >  #0: ffffffff81a16220 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x42/0x1ae >  #1: ffffffff81a16270 (cpuhp_state-down){+.+.}-{0:0}, at: cpuhp_thread_fun+0x72/0x1ae > > stack backtrace: > CPU: 1 UID: 0 PID: 20 Comm: cpuhp/1 Not tainted 7.1.0-rc2+ #12775 PREEMPTLAZY > Hardware name: StarFive VisionFive 2 v1.2A (DT) > Call Trace: > [] dump_backtrace+0x1c/0x24 > [] show_stack+0x28/0x34 > [] dump_stack_lvl+0x5e/0x86 > [] dump_stack+0x14/0x1c > [] lockdep_rcu_suspicious+0x14c/0x1b8 > [] nohz_balance_exit_idle+0xf4/0xf6 > [] sched_cpu_deactivate+0x6c/0x1c8 > [] cpuhp_invoke_callback+0xf8/0x1ce > [] cpuhp_thread_fun+0x150/0x1ae > [] smpboot_thread_fn+0x138/0x2a4 > [] kthread+0xea/0x10c > [] ret_from_fork_kernel+0x22/0x386 > [] ret_from_fork_kernel_asm+0x16/0x18 > CPU1: off > CPU2: off > CPU3: off > > This issue is observed on most of my ARM 32bit, ARM 64bit and RiscV64 based boards. > Ah, yes, makes sense. We missed the CPU hotplug case. When CPUs are taken offline, set_cpu_sd_state_busy() is invoked via: cpuhp/N kthread cpuhp_thread_fun() cpuhp_invoke_callback() sched_cpu_deactivate() nohz_balance_exit_idle() set_cpu_sd_state_busy() rcu_dereference_all(per_cpu(sd_llc, cpu)) The cpuhp kthread holds cpu_hotplug_lock, but runs with preemption and IRQs enabled. I think we should just restore the RCU read lock in set_cpu_sd_state_{busy,idle}() to fix this. I'll send a patch soon. Thanks, -Andrea