From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from SA9PR02CU001.outbound.protection.outlook.com (mail-southcentralusazon11013005.outbound.protection.outlook.com [40.93.196.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 321EB345757 for ; Wed, 13 May 2026 11:14:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.93.196.5 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778670845; cv=fail; b=X+qsTU/noLVZ9qB3glZVe1893kIv3y8BN/YK7JnqoLfy3I8q4zoPPrNRyFXVxpOX72/oiprVZNIQho+6mfPgXJ/PxfYuYLGBnTabAErmMZCNxpuNTqFXd/+fwrp6f/mZp44gzQUbujqUzvE06QHGcCe+vMRFiphqMfVkMQ6oXas= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778670845; c=relaxed/simple; bh=0jEAyZtHqfbn4qvLeWkCknCgSJjmadLsKucoJKQYTRs=; h=Date:From:To:Cc:Subject:Message-ID:References:Content-Type: Content-Disposition:In-Reply-To:MIME-Version; b=tGwhjkTgQ+dm17ULy45sNH5+0gdUIxRbyRSl2Xy/HrPNN351URbBrGOC1C4e20HJciJwq6zVxsgVNcO31JhJmweT548smusqbCWxQszbVRkSfAAP1Nn+UwlqQqNabHx9Q/dHjzKQlRZEKe3ZwuhSdX7irYgHAo1nPJ9LhpSTqyo= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=D3sV9Jn/; arc=fail smtp.client-ip=40.93.196.5 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="D3sV9Jn/" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=IRQuaq9FRHXU93pf1VYMdsWcps5GqXvf9zpJDLIB/nFvmX8d/Nz0VzQKWPsKNKEwECtw/oYvbJP87V8EJogPtXC4CvgW+7NhZPIu5ZU45W4neplDoFJcfP4qeXo0gz3gz2XWuHV4EYsgaMsIdJ2gWyd4SoCGe9qkXiD0hRb3CMkjoUvZ6wTC4isQwhyGbnYYYmGJU2DSvipJC9Wug8Hg74jMblkzU0XvLLWNvyq1kjsvgXiLmqkjd7PKHxPK19CGc+xp7nxaehyPOSOvYnKhJeZNEYddYTvxj7OT/SqKriLXC9GniH2cdED4SkEGWiu3PMTSt78dQGMpGW6ICtXriQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Z7a21QRwIky7W/sQ8IDNWwNamFXKpWwKJFOyG2OUoCU=; b=D0SFwIHXewBERiLnGts2moNTmjy4Q2t2hUDixx4OLP6qdodIUBO1bf3bc5jAj51qRhblBpbn7e/0SDdbbMEaltmlAXsui2j1GnRNCdOjkte+Zzok5cUFZRbGN8z9SrcHBMMXj53n9/baIcwoQZ+YSe4hg3sP/qgRULHvYqzdAluVPFP7mG9aPkt1rUX3GWLpxFAmzl6Sd/FXfyQgmgu3VDpfC/AocHXlTnKb6fcvwOcXerIw8mZheB5CoqlaxBTWnp/ukqcEdWzF9ouo2DHPnf/7Qfqwz525toyrJ+IM9l8HJJ4eDBhGYtzLK7WGnSPTE9gG1JrLk2lz30/D0FP7Cg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Z7a21QRwIky7W/sQ8IDNWwNamFXKpWwKJFOyG2OUoCU=; b=D3sV9Jn/Esnr4SINunQMBg+bKV0vO3BsQXHJI5kzfitseZh3yPi00ArqJe3k8JEhddP/8IgiG91QL3sfe4OYm2yi2KbSPEuoHWd9xFE/RtBLBxxJPcCKQCzCPnMlc++3eF0bpFwAYiXfxJwcKNzg+I9wLBFiq/+jGSantta9T40PbdegzOA3iDE7US0cf9El/e11ZogmF45ZIa+WZo6/SpVM/VtkeUqo+UTzJOlj87t9bFqUOKaEaY+YZtYyWiJSGlFu/gTGYWT04Okjgd8vUYKIBdBymN1yMTq7BYa9c4G6tYVJWqmYhCT4VmJ/lWjPGkHbxGlH+DFY0ji+d6VGbw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from LV8PR12MB9620.namprd12.prod.outlook.com (2603:10b6:408:2a1::19) by PH7PR12MB7966.namprd12.prod.outlook.com (2603:10b6:510:274::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9913.11; Wed, 13 May 2026 11:13:58 +0000 Received: from LV8PR12MB9620.namprd12.prod.outlook.com ([fe80::299d:f5e0:3550:1528]) by LV8PR12MB9620.namprd12.prod.outlook.com ([fe80::299d:f5e0:3550:1528%5]) with mapi id 15.20.9913.009; Wed, 13 May 2026 11:13:58 +0000 Date: Wed, 13 May 2026 13:13:49 +0200 From: Andrea Righi To: Juri Lelli Cc: Ingo Molnar , Peter Zijlstra , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , K Prateek Nayak , Frederic Weisbecker , linux-kernel@vger.kernel.org, David Haufe , Cao Ruichuang , Furkan =?utf-8?B?w4dhbMSxxZ9rYW4=?= Subject: Re: [PATCH v2] sched/deadline: Make dl-server nohz full aware Message-ID: References: <20260513-upstream-fix-dlserver-nohzfull-b4-v2-1-d3e9cbe5c845@redhat.com> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260513-upstream-fix-dlserver-nohzfull-b4-v2-1-d3e9cbe5c845@redhat.com> X-ClientProxiedBy: MI3PEPF00004EA7.ITAP293.PROD.OUTLOOK.COM (2603:10a6:298:1::44b) To LV8PR12MB9620.namprd12.prod.outlook.com (2603:10b6:408:2a1::19) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: LV8PR12MB9620:EE_|PH7PR12MB7966:EE_ X-MS-Office365-Filtering-Correlation-Id: 43185850-f38a-44aa-de11-08deb0e0baac X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|376014|7416014|11063799003|4133799003|22082099003|18002099003|56012099003; X-Microsoft-Antispam-Message-Info: RtZlkccuVJQPojOs7cHRKYEsaGpxnkj2h5z+oUazUxlPyX/EZ0c3jBJ2IjQcMBTMN3syxH9ycG1/lOnYEfU2k4fWvTvzb93R0yOcx5UV9HQaBoZjIGD7LmbiIInLZtQdi7InBTfjAxfWpevzptJXmDc+6UvQ7KNCCfgjnBTeeF1XW4tYAHtHZYwsNQED+fD2d7JZeGUK+pL9iY16ZIBQpEbQ+dsz+VHKTmoPQuUaFqnsvKawEd7yvtOUdBbXpslWXYuKC83tNnc+kPYyvNh5ZO1YnOHu5rm71mYmExpnUrZpwOzt9lzbgN/FGszdpUm0wILuV/ykA/My/11QEFfJMyYTfXTdavCD1w07a0Vwpjtdyh8E7CYnWw2Bluvz6tSAwa2oHObNeIO0kWosMHV3EbKftH0qF6ytkNZm2EuSimwANepKNuL/D35R5lyYppkrVFG8jNa7raPmR6vJ2SCWg4rPyEbrLT1QsXhpwsZDo5SVnnyI2HvNOA7+Y2d5Ep7d0sqCdTSMYAq9PVW9E+L76rd0xxFhw41IM0KYo4mp/IMlthZl74Gruh2/5l6OkW/xMBRgevlSzvqy6J3EwzGi3mv1Xil4N0OEuieT30pTDgee5XFj2YTHznVGzR81ZiIZzhWOK3nR0SJWSWvfIhM2cfa+L/dsBMlRIyb0TxAsTIc8PUC4AIAqdN80w76dMY8qXT/BwMX7K+pL3ew3t2MYNA== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:LV8PR12MB9620.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(376014)(7416014)(11063799003)(4133799003)(22082099003)(18002099003)(56012099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?ZglrZjCt+F3kXnkpo5DqYxEullFOzHzoHsJCfHvRf1L5I9D0D4artYBPE176?= =?us-ascii?Q?PmkZ3Apl9AvBAiVkGiHizmxvWKMFBUfFvnEkvtHGkwEyhQY5TfNjUaN2iI7H?= =?us-ascii?Q?x7xLbCE6iVHeczx8OfMEaD2KWuJdZ9APgVmGrAxRjp83e3BQPuQwqrsOatyP?= =?us-ascii?Q?p6dCbZYV8j2pdc6MMtdPQPeL8SLzV8hhuNFnuHP7sdZM616Xg6UVQKjLpstD?= =?us-ascii?Q?2UyH+/lOMGvDkDWtEIFfztDdJ0K+1umf43xSFU5d1tJwz3livNePGJ4+DM1O?= =?us-ascii?Q?2rNlFMUvR1uVy9BeZVbUW7TYdkvF3+LSE9CxQR4/EEMMG8mxJWSNTSIe/kOS?= =?us-ascii?Q?WVrKT0/17jmbfSkmDoZBLQfmlgmRApE/s3r/4uWNEKeqRTqo1RZ8vrK3sNmK?= =?us-ascii?Q?PTnBKNG+AdUVq4jUcLCGTHupCEpcJP8EVr40HWiZUHuR3UN1SVooqtgkuLj8?= =?us-ascii?Q?n/VjpmTmwQVZ2h/qOuHt6fnVyfDiEkRIfMgpjhgnmotKmOrJb2ECESe+tSmR?= =?us-ascii?Q?OPsmti1WHLEjkW921iRuzl03rQ9zoj42NrjaFZOn+XVJOZlDEV5BbMmhSo/Q?= =?us-ascii?Q?rWHjnA5zsiNruV00qtknOxI8NVrcAJBKS7NeDvHWlX6fuJ5ZFpTOtV46DXq5?= =?us-ascii?Q?TCVxhulh0CGtE1bAOMU2jbChrdtDfrfNvpjUTniOr/dFGXpEWkp93UILfl3G?= =?us-ascii?Q?vLjL3M26u1GdlzDBpQT/cDFq4Y2keKx1CMPUA3hopkKa/wRU4RK2rFVolcv4?= =?us-ascii?Q?UNmt240Fbd2RxuKwEHOqYJUCcEbLua+k6s/OSMgY+9gbBJtkvZ93E1hkGN2m?= =?us-ascii?Q?C+tyknHFxjq0UP/bTrgywF0tq7oArTn5PRp6ZI2NVAVclzSTiVBe7D/upXAr?= =?us-ascii?Q?yVG/4+HFqQU26Cg9+0jqIvLfNdp89gwQrtip0pLlGLZwpCH/kYzbYnBChBSa?= =?us-ascii?Q?0Dv07QUp69PqV6SkG5PBN6EpE2z2mpJYy3UwKlWADqrtawL9Fjj4zlLe2ABE?= =?us-ascii?Q?tQxvV7G9AljZdr+rHEcmb/7Ci4DeyWeWsyHuCrdwi36DRT40FC8CWLJmDLVB?= =?us-ascii?Q?sLa4pRvRDPN7S4a3ZPlkoy5sY2PkSQQ6UOBTp6bUHYQRXRBncBVX4pK65uf2?= =?us-ascii?Q?a/P+D4f5Yhg0ORxGbGD7DokWSkfymL0PGrwg0YWRhSzDlyaaDRXSn7RTA+Jk?= =?us-ascii?Q?xl8n1vqzI+J5OKHzh97BIAkXkjsl3uxpydk/4EulzftIJncH6TVwfCldrktO?= =?us-ascii?Q?fNadaYF2Vdspv0WC9WNQBaF5O6r93q6aRfUSpULDo20am6waY52ReND0Y7mA?= =?us-ascii?Q?/YP8kqAvB6WhqIJTfQbxHEMX0KF79w8zoOwkTiSnwmMlUXevvCjaZtKadEIg?= =?us-ascii?Q?6s4trgq9IfBRi6/8W0FmE/sGcmdp6vjdUR+y43Z7nv4xgjSG457Rxg4aG2Fs?= =?us-ascii?Q?BoEbgL39FQKB3uUlqYCLadcFQZXSg1JzPp3vTkZeShoVYnxMnM6kL0I1MSN/?= =?us-ascii?Q?/2IWqHl5gYZItPaBixkKQGSLnUV2uCCnRr7u9T0nVQam9aeR0fcGEhy/xPvy?= =?us-ascii?Q?64GnxK5PJJJeda4Vnj6F0cqPdZuaxZuhrmi3xrid5CUc3mNqCV9h1Fe397w8?= =?us-ascii?Q?uYqwCYeWX9C9GbzeXqr0v0ekd8VEe1U+vf/1YXUDwMiYt5gKrDVvCpgNh7SX?= =?us-ascii?Q?n6YTwit1mB6Kw6Kxr4zAp5gYDGUWYRZxDzOse95rnMRlRyOa?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 43185850-f38a-44aa-de11-08deb0e0baac X-MS-Exchange-CrossTenant-AuthSource: LV8PR12MB9620.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 May 2026 11:13:58.5975 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: cWQcPbP1+ygIXOy/0A0D3KrwyzagPXavKLnL7FmnskdliWicUeIeAPOnxvm6vPiIjYYTDf5aNtkjQ8FWZfuSDA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR12MB7966 Hi Juri, On Wed, May 13, 2026 at 11:13:03AM +0200, Juri Lelli wrote: > The dl_server_timer() originally caused spurious IPIs on nohz_full > cores, breaking isolation guarantees. While such IPIs cannot be observed > on recent kernels, dl-server timers for tick-stopped isolated CPUs still > fire unnecessarily on housekeeping cores. > > The problem is that dl-servers are not coordinated with nohz_full tick > state. Even when the tick stops on an isolated CPU, its dl-server timer > continues to fire on housekeeping, wasting cycles and potentially > affecting housekeeping CPU performance. > > Fix by managing servers in sched_can_stop_tick(): > > - When RT tasks run with CFS/SCX tasks, start the appropriate server(s) > and keep the tick running > - When only RT tasks remain, stop all servers and allow tick to stop > (except for >1 RR tasks which need the tick for round-robin) > - When only CFS/SCX tasks remain, stop all servers before stopping tick > > Introduce dl_servers_stop_all() to reduce duplication and abstract > server management from core.c. Unify RT handling into one block that > handles both RR and FIFO cases. > > Note on SCX: While SCX is incompatible with isolcpus=domain, it does > support nohz_full. The ext_server handling in this patch targets > nohz_full configurations without domain isolation. > > Fixes: 557a6bfc662c ("sched/fair: Add trivial fair server") > Reported-by: David Haufe > Closes: https://lore.kernel.org/lkml/CAKJHwtOw_G67edzuHVtL1xC5Vyt6StcZzihtDd0yaKudW=rwVw@mail.gmail.com > Signed-off-by: Juri Lelli >From a sched_ext perspective LGTM. Reviewed-by: Andrea Righi Thanks, -Andrea > --- > Changes from v1 [1] > > - Fix CFS/SCX server start logic to handle both simultaneously in > partial switch mode (Furkan) > - Clarify in commit message that SCX supports nohz_full despite > isolcpus=domain incompatibility (Andrea) > > 1 - https://lore.kernel.org/lkml/20260512-upstream-fix-dlserver-nohzfull-b4-v1-1-a94844387ae7@redhat.com/ > --- > kernel/sched/core.c | 46 +++++++++++++++++++++++++++------------------- > kernel/sched/deadline.c | 14 ++++++++++++++ > kernel/sched/sched.h | 1 + > 3 files changed, 42 insertions(+), 19 deletions(-) > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index b905805bbcbe4..6d05ce9b1dfe6 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -1414,30 +1414,40 @@ static inline bool __need_bw_check(struct rq *rq, struct task_struct *p) > > bool sched_can_stop_tick(struct rq *rq) > { > - int fifo_nr_running; > - > /* Deadline tasks, even if single, need the tick */ > if (rq->dl.dl_nr_running) > return false; > > /* > - * If there are more than one RR tasks, we need the tick to affect the > - * actual RR behaviour. > + * If there are RT tasks, we may need the tick (for >1 RR tasks), > + * but we must also service lower-priority CFS/SCX tasks via dl-servers. > */ > - if (rq->rt.rr_nr_running) { > - if (rq->rt.rr_nr_running == 1) > - return true; > - else > + if (rq->rt.rt_nr_running) { > + bool cfs_or_scx_queued = false; > + > + if (rq->cfs.h_nr_queued) { > + dl_server_start(&rq->fair_server); > + cfs_or_scx_queued = true; > + } > +#ifdef CONFIG_SCHED_CLASS_EXT > + if (rq->scx.nr_running) { > + dl_server_start(&rq->ext_server); > + cfs_or_scx_queued = true; > + } > +#endif > + if (cfs_or_scx_queued) > return false; > - } > > - /* > - * If there's no RR tasks, but FIFO tasks, we can skip the tick, no > - * forced preemption between FIFO tasks. > - */ > - fifo_nr_running = rq->rt.rt_nr_running - rq->rt.rr_nr_running; > - if (fifo_nr_running) > + /* > + * Only RT tasks, no CFS/SCX. Stop servers to prevent spurious > + * wakeups. Tick can stop for single RR or any FIFO, but must > + * run for multiple RR (round-robin behavior). > + */ > + dl_servers_stop_all(rq); > + if (rq->rt.rr_nr_running > 1) > + return false; > return true; > + } > > /* > * If there are no DL,RR/FIFO tasks, there must only be CFS or SCX tasks > @@ -1462,6 +1472,7 @@ bool sched_can_stop_tick(struct rq *rq) > return false; > } > > + dl_servers_stop_all(rq); > return true; > } > #endif /* CONFIG_NO_HZ_FULL */ > @@ -8810,10 +8821,7 @@ int sched_cpu_dying(unsigned int cpu) > WARN(true, "Dying CPU not properly vacated!"); > dump_rq_tasks(rq, KERN_WARNING); > } > - dl_server_stop(&rq->fair_server); > -#ifdef CONFIG_SCHED_CLASS_EXT > - dl_server_stop(&rq->ext_server); > -#endif > + dl_servers_stop_all(rq); > rq_unlock_irqrestore(rq, &rf); > > calc_load_migrate(rq); > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > index edca7849b165d..c2b3d6bbe4828 100644 > --- a/kernel/sched/deadline.c > +++ b/kernel/sched/deadline.c > @@ -1826,6 +1826,20 @@ void dl_server_stop(struct sched_dl_entity *dl_se) > dl_se->dl_server_active = 0; > } > > +/* > + * Stop all dl-servers on this runqueue. Called when transitioning to a state > + * where the tick can be stopped (e.g., single RR/FIFO task, or no RT tasks). > + * This ensures server timers are disarmed and won't cause spurious wakeups on > + * nohz_full isolated cores. > + */ > +void dl_servers_stop_all(struct rq *rq) > +{ > + dl_server_stop(&rq->fair_server); > +#ifdef CONFIG_SCHED_CLASS_EXT > + dl_server_stop(&rq->ext_server); > +#endif > +} > + > void dl_server_init(struct sched_dl_entity *dl_se, struct rq *rq, > dl_server_pick_f pick_task) > { > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h > index 9f63b15d309d1..26cf1d14efde5 100644 > --- a/kernel/sched/sched.h > +++ b/kernel/sched/sched.h > @@ -412,6 +412,7 @@ extern void dl_server_update_idle(struct sched_dl_entity *dl_se, s64 delta_exec) > extern void dl_server_update(struct sched_dl_entity *dl_se, s64 delta_exec); > extern void dl_server_start(struct sched_dl_entity *dl_se); > extern void dl_server_stop(struct sched_dl_entity *dl_se); > +extern void dl_servers_stop_all(struct rq *rq); > extern void dl_server_init(struct sched_dl_entity *dl_se, struct rq *rq, > dl_server_pick_f pick_task); > extern void sched_init_dl_servers(void); > > --- > base-commit: 4ac4d6549a6563878d7c19c154e017f6cb7114d3 > change-id: 20260513-upstream-fix-dlserver-nohzfull-b4-fa741a2b6189 > > Best regards, > -- > Juri Lelli >