From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5925E3587C2 for ; Thu, 12 Feb 2026 14:34:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770906842; cv=none; b=QFF5T3qmpbD77BKKXil72Egbbp1Lbf98/q+ofjtS4sVjIdIcwxZfe/NYlHZha3XAzqJgtzWBDV5irVm4LO7Xc12ZsMcvdGL9FkZ9PLYxWDik99aKTh55DlTcS9YEnFDrg3aupTvmYBy5CyqRLi3QTwuHXl0a6mkjwKNX9+RxqjA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770906842; c=relaxed/simple; bh=Y7B3aOlHSCFK37hQpBG2LICDe7IS2JlrKJ9cuQZooYc=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=lEy7okffPzfGJa5wwwMCR5eUd4upj6V6Pe313tOMnFnbI0Z0dY7jOzuRgvO/T3GVnn8pmxY9hTb7n1toUjciTodspJSBdbmJMDf5NiRvNhCCnfsY7cNmGdPcuF/w5OwAycb3bOxyxS+gfXRPfQqwNrCOGi7UlThXs9on5i696j8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=QjuI2IXs; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="QjuI2IXs" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 934BEC4CEF7; Thu, 12 Feb 2026 14:34:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1770906841; bh=Y7B3aOlHSCFK37hQpBG2LICDe7IS2JlrKJ9cuQZooYc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=QjuI2IXs1zVOkj68ZS97EuVrIpgMeDRcdAdCDHkHAgnspX+lAHHSEpbYmNxEquxmo eMU2oo4+qgWPJV+MCngdwKMUfQ4TA9d/+4lqp8GP4EAdOgmciNN6WW2MhmizGUcAPz pNa41wHLf3+5WSgy9Sv4xihLpNEZJIe3rrPE9lVsOA/DfL220ART5Tw4dVbs7b8MDn t0OJUuDL2k4KmqElp17Tm47qTQlyvha99KQfEx2oxUIugAqxmEEBY+l/2dOBMQY0Zb 14kIRI1wS9hRrgv4D3AW+vTfIOCZMlQQXBlawC9DngNYj1DRqyUyhWxdDfpzpWGo4o kOdfohHn1SadQ== Date: Thu, 12 Feb 2026 15:33:59 +0100 From: Frederic Weisbecker To: Shubhang Kaushik Cc: Anna-Maria Behnsen , Ingo Molnar , Thomas Gleixner , Vincent Guittot , Valentin Schneider , dietmar.eggemann@arm.com, bsegall@google.com, mgorman@suse.de, rostedt@goodmis.org, Shubhang Kaushik , Christoph Lameter , linux-kernel@vger.kernel.org, Adam Li Subject: Re: [RESEND PATCH] tick/nohz: Fix wrong NOHZ idle CPU state Message-ID: References: <20260203-fix-nohz-idle-v1-1-ad05a5872080@os.amperecomputing.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20260203-fix-nohz-idle-v1-1-ad05a5872080@os.amperecomputing.com> Le Tue, Feb 03, 2026 at 04:49:03PM -0800, Shubhang Kaushik a écrit : > Under CONFIG_NO_HZ_FULL, the scheduler tick can get stopped earlier via > tick_nohz_full_stop_tick() before the CPU subsequently enters the idle > path. In this case, tick_nohz_idle_stop_tick() observes TS_FLAG_STOPPED > already set and skips nohz_balance_enter_idle() because the !was_stopped > condition assumes tick-stop and idle-entry are coupled. > This leaves a tickless idle CPU absent from nohz.idle_cpus_mask, making > it invisible to NOHZ idle load balancing while periodic balancing is > also suppressed. > > The patch fixes this by decoupling tick-stop transition accounting from > scheduler bookkeeping. idle_jiffies remains updated only on the > tick-stop transition, while nohz_balance_enter_idle() is invoked > whenever a CPU enters idle with the tick already stopped, relying on its > existing idempotent gaurd to avoid duplicate registration. > > Tested on Ampere Altra on 6.19.0-rc8 with CONFIG_NO_HZ_FULL enabled: > - This change improves load distribution by ensuring that tickless idle > CPUs are visible to NOHZ idle load balancing. In llama-batched-bench, > throughput improves by up to ~14% across multiple thread counts. > - Hackbench single-process results improve by 5% and multi-process > results improve by up to ~26%, consistent with reduced scheduler > jitter and earlier utilization of fully idle cores. > No regressions observed. Because you rely on dynamic placement of isolated tasks throughout isolated CPUs by the scheduler. But nohz_full is designed for running only one task per isolated CPU without any disturbance. And migration is a significant disturbance. This is why nohz_full tries not to be too smart and assumes that task placement is entirely within the hands of the user. So I have to ask, what prevents you from using static task placement in your workload? I'm not saying it's undesirable or impossible to do adaptive userspace dyntick for users that don't rely on ultra low latency but rather on high CPU-bound performance. In fact the initial purpose of nohz_full was for HPC and not real-time. Turns out that real time is all the usecase I have seen so far and you're the first HPC one. But adapting nohz_full dynamically for that will involve much more than just load balancing. Now the static affinity should work for everyone. Thanks. > > Signed-off-by: Shubhang Kaushik > Signed-off-by: Adam Li > Reviewed-by: Christoph Lameter (Ampere) > Reviewed-by: Shubhang Kaushik > --- > This is a resend of the original patch to ensure visibility. > Previous resend: https://lkml.org/lkml/2025/8/21/170 > Original thread: https://lkml.org/lkml/2025/8/21/171 > > The patch addresses a performance regression in NOHZ idle load balancing > observed under CONFIG_NO_HZ_FULL, where idle CPUs were becoming > invisible to the balancer. > --- > kernel/time/tick-sched.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c > index 2f8a7923fa279409ffe950f770ff2eac868f6ece..eee6fcebe78c2f8d93464a55fe332e12fe9c164e 100644 > --- a/kernel/time/tick-sched.c > +++ b/kernel/time/tick-sched.c > @@ -1250,8 +1250,9 @@ void tick_nohz_idle_stop_tick(void) > ts->idle_sleeps++; > ts->idle_expires = expires; > > - if (!was_stopped && tick_sched_flag_test(ts, TS_FLAG_STOPPED)) { > - ts->idle_jiffies = ts->last_jiffies; > + if (tick_sched_flag_test(ts, TS_FLAG_STOPPED)) { > + if (!was_stopped) > + ts->idle_jiffies = ts->last_jiffies; > nohz_balance_enter_idle(cpu); > } > } else { > > --- > base-commit: 18f7fcd5e69a04df57b563360b88be72471d6b62 > change-id: 20260203-fix-nohz-idle-b2838276cb91 > > Best regards, > -- > Shubhang Kaushik > -- Frederic Weisbecker SUSE Labs