From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3D0742EBBB7 for ; Fri, 13 Feb 2026 12:56:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770987374; cv=none; b=CP0HaDJqdSwkjD6tBs0AQB5FAGejpm2xLu5GPY7/GMXmGdUXXswE1QhaCkf43iCFVbdkeE6MTOutTsQDJiueFggcF9n2M36dyxrMvzXloVsfw/WFDeAjNOrKwgKeSyPPzZmr31J4wNZXhrGILV34tLPYsQCWvxSybMBoaZUscnU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770987374; c=relaxed/simple; bh=3cn2cO3BaNejz/WVM4H2FfibjE+K20UJauriQRvSCUQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=gJBILT/vfnRlME7gPZfBQjNNfCCCy0g40YfhUDVDvzdrTDgkuWnmDijLLFgR+wXN//RnPdgGga+hdt37EgvsGlVaqVXjRA1i67/c5xhWHbBpKkrxjrKzL7W7vzrj4oGWK5BJtIVADdzGF8VYcUFuTEaPDw1cw6XUy1CbvaCD0to= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=IK4RVNNk; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="IK4RVNNk" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A6E3AC116C6; Fri, 13 Feb 2026 12:56:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1770987374; bh=3cn2cO3BaNejz/WVM4H2FfibjE+K20UJauriQRvSCUQ=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=IK4RVNNkYP8KptplKM5Gy6Y88wTF7qts4aQpIn1oOMZ2+aNq1iAPQnOcKYkuhy4dE uH7YBi/4Jz9Qt/y3p61MV4eGpmxMjeidKzcEX3LRB1jwdujlF4n4f3VCoHp2omwP8m lNrbpxZVylfOSxbQQPskE/b7dGXIigHj1DhtX+iE3E3X7qTMc1EhbKZyC3Vzxdqbqk BHhPSNQpgEdIWxAB24rMvODj+bdVL/yNojmw+z7z/lvoZOUmiovYkoOUMHSzXkheoL jGYuA7JZWPnRctlOK59gv4HHsrYeYgiwf2gvJxPfkPU+mkQrmCVyBF5dxB1fk2431Z wA5AUwmANDJQg== Date: Fri, 13 Feb 2026 13:56:08 +0100 From: Frederic Weisbecker To: Shubhang Kaushik Cc: Anna-Maria Behnsen , Ingo Molnar , Thomas Gleixner , Vincent Guittot , Valentin Schneider , dietmar.eggemann@arm.com, bsegall@google.com, mgorman@suse.de, rostedt@goodmis.org, Christoph Lameter , linux-kernel@vger.kernel.org, Adam Li Subject: Re: [RESEND PATCH] tick/nohz: Fix wrong NOHZ idle CPU state Message-ID: References: <20260203-fix-nohz-idle-v1-1-ad05a5872080@os.amperecomputing.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Le Thu, Feb 12, 2026 at 11:36:06AM -0800, Shubhang Kaushik a écrit : > Hi Frederic, > > On Thu, 12 Feb 2026, Frederic Weisbecker wrote: > > > > > > > Tested on Ampere Altra on 6.19.0-rc8 with CONFIG_NO_HZ_FULL enabled: > > > - This change improves load distribution by ensuring that tickless idle > > > CPUs are visible to NOHZ idle load balancing. In llama-batched-bench, > > > throughput improves by up to ~14% across multiple thread counts. > > > - Hackbench single-process results improve by 5% and multi-process > > > results improve by up to ~26%, consistent with reduced scheduler > > > jitter and earlier utilization of fully idle cores. > > > No regressions observed. > > > > Because you rely on dynamic placement of isolated tasks throughout isolated > > CPUs by the scheduler. > > > > But nohz_full is designed for running only one task per isolated CPU without > > any disturbance. And migration is a significant disturbance. This is why > > nohz_full tries not to be too smart and assumes that task placement is entirely > > within the hands of the user. > > > > So I have to ask, what prevents you from using static task placement in your > > workload? > > Actually, the llama-batched-bench results I shared already included static > affinity testing via numactl -C. > > Even with static placement, we observe this ~14% throughput improvement. > This suggests that the issue isn't about the scheduler trying to be smart > with task migration, but rather about the side effects of an idle CPU being > absent from nohz.idle_cpus_mask. > > When nohz_full CPUs enter idle but aren't correctly accounted for in the > idle mask, it appears to cause unnecessary overhead or interference in the > NOHZ load balancing logic for the CPUs that are still running tasks. By > ensuring the idle state is correctly tracked, we're not encouraging > migration, but rather ensuring the scheduler's global state accurately > reflects reality. Then there seem to be something else going on that we don't fully understand because isolated CPUs run 1 pinned task per CPU and the only housekeeping CPU is CPU 0. So there is nothing to balance here. Perhaps some CPUs spend too much time scanning through all isolated CPUs to see if there is balancing to do. I don't know, this needs further investigation. But if the nohz_full CPUs are correctly domain isolated as they should (through isolcpus=domain or cpuset isolated partitions), they should be invisible to ilb anyway. Thanks. -- Frederic Weisbecker SUSE Labs