From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 557BC35EDBD for ; Tue, 27 Jan 2026 15:18:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.92.199 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769527084; cv=none; b=hOfQIVQphFU7FioV2Sy1Ks9qHiJSq/6tuSdEwQ1zDeYANrcJXtVLvOii4bAS19cFbefX3FalX6vsI0sXNh7yZoK3z6Ox25n9BDkEpUEa6+gpNQhAIkbRLesVBg79qAsOvDsOraeq8kLk5xXmTdb7xQn4FStyK7Is1+brzqfyEwM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769527084; c=relaxed/simple; bh=9+C/f6octNwnoJU46X0wpWjV+ZFnki4PYfnp9rVStls=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=ij90sbB3FNVNvduPT16KC8WX4ZocH2c6sLNQlq7ZjzgDIFTCzxRitHy1eyjiL34JUDKN7lde3FdgGRDHvuoYRg10NZPKKk7hRALnF3MCoTHiHpcylPPj4+xkP8tlLmK6lCqwF9VlnnlFklTVP7c8N8CVElDxSneIYM9WDYhk6Ho= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=bsbmIYSA; arc=none smtp.client-ip=90.155.92.199 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="bsbmIYSA" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=LjUPsmfUepXA4MxlAt62f1uTZ3EcET+2AsItK7kZfFE=; b=bsbmIYSAFI+k50F8aNfnfO5XND 9OzGlr3JGq/Q4rHkAeQx0GJ3c5gfbBTgml4hUaa8mqAkCaoMTWVCNMjVl/b1CIKFbY7N3HT3FKWBS VsYmiZVCz/1lBIJmGfdg5LNwFROaWtES6GEmAt/WLheolqzouCAg61VU52mj+7ZrcHlWRu/duMjdv udndpfMC3rRf2aSunPShavBXxPpcezG3raC9NV2i1W1JHPk+sjsyOmLaOlDt9f8BDKm47q8WEcsPS tUp/36gCMFclwDKnhagLqKb8Bhg+r3XNiy5icDDNSeDizUsxdJc2wbSmA9Sumec35/Wdm04Tzqkam uVDjb5pQ==; Received: from 2001-1c00-8d85-5700-266e-96ff-fe07-7dcc.cable.dynamic.v6.ziggo.nl ([2001:1c00:8d85:5700:266e:96ff:fe07:7dcc] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.98.2 #2 (Red Hat Linux)) id 1vkkpB-000000074iF-1Zgv; Tue, 27 Jan 2026 15:17:49 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id 0E8A3303018; Tue, 27 Jan 2026 16:17:48 +0100 (CET) Date: Tue, 27 Jan 2026 16:17:48 +0100 From: Peter Zijlstra To: Mario Roy Cc: Chris Mason , Joseph Salisbury , Adam Li , Hazem Mohamed Abuelfotoh , Josh Don , mingo@redhat.com, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, linux-kernel@vger.kernel.org, kprateek.nayak@amd.com Subject: Re: [PATCH 4/4] sched/fair: Proportional newidle balance Message-ID: <20260127151748.GA1079264@noisy.programming.kicks-ass.net> References: <20251107160645.929564468@infradead.org> <20251107161739.770122091@infradead.org> <8760001e-0274-454c-a4e4-1f38a9695b88@gmail.com> <20260123105046.GM171111@noisy.programming.kicks-ass.net> <20260123110306.GA217302@noisy.programming.kicks-ass.net> <20260127104041.GD217302@noisy.programming.kicks-ass.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260127104041.GD217302@noisy.programming.kicks-ass.net> On Tue, Jan 27, 2026 at 11:40:41AM +0100, Peter Zijlstra wrote: > On Fri, Jan 23, 2026 at 12:03:06PM +0100, Peter Zijlstra wrote: > > On Fri, Jan 23, 2026 at 11:50:46AM +0100, Peter Zijlstra wrote: > > > On Sun, Jan 18, 2026 at 03:46:22PM -0500, Mario Roy wrote: > > > > The patch "Proportional newidle balance" introduced a regression > > > > with Linux 6.12.65 and 6.18.5. There is noticeable regression with > > > > easyWave testing. [1] > > > > > > > > The CPU is AMD Threadripper 9960X CPU (24/48). I followed the source > > > > to install easyWave [2]. That is fetching the two tar.gz archives. > > > > > > What is the actual configuration of that chip? Is it like 3*8 or 4*6 > > > (CCX wise). A quick google couldn't find me the answer :/ > > > > Obviously I found it right after sending this. It's a 4x6 config. > > Meaning it needs newidle to balance between those 4 domains. > > So with the below patch on top of my Xeon w7-2495X (which is 24-core > 48-thread) I too have 4 LLC :-) > > And I think I can see a slight difference, but nowhere near as terrible. > > Let me go stick some tracing on. Does this help some? Turns out, this easywave thing has a very low newidle rate, but then also a fairly low success rate. But since it doesn't do it that often, the cost isn't that significant so we might as well always do it etc.. This adds a second term to the ratio computation that takes time into account, For low rate newidle this term will dominate, while for higher rate the success ratio is more important. Chris, afaict this still DTRT for schbench, but if this works for Mario, could you also re-run things at your end? [ the 4 'second' thing is a bit random, but looking at the timings between easywave and schbench this seems to be a reasonable middle ground. Although I think 8 'seconds' -- 23 shift -- would also work. That would give: 1024 - 8 s - 64 Hz 512 - 4 s - 128 Hz 256 - 2 s - 256 Hz 128 - 1 s - 512 Hz 64 - .5 s - 1024 Hz 32 - .25 s - 2048 Hz ] --- diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h index 45c0022b91ce..a1e1032426dc 100644 --- a/include/linux/sched/topology.h +++ b/include/linux/sched/topology.h @@ -95,6 +95,7 @@ struct sched_domain { unsigned int newidle_call; unsigned int newidle_success; unsigned int newidle_ratio; + u64 newidle_stamp; u64 max_newidle_lb_cost; unsigned long last_decay_max_lb_cost; diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index eca642295c4b..ab9cf06c6a76 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -12224,8 +12224,31 @@ static inline void update_newidle_stats(struct sched_domain *sd, unsigned int su sd->newidle_call++; sd->newidle_success += success; if (sd->newidle_call >= 1024) { - sd->newidle_ratio = sd->newidle_success; + u64 now = sched_clock(); + s64 delta = now - sd->newidle_stamp; + sd->newidle_stamp = now; + int ratio = 0; + + if (delta < 0) + delta = 0; + + if (sched_feat(NI_RATE)) { + /* + * ratio delta freq + * + * 1024 - 4 s - 128 Hz + * 512 - 2 s - 256 Hz + * 256 - 1 s - 512 Hz + * 128 - .5 s - 1024 Hz + * 64 - .25 s - 2048 Hz + */ + ratio = delta >> 22; + } + + ratio += sd->newidle_success; + + sd->newidle_ratio = min(1024, ratio); sd->newidle_call /= 2; sd->newidle_success /= 2; } @@ -12932,7 +12959,7 @@ static int sched_balance_newidle(struct rq *this_rq, struct rq_flags *rf) if (sd->flags & SD_BALANCE_NEWIDLE) { unsigned int weight = 1; - if (sched_feat(NI_RANDOM)) { + if (sched_feat(NI_RANDOM) && sd->newidle_ratio < 1024) { /* * Throw a 1k sided dice; and only run * newidle_balance according to the success diff --git a/kernel/sched/features.h b/kernel/sched/features.h index 980d92bab8ab..7aba7523c6c1 100644 --- a/kernel/sched/features.h +++ b/kernel/sched/features.h @@ -126,3 +126,4 @@ SCHED_FEAT(LATENCY_WARN, false) * Do newidle balancing proportional to its success rate using randomization. */ SCHED_FEAT(NI_RANDOM, true) +SCHED_FEAT(NI_RATE, true) diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index cf643a5ddedd..05741f18f334 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -4,6 +4,7 @@ */ #include +#include #include #include "sched.h" @@ -1637,6 +1638,7 @@ sd_init(struct sched_domain_topology_level *tl, struct sched_domain *sd = *per_cpu_ptr(sdd->sd, cpu); int sd_id, sd_weight, sd_flags = 0; struct cpumask *sd_span; + u64 now = sched_clock(); sd_weight = cpumask_weight(tl->mask(tl, cpu)); @@ -1674,6 +1676,7 @@ sd_init(struct sched_domain_topology_level *tl, .newidle_call = 512, .newidle_success = 256, .newidle_ratio = 512, + .newidle_stamp = now, .max_newidle_lb_cost = 0, .last_decay_max_lb_cost = jiffies,