From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pg1-f182.google.com (mail-pg1-f182.google.com [209.85.215.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B04911C695 for ; Fri, 17 Apr 2026 02:07:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.182 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776391672; cv=none; b=Sz1EFbPAUYhIkVpsbu3gQ1Bz9HL2FPfSTlFvOONZexJmjTEGhoiEcJBWYBLoriOWk/kHUGzf+kwBrpn7q42aHDo3FdjDst7MjlldlGj8vBckIB9jPLd4SwAiLqtDrY6vsHye/qfU9h6UxkZ9YLmpC3hsHuSr+DL/5Xu5R+KCrKo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776391672; c=relaxed/simple; bh=un2m2MuQZQS71qhE7FLZL/TSAP4MI6DXglRCAlIANwU=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version:Content-Type; b=oSCI4qjXrw256I5pt+Z+WT8t7WG2H9mBbc1sci6fxXqVSqd00mGn4lxnGrDue5CgIWM8nl5nDaq8uh54yfnwNHCZA3dtK8kJ2L2lfTYMFp9y1dsYUrVENF3l/2HtBClPdCB0Yo0ujOg0bjhPlIWePEYh0tHaEpSY49xeQ7TqBzc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=rmzidhD2; arc=none smtp.client-ip=209.85.215.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="rmzidhD2" Received: by mail-pg1-f182.google.com with SMTP id 41be03b00d2f7-c795eacbeb0so64454a12.2 for ; Thu, 16 Apr 2026 19:07:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1776391671; x=1776996471; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=UvNRi+YFAanevbjBQFw+h1zpWC2/5UB62MsywYgUxNM=; b=rmzidhD2iTlkvuhkcr987e7vlO+vHlHXL6dkpbHHH/GHc1OqPPhH35JJS146k5dfNZ +wY60/R8ZUW/JgExj4fX8A+SM+t5XPBjPbxef7UTBrQCw78yIkCu9g/Drk4gAA+JAIU1 Soe1W/KF2iNGh/jP0Lm43GaOn6JSnL1zLMFKOXTZD2pKXmq5wnNKtBeJ3mz8eS+3Kudy yw73i14M4FJLXmdGHXq28X2LMslE68mK6VX8wDBBb2buk3BdkVdf2o5sk//jTBGr2Or3 I6ooC/tLiJ7P/+O2g+VN2m/FKDIsuuK1WgzhGw06cIkjVDlUPG/as5iPrhAP1mK2k22e vxlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776391671; x=1776996471; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=UvNRi+YFAanevbjBQFw+h1zpWC2/5UB62MsywYgUxNM=; b=hm/XKv0Szy7Dqk0ciZtIC0X8Bn+eU5FdqXf1cOeiK6cT2bdBnVpzFpJzrYpDDJcZeb UnzmjtPuicUN3KRtVbORG5UFYFJb9vGLpLZ+mU+VVafw9PCks/M11pGZ+88VggnOmKth AfxYdo/+H+nf2FtmEGIzj5mx05ZC8pm2/tZKS9u4JzyEB6KGq9ZOOBda0+o6TkGE39or pGCTOUEdDvcF5Ujp9JATE1NRGvjdwdWzzKAnEswU3Jl5ueDNZ1dWFqHygZ3B3TdFIYP7 U+ExEyULxupQx2uY8ohShPIKZ+g4M2ClnkqVMe68UsE/dtOmoL5m7DG7HOCn2SvUw2Ak dLoA== X-Forwarded-Encrypted: i=1; AFNElJ8oW2usa5IAImR7aGaejISPltgpUsyzeIeTFWr68vbkIiPHfWY869D2q5Ktme8b+yz3cEolTxxCiFg9CZA=@vger.kernel.org X-Gm-Message-State: AOJu0YwW6mffvllPMMm5V2V7koKJjBu50Q7kJLAFRnWWsYHLJeYSmV6K rlGN90CH8qiVRj4LIsJIY1BXT0AyQGkLgLfnv/sISlcfoG/nTkhpHcBq X-Gm-Gg: AeBDievdMScxiW7+TES8Z2FIBektbrWWb6WUkIPy+biEufjXMuPl+qWHWwyKZ3lH3DJ XCyhlqlXizEZZc24TrJR5rjmQXUJqOnpL/PggZEAE4LrZ+U1CBkP69TCrm+yHEVilwTO/zlHUAl ypjBA4GlOvaiqEbN7fvWgPxmTxF1YcAzxjwcChkAIhUVuntNrzsvgUqKdLgyuEHO3I+/ZAhkgc6 xtYKvKGWS50395XWinUZEsMeDt6RNMK2bol+gMoLDnSlvFpee1hZIPp7ZR50ghcvyRrYWmRN4AA LvO+folyNd18irGfmlWkLjYLkribktG6j/HHmgGCKctkpkO8TD4icnbRcEfaV40Vv70ZfE7I4KY tC8jG7MuDB7Uy8MV7laOlFGT+VsB4wJLzmp0RoEG4yde8At2VvwfQoagXx1tzw42+jjsoQ27RhC HEgpts8NkYFoPJmrgw1EXkOfikO6CHOKZtPjR1d7HKrTEFF7II+u43/zYsVXcKEAS1y8QPQEvIs /DbyIvtd5iPbSfWwn9mu6wyCMZK5ivTiru78dS+zjM= X-Received: by 2002:a05:6a20:7f92:b0:398:b95c:51f7 with SMTP id adf61e73a8af0-3a08d8e2805mr814132637.37.1776391670900; Thu, 16 Apr 2026 19:07:50 -0700 (PDT) Received: from GTR.flets-east.jp.iptvf.jp ([2400:4052:8024:9500:71db:5b2f:83f9:565c]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-82f8ea0a97esm126637b3a.27.2026.04.16.19.07.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 Apr 2026 19:07:50 -0700 (PDT) From: Masahito S To: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org Cc: dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, kprateek.nayak@amd.com, linux-kernel@vger.kernel.org, Masahito S Subject: [PATCH] sched/idle: Fix avg_idle saturation by establishing symmetric idle entry hook Date: Fri, 17 Apr 2026 11:06:54 +0900 Message-Id: <20260417020654.911709-1-firelzrd@gmail.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit update_rq_avg_idle(), called from put_prev_task_idle(), computes rq->avg_idle as rq_clock() - rq->idle_stamp. However, idle_stamp is only set by sched_balance_newidle() when a CPU enters CPU_NEWLY_IDLE through the fair class path. When the idle task is preempted without sched_balance_newidle() having run (boot, hotplug, sched class transitions), idle_stamp remains 0, producing a delta equal to rq_clock() — a value in the billions of nanoseconds — which saturates avg_idle at 2 * max_idle_balance_cost. This inflated avg_idle prevents sched_balance_newidle() from early-returning (fair.c: avg_idle < max_newidle_lb_cost check), making it overly aggressive. The resulting excess newidle migrations override wake-time placement decisions made by select_idle_sibling(), degrading cache locality that careful placement (recent_used_cpu, select_idle_core, etc.) is designed to preserve. Fix this by: 1. Adding an idle_stamp validity guard to update_rq_avg_idle(), so that a zero idle_stamp is never used as a timestamp. 2. Setting idle_stamp in set_next_task_idle() when it has not already been set by sched_balance_newidle(). This establishes a symmetric idle entry/exit contract: set_next_task_idle() marks the start of the idle period, put_prev_task_idle() measures and records it via update_rq_avg_idle(). The entry hook preserves idle_stamp if sched_balance_newidle() has already set it, maintaining the existing semantic where balance-attempt duration is included in the idle measurement. --- kernel/sched/core.c | 3 +++ kernel/sched/idle.c | 3 +++ 2 files changed, 6 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 496dff740d..ec801f731c 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -3633,6 +3633,9 @@ static inline void ttwu_do_wakeup(struct task_struct *p) void update_rq_avg_idle(struct rq *rq) { + if (!rq->idle_stamp) + return; + u64 delta = rq_clock(rq) - rq->idle_stamp; u64 max = 2*rq->max_idle_balance_cost; diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c index a83be0c834..9ceb7e6224 100644 --- a/kernel/sched/idle.c +++ b/kernel/sched/idle.c @@ -491,6 +491,9 @@ static void set_next_task_idle(struct rq *rq, struct task_struct *next, bool fir schedstat_inc(rq->sched_goidle); next->se.exec_start = rq_clock_task(rq); + if (!rq->idle_stamp) + rq->idle_stamp = rq_clock(rq); + /* * rq is about to be idle, check if we need to update the * lost_idle_time of clock_pelt -- 2.34.1