From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from PH0PR06CU001.outbound.protection.outlook.com (mail-westus3azon11011068.outbound.protection.outlook.com [40.107.208.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8CA83379ED2 for ; Thu, 23 Apr 2026 05:11:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.208.68 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776921071; cv=fail; b=e5K664qXU0unmf0iisn0K4YVujP3ffk+noGjoRSA1xNARWs3X0brfL29lD0kGk89n2mK4R60v8t3Wx8SmKmM/3dH/vq9NX0f3dVWEBngvLV/yaWjH21Wj8fmemdyV2Arxl1O8DfK0W0fUBM5E6M55kBBoRS82UgP/6J+ClhCI2k= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776921071; c=relaxed/simple; bh=WKvuQUuv3S+ZQ8UU8E0C5jQ3xpCukxHViWK3Dsed6Cs=; h=Message-ID:Date:MIME-Version:Subject:To:CC:References:From: In-Reply-To:Content-Type; b=XNfXLtv4AIC8hAiCaXsSTgRGxjPm5fhNHxT/9CBNT1WqJz8xepmYPCSlNPKqqiHO1TtP6v8F/Q/EDfE7A98eaiSoLrvl+xtFKCLR+550/SEtV1OqheUz4/Za/9LcrBaiBMlJ03M0xsbA58axSMNNa89Q1eyGJzpHhE1a1T4ZffI= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=XAureapp; arc=fail smtp.client-ip=40.107.208.68 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="XAureapp" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=NpRpPH+A4y8aVm5jxx4Sg8DnNaBeqX0XM3MWw2RySzIkGNwUZfcZdZ1hC0/TiMPoU+fk7nmmafcgYKbaiDb3F143FxM+0Gbtxh2FktiJSnO9c49YqU/FwgeIKF9kcVbwRzRuL2H02ocJjL4bRroY6/aP7PCVH0fm1RXhCK4f6G/aclFB9TqEQ4V5NRgcuzzQ7UP0U+MX2JVVlYZfrojdZ7/7+cAP8p5/DaCdWL2jipZTC/xWeRrW3xi17fagcyZqugjSJWeRZg5siXIGz0cqusEK78HYYNkXXJa5CM63AD2rEDtd4lfK0SqbscF7A9kNou2JllVK6BjxXIlgl6TepQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=zZdxJQAcGn5UtzgZlO6bkHwX0Wib4/at7kGZLIScgws=; b=ugRrkrXfFadGqZ93agmLbItBuSNNY8/6BLl87G/hSc9++hOFJdv8edR8dUeJDuNfM/DSxv9kTvih77GJskhwwTtMs0y9137ztTnuf+P0eSaZzDpUS0prypVeMer4Wjjll0jEWxNAGGKAflaQZ2PVZPrEFGFMskYq/0CA1WiHRRvFTAAMprM5rEyBM1w5OpjVup7bfi86EgkxZ79rrn1t89l+wC9eIxzlWmknJHS1GHdLw+rpk/NDALskYPR2cmOAhw9DdhMu55wyB9IIf17dCnwkA4xYCeo/wXzDN8hvZS51lMZ/D772ShpMZPiWaOmQcHshYurI2VgPcnXLKY3k9g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=softfail (sender ip is 165.204.55.251) smtp.rcpttodomain=gmail.com smtp.mailfrom=amd.com; dmarc=fail (p=quarantine sp=quarantine pct=100) action=quarantine header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=zZdxJQAcGn5UtzgZlO6bkHwX0Wib4/at7kGZLIScgws=; b=XAureappMVwtDK3FhQlY/w++DCI8ITndqAD0LkwricPimU72e5hS2r9VkJUW2R3JwUb1hIjWz5F8CAX/7B92x9uAd8wW/jXGAK+3EHkZffkaNujYZIq2vYpjGHLaI0GVfaRJip+MdmpJcuiy0niSSLyjNlcOUM531aioqEg0aI0= Received: from BL1PR13CA0274.namprd13.prod.outlook.com (2603:10b6:208:2bc::9) by BL3PR12MB6451.namprd12.prod.outlook.com (2603:10b6:208:3ba::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9846.12; Thu, 23 Apr 2026 05:11:05 +0000 Received: from BL02EPF00021F6F.namprd02.prod.outlook.com (2603:10b6:208:2bc:cafe::f3) by BL1PR13CA0274.outlook.office365.com (2603:10b6:208:2bc::9) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9846.19 via Frontend Transport; Thu, 23 Apr 2026 05:11:05 +0000 X-MS-Exchange-Authentication-Results: spf=softfail (sender IP is 165.204.55.251) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=fail action=quarantine header.from=amd.com; Received-SPF: SoftFail (protection.outlook.com: domain of transitioning amd.com discourages use of 165.204.55.251 as permitted sender) Received: from satlexmb08.amd.com (165.204.55.251) by BL02EPF00021F6F.mail.protection.outlook.com (10.167.249.11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9846.18 via Frontend Transport; Thu, 23 Apr 2026 05:11:05 +0000 Received: from SATLEXMB03.amd.com (10.181.40.144) by satlexmb08.amd.com (10.181.42.217) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.2.2562.17; Thu, 23 Apr 2026 00:11:05 -0500 Received: from satlexmb08.amd.com (10.181.42.217) by SATLEXMB03.amd.com (10.181.40.144) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 23 Apr 2026 00:11:04 -0500 Received: from [10.136.34.119] (10.180.168.240) by satlexmb08.amd.com (10.181.42.217) with Microsoft SMTP Server id 15.2.2562.17 via Frontend Transport; Thu, 23 Apr 2026 00:11:00 -0500 Message-ID: Date: Thu, 23 Apr 2026 10:40:54 +0530 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2] sched/idle: Fix avg_idle saturation by establishing symmetric idle entry hook To: Masahito S , , , , , John Stultz CC: , , , , , , , References: <20260417020654.911709-1-firelzrd@gmail.com> <20260423023322.1293923-1-firelzrd@gmail.com> Content-Language: en-US From: K Prateek Nayak In-Reply-To: <20260423023322.1293923-1-firelzrd@gmail.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Received-SPF: None (SATLEXMB03.amd.com: kprateek.nayak@amd.com does not designate permitted sender hosts) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BL02EPF00021F6F:EE_|BL3PR12MB6451:EE_ X-MS-Office365-Filtering-Correlation-Id: 53ecbf5a-3f93-4ac6-9378-08dea0f6b8c6 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|36860700016|376014|7416014|1800799024|82310400026|18002099003|56012099003|22082099003; X-Microsoft-Antispam-Message-Info: UbS72tR1vYacbdJiHYuxzGl8M8kFV10pACjFTjGVqPuRXEJWzEiuDTeU8J33l/I8p3oBYSWeBJaLG3cgkbHq3x2RYP/0oQzx4rL4Rd8QelT1ad283LFjLHFl72frE8n5nJfKO6MsPQ/mHDgS8vAd+ajCfk61OfBT9yM+DoemFV4EYHcOmMpGvKW6MMn7WubLhNlO5pS1YihD+d3ag7lGn/LkVD8l2cRgUoEsyELQB/Tai2ge1Xt2PJXSVHKnImaDxH3GmhIdAZ+7FwoSm3xxx8gq7Z5uj6JGwgzPp4Nuxt7sQQyPoHeMW8VMakLHnujrZxsCYYTKbYoj7pHyIxzq8NPYvdpGCXgv2l/CEd86TDNPGgI0pYatJFPc0oRkHecLl19wvUv/Z5qmmwdywj4wlnUTd4712dtDh8vjwO+WYhbSiF15raVX00W3UpdqTJW6R0GqGxF61pf+pCt/Ab96El7WWI+dE+QBtKb6xGYw2ph1IJt5IgpWV8hWiVP4VhFypilMy8AtQmpVNjr09pPB1U38oXAgnyKak7dsM7gXHNbBl56Or0F8UA31mSEEyeQXMJ5KgPzLKMtEAUejDdGaTbuVYRxD/38Z7KtREUUNLczCRCWbOlqY7ysgWaPUPJ+nXsrzDvnScwhVCKUFmlXHPs2vbdII4+l8SNv6vB2z8jFZ+8/2OKrRB4Osb2PYCz+Fz6tR2b3KdDFUDSV0+ZRHlz38/mg/48pzLw1mJcCPGORXQncj/opcpXfyhEVuVXBsnhjxCkiKzzYnRjuejCNqfQ== X-Forefront-Antispam-Report: CIP:165.204.55.251;CTRY:CA;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:satlexmb08.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(36860700016)(376014)(7416014)(1800799024)(82310400026)(18002099003)(56012099003)(22082099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 2nQ+T/TFfOV90hR2TgXDlaUhlXJ/i7xmz+Uu///Kfm/ZUnUmXRCTi3BgBWxJjXxvYyKUJM4X+isODovA0Umi5aeIprAQrkXt2snxDnOifFQkkbnabfQRmL4mo0LWcRpgX+1TDi+q4rVycvPFGH13csBCKLWNpIF0dpp7kiX7lPmabyjE1J9g+S637ZF8AD8dU7+Adv18uKRANF3zyBUboXPbT2ijvGRGaYpEG1IVMCrm4Ndhr0cFwyrcJrD/BrQNAmJ4dPftuSlqxw6WKZZ4TKl+RrmvxkBzISdf1XPTzSUTHlKrwfWzgLjsCWMZ2H+S+O8lgdRTp20wrVmgWg9usHPDD2LfckUcWLr9mrhFLEMrWxiJMZZu/2R+MbDsFtvykK1MfijFuPMjhlM66UG296CWQjeBL3KOAaopPcuQUzjeMYXTtRgjnsF40BfD+jqq X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Apr 2026 05:11:05.3122 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 53ecbf5a-3f93-4ac6-9378-08dea0f6b8c6 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.55.251];Helo=[satlexmb08.amd.com] X-MS-Exchange-CrossTenant-AuthSource: BL02EPF00021F6F.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL3PR12MB6451 Hello Masahito, On 4/23/2026 8:03 AM, Masahito S wrote: > update_rq_avg_idle(), called from put_prev_task_idle(), computes > rq->avg_idle as rq_clock() - rq->idle_stamp. However, idle_stamp is > only set by sched_balance_newidle() when a CPU enters CPU_NEWLY_IDLE > through the fair class path. When the idle task is preempted without > sched_balance_newidle() having run (boot, hotplug, sched class > transitions), idle_stamp remains 0, producing a delta equal to > rq_clock() — a value in the billions of nanoseconds — which saturates > avg_idle at 2 * max_idle_balance_cost. But these are rare cases right? Hotplug would anyways trigger a load balance when the CPU comes online and the avg_idle will stabilize thereafter. Boot happens once so it should be fine to saturate the counter at boot, and idle task being preempted implies we are calling put_prev_task_idle(). For idle task to be picked again it has to go though the pick for rest of the scheduler classes which would do a newidle balance at fair right? Maybe there is some case with sched-ext where this saturates but then the counter only becomes relevant when the sched-ext scheduler is unloaded. With PROXY_EXEC, we do switch to idle between schedule() to take the current tasks off the CPU so this does have some merit. > > This inflated avg_idle prevents sched_balance_newidle() from > early-returning (fair.c: avg_idle < max_newidle_lb_cost check), > making it overly aggressive. The resulting excess newidle migrations > override wake-time placement decisions made by select_idle_sibling(), > degrading cache locality that careful placement (recent_used_cpu, > select_idle_core, etc.) is designed to preserve. > > Fix this by: > > 1. Adding an idle_stamp validity guard to update_rq_avg_idle(), so > that a zero idle_stamp is never used as a timestamp. > > 2. Setting idle_stamp in set_next_task_idle() when it has not already > been set by sched_balance_newidle(). This establishes a symmetric > idle entry/exit contract: set_next_task_idle() marks the start of > the idle period, put_prev_task_idle() measures and records it via > update_rq_avg_idle(). > > The entry hook preserves idle_stamp if sched_balance_newidle() has > already set it, maintaining the existing semantic where balance-attempt > duration is included in the idle measurement. > > Signed-off-by: Masahito Suzuki > --- > Changes in v2: > - Added missing Signed-off-by tag (no functional changes). > Thanks to Eric Naim and Christian Loehle for pointing this out. > > kernel/sched/core.c | 3 +++ > kernel/sched/idle.c | 3 +++ > 2 files changed, 6 insertions(+) > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 496dff740d..ec801f731c 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -3633,6 +3633,9 @@ static inline void ttwu_do_wakeup(struct task_struct *p) > > void update_rq_avg_idle(struct rq *rq) > { > + if (!rq->idle_stamp) > + return; > + I think this makes sense since we can be forced into idle and we don't want to account that. > u64 delta = rq_clock(rq) - rq->idle_stamp; > u64 max = 2*rq->max_idle_balance_cost; > > diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c > index a83be0c834..9ceb7e6224 100644 > --- a/kernel/sched/idle.c > +++ b/kernel/sched/idle.c > @@ -491,6 +491,9 @@ static void set_next_task_idle(struct rq *rq, struct task_struct *next, bool fir > schedstat_inc(rq->sched_goidle); > next->se.exec_start = rq_clock_task(rq); > > + if (!rq->idle_stamp) > + rq->idle_stamp = rq_clock(rq); > + I don't think this is required because we can switch the donor to idle task for PROXY_EXEC and we don't want to account that as a short idle time unless there is another case I'm missing. > /* > * rq is about to be idle, check if we need to update the > * lost_idle_time of clock_pelt -- Thanks and Regards, Prateek