From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from CY7PR03CU001.outbound.protection.outlook.com (mail-westcentralusazon11010017.outbound.protection.outlook.com [40.93.198.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 65B813375CB for ; Fri, 9 Jan 2026 10:21:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.93.198.17 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767954105; cv=fail; b=Cg0n/+H5LEOzMuqWthf58rBdSiaZTyAjJ2MtaMSnsJG/8vssS78iHPl5SVsr0pXbxWCbLevsX6oZXCOfJzoVIk5eNjqhyAAJEu0yXeG2oJ9nI+p8U2jQNOkwIUCgYiGQsIfByWJ/gD++YRXjTzvdlsMXYpNDSZSoyn3E4Gm5qqc= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767954105; c=relaxed/simple; bh=614MsPBTw2W5zP9GIx4Z9Q3VBKZLdyZYFOhNSJneAO8=; h=Message-ID:Date:MIME-Version:Subject:To:CC:References:From: In-Reply-To:Content-Type; b=Ip2S4ibFEyJZhypBfH3Dc1wthfAIFyRmsOWW3KUda/e9u4nvE6gv12mNkFAkU9FqYHhfJycKwzFYfyZ95SdK2GRC+Ln5OenSP91kdBYGAQzNgSFcBFO8M+rQ8+vl3AP87WYa6rvHHv+XdNS9xteXpwInyEzYi0qoSVRRt1Jzkso= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=uQMcM79/; arc=fail smtp.client-ip=40.93.198.17 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="uQMcM79/" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=xw8/ALyw8hWP8nkCL926yUnYeeqo5mNFGVMdKnrQiereItVYqswQIHGfrYq9HgPNPNVlK9i/ybeY8vTFUu3W0GferiLrjLwglz3DfdOKo360LPPgGzd3UJSEb7AlyhFBOVAyq+5n/TRy9LjyJuOwP4Pznw7E0pj/hIbOlPYgxuszAAcdOGlziNtJPeRIPeXSRKHfWLHw5VGCqtJa0pWhvTnubuHINCHZXqHulNtdb86N9MNqjopaxH+tHEzE4+L9xXWWelIkFoAS5Zme1pGVTb7IZX3yPypXlXtpQzdLNewXdcKt8cTNqIDI7TJtJef2c9WpVrJCm2LwSW8tAI2amg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=POFusVvRql0bQV4Y8pQKGLhxztHuILHliM0l0ZrrLi4=; b=A1/ojxkGNOQS6IMNN4i9CtOD7fN0BI0XnXc0Ovn8ocwadGgJgMjG8mVo4WWHDer4HwsgXLvXPOQ6DxNKgpST8nXA9q4DD8cOFXAMoz41hlxtKmi6dE8rQN1kJ2sjJXBaJnvEsRxSOngG1LZF+s7GYmW30xlmShExqJXjnHRvvsMvY3x5UTuqnN+yz64kmQ2ruf+cb5yWEqMJutVgRZJ9qCw+loqvu3Br8vsYHZwjiGZeiFm1C3mDo0UnnWHukPK/YvPgR1Ap7MsFExIRkyv3X2hI2B4IR9W7iDnz7vqtmbOPUP3Es8X/Hait6CJg4LsxWVzRWsIohvv9DjROt3kX/g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=huawei.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=POFusVvRql0bQV4Y8pQKGLhxztHuILHliM0l0ZrrLi4=; b=uQMcM79/skHp6d64Rm8Lm3iTZEVHanDD09H+Jl6yf+Dfu7XDrT9qEGq+SfNEaulk4oy9sEC/NEUYXUi+4do0F3ZOTGYyWMEK3QGZWUHRj11CUUuQhLELn09D74d16dKxrxC71a4K5XMLo/TpsMWKYPsuU1NndAswix0sTRbfzjA= Received: from SJ0PR13CA0034.namprd13.prod.outlook.com (2603:10b6:a03:2c2::9) by IA1PR12MB6260.namprd12.prod.outlook.com (2603:10b6:208:3e4::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9499.5; Fri, 9 Jan 2026 10:21:38 +0000 Received: from SJ1PEPF000026C4.namprd04.prod.outlook.com (2603:10b6:a03:2c2:cafe::7e) by SJ0PR13CA0034.outlook.office365.com (2603:10b6:a03:2c2::9) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9520.0 via Frontend Transport; Fri, 9 Jan 2026 10:21:38 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=satlexmb07.amd.com; pr=C Received: from satlexmb07.amd.com (165.204.84.17) by SJ1PEPF000026C4.mail.protection.outlook.com (10.167.244.101) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9520.1 via Frontend Transport; Fri, 9 Jan 2026 10:21:36 +0000 Received: from satlexmb08.amd.com (10.181.42.217) by satlexmb07.amd.com (10.181.42.216) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.17; Fri, 9 Jan 2026 04:21:36 -0600 Received: from [10.136.32.160] (10.180.168.240) by satlexmb08.amd.com (10.181.42.217) with Microsoft SMTP Server id 15.2.2562.17 via Frontend Transport; Fri, 9 Jan 2026 04:21:32 -0600 Message-ID: <14af463a-d21e-4239-9091-a667f200acff@amd.com> Date: Fri, 9 Jan 2026 15:51:30 +0530 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] sched/fair: Fix vruntime drift by preventing double lag scaling during reweight To: Zicheng Qu , , , , , , , , , , CC: References: <20251226001731.3730586-1-quzicheng@huawei.com> <0615d2c6-c963-46ff-9088-d85e3821eec8@amd.com> <23574b3c-e990-45bc-b3f5-8664781adddf@huawei.com> Content-Language: en-US From: K Prateek Nayak In-Reply-To: <23574b3c-e990-45bc-b3f5-8664781adddf@huawei.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PEPF000026C4:EE_|IA1PR12MB6260:EE_ X-MS-Office365-Filtering-Correlation-Id: 2b99a3ef-5cd9-4076-860a-08de4f68df07 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|7416014|376014|36860700013|1800799024|82310400026|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?B?eTEyMHNXMXhIYW51K3RZTW1BT2p3cE9haUloeldRZDJPNjNWRHdoUzlZa1pR?= =?utf-8?B?S1Iwd0JTcDV2Z0h4bDE3aDBqd0pDN3lsa0RRWVd3QVl2MklIdmcxaTkwQjRy?= =?utf-8?B?ZmtvMExCTzVQRnNVaUZuUVhQaXRXaWxjRTk0TDVjVXROWDF5bXdCUnFkOUVW?= =?utf-8?B?UW05aDdGMlZwRnFwQ1BZQ1laZGZ2VDhqazduTEpSNjM0VE9GczBER2E1eW9n?= =?utf-8?B?TTg5QUtpZGxpSUg4MGdkV05idTdpbEhZem55YTR0WEkwN2ZCS21JVGp2VXVN?= =?utf-8?B?bjlUTWtLcXQzbkV0Y3RsdUVZSUlLNyt2UE1xbVltblg3ZUM3MHplODRNNGNG?= =?utf-8?B?aG8wVDZIcTZKdmlVZTV3WHdCRkVHSndLN2ZERWUwbTRkYURNTU5DMUU4NnE5?= =?utf-8?B?OTZLSndoalJCai83bU5ocDR4dlB4N0I0VFBzSkg1RXRDU3lTUVUwZS9tMTkw?= =?utf-8?B?YkQydXdYUUZyV2pOeWJVdmQxMlFtRVdjS2U0ZnE5L3Q0c1dLQkhaY0l6Z3Vm?= =?utf-8?B?ZkplUS9PcytYb3lmTVpSMzNaSWYrWHE5bW9ReXpHSVlXbnpQYXJwUFdJOU5a?= =?utf-8?B?cEZncUVBWlBCWTFWbHpNVjNtb0pCcUc4b0E4N1V6Smxwd0VvSU9JWFpkMDk3?= =?utf-8?B?KzZ4VWlNSHN3T2RaMzU3OUVCQ2xPUkhFOXUweThhWXVjL2E0U0lYOVlwVHJ3?= =?utf-8?B?Q1FIRzc4M25LV2UvUHhxem9VN1FkSzkzakVXalhoNW9VSFRlc3c0bnlyNDEz?= =?utf-8?B?Z0RmTmdlN2Vmd1ErK2l1SWRidkcvSEsvVlYyRDBGYVI2NmxzYWFWdHUvMzJ5?= =?utf-8?B?NzRONy9TMjZ6MnU5SFdSYnl4Wk1qN1RMKzRKaGNHSHRkVm5LdmlhaVBtTmFC?= =?utf-8?B?dzUvelFBQUFvTXBSMHlwK1NDeWRKK2VZODdxNVhkVHl6ckRHOFV5cWV3eklZ?= =?utf-8?B?b05jZ056YnYzQXphajg4UUVaSGxSVExHUnhjY3hlUU50UVE1OWw5VENOZ2pt?= =?utf-8?B?aHFoU2FXc2dKbG93VHZIbVNrNjBGVU0vSmlRUVRiR2NLYWdON1REYUFGYnJS?= =?utf-8?B?NXE0SlFMRW1TbXFtcVJVK0dKMy81NUYvOXJTaHQ2MjVkdlRGa2NwcDFCTEJw?= =?utf-8?B?SXVzc2dtS0dQbUp4aDkzdXZITnJTNjM1alFJV1M4OFNQZHFrZUxGTlZUT0s1?= =?utf-8?B?Vjg3R2dNbk15ZzBNU0p6Z2ZiUXZLdDR1TUdwSEptTzlyWjJ4YnJyc3plSmoz?= =?utf-8?B?Q1BFM0UrQlZoOWI2ZkJ0ejgvdkMzMlNQMXpKeG5Oc2p2WXpOZ0wwMEpSYzZz?= =?utf-8?B?U0RkQ3Q3RlFqV2pkQzYyZG5xNDR3RDZ3VUFoOUc0LzJGaUFkYWJpQysrbThu?= =?utf-8?B?ay9pY1I5cU9HcGJzSTc1SEwxMC9ac1BBUUwvNmRSWFcyZTdnSnJaaVhydE9v?= =?utf-8?B?WjZHVGJOei9NZnlNdUNlRXBBM3YydDYzcThtMG5MVDkyR2R2Q1h1MEsyTzgz?= =?utf-8?B?ZkYrNUFmZVZ5bnZPN0pSRDFZZHpqdnhDbCtwUFdPYmRxU2J1dmRyWEJZRVBM?= =?utf-8?B?dHVJRVBaTGZXUUh5bmdjTy80VC9lRUQwUlJIMWd2azM1Zm45d0N5UmdzWVA2?= =?utf-8?B?RUV4Zzl4NGl5UEs0eW51Mi9QUS82WTFEeU12RE5iVHdjaGFvZkdpczE4M1pQ?= =?utf-8?B?dDF1V3dzdE00MUsvaU9IalhQNzEyNlNkMFgySFoxZ1BEZHd0Tk11UE5JZTNC?= =?utf-8?B?MnFHaVpYR2lHWk1ESTlqVDEveDFtUmlWOCtHdDA1TmNOSmFrQ1d6TC9LV1p1?= =?utf-8?B?ZHZSVlJNTWlaK1NzSzVlcTlNWTlpZHpGbm94dWpEblMrcmlnbHZoVTlOZjdO?= =?utf-8?B?cm5BVXpVUDY0bVZ4MGNrTXdBenhWZDNNb3ZNbDFnRkJTcUVRRE1kTG1HcFFr?= =?utf-8?B?SGUvRnZaYy9Hcng0Zi9BNTc0a0tTVkI0bWF5dW85R2I4SldjWm1aRmQwRSt6?= =?utf-8?B?OTcvLzZZdmYzMEFZL2xMQWFYSmJjU01Ubi9oSW16bC93aHd1b0dzVXp4a2o5?= =?utf-8?B?REVLSmh5cFFLTGMxWnNrZVYxbTIxTzl4WW13bDBUeDRzY05SWUxoS0V2aTV2?= =?utf-8?Q?ljULGe3MSSfZRmgYOBEKwIbi7?= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:satlexmb07.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(7416014)(376014)(36860700013)(1800799024)(82310400026)(921020);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2026 10:21:36.8861 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 2b99a3ef-5cd9-4076-860a-08de4f68df07 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[satlexmb07.amd.com] X-MS-Exchange-CrossTenant-AuthSource: SJ1PEPF000026C4.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6260 Hello Zicheng, On 1/9/2026 2:10 PM, Zicheng Qu wrote: > Hi Prateek, > > On 1/9/2026 12:50 PM, K Prateek Nayak wrote: >> If I'm not mistaken, the problem is that we'll see "curr->on_rq" and >> then do: >> >>      if (curr && curr->on_rq) >>          load += scale_load_down(curr->load.weight); >> >>      lag *= load + scale_load_down(se->load.weight); >> >> >> which shouldn't be the case since we are accounting "se" twice when >> it is also the "curr" and avg_vruntime() would have also accounted it >> already since "curr->on_rq" and then we do everything twice for "se". > Thanks for the analysis — I agree your concern is reasonable, but I > think the issue here is slightly different from "accounting se twice", > but a semantic mismatch in how place_entity() is used. > > place_entity() is meant to compensate lag for entities being inserted > into the runqueue, accounting for the effect of a new entity on the > weighted average vruntime. That assumption holds when an se is joining > the rq. However, when se == cfs_rq->curr, the entity never left the > runqueue and avg_vruntime() has not changed, so applying enqueue-style > lag scaling is not appropriate. I believe the intention is to discount the contribution of the task and then re-account it again after the reweigh. I don't think se being the "curr" makes it any different except for the fact that its vruntime and load contribution isn't reflected in the sum and is added in by avg_vruntime() >> I'm wondering if instead of adding a flag, we can do: > Yes, I totally agree that adding a new flag is unnecessary. We > can handle this directly in place_entity() by skipping lag scaling in > case of `se == cfs_rq->curr`, for example: > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index da46c3164537..1b279bf43f38 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -5123,6 +5123,15 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) > >                 lag = se->vlag; > > +              /* > +               * place_entity() compensates lag for entities being inserted into the > +               * runqueue. When se == cfs_rq->curr, the entity never left the rq and > +               * avg_vruntime() did not change, so enqueue-style lag scaling does not > +               * apply. > +               */ > +              if (se == cfs_rq->curr) > +                      goto skip_lag_scale; This affects the place_entity() from enqueue_task() where se is dequeued (se->on_rq == 0) but it is still the curr - can happen when rq drops lock for newidle balance and a concurrent wakeup is queuing the task. You need to check for "se == cfs_rq->curr && se->on_rq" here and then this bit should be good. Let me stare more at the avg_vruntime() since "curr->on_rq" would add its vruntime too in that calculation. > + >                 /* >                  * If we want to place a task and preserve lag, we have to >                  * consider the effect of the new entity on the weighted > @@ -5185,6 +5194,7 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) >                 lag = div_s64(lag, load); >         } > > +skip_lag_scale: >         se->vruntime = vruntime - lag; > >         if (se->rel_deadline) { > > Best regards, > Zicheng -- Thanks and Regards, Prateek