From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from sg-1-100.ptr.blmpb.com (sg-1-100.ptr.blmpb.com [118.26.132.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D08EE29E11A for ; Tue, 2 Dec 2025 09:44:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.100 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764668680; cv=none; b=OUi+HRG1jjuZjbOY3hHwaEYnprcFsGg9EkIPf/M90UdT7IktVXlCSKlLzIcGG99MtWSmy98pBCNq8CzTAxtODe6oBEu6JYWkH2wuwhsaayEubHSQEYuW8E+GX26AnpBZzXg0YtNBH4OzoyhmCFnmZ3IuWV7NXAZrAlTFJqDMF+Q= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764668680; c=relaxed/simple; bh=69ylrxDmuvNf2irrJEYyhsn6P0kqTT3kXyeBkGnJPKI=; h=References:To:From:Date:Message-Id:Mime-Version:Cc:Subject: Content-Type:In-Reply-To:Content-Disposition; b=WwWDDsifFdRcYY8eszB2bglMeuaxFpMjpyZLKPVjIC7s571vf5Sh6qtE8UqM8RL5Yo4i/Ejh2u6aFZddNswDvED5EW9m43JGecQo2eNHjD1tzcucZ31Qj9ssLHuPGOROOTWfD4pr0Y+aYs20VaCMF1kwk1o4C2GTcoPaX7AtYdM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=eoxxLDoi; arc=none smtp.client-ip=118.26.132.100 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="eoxxLDoi" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1764668660; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=jiNtmCLzhyBbq2WS28Kdt6/9ljc+9RKylveSQE3gAHI=; b=eoxxLDoi2hx4/XrJsos0bso17RUR2bAynmgZBF+69tzHMQgPqZPBu9CznGf3Jteu8IRQi9 +wrSfOKBbz18xTjy/HG+WY0ONBpTf+JwOfXulXrwTzWRCE4e7Rwe1+cLWZgGsEXgICat1C vX8IN+SE2gK6oshMARdaNAUkyJgEi/8E7MahzLd57YXnHGPCYwMFe90y/MzQ59Mqsh2LKV YwLCj2Yg4FAUNBUGPc5luyXSGMb3M/JkGWt1JC8WhD4Hbu2CgXAYQC1P6Pfguj2/c8zl3H OgWkk0L8L1dUHVa9DQDNXCQO/3GWXHO0i66GKU/fMcAfecJOOALNtF62swyqMA== References: <20250829081120.806-1-ziqianlu@bytedance.com> To: "Bezdeka, Florian" From: "Aaron Lu" Date: Tue, 2 Dec 2025 17:43:22 +0800 Message-Id: <20251202094322.GA3378032@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Cc: "bsegall@google.com" , "vschneid@redhat.com" , "xii@google.com" , "chengming.zhou@linux.dev" , "mingo@redhat.com" , "joshdon@google.com" , "vincent.guittot@linaro.org" , "kprateek.nayak@amd.com" , "peterz@infradead.org" , "bigeasy@linutronix.de" , "yu.c.chen@intel.com" , "dietmar.eggemann@arm.com" , "rostedt@goodmis.org" , "juri.lelli@redhat.com" , "linux-kernel@vger.kernel.org" , "mkoutny@suse.com" , "mgorman@suse.de" , "zhouchuyi@bytedance.com" , "Kiszka, Jan" , "liusongtang@bytedance.com" , "matteo.martelli@codethink.co.uk" Subject: Re: [PATCH v4 0/5] Defer throttle when task exits to user Content-Type: text/plain; charset=UTF-8 X-Lms-Return-Path: X-Original-From: Aaron Lu In-Reply-To: Content-Disposition: inline On Tue, Dec 02, 2025 at 08:59:15AM +0000, Bezdeka, Florian wrote: > On Fri, 2025-08-29 at 16:11 +0800, Aaron Lu wrote: > > v4: > > - Add cfs_bandwidth_used() in task_is_throttled() and remove unlikely > > for task_is_throttled(), suggested by Valetin Schneider; > > - Add a warn for non empty throttle_node in enqueue_throttled_task(), > > suggested by Valetin Schneider; > > - Improve comments in enqueue_throttled_task() by Valetin Schneider; > > - Clear throttled for to-be-unthrottled tasks in tg_unthrottle_up(); > > - Change throttled and pelt_clock_throttled fields in cfs_rq from int to > > bool, reported by LKP; > > - Improve changelog for patch4 by Valetin Schneider. > > > > Thanks a lot for all the reviews and tests, I hope I didn't miss any of > > them but if I do, please let me know. I've also run Jan's rt reproducer > > and songtang's stress test and didn't notice any problem. > > > > Apply on top of sched/core, head commit 1b5f1454091e("sched/idle: Remove > > play_idle()"). > > > > Hi all, > > as this all has arrived in 6.18 now - thanks for all the work - I would > like to start a discussion about backporting this series - and some more > related work, see below - to older stable releases. Especially > PREEMPT_RT enabled systems are of interest as this series fixes a > serious system freeze. > > Has someone already looked into the backporting topic? > > I can remember from the previous discussion that everything below 6.12 > is hard, as scheduler internals have changed (EEVDF, vlag). Still, 6.12 > would be valuable. > > I have the following commits on my radar: > > This series: > > 2cd571245b43 ("sched/fair: Add related data structure for task based throttle") > 7fc2d1439247 ("sched/fair: Implement throttle task work and related helpers") > e1fad12dcb66 ("sched/fair: Switch to task based throttle model") > eb962f251fbb ("sched/fair: Task based throttle time accounting") > 5b726e9bf954 ("sched/fair: Get rid of throttled_lb_pair()") > > Follow up series: > https://lore.kernel.org/all/20250910095044.278-1-ziqianlu@bytedance.com/ > > fe8d238e646e ("sched/fair: Propagate load for throttled cfs_rq") > fcd394866e3d ("sched/fair: update_cfs_group() for throttled cfs_rqs") > 253b3f587241 ("sched/fair: Do not special case tasks in throttled hierarchy") > 0d4eaf8caf8c ("sched/fair: Do not balance task to a throttled cfs_rq") > There is one more fix before the next fix: https://lore.kernel.org/all/20251021053522.37583-1-kprateek.nayak@amd.com/ 0e4a169d1a2b ("sched/fair: Start a cfs_rq on throttled hierarchy with PELT clock throttled") > Another follow up: > https://lore.kernel.org/all/20250929074645.416-1-ziqianlu@bytedance.com/ > > 956dfda6a708 ("sched/fair: Prevent cfs_rq from being unthrottled with zero runtime_remaining") > > > That should hopefully be enough, right? > I think so. > Any concerns, additional thoughts, missing peaces? Please let me know! 1 if the base does not have Josh's async unthrottle: 8ad075c2eb1f ("sched: Async unthrottling for cfs bandwidth"), make sure to backport that too or the distribute runtime timer handler can be time consuming. 2 if the base uses cfs, in dequeue_throttled_task(), the task's vruntime has to be adjusted like below: static void dequeue_throttled_task(struct task_struct *p, int flags) { WARN_ON_ONCE(p->se.on_rq); list_del_init(&p->throttle_node); /* task blocked after throttled */ if (flags & DEQUEUE_SLEEP) p->throttled = false; else { struct sched_entity *se = &p->se; struct cfs_rq *cfs_rq; /* * We are leaving this cfs_rq but our vruntime is not * normalized yet as that is only done for tasks dequeued * with !DEQUEUE_SLEEP in dequeue_entity(), so we have to: * Fix up our vruntime so that the current sleep doesn't * cause 'unlimited' sleep bonus. */ cfs_rq = cfs_rq_of(se); place_entity(cfs_rq, se, 0); se->vruntime -= cfs_rq->min_vruntime; } } 3 Also in this dequeue_throttled_task() function, if the base doesn't have commit e1f078f50478("sched/fair: Combine detach into dequeue when migrating task"), then it's not necessary to do the following because migrate_task_rq_fair() have already dealed with that: /* * task is migrating off its old cfs_rq, detach * the task's load from its old cfs_rq. */ if (task_on_rq_migrating(p)) detach_task_cfs_rq(p); That's what I can think of right now. I did a backport for 5.15 based kernel, I can probably post it somewhere if it is useful, just let me know.