public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Aaron Lu" <ziqianlu@bytedance.com>
To: "Bezdeka,  Florian" <florian.bezdeka@siemens.com>
Cc: "bsegall@google.com" <bsegall@google.com>,
	 "vschneid@redhat.com" <vschneid@redhat.com>,
	 "xii@google.com" <xii@google.com>,
	 "chengming.zhou@linux.dev" <chengming.zhou@linux.dev>,
	 "mingo@redhat.com" <mingo@redhat.com>,
	 "joshdon@google.com" <joshdon@google.com>,
	 "vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
	 "kprateek.nayak@amd.com" <kprateek.nayak@amd.com>,
	 "peterz@infradead.org" <peterz@infradead.org>,
	 "bigeasy@linutronix.de" <bigeasy@linutronix.de>,
	 "yu.c.chen@intel.com" <yu.c.chen@intel.com>,
	 "dietmar.eggemann@arm.com" <dietmar.eggemann@arm.com>,
	 "rostedt@goodmis.org" <rostedt@goodmis.org>,
	 "juri.lelli@redhat.com" <juri.lelli@redhat.com>,
	 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	 "mkoutny@suse.com" <mkoutny@suse.com>,
	 "mgorman@suse.de" <mgorman@suse.de>,
	 "zhouchuyi@bytedance.com" <zhouchuyi@bytedance.com>,
	 "Kiszka,  Jan" <jan.kiszka@siemens.com>,
	 "liusongtang@bytedance.com" <liusongtang@bytedance.com>,
	 "matteo.martelli@codethink.co.uk"
	<matteo.martelli@codethink.co.uk>
Subject: Re: [PATCH v4 0/5] Defer throttle when task exits to user
Date: Tue, 2 Dec 2025 17:43:22 +0800	[thread overview]
Message-ID: <20251202094322.GA3378032@bytedance.com> (raw)
In-Reply-To: <e65ed1e2308f268265a45bed4c569d3687e720f0.camel@siemens.com>

On Tue, Dec 02, 2025 at 08:59:15AM +0000, Bezdeka, Florian wrote:
> On Fri, 2025-08-29 at 16:11 +0800, Aaron Lu wrote:
> > v4:
> > - Add cfs_bandwidth_used() in task_is_throttled() and remove unlikely
> >   for task_is_throttled(), suggested by Valetin Schneider;
> > - Add a warn for non empty throttle_node in enqueue_throttled_task(),
> >   suggested by Valetin Schneider;
> > - Improve comments in enqueue_throttled_task() by Valetin Schneider;
> > - Clear throttled for to-be-unthrottled tasks in tg_unthrottle_up();
> > - Change throttled and pelt_clock_throttled fields in cfs_rq from int to
> >   bool, reported by LKP;
> > - Improve changelog for patch4 by Valetin Schneider.
> > 
> > Thanks a lot for all the reviews and tests, I hope I didn't miss any of
> > them but if I do, please let me know. I've also run Jan's rt reproducer
> > and songtang's stress test and didn't notice any problem.
> > 
> > Apply on top of sched/core, head commit 1b5f1454091e("sched/idle: Remove
> > play_idle()").
> > 
> 
> Hi all,
> 
> as this all has arrived in 6.18 now - thanks for all the work - I would
> like to start a discussion about backporting this series - and some more
> related work, see below - to older stable releases. Especially
> PREEMPT_RT enabled systems are of interest as this series fixes a
> serious system freeze.
> 
> Has someone already looked into the backporting topic?
> 
> I can remember from the previous discussion that everything below 6.12
> is hard, as scheduler internals have changed (EEVDF, vlag). Still, 6.12
> would be valuable.
> 
> I have the following commits on my radar:
> 
> This series:
> 
> 2cd571245b43 ("sched/fair: Add related data structure for task based throttle")
> 7fc2d1439247 ("sched/fair: Implement throttle task work and related helpers")
> e1fad12dcb66 ("sched/fair: Switch to task based throttle model")
> eb962f251fbb ("sched/fair: Task based throttle time accounting")
> 5b726e9bf954 ("sched/fair: Get rid of throttled_lb_pair()")
> 
> Follow up series:
> https://lore.kernel.org/all/20250910095044.278-1-ziqianlu@bytedance.com/
> 
> fe8d238e646e ("sched/fair: Propagate load for throttled cfs_rq")
> fcd394866e3d ("sched/fair: update_cfs_group() for throttled cfs_rqs")
> 253b3f587241 ("sched/fair: Do not special case tasks in throttled hierarchy")
> 0d4eaf8caf8c ("sched/fair: Do not balance task to a throttled cfs_rq")
>

There is one more fix before the next fix:
https://lore.kernel.org/all/20251021053522.37583-1-kprateek.nayak@amd.com/

0e4a169d1a2b ("sched/fair: Start a cfs_rq on throttled hierarchy with
PELT clock throttled")

> Another follow up:
> https://lore.kernel.org/all/20250929074645.416-1-ziqianlu@bytedance.com/
> 
> 956dfda6a708 ("sched/fair: Prevent cfs_rq from being unthrottled with zero runtime_remaining")
> 
> 
> That should hopefully be enough, right?
> 

I think so.

> Any concerns, additional thoughts, missing peaces? Please let me know!

1 if the base does not have Josh's async unthrottle:
  8ad075c2eb1f ("sched: Async unthrottling for cfs bandwidth"),
  make sure to backport that too or the distribute runtime timer handler
  can be time consuming.

2 if the base uses cfs, in dequeue_throttled_task(), the task's vruntime
  has to be adjusted like below:

static void dequeue_throttled_task(struct task_struct *p, int flags)
{
	WARN_ON_ONCE(p->se.on_rq);
	list_del_init(&p->throttle_node);

	/* task blocked after throttled */
	if (flags & DEQUEUE_SLEEP)
		p->throttled = false;
	else {
		struct sched_entity *se = &p->se;
		struct cfs_rq *cfs_rq;

		/*
		 * We are leaving this cfs_rq but our vruntime is not
		 * normalized yet as that is only done for tasks dequeued
		 * with !DEQUEUE_SLEEP in dequeue_entity(), so we have to:
		 * Fix up our vruntime so that the current sleep doesn't
		 * cause 'unlimited' sleep bonus.
		 */
		cfs_rq = cfs_rq_of(se);
		place_entity(cfs_rq, se, 0);
		se->vruntime -= cfs_rq->min_vruntime;
	}
}

3 Also in this dequeue_throttled_task() function, if the base doesn't
  have commit e1f078f50478("sched/fair: Combine detach into dequeue 
  when migrating task"), then it's not necessary to do the following
  because migrate_task_rq_fair() have already dealed with that:
	/*
	 * task is migrating off its old cfs_rq, detach
	 * the task's load from its old cfs_rq.
	 */
	if (task_on_rq_migrating(p))
		detach_task_cfs_rq(p);

That's what I can think of right now.

I did a backport for 5.15 based kernel, I can probably post it somewhere
if it is useful, just let me know.

  reply	other threads:[~2025-12-02  9:44 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-29  8:11 [PATCH v4 0/5] Defer throttle when task exits to user Aaron Lu
2025-08-29  8:11 ` [PATCH v4 1/5] sched/fair: Add related data structure for task based throttle Aaron Lu
2025-09-03  8:05   ` [tip: sched/core] " tip-bot2 for Valentin Schneider
2025-08-29  8:11 ` [PATCH v4 2/5] sched/fair: Implement throttle task work and related helpers Aaron Lu
2025-09-03  8:05   ` [tip: sched/core] " tip-bot2 for Valentin Schneider
2025-08-29  8:11 ` [PATCH v4 3/5] sched/fair: Switch to task based throttle model Aaron Lu
2025-09-03  8:05   ` [tip: sched/core] " tip-bot2 for Valentin Schneider
2025-09-03 14:51   ` [PATCH v4 3/5] " Peter Zijlstra
2025-09-03 17:12     ` K Prateek Nayak
2025-09-03 20:27       ` Peter Zijlstra
2025-09-04  5:44         ` K Prateek Nayak
2025-09-04  7:04           ` Aaron Lu
2025-09-05 11:37             ` Aaron Lu
2025-09-05 12:53               ` Peter Zijlstra
2025-09-08 11:05                 ` [PATCH] sched/fair: Propagate load for throttled cfs_rq Aaron Lu
2025-09-09  4:20                   ` kernel test robot
2025-09-09  6:17                     ` Aaron Lu
2025-09-09  6:22                       ` K Prateek Nayak
2025-09-09  6:27                         ` Aaron Lu
2025-09-10  9:55                           ` Aaron Lu
2025-09-03 20:46       ` [PATCH v4 3/5] sched/fair: Switch to task based throttle model Benjamin Segall
2025-09-04  6:03         ` K Prateek Nayak
2025-09-09  4:10           ` Benjamin Segall
2025-09-04  8:16         ` Aaron Lu
2025-09-04  9:51           ` K Prateek Nayak
2025-09-04 11:05             ` Aaron Lu
2025-09-04 14:20               ` K Prateek Nayak
2025-09-09  3:58               ` Benjamin Segall
2025-09-09 12:03                 ` Aaron Lu
2025-09-10  3:03               ` Aaron Lu
2025-09-04 12:04           ` Aaron Lu
2025-09-05  7:53             ` Aaron Lu
2025-09-03 20:55   ` Benjamin Segall
2025-09-04 11:26     ` Aaron Lu
2025-09-04 11:30       ` Aaron Lu
2025-08-29  8:11 ` [PATCH v4 4/5] sched/fair: Task based throttle time accounting Aaron Lu
2025-09-03  8:05   ` [tip: sched/core] " tip-bot2 for Aaron Lu
2025-08-29  8:11 ` [PATCH v4 5/5] sched/fair: Get rid of throttled_lb_pair() Aaron Lu
2025-09-03  8:05   ` [tip: sched/core] " tip-bot2 for Aaron Lu
2025-09-01 10:03 ` [PATCH v4 0/5] Defer throttle when task exits to user Peter Zijlstra
2025-12-02  8:59 ` Bezdeka, Florian
2025-12-02  9:43   ` Aaron Lu [this message]
2025-12-02 10:09     ` Florian Bezdeka
2025-12-02 12:01       ` Aaron Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251202094322.GA3378032@bytedance.com \
    --to=ziqianlu@bytedance.com \
    --cc=bigeasy@linutronix.de \
    --cc=bsegall@google.com \
    --cc=chengming.zhou@linux.dev \
    --cc=dietmar.eggemann@arm.com \
    --cc=florian.bezdeka@siemens.com \
    --cc=jan.kiszka@siemens.com \
    --cc=joshdon@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liusongtang@bytedance.com \
    --cc=matteo.martelli@codethink.co.uk \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=mkoutny@suse.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=xii@google.com \
    --cc=yu.c.chen@intel.com \
    --cc=zhouchuyi@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox