From: Aaron Lu <ziqianlu@bytedance.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>,
Valentin Schneider <vschneid@redhat.com>,
Ben Segall <bsegall@google.com>,
Chengming Zhou <chengming.zhou@linux.dev>,
Josh Don <joshdon@google.com>, Ingo Molnar <mingo@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Xi Wang <xii@google.com>,
linux-kernel@vger.kernel.org, Juri Lelli <juri.lelli@redhat.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Mel Gorman <mgorman@suse.de>,
Chuyi Zhou <zhouchuyi@bytedance.com>,
Jan Kiszka <jan.kiszka@siemens.com>,
Florian Bezdeka <florian.bezdeka@siemens.com>,
Songtang Liu <liusongtang@bytedance.com>,
Chen Yu <yu.c.chen@intel.com>,
Matteo Martelli <matteo.martelli@codethink.co.uk>,
Michal Koutn?? <mkoutny@suse.com>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: Re: [PATCH v4 3/5] sched/fair: Switch to task based throttle model
Date: Fri, 5 Sep 2025 19:37:19 +0800 [thread overview]
Message-ID: <20250905113719.GL42@bytedance> (raw)
In-Reply-To: <20250904070407.GD42@bytedance>
Hi Peter,
On Thu, Sep 04, 2025 at 03:04:07PM +0800, Aaron Lu wrote:
> On Thu, Sep 04, 2025 at 11:14:31AM +0530, K Prateek Nayak wrote:
> > On 9/4/2025 1:57 AM, Peter Zijlstra wrote:
> > > So this is mostly tasks leaving/joining the class/cgroup. And its
> > > purpose seems to be to remove/add the blocked load component.
> > >
> > > Previously throttle/unthrottle would {de,en}queue the whole subtree from
> > > PELT, see how {en,de}queue would also stop at throttle.
> > >
> > > But now none of that is done; PELT is fully managed by the tasks
> > > {de,en}queueing.
> > >
> > > So I'm thinking that when a task joins fair (deboost from RT or
> > > whatever), we add the blocking load and fully propagate it. If the task
> > > is subject to throttling, that will then happen 'naturally' and it will
> > > dequeue itself again.
> >
> > That seems like the correct thing to do yes. Those throttled_cfs_rq()
> > checks in propagate_entity_cfs_rq() can be removed then.
> >
>
> Not sure if I understand correctly, I've come to the below code
> according to your discussion:
>
Does the below diff look sane to you? If so, shall I send a separate
patch on top or fold it in patch3 and send an updated patch3?
Thanks.
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 3e927b9b7eeb6..97ae561c60f5b 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5234,6 +5234,7 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
>
> static void check_enqueue_throttle(struct cfs_rq *cfs_rq);
> static inline int cfs_rq_throttled(struct cfs_rq *cfs_rq);
> +static inline int cfs_rq_pelt_clock_throttled(struct cfs_rq *cfs_rq);
>
> static void
> requeue_delayed_entity(struct sched_entity *se);
> @@ -5729,6 +5730,11 @@ static inline int cfs_rq_throttled(struct cfs_rq *cfs_rq)
> return cfs_bandwidth_used() && cfs_rq->throttled;
> }
>
> +static inline int cfs_rq_pelt_clock_throttled(struct cfs_rq *cfs_rq)
> +{
> + return cfs_bandwidth_used() && cfs_rq->pelt_clock_throttled;
> +}
> +
> /* check whether cfs_rq, or any parent, is throttled */
> static inline int throttled_hierarchy(struct cfs_rq *cfs_rq)
> {
> @@ -6721,6 +6727,11 @@ static inline int cfs_rq_throttled(struct cfs_rq *cfs_rq)
> return 0;
> }
>
> +static inline int cfs_rq_pelt_clock_throttled(struct cfs_rq *cfs_rq)
> +{
> + return 0;
> +}
> +
> static inline int throttled_hierarchy(struct cfs_rq *cfs_rq)
> {
> return 0;
> @@ -13154,10 +13165,7 @@ static void propagate_entity_cfs_rq(struct sched_entity *se)
> {
> struct cfs_rq *cfs_rq = cfs_rq_of(se);
>
> - if (cfs_rq_throttled(cfs_rq))
> - return;
> -
> - if (!throttled_hierarchy(cfs_rq))
> + if (!cfs_rq_pelt_clock_throttled(cfs_rq))
> list_add_leaf_cfs_rq(cfs_rq);
>
> /* Start to propagate at parent */
> @@ -13168,10 +13176,7 @@ static void propagate_entity_cfs_rq(struct sched_entity *se)
>
> update_load_avg(cfs_rq, se, UPDATE_TG);
>
> - if (cfs_rq_throttled(cfs_rq))
> - break;
> -
> - if (!throttled_hierarchy(cfs_rq))
> + if (!cfs_rq_pelt_clock_throttled(cfs_rq))
> list_add_leaf_cfs_rq(cfs_rq);
> }
> }
>
> So this means when a task left/joined a cfs_rq, we will do propagate
> immediately, no matter if the cfs_rq is throttled or has its pelt clock
> stopped or not; if cfs_rq still has pelt clock running, it will be added
> to leaf cfs_rq list to make sure its load can be decayed. If cfs_rq's
> pelt clock is stopped, it will be added to leaf cfs_rq list if necessary
> by enqueue_task_fair() or when it's unthrottled.
next prev parent reply other threads:[~2025-09-05 11:37 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-29 8:11 [PATCH v4 0/5] Defer throttle when task exits to user Aaron Lu
2025-08-29 8:11 ` [PATCH v4 1/5] sched/fair: Add related data structure for task based throttle Aaron Lu
2025-09-03 8:05 ` [tip: sched/core] " tip-bot2 for Valentin Schneider
2025-08-29 8:11 ` [PATCH v4 2/5] sched/fair: Implement throttle task work and related helpers Aaron Lu
2025-09-03 8:05 ` [tip: sched/core] " tip-bot2 for Valentin Schneider
2025-08-29 8:11 ` [PATCH v4 3/5] sched/fair: Switch to task based throttle model Aaron Lu
2025-09-03 8:05 ` [tip: sched/core] " tip-bot2 for Valentin Schneider
2025-09-03 14:51 ` [PATCH v4 3/5] " Peter Zijlstra
2025-09-03 17:12 ` K Prateek Nayak
2025-09-03 20:27 ` Peter Zijlstra
2025-09-04 5:44 ` K Prateek Nayak
2025-09-04 7:04 ` Aaron Lu
2025-09-05 11:37 ` Aaron Lu [this message]
2025-09-05 12:53 ` Peter Zijlstra
2025-09-08 11:05 ` [PATCH] sched/fair: Propagate load for throttled cfs_rq Aaron Lu
2025-09-09 4:20 ` kernel test robot
2025-09-09 6:17 ` Aaron Lu
2025-09-09 6:22 ` K Prateek Nayak
2025-09-09 6:27 ` Aaron Lu
2025-09-10 9:55 ` Aaron Lu
2025-09-03 20:46 ` [PATCH v4 3/5] sched/fair: Switch to task based throttle model Benjamin Segall
2025-09-04 6:03 ` K Prateek Nayak
2025-09-09 4:10 ` Benjamin Segall
2025-09-04 8:16 ` Aaron Lu
2025-09-04 9:51 ` K Prateek Nayak
2025-09-04 11:05 ` Aaron Lu
2025-09-04 14:20 ` K Prateek Nayak
2025-09-09 3:58 ` Benjamin Segall
2025-09-09 12:03 ` Aaron Lu
2025-09-10 3:03 ` Aaron Lu
2025-09-04 12:04 ` Aaron Lu
2025-09-05 7:53 ` Aaron Lu
2025-09-03 20:55 ` Benjamin Segall
2025-09-04 11:26 ` Aaron Lu
2025-09-04 11:30 ` Aaron Lu
2025-08-29 8:11 ` [PATCH v4 4/5] sched/fair: Task based throttle time accounting Aaron Lu
2025-09-03 8:05 ` [tip: sched/core] " tip-bot2 for Aaron Lu
2025-08-29 8:11 ` [PATCH v4 5/5] sched/fair: Get rid of throttled_lb_pair() Aaron Lu
2025-09-03 8:05 ` [tip: sched/core] " tip-bot2 for Aaron Lu
2025-09-01 10:03 ` [PATCH v4 0/5] Defer throttle when task exits to user Peter Zijlstra
2025-12-02 8:59 ` Bezdeka, Florian
2025-12-02 9:43 ` Aaron Lu
2025-12-02 10:09 ` Florian Bezdeka
2025-12-02 12:01 ` Aaron Lu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250905113719.GL42@bytedance \
--to=ziqianlu@bytedance.com \
--cc=bigeasy@linutronix.de \
--cc=bsegall@google.com \
--cc=chengming.zhou@linux.dev \
--cc=dietmar.eggemann@arm.com \
--cc=florian.bezdeka@siemens.com \
--cc=jan.kiszka@siemens.com \
--cc=joshdon@google.com \
--cc=juri.lelli@redhat.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=liusongtang@bytedance.com \
--cc=matteo.martelli@codethink.co.uk \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=mkoutny@suse.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
--cc=xii@google.com \
--cc=yu.c.chen@intel.com \
--cc=zhouchuyi@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.