public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Aaron Lu <ziqianlu@bytedance.com>
To: Valentin Schneider <vschneid@redhat.com>
Cc: Ben Segall <bsegall@google.com>,
	K Prateek Nayak <kprateek.nayak@amd.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Chengming Zhou <chengming.zhou@linux.dev>,
	Josh Don <joshdon@google.com>, Ingo Molnar <mingo@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Xi Wang <xii@google.com>,
	linux-kernel@vger.kernel.org, Juri Lelli <juri.lelli@redhat.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Mel Gorman <mgorman@suse.de>,
	Chuyi Zhou <zhouchuyi@bytedance.com>,
	Jan Kiszka <jan.kiszka@siemens.com>,
	Florian Bezdeka <florian.bezdeka@siemens.com>,
	Songtang Liu <liusongtang@bytedance.com>,
	Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH v3 4/5] sched/fair: Task based throttle time accounting
Date: Tue, 19 Aug 2025 17:34:27 +0800	[thread overview]
Message-ID: <20250819093427.GC38@bytedance> (raw)
In-Reply-To: <xhsmhbjociso8.mognet@vschneid-thinkpadt14sgen2i.remote.csb>

On Mon, Aug 18, 2025 at 04:57:27PM +0200, Valentin Schneider wrote:
> On 15/07/25 15:16, Aaron Lu wrote:
> > @@ -5287,19 +5287,12 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
> >               check_enqueue_throttle(cfs_rq);
> >               list_add_leaf_cfs_rq(cfs_rq);
> >  #ifdef CONFIG_CFS_BANDWIDTH
> > -		if (throttled_hierarchy(cfs_rq)) {
> > +		if (cfs_rq->pelt_clock_throttled) {
> >                       struct rq *rq = rq_of(cfs_rq);
> >
> > -			if (cfs_rq_throttled(cfs_rq) && !cfs_rq->throttled_clock)
> > -				cfs_rq->throttled_clock = rq_clock(rq);
> > -			if (!cfs_rq->throttled_clock_self)
> > -				cfs_rq->throttled_clock_self = rq_clock(rq);
> > -
> > -			if (cfs_rq->pelt_clock_throttled) {
> > -				cfs_rq->throttled_clock_pelt_time += rq_clock_pelt(rq) -
> > -					cfs_rq->throttled_clock_pelt;
> > -				cfs_rq->pelt_clock_throttled = 0;
> > -			}
> > +			cfs_rq->throttled_clock_pelt_time += rq_clock_pelt(rq) -
> > +				cfs_rq->throttled_clock_pelt;
> > +			cfs_rq->pelt_clock_throttled = 0;
> 
> This is the only hunk of the patch that affects the PELT stuff; should this
> have been included in patch 3 which does the rest of the PELT accounting changes?
> 

Yes, I think your suggestion makes sense, I'll move it to patch3 in next
version, thanks.

> > @@ -7073,6 +7073,9 @@ static int dequeue_entities(struct rq *rq, struct sched_entity *se, int flags)
> >               if (cfs_rq_is_idle(cfs_rq))
> >                       h_nr_idle = h_nr_queued;
> >
> > +		if (throttled_hierarchy(cfs_rq) && task_throttled)
> > +			record_throttle_clock(cfs_rq);
> > +
> 
> Apologies if this has been discussed before.
> 
> So the throttled time (as reported by cpu.stat.local) is now accounted as
> the time from which the first task in the hierarchy gets effectively
> throttled - IOW the first time a task in a throttled hierarchy reaches
> resume_user_mode_work() - as opposed to as soon as the hierarchy runs out
> of quota.

Right.

> 
> The gap between the two shouldn't be much, but that should at the very
> least be highlighted in the changelog.
>

Got it, does the below added words make this clear?

    With task based throttle model, the previous way to check cfs_rq's
    nr_queued to decide if throttled time should be accounted doesn't work
    as expected, e.g. when a cfs_rq which has a single task is throttled,
    that task could later block in kernel mode instead of being dequeued on
    limbo list and account this as throttled time is not accurate.

    Rework throttle time accounting for a cfs_rq as follows:
    - start accounting when the first task gets throttled in its hierarchy;
    - stop accounting on unthrottle.

    Note that there will be a time gap between when a cfs_rq is throttled
    and when a task in its hierarchy is actually throttled. This accounting
    mechanism only started accounting in the latter case.

  reply	other threads:[~2025-08-19  9:34 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-15  7:16 [PATCH v3 0/5] Defer throttle when task exits to user Aaron Lu
2025-07-15  7:16 ` [PATCH v3 1/5] sched/fair: Add related data structure for task based throttle Aaron Lu
2025-07-15  7:16 ` [PATCH v3 2/5] sched/fair: Implement throttle task work and related helpers Aaron Lu
2025-07-15  7:16 ` [PATCH v3 3/5] sched/fair: Switch to task based throttle model Aaron Lu
2025-07-15 23:29   ` kernel test robot
2025-07-16  6:57     ` Aaron Lu
2025-07-16  7:40       ` Philip Li
2025-07-16 11:15         ` [PATCH v3 update " Aaron Lu
2025-07-16 11:27       ` [PATCH v3 " Peter Zijlstra
2025-07-16 15:20   ` kernel test robot
2025-07-17  3:52     ` Aaron Lu
2025-07-23  8:21       ` Oliver Sang
2025-07-23 10:08         ` Aaron Lu
2025-08-08  9:12   ` Valentin Schneider
2025-08-08 10:13     ` Aaron Lu
2025-08-08 11:45       ` Valentin Schneider
2025-08-12  8:48         ` Aaron Lu
2025-08-14 15:54           ` Valentin Schneider
2025-08-15  9:30             ` Aaron Lu
2025-08-22 11:07               ` Aaron Lu
2025-09-03  7:14                 ` Aaron Lu
2025-09-03  9:11                   ` K Prateek Nayak
2025-09-03 10:11                     ` Aaron Lu
2025-09-03 10:31                       ` K Prateek Nayak
2025-09-03 11:35                         ` Aaron Lu
2025-09-04  7:33                           ` Bezdeka, Florian
2025-09-04  8:26                             ` K Prateek Nayak
2025-09-04  8:40                             ` Aaron Lu
2025-08-28  3:50         ` Aaron Lu
2025-08-17  8:50   ` Chen, Yu C
2025-08-18  2:50     ` Aaron Lu
2025-08-18  3:10       ` Chen, Yu C
2025-08-18  3:12       ` Aaron Lu
2025-07-15  7:16 ` [PATCH v3 4/5] sched/fair: Task based throttle time accounting Aaron Lu
2025-08-18 14:57   ` Valentin Schneider
2025-08-19  9:34     ` Aaron Lu [this message]
2025-08-19 14:09       ` Valentin Schneider
2025-08-26 14:10       ` Michal Koutný
2025-08-27 15:16         ` Valentin Schneider
2025-08-28  6:06         ` Aaron Lu
2025-08-26  9:15     ` Aaron Lu
2025-07-15  7:16 ` [PATCH v3 5/5] sched/fair: Get rid of throttled_lb_pair() Aaron Lu
2025-07-15  7:22 ` [PATCH v3 0/5] Defer throttle when task exits to user Aaron Lu
2025-08-01 14:31 ` Matteo Martelli
2025-08-04  7:52   ` Aaron Lu
2025-08-04 11:18     ` Valentin Schneider
2025-08-04 11:56       ` Aaron Lu
2025-08-08 16:37     ` Matteo Martelli
2025-08-04  8:51 ` K Prateek Nayak
2025-08-04 11:48   ` Aaron Lu
2025-08-27 14:58 ` Valentin Schneider

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250819093427.GC38@bytedance \
    --to=ziqianlu@bytedance.com \
    --cc=bsegall@google.com \
    --cc=chengming.zhou@linux.dev \
    --cc=dietmar.eggemann@arm.com \
    --cc=florian.bezdeka@siemens.com \
    --cc=jan.kiszka@siemens.com \
    --cc=joshdon@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liusongtang@bytedance.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tj@kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=xii@google.com \
    --cc=zhouchuyi@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox