From: Chengming Zhou <chengming.zhou@linux.dev>
To: Aaron Lu <ziqianlu@bytedance.com>
Cc: Valentin Schneider <vschneid@redhat.com>,
Ben Segall <bsegall@google.com>,
K Prateek Nayak <kprateek.nayak@amd.com>,
Peter Zijlstra <peterz@infradead.org>,
Josh Don <joshdon@google.com>, Ingo Molnar <mingo@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
linux-kernel@vger.kernel.org, Juri Lelli <juri.lelli@redhat.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Mel Gorman <mgorman@suse.de>,
Chuyi Zhou <zhouchuyi@bytedance.com>
Subject: Re: [RFC PATCH 2/7] sched/fair: Handle throttle path for task based throttle
Date: Mon, 31 Mar 2025 17:14:50 +0800 [thread overview]
Message-ID: <08d75be1-e8e4-434e-a9d8-6a4503043872@linux.dev> (raw)
In-Reply-To: <20250331064204.GB1571554@bytedance>
On 2025/3/31 14:42, Aaron Lu wrote:
> Hi Chengming,
>
> On Fri, Mar 14, 2025 at 07:07:10PM +0800, Chengming Zhou wrote:
>> On 2025/3/14 17:42, Aaron Lu wrote:
>>> On Fri, Mar 14, 2025 at 04:39:41PM +0800, Chengming Zhou wrote:
>>>> On 2025/3/13 15:21, Aaron Lu wrote:
>>>>> From: Valentin Schneider <vschneid@redhat.com>
>>>>>
>>>>> Once a cfs_rq gets throttled, for all tasks belonging to this cfs_rq,
>>>>> add a task work to them so that when those tasks return to user, the
>>>>> actual throttle/dequeue can happen.
>>>>>
>>>>> Note that since the throttle/dequeue always happens on a task basis when
>>>>> it returns to user, it's no longer necessary for check_cfs_rq_runtime()
>>>>> to return a value and pick_task_fair() acts differently according to that
>>>>> return value, so check_cfs_rq_runtime() is changed to not return a
>>>>> value.
>>>>
>>>> Previously with the per-cfs_rq throttling, we use update_curr() -> put() path
>>>> to throttle the cfs_rq and dequeue it from the cfs_rq tree.
>>>>
>>>> Now with your per-task throttling, maybe things can become simpler. That we
>>>> can just throttle_cfs_rq() (cfs_rq subtree) when curr accouting to mark these
>>>> throttled.
>>>
>>> Do I understand correctly that now in throttle_cfs_rq(), we just mark
>>> this hierarchy as throttled, but do not add any throttle work to these
>>> tasks in this hierarchy and leave the throttle work add job to pick
>>> time?
>>
>> Right, we can move throttle_cfs_rq() forward to the curr accouting time, which
>> just mark these throttled.
>
> While preparing the next version, I found that if I move
> throttle_cfs_rq() to accounting time, like in __account_cfs_rq_runtime(),
> then it is possible on unthrottle path, the following can happen:
> unthrottle_cfs_rq() -> enqueue_task_fair() -> update_curr() ->
> account_cfs_rq_runtime() -> throttle_cfs_rq()...
Ah, right, then it's best to leave throttle_cfs_rq() where it is.
>
> Initially I was confused why update_curr() can notice a non-null curr
> when this cfs_rq is being unthrottled but then I realized in this task
> based throttling model, it is possible some task woke up in that
> throttled cfs_rq and have cfs_rq->curr set and then cfs_rq gets
> unthrottled.
>
> So I suppose I'll keep the existing way of marking a cfs_rq as
> throttled by calling check_cfs_rq_runtime() in the following two places:
> - in pick_task_fair(), so that the to-be-picked cfs_rq can be marked for
> throttle;
> - in put_prev_entity() for prev runnable task's cfs_rq.
>
>> And move setup_task_work() afterward to the pick task time, which make that task
>> dequeue when ret2user.
>
> No problem for this part as far as my test goes :-)
Good to hear.
Thanks!
>
> Thanks,
> Aaron
>
>>>
>>>> Then then if we pick a task from a throttled cfs_rq subtree, we can setup task work
>>>> for it, so we don't botter with the delayed_dequeue task case that Prateek mentioned.
>>>
>>> If we add a check point in pick time, maybe we can also avoid the check
>>> in enqueue time. One thing I'm thinking is, for a task, it may be picked
>>> multiple times with only a single enqueue so if we do the check in pick,
>>> the overhead can be larger?
>>
>> As Prateek already mentioned, this check cost is negligeable.
>>
>>>
>>>> WDYT?
>>>
>>> Thanks for your suggestion. I'll try this approach and see how it turned
>>> out.
next prev parent reply other threads:[~2025-03-31 9:15 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-17 10:56 [RFC PATCH 0/7] Defer throttle when task exits to user Aaron Lu
2025-03-13 7:21 ` [RFC PATCH 1/7] sched/fair: Add related data structure for task based throttle Aaron Lu
2025-03-17 10:28 ` Valentin Schneider
2025-03-17 11:02 ` Aaron Lu
2025-03-13 7:21 ` [RFC PATCH 2/7] sched/fair: Handle throttle path " Aaron Lu
2025-03-13 18:14 ` K Prateek Nayak
2025-03-14 8:48 ` Aaron Lu
2025-03-14 9:00 ` K Prateek Nayak
2025-03-14 3:28 ` K Prateek Nayak
2025-03-14 8:57 ` Aaron Lu
2025-03-14 9:12 ` K Prateek Nayak
2025-03-14 15:10 ` Aaron Lu
2025-03-14 8:39 ` Chengming Zhou
2025-03-14 8:49 ` K Prateek Nayak
2025-03-14 9:42 ` Aaron Lu
2025-03-14 10:26 ` K Prateek Nayak
2025-03-14 11:47 ` Aaron Lu
2025-03-14 15:58 ` Chengming Zhou
2025-03-14 18:04 ` K Prateek Nayak
2025-03-14 11:07 ` Chengming Zhou
2025-03-31 6:42 ` Aaron Lu
2025-03-31 9:14 ` Chengming Zhou [this message]
2025-03-16 3:25 ` Josh Don
2025-03-17 2:54 ` Chengming Zhou
2025-03-20 6:59 ` K Prateek Nayak
2025-03-20 8:39 ` Chengming Zhou
2025-03-20 18:40 ` Xi Wang
2025-03-24 8:58 ` Aaron Lu
2025-03-25 10:02 ` Aaron Lu
2025-03-28 0:11 ` Xi Wang
2025-03-28 3:11 ` Aaron Lu
2025-03-28 22:47 ` Benjamin Segall
2025-03-19 13:43 ` Aaron Lu
2025-03-20 1:06 ` Josh Don
2025-03-20 6:53 ` K Prateek Nayak
2025-03-13 7:21 ` [RFC PATCH 3/7] sched/fair: Handle unthrottle " Aaron Lu
2025-03-14 3:53 ` K Prateek Nayak
2025-03-14 4:06 ` K Prateek Nayak
2025-03-14 10:43 ` Aaron Lu
2025-03-14 17:52 ` K Prateek Nayak
2025-03-17 5:48 ` Aaron Lu
2025-04-02 9:25 ` Aaron Lu
2025-04-02 17:24 ` K Prateek Nayak
2025-03-13 7:21 ` [RFC PATCH 4/7] sched/fair: Take care of migrated task " Aaron Lu
2025-03-14 4:03 ` K Prateek Nayak
2025-03-14 9:49 ` [External] " Aaron Lu
2025-03-13 7:21 ` [RFC PATCH 5/7] sched/fair: Take care of group/affinity/sched_class change for throttled task Aaron Lu
2025-03-14 4:51 ` K Prateek Nayak
2025-03-14 11:40 ` [External] " Aaron Lu
2025-03-13 7:22 ` [RFC PATCH 6/7] sched/fair: fix tasks_rcu with task based throttle Aaron Lu
2025-03-14 4:14 ` K Prateek Nayak
2025-03-14 11:37 ` [External] " Aaron Lu
2025-03-31 6:19 ` Aaron Lu
2025-04-01 3:17 ` K Prateek Nayak
2025-04-01 8:48 ` Aaron Lu
2025-03-13 7:22 ` [RFC PATCH 7/7] sched/fair: Make sure cfs_rq has enough runtime_remaining on unthrottle path Aaron Lu
2025-03-14 4:18 ` K Prateek Nayak
2025-03-14 11:39 ` [External] " Aaron Lu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=08d75be1-e8e4-434e-a9d8-6a4503043872@linux.dev \
--to=chengming.zhou@linux.dev \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=joshdon@google.com \
--cc=juri.lelli@redhat.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
--cc=zhouchuyi@bytedance.com \
--cc=ziqianlu@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox