From: Aaron Lu <ziqianlu@bytedance.com>
To: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Valentin Schneider <vschneid@redhat.com>,
Ben Segall <bsegall@google.com>,
Peter Zijlstra <peterz@infradead.org>,
Josh Don <joshdon@google.com>, Ingo Molnar <mingo@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
linux-kernel@vger.kernel.org, Juri Lelli <juri.lelli@redhat.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Mel Gorman <mgorman@suse.de>,
Chengming Zhou <chengming.zhou@linux.dev>,
Chuyi Zhou <zhouchuyi@bytedance.com>
Subject: Re: [RFC PATCH 6/7] sched/fair: fix tasks_rcu with task based throttle
Date: Tue, 1 Apr 2025 16:48:26 +0800 [thread overview]
Message-ID: <20250401084826.GA13530@bytedance> (raw)
In-Reply-To: <e9aae247-5347-4748-ac5e-9f54f733e230@amd.com>
On Tue, Apr 01, 2025 at 08:47:02AM +0530, K Prateek Nayak wrote:
> Hello Aaron,
>
> On 3/31/2025 11:49 AM, Aaron Lu wrote:
> > > > Taskes throttled on exit to user path are scheduled by cond_resched() in
> > > > task_work_run() but that is a preempt schedule and doesn't mark a task
> > > > rcu quiescent state.
>
> So browsing through kernel/rcu/task.h, I found the
> cond_resched_tasks_rcu_qs() that basically clears task holdout before
> calling schedule(). The question is, is it safe to be used in the
> task_work_run() context? I have no idea but you can do a resched_curr()
> followed by a cond_resched_tasks_rcu_qs() in throttle_cfs_rq_work() and
> that should give you the same effect as doing a schedule().
>
> Thoughts?
>
Looks good to me.
> > > >
> > > > Fix this by directly calling schedule() in throttle_cfs_rq_work().
> > > Perhaps that can be gotten around by just using set_ti_thread_flag()
> > > resched_curr() will also call set_preempt_need_resched() which allows
> > > cond_resched() to resched the task.
> > >
> > > Since exit_to_user_mode_loop() will run once again seeing that
> > > TIF_NEED_RESCHED is set, schedule() should follow soon. Thoughts?
> > >
> > I tried this and noticed an unexpected consequence: get_signal() will
> > also run task works and if the signal is kill, then it can happen:
> > exit_to_user_mode_loop() -> get_signal() -> throttle_task_work() ->
> > do_exit() -> exit_signals() -> percpu_rwsem_wait() -> schedule() ->
> > try_to_block_task() -> dequeue_task_fair().
> >
> > I would like to avoid this path, at least for now. I want to make sure
> > for throttled tasks, only events like task group change, affinity change
> > etc. can cause it dequeue from core, that's why I added
> > SCHED_WARN_ON(flags & DEQUEUE_SLEEP) in dequeue_task_fair(). I think
> > this can help me capture any unexpected events like this.
> >
> > Besides, I think explicitely calling schedule() has the advantage of not
> > relying on other code, i.e. it doesn't matter if someone removed
> > cond_resched() in task_work_run() some time later or someone changed the
> > logic in exit_to_user_mode_loop().
> >
> > So I hope you don't mind I keep schedule() in throttle_cfs_rq_work(),
> > but if you see anything wrong of doing this, feel free to let me know,
> > thanks.
>
> I don't have any strong feelings. Just that the open-coded schedule()
> struck out like a sore thumb and since you mention future changes, the
> "local_irq_enable_exit_to_user(ti_work)" could perhaps one day be
> extended to not disable IRQs in exit_to_user_mode_loop() in which case
> a direct call to schedule() can cause scheduling while atomic warnings.
>
> IMO using cond_resched_tasks_rcu_qs() is cleaner and it is obvious that
> the throttle work wants to call schedule() asap while also clearing the
> RCU holdout but I'll wait for others to comment if it is safe to do so
> or if we are missing something.
I'm running some tests and I'll follow your suggestion if no problem
found, thanks for the suggestion.
next prev parent reply other threads:[~2025-04-01 8:48 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-17 10:56 [RFC PATCH 0/7] Defer throttle when task exits to user Aaron Lu
2025-03-13 7:21 ` [RFC PATCH 1/7] sched/fair: Add related data structure for task based throttle Aaron Lu
2025-03-17 10:28 ` Valentin Schneider
2025-03-17 11:02 ` Aaron Lu
2025-03-13 7:21 ` [RFC PATCH 2/7] sched/fair: Handle throttle path " Aaron Lu
2025-03-13 18:14 ` K Prateek Nayak
2025-03-14 8:48 ` Aaron Lu
2025-03-14 9:00 ` K Prateek Nayak
2025-03-14 3:28 ` K Prateek Nayak
2025-03-14 8:57 ` Aaron Lu
2025-03-14 9:12 ` K Prateek Nayak
2025-03-14 15:10 ` Aaron Lu
2025-03-14 8:39 ` Chengming Zhou
2025-03-14 8:49 ` K Prateek Nayak
2025-03-14 9:42 ` Aaron Lu
2025-03-14 10:26 ` K Prateek Nayak
2025-03-14 11:47 ` Aaron Lu
2025-03-14 15:58 ` Chengming Zhou
2025-03-14 18:04 ` K Prateek Nayak
2025-03-14 11:07 ` Chengming Zhou
2025-03-31 6:42 ` Aaron Lu
2025-03-31 9:14 ` Chengming Zhou
2025-03-16 3:25 ` Josh Don
2025-03-17 2:54 ` Chengming Zhou
2025-03-20 6:59 ` K Prateek Nayak
2025-03-20 8:39 ` Chengming Zhou
2025-03-20 18:40 ` Xi Wang
2025-03-24 8:58 ` Aaron Lu
2025-03-25 10:02 ` Aaron Lu
2025-03-28 0:11 ` Xi Wang
2025-03-28 3:11 ` Aaron Lu
2025-03-28 22:47 ` Benjamin Segall
2025-03-19 13:43 ` Aaron Lu
2025-03-20 1:06 ` Josh Don
2025-03-20 6:53 ` K Prateek Nayak
2025-03-13 7:21 ` [RFC PATCH 3/7] sched/fair: Handle unthrottle " Aaron Lu
2025-03-14 3:53 ` K Prateek Nayak
2025-03-14 4:06 ` K Prateek Nayak
2025-03-14 10:43 ` Aaron Lu
2025-03-14 17:52 ` K Prateek Nayak
2025-03-17 5:48 ` Aaron Lu
2025-04-02 9:25 ` Aaron Lu
2025-04-02 17:24 ` K Prateek Nayak
2025-03-13 7:21 ` [RFC PATCH 4/7] sched/fair: Take care of migrated task " Aaron Lu
2025-03-14 4:03 ` K Prateek Nayak
2025-03-14 9:49 ` [External] " Aaron Lu
2025-03-13 7:21 ` [RFC PATCH 5/7] sched/fair: Take care of group/affinity/sched_class change for throttled task Aaron Lu
2025-03-14 4:51 ` K Prateek Nayak
2025-03-14 11:40 ` [External] " Aaron Lu
2025-03-13 7:22 ` [RFC PATCH 6/7] sched/fair: fix tasks_rcu with task based throttle Aaron Lu
2025-03-14 4:14 ` K Prateek Nayak
2025-03-14 11:37 ` [External] " Aaron Lu
2025-03-31 6:19 ` Aaron Lu
2025-04-01 3:17 ` K Prateek Nayak
2025-04-01 8:48 ` Aaron Lu [this message]
2025-03-13 7:22 ` [RFC PATCH 7/7] sched/fair: Make sure cfs_rq has enough runtime_remaining on unthrottle path Aaron Lu
2025-03-14 4:18 ` K Prateek Nayak
2025-03-14 11:39 ` [External] " Aaron Lu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250401084826.GA13530@bytedance \
--to=ziqianlu@bytedance.com \
--cc=bsegall@google.com \
--cc=chengming.zhou@linux.dev \
--cc=dietmar.eggemann@arm.com \
--cc=joshdon@google.com \
--cc=juri.lelli@redhat.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
--cc=zhouchuyi@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox