All of lore.kernel.org
 help / color / mirror / Atom feed
From: Aaron Lu <ziqianlu@bytedance.com>
To: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Valentin Schneider <vschneid@redhat.com>,
	Ben Segall <bsegall@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Josh Don <joshdon@google.com>, Ingo Molnar <mingo@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	linux-kernel@vger.kernel.org, Juri Lelli <juri.lelli@redhat.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Mel Gorman <mgorman@suse.de>,
	Chengming Zhou <chengming.zhou@linux.dev>,
	Chuyi Zhou <zhouchuyi@bytedance.com>
Subject: Re: [RFC PATCH 6/7] sched/fair: fix tasks_rcu with task based throttle
Date: Tue, 1 Apr 2025 16:48:26 +0800	[thread overview]
Message-ID: <20250401084826.GA13530@bytedance> (raw)
In-Reply-To: <e9aae247-5347-4748-ac5e-9f54f733e230@amd.com>

On Tue, Apr 01, 2025 at 08:47:02AM +0530, K Prateek Nayak wrote:
> Hello Aaron,
> 
> On 3/31/2025 11:49 AM, Aaron Lu wrote:
> > > > Taskes throttled on exit to user path are scheduled by cond_resched() in
> > > > task_work_run() but that is a preempt schedule and doesn't mark a task
> > > > rcu quiescent state.
> 
> So browsing through kernel/rcu/task.h, I found the
> cond_resched_tasks_rcu_qs() that basically clears task holdout before
> calling schedule(). The question is, is it safe to be used in the
> task_work_run() context? I have no idea but you can do a resched_curr()
> followed by a cond_resched_tasks_rcu_qs() in throttle_cfs_rq_work() and
> that should give you the same effect as doing a schedule().
> 
> Thoughts?
>

Looks good to me.

> > > > 
> > > > Fix this by directly calling schedule() in throttle_cfs_rq_work().
> > > Perhaps that can be gotten around by just using set_ti_thread_flag()
> > > resched_curr() will also call set_preempt_need_resched() which allows
> > > cond_resched() to resched the task.
> > > 
> > > Since exit_to_user_mode_loop() will run once again seeing that
> > > TIF_NEED_RESCHED is set, schedule() should follow soon. Thoughts?
> > > 
> > I tried this and noticed an unexpected consequence: get_signal() will
> > also run task works and if the signal is kill, then it can happen:
> > exit_to_user_mode_loop() -> get_signal() -> throttle_task_work() ->
> > do_exit() -> exit_signals() -> percpu_rwsem_wait() -> schedule() ->
> > try_to_block_task() -> dequeue_task_fair().
> > 
> > I would like to avoid this path, at least for now. I want to make sure
> > for throttled tasks, only events like task group change, affinity change
> > etc. can cause it dequeue from core, that's why I added
> > SCHED_WARN_ON(flags & DEQUEUE_SLEEP) in dequeue_task_fair(). I think
> > this can help me capture any unexpected events like this.
> > 
> > Besides, I think explicitely calling schedule() has the advantage of not
> > relying on other code, i.e. it doesn't matter if someone removed
> > cond_resched() in task_work_run() some time later or someone changed the
> > logic in exit_to_user_mode_loop().
> > 
> > So I hope you don't mind I keep schedule() in throttle_cfs_rq_work(),
> > but if you see anything wrong of doing this, feel free to let me know,
> > thanks.
> 
> I don't have any strong feelings. Just that the open-coded schedule()
> struck out like a sore thumb and since you mention future changes, the
> "local_irq_enable_exit_to_user(ti_work)" could perhaps one day be
> extended to not disable IRQs in exit_to_user_mode_loop() in which case
> a direct call to schedule() can cause scheduling while atomic warnings.
> 
> IMO using cond_resched_tasks_rcu_qs() is cleaner and it is obvious that
> the throttle work wants to call schedule() asap while also clearing the
> RCU holdout but I'll wait for others to comment if it is safe to do so
> or if we are missing something.

I'm running some tests and I'll follow your suggestion if no problem
found, thanks for the suggestion.

  reply	other threads:[~2025-04-01  8:48 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-17 10:56 [RFC PATCH 0/7] Defer throttle when task exits to user Aaron Lu
2025-03-13  7:21 ` [RFC PATCH 1/7] sched/fair: Add related data structure for task based throttle Aaron Lu
2025-03-17 10:28   ` Valentin Schneider
2025-03-17 11:02     ` Aaron Lu
2025-03-13  7:21 ` [RFC PATCH 2/7] sched/fair: Handle throttle path " Aaron Lu
2025-03-13 18:14   ` K Prateek Nayak
2025-03-14  8:48     ` Aaron Lu
2025-03-14  9:00       ` K Prateek Nayak
2025-03-14  3:28   ` K Prateek Nayak
2025-03-14  8:57     ` Aaron Lu
2025-03-14  9:12       ` K Prateek Nayak
2025-03-14 15:10         ` Aaron Lu
2025-03-14  8:39   ` Chengming Zhou
2025-03-14  8:49     ` K Prateek Nayak
2025-03-14  9:42     ` Aaron Lu
2025-03-14 10:26       ` K Prateek Nayak
2025-03-14 11:47         ` Aaron Lu
2025-03-14 15:58           ` Chengming Zhou
2025-03-14 18:04           ` K Prateek Nayak
2025-03-14 11:07       ` Chengming Zhou
2025-03-31  6:42         ` Aaron Lu
2025-03-31  9:14           ` Chengming Zhou
2025-03-16  3:25   ` Josh Don
2025-03-17  2:54     ` Chengming Zhou
2025-03-20  6:59       ` K Prateek Nayak
2025-03-20  8:39         ` Chengming Zhou
2025-03-20 18:40           ` Xi Wang
2025-03-24  8:58             ` Aaron Lu
2025-03-25 10:02               ` Aaron Lu
2025-03-28  0:11                 ` Xi Wang
2025-03-28  3:11                   ` Aaron Lu
2025-03-28 22:47         ` Benjamin Segall
2025-03-19 13:43     ` Aaron Lu
2025-03-20  1:06       ` Josh Don
2025-03-20  6:53     ` K Prateek Nayak
2025-03-13  7:21 ` [RFC PATCH 3/7] sched/fair: Handle unthrottle " Aaron Lu
2025-03-14  3:53   ` K Prateek Nayak
2025-03-14  4:06     ` K Prateek Nayak
2025-03-14 10:43     ` Aaron Lu
2025-03-14 17:52       ` K Prateek Nayak
2025-03-17  5:48         ` Aaron Lu
2025-04-02  9:25         ` Aaron Lu
2025-04-02 17:24           ` K Prateek Nayak
2025-03-13  7:21 ` [RFC PATCH 4/7] sched/fair: Take care of migrated task " Aaron Lu
2025-03-14  4:03   ` K Prateek Nayak
2025-03-14  9:49     ` [External] " Aaron Lu
2025-03-13  7:21 ` [RFC PATCH 5/7] sched/fair: Take care of group/affinity/sched_class change for throttled task Aaron Lu
2025-03-14  4:51   ` K Prateek Nayak
2025-03-14 11:40     ` [External] " Aaron Lu
2025-03-13  7:22 ` [RFC PATCH 6/7] sched/fair: fix tasks_rcu with task based throttle Aaron Lu
2025-03-14  4:14   ` K Prateek Nayak
2025-03-14 11:37     ` [External] " Aaron Lu
2025-03-31  6:19     ` Aaron Lu
2025-04-01  3:17       ` K Prateek Nayak
2025-04-01  8:48         ` Aaron Lu [this message]
2025-03-13  7:22 ` [RFC PATCH 7/7] sched/fair: Make sure cfs_rq has enough runtime_remaining on unthrottle path Aaron Lu
2025-03-14  4:18   ` K Prateek Nayak
2025-03-14 11:39     ` [External] " Aaron Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250401084826.GA13530@bytedance \
    --to=ziqianlu@bytedance.com \
    --cc=bsegall@google.com \
    --cc=chengming.zhou@linux.dev \
    --cc=dietmar.eggemann@arm.com \
    --cc=joshdon@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=zhouchuyi@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.