All of lore.kernel.org
 help / color / mirror / Atom feed
From: K Prateek Nayak <kprateek.nayak@amd.com>
To: John Stultz <jstultz@google.com>, LKML <linux-kernel@vger.kernel.org>
Cc: Joel Fernandes <joelagnelf@nvidia.com>,
	Qais Yousef <qyousef@layalina.io>, Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Valentin Schneider <vschneid@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>,
	Zimuzo Ezeozue <zezeozue@google.com>,
	Mel Gorman <mgorman@suse.de>, Will Deacon <will@kernel.org>,
	Waiman Long <longman@redhat.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Metin Kaya <Metin.Kaya@arm.com>,
	Xuewen Yan <xuewen.yan94@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Suleiman Souhlal <suleiman@google.com>, <kernel-team@android.com>
Subject: Re: [RFC PATCH v15 5/7] sched: Add an initial sketch of the find_proxy_task() function
Date: Sat, 15 Mar 2025 22:05:01 +0530	[thread overview]
Message-ID: <cb735bbd-e4db-41b4-95fd-b7b85f040e4e@amd.com> (raw)
In-Reply-To: <20250312221147.1865364-6-jstultz@google.com>

Hello John,

On 3/13/2025 3:41 AM, John Stultz wrote:
> Add a find_proxy_task() function which doesn't do much.
> 
> When we select a blocked task to run, we will just deactivate it
> and pick again. The exception being if it has become unblocked
> after find_proxy_task() was called.
> 
> Greatly simplified from patch by:
>    Peter Zijlstra (Intel) <peterz@infradead.org>
>    Juri Lelli <juri.lelli@redhat.com>
>    Valentin Schneider <valentin.schneider@arm.com>
>    Connor O'Brien <connoro@google.com>
> 
> Cc: Joel Fernandes <joelagnelf@nvidia.com>
> Cc: Qais Yousef <qyousef@layalina.io>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Juri Lelli <juri.lelli@redhat.com>
> Cc: Vincent Guittot <vincent.guittot@linaro.org>
> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
> Cc: Valentin Schneider <vschneid@redhat.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Ben Segall <bsegall@google.com>
> Cc: Zimuzo Ezeozue <zezeozue@google.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Will Deacon <will@kernel.org>
> Cc: Waiman Long <longman@redhat.com>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Metin Kaya <Metin.Kaya@arm.com>
> Cc: Xuewen Yan <xuewen.yan94@gmail.com>
> Cc: K Prateek Nayak <kprateek.nayak@amd.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
> Cc: Suleiman Souhlal <suleiman@google.com>
> Cc: kernel-team@android.com
> [jstultz: Split out from larger proxy patch and simplified
>   for review and testing.]
> Signed-off-by: John Stultz <jstultz@google.com>
> ---
> v5:
> * Split out from larger proxy patch
> v7:
> * Fixed unused function arguments, spelling nits, and tweaks for
>    clarity, pointed out by Metin Kaya
> * Fix build warning Reported-by: kernel test robot <lkp@intel.com>
>    Closes: https://lore.kernel.org/oe-kbuild-all/202311081028.yDLmCWgr-lkp@intel.com/
> v8:
> * Fixed case where we might return a blocked task from find_proxy_task()
> * Continued tweaks to handle avoiding returning blocked tasks
> v9:
> * Add zap_balance_callbacks helper to unwind balance_callbacks
>    when we will re-call pick_next_task() again.
> * Add extra comment suggested by Metin
> * Typo fixes from Metin
> * Moved adding proxy_resched_idle earlier in the series, as suggested
>    by Metin
> * Fix to call proxy_resched_idle() *prior* to deactivating next, to avoid
>    crashes caused by stale references to next
> * s/PROXY/SCHED_PROXY_EXEC/ as suggested by Metin
> * Number of tweaks and cleanups suggested by Metin
> * Simplify proxy_deactivate as suggested by Metin
> v11:
> * Tweaks for earlier simplification in try_to_deactivate_task
> v13:
> * Rename rename "next" to "donor" in find_proxy_task() for clarity
> * Similarly use "donor" instead of next in proxy_deactivate
> * Refactor/simplify proxy_resched_idle
> * Moved up a needed fix from later in the series
> v15:
> * Tweaked some comments to better explain the initial sketch of
>    find_proxy_task(), suggested by Qais
> * Build fixes for !CONFIG_SMP
> * Slight rework for blocked_on_state being added later in the
>    series.
> * Move the zap_balance_callbacks to later in the patch series
> ---
>   kernel/sched/core.c  | 103 +++++++++++++++++++++++++++++++++++++++++--
>   kernel/sched/rt.c    |  15 ++++++-
>   kernel/sched/sched.h |  10 ++++-
>   3 files changed, 122 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 3968c3967ec38..b4f7b14f62a24 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6600,7 +6600,7 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
>    * Otherwise marks the task's __state as RUNNING
>    */
>   static bool try_to_block_task(struct rq *rq, struct task_struct *p,
> -			      unsigned long task_state)
> +			      unsigned long task_state, bool deactivate_cond)
>   {
>   	int flags = DEQUEUE_NOCLOCK;
>   
> @@ -6609,6 +6609,9 @@ static bool try_to_block_task(struct rq *rq, struct task_struct *p,
>   		return false;
>   	}
>   
> +	if (!deactivate_cond)
> +		return false;
> +
>   	p->sched_contributes_to_load =
>   		(task_state & TASK_UNINTERRUPTIBLE) &&
>   		!(task_state & TASK_NOLOAD) &&
> @@ -6632,6 +6635,93 @@ static bool try_to_block_task(struct rq *rq, struct task_struct *p,
>   	return true;
>   }
>   
> +#ifdef CONFIG_SCHED_PROXY_EXEC
> +
> +static inline struct task_struct *
> +proxy_resched_idle(struct rq *rq)

nit. Any reason why this was put in the next line?

> +{
> +	put_prev_task(rq, rq->donor);

Any reason we cannot do a:

     put_prev_set_next_task(rq, rq->donor, rq->idle)

here? I don't see any dependency on rq->donor in set_next_task_idle()
and it should be safe.

> +	rq_set_donor(rq, rq->idle);
> +	set_next_task(rq, rq->idle);
> +	set_tsk_need_resched(rq->idle);
> +	return rq->idle;
> +}
> +
> +static bool proxy_deactivate(struct rq *rq, struct task_struct *donor)
> +{
> +	unsigned long state = READ_ONCE(donor->__state);
> +
> +	/* Don't deactivate if the state has been changed to TASK_RUNNING */
> +	if (state == TASK_RUNNING)
> +		return false;
> +	/*
> +	 * Because we got donor from pick_next_task, it is *crucial*
> +	 * that we call proxy_resched_idle before we deactivate it.
> +	 * As once we deactivate donor, donor->on_rq is set to zero,
> +	 * which allows ttwu to immediately try to wake the task on
> +	 * another rq. So we cannot use *any* references to donor
> +	 * after that point. So things like cfs_rq->curr or rq->donor
> +	 * need to be changed from next *before* we deactivate.
> +	 */
> +	proxy_resched_idle(rq);
> +	return try_to_block_task(rq, donor, state, true);
> +}
> +
> +/*
> + * Initial simple sketch that just deactivates the blocked task
> + * chosen by pick_next_task() so we can then pick something that
> + * isn't blocked.
> + */
> +static struct task_struct *
> +find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)
> +{
> +	struct task_struct *p = donor;
> +	struct mutex *mutex;
> +
> +	mutex = p->blocked_on;
> +	/* Something changed in the chain, so pick again */
> +	if (!mutex)
> +		return NULL;
> +	/*
> +	 * By taking mutex->wait_lock we hold off concurrent mutex_unlock()
> +	 * and ensure @owner sticks around.
> +	 */
> +	raw_spin_lock(&mutex->wait_lock);
> +
> +	/* Check again that p is blocked with blocked_lock held */
> +	if (!task_is_blocked(p) || mutex != __get_task_blocked_on(p)) {
> +		/*
> +		 * Something changed in the blocked_on chain and
> +		 * we don't know if only at this level. So, let's
> +		 * just bail out completely and let __schedule
> +		 * figure things out (pick_again loop).
> +		 */
> +		goto out;
> +	}
> +
> +	if (!proxy_deactivate(rq, donor)) {
> +		/*
> +		 * XXX: For now, if deactivation failed, set donor
> +		 * as not blocked, as we aren't doing proxy-migrations
> +		 * yet (more logic will be needed then).
> +		 */
> +		__clear_task_blocked_on(donor, mutex);
> +		raw_spin_unlock(&mutex->wait_lock);
> +		return NULL;
> +	}
> +out:
> +	raw_spin_unlock(&mutex->wait_lock);
> +	return NULL; /* do pick_next_task again */
> +}
> +#else /* SCHED_PROXY_EXEC */
> +static struct task_struct *
> +find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)
> +{
> +	WARN_ONCE(1, "This should never be called in the !SCHED_PROXY_EXEC case\n");
> +	return donor;
> +}
> +#endif /* SCHED_PROXY_EXEC */
> +
>   /*
>    * __schedule() is the main scheduler function.
>    *
> @@ -6739,12 +6829,19 @@ static void __sched notrace __schedule(int sched_mode)
>   			goto picked;
>   		}
>   	} else if (!preempt && prev_state) {
> -		try_to_block_task(rq, prev, prev_state);
> +		try_to_block_task(rq, prev, prev_state,
> +				  !task_is_blocked(prev));
>   		switch_count = &prev->nvcsw;
>   	}
>   
> -	next = pick_next_task(rq, prev, &rf);
> +pick_again:
> +	next = pick_next_task(rq, rq->donor, &rf);
>   	rq_set_donor(rq, next);
> +	if (unlikely(task_is_blocked(next))) {
> +		next = find_proxy_task(rq, next, &rf);
> +		if (!next)
> +			goto pick_again;
> +	}
>   picked:
>   	clear_tsk_need_resched(prev);
>   	clear_preempt_need_resched();
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index 4b8e33c615b12..2d418e0efecc5 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -1479,8 +1479,19 @@ enqueue_task_rt(struct rq *rq, struct task_struct *p, int flags)
>   
>   	enqueue_rt_entity(rt_se, flags);
>   
> -	if (!task_current(rq, p) && p->nr_cpus_allowed > 1)
> -		enqueue_pushable_task(rq, p);
> +	/*
> +	 * Current can't be pushed away. Selected is tied to current,
> +	 * so don't push it either.
> +	 */
> +	if (task_current(rq, p) || task_current_donor(rq, p))
> +		return;
> +	/*
> +	 * Pinned tasks can't be pushed.
> +	 */
> +	if (p->nr_cpus_allowed == 1)
> +		return;
> +
> +	enqueue_pushable_task(rq, p);
>   }
>   
>   static bool dequeue_task_rt(struct rq *rq, struct task_struct *p, int flags)
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 05d2122533619..3e49d77ce2cdd 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -2311,6 +2311,14 @@ static inline int task_current_donor(struct rq *rq, struct task_struct *p)
>   	return rq->donor == p;
>   }
>   
> +static inline bool task_is_blocked(struct task_struct *p)
> +{
> +	if (!sched_proxy_exec())
> +		return false;
> +
> +	return !!p->blocked_on;
> +}
> +
>   static inline int task_on_cpu(struct rq *rq, struct task_struct *p)
>   {
>   #ifdef CONFIG_SMP
> @@ -2520,7 +2528,7 @@ static inline void put_prev_set_next_task(struct rq *rq,
>   					  struct task_struct *prev,
>   					  struct task_struct *next)
>   {
> -	WARN_ON_ONCE(rq->curr != prev);
> +	WARN_ON_ONCE(rq->donor != prev);
>   
>   	__put_prev_set_next_dl_server(rq, prev, next);
>   

-- 
Thanks and Regards,
Prateek


  reply	other threads:[~2025-03-15 16:35 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-12 22:11 [RFC PATCH v15 0/7] Single RunQueue Proxy Execution (v15) John Stultz
2025-03-12 22:11 ` [RFC PATCH v15 1/7] sched: Add CONFIG_SCHED_PROXY_EXEC & boot argument to enable/disable John Stultz
2025-03-13 10:09   ` Steven Rostedt
2025-03-14  0:48     ` John Stultz
2025-03-17 14:33   ` Peter Zijlstra
2025-03-17 14:44     ` John Stultz
2025-03-17 14:50       ` Peter Zijlstra
2025-03-12 22:11 ` [RFC PATCH v15 2/7] locking/mutex: Rework task_struct::blocked_on John Stultz
2025-03-13 10:13   ` Steven Rostedt
2025-03-14  6:12     ` John Stultz
2025-03-16 16:33       ` Steven Rostedt
2025-03-18 14:11       ` Masami Hiramatsu
2025-03-18 15:33         ` Lance Yang
2025-03-19  9:49           ` John Stultz
2025-03-19 12:05             ` Lance Yang
2025-03-19  8:54         ` John Stultz
2025-03-17 11:44     ` Peter Zijlstra
2025-03-12 22:11 ` [RFC PATCH v15 3/7] locking/mutex: Add p->blocked_on wrappers for correctness checks John Stultz
2025-03-12 22:11 ` [RFC PATCH v15 4/7] sched: Fix runtime accounting w/ split exec & sched contexts John Stultz
2025-03-13 10:26   ` Steven Rostedt
2025-03-15  6:05     ` John Stultz
2025-03-13 17:24   ` K Prateek Nayak
2025-03-12 22:11 ` [RFC PATCH v15 5/7] sched: Add an initial sketch of the find_proxy_task() function John Stultz
2025-03-15 16:35   ` K Prateek Nayak [this message]
2025-03-17 13:48   ` Peter Zijlstra
2025-03-12 22:11 ` [RFC PATCH v15 6/7] sched: Fix proxy/current (push,pull)ability John Stultz
2025-03-14  8:40   ` K Prateek Nayak
2025-03-15  5:10     ` John Stultz
2025-03-15 16:06       ` K Prateek Nayak
2025-03-17 14:07   ` Peter Zijlstra
2025-03-28  4:45     ` K Prateek Nayak
2025-03-12 22:11 ` [RFC PATCH v15 7/7] sched: Start blocked_on chain processing in find_proxy_task() John Stultz
2025-03-17 16:43   ` Peter Zijlstra
2025-03-18  6:09     ` Peter Zijlstra
2025-03-17 16:47   ` Peter Zijlstra
2025-03-17 16:49   ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cb735bbd-e4db-41b4-95fd-b7b85f040e4e@amd.com \
    --to=kprateek.nayak@amd.com \
    --cc=Metin.Kaya@arm.com \
    --cc=boqun.feng@gmail.com \
    --cc=bsegall@google.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=dietmar.eggemann@arm.com \
    --cc=joelagnelf@nvidia.com \
    --cc=jstultz@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=kernel-team@android.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=qyousef@layalina.io \
    --cc=rostedt@goodmis.org \
    --cc=suleiman@google.com \
    --cc=tglx@linutronix.de \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=will@kernel.org \
    --cc=xuewen.yan94@gmail.com \
    --cc=zezeozue@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.