Linux Documentation
 help / color / mirror / Atom feed
From: Shrikanth Hegde <sshegde@linux.ibm.com>
To: Yury Norov <yury.norov@gmail.com>
Cc: linux-kernel@vger.kernel.org, mingo@kernel.org,
	peterz@infradead.org, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, kprateek.nayak@amd.com,
	iii@linux.ibm.com, corbet@lwn.net, tglx@kernel.org,
	gregkh@linuxfoundation.org, pbonzini@redhat.com,
	seanjc@google.com, vschneid@redhat.com, huschle@linux.ibm.com,
	rostedt@goodmis.org, dietmar.eggemann@arm.com,
	maddy@linux.ibm.com, srikar@linux.ibm.com, hdanton@sina.com,
	chleroy@kernel.org, vineeth@bitbyteword.org, frederic@kernel.org,
	arighi@nvidia.com, pauld@redhat.com, christian.loehle@arm.com,
	tj@kernel.org, tommaso.cucinotta@gmail.com, maz@kernel.org,
	rafael@kernel.org, rdunlap@infradead.org, kernellwp@gmail.com,
	linux-doc@vger.kernel.org
Subject: Re: [PATCH v6 05/23] sched/core: Try to use a preferred CPU in is_cpu_allowed
Date: Wed, 1 Jul 2026 22:19:50 +0530	[thread overview]
Message-ID: <0b50873f-90aa-4b1c-913c-475c80fa21c8@linux.ibm.com> (raw)
In-Reply-To: <akU7vr4cpAhPRFeL@yury>

Hi Yury,

On 7/1/26 9:39 PM, Yury Norov wrote:
> On Wed, Jul 01, 2026 at 07:46:36PM +0530, Shrikanth Hegde wrote:
>> When possible, choose a preferred CPUs to pick.
>>
>> Push task mechanism uses stopper thread which going to call
>> select_fallback_rq and use this mechanism to pick only a preferred CPU.
>>
>> When task is affined only to non-preferred CPUs it should continue to
>> run there. Detect that by checking if cpus_ptr and cpu_preferred_mask
>> intersect or not.
>>
>> This takes care of wakeup path optimization for FAIR tasks.
>> is_cpu_allowed is called to ensure wakeup happens on preferred CPUs.
>> With that, additional checks in available_idle_cpu is not necessary.
>>
>> Add a comment on rare case of O(N**2) in select_fallback_rq.
>>
>> Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
>> ---
>> v5->v6:
>> - Drop optimization for select_fallback_rq
>> - Keep comment on N**2
>>
>>   kernel/sched/core.c  | 29 ++++++++++++++++++++++++++++-
>>   kernel/sched/sched.h |  9 +++++++++
>>   2 files changed, 37 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index a45f7c308329..1fb1c17e8387 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -2500,6 +2500,8 @@ static inline bool rq_has_pinned_tasks(struct rq *rq)
>>    */
>>   static inline bool is_cpu_allowed(struct task_struct *p, int cpu)
>>   {
>> +	bool task_has_preferred_cpu;
>> +
>>   	/* When not in the task's cpumask, no point in looking further. */
>>   	if (!task_allowed_on_cpu(p, cpu))
>>   		return false;
>> @@ -2508,9 +2510,30 @@ static inline bool is_cpu_allowed(struct task_struct *p, int cpu)
>>   	if (is_migration_disabled(p))
>>   		return cpu_online(cpu);
>>   
>> +	/*
>> +	 * This is essential to maintain user affinities when preferred
>> +	 * CPUs change. A task pinned on non-preferred CPU should continue
>> +	 * to run there, since this is non-user triggered.
>> +	 *
>> +	 * If CPU is non-preferred and task can run on other CPUs which are
>> +	 * currently preferred, then choose those other CPUs instead.
>> +	 * Overhead is minimal when CPU is preferred.
>> +	 *
>> +	 * For majority of the cases this would still keep select_fallback_rq
>> +	 * as O(N). task_has_preferred_cpus which is O(N) is called only if
>> +	 * !cpu_preferred. Then task running there is expected to move out.
>> +	 * So subsequent it should run on preferred CPU. This becomes O(N**2)
>> +	 * only for tasks pinned only non preferred CPUs. That is rare case.
>> +	 */
> 
> The is_cpu_allowed() is ~20 lines now, and your patch doubles that count.
> Can you keep this type of thoughts in commit message? 90% of setups
> will disable preferred CPUs, and I guess 99% of developers don't care.
> 
> This is the code, not a scientific paper, after all.
> 

Ok. I will update the comments and share updated one soon
as reply to this.

>> +	task_has_preferred_cpu = !cpu_preferred(cpu) &&
>> +				 task_has_preferred_cpus(p);
> 
> Maybe it's just me, but the name looks illogical. Because if
> 'cpu' is preferred, the task indeed has some preferred CPUs.
> 
> Maybe 'can_sched_on_preferred' or something like that?
> 

ok.

>> +
>>   	/* Non kernel threads are not allowed during either online or offline. */
>> -	if (!(p->flags & PF_KTHREAD))
>> +	if (!(p->flags & PF_KTHREAD)) {
>> +		if (task_has_preferred_cpu)
>> +			return false;
>>   		return cpu_active(cpu);
>> +	}
> 
> The comment on top of the block seems to be applicable to the 2nd
> return only, right?
> 

First return is for !kthread and second return is kthread.
It is applicable for both. (since kthreads are FAIR class)

But on that thought, do unbound kthreads run too often or
are they usually bound to a CPU? If it is later, we can even drop
that second return.

>>   
>>   	/* KTHREAD_IS_PER_CPU is always allowed. */
>>   	if (kthread_is_per_cpu(p))
>> @@ -2520,6 +2543,10 @@ static inline bool is_cpu_allowed(struct task_struct *p, int cpu)
>>   	if (cpu_dying(cpu))
>>   		return false;
>>   
>> +	/* Try on preferred CPU first if possible*/
>> +	if (task_has_preferred_cpu)
>> +		return false;
> 
> Would it look better if you drop the comment and:
>          
>          if (need_sched_on_preferred)
>                  return false;
> 
>> +
>>   	/* But are allowed during online. */
> 
> This comment is the continuation of the cpu_dying() case. With your
> change it's not anymore, and it needs to be reworded.
> 
>>   	return cpu_online(cpu);
>>   }
>> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
>> index 26ae13c86b69..36ae20310891 100644
>> --- a/kernel/sched/sched.h
>> +++ b/kernel/sched/sched.h
>> @@ -4230,4 +4230,13 @@ DEFINE_CLASS_IS_UNCONDITIONAL(sched_change)
>>   
>>   #include "ext/ext.h"
>>   
>> +static inline bool task_has_preferred_cpus(struct task_struct *p)
>> +{
>> +	/* Only FAIR tasks honor preferred CPU state */
>> +	if (unlikely(p->sched_class != &fair_sched_class))
>> +		return false;
>> +
>> +	return cpumask_intersects(p->cpus_ptr, cpu_preferred_mask);
>> +}
>> +
>>   #endif /* _KERNEL_SCHED_SCHED_H */
>> -- 
>> 2.47.3


  reply	other threads:[~2026-07-01 16:50 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-07-01 14:16 [PATCH v6 00/23] sched: Introduce cpu_preferred_mask and steal-driven vCPU backoff Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 01/23] sched/docs: Document cpu_preferred_mask and Preferred CPU concept Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 02/23] kconfig: Provide PREFERRED_CPU option Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 03/23] cpumask: Introduce cpu_preferred_mask Shrikanth Hegde
2026-07-01 15:35   ` Yury Norov
2026-07-01 16:40     ` Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 04/23] sysfs: Add preferred CPU file Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 05/23] sched/core: Try to use a preferred CPU in is_cpu_allowed Shrikanth Hegde
2026-07-01 16:09   ` Yury Norov
2026-07-01 16:49     ` Shrikanth Hegde [this message]
2026-07-01 14:16 ` [PATCH v6 06/23] sched/fair: Load balance only among preferred CPUs Shrikanth Hegde
2026-07-01 16:19   ` Yury Norov
2026-07-01 16:41     ` Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 07/23] sched/fair: Pull the load on preferred CPU Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 08/23] sched/core: Keep tick on non-preferred CPUs until tasks are out Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 09/23] sched/core: Push current task from non preferred CPU Shrikanth Hegde
2026-07-01 16:50   ` Yury Norov
2026-07-01 17:03     ` Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 10/23] sched/debug: Add migration stats due to non preferred CPUs Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 11/23] virt/steal_monitor: Add documentation Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 12/23] virt: Introduce steal monitor driver Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 13/23] virt/steal_monitor: Restore to active on module disable Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 14/23] virt/steal_monitor: Define steal_monitor structure Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 15/23] virt/steal_monitor: Add control knobs for handling steal values Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 16/23] virt/steal_monitor: Compute work at regular intervals Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 17/23] virt/steal_monitor: Provide default method to get systemwide steal time Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 18/23] virt/steal_monitor: Provide default method to inc/dec preferred CPUs Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 19/23] virt/steal_monitor: Provide default method to get num of CPUs for steal ratio Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 20/23] virt/steal_monitor: Act on steal values at regular intervals Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 21/23] virt/steal_monitor: Add direction control Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 22/23] virt/steal_monitor: Add design check of preferred subset of active Shrikanth Hegde
2026-07-01 14:16 ` [PATCH v6 23/23] virt/steal_monitor: Optimise decrease_preferred_cpus when all CPUs are housekeeping Shrikanth Hegde

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0b50873f-90aa-4b1c-913c-475c80fa21c8@linux.ibm.com \
    --to=sshegde@linux.ibm.com \
    --cc=arighi@nvidia.com \
    --cc=chleroy@kernel.org \
    --cc=christian.loehle@arm.com \
    --cc=corbet@lwn.net \
    --cc=dietmar.eggemann@arm.com \
    --cc=frederic@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=hdanton@sina.com \
    --cc=huschle@linux.ibm.com \
    --cc=iii@linux.ibm.com \
    --cc=juri.lelli@redhat.com \
    --cc=kernellwp@gmail.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maddy@linux.ibm.com \
    --cc=maz@kernel.org \
    --cc=mingo@kernel.org \
    --cc=pauld@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=rdunlap@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=seanjc@google.com \
    --cc=srikar@linux.ibm.com \
    --cc=tglx@kernel.org \
    --cc=tj@kernel.org \
    --cc=tommaso.cucinotta@gmail.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vineeth@bitbyteword.org \
    --cc=vschneid@redhat.com \
    --cc=yury.norov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox