From: Yury Norov <ynorov@nvidia.com>
To: Shrikanth Hegde <sshegde@linux.ibm.com>
Cc: linux-kernel@vger.kernel.org, mingo@kernel.org,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, tglx@linutronix.de,
yury.norov@gmail.com, gregkh@linuxfoundation.org,
pbonzini@redhat.com, seanjc@google.com, kprateek.nayak@amd.com,
vschneid@redhat.com, iii@linux.ibm.com, huschle@linux.ibm.com,
rostedt@goodmis.org, dietmar.eggemann@arm.com, mgorman@suse.de,
bsegall@google.com, maddy@linux.ibm.com, srikar@linux.ibm.com,
hdanton@sina.com, chleroy@kernel.org, vineeth@bitbyteword.org,
joelagnelf@nvidia.com
Subject: Re: [PATCH v2 05/17] sched/core: allow only preferred CPUs in is_cpu_allowed
Date: Tue, 7 Apr 2026 21:05:47 -0400 [thread overview]
Message-ID: <adWp6xXjpmeEDHeY@yury> (raw)
In-Reply-To: <20260407191950.643549-6-sshegde@linux.ibm.com>
On Wed, Apr 08, 2026 at 12:49:38AM +0530, Shrikanth Hegde wrote:
>
> When possible, choose a preferred CPUs to pick.
>
> Push task mechanism uses stopper thread which going to call
> select_fallback_rq and use this mechanism to pick only a preferred CPU.
>
> When task is affined only to non-preferred CPUs it should continue to
> run there. Detect that by checking if cpus_ptr and cpu_preferred_mask
> interesect or not.
>
> Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
> ---
> kernel/sched/core.c | 17 ++++++++++++++---
> kernel/sched/sched.h | 12 ++++++++++++
> 2 files changed, 26 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 7ea05a7a717b..336e7c694eb7 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2463,9 +2463,16 @@ static inline bool is_cpu_allowed(struct task_struct *p, int cpu)
> if (is_migration_disabled(p))
> return cpu_online(cpu);
>
> - /* Non kernel threads are not allowed during either online or offline. */
> - if (!(p->flags & PF_KTHREAD))
> - return cpu_active(cpu);
> + /*
> + * Non kernel threads are not allowed during either online or offline.
> + * Ensure it is a preferred CPU to avoid further contention
> + */
> + if (!(p->flags & PF_KTHREAD)) {
> + if (!cpu_active(cpu))
> + return false;
> + if (!cpu_preferred(cpu) && task_can_run_on_preferred_cpu(p))
> + return false;
> + }
>
> /* KTHREAD_IS_PER_CPU is always allowed. */
> if (kthread_is_per_cpu(p))
> @@ -2475,6 +2482,10 @@ static inline bool is_cpu_allowed(struct task_struct *p, int cpu)
> if (cpu_dying(cpu))
> return false;
>
> + /* Try on preferred CPU first */
> + if (!cpu_preferred(cpu) && task_can_run_on_preferred_cpu(p))
> + return false;
You repeat this for the 2nd time. The cpu_preferred() call should go
inside task_can_run_on_preferred_cpu().
And can you please pick some shorter name?
> +
> /* But are allowed during online. */
> return cpu_online(cpu);
> }
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 88e0c93b9e21..7271af2ca64f 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -4130,4 +4130,16 @@ DEFINE_CLASS_IS_UNCONDITIONAL(sched_change)
>
> #include "ext.h"
>
> +#ifdef CONFIG_PARAVIRT
> +static inline bool task_can_run_on_preferred_cpu(struct task_struct *p)
> +{
> + return cpumask_intersects(p->cpus_ptr, cpu_preferred_mask);
This makes is_cpu_allowed() O(N). Even if CONFIG_PARAVIRT is enabled,
I think some people would prefer to avoid this. Also, select_fallback_rq()
calls it in a loop, and this makes it O(N^2).
/* Any allowed, online CPU? */
for_each_cpu(dest_cpu, p->cpus_ptr) {
if (!is_cpu_allowed(p, dest_cpu))
continue;
goto out;
}
You can keep it O(N):
for_each_cpu_and(dest_cpu, p->cpus_ptr, cpu_preferred_mask) {
...
}
Not sure how critical that path is, but this looks suspicious.
> +}
> +#else
> +static inline bool task_can_run_on_preferred_cpu(struct task_struct *p)
> +{
> + return true;
> +}
> +#endif
Same comment as in patch 3. I believe, it's worth to declare cpu_preferred_mask
unrelated to CONFIG_PARAVIRT, so that you'll not have to spread this
ifdefery around.
> +
> #endif /* _KERNEL_SCHED_SCHED_H */
> --
> 2.47.3
next prev parent reply other threads:[~2026-04-08 1:05 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-07 19:19 [PATCH v2 00/17] sched/paravirt: Introduce cpu_preferred_mask and steal-driven vCPU backoff Shrikanth Hegde
2026-04-07 19:19 ` [PATCH v2 01/17] sched/debug: Remove unused schedstats Shrikanth Hegde
2026-04-07 19:19 ` [PATCH v2 02/17] sched/docs: Document cpu_preferred_mask and Preferred CPU concept Shrikanth Hegde
2026-04-07 19:19 ` [PATCH v2 03/17] cpumask: Introduce cpu_preferred_mask Shrikanth Hegde
2026-04-07 20:27 ` Yury Norov
2026-04-08 9:16 ` Shrikanth Hegde
2026-04-08 17:57 ` Yury Norov
2026-05-05 4:07 ` Shrikanth Hegde
2026-04-07 19:19 ` [PATCH v2 04/17] sysfs: Add preferred CPU file Shrikanth Hegde
2026-04-07 19:19 ` [PATCH v2 05/17] sched/core: allow only preferred CPUs in is_cpu_allowed Shrikanth Hegde
2026-04-08 1:05 ` Yury Norov [this message]
2026-04-08 12:56 ` Shrikanth Hegde
2026-04-08 18:09 ` Yury Norov
2026-04-07 19:19 ` [PATCH v2 06/17] sched/fair: Select preferred CPU at wakeup when possible Shrikanth Hegde
2026-04-07 19:19 ` [PATCH v2 07/17] sched/fair: load balance only among preferred CPUs Shrikanth Hegde
2026-04-07 19:19 ` [PATCH v2 08/17] sched/rt: Select a preferred CPU for wakeup and pulling rt task Shrikanth Hegde
2026-04-07 19:19 ` [PATCH v2 09/17] sched/core: Keep tick on non-preferred CPUs until tasks are out Shrikanth Hegde
2026-04-07 19:19 ` [PATCH v2 10/17] sched/core: Push current task from non preferred CPU Shrikanth Hegde
2026-04-07 19:19 ` [PATCH v2 11/17] sched/debug: Add migration stats due to non preferred CPUs Shrikanth Hegde
2026-04-07 19:19 ` [PATCH v2 12/17] sched/feature: Add STEAL_MONITOR feature Shrikanth Hegde
2026-04-07 19:19 ` [PATCH v2 13/17] sched/core: Introduce a simple steal monitor Shrikanth Hegde
2026-04-07 19:19 ` [PATCH v2 14/17] sched/core: Compute steal values at regular intervals Shrikanth Hegde
2026-04-07 19:19 ` [PATCH v2 15/17] sched/core: Handle steal values and mark CPUs as preferred Shrikanth Hegde
2026-04-07 19:19 ` [PATCH v2 16/17] sched/core: Mark the direction of steal values to avoid oscillations Shrikanth Hegde
2026-04-07 19:19 ` [PATCH v2 17/17] sched/debug: Add debug knobs for steal monitor Shrikanth Hegde
2026-04-07 19:50 ` [PATCH v2 00/17] sched/paravirt: Introduce cpu_preferred_mask and steal-driven vCPU backoff Shrikanth Hegde
2026-04-08 10:14 ` Hillf Danton
2026-04-08 13:49 ` Shrikanth Hegde
2026-04-09 5:15 ` Hillf Danton
2026-04-09 10:27 ` Shrikanth Hegde
2026-04-10 9:47 ` Shrikanth Hegde
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=adWp6xXjpmeEDHeY@yury \
--to=ynorov@nvidia.com \
--cc=bsegall@google.com \
--cc=chleroy@kernel.org \
--cc=dietmar.eggemann@arm.com \
--cc=gregkh@linuxfoundation.org \
--cc=hdanton@sina.com \
--cc=huschle@linux.ibm.com \
--cc=iii@linux.ibm.com \
--cc=joelagnelf@nvidia.com \
--cc=juri.lelli@redhat.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=maddy@linux.ibm.com \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=seanjc@google.com \
--cc=srikar@linux.ibm.com \
--cc=sshegde@linux.ibm.com \
--cc=tglx@linutronix.de \
--cc=vincent.guittot@linaro.org \
--cc=vineeth@bitbyteword.org \
--cc=vschneid@redhat.com \
--cc=yury.norov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.