From: Shrikanth Hegde <sshegde@linux.ibm.com>
To: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, tglx@linutronix.de,
yury.norov@gmail.com, maddy@linux.ibm.com,
linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
gregkh@linuxfoundation.org
Cc: sshegde@linux.ibm.com, vschneid@redhat.com, iii@linux.ibm.com,
huschle@linux.ibm.com, rostedt@goodmis.org,
dietmar.eggemann@arm.com, vineeth@bitbyteword.org,
jgross@suse.com, pbonzini@redhat.com, seanjc@google.com
Subject: [RFC PATCH v3 07/10] sched/core: Push current task from paravirt CPU
Date: Wed, 10 Sep 2025 23:12:07 +0530 [thread overview]
Message-ID: <20250910174210.1969750-8-sshegde@linux.ibm.com> (raw)
In-Reply-To: <20250910174210.1969750-1-sshegde@linux.ibm.com>
Actively push out any task running on a paravirt CPU. Since the task is
running on the CPU need to spawn a stopper thread and push the task out.
If task is sleeping, when it wakes up it is expected to move out. In
case it still chooses a paravirt CPU, next tick will move it out.
However, if the task in pinned only to paravirt CPUs, it will continue
running there.
Though code is almost same as __balance_push_cpu_stop and quite close to
push_cpu_stop, it provides a cleaner implementation w.r.t to PARAVIRT
config.
Add push_task_work_done flag to protect pv_push_task_work buffer. This has
been placed at the empty slot available considering 64/128 byte
cacheline.
This currently works only FAIR and RT.
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
---
kernel/sched/core.c | 84 ++++++++++++++++++++++++++++++++++++++++++++
kernel/sched/sched.h | 9 ++++-
2 files changed, 92 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 279b0dd72b5e..1f9df5b8a3a2 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5629,6 +5629,10 @@ void sched_tick(void)
sched_clock_tick();
+ /* push the current task out if a paravirt CPU */
+ if (is_cpu_paravirt(cpu))
+ push_current_from_paravirt_cpu(rq);
+
rq_lock(rq, &rf);
donor = rq->donor;
@@ -10977,4 +10981,84 @@ void sched_enq_and_set_task(struct sched_enq_and_set_ctx *ctx)
struct cpumask __cpu_paravirt_mask __read_mostly;
EXPORT_SYMBOL(__cpu_paravirt_mask);
DEFINE_STATIC_KEY_FALSE(cpu_paravirt_push_tasks);
+
+static DEFINE_PER_CPU(struct cpu_stop_work, pv_push_task_work);
+
+static int paravirt_push_cpu_stop(void *arg)
+{
+ struct task_struct *p = arg;
+ struct rq *rq = this_rq();
+ struct rq_flags rf;
+ int cpu;
+
+ raw_spin_lock_irq(&p->pi_lock);
+ rq_lock(rq, &rf);
+ rq->push_task_work_done = 0;
+
+ update_rq_clock(rq);
+
+ if (task_rq(p) == rq && task_on_rq_queued(p)) {
+ cpu = select_fallback_rq(rq->cpu, p);
+ rq = __migrate_task(rq, &rf, p, cpu);
+ }
+
+ rq_unlock(rq, &rf);
+ raw_spin_unlock_irq(&p->pi_lock);
+ put_task_struct(p);
+
+ return 0;
+}
+
+/* A CPU is marked as Paravirt when there is contention for underlying
+ * physical CPU and using this CPU will lead to hypervisor preemptions.
+ * It is better not to use this CPU.
+ *
+ * In case any task is scheduled on such CPU, move it out. In
+ * select_fallback_rq a non paravirt CPU will be chosen and henceforth
+ * task shouldn't come back to this CPU
+ */
+void push_current_from_paravirt_cpu(struct rq *rq)
+{
+ struct task_struct *push_task = rq->curr;
+ unsigned long flags;
+ struct rq_flags rf;
+
+ if (!is_cpu_paravirt(rq->cpu))
+ return;
+
+ /* Idle task can't be pused out */
+ if (rq->curr == rq->idle)
+ return;
+
+ /* Do for only SCHED_NORMAL AND RT for now */
+ if (push_task->sched_class != &fair_sched_class &&
+ push_task->sched_class != &rt_sched_class)
+ return;
+
+ if (kthread_is_per_cpu(push_task) ||
+ is_migration_disabled(push_task))
+ return;
+
+ /* Is it affine to only paravirt cpus? */
+ if (cpumask_subset(push_task->cpus_ptr, cpu_paravirt_mask))
+ return;
+
+ /* There is already a stopper thread for this. Dont race with it */
+ if (rq->push_task_work_done == 1)
+ return;
+
+ local_irq_save(flags);
+ preempt_disable();
+
+ get_task_struct(push_task);
+
+ rq_lock(rq, &rf);
+ rq->push_task_work_done = 1;
+ rq_unlock(rq, &rf);
+
+ stop_one_cpu_nowait(rq->cpu, paravirt_push_cpu_stop, push_task,
+ this_cpu_ptr(&pv_push_task_work));
+ preempt_enable();
+ local_irq_restore(flags);
+}
#endif
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 8f9991453d36..5077a32593da 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1187,7 +1187,9 @@ struct rq {
unsigned char nohz_idle_balance;
unsigned char idle_balance;
-
+#ifdef CONFIG_PARAVIRT
+ bool push_task_work_done;
+#endif
unsigned long misfit_task_load;
/* For active balancing */
@@ -3890,11 +3892,16 @@ static inline bool is_cpu_paravirt(int cpu)
return false;
}
+
+void push_current_from_paravirt_cpu(struct rq *rq);
+
#else /* !CONFIG_PARAVIRT */
static inline bool is_cpu_paravirt(int cpu)
{
return false;
}
+
+static inline void push_current_from_paravirt_cpu(struct rq *rq) { }
#endif /* !CONFIG_PARAVIRT */
#endif /* _KERNEL_SCHED_SCHED_H */
--
2.47.3
next prev parent reply other threads:[~2025-09-10 17:44 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-10 17:42 [RFC PATCH v3 00/10] paravirt CPUs and push task for less vCPU preemption Shrikanth Hegde
2025-09-10 17:42 ` [RFC PATCH v3 01/10] sched/docs: Document cpu_paravirt_mask and Paravirt CPU concept Shrikanth Hegde
2025-09-10 17:42 ` [RFC PATCH v3 02/10] cpumask: Introduce cpu_paravirt_mask Shrikanth Hegde
2025-09-10 17:42 ` [RFC PATCH v3 03/10] sched: Static key to check paravirt cpu push Shrikanth Hegde
2025-09-11 1:53 ` Yury Norov
2025-09-11 14:37 ` Shrikanth Hegde
2025-09-11 15:29 ` Yury Norov
2025-09-10 17:42 ` [RFC PATCH v3 04/10] sched/core: Dont allow to use CPU marked as paravirt Shrikanth Hegde
2025-09-11 5:16 ` K Prateek Nayak
2025-09-11 14:44 ` Shrikanth Hegde
2025-09-10 17:42 ` [RFC PATCH v3 05/10] sched/fair: Don't consider paravirt CPUs for wakeup and load balance Shrikanth Hegde
2025-09-11 5:23 ` K Prateek Nayak
2025-09-11 15:56 ` Shrikanth Hegde
2025-09-11 16:55 ` K Prateek Nayak
2025-11-08 12:04 ` Shrikanth Hegde
2025-09-10 17:42 ` [RFC PATCH v3 06/10] sched/rt: Don't select paravirt CPU for wakeup and push/pull rt task Shrikanth Hegde
2025-09-10 17:42 ` Shrikanth Hegde [this message]
2025-09-11 5:40 ` [RFC PATCH v3 07/10] sched/core: Push current task from paravirt CPU K Prateek Nayak
2025-09-11 16:52 ` Shrikanth Hegde
2025-09-11 17:06 ` K Prateek Nayak
2025-09-12 5:22 ` Shrikanth Hegde
2025-09-12 8:48 ` K Prateek Nayak
2025-09-12 12:49 ` Shrikanth Hegde
2025-11-10 4:54 ` Shrikanth Hegde
2025-09-10 17:42 ` [RFC PATCH v3 08/10] sysfs: Add paravirt CPU file Shrikanth Hegde
2025-09-10 17:42 ` [RFC PATCH v3 09/10] powerpc: Add debug file for set/unset paravirt CPUs Shrikanth Hegde
2025-09-10 17:42 ` [HELPER PATCH] sysfs: Provide write method for paravirt Shrikanth Hegde
2025-10-20 14:32 ` [RFC PATCH v3 00/10] paravirt CPUs and push task for less vCPU preemption Sean Christopherson
2025-10-20 15:05 ` Paolo Bonzini
2025-10-23 4:03 ` Shrikanth Hegde
2025-10-21 6:10 ` Shrikanth Hegde
2025-10-22 18:46 ` Sean Christopherson
2025-10-30 17:43 ` Shrikanth Hegde
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250910174210.1969750-8-sshegde@linux.ibm.com \
--to=sshegde@linux.ibm.com \
--cc=dietmar.eggemann@arm.com \
--cc=gregkh@linuxfoundation.org \
--cc=huschle@linux.ibm.com \
--cc=iii@linux.ibm.com \
--cc=jgross@suse.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maddy@linux.ibm.com \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=vincent.guittot@linaro.org \
--cc=vineeth@bitbyteword.org \
--cc=vschneid@redhat.com \
--cc=yury.norov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).