From: Shrikanth Hegde <sshegde@linux.ibm.com>
To: linux-kernel@vger.kernel.org, mingo@kernel.org,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, yury.norov@gmail.com,
kprateek.nayak@amd.com, iii@linux.ibm.com
Cc: sshegde@linux.ibm.com, tglx@kernel.org,
gregkh@linuxfoundation.org, pbonzini@redhat.com,
seanjc@google.com, vschneid@redhat.com, huschle@linux.ibm.com,
rostedt@goodmis.org, dietmar.eggemann@arm.com, mgorman@suse.de,
bsegall@google.com, maddy@linux.ibm.com, srikar@linux.ibm.com,
hdanton@sina.com, chleroy@kernel.org, vineeth@bitbyteword.org,
frederic@kernel.org, arighi@nvidia.com, pauld@redhat.com,
christian.loehle@arm.com, tj@kernel.org,
tommaso.cucinotta@gmail.com, maz@kernel.org, rafael@kernel.org
Subject: [PATCH v3 16/20] sched/core: Compute steal values at regular intervals
Date: Thu, 14 May 2026 20:52:00 +0530 [thread overview]
Message-ID: <20260514152204.481115-17-sshegde@linux.ibm.com> (raw)
In-Reply-To: <20260514152204.481115-1-sshegde@linux.ibm.com>
Kick off the work to compute the steal time at regular interval.
Gated with steal monitor enabled static key check to avoid any overhead
when its disabled.
The sampling period can changed at runtime using steal_mon/sampling_period.
By default is 1000 milliseconds. I.e. 1 second
This work is done by first online housekeeping CPU only. Hence it won't
need any complicated synchronization.
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
---
include/linux/sched.h | 2 ++
kernel/sched/core.c | 26 ++++++++++++++++++++++++++
kernel/sched/debug.c | 1 +
kernel/sched/sched.h | 7 +++++++
4 files changed, 36 insertions(+)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index ee5f19a96118..738f17d63943 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2527,6 +2527,8 @@ struct steal_monitor_t {
unsigned int high_threshold;
unsigned int sampling_period_ms;
};
+
+extern struct steal_monitor_t steal_mon;
#endif
#endif
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 907c6b38460b..a3f65e9c7d30 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5719,6 +5719,9 @@ void sched_tick(void)
rq->idle_balance = idle_cpu(cpu);
sched_balance_trigger(rq);
}
+
+ if (sched_steal_mon_enabled())
+ sched_trigger_steal_computation(cpu);
}
#ifdef CONFIG_NO_HZ_FULL
@@ -11375,4 +11378,27 @@ void sched_steal_detection_work(struct work_struct *work)
now = ktime_get();
sm->prev_time = now;
}
+
+void sched_trigger_steal_computation(int cpu)
+{
+ int first_hk_cpu = cpumask_first_and(housekeeping_cpumask(HK_TYPE_KERNEL_NOISE),
+ cpu_online_mask);
+ ktime_t now;
+
+ /* Done by first online housekeeping CPU only */
+ if (likely(cpu != first_hk_cpu))
+ return;
+
+ /*
+ * Since everything is updated by first housekeeping CPU,
+ * There is no need for complex syncronization.
+ */
+ now = ktime_get();
+
+ /* Default is once per second */
+ if (likely(ktime_ms_delta(now, steal_mon.prev_time) < steal_mon.sampling_period_ms))
+ return;
+
+ schedule_work_on(first_hk_cpu, &steal_mon.work);
+}
#endif
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index be8d223b43fd..f00c08581253 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -606,6 +606,7 @@ static ssize_t sched_sm_en_write(struct file *filp, const char __user *ubuf,
static_branch_enable(&__sched_sm_enable);
} else if (!sched_sm_wr_enable && orig) {
static_branch_disable(&__sched_sm_enable);
+ cancel_work_sync(&steal_mon.work);
cpumask_copy(&__cpu_preferred_mask, cpu_online_mask);
}
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index d674f8e8e854..cc90012a85fc 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -4145,9 +4145,16 @@ DECLARE_STATIC_KEY_FALSE(__sched_sm_enable);
void sched_push_current_non_preferred_cpu(struct rq *rq);
void sched_init_steal_monitor(void);
void sched_steal_detection_work(struct work_struct *work);
+void sched_trigger_steal_computation(int cpu);
+static inline bool sched_steal_mon_enabled(void)
+{
+ return static_branch_unlikely(&__sched_sm_enable);
+}
#else /* !CONFIG_PREFERRED_CPU */
static inline void sched_push_current_non_preferred_cpu(struct rq *rq) { }
static inline void sched_init_steal_monitor(void) { }
+static inline void sched_trigger_steal_computation(int cpu) { }
+static inline bool sched_steal_mon_enabled(void) { return false; }
#endif
#endif /* _KERNEL_SCHED_SCHED_H */
--
2.47.3
next prev parent reply other threads:[~2026-05-14 15:25 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-14 15:21 [PATCH v3 00/20] sched: Introduce cpu_preferred_mask and steal-driven vCPU backoff Shrikanth Hegde
2026-05-14 15:21 ` [PATCH v3 01/20] sched/debug: Remove unused schedstats Shrikanth Hegde
2026-05-14 15:21 ` [PATCH v3 02/20] sched/docs: Document cpu_preferred_mask and Preferred CPU concept Shrikanth Hegde
2026-05-14 15:21 ` [PATCH v3 03/20] kconfig: Provide PREFERRED_CPU option Shrikanth Hegde
2026-05-14 15:21 ` [PATCH v3 04/20] cpumask: Introduce cpu_preferred_mask Shrikanth Hegde
2026-05-14 15:21 ` [PATCH v3 05/20] sysfs: Add preferred CPU file Shrikanth Hegde
2026-05-14 15:21 ` [PATCH v3 06/20] sched/core: allow only preferred CPUs in is_cpu_allowed Shrikanth Hegde
2026-05-14 15:21 ` [PATCH v3 07/20] sched/fair: Select preferred CPU at wakeup when possible Shrikanth Hegde
2026-05-14 15:21 ` [PATCH v3 08/20] sched/fair: load balance only among preferred CPUs Shrikanth Hegde
2026-05-14 15:21 ` [PATCH v3 09/20] sched/rt: Select a preferred CPU for wakeup and pulling rt task Shrikanth Hegde
2026-05-14 15:21 ` [PATCH v3 10/20] sched/core: Keep tick on non-preferred CPUs until tasks are out Shrikanth Hegde
2026-05-14 15:21 ` [PATCH v3 11/20] sched/core: Push current task from non preferred CPU Shrikanth Hegde
2026-05-14 15:21 ` [PATCH v3 12/20] sched/debug: Add migration stats due to non preferred CPUs Shrikanth Hegde
2026-05-14 15:21 ` [PATCH v3 13/20] sched/debug: Create debugfs folder steal_monitor Shrikanth Hegde
2026-05-14 15:21 ` [PATCH v3 14/20] sched/debug: Provide debugfs to enable/disable steal monitor Shrikanth Hegde
2026-05-14 15:21 ` [PATCH v3 15/20] sched/core: Introduce a simple " Shrikanth Hegde
2026-05-14 15:22 ` Shrikanth Hegde [this message]
2026-05-14 15:22 ` [PATCH v3 17/20] sched/core: Introduce default arch handling code for inc/dec preferred CPUs Shrikanth Hegde
2026-05-14 15:22 ` [PATCH v3 18/20] sched/core: Handle steal values and mark CPUs as preferred Shrikanth Hegde
2026-05-14 15:22 ` [PATCH v3 19/20] sched/core: Mark the direction of steal values to avoid oscillations Shrikanth Hegde
2026-05-14 15:22 ` [PATCH v3 20/20] sched/debug: Add debug knobs for steal monitor Shrikanth Hegde
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260514152204.481115-17-sshegde@linux.ibm.com \
--to=sshegde@linux.ibm.com \
--cc=arighi@nvidia.com \
--cc=bsegall@google.com \
--cc=chleroy@kernel.org \
--cc=christian.loehle@arm.com \
--cc=dietmar.eggemann@arm.com \
--cc=frederic@kernel.org \
--cc=gregkh@linuxfoundation.org \
--cc=hdanton@sina.com \
--cc=huschle@linux.ibm.com \
--cc=iii@linux.ibm.com \
--cc=juri.lelli@redhat.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=maddy@linux.ibm.com \
--cc=maz@kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=pauld@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=rafael@kernel.org \
--cc=rostedt@goodmis.org \
--cc=seanjc@google.com \
--cc=srikar@linux.ibm.com \
--cc=tglx@kernel.org \
--cc=tj@kernel.org \
--cc=tommaso.cucinotta@gmail.com \
--cc=vincent.guittot@linaro.org \
--cc=vineeth@bitbyteword.org \
--cc=vschneid@redhat.com \
--cc=yury.norov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox