From: Wanpeng Li <kernellwp@gmail.com>
To: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
Paolo Bonzini <pbonzini@redhat.com>,
Sean Christopherson <seanjc@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
Vincent Guittot <vincent.guittot@linaro.org>,
Juri Lelli <juri.lelli@redhat.com>,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
Wanpeng Li <wanpengli@tencent.com>
Subject: [PATCH 02/10] sched/fair: Add rate-limiting and validation helpers
Date: Mon, 10 Nov 2025 11:32:23 +0800 [thread overview]
Message-ID: <20251110033232.12538-3-kernellwp@gmail.com> (raw)
In-Reply-To: <20251110033232.12538-1-kernellwp@gmail.com>
From: Wanpeng Li <wanpengli@tencent.com>
From: Wanpeng Li <wanpengli@tencent.com>
Implement core safety mechanisms for yield deboost operations.
Add yield_deboost_rate_limit() for high-frequency gating to prevent
excessive overhead on compute-intensive workloads. Use 6ms threshold
with lockless READ_ONCE/WRITE_ONCE to minimize cache line contention
while providing effective rate limiting.
Add yield_deboost_validate_tasks() for comprehensive validation
ensuring feature is enabled via sysctl, both tasks are valid and
distinct, both belong to fair_sched_class, entities are on the same
runqueue, and tasks are runnable.
The rate limiter prevents pathological high-frequency cases while
validation ensures only appropriate task pairs proceed. Both functions
are static and will be integrated in subsequent patches.
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
kernel/sched/fair.c | 68 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 68 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 5b7fcc86ccff..a7dc21c2dbdb 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8990,6 +8990,74 @@ static void put_prev_task_fair(struct rq *rq, struct task_struct *prev, struct t
}
}
+/*
+ * High-frequency yield gating to reduce overhead on compute-intensive workloads.
+ * Returns true if the yield should be skipped due to frequency limits.
+ *
+ * Optimized: single threshold with READ_ONCE/WRITE_ONCE, refresh timestamp on every call.
+ */
+static bool yield_deboost_rate_limit(struct rq *rq, u64 now_ns)
+{
+ u64 last = READ_ONCE(rq->yield_deboost_last_time_ns);
+ bool limited = false;
+
+ if (last) {
+ u64 delta = now_ns - last;
+ limited = (delta <= 6000ULL * NSEC_PER_USEC);
+ }
+
+ WRITE_ONCE(rq->yield_deboost_last_time_ns, now_ns);
+ return limited;
+}
+
+/*
+ * Validate tasks and basic parameters for yield deboost operation.
+ * Performs comprehensive safety checks including feature enablement,
+ * NULL pointer validation, task state verification, and same-rq requirement.
+ * Returns false with appropriate debug logging if any validation fails,
+ * ensuring only safe and meaningful yield operations proceed.
+ */
+static bool __maybe_unused yield_deboost_validate_tasks(struct rq *rq, struct task_struct *p_target,
+ struct task_struct **p_yielding_out,
+ struct sched_entity **se_y_out,
+ struct sched_entity **se_t_out)
+{
+ struct task_struct *p_yielding;
+ struct sched_entity *se_y, *se_t;
+ u64 now_ns;
+
+ if (!sysctl_sched_vcpu_debooster_enabled)
+ return false;
+
+ if (!rq || !p_target)
+ return false;
+
+ now_ns = rq->clock;
+
+ if (yield_deboost_rate_limit(rq, now_ns))
+ return false;
+
+ p_yielding = rq->curr;
+ if (!p_yielding || p_yielding == p_target ||
+ p_target->sched_class != &fair_sched_class ||
+ p_yielding->sched_class != &fair_sched_class)
+ return false;
+
+ se_y = &p_yielding->se;
+ se_t = &p_target->se;
+
+ if (!se_t || !se_y || !se_t->on_rq || !se_y->on_rq)
+ return false;
+
+ if (task_rq(p_yielding) != rq || task_rq(p_target) != rq)
+ return false;
+
+ *p_yielding_out = p_yielding;
+ *se_y_out = se_y;
+ *se_t_out = se_t;
+ return true;
+}
+
/*
* sched_yield() is very simple
*/
--
2.43.0
next prev parent reply other threads:[~2025-11-10 3:32 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-10 3:32 [PATCH 00/10] sched/kvm: Semantics-aware vCPU scheduling for oversubscribed KVM Wanpeng Li
2025-11-10 3:32 ` [PATCH 01/10] sched: Add vCPU debooster infrastructure Wanpeng Li
2025-11-10 3:32 ` Wanpeng Li [this message]
2025-11-12 6:40 ` [PATCH 02/10] sched/fair: Add rate-limiting and validation helpers K Prateek Nayak
2025-11-12 6:44 ` K Prateek Nayak
2025-11-13 13:36 ` Wanpeng Li
2025-11-13 12:00 ` Wanpeng Li
2025-11-10 3:32 ` [PATCH 03/10] sched/fair: Add cgroup LCA finder for hierarchical yield Wanpeng Li
2025-11-12 6:50 ` K Prateek Nayak
2025-11-13 8:59 ` Wanpeng Li
2025-11-10 3:32 ` [PATCH 04/10] sched/fair: Add penalty calculation and application logic Wanpeng Li
2025-11-12 7:25 ` K Prateek Nayak
2025-11-13 13:25 ` Wanpeng Li
2025-11-10 3:32 ` [PATCH 05/10] sched/fair: Wire up yield deboost in yield_to_task_fair() Wanpeng Li
2025-11-10 5:16 ` kernel test robot
2025-11-10 5:16 ` kernel test robot
2025-11-10 3:32 ` [PATCH 06/10] KVM: Fix last_boosted_vcpu index assignment bug Wanpeng Li
2025-11-21 0:35 ` Sean Christopherson
2025-11-21 0:38 ` Sean Christopherson
2025-11-21 11:46 ` Wanpeng Li
2025-11-10 3:32 ` [PATCH 07/10] KVM: x86: Add IPI tracking infrastructure Wanpeng Li
2025-11-10 3:32 ` [PATCH 08/10] KVM: x86/lapic: Integrate IPI tracking with interrupt delivery Wanpeng Li
2025-11-10 3:32 ` [PATCH 09/10] KVM: Implement IPI-aware directed yield candidate selection Wanpeng Li
2025-11-10 3:39 ` [PATCH 10/10] KVM: Relaxed boost as safety net Wanpeng Li
2025-11-10 12:02 ` [PATCH 00/10] sched/kvm: Semantics-aware vCPU scheduling for oversubscribed KVM Christian Borntraeger
2025-11-12 5:01 ` Wanpeng Li
2025-11-18 8:11 ` Christian Borntraeger
2025-11-18 14:19 ` Wanpeng Li
2025-11-11 6:28 ` K Prateek Nayak
2025-11-12 4:54 ` Wanpeng Li
2025-11-12 6:07 ` K Prateek Nayak
2025-11-13 5:37 ` Wanpeng Li
2025-11-13 4:42 ` K Prateek Nayak
2025-11-13 8:33 ` Wanpeng Li
2025-11-13 9:48 ` K Prateek Nayak
2025-11-13 13:56 ` Wanpeng Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251110033232.12538-3-kernellwp@gmail.com \
--to=kernellwp@gmail.com \
--cc=juri.lelli@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=vincent.guittot@linaro.org \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox