From: Wanpeng Li <kernellwp@gmail.com>
To: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
Paolo Bonzini <pbonzini@redhat.com>,
Sean Christopherson <seanjc@google.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>,
Christian Borntraeger <borntraeger@linux.ibm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Vincent Guittot <vincent.guittot@linaro.org>,
Juri Lelli <juri.lelli@redhat.com>,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
Wanpeng Li <wanpengli@tencent.com>,
Richie Buturla <richie@linux.ibm.com>
Subject: [PATCH v3 02/10] sched/fair: Credit a persistent, queue-depth-scaled vlag margin
Date: Fri, 12 Jun 2026 09:33:47 +0800 [thread overview]
Message-ID: <20260612013355.59231-3-kernellwp@gmail.com> (raw)
In-Reply-To: <20260612013355.59231-1-kernellwp@gmail.com>
From: Wanpeng Li <wanpengli@tencent.com>
Crediting only up to vlag = 0 makes the buddy eligible for a single
pick_eevdf() pass. The next update_curr() can push it below zero again
before PICK_BUDDY consumes the hint.
Credit to a bounded positive-vlag margin instead, so the buddy stays
eligible across several scheduling decisions. Scale the margin with
runqueue depth, because deeper runqueues dilute eligibility faster, and
clamp it to entity_lag()'s legal positive-lag bound to preserve EEVDF
fairness.
The helper is not called in this change; there is no functional change.
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
kernel/sched/fair.c | 58 ++++++++++++++++++++++++++++++++++++---------
1 file changed, 47 insertions(+), 11 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index e7f5ea25fdae..c6502db62cd3 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9342,21 +9342,45 @@ static void put_prev_task_fair(struct rq *rq, struct task_struct *prev, struct t
}
/*
- * eevdf_credit_entity_vlag - credit a nominated next-buddy to eligibility
+ * Positive-vlag target margin for a credited buddy, scaled by runqueue
+ * depth so it stays eligible across several picks. The caller clamps it to
+ * entity_lag()'s legal bound, so EEVDF fairness is preserved.
+ */
+static u64 __maybe_unused
+eevdf_persistent_margin(struct cfs_rq *cfs_rq, struct sched_entity *se)
+{
+ u64 base = sysctl_sched_base_slice;
+ unsigned int n = cfs_rq->h_nr_queued;
+ u64 raw;
+
+ if (n <= 4)
+ raw = base * 4;
+ else if (n <= 8)
+ raw = base * 6;
+ else if (n <= 16)
+ raw = base * 8;
+ else
+ raw = base * 12;
+
+ return calc_delta_fair(raw, se);
+}
+
+/*
+ * eevdf_credit_entity_vlag - credit bounded vlag to a nominated next-buddy
*
* Advance @se (already nominated by set_next_buddy(), so cfs_rq->next == se)
- * just enough negative vlag to reach the eligibility boundary (vlag = 0) so
- * pick_eevdf()'s PICK_BUDDY branch returns it. cfs_rq->curr is shifted in
- * place (off-tree, carrying any vprot window). Queued entities are left
- * unchanged.
+ * to a bounded positive-vlag margin so pick_eevdf()'s PICK_BUDDY branch
+ * keeps returning it across several picks, without exceeding entity_lag()'s
+ * legal bound. cfs_rq->curr is shifted in place (off-tree, carrying any
+ * vprot window). Queued entities are left unchanged.
*
- * Idempotent: a no-op once @se is already eligible. Caller must hold
+ * Idempotent once @se holds the margin. Caller must hold
* rq_of(cfs_rq)->lock with rq_clock up to date.
*/
static void __maybe_unused
eevdf_credit_entity_vlag(struct cfs_rq *cfs_rq, struct sched_entity *se)
{
- u64 avruntime, credit;
+ u64 avruntime, credit, want, margin, max_slice, lag_limit;
s64 vlag;
/* Callers gate this helper with YIELD_TO_LAG_CREDIT. */
@@ -9371,11 +9395,23 @@ eevdf_credit_entity_vlag(struct cfs_rq *cfs_rq, struct sched_entity *se)
avruntime = avg_vruntime(cfs_rq);
vlag = entity_lag(cfs_rq, se, avruntime);
- /* Already eligible: nothing to do. */
- if (vlag >= 0)
- return;
+ /* Clamp the margin to entity_lag()'s bound so place_entity() keeps it. */
+ max_slice = cfs_rq_max_slice(cfs_rq) + TICK_NSEC;
+ lag_limit = calc_delta_fair(max_slice, se);
+ margin = eevdf_persistent_margin(cfs_rq, se);
+ if (lag_limit && margin > lag_limit)
+ margin = lag_limit;
+ if (vlag >= 0) {
+ if ((u64)vlag >= margin)
+ return;
+ want = margin - (u64)vlag;
+ } else {
+ want = margin + (u64)(-vlag);
+ }
- credit = (u64)(-vlag);
+ credit = want;
+ if (!credit)
+ return;
if (cfs_rq->curr == se) {
/* curr is off-tree: in-place shift, carrying any vprot window. */
--
2.43.0
next prev parent reply other threads:[~2026-06-12 1:34 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-12 1:33 [PATCH v3 00/10] sched/fair, KVM: Semantics-aware directed yield for oversubscribed KVM Wanpeng Li
2026-06-12 1:33 ` [PATCH v3 01/10] sched/fair: Add EEVDF lag credit primitive for nominated next-buddy Wanpeng Li
2026-06-12 1:49 ` sashiko-bot
2026-06-12 5:34 ` K Prateek Nayak
2026-06-12 1:33 ` Wanpeng Li [this message]
2026-06-12 1:53 ` [PATCH v3 02/10] sched/fair: Credit a persistent, queue-depth-scaled vlag margin sashiko-bot
2026-06-12 6:07 ` K Prateek Nayak
2026-06-12 1:33 ` [PATCH v3 03/10] sched/fair: Credit queued next-buddy via canonical requeue Wanpeng Li
2026-06-12 1:55 ` sashiko-bot
2026-06-12 1:33 ` [PATCH v3 04/10] sched/fair: Credit nominated next-buddy in yield_to_task_fair() Wanpeng Li
2026-06-12 1:54 ` sashiko-bot
2026-06-12 1:33 ` [PATCH v3 05/10] sched/fair: Force a local resched on yield_to() so the buddy is picked Wanpeng Li
2026-06-12 1:50 ` sashiko-bot
2026-06-12 1:33 ` [PATCH v3 06/10] KVM: x86: Add IPI tracking infrastructure for directed yield Wanpeng Li
2026-06-12 1:33 ` [PATCH v3 07/10] KVM: x86/lapic: Track unicast fixed IPI delivery Wanpeng Li
2026-06-12 1:33 ` [PATCH v3 08/10] KVM: x86/lapic: Clear IPI tracking on matching-vector EOI Wanpeng Li
2026-06-12 3:46 ` sashiko-bot
2026-06-12 1:33 ` [PATCH v3 09/10] KVM: Add IPI-aware directed-yield candidate selection Wanpeng Li
2026-06-12 1:48 ` sashiko-bot
2026-06-12 1:33 ` [PATCH v3 10/10] KVM: Add relaxed preempted-only fallback for directed yield Wanpeng Li
2026-06-12 5:17 ` [PATCH v3 00/10] sched/fair, KVM: Semantics-aware directed yield for oversubscribed KVM K Prateek Nayak
2026-06-12 9:43 ` Shrikanth Hegde
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260612013355.59231-3-kernellwp@gmail.com \
--to=kernellwp@gmail.com \
--cc=borntraeger@linux.ibm.com \
--cc=juri.lelli@redhat.com \
--cc=kprateek.nayak@amd.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=richie@linux.ibm.com \
--cc=rostedt@goodmis.org \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=vincent.guittot@linaro.org \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.