From: Wanpeng Li <kernellwp@gmail.com>
To: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
Paolo Bonzini <pbonzini@redhat.com>,
Sean Christopherson <seanjc@google.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>,
Christian Borntraeger <borntraeger@linux.ibm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Vincent Guittot <vincent.guittot@linaro.org>,
Juri Lelli <juri.lelli@redhat.com>,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
Wanpeng Li <wanpengli@tencent.com>
Subject: [PATCH v2 3/9] sched/fair: Add cgroup LCA finder for hierarchical yield
Date: Fri, 19 Dec 2025 11:53:27 +0800 [thread overview]
Message-ID: <20251219035334.39790-4-kernellwp@gmail.com> (raw)
In-Reply-To: <20251219035334.39790-1-kernellwp@gmail.com>
From: Wanpeng Li <wanpengli@tencent.com>
Implement yield_deboost_find_lca() to locate the lowest common ancestor
(LCA) in the cgroup hierarchy for EEVDF-aware yield operations.
The LCA represents the appropriate hierarchy level where vruntime
adjustments should be applied to ensure fairness is maintained across
cgroup boundaries. This is critical for virtualization workloads where
vCPUs may be organized in nested cgroups.
Key aspects:
- For CONFIG_FAIR_GROUP_SCHED: Walk up both entity hierarchies by
aligning depths, then ascending together until common cfs_rq found
- For flat hierarchy: Simply verify both entities share the same cfs_rq
- Validate that meaningful contention exists (h_nr_queued > 1)
- Ensure yielding entity has non-zero slice for safe penalty calculation
Function operates under rq->lock protection. Static helper integrated
in subsequent patches.
v1 -> v2:
- Change nr_queued to h_nr_queued for accurate hierarchical task
counting that includes tasks in child cgroups
- Improve comments to clarify the LCA algorithm
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
kernel/sched/fair.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2f327882bf4d..39dbdd222687 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9102,6 +9102,36 @@ yield_deboost_validate_tasks(struct rq *rq, struct task_struct *p_target)
return p_yielding;
}
+/*
+ * Find the lowest common ancestor (LCA) in the cgroup hierarchy.
+ * Uses find_matching_se() to locate sibling entities at the same level,
+ * then returns their common cfs_rq for vruntime adjustments.
+ *
+ * Returns true if a valid LCA with meaningful contention (h_nr_queued > 1)
+ * is found, storing the LCA entities and common cfs_rq in output parameters.
+ */
+static bool __maybe_unused
+yield_deboost_find_lca(struct sched_entity *se_y, struct sched_entity *se_t,
+ struct sched_entity **se_y_lca_out,
+ struct sched_entity **se_t_lca_out,
+ struct cfs_rq **cfs_rq_out)
+{
+ struct sched_entity *se_y_lca = se_y;
+ struct sched_entity *se_t_lca = se_t;
+ struct cfs_rq *cfs_rq;
+
+ find_matching_se(&se_y_lca, &se_t_lca);
+
+ cfs_rq = cfs_rq_of(se_y_lca);
+ if (cfs_rq->h_nr_queued <= 1)
+ return false;
+
+ *se_y_lca_out = se_y_lca;
+ *se_t_lca_out = se_t_lca;
+ *cfs_rq_out = cfs_rq;
+ return true;
+}
+
/*
* sched_yield() is very simple
*/
--
2.43.0
next prev parent reply other threads:[~2025-12-19 3:53 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-19 3:53 [PATCH v2 0/9] sched/kvm: Semantics-aware vCPU scheduling for oversubscribed KVM Wanpeng Li
2025-12-19 3:53 ` [PATCH v2 1/9] sched: Add vCPU debooster infrastructure Wanpeng Li
2025-12-19 3:53 ` [PATCH v2 2/9] sched/fair: Add rate-limiting and validation helpers Wanpeng Li
2025-12-22 21:12 ` kernel test robot
2026-01-04 4:09 ` Hillf Danton
2025-12-19 3:53 ` Wanpeng Li [this message]
2025-12-19 3:53 ` [PATCH v2 4/9] sched/fair: Add penalty calculation and application logic Wanpeng Li
2025-12-22 23:36 ` kernel test robot
2025-12-19 3:53 ` [PATCH v2 5/9] sched/fair: Wire up yield deboost in yield_to_task_fair() Wanpeng Li
2025-12-22 7:06 ` kernel test robot
2025-12-22 9:31 ` kernel test robot
2025-12-19 3:53 ` [PATCH v2 6/9] KVM: x86: Add IPI tracking infrastructure Wanpeng Li
2025-12-19 3:53 ` [PATCH v2 7/9] KVM: x86/lapic: Integrate IPI tracking with interrupt delivery Wanpeng Li
2025-12-19 3:53 ` [PATCH v2 8/9] KVM: Implement IPI-aware directed yield candidate selection Wanpeng Li
2025-12-19 3:53 ` [PATCH v2 9/9] KVM: Relaxed boost as safety net Wanpeng Li
2026-01-04 2:40 ` [PATCH v2 0/9] sched/kvm: Semantics-aware vCPU scheduling for oversubscribed KVM Wanpeng Li
2026-01-05 6:26 ` K Prateek Nayak
2026-03-13 1:13 ` Sean Christopherson
2026-04-01 9:48 ` Wanpeng Li
2026-04-02 23:43 ` Sean Christopherson
2026-03-26 14:41 ` Christian Borntraeger
2026-04-01 9:34 ` Wanpeng Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251219035334.39790-4-kernellwp@gmail.com \
--to=kernellwp@gmail.com \
--cc=borntraeger@linux.ibm.com \
--cc=juri.lelli@redhat.com \
--cc=kprateek.nayak@amd.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=vincent.guittot@linaro.org \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox