public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Wanpeng Li <kernellwp@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	 Thomas Gleixner <tglx@linutronix.de>,
	Paolo Bonzini <pbonzini@redhat.com>,
	 K Prateek Nayak <kprateek.nayak@amd.com>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	 Steven Rostedt <rostedt@goodmis.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	 Juri Lelli <juri.lelli@redhat.com>,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	 Wanpeng Li <wanpengli@tencent.com>
Subject: Re: [PATCH v2 0/9] sched/kvm: Semantics-aware vCPU scheduling for oversubscribed KVM
Date: Thu, 12 Mar 2026 18:13:31 -0700	[thread overview]
Message-ID: <abNku6Vgx2eAn-ki@google.com> (raw)
In-Reply-To: <20251219035334.39790-1-kernellwp@gmail.com>

On Fri, Dec 19, 2025, Wanpeng Li wrote:
> Part 2: KVM IPI-Aware Directed Yield (patches 6-9)
> 
> Enhance kvm_vcpu_on_spin() with lightweight IPI tracking to improve
> directed yield candidate selection. Track sender/receiver relationships
> when IPIs are delivered and use this information to prioritize yield
> targets.
> 
> The tracking mechanism:
> 
> - Hooks into kvm_irq_delivery_to_apic() to detect unicast fixed IPIs (the
>   common case for inter-processor synchronization). When exactly one
>   destination vCPU receives an IPI, record the sender->receiver relationship
>   with a monotonic timestamp.
> 
>   In high VM density scenarios, software-based IPI tracking through
>   interrupt delivery interception becomes particularly valuable. It
>   captures precise sender/receiver relationships that can be leveraged
>   for intelligent scheduling decisions, providing performance benefits
>   that complement or even exceed hardware-accelerated interrupt delivery
>   in overcommitted environments.
> 
> - Uses lockless READ_ONCE/WRITE_ONCE accessors for minimal overhead. The
>   per-vCPU ipi_context structure is carefully designed to avoid cache line
>   bouncing.
> 
> - Implements a short recency window (50ms default) to avoid stale IPI
>   information inflating boost priority on throughput-sensitive workloads.
>   Old IPI relationships are naturally aged out.
> 
> - Clears IPI context on EOI with two-stage precision: unconditionally clear
>   the receiver's context (it processed the interrupt), but only clear the
>   sender's pending flag if the receiver matches and the IPI is recent. This
>   prevents unrelated EOIs from prematurely clearing valid IPI state.

That all relies on lack of IPI and EOI virtualization, which seems very
counter-productive given the way hardware is headed.

My reaction to all of this is that in the long run, we'd be far better off getting
the guest to "cooperate" in the sense of communicating intent, status, etc.  As
a stop-gap for older hardware, this obviously is beneficial.  But AFAICT the IPI
tracking is going to be dead weight in the near future.

And there are many, many use cases that what PV scheduling, i.e. if people band
together, I suspect it's feasible to get Linux-as-a-guest to provide hints to the
host that can be used to make scheduling decisions.

> Performance Results
> -------------------
> 
> Test environment: Intel Xeon, 16 physical cores, 16 vCPUs per VM

What generation of CPU?

  parent reply	other threads:[~2026-03-13  1:13 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-19  3:53 [PATCH v2 0/9] sched/kvm: Semantics-aware vCPU scheduling for oversubscribed KVM Wanpeng Li
2025-12-19  3:53 ` [PATCH v2 1/9] sched: Add vCPU debooster infrastructure Wanpeng Li
2025-12-19  3:53 ` [PATCH v2 2/9] sched/fair: Add rate-limiting and validation helpers Wanpeng Li
2025-12-22 21:12   ` kernel test robot
2026-01-04  4:09   ` Hillf Danton
2025-12-19  3:53 ` [PATCH v2 3/9] sched/fair: Add cgroup LCA finder for hierarchical yield Wanpeng Li
2025-12-19  3:53 ` [PATCH v2 4/9] sched/fair: Add penalty calculation and application logic Wanpeng Li
2025-12-22 23:36   ` kernel test robot
2025-12-19  3:53 ` [PATCH v2 5/9] sched/fair: Wire up yield deboost in yield_to_task_fair() Wanpeng Li
2025-12-22  7:06   ` kernel test robot
2025-12-22  9:31   ` kernel test robot
2025-12-19  3:53 ` [PATCH v2 6/9] KVM: x86: Add IPI tracking infrastructure Wanpeng Li
2025-12-19  3:53 ` [PATCH v2 7/9] KVM: x86/lapic: Integrate IPI tracking with interrupt delivery Wanpeng Li
2025-12-19  3:53 ` [PATCH v2 8/9] KVM: Implement IPI-aware directed yield candidate selection Wanpeng Li
2025-12-19  3:53 ` [PATCH v2 9/9] KVM: Relaxed boost as safety net Wanpeng Li
2026-01-04  2:40 ` [PATCH v2 0/9] sched/kvm: Semantics-aware vCPU scheduling for oversubscribed KVM Wanpeng Li
2026-01-05  6:26 ` K Prateek Nayak
2026-03-13  1:13 ` Sean Christopherson [this message]
2026-03-26 14:41 ` Christian Borntraeger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abNku6Vgx2eAn-ki@google.com \
    --to=seanjc@google.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=juri.lelli@redhat.com \
    --cc=kernellwp@gmail.com \
    --cc=kprateek.nayak@amd.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=vincent.guittot@linaro.org \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox