Kernel KVM virtualization development
 help / color / mirror / Atom feed
From: Xiaoyao Li <xiaoyao.li@intel.com>
To: Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Marc Zyngier <maz@kernel.org>,
	Oliver Upton <oliver.upton@linux.dev>
Cc: kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org,
	Jim Mattson <jmattson@google.com>
Subject: Re: [PATCH v5 2/5] KVM: x86: Provide a capability to disable APERF/MPERF read intercepts
Date: Thu, 26 Jun 2025 16:59:00 +0800	[thread overview]
Message-ID: <bc7aea45-f254-4cbc-8dc0-5435417d8577@intel.com> (raw)
In-Reply-To: <20250626001225.744268-3-seanjc@google.com>

On 6/26/2025 8:12 AM, Sean Christopherson wrote:
> From: Jim Mattson <jmattson@google.com>
> 
> Allow a guest to read the physical IA32_APERF and IA32_MPERF MSRs
> without interception.
> 
> The IA32_APERF and IA32_MPERF MSRs are not virtualized. Writes are not
> handled at all. The MSR values are not zeroed on vCPU creation, saved
> on suspend, or restored on resume. No accommodation is made for
> processor migration or for sharing a logical processor with other
> tasks. No adjustments are made for non-unit TSC multipliers. The MSRs
> do not account for time the same way as the comparable PMU events,
> whether the PMU is virtualized by the traditional emulation method or
> the new mediated pass-through approach.
> 
> Nonetheless, in a properly constrained environment, this capability
> can be combined with a guest CPUID table that advertises support for
> CPUID.6:ECX.APERFMPERF[bit 0] to induce a Linux guest to report the
> effective physical CPU frequency in /proc/cpuinfo. Moreover, there is
> no performance cost for this capability.
> 
> Signed-off-by: Jim Mattson <jmattson@google.com>
> Link: https://lore.kernel.org/r/20250530185239.2335185-3-jmattson@google.com
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>   Documentation/virt/kvm/api.rst | 23 +++++++++++++++++++++++
>   arch/x86/kvm/svm/nested.c      |  4 +++-
>   arch/x86/kvm/svm/svm.c         |  5 +++++
>   arch/x86/kvm/vmx/nested.c      |  6 ++++++
>   arch/x86/kvm/vmx/vmx.c         |  4 ++++
>   arch/x86/kvm/x86.c             |  6 +++++-
>   arch/x86/kvm/x86.h             |  5 +++++
>   include/uapi/linux/kvm.h       |  1 +
>   tools/include/uapi/linux/kvm.h |  1 +
>   9 files changed, 53 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 43ed57e048a8..27ced3ee2b53 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -7844,6 +7844,7 @@ Valid bits in args[0] are::
>     #define KVM_X86_DISABLE_EXITS_HLT              (1 << 1)
>     #define KVM_X86_DISABLE_EXITS_PAUSE            (1 << 2)
>     #define KVM_X86_DISABLE_EXITS_CSTATE           (1 << 3)
> +  #define KVM_X86_DISABLE_EXITS_APERFMPERF       (1 << 4)
>   
>   Enabling this capability on a VM provides userspace with a way to no
>   longer intercept some instructions for improved latency in some
> @@ -7854,6 +7855,28 @@ all such vmexits.
>   
>   Do not enable KVM_FEATURE_PV_UNHALT if you disable HLT exits.
>   
> +Virtualizing the ``IA32_APERF`` and ``IA32_MPERF`` MSRs requires more
> +than just disabling APERF/MPERF exits. While both Intel and AMD
> +document strict usage conditions for these MSRs--emphasizing that only
> +the ratio of their deltas over a time interval (T0 to T1) is
> +architecturally defined--simply passing through the MSRs can still
> +produce an incorrect ratio.
> +
> +This erroneous ratio can occur if, between T0 and T1:
> +
> +1. The vCPU thread migrates between logical processors.
> +2. Live migration or suspend/resume operations take place.
> +3. Another task shares the vCPU's logical processor.
> +4. C-states lower thean C0 are emulated (e.g., via HLT interception).

s/thean/than/

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>

  reply	other threads:[~2025-06-26  8:59 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-26  0:12 [PATCH v5 0/5] KVM: x86: Provide a cap to disable APERF/MPERF read intercepts Sean Christopherson
2025-06-26  0:12 ` [PATCH v5 1/5] KVM: x86: Replace growing set of *_in_guest bools with a u64 Sean Christopherson
2025-06-26  8:53   ` Xiaoyao Li
2025-06-26  0:12 ` [PATCH v5 2/5] KVM: x86: Provide a capability to disable APERF/MPERF read intercepts Sean Christopherson
2025-06-26  8:59   ` Xiaoyao Li [this message]
2025-06-26  0:12 ` [PATCH v5 3/5] KVM: selftests: Expand set of APIs for pinning tasks to a single CPU Sean Christopherson
2025-06-26  0:12 ` [PATCH v5 4/5] KVM: selftests: Test behavior of KVM_X86_DISABLE_EXITS_APERFMPERF Sean Christopherson
2025-06-26  0:12 ` [PATCH v5 5/5] KVM: selftests: Convert arch_timer tests to common helpers to pin task Sean Christopherson
2025-07-10 23:08 ` [PATCH v5 0/5] KVM: x86: Provide a cap to disable APERF/MPERF read intercepts Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bc7aea45-f254-4cbc-8dc0-5435417d8577@intel.com \
    --to=xiaoyao.li@intel.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox