From: Paolo Bonzini <pbonzini@redhat.com>
To: Marcelo Tosatti <mtosatti@redhat.com>, qemu-devel@nongnu.org
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH 0/2] read kvmclock from guest memory if !correct_tsc_shift
Date: Fri, 20 Jan 2023 09:54:03 +0100 [thread overview]
Message-ID: <4c924730-b77e-146d-55e6-c29588adc61c@redhat.com> (raw)
In-Reply-To: <20230120011116.134437211@redhat.com>
On 1/20/23 02:11, Marcelo Tosatti wrote:
> Before kernel commit 78db6a5037965429c04d708281f35a6e5562d31b,
> kvm_guest_time_update() would use vcpu->virtual_tsc_khz to calculate
> tsc_shift value in the vcpus pvclock structure written to guest memory.
To clarify, the problem is that kvm_guest_time_update() uses the guest
TSC frequency *that userspace desired* instead of the *actual* TSC
frequency. Because, within the 250 ppm tolerance, TSC scaling is not
enabled, the guest kvmclock is incorrect; KVM_GET_CLOCK instead returns
the correct value, and the bug occurs when migrating from a host that is
publishing a buggy kvmclock to the guest.
> For those kernels, if vcpu->virtual_tsc_khz != tsc_khz (which can be the
> case when guest state is restored via migration, or if tsc-khz option is
> passed to QEMU), and TSC scaling is not enabled (which happens if the
> difference between the frequency requested via KVM_SET_TSC_KHZ and the
> host TSC KHZ is smaller than 250ppm), then there can be a difference
> between what KVM_GET_CLOCK would return and what the guest reads as
> kvmclock value.
In practice, to trigger the bug you need to do two migrations from a
six-year-old kernel; I just can't see too many people stumbling upon
this in the wild, and I don't think it makes sense to hobble _all_
migrations from a kernel that is less than six years old for such an
edge case. New versions of QEMU do not even support running with such
old kernels (it will for example complain about no support for certain
KVM PV features).
It is not a huge request for the user to know if they are in the
problematic case. It is easiest to use a custom QEMU on the
destination, and always compute the kvmclock value from memory if the
page is valid.
Once you do a migration to the custom QEMU + a fixed kernel, the bug is
gone for good and there is no need to introduce new user API for that.
Paolo
> The effect is that the guest sees a jump in kvmclock value
> (either forwards or backwards) in such case.
>
> To fix incoming migration from pre-78db6a5037965 hosts,
> read kvmclock value from guest memory.
>
> Unless the KVM_CLOCK_CORRECT_TSC_SHIFT bit indicates
> that the value retrieved by KVM_GET_CLOCK on the source
> is safe to be used.
>
>
>
prev parent reply other threads:[~2023-01-20 8:55 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-20 1:11 [PATCH 0/2] read kvmclock from guest memory if !correct_tsc_shift Marcelo Tosatti
2023-01-20 1:11 ` [PATCH 1/2] linux-headers: sync KVM_CLOCK_CORRECT_TSC_SHIFT flag Marcelo Tosatti
2023-01-20 1:11 ` [PATCH 2/2] hw/i386/kvm/clock.c: read kvmclock from guest memory if !correct_tsc_shift Marcelo Tosatti
2023-01-20 8:54 ` Paolo Bonzini [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4c924730-b77e-146d-55e6-c29588adc61c@redhat.com \
--to=pbonzini@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).