From: Eduardo Habkost <ehabkost@redhat.com>
To: Roman Kagan <rkagan@virtuozzo.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Richard Henderson <rth@twiddle.net>,
Marcelo Tosatti <mtosatti@redhat.com>,
qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure
Date: Thu, 22 Mar 2018 15:38:13 -0300 [thread overview]
Message-ID: <20180322183813.GA28161@localhost.localdomain> (raw)
In-Reply-To: <20180322135803.GC4500@rkaganb.sw.ru>
On Thu, Mar 22, 2018 at 04:58:03PM +0300, Roman Kagan wrote:
> On Thu, Mar 22, 2018 at 10:22:18AM -0300, Eduardo Habkost wrote:
> > On Thu, Mar 22, 2018 at 04:00:14PM +0300, Roman Kagan wrote:
> > > On Wed, Mar 21, 2018 at 05:19:24PM -0300, Eduardo Habkost wrote:
> > > > On Wed, Mar 21, 2018 at 07:57:29PM +0300, Roman Kagan wrote:
> > > > > On Wed, Mar 21, 2018 at 02:18:54PM +0100, Vitaly Kuznetsov wrote:
> > > > > > Roman Kagan <rkagan@virtuozzo.com> writes:
> > > > > >
> > > > > > > On Tue, Mar 20, 2018 at 06:35:00PM +0100, Vitaly Kuznetsov wrote:
> > > > > > >> Requiring tsc_is_stable_and_known() is too restrictive: even without INVTCS
> > > > > > >> nested Hyper-V-on-KVM enables TSC pages for its guests e.g. when
> > > > > > >> Reenlightenment MSRs are present. Presence of frequency MSRs doesn't mean
> > > > > > >> these frequencies are stable, it just means they're available for reading.
> > > > > > >>
> > > > > > >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > > > > > >> ---
> > > > > > >> target/i386/kvm.c | 2 +-
> > > > > > >> 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > >>
> > > > > > >> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> > > > > > >> index 7d9f9ca0b1..74fc3d3b2c 100644
> > > > > > >> --- a/target/i386/kvm.c
> > > > > > >> +++ b/target/i386/kvm.c
> > > > > > >> @@ -651,7 +651,7 @@ static int hyperv_handle_properties(CPUState *cs)
> > > > > > >> env->features[FEAT_HYPERV_EAX] |= HV_TIME_REF_COUNT_AVAILABLE;
> > > > > > >> env->features[FEAT_HYPERV_EAX] |= HV_REFERENCE_TSC_AVAILABLE;
> > > > > > >>
> > > > > > >> - if (has_msr_hv_frequencies && tsc_is_stable_and_known(env)) {
> > > > > > >> + if (has_msr_hv_frequencies && env->tsc_khz) {
> > > > > > >> env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_FREQUENCY_MSRS;
> > > > > > >> env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
> > > > > > >> }
> > > > > > >
> > > > > > > I suggest that we add a corresponding cpu property here, too. The guest
> > > > > > > may legitimately rely on these msrs when it sees the support in CPUID,
> > > > > > > and migrating from a kernel with the feature supported (4.14+) to an
> > > > > > > older one will make it crash.
> > > > > > >
> > > > > >
> > > > > > This can be arranged, but what happens to people who use these features
> > > > > > today? Assuming they also passed 'invtsc' they have stable TSC page
> > > > > > clocksource already (when Hyper-V role is enabled) but when we start
> > > > > > requesting a new 'hv_frequency' cpu property they'll suddenly lose what
> > > > > > they have...
> > > > >
> > > > > I see two cases here:
> > > > >
> > > > > 1) people start a new VM, and discover that their old configuration is
> > > > > not enough for this feature to work.
> > > > >
> > > > > They need to reconfigure and restart the VM. This costs them some
> > > > > time investigating and restarting, but not data.
> > > >
> > > > If we keep machine-type compatibility, people will need to do
> > > > that only if they change the machine-type (or use the "pc" or
> > > > "q35" aliases). If they copy the old configuration, it will keep
> > > > working.
> > >
> > > The problem is that the feature is not fixed by the machine-type, due to
> > > the forgotten property: it only depends on the KVM version. So, once
> > > (if) we add the property and make the feature deterministic, we'll lose
> > > compatibility one way or another.
> > >
> > > Or are you suggesting that for pre-2.12 machine types we leave the
> > > property at "decided by your KVM" state?
> >
> > Yes, that's what I mean. This looks like the only way to avoid
> > losing features by just cold-rebooting an existing VM.
> >
> > The scenario I'm thinking is this:
> >
> > 1) pc-2.11 VM started on host running QEMU 2.11
> > 2) VM migrated to a host containing this patch
> > 3) 1 year later, the VM is shut down and booted again.
> > 4) Things stop working inside the VM because hv-frequency is
> > unexpectedly gone.
> >
> > Machine-type compatibility code would avoid (4).
>
> Right, but (4) typically means that you fail to start your workload from
> a clean state, so you just go and fix it; no data is lost.
>
> Compare this to a migration to an older KVM which results in your guest
> crashing, where you risk data loss and still have to meddle with
> configs.
True. To make it worse, we are already unable to avoid this crash
on existing VMs without a reboot. The only case where we can fix
this is if live-migration to older KVM happens after the guest
was rebooted when running on a newer QEMU version. :(
>
> > > > machine-type compatibility also makes the following case a bit
> > > > safer:
> > > >
> > > > >
> > > > > 2) people migrate from a QEMU without ->hv_frequency, to a new one with
> > > > > ->hv_frequency=off (assuming on both ends KVM supports the frequency
> > > > > MSRs).
> > > > >
> > > > > With the current implementation in KVM, this will only result in the
> > > > > feature bits disappearing from the respective CPUID leaf, but the
> > > > > MSRs themselves will continue working as they used to. So the guest
> > > > > either won't notice or will check the CPUID and adjust.
> > > >
> > > > If we keep machine-type compatibility, the CPUID bit won't
> > > > disappear for the guest while the MSRs keep working.
> > > >
> > > >
> > > > Whichever solution we choose, we can still have guests crashing
> > > > if migrating a pc-2.11 machine from a 4.14+ host kernel to a host
> > > > with an older kernel. But I don't think there's a way out of
> > > > this, except requiring an explicit "hv-frequencies" CPU option on
> > > > newer machine-types.
> > >
> > > What's wrong with requiring it, as we do for all other hv_* properties?
> >
> > On new machine-types, nothing wrong.
> >
> > On existing machine-types, see above.
>
> I wonder if the following can cater to all relevant cases:
>
> - hv_frequencies property is added, defaulting to "off", so that new
> users of this feature would need to explicitly turn it on;
>
> - on pre-2.12 machine types, it's set to the value of hv_time property
> by the compat code, so that on VMs where this feature could
> potentially be present it would become required; as a result, these
> configurations will refuse to start on insufficiently capable KVM,
> preventing the migration attempts.
This sounds like the safest option. The cost will be the
inconvenience of being unable to run pc-2.11 on hosts with older
KVM (Linux < v4.14, without commit
72c139bacfa386145d7bbb68c47c8824716153b6), and the need to
explicitly enable hv-frequencies on pc-2.12 and newer.
>
> Am I missing any scenarios that aren't covered?
>
It looks like the guest can still crash if we migrate
"QEMU-2.12 -machine pc-2.11 -cpu ...,+hv-time" to a host running
QEMU 2.11 and Linux < 4.14. I wonder if there's a way to avoid
that? If there's a way to avoid that with extra migration
subsections, is it worth the effort/complexity?
--
Eduardo
next prev parent reply other threads:[~2018-03-22 18:38 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-20 17:34 [Qemu-devel] [PATCH v3 0/2] i386/kvm: TSC page clocksource for Hyper-V-on-KVM fixes Vitaly Kuznetsov
2018-03-20 17:34 ` [Qemu-devel] [PATCH v3 1/2] i386/kvm: add support for Hyper-V reenlightenment MSRs Vitaly Kuznetsov
2018-03-20 18:32 ` Eduardo Habkost
2018-03-21 11:09 ` Roman Kagan
2018-03-21 13:09 ` Roman Kagan
2018-03-21 11:24 ` Roman Kagan
2018-03-22 17:09 ` Marcelo Tosatti
2018-03-22 17:39 ` Vitaly Kuznetsov
2018-03-20 17:35 ` [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure Vitaly Kuznetsov
2018-03-21 12:10 ` Roman Kagan
2018-03-21 13:18 ` Vitaly Kuznetsov
2018-03-21 16:57 ` Roman Kagan
2018-03-21 20:19 ` Eduardo Habkost
2018-03-22 13:00 ` Roman Kagan
2018-03-22 13:22 ` Eduardo Habkost
2018-03-22 13:58 ` Roman Kagan
2018-03-22 18:38 ` Eduardo Habkost [this message]
2018-03-23 9:45 ` Roman Kagan
2018-03-23 19:48 ` Eduardo Habkost
2018-03-26 14:20 ` Roman Kagan
2018-03-21 15:33 ` Paolo Bonzini
2018-03-21 16:17 ` Vitaly Kuznetsov
2018-03-21 17:17 ` Roman Kagan
2018-03-21 20:06 ` Eduardo Habkost
2018-03-21 16:47 ` Roman Kagan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180322183813.GA28161@localhost.localdomain \
--to=ehabkost@redhat.com \
--cc=mtosatti@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=rkagan@virtuozzo.com \
--cc=rth@twiddle.net \
--cc=vkuznets@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.