qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eduardo Habkost <ehabkost@redhat.com>
To: Roman Kagan <rkagan@virtuozzo.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Richard Henderson <rth@twiddle.net>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure
Date: Fri, 23 Mar 2018 16:48:27 -0300	[thread overview]
Message-ID: <20180323194827.GM3417@localhost.localdomain> (raw)
In-Reply-To: <20180323094529.GA28085@rkaganb.sw.ru>

On Fri, Mar 23, 2018 at 12:45:30PM +0300, Roman Kagan wrote:
> On Thu, Mar 22, 2018 at 03:38:13PM -0300, Eduardo Habkost wrote:
> > On Thu, Mar 22, 2018 at 04:58:03PM +0300, Roman Kagan wrote:
> > > On Thu, Mar 22, 2018 at 10:22:18AM -0300, Eduardo Habkost wrote:
> > > > On Thu, Mar 22, 2018 at 04:00:14PM +0300, Roman Kagan wrote:
> > > > > On Wed, Mar 21, 2018 at 05:19:24PM -0300, Eduardo Habkost wrote:
> > > > > > On Wed, Mar 21, 2018 at 07:57:29PM +0300, Roman Kagan wrote:
> > > > > > > On Wed, Mar 21, 2018 at 02:18:54PM +0100, Vitaly Kuznetsov wrote:
> > > > > > > > Roman Kagan <rkagan@virtuozzo.com> writes:
> > > > > > > > 
> > > > > > > > > On Tue, Mar 20, 2018 at 06:35:00PM +0100, Vitaly Kuznetsov wrote:
> > > > > > > > >> Requiring tsc_is_stable_and_known() is too restrictive: even without INVTCS
> > > > > > > > >> nested Hyper-V-on-KVM enables TSC pages for its guests e.g. when
> > > > > > > > >> Reenlightenment MSRs are present. Presence of frequency MSRs doesn't mean
> > > > > > > > >> these frequencies are stable, it just means they're available for reading.
> > > > > > > > >> 
> > > > > > > > >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > > > > > > > >> ---
> > > > > > > > >>  target/i386/kvm.c | 2 +-
> > > > > > > > >>  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > > > >> 
> > > > > > > > >> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> > > > > > > > >> index 7d9f9ca0b1..74fc3d3b2c 100644
> > > > > > > > >> --- a/target/i386/kvm.c
> > > > > > > > >> +++ b/target/i386/kvm.c
> > > > > > > > >> @@ -651,7 +651,7 @@ static int hyperv_handle_properties(CPUState *cs)
> > > > > > > > >>          env->features[FEAT_HYPERV_EAX] |= HV_TIME_REF_COUNT_AVAILABLE;
> > > > > > > > >>          env->features[FEAT_HYPERV_EAX] |= HV_REFERENCE_TSC_AVAILABLE;
> > > > > > > > >>  
> > > > > > > > >> -        if (has_msr_hv_frequencies && tsc_is_stable_and_known(env)) {
> > > > > > > > >> +        if (has_msr_hv_frequencies && env->tsc_khz) {
> > > > > > > > >>              env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_FREQUENCY_MSRS;
> > > > > > > > >>              env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
> > > > > > > > >>          }
> > > > > > > > >
> > > > > > > > > I suggest that we add a corresponding cpu property here, too.  The guest
> > > > > > > > > may legitimately rely on these msrs when it sees the support in CPUID,
> > > > > > > > > and migrating from a kernel with the feature supported (4.14+) to an
> > > > > > > > > older one will make it crash.
> > > > > > > > >
> > > > > > > > 
> > > > > > > > This can be arranged, but what happens to people who use these features
> > > > > > > > today? Assuming they also passed 'invtsc' they have stable TSC page
> > > > > > > > clocksource already (when Hyper-V role is enabled) but when we start
> > > > > > > > requesting a new 'hv_frequency' cpu property they'll suddenly lose what
> > > > > > > > they have...
> > > > > > > 
> > > > > > > I see two cases here:
> > > > > > > 
> > > > > > > 1) people start a new VM, and discover that their old configuration is
> > > > > > >    not enough for this feature to work.
> > > > > > > 
> > > > > > >    They need to reconfigure and restart the VM.  This costs them some
> > > > > > >    time investigating and restarting, but not data.
> > > > > > 
> > > > > > If we keep machine-type compatibility, people will need to do
> > > > > > that only if they change the machine-type (or use the "pc" or
> > > > > > "q35" aliases).  If they copy the old configuration, it will keep
> > > > > > working.
> > > > > 
> > > > > The problem is that the feature is not fixed by the machine-type, due to
> > > > > the forgotten property: it only depends on the KVM version.  So, once
> > > > > (if) we add the property and make the feature deterministic, we'll lose
> > > > > compatibility one way or another.
> > > > > 
> > > > > Or are you suggesting that for pre-2.12 machine types we leave the
> > > > > property at "decided by your KVM" state?
> > > > 
> > > > Yes, that's what I mean.  This looks like the only way to avoid
> > > > losing features by just cold-rebooting an existing VM.
> > > > 
> > > > The scenario I'm thinking is this:
> > > > 
> > > > 1) pc-2.11 VM started on host running QEMU 2.11
> > > > 2) VM migrated to a host containing this patch
> > > > 3) 1 year later, the VM is shut down and booted again.
> > > > 4) Things stop working inside the VM because hv-frequency is
> > > >    unexpectedly gone.
> > > > 
> > > > Machine-type compatibility code would avoid (4).
> > > 
> > > Right, but (4) typically means that you fail to start your workload from
> > > a clean state, so you just go and fix it; no data is lost.
> > > 
> > > Compare this to a migration to an older KVM which results in your guest
> > > crashing, where you risk data loss and still have to meddle with
> > > configs.
> > 
> > True. To make it worse, we are already unable to avoid this crash
> > on existing VMs without a reboot.  The only case where we can fix
> > this is if live-migration to older KVM happens after the guest
> > was rebooted when running on a newer QEMU version.  :(
> 
> Hmm, I thought the scheme I outlined below covered (== blocked) live
> migration QEMU-2.11/KVM-4.14+ -> QEMU-2.12(machine-2.11)/KVM-4.13-,
> didn't it?

It should, but what about migration
QEMU-2.12(pc-2.11)/KVM-4.14 -> QEMU-2.11(pc-2.11)/KVM-4.13?

Or, more specifically:
QEMU-2.11(pc-2.11)/KVM-4.14 ->
QEMU-2.12(pc-2.11)/KVM-4.14 -> QEMU-2.11(machine-2.11)/KVM-4.13?

> 
> > > > > > machine-type compatibility also makes the following case a bit
> > > > > > safer:
> > > > > > 
> > > > > > > 
> > > > > > > 2) people migrate from a QEMU without ->hv_frequency, to a new one with
> > > > > > >    ->hv_frequency=off (assuming on both ends KVM supports the frequency
> > > > > > >    MSRs).
> > > > > > > 
> > > > > > >    With the current implementation in KVM, this will only result in the
> > > > > > >    feature bits disappearing from the respective CPUID leaf, but the
> > > > > > >    MSRs themselves will continue working as they used to.  So the guest
> > > > > > >    either won't notice or will check the CPUID and adjust.
> > > > > > 
> > > > > > If we keep machine-type compatibility, the CPUID bit won't
> > > > > > disappear for the guest while the MSRs keep working.
> > > > > > 
> > > > > > 
> > > > > > Whichever solution we choose, we can still have guests crashing
> > > > > > if migrating a pc-2.11 machine from a 4.14+ host kernel to a host
> > > > > > with an older kernel.  But I don't think there's a way out of
> > > > > > this, except requiring an explicit "hv-frequencies" CPU option on
> > > > > > newer machine-types.
> > > > > 
> > > > > What's wrong with requiring it, as we do for all other hv_* properties?
> > > > 
> > > > On new machine-types, nothing wrong.
> > > > 
> > > > On existing machine-types, see above.
> > > 
> > > I wonder if the following can cater to all relevant cases:
> > > 
> > > - hv_frequencies property is added, defaulting to "off", so that new
> > >   users of this feature would need to explicitly turn it on;
> > > 
> > > - on pre-2.12 machine types, it's set to the value of hv_time property
> > >   by the compat code, so that on VMs where this feature could
> > >   potentially be present it would become required; as a result, these
> > >   configurations will refuse to start on insufficiently capable KVM,
> > >   preventing the migration attempts.
> > 
> > This sounds like the safest option.  The cost will be the
> > inconvenience of being unable to run pc-2.11 on hosts with older
> > KVM (Linux < v4.14, without commit
> > 72c139bacfa386145d7bbb68c47c8824716153b6),
> 
> not completely unable: people will have to add "hv_frequencies=off" to
> their cpu spec

This is different from the patch you sent, which sets
hv-frequencies=off by default on pc-2.11 too.

(And now I see you described this approach in the last paragraph below. :)

> 
> > and the need to explicitly enable hv-frequencies on pc-2.12 and newer.
> 
> which is the standard situation for all new features.

Yes, no question on what we want to do on pc-2.12.

> 
> > > Am I missing any scenarios that aren't covered?
> > > 
> > 
> > It looks like the guest can still crash if we migrate
> > "QEMU-2.12 -machine pc-2.11 -cpu ...,+hv-time" to a host running
> > QEMU 2.11 and Linux < 4.14.
> 
> Indeed :(

Well, your patch fixes it by not enabling hv-frequencies by
default on any machine-type.  Do you see any gotchas?


> 
> > I wonder if there's a way to avoid that?  If there's a way to avoid
> > that with extra migration subsections,
> 
> I guess this should work.
> 
> > is it worth the effort/complexity?
> 
> This is a judgement call.  For vendors this is a non-issue because most
> of them haven't even started shipping 2.11, so they just don't have VMs
> with this problem in the field.
> 
> So, taking the effort/complexity vs safety tradeoff into account, we can
> consider an alternative approach: just add hv_frequencies (default=off)
> cpu property to 2.12 and 2.11-stable, and ignore the cases where it's
> run on QEMU versions without explicit control over this feature.  Would
> it be too much against the current policy?
> 

With this, we can just declare that QEMU v2.11.0 + Linux 4.14+
was broken, and advise people to upgrade QEMU.

I think is the most reasonable option we have.  See my reply to
the patch you sent.

-- 
Eduardo

  reply	other threads:[~2018-03-23 19:48 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-20 17:34 [Qemu-devel] [PATCH v3 0/2] i386/kvm: TSC page clocksource for Hyper-V-on-KVM fixes Vitaly Kuznetsov
2018-03-20 17:34 ` [Qemu-devel] [PATCH v3 1/2] i386/kvm: add support for Hyper-V reenlightenment MSRs Vitaly Kuznetsov
2018-03-20 18:32   ` Eduardo Habkost
2018-03-21 11:09     ` Roman Kagan
2018-03-21 13:09     ` Roman Kagan
2018-03-21 11:24   ` Roman Kagan
2018-03-22 17:09   ` Marcelo Tosatti
2018-03-22 17:39     ` Vitaly Kuznetsov
2018-03-20 17:35 ` [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure Vitaly Kuznetsov
2018-03-21 12:10   ` Roman Kagan
2018-03-21 13:18     ` Vitaly Kuznetsov
2018-03-21 16:57       ` Roman Kagan
2018-03-21 20:19         ` Eduardo Habkost
2018-03-22 13:00           ` Roman Kagan
2018-03-22 13:22             ` Eduardo Habkost
2018-03-22 13:58               ` Roman Kagan
2018-03-22 18:38                 ` Eduardo Habkost
2018-03-23  9:45                   ` Roman Kagan
2018-03-23 19:48                     ` Eduardo Habkost [this message]
2018-03-26 14:20                       ` Roman Kagan
2018-03-21 15:33   ` Paolo Bonzini
2018-03-21 16:17     ` Vitaly Kuznetsov
2018-03-21 17:17       ` Roman Kagan
2018-03-21 20:06         ` Eduardo Habkost
2018-03-21 16:47     ` Roman Kagan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180323194827.GM3417@localhost.localdomain \
    --to=ehabkost@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rkagan@virtuozzo.com \
    --cc=rth@twiddle.net \
    --cc=vkuznets@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).