All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eduardo Habkost <ehabkost@redhat.com>
To: Roman Kagan <rkagan@virtuozzo.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Richard Henderson <rth@twiddle.net>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure
Date: Fri, 23 Mar 2018 16:48:27 -0300	[thread overview]
Message-ID: <20180323194827.GM3417@localhost.localdomain> (raw)
In-Reply-To: <20180323094529.GA28085@rkaganb.sw.ru>

On Fri, Mar 23, 2018 at 12:45:30PM +0300, Roman Kagan wrote:
> On Thu, Mar 22, 2018 at 03:38:13PM -0300, Eduardo Habkost wrote:
> > On Thu, Mar 22, 2018 at 04:58:03PM +0300, Roman Kagan wrote:
> > > On Thu, Mar 22, 2018 at 10:22:18AM -0300, Eduardo Habkost wrote:
> > > > On Thu, Mar 22, 2018 at 04:00:14PM +0300, Roman Kagan wrote:
> > > > > On Wed, Mar 21, 2018 at 05:19:24PM -0300, Eduardo Habkost wrote:
> > > > > > On Wed, Mar 21, 2018 at 07:57:29PM +0300, Roman Kagan wrote:
> > > > > > > On Wed, Mar 21, 2018 at 02:18:54PM +0100, Vitaly Kuznetsov wrote:
> > > > > > > > Roman Kagan <rkagan@virtuozzo.com> writes:
> > > > > > > > 
> > > > > > > > > On Tue, Mar 20, 2018 at 06:35:00PM +0100, Vitaly Kuznetsov wrote:
> > > > > > > > >> Requiring tsc_is_stable_and_known() is too restrictive: even without INVTCS
> > > > > > > > >> nested Hyper-V-on-KVM enables TSC pages for its guests e.g. when
> > > > > > > > >> Reenlightenment MSRs are present. Presence of frequency MSRs doesn't mean
> > > > > > > > >> these frequencies are stable, it just means they're available for reading.
> > > > > > > > >> 
> > > > > > > > >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > > > > > > > >> ---
> > > > > > > > >>  target/i386/kvm.c | 2 +-
> > > > > > > > >>  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > > > >> 
> > > > > > > > >> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> > > > > > > > >> index 7d9f9ca0b1..74fc3d3b2c 100644
> > > > > > > > >> --- a/target/i386/kvm.c
> > > > > > > > >> +++ b/target/i386/kvm.c
> > > > > > > > >> @@ -651,7 +651,7 @@ static int hyperv_handle_properties(CPUState *cs)
> > > > > > > > >>          env->features[FEAT_HYPERV_EAX] |= HV_TIME_REF_COUNT_AVAILABLE;
> > > > > > > > >>          env->features[FEAT_HYPERV_EAX] |= HV_REFERENCE_TSC_AVAILABLE;
> > > > > > > > >>  
> > > > > > > > >> -        if (has_msr_hv_frequencies && tsc_is_stable_and_known(env)) {
> > > > > > > > >> +        if (has_msr_hv_frequencies && env->tsc_khz) {
> > > > > > > > >>              env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_FREQUENCY_MSRS;
> > > > > > > > >>              env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
> > > > > > > > >>          }
> > > > > > > > >
> > > > > > > > > I suggest that we add a corresponding cpu property here, too.  The guest
> > > > > > > > > may legitimately rely on these msrs when it sees the support in CPUID,
> > > > > > > > > and migrating from a kernel with the feature supported (4.14+) to an
> > > > > > > > > older one will make it crash.
> > > > > > > > >
> > > > > > > > 
> > > > > > > > This can be arranged, but what happens to people who use these features
> > > > > > > > today? Assuming they also passed 'invtsc' they have stable TSC page
> > > > > > > > clocksource already (when Hyper-V role is enabled) but when we start
> > > > > > > > requesting a new 'hv_frequency' cpu property they'll suddenly lose what
> > > > > > > > they have...
> > > > > > > 
> > > > > > > I see two cases here:
> > > > > > > 
> > > > > > > 1) people start a new VM, and discover that their old configuration is
> > > > > > >    not enough for this feature to work.
> > > > > > > 
> > > > > > >    They need to reconfigure and restart the VM.  This costs them some
> > > > > > >    time investigating and restarting, but not data.
> > > > > > 
> > > > > > If we keep machine-type compatibility, people will need to do
> > > > > > that only if they change the machine-type (or use the "pc" or
> > > > > > "q35" aliases).  If they copy the old configuration, it will keep
> > > > > > working.
> > > > > 
> > > > > The problem is that the feature is not fixed by the machine-type, due to
> > > > > the forgotten property: it only depends on the KVM version.  So, once
> > > > > (if) we add the property and make the feature deterministic, we'll lose
> > > > > compatibility one way or another.
> > > > > 
> > > > > Or are you suggesting that for pre-2.12 machine types we leave the
> > > > > property at "decided by your KVM" state?
> > > > 
> > > > Yes, that's what I mean.  This looks like the only way to avoid
> > > > losing features by just cold-rebooting an existing VM.
> > > > 
> > > > The scenario I'm thinking is this:
> > > > 
> > > > 1) pc-2.11 VM started on host running QEMU 2.11
> > > > 2) VM migrated to a host containing this patch
> > > > 3) 1 year later, the VM is shut down and booted again.
> > > > 4) Things stop working inside the VM because hv-frequency is
> > > >    unexpectedly gone.
> > > > 
> > > > Machine-type compatibility code would avoid (4).
> > > 
> > > Right, but (4) typically means that you fail to start your workload from
> > > a clean state, so you just go and fix it; no data is lost.
> > > 
> > > Compare this to a migration to an older KVM which results in your guest
> > > crashing, where you risk data loss and still have to meddle with
> > > configs.
> > 
> > True. To make it worse, we are already unable to avoid this crash
> > on existing VMs without a reboot.  The only case where we can fix
> > this is if live-migration to older KVM happens after the guest
> > was rebooted when running on a newer QEMU version.  :(
> 
> Hmm, I thought the scheme I outlined below covered (== blocked) live
> migration QEMU-2.11/KVM-4.14+ -> QEMU-2.12(machine-2.11)/KVM-4.13-,
> didn't it?

It should, but what about migration
QEMU-2.12(pc-2.11)/KVM-4.14 -> QEMU-2.11(pc-2.11)/KVM-4.13?

Or, more specifically:
QEMU-2.11(pc-2.11)/KVM-4.14 ->
QEMU-2.12(pc-2.11)/KVM-4.14 -> QEMU-2.11(machine-2.11)/KVM-4.13?

> 
> > > > > > machine-type compatibility also makes the following case a bit
> > > > > > safer:
> > > > > > 
> > > > > > > 
> > > > > > > 2) people migrate from a QEMU without ->hv_frequency, to a new one with
> > > > > > >    ->hv_frequency=off (assuming on both ends KVM supports the frequency
> > > > > > >    MSRs).
> > > > > > > 
> > > > > > >    With the current implementation in KVM, this will only result in the
> > > > > > >    feature bits disappearing from the respective CPUID leaf, but the
> > > > > > >    MSRs themselves will continue working as they used to.  So the guest
> > > > > > >    either won't notice or will check the CPUID and adjust.
> > > > > > 
> > > > > > If we keep machine-type compatibility, the CPUID bit won't
> > > > > > disappear for the guest while the MSRs keep working.
> > > > > > 
> > > > > > 
> > > > > > Whichever solution we choose, we can still have guests crashing
> > > > > > if migrating a pc-2.11 machine from a 4.14+ host kernel to a host
> > > > > > with an older kernel.  But I don't think there's a way out of
> > > > > > this, except requiring an explicit "hv-frequencies" CPU option on
> > > > > > newer machine-types.
> > > > > 
> > > > > What's wrong with requiring it, as we do for all other hv_* properties?
> > > > 
> > > > On new machine-types, nothing wrong.
> > > > 
> > > > On existing machine-types, see above.
> > > 
> > > I wonder if the following can cater to all relevant cases:
> > > 
> > > - hv_frequencies property is added, defaulting to "off", so that new
> > >   users of this feature would need to explicitly turn it on;
> > > 
> > > - on pre-2.12 machine types, it's set to the value of hv_time property
> > >   by the compat code, so that on VMs where this feature could
> > >   potentially be present it would become required; as a result, these
> > >   configurations will refuse to start on insufficiently capable KVM,
> > >   preventing the migration attempts.
> > 
> > This sounds like the safest option.  The cost will be the
> > inconvenience of being unable to run pc-2.11 on hosts with older
> > KVM (Linux < v4.14, without commit
> > 72c139bacfa386145d7bbb68c47c8824716153b6),
> 
> not completely unable: people will have to add "hv_frequencies=off" to
> their cpu spec

This is different from the patch you sent, which sets
hv-frequencies=off by default on pc-2.11 too.

(And now I see you described this approach in the last paragraph below. :)

> 
> > and the need to explicitly enable hv-frequencies on pc-2.12 and newer.
> 
> which is the standard situation for all new features.

Yes, no question on what we want to do on pc-2.12.

> 
> > > Am I missing any scenarios that aren't covered?
> > > 
> > 
> > It looks like the guest can still crash if we migrate
> > "QEMU-2.12 -machine pc-2.11 -cpu ...,+hv-time" to a host running
> > QEMU 2.11 and Linux < 4.14.
> 
> Indeed :(

Well, your patch fixes it by not enabling hv-frequencies by
default on any machine-type.  Do you see any gotchas?


> 
> > I wonder if there's a way to avoid that?  If there's a way to avoid
> > that with extra migration subsections,
> 
> I guess this should work.
> 
> > is it worth the effort/complexity?
> 
> This is a judgement call.  For vendors this is a non-issue because most
> of them haven't even started shipping 2.11, so they just don't have VMs
> with this problem in the field.
> 
> So, taking the effort/complexity vs safety tradeoff into account, we can
> consider an alternative approach: just add hv_frequencies (default=off)
> cpu property to 2.12 and 2.11-stable, and ignore the cases where it's
> run on QEMU versions without explicit control over this feature.  Would
> it be too much against the current policy?
> 

With this, we can just declare that QEMU v2.11.0 + Linux 4.14+
was broken, and advise people to upgrade QEMU.

I think is the most reasonable option we have.  See my reply to
the patch you sent.

-- 
Eduardo

  reply	other threads:[~2018-03-23 19:48 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-20 17:34 [Qemu-devel] [PATCH v3 0/2] i386/kvm: TSC page clocksource for Hyper-V-on-KVM fixes Vitaly Kuznetsov
2018-03-20 17:34 ` [Qemu-devel] [PATCH v3 1/2] i386/kvm: add support for Hyper-V reenlightenment MSRs Vitaly Kuznetsov
2018-03-20 18:32   ` Eduardo Habkost
2018-03-21 11:09     ` Roman Kagan
2018-03-21 13:09     ` Roman Kagan
2018-03-21 11:24   ` Roman Kagan
2018-03-22 17:09   ` Marcelo Tosatti
2018-03-22 17:39     ` Vitaly Kuznetsov
2018-03-20 17:35 ` [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure Vitaly Kuznetsov
2018-03-21 12:10   ` Roman Kagan
2018-03-21 13:18     ` Vitaly Kuznetsov
2018-03-21 16:57       ` Roman Kagan
2018-03-21 20:19         ` Eduardo Habkost
2018-03-22 13:00           ` Roman Kagan
2018-03-22 13:22             ` Eduardo Habkost
2018-03-22 13:58               ` Roman Kagan
2018-03-22 18:38                 ` Eduardo Habkost
2018-03-23  9:45                   ` Roman Kagan
2018-03-23 19:48                     ` Eduardo Habkost [this message]
2018-03-26 14:20                       ` Roman Kagan
2018-03-21 15:33   ` Paolo Bonzini
2018-03-21 16:17     ` Vitaly Kuznetsov
2018-03-21 17:17       ` Roman Kagan
2018-03-21 20:06         ` Eduardo Habkost
2018-03-21 16:47     ` Roman Kagan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180323194827.GM3417@localhost.localdomain \
    --to=ehabkost@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rkagan@virtuozzo.com \
    --cc=rth@twiddle.net \
    --cc=vkuznets@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.