qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Igor Mammedov <imammedo@redhat.com>
To: Eduardo Habkost <ehabkost@redhat.com>
Cc: Roman Kagan <rkagan@virtuozzo.com>,
	qemu-devel@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>,
	Evgeny Yakovlev <eyakovlev@virtuozzo.com>,
	"Denis V . Lunev" <den@openvz.org>
Subject: Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
Date: Wed, 14 Jun 2017 15:24:50 +0200	[thread overview]
Message-ID: <20170614152450.199121ee@nial.brq.redhat.com> (raw)
In-Reply-To: <20170614130015.GC5016@thinpad.lan.raisama.net>

On Wed, 14 Jun 2017 10:00:15 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:

> On Wed, Jun 14, 2017 at 02:25:07PM +0300, Roman Kagan wrote:
> > On Tue, Jun 13, 2017 at 03:57:52PM -0300, Eduardo Habkost wrote:  
> > > On Tue, Jun 06, 2017 at 09:19:30PM +0300, Roman Kagan wrote:  
> > > > Hyper-V identifies vcpus by the virtual processor (VP) index which is
> > > > normally queried by the guest via HV_X64_MSR_VP_INDEX msr.
> > > > 
> > > > It has to be owned by QEMU in order to preserve it across migration.
> > > > 
> > > > However, the current implementation in KVM doesn't allow to set this
> > > > msr, and KVM uses its own notion of VP index.  Fortunately, the way
> > > > vcpus are created in QEMU/KVM basically guarantees that the KVM value is
> > > > equal to QEMU cpu_index.  
> > > 
> > > This might not be true in the future.  cpu_index is not a good
> > > identifier for CPUs, and we would like to remove it in the
> > > future.
> > > 
> > > But it looks like we have no choice, see extra comments below:  
> >   
> > > > +static void hyperv_set_vp_index(CPUState *cs)
> > > > +{
> > > > +    struct {
> > > > +        struct kvm_msrs info;
> > > > +        struct kvm_msr_entry entries[1];
> > > > +    } msr_data;
> > > > +    int ret;
> > > > +
> > > > +    msr_data.info.nmsrs = 1;
> > > > +    msr_data.entries[0].index = HV_X64_MSR_VP_INDEX;
> > > > +
> > > > +    /*
> > > > +     * Some kernels don't support setting HV_X64_MSR_VP_INDEX.  However,
> > > > +     * they use VP_INDEX equal to cpu_index due to the way vcpus are created.
> > > > +     * So ignore the errors from SET_MSRS, but verify that GET_MSRS returns the
> > > > +     * expected value.
> > > > +     */  
> > > 
> > > Oh.  This sounds like a problem.  As reference for others, this
> > > is the KVM code in kvm_hv_get_msr():
> > > 
> > > 	case HV_X64_MSR_VP_INDEX: {
> > > 		int r;
> > > 		struct kvm_vcpu *v;
> > > 
> > > 		kvm_for_each_vcpu(r, v, vcpu->kvm) {
> > > 			if (v == vcpu) {
> > > 				data = r;
> > > 				break;
> > > 			}
> > > 		}
> > > 		break;
> > > 	}
> > > 
> > > The problem with that is that it will break as soon as we create
> > > VCPUs in a different order.  Unsolvable on hosts that don't allow
> > > HV_X64_MSR_VP_INDEX to be set, however.  
> > 
> > Right, thanks for putting together a detailed explanation.
> > 
> > This was a thinko back then, not to have HV_X64_MSR_VP_INDEX maintained
> > by QEMU.  I'm going to post a patch to KVM fixing that.
> > 
> > Meanwhile QEMU needs a way to maintain its notion of vp_index that is
> >   1) in sync with kernel's notion
> >   2) also with kernels that don't support setting the msr
> >   3) persistent across migrations
> > 
> > cpu_index looked like a perfect candidate.  
> 
> I would like to be able to stop using cpu_index as a persistent
> VCPU identifier in the future.  I would also like to be able to
> create VCPUs in any order without breaking guest ABI.  But it
> seems to be impossible to do that without breaking vp_index on
> older kernels.
> 
> So cpu_index looks like the only candidate we have.
> 
> >   
> > > > +    msr_data.entries[0].data = cs->cpu_index;
> > > > +    kvm_vcpu_ioctl(cs, KVM_SET_MSRS, &msr_data);
> > > > +    ret = kvm_vcpu_ioctl(cs, KVM_GET_MSRS, &msr_data);
> > > > +    assert(ret == 1);
> > > > +    assert(msr_data.entries[0].data == cs->cpu_index);  
> > > 
> > > If KVM already initializes the MSR to cpu_index and we will abort
> > > if it was not set to anything except cpu_index here, why exactly
> > > do we need the KVM_SET_MSRS call?  
> > 
> > The kernel made no obligations to keep initializing its vp_index
> > identically to QEMU's cpu_index.  So we'd better check and abort if that
> > got out of sync.  Once KVM gains the ability to learn vp_index from QEMU
> > we'll no longer depend on that as we'll do explicit synchronization.  
> 
> I think the kernel now has an obligation to keep initializing
> HV_X64_MSR_VP_INDEX in a compatible way, or it will break older
> QEMU versions that don't set the MSR.
> 
> But if you don't think we can rely on that, then the KVM_SET_MSRS
> call won't hurt.

if we can tell apart old index based vp_index kernel and new that supports
MSR setting (add cap check perhaps) then we should be able to
 - leave out old vp_index broken as it is now (for old kernels/old QEMU and/ new QEMU old machine types)
     i.e. do not set vp_index MSR in QEMU
 - in case of (new QEMU new machine type/new kernel) use APIC ID which would work
   with out of order CPU creation

  reply	other threads:[~2017-06-14 13:25 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 01/23] hyperv: add header with protocol definitions Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 02/23] update-linux-headers: prepare for hyperv.h removal Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 03/23] hyperv: set partition-wide MSRs only on first vcpu Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 04/23] hyperv: ensure msrs are inited properly Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index Roman Kagan
2017-06-13 18:57   ` Eduardo Habkost
2017-06-14 11:25     ` Roman Kagan
2017-06-14 11:26       ` Paolo Bonzini
2017-06-14 13:00         ` Igor Mammedov
2017-06-15 12:41           ` Roman Kagan
2017-06-15 13:22             ` Paolo Bonzini
2017-06-15 13:27             ` Igor Mammedov
2017-06-15 16:05               ` Roman Kagan
2017-06-18 15:29                 ` Eduardo Habkost
2017-06-14 13:01         ` Eduardo Habkost
2017-06-14 13:11           ` Igor Mammedov
2017-06-14 13:17             ` Paolo Bonzini
2017-06-14 13:22               ` Eduardo Habkost
2017-06-14 13:37                 ` Paolo Bonzini
2017-06-14 13:38                 ` Igor Mammedov
2017-06-14 13:45                   ` Eduardo Habkost
2017-06-14 18:40                     ` Roman Kagan
2017-06-14 18:59                       ` Eduardo Habkost
2017-06-15  8:26                         ` Paolo Bonzini
2017-06-15 11:40                           ` Roman Kagan
2017-06-15 11:42                             ` Paolo Bonzini
2017-06-15 12:03                               ` Roman Kagan
2017-06-14 13:19             ` Eduardo Habkost
2017-06-14 13:00       ` Eduardo Habkost
2017-06-14 13:24         ` Igor Mammedov [this message]
2017-06-14 13:35           ` Eduardo Habkost
2017-06-14 15:31             ` Igor Mammedov
2017-06-06 18:19 ` [Qemu-devel] [PATCH 06/23] hyperv: helper to find vcpu by VP index Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 07/23] hyperv_testdev: refactor for readability Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 08/23] hyperv: cosmetic: g_malloc -> g_new Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 09/23] hyperv: synic: only setup ack notifier if there's a callback Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 10/23] hyperv: allow passing arbitrary data to sint ack callback Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 12/23] hyperv: make HvSintRoute reference-counted Roman Kagan
2017-06-14 13:53   ` Eduardo Habkost
2017-06-14 16:23     ` Roman Kagan
2017-06-23 12:44       ` Eduardo Habkost
2017-06-06 18:19 ` [Qemu-devel] [PATCH 13/23] hyperv: qdev-ify SynIC Roman Kagan
2017-06-13 18:34   ` Eduardo Habkost
2017-06-14  9:58     ` Roman Kagan
2017-06-14 12:46       ` Eduardo Habkost
2017-06-14 15:11         ` Roman Kagan
2017-06-14 15:21           ` Eduardo Habkost
2017-06-06 18:19 ` [Qemu-devel] [PATCH 14/23] kvm-all: make async_safe_run_on_cpu safe on kvm too Roman Kagan
2017-06-08 14:47   ` Paolo Bonzini
2017-06-06 18:19 ` [Qemu-devel] [PATCH 15/23] hyperv: make overlay pages for SynIC Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 16/23] hyperv: map overlay pages after updating msrs Roman Kagan
2017-06-14 11:12   ` Paolo Bonzini
2017-06-14 11:54     ` Roman Kagan
2017-06-14 12:11       ` Paolo Bonzini
2017-06-14 12:41         ` Roman Kagan
2017-06-14 12:46           ` Paolo Bonzini
2017-06-06 18:19 ` [Qemu-devel] [PATCH 17/23] hyperv: add synic message delivery Roman Kagan
2017-06-14 15:08   ` Paolo Bonzini
2017-06-14 15:28     ` Roman Kagan
2017-06-14 15:32       ` Paolo Bonzini
2017-06-14 15:39         ` Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 18/23] hyperv: add synic event flag signaling Roman Kagan
2017-06-14 15:07   ` Paolo Bonzini
2017-06-06 18:19 ` [Qemu-devel] [PATCH 19/23] hyperv: process SIGNAL_EVENT hypercall Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 20/23] hyperv: process POST_MESSAGE hypercall Roman Kagan
2017-06-14 11:19   ` Paolo Bonzini
2017-06-14 14:20     ` Roman Kagan
2017-06-14 14:30       ` Paolo Bonzini
2017-06-06 18:19 ` [Qemu-devel] [PATCH 21/23] hyperv_testdev: add SynIC message and event testmodes Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 22/23] MAINTAINERS: add myself and eyakovlev@ for hyperv* Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 23/23] hyperv: update copyright notices Roman Kagan
     [not found] ` <20170606181948.16238-12-rkagan@virtuozzo.com>
2017-06-13 19:02   ` [Qemu-devel] [PATCH 11/23] hyperv: address HvSintRoute by X86CPU pointer Eduardo Habkost
2017-06-14 11:08     ` Paolo Bonzini
2017-06-14 12:14       ` Roman Kagan
2017-06-14 12:17         ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170614152450.199121ee@nial.brq.redhat.com \
    --to=imammedo@redhat.com \
    --cc=den@openvz.org \
    --cc=ehabkost@redhat.com \
    --cc=eyakovlev@virtuozzo.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rkagan@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).