All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	qemu-devel@nongnu.org, Eduardo Habkost <ehabkost@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>
Subject: Re: [PATCH 2/2] apic: Use 32bit APIC ID for migration instance ID
Date: Tue, 15 Oct 2019 12:02:53 +0100	[thread overview]
Message-ID: <20191015110253.GF3073@work-vm> (raw)
In-Reply-To: <20191015101641.GD8666@xz-x1>

* Peter Xu (peterx@redhat.com) wrote:
> On Tue, Oct 15, 2019 at 10:22:18AM +0100, Dr. David Alan Gilbert wrote:
> > * Peter Xu (peterx@redhat.com) wrote:
> > > Migration is silently broken now with x2apic config like this:
> > > 
> > >      -smp 200,maxcpus=288,sockets=2,cores=72,threads=2 \
> > >      -device intel-iommu,intremap=on,eim=on
> > > 
> > > After migration, the guest kernel could hang at anything, due to
> > > x2apic bit not migrated correctly in IA32_APIC_BASE on some vcpus, so
> > > any operations related to x2apic could be broken then (e.g., RDMSR on
> > > x2apic MSRs could fail because KVM would think that the vcpu hasn't
> > > enabled x2apic at all).
> > > 
> > > The issue is that the x2apic bit was never applied correctly for vcpus
> > > whose ID > 255 when migrate completes, and that's because when we
> > > migrate APIC we use the APICCommonState.id as instance ID of the
> > > migration stream, while that's too short for x2apic.
> > > 
> > > Let's use the newly introduced initial_apic_id for that.
> > 
> > I'd like to understand a few things:
> >    a) Does this change the instance ID of existing APICs on the
> > migration stream? 
> >      a1) Ever for <256 CPUs?
> 
> No.
> 
> >      a2) For >=256 CPUs?
> 
> Yes.
> 
> > 
> >     [Because changing the ID breaks migration]
> 
> But if we don't change it, the stream is broken too. :)
> 
> Then the destination VM will receive e.g. two apic_id==0 instances (I
> think the apic_id==256 instance will wrongly overwrite the apic_id==0
> one), while the vcpu with apic_id==256 will use the initial apic
> values.
> 
> So IMHO we should still fix this, even if it changes the migration
> stream.  At least we start to make it right.

Yes, that makes sense.
It deserves a doc mention somewhere.

> > 
> >   b) Is the instance ID constant - I can see it's a property on the
> >      APIC, but I cna't see who sets it
> 
> For each vcpu, I think yes it should be a constant as long as the
> topology is the same.  This is how I understand it to be set:
> 
> (1) In pc_cpus_init(), we init these:
> 
>     possible_cpus = mc->possible_cpu_arch_ids(ms);
>     for (i = 0; i < ms->smp.cpus; i++) {
>         pc_new_cpu(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
>     }
> 
> (2) In x86_cpu_apic_create(), we apply the apic_id to "id" property:
> 
>     qdev_prop_set_uint32(cpu->apic_state, "id", cpu->apic_id);

OK, that's fine - as long as it's constaatn and not guest influenced.

> > 
> >   c) In the case where it fails, did we end up registering two
> >      devices with the same name and instance ID?  If so, is it worth
> >      adding a check that would error if we tried?
> 
> Sounds doable.
> 

Great,

Dave

> Thanks,
> 
> -- 
> Peter Xu
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


  reply	other threads:[~2019-10-15 11:04 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-15  7:54 [PATCH 0/2] apic: Fix migration breakage of >255 vcpus Peter Xu
2019-10-15  7:54 ` [PATCH 1/2] migration: Boost SaveStateEntry.instance_id to 64 bits Peter Xu
2019-10-15  8:34   ` Juan Quintela
2019-10-15 10:28     ` Peter Xu
2019-10-15  8:45   ` Juan Quintela
2019-10-15  8:57     ` Dr. David Alan Gilbert
2019-10-15 12:57       ` Juan Quintela
2019-10-15 10:23     ` Peter Xu
2019-10-15  7:54 ` [PATCH 2/2] apic: Use 32bit APIC ID for migration instance ID Peter Xu
2019-10-15  8:30   ` Juan Quintela
2019-10-15  9:22   ` Dr. David Alan Gilbert
2019-10-15 10:16     ` Peter Xu
2019-10-15 11:02       ` Dr. David Alan Gilbert [this message]
2019-10-15 19:49         ` Eduardo Habkost

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191015110253.GF3073@work-vm \
    --to=dgilbert@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.