* [Qemu-devel] [BUG] Migrate failes between boards with different PMC counts
@ 2017-04-24 6:26 Zhuangyanying
2017-04-24 9:23 ` Dr. David Alan Gilbert
0 siblings, 1 reply; 7+ messages in thread
From: Zhuangyanying @ 2017-04-24 6:26 UTC (permalink / raw)
To: pbonzini@redhat.com
Cc: Gonglei (Arei), Huangzhichao, wangxin (U), Zhanghailiang,
qemu-devel@nongnu.org
Hi all,
Recently, I found migration failed when enable vPMU.
migrate vPMU state was introduced in linux-3.10 + qemu-1.7.
As long as enable vPMU, qemu will save / load the
vmstate_msr_architectural_pmu(msr_global_ctrl) register during the migration.
But global_ctrl generated based on cpuid(0xA), the number of general-purpose performance
monitoring counters(PMC) can vary according to Intel SDN. The number of PMC presented
to vm, does not support configuration currently, it depend on host cpuid, and enable all pmc
defaultly at KVM. It cause migration to fail between boards with different PMC counts.
The return value of cpuid (0xA) is different dur to cpu, according to Intel SDN,18-10 Vol. 3B:
Note: The number of general-purpose performance monitoring counters (i.e. N in Figure 18-9)
can vary across processor generations within a processor family, across processor families, or
could be different depending on the configuration chosen at boot time in the BIOS regarding
Intel Hyper Threading Technology, (e.g. N=2 for 45 nm Intel Atom processors; N =4 for processors
based on the Nehalem microarchitecture; for processors based on the Sandy Bridge
microarchitecture, N = 4 if Intel Hyper Threading Technology is active and N=8 if not active).
Also I found, N=8 if HT is not active based on the broadwell,,
such as CPU E7-8890 v4 @ 2.20GHz
# ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 -hda
/data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true -incoming tcp::8888
Completed 100 %
qemu-system-x86_64: error: failed to set MSR 0x38f to 0x7000000ff
qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833: kvm_put_msrs:
Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
Aborted
So make number of pmc configurable to vm ? Any better idea ?
Regards,
-Zhuang Yanying
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [BUG] Migrate failes between boards with different PMC counts
2017-04-24 6:26 [Qemu-devel] [BUG] Migrate failes between boards with different PMC counts Zhuangyanying
@ 2017-04-24 9:23 ` Dr. David Alan Gilbert
2017-04-24 9:52 ` Daniel P. Berrange
0 siblings, 1 reply; 7+ messages in thread
From: Dr. David Alan Gilbert @ 2017-04-24 9:23 UTC (permalink / raw)
To: Zhuangyanying
Cc: pbonzini@redhat.com, wangxin (U), Gonglei (Arei), Huangzhichao,
Zhanghailiang, qemu-devel@nongnu.org
* Zhuangyanying (ann.zhuangyanying@huawei.com) wrote:
> Hi all,
>
> Recently, I found migration failed when enable vPMU.
>
> migrate vPMU state was introduced in linux-3.10 + qemu-1.7.
>
> As long as enable vPMU, qemu will save / load the
> vmstate_msr_architectural_pmu(msr_global_ctrl) register during the migration.
> But global_ctrl generated based on cpuid(0xA), the number of general-purpose performance
> monitoring counters(PMC) can vary according to Intel SDN. The number of PMC presented
> to vm, does not support configuration currently, it depend on host cpuid, and enable all pmc
> defaultly at KVM. It cause migration to fail between boards with different PMC counts.
>
> The return value of cpuid (0xA) is different dur to cpu, according to Intel SDN,18-10 Vol. 3B:
>
> Note: The number of general-purpose performance monitoring counters (i.e. N in Figure 18-9)
> can vary across processor generations within a processor family, across processor families, or
> could be different depending on the configuration chosen at boot time in the BIOS regarding
> Intel Hyper Threading Technology, (e.g. N=2 for 45 nm Intel Atom processors; N =4 for processors
> based on the Nehalem microarchitecture; for processors based on the Sandy Bridge
> microarchitecture, N = 4 if Intel Hyper Threading Technology is active and N=8 if not active).
>
> Also I found, N=8 if HT is not active based on the broadwell,,
> such as CPU E7-8890 v4 @ 2.20GHz
>
> # ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 -hda
> /data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true -incoming tcp::8888
> Completed 100 %
> qemu-system-x86_64: error: failed to set MSR 0x38f to 0x7000000ff
> qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833: kvm_put_msrs:
> Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
> Aborted
>
> So make number of pmc configurable to vm ? Any better idea ?
Coincidentally we hit a similar problem a few days ago with -cpu host - it took me
quite a while to spot the difference between the machines was the source
had hyperthreading disabled.
An option to set the number of counters makes sense to me; but I wonder
how many other options we need as well. Also, I'm not sure there's any
easy way for libvirt etc to figure out how many counters a host supports - it's
not in /proc/cpuinfo.
Dave
>
> Regards,
> -Zhuang Yanying
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [BUG] Migrate failes between boards with different PMC counts
2017-04-24 9:23 ` Dr. David Alan Gilbert
@ 2017-04-24 9:52 ` Daniel P. Berrange
2017-04-24 10:27 ` Dr. David Alan Gilbert
0 siblings, 1 reply; 7+ messages in thread
From: Daniel P. Berrange @ 2017-04-24 9:52 UTC (permalink / raw)
To: Dr. David Alan Gilbert
Cc: Zhuangyanying, Zhanghailiang, wangxin (U), qemu-devel@nongnu.org,
Gonglei (Arei), Huangzhichao, pbonzini@redhat.com
On Mon, Apr 24, 2017 at 10:23:21AM +0100, Dr. David Alan Gilbert wrote:
> * Zhuangyanying (ann.zhuangyanying@huawei.com) wrote:
> > Hi all,
> >
> > Recently, I found migration failed when enable vPMU.
> >
> > migrate vPMU state was introduced in linux-3.10 + qemu-1.7.
> >
> > As long as enable vPMU, qemu will save / load the
> > vmstate_msr_architectural_pmu(msr_global_ctrl) register during the migration.
> > But global_ctrl generated based on cpuid(0xA), the number of general-purpose performance
> > monitoring counters(PMC) can vary according to Intel SDN. The number of PMC presented
> > to vm, does not support configuration currently, it depend on host cpuid, and enable all pmc
> > defaultly at KVM. It cause migration to fail between boards with different PMC counts.
> >
> > The return value of cpuid (0xA) is different dur to cpu, according to Intel SDN,18-10 Vol. 3B:
> >
> > Note: The number of general-purpose performance monitoring counters (i.e. N in Figure 18-9)
> > can vary across processor generations within a processor family, across processor families, or
> > could be different depending on the configuration chosen at boot time in the BIOS regarding
> > Intel Hyper Threading Technology, (e.g. N=2 for 45 nm Intel Atom processors; N =4 for processors
> > based on the Nehalem microarchitecture; for processors based on the Sandy Bridge
> > microarchitecture, N = 4 if Intel Hyper Threading Technology is active and N=8 if not active).
> >
> > Also I found, N=8 if HT is not active based on the broadwell,,
> > such as CPU E7-8890 v4 @ 2.20GHz
> >
> > # ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 -hda
> > /data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true -incoming tcp::8888
> > Completed 100 %
> > qemu-system-x86_64: error: failed to set MSR 0x38f to 0x7000000ff
> > qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833: kvm_put_msrs:
> > Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
> > Aborted
> >
> > So make number of pmc configurable to vm ? Any better idea ?
>
> Coincidentally we hit a similar problem a few days ago with -cpu host - it took me
> quite a while to spot the difference between the machines was the source
> had hyperthreading disabled.
>
> An option to set the number of counters makes sense to me; but I wonder
> how many other options we need as well. Also, I'm not sure there's any
> easy way for libvirt etc to figure out how many counters a host supports -
> it's not in /proc/cpuinfo.
We actually try to avoid /proc/cpuinfo whereever possible. We do direct
CPUID asm instructions to identify features, and prefer to use
/sys/devices/system/cpu if that has suitable data
Where do the PMC counts come from originally ? CPUID or something else ?
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [BUG] Migrate failes between boards with different PMC counts
2017-04-24 9:52 ` Daniel P. Berrange
@ 2017-04-24 10:27 ` Dr. David Alan Gilbert
2017-04-24 10:34 ` Daniel P. Berrange
0 siblings, 1 reply; 7+ messages in thread
From: Dr. David Alan Gilbert @ 2017-04-24 10:27 UTC (permalink / raw)
To: Daniel P. Berrange
Cc: Zhuangyanying, Zhanghailiang, wangxin (U), qemu-devel@nongnu.org,
Gonglei (Arei), Huangzhichao, pbonzini@redhat.com
* Daniel P. Berrange (berrange@redhat.com) wrote:
> On Mon, Apr 24, 2017 at 10:23:21AM +0100, Dr. David Alan Gilbert wrote:
> > * Zhuangyanying (ann.zhuangyanying@huawei.com) wrote:
> > > Hi all,
> > >
> > > Recently, I found migration failed when enable vPMU.
> > >
> > > migrate vPMU state was introduced in linux-3.10 + qemu-1.7.
> > >
> > > As long as enable vPMU, qemu will save / load the
> > > vmstate_msr_architectural_pmu(msr_global_ctrl) register during the migration.
> > > But global_ctrl generated based on cpuid(0xA), the number of general-purpose performance
> > > monitoring counters(PMC) can vary according to Intel SDN. The number of PMC presented
> > > to vm, does not support configuration currently, it depend on host cpuid, and enable all pmc
> > > defaultly at KVM. It cause migration to fail between boards with different PMC counts.
> > >
> > > The return value of cpuid (0xA) is different dur to cpu, according to Intel SDN,18-10 Vol. 3B:
> > >
> > > Note: The number of general-purpose performance monitoring counters (i.e. N in Figure 18-9)
> > > can vary across processor generations within a processor family, across processor families, or
> > > could be different depending on the configuration chosen at boot time in the BIOS regarding
> > > Intel Hyper Threading Technology, (e.g. N=2 for 45 nm Intel Atom processors; N =4 for processors
> > > based on the Nehalem microarchitecture; for processors based on the Sandy Bridge
> > > microarchitecture, N = 4 if Intel Hyper Threading Technology is active and N=8 if not active).
> > >
> > > Also I found, N=8 if HT is not active based on the broadwell,,
> > > such as CPU E7-8890 v4 @ 2.20GHz
> > >
> > > # ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 -hda
> > > /data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true -incoming tcp::8888
> > > Completed 100 %
> > > qemu-system-x86_64: error: failed to set MSR 0x38f to 0x7000000ff
> > > qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833: kvm_put_msrs:
> > > Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
> > > Aborted
> > >
> > > So make number of pmc configurable to vm ? Any better idea ?
> >
> > Coincidentally we hit a similar problem a few days ago with -cpu host - it took me
> > quite a while to spot the difference between the machines was the source
> > had hyperthreading disabled.
> >
> > An option to set the number of counters makes sense to me; but I wonder
> > how many other options we need as well. Also, I'm not sure there's any
> > easy way for libvirt etc to figure out how many counters a host supports -
> > it's not in /proc/cpuinfo.
>
> We actually try to avoid /proc/cpuinfo whereever possible. We do direct
> CPUID asm instructions to identify features, and prefer to use
> /sys/devices/system/cpu if that has suitable data
>
> Where do the PMC counts come from originally ? CPUID or something else ?
Yes, they're bits 8..15 of CPUID leaf 0xa
Dave
> Regards,
> Daniel
> --
> |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o- https://fstop138.berrange.com :|
> |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [BUG] Migrate failes between boards with different PMC counts
2017-04-24 10:27 ` Dr. David Alan Gilbert
@ 2017-04-24 10:34 ` Daniel P. Berrange
2017-04-24 12:57 ` Zhuangyanying
0 siblings, 1 reply; 7+ messages in thread
From: Daniel P. Berrange @ 2017-04-24 10:34 UTC (permalink / raw)
To: Dr. David Alan Gilbert
Cc: Zhuangyanying, Zhanghailiang, wangxin (U), qemu-devel@nongnu.org,
Gonglei (Arei), Huangzhichao, pbonzini@redhat.com
On Mon, Apr 24, 2017 at 11:27:16AM +0100, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berrange@redhat.com) wrote:
> > On Mon, Apr 24, 2017 at 10:23:21AM +0100, Dr. David Alan Gilbert wrote:
> > > * Zhuangyanying (ann.zhuangyanying@huawei.com) wrote:
> > > > Hi all,
> > > >
> > > > Recently, I found migration failed when enable vPMU.
> > > >
> > > > migrate vPMU state was introduced in linux-3.10 + qemu-1.7.
> > > >
> > > > As long as enable vPMU, qemu will save / load the
> > > > vmstate_msr_architectural_pmu(msr_global_ctrl) register during the migration.
> > > > But global_ctrl generated based on cpuid(0xA), the number of general-purpose performance
> > > > monitoring counters(PMC) can vary according to Intel SDN. The number of PMC presented
> > > > to vm, does not support configuration currently, it depend on host cpuid, and enable all pmc
> > > > defaultly at KVM. It cause migration to fail between boards with different PMC counts.
> > > >
> > > > The return value of cpuid (0xA) is different dur to cpu, according to Intel SDN,18-10 Vol. 3B:
> > > >
> > > > Note: The number of general-purpose performance monitoring counters (i.e. N in Figure 18-9)
> > > > can vary across processor generations within a processor family, across processor families, or
> > > > could be different depending on the configuration chosen at boot time in the BIOS regarding
> > > > Intel Hyper Threading Technology, (e.g. N=2 for 45 nm Intel Atom processors; N =4 for processors
> > > > based on the Nehalem microarchitecture; for processors based on the Sandy Bridge
> > > > microarchitecture, N = 4 if Intel Hyper Threading Technology is active and N=8 if not active).
> > > >
> > > > Also I found, N=8 if HT is not active based on the broadwell,,
> > > > such as CPU E7-8890 v4 @ 2.20GHz
> > > >
> > > > # ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 -hda
> > > > /data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true -incoming tcp::8888
> > > > Completed 100 %
> > > > qemu-system-x86_64: error: failed to set MSR 0x38f to 0x7000000ff
> > > > qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833: kvm_put_msrs:
> > > > Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
> > > > Aborted
> > > >
> > > > So make number of pmc configurable to vm ? Any better idea ?
> > >
> > > Coincidentally we hit a similar problem a few days ago with -cpu host - it took me
> > > quite a while to spot the difference between the machines was the source
> > > had hyperthreading disabled.
> > >
> > > An option to set the number of counters makes sense to me; but I wonder
> > > how many other options we need as well. Also, I'm not sure there's any
> > > easy way for libvirt etc to figure out how many counters a host supports -
> > > it's not in /proc/cpuinfo.
> >
> > We actually try to avoid /proc/cpuinfo whereever possible. We do direct
> > CPUID asm instructions to identify features, and prefer to use
> > /sys/devices/system/cpu if that has suitable data
> >
> > Where do the PMC counts come from originally ? CPUID or something else ?
>
> Yes, they're bits 8..15 of CPUID leaf 0xa
Ok, that's easy enough for libvirt to detect then. More a question of what
libvirt should then do this with the info....
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [BUG] Migrate failes between boards with different PMC counts
2017-04-24 10:34 ` Daniel P. Berrange
@ 2017-04-24 12:57 ` Zhuangyanying
2017-04-25 17:20 ` Dr. David Alan Gilbert
0 siblings, 1 reply; 7+ messages in thread
From: Zhuangyanying @ 2017-04-24 12:57 UTC (permalink / raw)
To: Daniel P. Berrange, Dr. David Alan Gilbert
Cc: Zhanghailiang, wangxin (U), qemu-devel@nongnu.org, Gonglei (Arei),
Huangzhichao, pbonzini@redhat.com, Zhangbo (Oscar)
> -----Original Message-----
> From: Daniel P. Berrange [mailto:berrange@redhat.com]
> Sent: Monday, April 24, 2017 6:34 PM
> To: Dr. David Alan Gilbert
> Cc: Zhuangyanying; Zhanghailiang; wangxin (U); qemu-devel@nongnu.org;
> Gonglei (Arei); Huangzhichao; pbonzini@redhat.com
> Subject: Re: [Qemu-devel] [BUG] Migrate failes between boards with different
> PMC counts
>
> On Mon, Apr 24, 2017 at 11:27:16AM +0100, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrange (berrange@redhat.com) wrote:
> > > On Mon, Apr 24, 2017 at 10:23:21AM +0100, Dr. David Alan Gilbert wrote:
> > > > * Zhuangyanying (ann.zhuangyanying@huawei.com) wrote:
> > > > > Hi all,
> > > > >
> > > > > Recently, I found migration failed when enable vPMU.
> > > > >
> > > > > migrate vPMU state was introduced in linux-3.10 + qemu-1.7.
> > > > >
> > > > > As long as enable vPMU, qemu will save / load the
> > > > > vmstate_msr_architectural_pmu(msr_global_ctrl) register during the
> migration.
> > > > > But global_ctrl generated based on cpuid(0xA), the number of
> > > > > general-purpose performance monitoring counters(PMC) can vary
> > > > > according to Intel SDN. The number of PMC presented to vm, does
> > > > > not support configuration currently, it depend on host cpuid, and enable
> all pmc defaultly at KVM. It cause migration to fail between boards with
> different PMC counts.
> > > > >
> > > > > The return value of cpuid (0xA) is different dur to cpu, according to Intel
> SDN,18-10 Vol. 3B:
> > > > >
> > > > > Note: The number of general-purpose performance monitoring
> > > > > counters (i.e. N in Figure 18-9) can vary across processor
> > > > > generations within a processor family, across processor
> > > > > families, or could be different depending on the configuration
> > > > > chosen at boot time in the BIOS regarding Intel Hyper Threading
> > > > > Technology, (e.g. N=2 for 45 nm Intel Atom processors; N =4 for
> processors based on the Nehalem microarchitecture; for processors based on
> the Sandy Bridge microarchitecture, N = 4 if Intel Hyper Threading Technology
> is active and N=8 if not active).
> > > > >
> > > > > Also I found, N=8 if HT is not active based on the broadwell,,
> > > > > such as CPU E7-8890 v4 @ 2.20GHz
> > > > >
> > > > > # ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m
> > > > > 4096 -hda
> > > > > /data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true
> > > > > -incoming tcp::8888 Completed 100 %
> > > > > qemu-system-x86_64: error: failed to set MSR 0x38f to
> > > > > 0x7000000ff
> > > > > qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833:
> kvm_put_msrs:
> > > > > Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
> > > > > Aborted
> > > > >
> > > > > So make number of pmc configurable to vm ? Any better idea ?
> > > >
> > > > Coincidentally we hit a similar problem a few days ago with -cpu
> > > > host - it took me quite a while to spot the difference between
> > > > the machines was the source had hyperthreading disabled.
> > > >
> > > > An option to set the number of counters makes sense to me; but I
> > > > wonder how many other options we need as well. Also, I'm not sure
> > > > there's any easy way for libvirt etc to figure out how many
> > > > counters a host supports - it's not in /proc/cpuinfo.
> > >
> > > We actually try to avoid /proc/cpuinfo whereever possible. We do
> > > direct CPUID asm instructions to identify features, and prefer to
> > > use /sys/devices/system/cpu if that has suitable data
> > >
> > > Where do the PMC counts come from originally ? CPUID or something
> else ?
> >
> > Yes, they're bits 8..15 of CPUID leaf 0xa
>
> Ok, that's easy enough for libvirt to detect then. More a question of what libvirt
> should then do this with the info....
>
Do you mean to do a validation at the begining of migration? in qemuMigrationBakeCookie() & qemuMigrationEatCookie(), if the PMC numbers are not equal, just quit migration?
It maybe a good enough first edition.
But for a further better edition, maybe it's better to support Heterogeneous migration I think, so we might need to make PMC number configrable, then we need to modify KVM/qemu as well.
Regards,
-Zhuang Yanying
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [BUG] Migrate failes between boards with different PMC counts
2017-04-24 12:57 ` Zhuangyanying
@ 2017-04-25 17:20 ` Dr. David Alan Gilbert
0 siblings, 0 replies; 7+ messages in thread
From: Dr. David Alan Gilbert @ 2017-04-25 17:20 UTC (permalink / raw)
To: Zhuangyanying
Cc: Daniel P. Berrange, Zhanghailiang, wangxin (U),
qemu-devel@nongnu.org, Gonglei (Arei), Huangzhichao,
pbonzini@redhat.com, Zhangbo (Oscar), ehabkost
* Zhuangyanying (ann.zhuangyanying@huawei.com) wrote:
>
>
> > -----Original Message-----
> > From: Daniel P. Berrange [mailto:berrange@redhat.com]
> > Sent: Monday, April 24, 2017 6:34 PM
> > To: Dr. David Alan Gilbert
> > Cc: Zhuangyanying; Zhanghailiang; wangxin (U); qemu-devel@nongnu.org;
> > Gonglei (Arei); Huangzhichao; pbonzini@redhat.com
> > Subject: Re: [Qemu-devel] [BUG] Migrate failes between boards with different
> > PMC counts
> >
> > On Mon, Apr 24, 2017 at 11:27:16AM +0100, Dr. David Alan Gilbert wrote:
> > > * Daniel P. Berrange (berrange@redhat.com) wrote:
> > > > On Mon, Apr 24, 2017 at 10:23:21AM +0100, Dr. David Alan Gilbert wrote:
> > > > > * Zhuangyanying (ann.zhuangyanying@huawei.com) wrote:
> > > > > > Hi all,
> > > > > >
> > > > > > Recently, I found migration failed when enable vPMU.
> > > > > >
> > > > > > migrate vPMU state was introduced in linux-3.10 + qemu-1.7.
> > > > > >
> > > > > > As long as enable vPMU, qemu will save / load the
> > > > > > vmstate_msr_architectural_pmu(msr_global_ctrl) register during the
> > migration.
> > > > > > But global_ctrl generated based on cpuid(0xA), the number of
> > > > > > general-purpose performance monitoring counters(PMC) can vary
> > > > > > according to Intel SDN. The number of PMC presented to vm, does
> > > > > > not support configuration currently, it depend on host cpuid, and enable
> > all pmc defaultly at KVM. It cause migration to fail between boards with
> > different PMC counts.
> > > > > >
> > > > > > The return value of cpuid (0xA) is different dur to cpu, according to Intel
> > SDN,18-10 Vol. 3B:
> > > > > >
> > > > > > Note: The number of general-purpose performance monitoring
> > > > > > counters (i.e. N in Figure 18-9) can vary across processor
> > > > > > generations within a processor family, across processor
> > > > > > families, or could be different depending on the configuration
> > > > > > chosen at boot time in the BIOS regarding Intel Hyper Threading
> > > > > > Technology, (e.g. N=2 for 45 nm Intel Atom processors; N =4 for
> > processors based on the Nehalem microarchitecture; for processors based on
> > the Sandy Bridge microarchitecture, N = 4 if Intel Hyper Threading Technology
> > is active and N=8 if not active).
> > > > > >
> > > > > > Also I found, N=8 if HT is not active based on the broadwell,,
> > > > > > such as CPU E7-8890 v4 @ 2.20GHz
> > > > > >
> > > > > > # ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m
> > > > > > 4096 -hda
> > > > > > /data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true
> > > > > > -incoming tcp::8888 Completed 100 %
> > > > > > qemu-system-x86_64: error: failed to set MSR 0x38f to
> > > > > > 0x7000000ff
> > > > > > qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833:
> > kvm_put_msrs:
> > > > > > Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
> > > > > > Aborted
> > > > > >
> > > > > > So make number of pmc configurable to vm ? Any better idea ?
> > > > >
> > > > > Coincidentally we hit a similar problem a few days ago with -cpu
> > > > > host - it took me quite a while to spot the difference between
> > > > > the machines was the source had hyperthreading disabled.
> > > > >
> > > > > An option to set the number of counters makes sense to me; but I
> > > > > wonder how many other options we need as well. Also, I'm not sure
> > > > > there's any easy way for libvirt etc to figure out how many
> > > > > counters a host supports - it's not in /proc/cpuinfo.
> > > >
> > > > We actually try to avoid /proc/cpuinfo whereever possible. We do
> > > > direct CPUID asm instructions to identify features, and prefer to
> > > > use /sys/devices/system/cpu if that has suitable data
> > > >
> > > > Where do the PMC counts come from originally ? CPUID or something
> > else ?
> > >
> > > Yes, they're bits 8..15 of CPUID leaf 0xa
> >
> > Ok, that's easy enough for libvirt to detect then. More a question of what libvirt
> > should then do this with the info....
> >
>
> Do you mean to do a validation at the begining of migration? in qemuMigrationBakeCookie() & qemuMigrationEatCookie(), if the PMC numbers are not equal, just quit migration?
> It maybe a good enough first edition.
> But for a further better edition, maybe it's better to support Heterogeneous migration I think, so we might need to make PMC number configrable, then we need to modify KVM/qemu as well.
Yes agreed; the only thing I wanted to check was that libvirt would have enough
information to be able to use any feature we added to QEMU.
Dave
> Regards,
> -Zhuang Yanying
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-04-25 17:20 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-24 6:26 [Qemu-devel] [BUG] Migrate failes between boards with different PMC counts Zhuangyanying
2017-04-24 9:23 ` Dr. David Alan Gilbert
2017-04-24 9:52 ` Daniel P. Berrange
2017-04-24 10:27 ` Dr. David Alan Gilbert
2017-04-24 10:34 ` Daniel P. Berrange
2017-04-24 12:57 ` Zhuangyanying
2017-04-25 17:20 ` Dr. David Alan Gilbert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).