From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48688) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZpSYo-0000vC-Ma for qemu-devel@nongnu.org; Thu, 22 Oct 2015 22:58:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZpSYn-0003qR-D1 for qemu-devel@nongnu.org; Thu, 22 Oct 2015 22:58:30 -0400 Date: Fri, 23 Oct 2015 12:34:50 +1100 From: David Gibson Message-ID: <20151023013450.GA27149@voom.redhat.com> References: <20151019033447.GA5977@tungsten.ozlabs.ibm.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="2oS5YaxWCcQjTEyO" Content-Disposition: inline In-Reply-To: <20151019033447.GA5977@tungsten.ozlabs.ibm.com> Subject: Re: [Qemu-devel] [Qemu-ppc] PPC VCPU ID packing via KVM_CAP_PPC_SMT List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Sam Bobroff Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org --2oS5YaxWCcQjTEyO Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Oct 19, 2015 at 02:34:47PM +1100, Sam Bobroff wrote: > Hi everyone, >=20 > It's currently possible to configure QEMU and KVM such that (on a Power 7= or 8 > host) users are unable to create as many VCPUs as they might reasonably e= xpect. > I'll outline one fairly straight forward solution (below) and I would wel= come > feedback: Does this seem a reasonable approach? Are there alternatives? >=20 > The issue: >=20 > The behaviour is caused by three things: > * QEMU limits the total number (count) of VCPUs based on the machine type= (hard > coded to 256 for pseries). > * See hw/ppc/spapr.c spapr_machine_class_init() > * KVM limits the highest VCPU ID to CONFIG_NR_CPUS (2048 for > pseries_defconfig). > * See arch/powerpc/configs/pseries_defconfig > * and arch/powerpc/include/asm/kvm_host.h > * If the host SMT mode is higher than the guest SMT mode when creating VC= PUs, > QEMU must "pad out" the VCPU IDs to align the VCPUs with physical cores= (KVM > doesn't know which SMT mode the guest wants). > * See target-ppc/translate_init.c ppc_cpu_realizefn(). >=20 > In the most pathological case the guest is SMT 1 (smp_threads =3D 1) and = the host > SMT 8 (max_smt =3D 8), which causes the VCPU IDs to be spaced 8 apart (e.= g. 0, 8, > 24, ...). >=20 > This doesn't produce any strange behaviour with default limits, but consi= der > the case where CONFIG_NR_CPUs is set to 1024 (with the same SMT modes as > above): as the 128th VCPU is created, it's VCPU ID will be 128 * 8 =3D 10= 24, > which will be rejected by KVM. This could be surprising because only 128 = VCPUs > can be created when max_cpus =3D 256 and CONFIG_NR_CPUS =3D 1024. >=20 > Proposal: >=20 > One solution is to provide a way for QEMU to inform KVM of the guest's SMT > mode. This would allow KVM to place the VCPUs correctly within physical c= ores > without any VCPU ID padding. I think that's a good idea. In fact it's what we should have done in the first place. Controlling the guest SMT mode implicitly with the vcpu IDs was a case of too-clever-by-half on my part. > And one way to do that would be for KVM to allow QEMU to set the (current= ly > read-only) KVM_CAP_PPC_SMT capability to the required guest SMT mode. Sounds ok. > The simplest implementation would seem to be to add a new version of the > pseries machine and have it require that the kernel support setting > KVM_CAP_PPC_SMT, but would this be a reasonable restriction? It's.. not great. > Should we add a > property (where?) to allow the new machine version to run without the new > kernel feature? Could that property default to "on" or "on if supported b= y the > kernel" without it becoming too complicated or causing trouble during > migration? So, migration is the issue, yes. But.. I thought we already disconnected the KVM vcpu IDs from the qemu internal cpu IDs, which is what we need for migration. If that's so (check, please), then it should be sufficient to make sure that the KVM vcpu ID is included in the migration stream - it might be already. --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --2oS5YaxWCcQjTEyO Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJWKY66AAoJEGw4ysog2bOS7zgP/2kUIND36rq0YXQuDdKszkwR 3lSY/T9scK1ngeMAssjUVF+SUmmIQKYFnSNZJD58EC8xrL1K9fQOTukWEdU1vr8C LUg6Ma4awbKpYQr7AhfNN3+0QDMq6cGJOb7cGHb3JsNLSmKusMg48sEtsiGpG7Wm Bq7ldrUkB5Gzd1MjLqCRz6vjVUcncu55IaIL1Fg9W/BW5E2nygbK1/jiFrFvPCkb ZYG9IinmDXSZxcyIewMarMQr4CjfkvnG977gzgFviAvriRM7VTHysptj9c9wi4Ip oqlwZ2cpBQzJi3kl9ytYbRVVCRzh3YSHFJQvq+a8VLNSbtm4D9a3BzxiKcdB+x9m sqL5Hh/GOXnB8WcQX5AhJNDJmh/ehr+yJyKLmj97ykU3ucGA+RMmPs2Hi2c+KcGY fz4yCqflurRL8bIlgUbj7VdqIZxR06WSmd36I3eYH77vttcmLbJoZ4De266nMmGT 0iY9WV4Xog29JR+OtSRVPg8VuUVSzdK4BVE6JwOpEC/YQCHzij4QhU5nImYA5n2a OHWdi7js8qGAEh9LMMRGfxfsbDhJodso3kvNqbnpNn7ccMZqOuk4QqtCh7P9nd6L dguu44E7Bm6urByVrK9J6VK00ItsxmBeOqSiM3l0DMlQ0i0OgIGfuoeecqeC7vCo dC+oHtQ5vBDS2k7aHNPs =h0DT -----END PGP SIGNATURE----- --2oS5YaxWCcQjTEyO--