From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45990) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZwL7l-0004JP-B1 for qemu-devel@nongnu.org; Tue, 10 Nov 2015 21:27:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZwL7j-0001Jq-6A for qemu-devel@nongnu.org; Tue, 10 Nov 2015 21:27:01 -0500 Date: Wed, 11 Nov 2015 12:41:26 +1100 From: David Gibson Message-ID: <20151111014126.GD5852@voom.redhat.com> References: <1446678366-15082-1-git-send-email-sukadev@linux.vnet.ibm.com> <20151109045812.GE18558@voom.redhat.com> <20151110042232.GB20030@us.ibm.com> <20151111001758.GK18558@voom.redhat.com> <20151111005638.GB4644@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="UoPmpPX/dBe4BELn" Content-Disposition: inline In-Reply-To: <20151111005638.GB4644@linux.vnet.ibm.com> Subject: Re: [Qemu-devel] [PATCH v2 1/1] target-ppc: Implement rtas_get_sysparm(PROCESSOR_MODULE_INFO) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Nishanth Aravamudan Cc: stewart@linux.vnet.ibm.com, benh@au1.ibm.com, aik@ozlabs.ru, agraf@suse.de, qemu-devel@nongnu.org, qemu-ppc@nongnu.org, paulus@au1.ibm.com, Sukadev Bhattiprolu --UoPmpPX/dBe4BELn Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Nov 10, 2015 at 04:56:38PM -0800, Nishanth Aravamudan wrote: > On 11.11.2015 [11:17:58 +1100], David Gibson wrote: > > On Mon, Nov 09, 2015 at 08:22:32PM -0800, Sukadev Bhattiprolu wrote: > > > David Gibson [david@gibson.dropbear.id.au] wrote: > > > | On Wed, Nov 04, 2015 at 03:06:05PM -0800, Sukadev Bhattiprolu wrote: > > > | > Implement RTAS_SYSPARM_PROCESSOR_MODULE_INFO parameter to rtas_ge= t_sysparm() > > > | > call in qemu. This call returns the processor module (socket), ch= ip and core > > > | > information as specified in section 7.3.16.18 of PAPR v2.7. > > > |=20 > > > | PAPR v2.7 isn't available publically. For upstream patches, please > > > | reference LoPAPR instead (where it's section 7.3.16.17 AFAICT). > > >=20 > > > Ok. > > >=20 > > > |=20 > > > | > We walk the /proc/device-tree to determine the number of chips, c= ores and > > > | > modules in the _host_ system and return this info to the guest ap= plication > > > | > that makes the rtas_get_sysparm() call. > > > | >=20 > > > | > We currently hard code the number of module_types to 1, but we sh= ould determine > > > | > that dynamically somehow later. > > > | >=20 > > > | > Thanks to input from Nishanth Aravamudan and Alexey Kardashevsk. > > > | >=20 > > > | > Signed-off-by: Sukadev Bhattiprolu > > > |=20 > > > | This isn't ready to go yet - you need to put some more consideration > > > | into the uncommon cases: PR KVM, TCG and non-Power hosts. > > >=20 > > > Ok. Is there a we can make this code applicable only a Powerpc host? > > > (would moving this code to target-ppc/kvm.c do that?) > >=20 > > Yes, moving it to target-ppc/kvm.c would mostly do that. You'd need > > some logic to make sure it fails gracefully in other cases, of course. > >=20 > > [snip] > > > | > switch (parameter) { > > > | > + case RTAS_SYSPARM_PROCESSOR_MODULE_INFO: { > > > | > + int i; > > > | > + int offset =3D 0; > > > | > + int size; > > > | > + struct rtas_module_info modinfo; > > > | > + > > > | > + if (rtas_get_module_info(&modinfo)) { > > > | > + break; > > > | > + } > > > |=20 > > > | So, you handle the variable size of this structure before sending it > > > | to the guest, but you don't handle it in allocation of the structure > > > | right here. You'll get away with that because for now you only ever > > > | have one entry in the sockets array, but it's a bit icky. > > >=20 > > > Can we assume that the size is static for now... > > > |=20 > > > | > + > > > | > + size =3D sizeof(modinfo); > > > | > + size +=3D (modinfo.module_types - 1) * sizeof(struct rta= s_socket_info); > > > |=20 > > > | More seriously, this calculation will break horribly if you change = the > > > | size of the array in struct rtas_module_info. > > >=20 > > > and just set 'size' to sizeof(modinfo)?. > >=20 > > For purposes of allocation you could just use a fixed size. But the > > guest might get confused by additional data beyond the declared size, > > so you do need to get the value correct that you send back to the guest. > >=20 > > [snip] > > > | > +/* > > > | > + * Each module's (aka socket's) id is contained in the 'ibm,hw-m= odule-id' > > > | > + * file in the "xscom" directory (/proc/device-tree/xscom*). Sim= ilarly each > > > | > + * chip's id is contained in the 'ibm,chip-id' file in the xscom= directory. > > > | > + * > > > | > + * A module can contain more than one chip and a chip can contai= n more > > > | > + * than one core. So there are likely to be duplicates in the mo= dule > > > | > + * and chip identifiers (i.e more than one xscom directory can c= ontain > > > | > + * the same module/chip id). > > > | > + * > > > | > + * Search the xscom directories and count the number of _UNIQUE_= module > > > | > + * and chip identifiers in the system. > > > |=20 > > > | There's no direct way to go from a core > > > | (i.e. /proc/device-tree/cpus/cpu@NNN) to the corresponding chip and= /or > > > | module? > > >=20 > > > Yes, it would logical to find the chip and module from the core :-) > > >=20 > > > While 'ibm,chip-id' is in the core dir (/proc/device-tree/cpus/PowerP= C,*/),=20 > > > the 'ibm,hw-module-id' is not there (on my Tuleta system). Maybe the > > > 'ibm,hw-module-id' will be added in the future? > >=20 > > Hm, I see. Is there any device node that represents the "chip"? > >=20 > > > I am using the xscom node to be consistent in counting chips and modu= les. > >=20 > > The trouble with xscom is that it's extremely specific to the way the > > current IBM servers present things. It won't work on other types of > > host machine (which could happen with PR KVM), and could even break if > > IBM changes the way it organizes the SCOMs in a future machine. > >=20 > > Working from the nodes in /cpus still has some dependencies on IBM > > specific properties, but it's at least partially based on OF > > standards. > >=20 > > There's also another possible approach here, though I don't know if it > > will work. Instead of looking directly in the device tree, try to get > > the information from lscpu, or libosinfo. That would at least give > > you some hope of providing meaningful information on other host types. >=20 > Heh, the issue that is underlying all of this, is that `lscpu` itself is > quite wrong. >=20 > On PAPR-compliant hypervisors (well, PowerVM, at least), the only > supported means of determining the underlying hardware CPU information > (which is what licensing models want in the end), is to use this RTAS > call in an LPAR. `lscpu` is explicitly incorrect in these environments > (it's values are "derived" from sysfs and some are adjusted to ensure > the division of values works out). So.. I'm not sure if you're just saying that lscpu is wrong because it gives the guest information, or because of other problems. What I was suggesting is implementing the RTAS call so that it effectively lets the guest get lscpu information from the host. > So, we are trying to at least resolve what PowerKVM guest can see by > supporting this RTAS call there. We should report *something* to the > guest, if possible, and we can adjust what is reported to the guests as > we go, from the host perspective. >=20 > I haven't followed along too closely in this thread, but woudl it be > reasonable to only report this RTAS call as being supported under > KVM? Possibly, yes. > How are other RTAS calls dealt with for PR and non-IBM models > currently? Most of them still make sense in PR or TCG. A few do look in the host device tree, in which case they're likely to fail on non-KVM. --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --UoPmpPX/dBe4BELn Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJWQpzGAAoJEGw4ysog2bOSjp4QAIVA87GeP4QJHSBZ2jkgxnar LoMaHiadgj3Ze8VeqfOUjtljrCTleinAKdeq2SgRc3uXOiM8Ntk8I4m8MKOP4D2g 7rjY9diSqy0ABE2VQbugitKpR9YjnegiYQBJyYceTVil2gkPYPaLHXxspwtEn7fQ LJOdwDHl1JaFw6vzdDuk3sNlyqvAV3SYsjb3tVJU86FA5QPAkBQSycLCpk78RHlc 9gfqiCeKNHm9a7d/M+TuSsKmAKcUS92uBZqu3+wT+cB7XE1pDfdpKIXOhJa0FBVF Obq4dOgrzL9VnjxzI6A0EBC5P+poyUgPauMWxdCGugDhSGvpM+T1igJrFxospKYy 9+XJfU68LSZhg1a55GyKWVr/KM1AFX/+7m7p5cPiMB1BZCUKhbkZD7IIbGZlKTnz JnzJHCr6hWv9o2a532lgIb9vBIrbDFBvq9s/oYLxrjw7PL2HmGu4dc32V9GF3uEl J8KWuVNw5SKaCJ+1KQ8wYAeZYPATTTL47nvBD8wXYivLarv1oP+xRpDWthLdLE0O vwooHuFLmMRA9qtrIz45aw/rF3gmDvocfECQdxCtMy1JdE7AYTjZNrVVeOg0yQXk 43exKZJNcjxUazj9Gp17tkaF9h6jI7aUORtfjI4DaWjDLExkpesD+NLw8aaRpcgR 7TvVZWN/skAdGITV94sV =vvLM -----END PGP SIGNATURE----- --UoPmpPX/dBe4BELn--