From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34490) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zwjvb-0003A6-Ki for qemu-devel@nongnu.org; Wed, 11 Nov 2015 23:56:08 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZwjvY-0005xj-T7 for qemu-devel@nongnu.org; Wed, 11 Nov 2015 23:56:07 -0500 Date: Thu, 12 Nov 2015 15:47:15 +1100 From: David Gibson Message-ID: <20151112044715.GB4886@voom.redhat.com> References: <1446678366-15082-1-git-send-email-sukadev@linux.vnet.ibm.com> <20151109045812.GE18558@voom.redhat.com> <20151110042232.GB20030@us.ibm.com> <20151111001758.GK18558@voom.redhat.com> <20151111005638.GB4644@linux.vnet.ibm.com> <20151111014126.GD5852@voom.redhat.com> <20151111221048.GF4644@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="/WwmFnJnmDyWGHa4" Content-Disposition: inline In-Reply-To: <20151111221048.GF4644@linux.vnet.ibm.com> Subject: Re: [Qemu-devel] [PATCH v2 1/1] target-ppc: Implement rtas_get_sysparm(PROCESSOR_MODULE_INFO) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Nishanth Aravamudan Cc: stewart@linux.vnet.ibm.com, benh@au1.ibm.com, aik@ozlabs.ru, agraf@suse.de, qemu-devel@nongnu.org, qemu-ppc@nongnu.org, paulus@au1.ibm.com, Sukadev Bhattiprolu --/WwmFnJnmDyWGHa4 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Nov 11, 2015 at 02:10:48PM -0800, Nishanth Aravamudan wrote: > On 11.11.2015 [12:41:26 +1100], David Gibson wrote: > > On Tue, Nov 10, 2015 at 04:56:38PM -0800, Nishanth Aravamudan wrote: > > > On 11.11.2015 [11:17:58 +1100], David Gibson wrote: > > > > On Mon, Nov 09, 2015 at 08:22:32PM -0800, Sukadev Bhattiprolu wrote: >=20 > >=20 > > > > The trouble with xscom is that it's extremely specific to the way t= he > > > > current IBM servers present things. It won't work on other types of > > > > host machine (which could happen with PR KVM), and could even break= if > > > > IBM changes the way it organizes the SCOMs in a future machine. > > > >=20 > > > > Working from the nodes in /cpus still has some dependencies on IBM > > > > specific properties, but it's at least partially based on OF > > > > standards. > > > >=20 > > > > There's also another possible approach here, though I don't know if= it > > > > will work. Instead of looking directly in the device tree, try to = get > > > > the information from lscpu, or libosinfo. That would at least give > > > > you some hope of providing meaningful information on other host typ= es. > > >=20 > > > Heh, the issue that is underlying all of this, is that `lscpu` itself= is > > > quite wrong. > > >=20 > > > On PAPR-compliant hypervisors (well, PowerVM, at least), the only > > > supported means of determining the underlying hardware CPU information > > > (which is what licensing models want in the end), is to use this RTAS > > > call in an LPAR. `lscpu` is explicitly incorrect in these environments > > > (it's values are "derived" from sysfs and some are adjusted to ensure > > > the division of values works out). > >=20 > > So.. I'm not sure if you're just saying that lscpu is wrong because it > > gives the guest information, or because of other problems. >=20 > `lscpu`'s man-page specifically says that on virtualized platforms, the > output may be inaccurate. And, in fact, on Power, in a KVM guest (and > in a LPAR), `lscpu` is outputting the guest CPU information, which is > completely fake. This is true on x86 KVM guests too, afaict. Um.. yes, I was assuming lscpu reporting information about virtual cpus and sockets was intended and correct behaviour. > *If* we have a valid RTAS implementation on PowerKVM (or under qemu > generally), I think we can modify `lscpu` to do the right thing in at > least those two environments. >=20 > > What I was suggesting is implementing the RTAS call so that it > > effectively lets the guest get lscpu information from the host. >=20 > A bit of a chicken & egg problem, I'd say. The `lscpu` output in PowerNV > is also wrong :) Ok.. why is it wrong in PowerNV? This sounds like something you'd want to fix anyway. > > > So, we are trying to at least resolve what PowerKVM guest can see by > > > supporting this RTAS call there. We should report *something* to the > > > guest, if possible, and we can adjust what is reported to the guests = as > > > we go, from the host perspective. > > >=20 > > > I haven't followed along too closely in this thread, but woudl it be > > > reasonable to only report this RTAS call as being supported under > > > KVM? > >=20 > > Possibly, yes. >=20 > At least, as a first step, I guess. >=20 > > > How are other RTAS calls dealt with for PR and non-IBM models > > > currently? > >=20 > > Most of them still make sense in PR or TCG. A few do look in the host > > device tree, in which case they're likely to fail on non-KVM. >=20 > Got it, thanks. >=20 > So my investigation overall led me to this set of conclusions: >=20 > 1) Under PowerVM, we do not use this RTAS call, which is the only (as > asserted by pHyp developers) valid way to get hardware information about > the machine. Therefore, the PowerVM `lscpu` output is the "virtual" CPU > information -- where cores are as defined by sharing of the L2-cache. >=20 > 2) Under PowerKVM, we do not use this RTAS call, because it's not > supported, and just spit out whatever the qemu topology is (which has no > connection to the host (physical) CPU information). Right.. so does that mean nothing is using this call yet? > --> so if we implement the RTAS call of some sort under PowerKVM, then > we can update `lscpu` to use that RTAS call. Yeah, I'm not convinced that's correct. Shouldn't lscpu return the virtual cpu information, at least by default. > 3) Under PowerNV, there is a dependency on the hack that is ibm,chip-id > from OPAL, which leads to twice as many sockets potentially being > reported. `lscpu` also uses the sysfs files directly, which may or may > not be the physical topology (I'm still tracking all of this down).=20 >=20 > *Also* `lscpu` has no knowledge of offline/online CPUs, so as you > online/offline CPUs, the output of `lscpu` starts to change. Ah, true. > I think what we eventually want to do is add some fields to `lscpu` to > indicate the "physical" data vs. the "virtual" data. Ok. --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --/WwmFnJnmDyWGHa4 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJWRBnTAAoJEGw4ysog2bOSjEYP/RWFGfF0h+GbBpn4vRGazE54 dvmEIRwj1mwz/zwDFhc6vPWmGr7YBizN6a7QyOUmel9I3XyZxERLauR/oRrClWgh A0XHQ08Zgdy27zE26aL5HaR6b6rPnZibx642HJKJ+4bPKBivjGK5ffM1GkXS8lDK m/IEU/+0PBw9FspvfekZZYRr930ncxcqyR2hCMkaJeAPM6M+eal0dOs+6tLe+xQX 4T52Jb7qfSRjZLaXf+P1skgZMpkj2tIPs+7VpySHNcYIOqiqkben4vhUFSPDWVsu SFy82Q0x51H3JBuW1suH15O1hhQfmjWtdAAAWPfGCW/8roOgc0XdWeWwv3A5Tx+m bZdHf17O0ph1nozlv5kl8jGESuu80ZwF6PP/bT8KcdsAacOa2HUFW174BWYVMyw4 e9mu5nMSYibyEAYH6Lrkop/kwxn7gLTlOSmUbvs6KN/7Wg+qvLBoM4zgol4o8CUp cWQ2qhjTHYV/P1f4ExEUv8B1kphTNePTZQ3ipQR2WfHN8Wu0d6Xq65EoUHuK0TSY Ayv+7ONbXKq4+eLQrVf9LWhj3vmRlcF0aGa/t1NAywzxKVBwvj/CMyONVt/jE5mu Hla/OCG/UZu/6bJxqvJCjWr8gtA3+jqK7vZhzdvHzpQzCz/VsnnambZgTEcAiTDQ /JdFULVH4d+WT/whdaWJ =S1X/ -----END PGP SIGNATURE----- --/WwmFnJnmDyWGHa4--