From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47835) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zvea7-0003wE-Qu for qemu-devel@nongnu.org; Mon, 09 Nov 2015 00:01:30 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zvea4-0008HQ-Ro for qemu-devel@nongnu.org; Mon, 09 Nov 2015 00:01:27 -0500 Date: Mon, 9 Nov 2015 15:58:12 +1100 From: David Gibson Message-ID: <20151109045812.GE18558@voom.redhat.com> References: <1446678366-15082-1-git-send-email-sukadev@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="CGDBiGfvSTbxKZlW" Content-Disposition: inline In-Reply-To: <1446678366-15082-1-git-send-email-sukadev@linux.vnet.ibm.com> Subject: Re: [Qemu-devel] [PATCH v2 1/1] target-ppc: Implement rtas_get_sysparm(PROCESSOR_MODULE_INFO) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Sukadev Bhattiprolu Cc: stewart@linux.vnet.ibm.com, benh@au1.ibm.com, aik@ozlabs.ru, nacc@linux.vnet.ibm.com, agraf@suse.de, qemu-devel@nongnu.org, qemu-ppc@nongnu.org, paulus@au1.ibm.com --CGDBiGfvSTbxKZlW Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Nov 04, 2015 at 03:06:05PM -0800, Sukadev Bhattiprolu wrote: > Implement RTAS_SYSPARM_PROCESSOR_MODULE_INFO parameter to rtas_get_syspar= m() > call in qemu. This call returns the processor module (socket), chip and c= ore > information as specified in section 7.3.16.18 of PAPR v2.7. PAPR v2.7 isn't available publically. For upstream patches, please reference LoPAPR instead (where it's section 7.3.16.17 AFAICT). > We walk the /proc/device-tree to determine the number of chips, cores and > modules in the _host_ system and return this info to the guest application > that makes the rtas_get_sysparm() call. >=20 > We currently hard code the number of module_types to 1, but we should det= ermine > that dynamically somehow later. >=20 > Thanks to input from Nishanth Aravamudan and Alexey Kardashevsk. >=20 > Signed-off-by: Sukadev Bhattiprolu This isn't ready to go yet - you need to put some more consideration into the uncommon cases: PR KVM, TCG and non-Power hosts. > --- > Changelog[v2]: > [Alexey Kardashevsk] Use existing interfaces like ldl_be_p(), > stw_be_phys(), g_hash_table_new_full(), error_report() ra= ther > than re-inventing; fix indentation, function prottypes et= c; > Drop the fts() interface (which doesn't seem to be availa= ble > on mingw32/mingw64) and use opendir() to walk specific > directories in the directory tree. > --- > hw/ppc/Makefile.objs | 1 + > hw/ppc/spapr_rtas.c | 35 +++++++ > hw/ppc/spapr_rtas_modinfo.c | 230 ++++++++++++++++++++++++++++++++++++++= ++++++ > hw/ppc/spapr_rtas_modinfo.h | 12 +++ > include/hw/ppc/spapr.h | 1 + > 5 files changed, 279 insertions(+) > create mode 100644 hw/ppc/spapr_rtas_modinfo.c > create mode 100644 hw/ppc/spapr_rtas_modinfo.h >=20 > diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs > index c1ffc77..57c6b02 100644 > --- a/hw/ppc/Makefile.objs > +++ b/hw/ppc/Makefile.objs > @@ -4,6 +4,7 @@ obj-y +=3D ppc.o ppc_booke.o > obj-$(CONFIG_PSERIES) +=3D spapr.o spapr_vio.o spapr_events.o > obj-$(CONFIG_PSERIES) +=3D spapr_hcall.o spapr_iommu.o spapr_rtas.o > obj-$(CONFIG_PSERIES) +=3D spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng= =2Eo > +obj-$(CONFIG_PSERIES) +=3D spapr_rtas_modinfo.o > ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy) > obj-y +=3D spapr_pci_vfio.o > endif > diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c > index 34b12a3..41fd8a6 100644 > --- a/hw/ppc/spapr_rtas.c > +++ b/hw/ppc/spapr_rtas.c > @@ -34,6 +34,8 @@ > #include "hw/ppc/spapr.h" > #include "hw/ppc/spapr_vio.h" > #include "qapi-event.h" > + > +#include "spapr_rtas_modinfo.h" > #include "hw/boards.h" > =20 > #include > @@ -240,6 +242,39 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU= *cpu, > target_ulong ret =3D RTAS_OUT_SUCCESS; > =20 > switch (parameter) { > + case RTAS_SYSPARM_PROCESSOR_MODULE_INFO: { > + int i; > + int offset =3D 0; > + int size; > + struct rtas_module_info modinfo; > + > + if (rtas_get_module_info(&modinfo)) { > + break; > + } So, you handle the variable size of this structure before sending it to the guest, but you don't handle it in allocation of the structure right here. You'll get away with that because for now you only ever have one entry in the sockets array, but it's a bit icky. > + > + size =3D sizeof(modinfo); > + size +=3D (modinfo.module_types - 1) * sizeof(struct rtas_socket= _info); More seriously, this calculation will break horribly if you change the size of the array in struct rtas_module_info. > + stw_be_phys(&address_space_memory, buffer+offset, size); > + offset +=3D 2; > + > + stw_be_phys(&address_space_memory, buffer+offset, modinfo.module= _types); > + offset +=3D 2; > + > + for (i =3D 0; i < modinfo.module_types; i++) { > + stw_be_phys(&address_space_memory, buffer+offset, > + modinfo.si[i].sockets); > + offset +=3D 2; > + stw_be_phys(&address_space_memory, buffer+offset, > + modinfo.si[i].chips); > + offset +=3D 2; > + stw_be_phys(&address_space_memory, buffer+offset, > + modinfo.si[i].cores_per_chip); > + offset +=3D 2; > + } > + break; > + } > + > case RTAS_SYSPARM_SPLPAR_CHARACTERISTICS: { > char *param_val =3D g_strdup_printf("MaxEntCap=3D%d," > "DesMem=3D%llu," > diff --git a/hw/ppc/spapr_rtas_modinfo.c b/hw/ppc/spapr_rtas_modinfo.c > new file mode 100644 > index 0000000..068fc2c > --- /dev/null > +++ b/hw/ppc/spapr_rtas_modinfo.c > @@ -0,0 +1,230 @@ > +/* > + * QEMU PowerPC pSeries Logical Partition (aka sPAPR) hardware System Em= ulator > + * > + * Hypercall based emulated RTAS > + * > + * Copyright (c) 2015 Sukadev Bhattiprolu, IBM Corporation. > + * > + * Permission is hereby granted, free of charge, to any person obtaining= a copy > + * of this software and associated documentation files (the "Software"),= to deal > + * in the Software without restriction, including without limitation the= rights > + * to use, copy, modify, merge, publish, distribute, sublicense, and/or = sell > + * copies of the Software, and to permit persons to whom the Software is > + * furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be includ= ed in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRE= SS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILI= TY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHA= LL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR = OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISI= NG FROM, > + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALING= S IN > + * THE SOFTWARE. > + * > + */ > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include > +#include "spapr_rtas_modinfo.h" > +#include "qemu/error-report.h" > +#include "qemu/bswap.h" This entirely file assumes that (a) you have a ppc64 Linux host, which may not be true, and (b) that it's ok to expose host details to the guest. > +static int file_read_buf(char *file_name, char *buf, int len) > +{ > + int rc; > + FILE *fp; > + > + fp =3D fopen(file_name, "r"); > + if (!fp) { > + error_report("%s: Error opening %s\n", __func__, file_name); > + return -1; > + } > + > + rc =3D fread(buf, 1, len, fp); > + fclose(fp); > + > + if (rc !=3D len) { > + return -1; > + } > + > + return 0; > +} > + > +/* > + * Read an identifier from the file @path and add the identifier > + * to the hash table @gt unless its already in the table. > + */ > +static int hash_table_add_contents(GHashTable *gt, char *path) > +{ > + int idx; > + char buf[16]; > + int *key; > + > + if (file_read_buf(path, buf, sizeof(int))) { If you're just reading sizeof(int), why is the buf 16 bytes? You should allocate buf to be the right size and use sizeof(buf). Also since this is a value you're getting directly from externally, you should use a fixed width type, rather than 'int'. > + return -1; > + } > + > + idx =3D ldl_be_p(buf); Easier to use the 'beNN_to_cpu' functions than ldl_be here. > + if (g_hash_table_contains(gt, &idx)) { > + return 0; > + } > + > + key =3D g_malloc(sizeof(int)); > + > + *key =3D idx; > + if (!g_hash_table_insert(gt, key, NULL)) { > + error_report("Unable to insert key %d into chips hash table\n", = idx); > + g_free(key); > + return -1; > + } > + > + return 0; > +} > + > +/* > + * Each core in the system is represented by a directory with the > + * prefix 'PowerPC,POWER' in the directory /proc/device-tree/cpus/. > + * Process that directory and count the number of cores in the system. True on IBM POWER systems, but not necessarily everywhere - e.g. PR KVM on an embedded PowerPC host. > + */ > +static int count_cores(int *num_cores) > +{ > + int rc; > + DIR *dir; > + struct dirent *ent; > + const char *cpus_dir =3D "/proc/device-tree/cpus"; > + const char *ppc_prefix =3D "PowerPC,POWER"; > + > + dir =3D opendir(cpus_dir); > + if (!dir) { > + error_report("Unable to open %s, error %d\n", cpus_dir, errno); > + return -1; > + } You could probably do this more simply with glob(3). > + *num_cores =3D 0; > + while (1) { > + errno =3D 0; > + ent =3D readdir(dir); > + if (!ent) { > + break; > + } > + > + if (!strncmp(ppc_prefix, ent->d_name, strlen(ppc_prefix))) { > + (*num_cores)++; > + } > + } > + > + rc =3D 0; > + if (errno) { > + rc =3D -1; > + } > + > + closedir(dir); > + return rc; > +} > + > +/* > + * Each module's (aka socket's) id is contained in the 'ibm,hw-module-id' > + * file in the "xscom" directory (/proc/device-tree/xscom*). Similarly e= ach > + * chip's id is contained in the 'ibm,chip-id' file in the xscom directo= ry. > + * > + * A module can contain more than one chip and a chip can contain more > + * than one core. So there are likely to be duplicates in the module > + * and chip identifiers (i.e more than one xscom directory can contain > + * the same module/chip id). > + * > + * Search the xscom directories and count the number of _UNIQUE_ module > + * and chip identifiers in the system. There's no direct way to go from a core (i.e. /proc/device-tree/cpus/cpu@NNN) to the corresponding chip and/or module? > + */ > +static int count_modules_chips(int *num_modules, int *num_chips) > +{ > + int rc; > + DIR *dir; > + struct dirent *ent; > + char path[PATH_MAX]; > + const char *xscom_dir =3D "/proc/device-tree"; > + const char *xscom_prefix =3D "xscom@"; > + GHashTable *chips_tbl, *modules_tbl; > + > + dir =3D opendir(xscom_dir); > + if (!dir) { > + error_report("Unable to open %s, error %d\n", xscom_dir, errno); > + return -1; > + } > + > + modules_tbl =3D g_hash_table_new_full(g_int_hash, g_int_equal, g_fre= e, NULL); > + chips_tbl =3D g_hash_table_new_full(g_int_hash, g_int_equal, g_free,= NULL); > + > + rc =3D -1; > + while (1) { > + errno =3D 0; > + ent =3D readdir(dir); > + if (!ent) { > + break; > + } Again glob(3) could simplify this. > + > + if (strncmp(xscom_prefix, ent->d_name, strlen(xscom_prefix))) { > + continue; > + } > + > + sprintf(path, "%s/%s/ibm,chip-id", xscom_dir, ent->d_name); > + if (hash_table_add_contents(chips_tbl, path)) { > + goto cleanup; > + } > + > + sprintf(path, "%s/%s/ibm,hw-module-id", xscom_dir, ent->d_name); > + if (hash_table_add_contents(modules_tbl, path)) { > + goto cleanup; > + } > + } > + > + if (errno) { > + goto cleanup; > + } > + > + *num_modules =3D g_hash_table_size(modules_tbl); > + *num_chips =3D g_hash_table_size(chips_tbl); > + rc =3D 0; > + > +cleanup: > + g_hash_table_remove_all(modules_tbl); > + g_hash_table_destroy(modules_tbl); > + > + g_hash_table_remove_all(chips_tbl); > + g_hash_table_destroy(chips_tbl); > + > + closedir(dir); > + > + return rc; > +} > + > +int rtas_get_module_info(struct rtas_module_info *modinfo) > +{ > + int cores; > + int chips; > + int modules; > + > + memset(modinfo, 0, sizeof(struct rtas_module_info)); > + > + if (count_modules_chips(&modules, &chips) || count_cores(&cores)) { > + return -1; > + } > + > + /* > + * TODO: Hard code number of module_types for now till > + * we figure out how to determine it dynamically. > + */ > + modinfo->module_types =3D 1; > + modinfo->si[0].sockets =3D modules; > + modinfo->si[0].chips =3D chips; > + modinfo->si[0].cores_per_chip =3D cores / chips; > + > + return 0; > +} > diff --git a/hw/ppc/spapr_rtas_modinfo.h b/hw/ppc/spapr_rtas_modinfo.h > new file mode 100644 > index 0000000..356a2b5 > --- /dev/null > +++ b/hw/ppc/spapr_rtas_modinfo.h > @@ -0,0 +1,12 @@ > + > +struct rtas_socket_info { > + unsigned short sockets; > + unsigned short chips; > + unsigned short cores_per_chip; > +}; > +struct rtas_module_info { > + unsigned short module_types; > + struct rtas_socket_info si[1]; You probably want a "MAX_MODULE_TYPES" #define or similar instead of harcoding this to 1. > +}; > + > +extern int rtas_get_module_info(struct rtas_module_info *topo); > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h > index 5baa906..cfe7fa2 100644 > --- a/include/hw/ppc/spapr.h > +++ b/include/hw/ppc/spapr.h > @@ -463,6 +463,7 @@ int spapr_allocate_irq_block(int num, bool lsi, bool = msi); > /* RTAS ibm,get-system-parameter token values */ > #define RTAS_SYSPARM_SPLPAR_CHARACTERISTICS 20 > #define RTAS_SYSPARM_DIAGNOSTICS_RUN_MODE 42 > +#define RTAS_SYSPARM_PROCESSOR_MODULE_INFO 43 > #define RTAS_SYSPARM_UUID 48 > =20 > /* RTAS indicator/sensor types --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --CGDBiGfvSTbxKZlW Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJWQCfjAAoJEGw4ysog2bOSpS4P/2kevlGvxnyUUpk2q9ATDG0r mf9XAyLeh2p8U2BNEjf4Uu10fhc53sMjFy235Q9FSLYRLcq9lfIKwlWI3fdss0is FrZRlUHnbmahFPpT7rV9Dh+Cace2TqcsHn5MgGLzShXmMWCLJUCYVzRQmGOZPEmO /Faqw9NdWBuFYIlPkazHLrsjHqBv9ENjyVs8XbMt5fDbtpG7Oyjc5LRgZXKJMNDI Q+gW/JY6rfVZ9yELgc610dKSnSc4k+nAlZhKh4gfjlbV4uUrFXO1bLrPCL+LapvR rKM3HkRmZuUuyvulGblM1rjDnNeTmHXpy1rJtoDS5eOfn4r6f78rQJn0grFbDDJv z1r7EAmRGqxG0OxhbjH0Lr46kOZoQzvQrWi79fSxshNKYhlkCC9gFYNT7EqIo3sR vHO/N0rcIJ4y9XqTblI07ZqGHpdyjX8g5bcgaReq5WCOmt55HUzBe/s4LE9r3IbB fR/sqVLqtMUAMmbozfSo8r8ED+CmeafdIUZy95K0chXZf3taP8di2kwRCQCAZPYC TX0nWvGC5gt0U9SJ9gAyXjih5U9Bk1/1Obqac36RkeZUZK44gbgUEpKbu/I6JD4n bp1GANazkHhf+HA/DD7Jp0suqL8u2FI+9z0S78pA4GN4mHg9Tp25r/uN8SHYhxsY 5mwMq6yTsug8DTXaTVmb =g9pE -----END PGP SIGNATURE----- --CGDBiGfvSTbxKZlW--