From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52456) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zw0VD-0004ZM-9S for qemu-devel@nongnu.org; Mon, 09 Nov 2015 23:25:53 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zw0V9-0003KD-4C for qemu-devel@nongnu.org; Mon, 09 Nov 2015 23:25:51 -0500 Received: from mail-pa0-x234.google.com ([2607:f8b0:400e:c03::234]:35076) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zw0V8-0003K2-JB for qemu-devel@nongnu.org; Mon, 09 Nov 2015 23:25:47 -0500 Received: by pasz6 with SMTP id z6so228193653pas.2 for ; Mon, 09 Nov 2015 20:25:45 -0800 (PST) References: <1446678366-15082-1-git-send-email-sukadev@linux.vnet.ibm.com> <563FFD9C.7070407@ozlabs.ru> <20151110035757.GA20030@us.ibm.com> From: Alexey Kardashevskiy Message-ID: <564171C2.8040700@ozlabs.ru> Date: Tue, 10 Nov 2015 15:25:38 +1100 MIME-Version: 1.0 In-Reply-To: <20151110035757.GA20030@us.ibm.com> Content-Type: text/plain; charset=koi8-r; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v2 1/1] target-ppc: Implement rtas_get_sysparm(PROCESSOR_MODULE_INFO) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Sukadev Bhattiprolu Cc: stewart@linux.vnet.ibm.com, benh@au1.ibm.com, nacc@linux.vnet.ibm.com, agraf@suse.de, qemu-devel@nongnu.org, qemu-ppc@nongnu.org, paulus@au1.ibm.com, david@gibson.dropbear.id.au On 11/10/2015 02:57 PM, Sukadev Bhattiprolu wrote: > Alexey Kardashevskiy [aik@ozlabs.ru] wrote: > | On 11/05/2015 10:06 AM, Sukadev Bhattiprolu wrote: > | >Implement RTAS_SYSPARM_PROCESSOR_MODULE_INFO parameter to rtas_get_sysparm() > | >call in qemu. This call returns the processor module (socket), chip and core > | >information as specified in section 7.3.16.18 of PAPR v2.7. > | > > | >We walk the /proc/device-tree to determine the number of chips, cores and > | >modules in the _host_ system and return this info to the guest application > | >that makes the rtas_get_sysparm() call. > | > > | >We currently hard code the number of module_types to 1, but we should determine > | >that dynamically somehow later. > | > > | >Thanks to input from Nishanth Aravamudan and Alexey Kardashevsk. > | > | "iy" is missing :) > > Argh. Sorry :-) > > Agree with your other comments. Couple of questions/comments below. > > | > | checkpatch.pl does not warn on this but new lines start under > | opening brace in the previous line. > | > | In terms on vim, it would be: > | set expandtab > | set tabstop=4 > | set shiftwidth=4 > | set cino=:0,(0 > | > | > | > | >+ offset += 2; > | >+ stw_be_phys(&address_space_memory, buffer+offset, > | >+ modinfo.si[i].chips); > | >+ offset += 2; > | >+ stw_be_phys(&address_space_memory, buffer+offset, > | >+ modinfo.si[i].cores_per_chip); > | >+ offset += 2; > | >+ } > | >+ break; > | >+ } > | >+ > | > case RTAS_SYSPARM_SPLPAR_CHARACTERISTICS: { > | > char *param_val = g_strdup_printf("MaxEntCap=%d," > | > "DesMem=%llu," > | >diff --git a/hw/ppc/spapr_rtas_modinfo.c b/hw/ppc/spapr_rtas_modinfo.c > | >new file mode 100644 > | >index 0000000..068fc2c > | >--- /dev/null > | >+++ b/hw/ppc/spapr_rtas_modinfo.c > | >@@ -0,0 +1,230 @@ > | >+/* > | >+ * QEMU PowerPC pSeries Logical Partition (aka sPAPR) hardware System Emulator > | >+ * > | >+ * Hypercall based emulated RTAS > | > | > | This is a description of hw/ppc/spapr_rtas.c, not of the new file. > > Agree. > > | > | Not sure the new code deserves a separate file, I'd either: > | - add it into hw/ppc/spapr_rtas.c OR > | - move rtas_ibm_get_system_parameter() + > | rtas_ibm_set_system_parameter() to a separate file (let's call it > | hw/ppc/spapr_rtas_syspar.c) and add this new parameter there as > | there will be new parameters in the future anyway. > | > | But I'll leave to the maintainer (David, hi :) ). > > Will discuss this on the other thread from David (maybe move to > target-ppc/kvm.c?) Sounds right. > > | > | > > > | >+ > | >+static int file_read_buf(char *file_name, char *buf, int len) > | >+{ > | >+ int rc; > | >+ FILE *fp; > | >+ > | >+ fp = fopen(file_name, "r"); > | >+ if (!fp) { > | >+ error_report("%s: Error opening %s\n", __func__, file_name); > | > | > | error_report() does not need "\n", remote it here and below. > | > | Another thing - you passed __func__ here but you did not in other > | places, please make this consistent and pass __func__ everywhere OR > | make error messages more descriptive, the latter seems to be > | preferable way in QEMU, either way this should be easily grep'able, > | for example - "Unable to open %s, error %d" is too generic. > > will change to "Error opening device tree file %s" and drop the > __func__. > > | > | > | >+ return -1; > | >+ } > | >+ > | >+ rc = fread(buf, 1, len, fp); > | >+ fclose(fp); > | >+ > | >+ if (rc != len) { > | >+ return -1; > | >+ } > | >+ > | >+ return 0; > | >+} > | >+ > | >+/* > | >+ * Read an identifier from the file @path and add the identifier > | >+ * to the hash table @gt unless its already in the table. > | >+ */ > | >+static int hash_table_add_contents(GHashTable *gt, char *path) > | > | Move this helper before count_modules_chips(). I thought for a sec > | that count_cores() uses it too :) > > Ok. > > | > | > | >+{ > | >+ int idx; > | >+ char buf[16]; > | >+ int *key; > | >+ > | >+ if (file_read_buf(path, buf, sizeof(int))) { > | >+ return -1; > | >+ } > | >+ > | >+ idx = ldl_be_p(buf); > | >+ > | >+ if (g_hash_table_contains(gt, &idx)) { > | >+ return 0; > | >+ } > | >+ > | >+ key = g_malloc(sizeof(int)); > | > | > | I kind of hoped that g_direct_hash() (and probably GINT_TO_POINTER() > | but I do not see kvm-all.c using it) will let you avoid > | g_malloc()'ing the keys. > > I think g_direct_hash() and GINT_TO_POINTER will work. Will try them > out. > > | > | > | > | >+ > | >+ *key = idx; > | >+ if (!g_hash_table_insert(gt, key, NULL)) { > | >+ error_report("Unable to insert key %d into chips hash table\n", idx); > | >+ g_free(key); > | >+ return -1; > | >+ } > | >+ > | >+ return 0; > | >+} > | >+ > | >+/* > | >+ * Each core in the system is represented by a directory with the > | >+ * prefix 'PowerPC,POWER' in the directory /proc/device-tree/cpus/. > | >+ * Process that directory and count the number of cores in the system. > | >+ */ > | >+static int count_cores(int *num_cores) > | >+{ > | >+ int rc; > | >+ DIR *dir; > | >+ struct dirent *ent; > | >+ const char *cpus_dir = "/proc/device-tree/cpus"; > | >+ const char *ppc_prefix = "PowerPC,POWER"; > | >+ > | >+ dir = opendir(cpus_dir); > | >+ if (!dir) { > | >+ error_report("Unable to open %s, error %d\n", cpus_dir, errno); > | >+ return -1; > | >+ } > | >+ > | >+ *num_cores = 0; > | >+ while (1) { > | >+ errno = 0; > | >+ ent = readdir(dir); > | >+ if (!ent) { > | > | rc = -errno; .... > > Yes. Thanks, > | > | > | >+ break; > | >+ } > | >+ > | >+ if (!strncmp(ppc_prefix, ent->d_name, strlen(ppc_prefix))) { > | >+ (*num_cores)++; > | >+ } > | >+ } > | >+ > | >+ rc = 0; > | >+ if (errno) { > | >+ rc = -1; > | >+ } > | > | > | ... and remove these 4 lines above? > > Ok. > > | > | > | > | >+ > | >+ closedir(dir); > | >+ return rc; > | >+} > | >+ > | >+/* > | >+ * Each module's (aka socket's) id is contained in the 'ibm,hw-module-id' > | >+ * file in the "xscom" directory (/proc/device-tree/xscom*). Similarly each > | >+ * chip's id is contained in the 'ibm,chip-id' file in the xscom directory. > | >+ * > | >+ * A module can contain more than one chip and a chip can contain more > | >+ * than one core. So there are likely to be duplicates in the module > | >+ * and chip identifiers (i.e more than one xscom directory can contain > | >+ * the same module/chip id). > | >+ * > | >+ * Search the xscom directories and count the number of _UNIQUE_ module > | >+ * and chip identifiers in the system. > | >+ */ > | >+static int count_modules_chips(int *num_modules, int *num_chips) > | >+{ > | >+ int rc; > | > | int rc = -1; > | > | > | >+ DIR *dir; > | >+ struct dirent *ent; > | >+ char path[PATH_MAX]; > | >+ const char *xscom_dir = "/proc/device-tree"; > | >+ const char *xscom_prefix = "xscom@"; > | >+ GHashTable *chips_tbl, *modules_tbl; > | >+ > | >+ dir = opendir(xscom_dir); > | >+ if (!dir) { > | >+ error_report("Unable to open %s, error %d\n", xscom_dir, errno); > | >+ return -1; > | >+ } > | >+ > | >+ modules_tbl = g_hash_table_new_full(g_int_hash, g_int_equal, g_free, NULL); > | >+ chips_tbl = g_hash_table_new_full(g_int_hash, g_int_equal, g_free, NULL); > | >+ > | >+ rc = -1; > | > | > | Remove this. > | > | >+ while (1) { > | >+ errno = 0; > | >+ ent = readdir(dir); > | >+ if (!ent) { > | > | Add this: > | rc = -errno; > | goto cleanup; > | > | >+ break; > | >+ } > | >+ > | >+ if (strncmp(xscom_prefix, ent->d_name, strlen(xscom_prefix))) { > | >+ continue; > | >+ } > | >+ > | >+ sprintf(path, "%s/%s/ibm,chip-id", xscom_dir, ent->d_name); > | >+ if (hash_table_add_contents(chips_tbl, path)) { > | >+ goto cleanup; > | >+ } > | >+ > | >+ sprintf(path, "%s/%s/ibm,hw-module-id", xscom_dir, ent->d_name); > | >+ if (hash_table_add_contents(modules_tbl, path)) { > | >+ goto cleanup; > | >+ } > | >+ } > | >+ > | >+ if (errno) { > | >+ goto cleanup; > | >+ } > | > | Do not need these 3 lines. > | > | > | >+ > | >+ *num_modules = g_hash_table_size(modules_tbl); > | >+ *num_chips = g_hash_table_size(chips_tbl); > | >+ rc = 0; > | > | > | Remove this. > | > | count_modules_chips() and count_cores() do pretty similar things, it > | would improve the code if error handling was done similar ways. > > Agree. > > | > | > | >+ > | >+cleanup: > | >+ g_hash_table_remove_all(modules_tbl); > | >+ g_hash_table_destroy(modules_tbl); > | >+ > | >+ g_hash_table_remove_all(chips_tbl); > | >+ g_hash_table_destroy(chips_tbl); > | >+ > | >+ closedir(dir); > | >+ > | >+ return rc; > | >+} > | >+ > | >+int rtas_get_module_info(struct rtas_module_info *modinfo) > | >+{ > | >+ int cores; > | >+ int chips; > | >+ int modules; > | > | Nit - could be one line. > | > | > | >+ > | >+ memset(modinfo, 0, sizeof(struct rtas_module_info)); > | >+ > | >+ if (count_modules_chips(&modules, &chips) || count_cores(&cores)) { > | >+ return -1; > | >+ } > | >+ > | >+ /* > | >+ * TODO: Hard code number of module_types for now till > | >+ * we figure out how to determine it dynamically. > | >+ */ > | >+ modinfo->module_types = 1; > | >+ modinfo->si[0].sockets = modules; > | > | > | A "module" here is what modinfo describes so I'd expect @modules to > | be "1" in the existing code and count_modules_chips() to be named > | count_sockets_chips() and return @sockets, not @modules. > | > | When exactly does a socket become a module? The SPAPR spec uses "sockets" here. > > I am trying to get the terminology too :-) Is socket a slot where a > module is attached? Sorry, no idea. > > I will change the variable name 'modules' to 'sockets'. > | > | > | >+ modinfo->si[0].chips = chips; > | >+ modinfo->si[0].cores_per_chip = cores / chips; > | > | > | What if no "ibm,chip-id" was found and chips == 0? > > If we fail to readdir(xscom) or fail to read the 'ibm,chip-id', > we return an error which we check above. You assume that if there is /proc/device-tree, then there is always "xscom@" but this might not be always the case, like PR KVM on embedded PPC64. > > | > | > | >+ > | >+ return 0; > | >+} > | >diff --git a/hw/ppc/spapr_rtas_modinfo.h b/hw/ppc/spapr_rtas_modinfo.h > | >new file mode 100644 > | >index 0000000..356a2b5 > | >--- /dev/null > | >+++ b/hw/ppc/spapr_rtas_modinfo.h > | > | > | This file is missing a license and > | #ifndef SPAPR_RTAS_MODINFO_H > | #define SPAPR_RTAS_MODINFO_H > | #endif > | > | But I'd rather put the content to include/hw/ppc/spapr.h and avoid > | creating new files. > > Ok. > > | > | > | >@@ -0,0 +1,12 @@ > | >+ > | >+struct rtas_socket_info { > | >+ unsigned short sockets; > | >+ unsigned short chips; > | >+ unsigned short cores_per_chip; > | >+}; > | >+struct rtas_module_info { > | >+ unsigned short module_types; > | >+ struct rtas_socket_info si[1]; > | >+}; > | > | > | Will the number of rtas_socket_info be always a constant or (sure, > | in the future) it may take different values? > > TBH, I am not sure. One thing though, array size of rtas_socket_info > depends on number of module _types_. Like Nishanth Aravamudan mentioned > offline, we are not sure if both a DCM and SCM types can be on a system > at once (or even if they can be added dynamically). What are DCM and SCM? What type do we have in Tuleta? > Another thing I have on my list to check is how this works with > hotplug CPU (in the host). If the device-tree is properly updated > then these calls will return the updated info? The values in the > array will change but the number of entries (types) in the array is > still 1. > > | > | Normally when you invent API like rtas_get_module_info(), you tell > | the helper how much memory is allocated in the rtas_module_info > | struct (in our case it is ARRAY_SIZE(rtas_module_info.si) which is > | fine) and then the helper signals somehow how many module types it > | has found (which is missing here). > > Can we assume that the number of types is static for now i.e both > caller and callee know the size (although I should fix the computation > of size in spapr_rtas.c and use MAX_MODULE_TYPES that David pointed out). I do not know what these modules can be so I cannot advise on this, sorry. I honestly do not see much point in implementing this system parameter - what will the guest do with this information? You could get rid of rtas_module_info for now and have a helper for rtas_socket_info only - rtas_get_xxx_module_info() (where xxx is DCM or SCM or whatever it is now and what we support). Then it would be up to rtas_ibm_get_system_parameter() to decide how many modules types we want to expose to the guest (now - one) and when we get a new module type, we will either extend rtas_get_xxx_module_info() or implement another helper. -- Alexey