Re: [Qemu-devel] [PATCH v2 1/1] target-ppc: Implement rtas_get_sysparm(PROCESSOR_MODULE_INFO)

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: stewart@linux.vnet.ibm.com, benh@au1.ibm.com,
	nacc@linux.vnet.ibm.com, agraf@suse.de, qemu-devel@nongnu.org,
	qemu-ppc@nongnu.org, paulus@au1.ibm.com,
	david@gibson.dropbear.id.au
Subject: Re: [Qemu-devel] [PATCH v2 1/1] target-ppc: Implement rtas_get_sysparm(PROCESSOR_MODULE_INFO)
Date: Tue, 10 Nov 2015 15:25:38 +1100	[thread overview]
Message-ID: <564171C2.8040700@ozlabs.ru> (raw)
In-Reply-To: <20151110035757.GA20030@us.ibm.com>

On 11/10/2015 02:57 PM, Sukadev Bhattiprolu wrote:
> Alexey Kardashevskiy [aik@ozlabs.ru] wrote:
> | On 11/05/2015 10:06 AM, Sukadev Bhattiprolu wrote:
> | >Implement RTAS_SYSPARM_PROCESSOR_MODULE_INFO parameter to rtas_get_sysparm()
> | >call in qemu. This call returns the processor module (socket), chip and core
> | >information as specified in section 7.3.16.18 of PAPR v2.7.
> | >
> | >We walk the /proc/device-tree to determine the number of chips, cores and
> | >modules in the _host_ system and return this info to the guest application
> | >that makes the rtas_get_sysparm() call.
> | >
> | >We currently hard code the number of module_types to 1, but we should determine
> | >that dynamically somehow later.
> | >
> | >Thanks to input from Nishanth Aravamudan and Alexey Kardashevsk.
> |
> | "iy" is missing :)
>
> Argh. Sorry :-)
>
> Agree with your other comments. Couple of questions/comments below.
>
> |
> | checkpatch.pl does not warn on this but new lines start under
> | opening brace in the previous line.
> |
> | In terms on vim, it would be:
> | set expandtab
> | set tabstop=4
> | set shiftwidth=4
> | set cino=:0,(0
> |
> |
> |
> | >+            offset += 2;
> | >+            stw_be_phys(&address_space_memory, buffer+offset,
> | >+                                    modinfo.si[i].chips);
> | >+            offset += 2;
> | >+            stw_be_phys(&address_space_memory, buffer+offset,
> | >+                                    modinfo.si[i].cores_per_chip);
> | >+            offset += 2;
> | >+        }
> | >+        break;
> | >+    }
> | >+
> | >      case RTAS_SYSPARM_SPLPAR_CHARACTERISTICS: {
> | >          char *param_val = g_strdup_printf("MaxEntCap=%d,"
> | >                                            "DesMem=%llu,"
> | >diff --git a/hw/ppc/spapr_rtas_modinfo.c b/hw/ppc/spapr_rtas_modinfo.c
> | >new file mode 100644
> | >index 0000000..068fc2c
> | >--- /dev/null
> | >+++ b/hw/ppc/spapr_rtas_modinfo.c
> | >@@ -0,0 +1,230 @@
> | >+/*
> | >+ * QEMU PowerPC pSeries Logical Partition (aka sPAPR) hardware System Emulator
> | >+ *
> | >+ * Hypercall based emulated RTAS
> |
> |
> | This is a description of hw/ppc/spapr_rtas.c, not of the new file.
>
> Agree.
>
> |
> | Not sure the new code deserves a separate file, I'd either:
> | - add it into hw/ppc/spapr_rtas.c OR
> | - move rtas_ibm_get_system_parameter() +
> | rtas_ibm_set_system_parameter() to a separate file (let's call it
> | hw/ppc/spapr_rtas_syspar.c) and add this new parameter there as
> | there will be new parameters in the future anyway.
> |
> | But I'll leave to the maintainer (David, hi :) ).
>
> Will discuss this on the other thread from David (maybe move to
> target-ppc/kvm.c?)


Sounds right.


>
> |
> |
> <snip>
>
> | >+
> | >+static int file_read_buf(char *file_name, char *buf, int len)
> | >+{
> | >+    int rc;
> | >+    FILE *fp;
> | >+
> | >+    fp = fopen(file_name, "r");
> | >+    if (!fp) {
> | >+        error_report("%s: Error opening %s\n", __func__, file_name);
> |
> |
> | error_report() does not need "\n", remote it here and below.
> |
> | Another thing - you passed __func__ here but you did not in other
> | places, please make this consistent and pass __func__ everywhere OR
> | make error messages more descriptive, the latter seems to be
> | preferable way in QEMU, either way this should be easily grep'able,
> | for example - "Unable to open %s, error %d" is too generic.
>
> will change to "Error opening device tree file %s" and drop the
> __func__.
>
> |
> |
> | >+        return -1;
> | >+    }
> | >+
> | >+    rc = fread(buf, 1, len, fp);
> | >+    fclose(fp);
> | >+
> | >+    if (rc != len) {
> | >+        return -1;
> | >+    }
> | >+
> | >+    return 0;
> | >+}
> | >+
> | >+/*
> | >+ * Read an identifier from the file @path and add the identifier
> | >+ * to the hash table @gt unless its already in the table.
> | >+ */
> | >+static int hash_table_add_contents(GHashTable *gt, char *path)
> |
> | Move this helper before count_modules_chips(). I thought for a sec
> | that count_cores() uses it too :)
>
> Ok.
>
> |
> |
> | >+{
> | >+    int idx;
> | >+    char buf[16];
> | >+    int *key;
> | >+
> | >+    if (file_read_buf(path, buf, sizeof(int))) {
> | >+        return -1;
> | >+    }
> | >+
> | >+    idx = ldl_be_p(buf);
> | >+
> | >+    if (g_hash_table_contains(gt, &idx)) {
> | >+        return 0;
> | >+    }
> | >+
> | >+    key = g_malloc(sizeof(int));
> |
> |
> | I kind of hoped that g_direct_hash() (and probably GINT_TO_POINTER()
> | but I do not see kvm-all.c using it) will let you avoid
> | g_malloc()'ing the keys.
>
> I think g_direct_hash() and GINT_TO_POINTER will work. Will try them
> out.
>
> |
> |
> |
> | >+
> | >+    *key = idx;
> | >+    if (!g_hash_table_insert(gt, key, NULL)) {
> | >+        error_report("Unable to insert key %d into chips hash table\n", idx);
> | >+        g_free(key);
> | >+        return -1;
> | >+    }
> | >+
> | >+    return 0;
> | >+}
> | >+
> | >+/*
> | >+ * Each core in the system is represented by a directory with the
> | >+ * prefix 'PowerPC,POWER' in the directory /proc/device-tree/cpus/.
> | >+ * Process that directory and count the number of cores in the system.
> | >+ */
> | >+static int count_cores(int *num_cores)
> | >+{
> | >+    int rc;
> | >+    DIR *dir;
> | >+    struct dirent *ent;
> | >+    const char *cpus_dir = "/proc/device-tree/cpus";
> | >+    const char *ppc_prefix = "PowerPC,POWER";
> | >+
> | >+    dir = opendir(cpus_dir);
> | >+    if (!dir) {
> | >+        error_report("Unable to open %s, error %d\n", cpus_dir, errno);
> | >+        return -1;
> | >+    }
> | >+
> | >+    *num_cores = 0;
> | >+    while (1) {
> | >+        errno = 0;
> | >+        ent = readdir(dir);
> | >+        if (!ent) {
> |
> | rc = -errno; ....
>
> Yes. Thanks,
> |
> |
> | >+            break;
> | >+        }
> | >+
> | >+        if (!strncmp(ppc_prefix, ent->d_name, strlen(ppc_prefix))) {
> | >+            (*num_cores)++;
> | >+        }
> | >+    }
> | >+
> | >+    rc = 0;
> | >+    if (errno) {
> | >+        rc = -1;
> | >+    }
> |
> |
> | ... and remove these 4 lines above?
>
> Ok.
>
> |
> |
> |
> | >+
> | >+    closedir(dir);
> | >+    return rc;
> | >+}
> | >+
> | >+/*
> | >+ * Each module's (aka socket's) id is contained in the 'ibm,hw-module-id'
> | >+ * file in the "xscom" directory (/proc/device-tree/xscom*). Similarly each
> | >+ * chip's id is contained in the 'ibm,chip-id' file in the xscom directory.
> | >+ *
> | >+ * A module can contain more than one chip and a chip can contain more
> | >+ * than one core. So there are likely to be duplicates in the module
> | >+ * and chip identifiers (i.e more than one xscom directory can contain
> | >+ * the same module/chip id).
> | >+ *
> | >+ * Search the xscom directories and count the number of _UNIQUE_ module
> | >+ * and chip identifiers in the system.
> | >+ */
> | >+static int count_modules_chips(int *num_modules, int *num_chips)
> | >+{
> | >+    int rc;
> |
> | int rc = -1;
> |
> |
> | >+    DIR *dir;
> | >+    struct dirent *ent;
> | >+    char path[PATH_MAX];
> | >+    const char *xscom_dir = "/proc/device-tree";
> | >+    const char *xscom_prefix = "xscom@";
> | >+    GHashTable *chips_tbl, *modules_tbl;
> | >+
> | >+    dir = opendir(xscom_dir);
> | >+    if (!dir) {
> | >+        error_report("Unable to open %s, error %d\n", xscom_dir, errno);
> | >+        return -1;
> | >+    }
> | >+
> | >+    modules_tbl = g_hash_table_new_full(g_int_hash, g_int_equal, g_free, NULL);
> | >+    chips_tbl = g_hash_table_new_full(g_int_hash, g_int_equal, g_free, NULL);
> | >+
> | >+    rc = -1;
> |
> |
> | Remove this.
> |
> | >+    while (1) {
> | >+        errno = 0;
> | >+        ent = readdir(dir);
> | >+        if (!ent) {
> |
> | Add this:
> | rc = -errno;
> | goto cleanup;
> |
> | >+            break;
> | >+        }
> | >+
> | >+        if (strncmp(xscom_prefix, ent->d_name, strlen(xscom_prefix))) {
> | >+            continue;
> | >+        }
> | >+
> | >+        sprintf(path, "%s/%s/ibm,chip-id", xscom_dir, ent->d_name);
> | >+        if (hash_table_add_contents(chips_tbl, path)) {
> | >+            goto cleanup;
> | >+        }
> | >+
> | >+        sprintf(path, "%s/%s/ibm,hw-module-id", xscom_dir, ent->d_name);
> | >+        if (hash_table_add_contents(modules_tbl, path)) {
> | >+            goto cleanup;
> | >+        }
> | >+    }
> | >+
> | >+    if (errno) {
> | >+        goto cleanup;
> | >+    }
> |
> | Do not need these 3 lines.
> |
> |
> | >+
> | >+    *num_modules = g_hash_table_size(modules_tbl);
> | >+    *num_chips = g_hash_table_size(chips_tbl);
> | >+    rc = 0;
> |
> |
> | Remove this.
> |
> | count_modules_chips() and count_cores() do pretty similar things, it
> | would improve the code if error handling was done similar ways.
>
> Agree.
>
> |
> |
> | >+
> | >+cleanup:
> | >+    g_hash_table_remove_all(modules_tbl);
> | >+    g_hash_table_destroy(modules_tbl);
> | >+
> | >+    g_hash_table_remove_all(chips_tbl);
> | >+    g_hash_table_destroy(chips_tbl);
> | >+
> | >+    closedir(dir);
> | >+
> | >+    return rc;
> | >+}
> | >+
> | >+int rtas_get_module_info(struct rtas_module_info *modinfo)
> | >+{
> | >+    int cores;
> | >+    int chips;
> | >+    int modules;
> |
> | Nit - could be one line.
> |
> |
> | >+
> | >+    memset(modinfo, 0, sizeof(struct rtas_module_info));
> | >+
> | >+    if (count_modules_chips(&modules, &chips) || count_cores(&cores)) {
> | >+        return -1;
> | >+    }
> | >+
> | >+    /*
> | >+     * TODO: Hard code number of module_types for now till
> | >+     *       we figure out how to determine it dynamically.
> | >+     */
> | >+    modinfo->module_types = 1;
> | >+    modinfo->si[0].sockets = modules;
> |
> |
> | A "module" here is what modinfo describes so I'd expect @modules to
> | be "1" in the existing code and count_modules_chips() to be named
> | count_sockets_chips() and return @sockets, not @modules.
> |
> | When exactly does a socket become a module? The SPAPR spec uses "sockets" here.
>
> I am trying to get the terminology too :-) Is socket a slot where a
> module is attached?

Sorry, no idea.


>
> I will change the variable name 'modules' to 'sockets'.
> |
> |
> | >+    modinfo->si[0].chips = chips;
> | >+    modinfo->si[0].cores_per_chip = cores / chips;
> |
> |
> | What if no "ibm,chip-id" was found and chips == 0?
>
> If we fail to readdir(xscom) or fail to read the 'ibm,chip-id',
> we return an error which we check above.


You assume that if there is /proc/device-tree, then there is always 
"xscom@" but this might not be always the case, like PR KVM on embedded PPC64.



>
> |
> |
> | >+
> | >+    return 0;
> | >+}
> | >diff --git a/hw/ppc/spapr_rtas_modinfo.h b/hw/ppc/spapr_rtas_modinfo.h
> | >new file mode 100644
> | >index 0000000..356a2b5
> | >--- /dev/null
> | >+++ b/hw/ppc/spapr_rtas_modinfo.h
> |
> |
> | This file is missing a license and
> | #ifndef SPAPR_RTAS_MODINFO_H
> | #define SPAPR_RTAS_MODINFO_H
> | #endif
> |
> | But I'd rather put the content to include/hw/ppc/spapr.h and avoid
> | creating new files.
>
> Ok.
>
> |
> |
> | >@@ -0,0 +1,12 @@
> | >+
> | >+struct rtas_socket_info {
> | >+    unsigned short sockets;
> | >+    unsigned short chips;
> | >+    unsigned short cores_per_chip;
> | >+};
> | >+struct rtas_module_info {
> | >+    unsigned short module_types;
> | >+    struct rtas_socket_info si[1];
> | >+};
> |
> |
> | Will the number of rtas_socket_info be always a constant or (sure,
> | in the future) it may take different values?
>
> TBH, I am not sure. One thing though, array size of rtas_socket_info
> depends on number of module _types_. Like Nishanth Aravamudan mentioned
> offline, we are not sure if both a DCM and SCM types can be on a system
> at once (or even if they can be added dynamically).

What are DCM and SCM? What type do we have in Tuleta?


> Another thing I have on my list to check is how this works with
> hotplug CPU (in the host). If the device-tree is properly updated
> then these calls will return the updated info? The values in the
> array will change but the number of entries (types) in the array is
> still 1.
>
> |
> | Normally when you invent API like rtas_get_module_info(), you tell
> | the helper how much memory is allocated in the rtas_module_info
> | struct (in our case it is ARRAY_SIZE(rtas_module_info.si) which is
> | fine) and then the helper signals somehow how many module types it
> | has found (which is missing here).
>
> Can we assume that the number of types is static for now i.e both
> caller and callee know the size (although I should fix the computation
> of size in spapr_rtas.c and use MAX_MODULE_TYPES that David pointed out).


I do not know what these modules can be so I cannot advise on this, sorry. 
I honestly do not see much point in implementing this system parameter - 
what will the guest do with this information?

You could get rid of rtas_module_info for now and have a helper for 
rtas_socket_info only - rtas_get_xxx_module_info() (where xxx is DCM or SCM 
or whatever it is now and what we support). Then it would be up to 
rtas_ibm_get_system_parameter() to decide how many modules types we want to 
expose to the guest (now - one) and when we get a new module type, we will 
either extend rtas_get_xxx_module_info() or implement another helper.


-- 
Alexey

next prev parent reply	other threads:[~2015-11-10  4:25 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-04 23:06 [Qemu-devel] [PATCH v2 1/1] target-ppc: Implement rtas_get_sysparm(PROCESSOR_MODULE_INFO) Sukadev Bhattiprolu
2015-11-09  1:57 ` Alexey Kardashevskiy
2015-11-09  5:01   ` David Gibson
2015-11-10  3:57   ` Sukadev Bhattiprolu
2015-11-10  4:25     ` Alexey Kardashevskiy [this message]
2015-11-10  4:46       ` Sukadev Bhattiprolu
2015-11-10  6:58         ` Alexey Kardashevskiy
2015-11-10 18:27           ` Sukadev Bhattiprolu
2015-11-09  4:58 ` David Gibson
2015-11-10  4:22   ` Sukadev Bhattiprolu
2015-11-10  9:53     ` Thomas Huth
2015-11-13 20:29       ` Sukadev Bhattiprolu
2015-11-11  0:17     ` David Gibson
2015-11-11  0:56       ` Nishanth Aravamudan
2015-11-11  1:41         ` David Gibson
2015-11-11 22:10           ` Nishanth Aravamudan
2015-11-12  4:47             ` David Gibson
2015-11-12 16:46               ` Nishanth Aravamudan
2015-12-01  3:41                 ` David Gibson
2015-12-05  1:04                   ` Nishanth Aravamudan
2015-12-10  3:55                     ` David Gibson
2015-11-13 20:21       ` Sukadev Bhattiprolu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=564171C2.8040700@ozlabs.ru \
    --to=aik@ozlabs.ru \
    --cc=agraf@suse.de \
    --cc=benh@au1.ibm.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=nacc@linux.vnet.ibm.com \
    --cc=paulus@au1.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=stewart@linux.vnet.ibm.com \
    --cc=sukadev@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).