* [Qemu-devel] [0/3] pseries: RFC: Advertise host CPU capabilties to guest
@ 2011-09-30 7:50 David Gibson
2011-09-30 7:50 ` [Qemu-devel] [PATCH 1/3] ppc: Generalize the kvmppc_get_clockfreq() function David Gibson
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: David Gibson @ 2011-09-30 7:50 UTC (permalink / raw)
To: agraf; +Cc: qemu-devel
This series contains some patches which, when using KVM, gather
information about the capabilities of the host CPU and advertise them
to the guest system when using the pseries machine. Specifically it
does this for whether the CPU supports VMX, VSX and/or DFP
instructions, and for the CPUs supported page sizes.
The VSX and DFP portions of this were posted earlier, and I've fixed
the minor comments which people made. This leaves one objection from
Alex Graf, that whether the features are advertised should also depend
on the target CPU selected in qemu. A similar objection may apply to
the pagesizes patch. I guess the idea is to "clamp" the advertised
capabilities to those permitted by the selected target CPU, but I'm
not entirely sure what the logic here should be.
Frankly, particularly in the case of KVM Book3S-HV, I'm not terribly
convinced that attempting to make the guest CPU appear different from
the host CPU is terribly meaningful. These patches as they stand have
the advantage that future, roughly compatible CPUs should Just Work
with these capabilities advertised in the correct cases. Alex, can
you advise what sort of logic you'd like here.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH 1/3] ppc: Generalize the kvmppc_get_clockfreq() function
2011-09-30 7:50 [Qemu-devel] [0/3] pseries: RFC: Advertise host CPU capabilties to guest David Gibson
@ 2011-09-30 7:50 ` David Gibson
2011-09-30 18:06 ` Alexander Graf
2011-09-30 7:50 ` [Qemu-devel] [PATCH 2/3] pseries: Add device tree properties for VMX/VSX and DFP under kvm David Gibson
` (2 subsequent siblings)
3 siblings, 1 reply; 9+ messages in thread
From: David Gibson @ 2011-09-30 7:50 UTC (permalink / raw)
To: agraf; +Cc: qemu-devel
Currently the kvmppc_get_clockfreq() function reads the host's clock
frequency from /proc/device-tree, which is useful to past to the guest
in KVM setups. However, there are some other host properties
advertised in the device tree which can also be relevant to the
guests.
This patch, therefore, replaces kvmppc_get_clockfreq() which can
retrieve any named, single integer property from the host device
tree's CPU node.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
hw/ppc440_bamboo.c | 2 +-
hw/ppce500_mpc8544ds.c | 2 +-
hw/spapr.c | 13 ++++++++++++-
target-ppc/kvm.c | 30 +++++++++++++++++++-----------
target-ppc/kvm_ppc.h | 4 ++--
5 files changed, 35 insertions(+), 16 deletions(-)
diff --git a/hw/ppc440_bamboo.c b/hw/ppc440_bamboo.c
index 1523764..df85da0 100644
--- a/hw/ppc440_bamboo.c
+++ b/hw/ppc440_bamboo.c
@@ -83,7 +83,7 @@ static int bamboo_load_device_tree(target_phys_addr_t addr,
* the correct frequencies. */
if (kvm_enabled()) {
tb_freq = kvmppc_get_tbfreq();
- clock_freq = kvmppc_get_clockfreq();
+ clock_freq = kvmppc_read_int_cpu_dt("clock-frequency");
}
qemu_devtree_setprop_cell(fdt, "/cpus/cpu@0", "clock-frequency",
diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
index f00367e..eb37c3d 100644
--- a/hw/ppce500_mpc8544ds.c
+++ b/hw/ppce500_mpc8544ds.c
@@ -112,7 +112,7 @@ static int mpc8544_load_device_tree(CPUState *env,
if (kvm_enabled()) {
/* Read out host's frequencies */
- clock_freq = kvmppc_get_clockfreq();
+ clock_freq = kvmppc_read_int_cpu_dt("clock-frequency");
tb_freq = kvmppc_get_tbfreq();
/* indicate KVM hypercall interface */
diff --git a/hw/spapr.c b/hw/spapr.c
index 9a3a1ea..ea5690e 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -185,7 +185,8 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
0xffffffff, 0xffffffff};
uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq() : TIMEBASE_FREQ;
- uint32_t cpufreq = kvm_enabled() ? kvmppc_get_clockfreq() : 1000000000;
+ uint32_t cpufreq = kvm_enabled() ?
+ kvmppc_read_int_cpu_dt("clock-frequency") : 1000000000;
if ((index % smt) != 0) {
continue;
@@ -233,6 +234,16 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
segs, sizeof(segs))));
}
+ /* Advertise VMX/VSX (vector extensions) if available */
+ if (vmx) {
+ _FDT((fdt_property_cell(fdt, "ibm,vmx", vmx)));
+ }
+
+ /* Advertise DFP (Decimal Floating Point) if available */
+ if (dfp) {
+ _FDT((fdt_property_cell(fdt, "ibm,dfp", dfp)));
+ }
+
_FDT((fdt_end_node(fdt)));
}
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 26165b6..db2326d 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -650,32 +650,40 @@ static int kvmppc_find_cpu_dt(char *buf, int buf_len)
return 0;
}
-uint64_t kvmppc_get_clockfreq(void)
+/* Read a CPU node property from the host device tree that's a single
+ * integer (32-bit or 64-bit). Returns 0 if anything goes wrong
+ * (can't find or open the property, or doesn't understand the
+ * format) */
+uint64_t kvmppc_read_int_cpu_dt(const char *propname)
{
- char buf[512];
- uint32_t tb[2];
+ char buf[PATH_MAX];
+ union {
+ uint32_t v32;
+ uint64_t v64;
+ } u;
FILE *f;
int len;
if (kvmppc_find_cpu_dt(buf, sizeof(buf))) {
- return 0;
+ return -1;
}
- strncat(buf, "/clock-frequency", sizeof(buf) - strlen(buf));
+ strncat(buf, "/", sizeof(buf) - strlen(buf));
+ strncat(buf, propname, sizeof(buf) - strlen(buf));
f = fopen(buf, "rb");
if (!f) {
return -1;
}
- len = fread(tb, sizeof(tb[0]), 2, f);
+ len = fread(&u, 1, sizeof(u), f);
fclose(f);
switch (len) {
- case 1:
- /* freq is only a single cell */
- return tb[0];
- case 2:
- return *(uint64_t*)tb;
+ case 4:
+ /* property is a 32-bit quantity */
+ return be32_to_cpu(u.v32);
+ case 8:
+ return be64_to_cpu(u.v64);
}
return 0;
diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index 9e8a7b5..0b9a58a 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -14,7 +14,7 @@ void kvmppc_init(void);
#ifdef CONFIG_KVM
uint32_t kvmppc_get_tbfreq(void);
-uint64_t kvmppc_get_clockfreq(void);
+uint64_t kvmppc_read_int_cpu_dt(const char *propname);
int kvmppc_get_hypercall(CPUState *env, uint8_t *buf, int buf_len);
int kvmppc_set_interrupt(CPUState *env, int irq, int level);
void kvmppc_set_papr(CPUState *env);
@@ -30,7 +30,7 @@ static inline uint32_t kvmppc_get_tbfreq(void)
return 0;
}
-static inline uint64_t kvmppc_get_clockfreq(void)
+static inline uint64_t kvmppc_read_int_cpu_dt(const char *propname)
{
return 0;
}
--
1.7.6.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH 2/3] pseries: Add device tree properties for VMX/VSX and DFP under kvm
2011-09-30 7:50 [Qemu-devel] [0/3] pseries: RFC: Advertise host CPU capabilties to guest David Gibson
2011-09-30 7:50 ` [Qemu-devel] [PATCH 1/3] ppc: Generalize the kvmppc_get_clockfreq() function David Gibson
@ 2011-09-30 7:50 ` David Gibson
2011-09-30 7:50 ` [Qemu-devel] [PATCH 3/3] pseries: Correctly create ibm, segment-page-sizes property David Gibson
2011-09-30 8:20 ` [Qemu-devel] [0/3] pseries: RFC: Advertise host CPU capabilties to guest Alexander Graf
3 siblings, 0 replies; 9+ messages in thread
From: David Gibson @ 2011-09-30 7:50 UTC (permalink / raw)
To: agraf; +Cc: qemu-devel
Sufficiently recent PAPR specifications define properties "ibm,vmx"
and "ibm,dfp" on the CPU node which advertise whether the VMX vector
extensions (or the later VSX version) and/or the Decimal Floating
Point operations from IBM's recent POWER CPUs are available.
Currently we do not put these in the guest device tree and the guest
kernel will consequently assume they are not available. This is good,
because they are not supported under TCG. VMX is similar enough to
Altivec that it might be trivial to support, but VSX and DFP would
both require significant work to support in TCG.
However, when running under kvm on a host which supports these
instructions, there's no reason not to let the guest use them. This
patch, therefore, checks for the relevant support on the host CPU
and, if present, advertises them to the guest as well.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
hw/spapr.c | 11 +++++++++--
1 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/hw/spapr.c b/hw/spapr.c
index ea5690e..8089d83 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -187,6 +187,8 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq() : TIMEBASE_FREQ;
uint32_t cpufreq = kvm_enabled() ?
kvmppc_read_int_cpu_dt("clock-frequency") : 1000000000;
+ uint32_t vmx = kvm_enabled() ? kvmppc_read_int_cpu_dt("ibm,vmx") : 0;
+ uint32_t dfp = kvm_enabled() ? kvmppc_read_int_cpu_dt("ibm,dfp") : 0;
if ((index % smt) != 0) {
continue;
@@ -234,12 +236,17 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
segs, sizeof(segs))));
}
- /* Advertise VMX/VSX (vector extensions) if available */
+ /* Advertise VMX/VSX (vector extensions) if available
+ * 0 / no property == no vector extensions
+ * 1 == VMX / Altivec available
+ * 2 == VSX available */
if (vmx) {
_FDT((fdt_property_cell(fdt, "ibm,vmx", vmx)));
}
- /* Advertise DFP (Decimal Floating Point) if available */
+ /* Advertise DFP (Decimal Floating Point) if available
+ * 0 / no property == no DFP
+ * 1 == DFP available */
if (dfp) {
_FDT((fdt_property_cell(fdt, "ibm,dfp", dfp)));
}
--
1.7.6.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH 3/3] pseries: Correctly create ibm, segment-page-sizes property
2011-09-30 7:50 [Qemu-devel] [0/3] pseries: RFC: Advertise host CPU capabilties to guest David Gibson
2011-09-30 7:50 ` [Qemu-devel] [PATCH 1/3] ppc: Generalize the kvmppc_get_clockfreq() function David Gibson
2011-09-30 7:50 ` [Qemu-devel] [PATCH 2/3] pseries: Add device tree properties for VMX/VSX and DFP under kvm David Gibson
@ 2011-09-30 7:50 ` David Gibson
2011-10-07 7:20 ` Alexander Graf
2011-09-30 8:20 ` [Qemu-devel] [0/3] pseries: RFC: Advertise host CPU capabilties to guest Alexander Graf
3 siblings, 1 reply; 9+ messages in thread
From: David Gibson @ 2011-09-30 7:50 UTC (permalink / raw)
To: agraf; +Cc: qemu-devel
Current versions of the PowerPC architecture require and fully define
4kB and 16MB page sizes. Other pagesizes (e.g. 64kB, 1MB) are
permitted and are often supported, but the exact encodings used to set
the up can vary from chip to chip.
The supported pagesizes and required encodings are advertised to the
OS via the ibm,segment-page-sizes property in the device tree.
Currently we do not put this property in our device tree, so guests
are restricted to the architected 4kB and 16MB pagesizes.
The base sizes are all that we implement in tcg, however with KVM the
guest can use anything supported by the host as long as the guest's
base memory is backed by pages at least as large. Furthermore, in
order to use any extended page sizes, the guest needs to know the
correct encodings for the host.
This patch, therefore, reads the host's pagesize information, filters
it based on the pagesize backing RAM, and passes it into the guest.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
hw/spapr.c | 127 ++++++++++++++++++++++++++++++++++++++++++++++++++
target-ppc/kvm.c | 43 +++++++++++++++++
target-ppc/kvm_ppc.h | 6 ++
3 files changed, 176 insertions(+), 0 deletions(-)
diff --git a/hw/spapr.c b/hw/spapr.c
index 8089d83..72b6c6a 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -24,6 +24,8 @@
* THE SOFTWARE.
*
*/
+#include <sys/vfs.h>
+
#include "sysemu.h"
#include "hw.h"
#include "elf.h"
@@ -88,6 +90,122 @@ qemu_irq spapr_allocate_irq(uint32_t hint, uint32_t *irq_num)
return qirq;
}
+#define HUGETLBFS_MAGIC 0x958458f6
+
+static long getrampagesize(void)
+{
+ struct statfs fs;
+ int ret;
+
+ if (!mem_path) {
+ /* guest RAM is backed by normal anonymous pages */
+ return getpagesize();
+ }
+
+ do {
+ ret = statfs(mem_path, &fs);
+ } while (ret != 0 && errno == EINTR);
+
+ if (ret != 0) {
+ fprintf(stderr, "Couldn't statfs() memory path: %s\n",
+ strerror(errno));
+ exit(1);
+ }
+
+ if (fs.f_type != HUGETLBFS_MAGIC) {
+ /* Explicit mempath, but it's ordinary pages */
+ return getpagesize();
+ }
+
+ /* It's hugepage, return the huge page size */
+ return fs.f_bsize;
+}
+
+static size_t create_page_sizes_prop(uint32_t *prop, size_t maxsize)
+{
+ int cells;
+ target_ulong ram_page_size = getrampagesize();
+ int i, j;
+
+ if (!kvm_enabled()) {
+ /* For the supported CPUs in emulation, we support just 4k and
+ * 16MB pages, with the usual encodings. This is the default
+ * set the guest will assume if we don't specify anything */
+ return 0;
+ }
+
+ cells = kvmppc_read_segment_page_sizes(prop, maxsize / sizeof(uint32_t));
+ if (cells < 0) {
+ fprintf(stderr, "Error reading host's "
+ "ibm,segment-page-sizes property\n");
+ exit(1);
+ }
+
+ if (cells == 0) {
+ /* Host specifies no pagesizes, so use the architected ones */
+ uint32_t def_page_sizes[] = {0xc, 0x0, 0x1, 0xc, 0x0, /* 4kB */
+ 0x18, 0x100, 0x1, 0x18, 0x0, }; /* 16MB */
+
+ assert(maxsize >= sizeof(def_page_sizes));
+
+ memcpy(prop, def_page_sizes, sizeof(def_page_sizes));
+ cells = sizeof(def_page_sizes) / sizeof(def_page_sizes[0]);
+ }
+
+ /* Filter based on pagesize backing RAM */
+ i = j = 0;
+ while (i < cells) {
+ uint32_t baseshift, slbenc, numsizes, k, n;
+
+ if ((i + 3) >= cells) {
+ fprintf(stderr, "Malformed ibm,segment-page-sizes on host\n");
+ exit(1);
+ }
+
+ baseshift = be32_to_cpu(prop[i++]);
+ slbenc = be32_to_cpu(prop[i++]);
+ numsizes = be32_to_cpu(prop[i++]);
+
+ if ((i + numsizes*2) >= cells) {
+ fprintf(stderr, "Malformed ibm,segment-page-sizes on host\n");
+ exit(1);
+ }
+
+ /* Too big, skip */
+ if ((1UL << baseshift) > ram_page_size) {
+ i += numsizes*2;
+ continue;
+ }
+
+ n = 0;
+ for (k = 0; k < numsizes; k++) {
+ uint32_t shift = be32_to_cpu(prop[i + k*2]);
+
+ if ((1UL << shift) <= ram_page_size) {
+ n++;
+ }
+ }
+
+ prop[j++] = cpu_to_be32(baseshift);
+ prop[j++] = cpu_to_be32(slbenc);
+ prop[j++] = cpu_to_be32(n);
+
+ for (k = 0; k < numsizes; k++) {
+ uint32_t shift = be32_to_cpu(prop[i++]);
+ uint32_t hashenc = be32_to_cpu(prop[i++]);
+
+ if ((1UL << shift) <= ram_page_size) {
+ prop[j++] = cpu_to_be32(shift);
+ prop[j++] = cpu_to_be32(hashenc);
+ }
+ }
+ }
+
+ assert(i == cells);
+
+ return j * sizeof(uint32_t);
+}
+
static void *spapr_create_fdt_skel(const char *cpu_model,
target_phys_addr_t rma_size,
target_phys_addr_t initrd_base,
@@ -189,6 +307,8 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
kvmppc_read_int_cpu_dt("clock-frequency") : 1000000000;
uint32_t vmx = kvm_enabled() ? kvmppc_read_int_cpu_dt("ibm,vmx") : 0;
uint32_t dfp = kvm_enabled() ? kvmppc_read_int_cpu_dt("ibm,dfp") : 0;
+ uint32_t page_sizes_prop[15];
+ size_t page_sizes_prop_size;
if ((index % smt) != 0) {
continue;
@@ -251,6 +371,13 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
_FDT((fdt_property_cell(fdt, "ibm,dfp", dfp)));
}
+ page_sizes_prop_size = create_page_sizes_prop(page_sizes_prop,
+ sizeof(page_sizes_prop));
+ if (page_sizes_prop_size) {
+ _FDT((fdt_property(fdt, "ibm,segment-page-sizes",
+ page_sizes_prop, page_sizes_prop_size)));
+ }
+
_FDT((fdt_end_node(fdt)));
}
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index db2326d..b399845 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -689,6 +689,49 @@ uint64_t kvmppc_read_int_cpu_dt(const char *propname)
return 0;
}
+/* Read a CPU node property from the host device tree that's a single
+ * integer (32-bit or 64-bit). Returns 0 if anything goes wrong
+ * (can't find or open the property, or doesn't understand the
+ * format) */
+int kvmppc_read_segment_page_sizes(uint32_t *prop, int maxcells)
+{
+ char buf[PATH_MAX];
+ FILE *f;
+ int ncells;
+
+ if (kvmppc_find_cpu_dt(buf, sizeof(buf))) {
+ return -1;
+ }
+
+ strncat(buf, "/ibm,segment-page-sizes", sizeof(buf) - strlen(buf));
+
+ f = fopen(buf, "rb");
+ if (!f) {
+ if (errno == -ENOENT) {
+ /* If missing, assume defaults */
+ return 0;
+ }
+ return -1;
+ }
+
+ ncells = fread(prop, sizeof(uint32_t), maxcells, f);
+ if (ncells == maxcells) {
+ uint32_t tmp;
+ int n;
+
+ n = fread(&tmp, sizeof(tmp), 1, f);
+ if ((n != 0) || !feof(f)) {
+ fclose(f);
+ /* Not enough space provided for the result */
+ return -1;
+ }
+ }
+
+ fclose(f);
+
+ return ncells;
+}
+
int kvmppc_get_hypercall(CPUState *env, uint8_t *buf, int buf_len)
{
uint32_t *hc = (uint32_t*)buf;
diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index 0b9a58a..14fbaa6 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -15,6 +15,7 @@ void kvmppc_init(void);
uint32_t kvmppc_get_tbfreq(void);
uint64_t kvmppc_read_int_cpu_dt(const char *propname);
+int kvmppc_read_segment_page_sizes(uint32_t *prop, int maxcells);
int kvmppc_get_hypercall(CPUState *env, uint8_t *buf, int buf_len);
int kvmppc_set_interrupt(CPUState *env, int irq, int level);
void kvmppc_set_papr(CPUState *env);
@@ -35,6 +36,11 @@ static inline uint64_t kvmppc_read_int_cpu_dt(const char *propname)
return 0;
}
+static inline int kvmppc_read_segment_page_sizes(uint32_t *prop, int maxcells)
+{
+ return -1;
+}
+
static inline int kvmppc_get_hypercall(CPUState *env, uint8_t *buf, int buf_len)
{
return -1;
--
1.7.6.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [0/3] pseries: RFC: Advertise host CPU capabilties to guest
2011-09-30 7:50 [Qemu-devel] [0/3] pseries: RFC: Advertise host CPU capabilties to guest David Gibson
` (2 preceding siblings ...)
2011-09-30 7:50 ` [Qemu-devel] [PATCH 3/3] pseries: Correctly create ibm, segment-page-sizes property David Gibson
@ 2011-09-30 8:20 ` Alexander Graf
2011-09-30 9:00 ` David Gibson
3 siblings, 1 reply; 9+ messages in thread
From: Alexander Graf @ 2011-09-30 8:20 UTC (permalink / raw)
To: David Gibson; +Cc: qemu-devel@nongnu.org
Am 30.09.2011 um 09:50 schrieb David Gibson <david@gibson.dropbear.id.au>:
> This series contains some patches which, when using KVM, gather
> information about the capabilities of the host CPU and advertise them
> to the guest system when using the pseries machine. Specifically it
> does this for whether the CPU supports VMX, VSX and/or DFP
> instructions, and for the CPUs supported page sizes.
>
> The VSX and DFP portions of this were posted earlier, and I've fixed
> the minor comments which people made. This leaves one objection from
> Alex Graf, that whether the features are advertised should also depend
> on the target CPU selected in qemu. A similar objection may apply to
> the pagesizes patch. I guess the idea is to "clamp" the advertised
> capabilities to those permitted by the selected target CPU, but I'm
> not entirely sure what the logic here should be.
>
> Frankly, particularly in the case of KVM Book3S-HV, I'm not terribly
> convinced that attempting to make the guest CPU appear different from
> the host CPU is terribly meaningful. These patches as they stand have
> the advantage that future, roughly compatible CPUs should Just Work
> with these capabilities advertised in the correct cases. Alex, can
> you advise what sort of logic you'd like here.
Yes, very simple. I want you to create a CPU type 'host', similar to how x86 does it. That should be the default CPU type for KVM with the pseries machine.
You can also add a check in-kernel that verifies if guest PVR == host PVR for HV mode. That way you ensure that -cpu host is always used there. If you later add compat modes, you can check them there, but still have -cpu xxx available to tell all pieces of the kvm/qemu chain what to use.
Alex
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [0/3] pseries: RFC: Advertise host CPU capabilties to guest
2011-09-30 8:20 ` [Qemu-devel] [0/3] pseries: RFC: Advertise host CPU capabilties to guest Alexander Graf
@ 2011-09-30 9:00 ` David Gibson
0 siblings, 0 replies; 9+ messages in thread
From: David Gibson @ 2011-09-30 9:00 UTC (permalink / raw)
To: Alexander Graf; +Cc: qemu-devel@nongnu.org
On Fri, Sep 30, 2011 at 10:20:14AM +0200, Alexander Graf wrote:
>
> Am 30.09.2011 um 09:50 schrieb David Gibson <david@gibson.dropbear.id.au>:
>
> > This series contains some patches which, when using KVM, gather
> > information about the capabilities of the host CPU and advertise them
> > to the guest system when using the pseries machine. Specifically it
> > does this for whether the CPU supports VMX, VSX and/or DFP
> > instructions, and for the CPUs supported page sizes.
> >
> > The VSX and DFP portions of this were posted earlier, and I've fixed
> > the minor comments which people made. This leaves one objection from
> > Alex Graf, that whether the features are advertised should also depend
> > on the target CPU selected in qemu. A similar objection may apply to
> > the pagesizes patch. I guess the idea is to "clamp" the advertised
> > capabilities to those permitted by the selected target CPU, but I'm
> > not entirely sure what the logic here should be.
> >
> > Frankly, particularly in the case of KVM Book3S-HV, I'm not terribly
> > convinced that attempting to make the guest CPU appear different from
> > the host CPU is terribly meaningful. These patches as they stand have
> > the advantage that future, roughly compatible CPUs should Just Work
> > with these capabilities advertised in the correct cases. Alex, can
> > you advise what sort of logic you'd like here.
>
>
> Yes, very simple. I want you to create a CPU type 'host', similar to
> how x86 does it. That should be the default CPU type for KVM with
> the pseries machine.
Ah, ok. I didn't realize x86 did that, but I'd been thinking
something like that would make more sense. I'm away for the next
week, but I'll look at this when I get the chance.
> You can also add a check in-kernel that verifies if guest PVR ==
> host PVR for HV mode. That way you ensure that -cpu host is always
> used there. If you later add compat modes, you can check them there,
> but still have -cpu xxx available to tell all pieces of the kvm/qemu
> chain what to use.
Ok.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH 1/3] ppc: Generalize the kvmppc_get_clockfreq() function
2011-09-30 7:50 ` [Qemu-devel] [PATCH 1/3] ppc: Generalize the kvmppc_get_clockfreq() function David Gibson
@ 2011-09-30 18:06 ` Alexander Graf
2011-10-11 4:29 ` David Gibson
0 siblings, 1 reply; 9+ messages in thread
From: Alexander Graf @ 2011-09-30 18:06 UTC (permalink / raw)
To: David Gibson; +Cc: qemu-devel@nongnu.org
Am 30.09.2011 um 09:50 schrieb David Gibson <david@gibson.dropbear.id.au>:
> Currently the kvmppc_get_clockfreq() function reads the host's clock
> frequency from /proc/device-tree, which is useful to past to the guest
> in KVM setups. However, there are some other host properties
> advertised in the device tree which can also be relevant to the
> guests.
>
> This patch, therefore, replaces kvmppc_get_clockfreq() which can
> retrieve any named, single integer property from the host device
> tree's CPU node.
>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
> hw/ppc440_bamboo.c | 2 +-
> hw/ppce500_mpc8544ds.c | 2 +-
> hw/spapr.c | 13 ++++++++++++-
> target-ppc/kvm.c | 30 +++++++++++++++++++-----------
> target-ppc/kvm_ppc.h | 4 ++--
> 5 files changed, 35 insertions(+), 16 deletions(-)
>
> diff --git a/hw/ppc440_bamboo.c b/hw/ppc440_bamboo.c
> index 1523764..df85da0 100644
> --- a/hw/ppc440_bamboo.c
> +++ b/hw/ppc440_bamboo.c
> @@ -83,7 +83,7 @@ static int bamboo_load_device_tree(target_phys_addr_t addr,
> * the correct frequencies. */
> if (kvm_enabled()) {
> tb_freq = kvmppc_get_tbfreq();
> - clock_freq = kvmppc_get_clockfreq();
> + clock_freq = kvmppc_read_int_cpu_dt("clock-frequency");
Hrm. I was actually trying to abstract host dt handling away here. The idea was to use the helper inside of kvm.c, but still expose specific functions for specific properties.
> }
>
> qemu_devtree_setprop_cell(fdt, "/cpus/cpu@0", "clock-frequency",
> diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
> index f00367e..eb37c3d 100644
> --- a/hw/ppce500_mpc8544ds.c
> +++ b/hw/ppce500_mpc8544ds.c
> @@ -112,7 +112,7 @@ static int mpc8544_load_device_tree(CPUState *env,
>
> if (kvm_enabled()) {
> /* Read out host's frequencies */
> - clock_freq = kvmppc_get_clockfreq();
> + clock_freq = kvmppc_read_int_cpu_dt("clock-frequency");
> tb_freq = kvmppc_get_tbfreq();
>
> /* indicate KVM hypercall interface */
> diff --git a/hw/spapr.c b/hw/spapr.c
> index 9a3a1ea..ea5690e 100644
> --- a/hw/spapr.c
> +++ b/hw/spapr.c
> @@ -185,7 +185,8 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
> uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
> 0xffffffff, 0xffffffff};
> uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq() : TIMEBASE_FREQ;
> - uint32_t cpufreq = kvm_enabled() ? kvmppc_get_clockfreq() : 1000000000;
> + uint32_t cpufreq = kvm_enabled() ?
> + kvmppc_read_int_cpu_dt("clock-frequency") : 1000000000;
>
> if ((index % smt) != 0) {
> continue;
> @@ -233,6 +234,16 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
> segs, sizeof(segs))));
> }
>
> + /* Advertise VMX/VSX (vector extensions) if available */
> + if (vmx) {
> + _FDT((fdt_property_cell(fdt, "ibm,vmx", vmx)));
> + }
> +
> + /* Advertise DFP (Decimal Floating Point) if available */
> + if (dfp) {
> + _FDT((fdt_property_cell(fdt, "ibm,dfp", dfp)));
> + }
> +
Please make sure that your patch set is bisectable :)
Alex
> _FDT((fdt_end_node(fdt)));
> }
>
> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> index 26165b6..db2326d 100644
> --- a/target-ppc/kvm.c
> +++ b/target-ppc/kvm.c
> @@ -650,32 +650,40 @@ static int kvmppc_find_cpu_dt(char *buf, int buf_len)
> return 0;
> }
>
> -uint64_t kvmppc_get_clockfreq(void)
> +/* Read a CPU node property from the host device tree that's a single
> + * integer (32-bit or 64-bit). Returns 0 if anything goes wrong
> + * (can't find or open the property, or doesn't understand the
> + * format) */
> +uint64_t kvmppc_read_int_cpu_dt(const char *propname)
> {
> - char buf[512];
> - uint32_t tb[2];
> + char buf[PATH_MAX];
> + union {
> + uint32_t v32;
> + uint64_t v64;
> + } u;
> FILE *f;
> int len;
>
> if (kvmppc_find_cpu_dt(buf, sizeof(buf))) {
> - return 0;
> + return -1;
> }
>
> - strncat(buf, "/clock-frequency", sizeof(buf) - strlen(buf));
> + strncat(buf, "/", sizeof(buf) - strlen(buf));
> + strncat(buf, propname, sizeof(buf) - strlen(buf));
>
> f = fopen(buf, "rb");
> if (!f) {
> return -1;
> }
>
> - len = fread(tb, sizeof(tb[0]), 2, f);
> + len = fread(&u, 1, sizeof(u), f);
> fclose(f);
> switch (len) {
> - case 1:
> - /* freq is only a single cell */
> - return tb[0];
> - case 2:
> - return *(uint64_t*)tb;
> + case 4:
> + /* property is a 32-bit quantity */
> + return be32_to_cpu(u.v32);
> + case 8:
> + return be64_to_cpu(u.v64);
> }
>
> return 0;
> diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
> index 9e8a7b5..0b9a58a 100644
> --- a/target-ppc/kvm_ppc.h
> +++ b/target-ppc/kvm_ppc.h
> @@ -14,7 +14,7 @@ void kvmppc_init(void);
> #ifdef CONFIG_KVM
>
> uint32_t kvmppc_get_tbfreq(void);
> -uint64_t kvmppc_get_clockfreq(void);
> +uint64_t kvmppc_read_int_cpu_dt(const char *propname);
> int kvmppc_get_hypercall(CPUState *env, uint8_t *buf, int buf_len);
> int kvmppc_set_interrupt(CPUState *env, int irq, int level);
> void kvmppc_set_papr(CPUState *env);
> @@ -30,7 +30,7 @@ static inline uint32_t kvmppc_get_tbfreq(void)
> return 0;
> }
>
> -static inline uint64_t kvmppc_get_clockfreq(void)
> +static inline uint64_t kvmppc_read_int_cpu_dt(const char *propname)
> {
> return 0;
> }
> --
> 1.7.6.3
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH 3/3] pseries: Correctly create ibm, segment-page-sizes property
2011-09-30 7:50 ` [Qemu-devel] [PATCH 3/3] pseries: Correctly create ibm, segment-page-sizes property David Gibson
@ 2011-10-07 7:20 ` Alexander Graf
0 siblings, 0 replies; 9+ messages in thread
From: Alexander Graf @ 2011-10-07 7:20 UTC (permalink / raw)
To: David Gibson; +Cc: qemu-devel
On 30.09.2011, at 09:50, David Gibson wrote:
> Current versions of the PowerPC architecture require and fully define
> 4kB and 16MB page sizes. Other pagesizes (e.g. 64kB, 1MB) are
> permitted and are often supported, but the exact encodings used to set
> the up can vary from chip to chip.
>
> The supported pagesizes and required encodings are advertised to the
> OS via the ibm,segment-page-sizes property in the device tree.
> Currently we do not put this property in our device tree, so guests
> are restricted to the architected 4kB and 16MB pagesizes.
>
> The base sizes are all that we implement in tcg, however with KVM the
> guest can use anything supported by the host as long as the guest's
> base memory is backed by pages at least as large. Furthermore, in
> order to use any extended page sizes, the guest needs to know the
> correct encodings for the host.
>
> This patch, therefore, reads the host's pagesize information, filters
> it based on the pagesize backing RAM, and passes it into the guest.
>
> Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
> hw/spapr.c | 127 ++++++++++++++++++++++++++++++++++++++++++++++++++
> target-ppc/kvm.c | 43 +++++++++++++++++
> target-ppc/kvm_ppc.h | 6 ++
> 3 files changed, 176 insertions(+), 0 deletions(-)
>
> diff --git a/hw/spapr.c b/hw/spapr.c
> index 8089d83..72b6c6a 100644
> --- a/hw/spapr.c
> +++ b/hw/spapr.c
> @@ -24,6 +24,8 @@
> * THE SOFTWARE.
> *
> */
> +#include <sys/vfs.h>
> +
> #include "sysemu.h"
> #include "hw.h"
> #include "elf.h"
> @@ -88,6 +90,122 @@ qemu_irq spapr_allocate_irq(uint32_t hint, uint32_t *irq_num)
> return qirq;
> }
>
> +#define HUGETLBFS_MAGIC 0x958458f6
> +
> +static long getrampagesize(void)
> +{
> + struct statfs fs;
> + int ret;
> +
> + if (!mem_path) {
> + /* guest RAM is backed by normal anonymous pages */
> + return getpagesize();
> + }
> +
> + do {
> + ret = statfs(mem_path, &fs);
> + } while (ret != 0 && errno == EINTR);
> +
> + if (ret != 0) {
> + fprintf(stderr, "Couldn't statfs() memory path: %s\n",
> + strerror(errno));
> + exit(1);
> + }
> +
> + if (fs.f_type != HUGETLBFS_MAGIC) {
> + /* Explicit mempath, but it's ordinary pages */
> + return getpagesize();
> + }
> +
> + /* It's hugepage, return the huge page size */
> + return fs.f_bsize;
> +}
Would this function compile and work on win32 hosts? If not, it should probably go to kvm.c.
> +
> +static size_t create_page_sizes_prop(uint32_t *prop, size_t maxsize)
> +{
> + int cells;
> + target_ulong ram_page_size = getrampagesize();
> + int i, j;
> +
> + if (!kvm_enabled()) {
> + /* For the supported CPUs in emulation, we support just 4k and
> + * 16MB pages, with the usual encodings. This is the default
> + * set the guest will assume if we don't specify anything */
> + return 0;
> + }
> +
> + cells = kvmppc_read_segment_page_sizes(prop, maxsize / sizeof(uint32_t));
Shouldn't we rather be asking the kvm kernel module to tell us its supported segment sizes? Just because the host doesn't support 256MB page size doesn't mean we can't expose it to the guest, right? Depending on the KVM mode of course.
For HV we would pass through the hardware ones. For PR we could pretty much support anything since we're shadowing the htab. But there it'd be a win too, since we would get less page table entries and could potentially also back things with huge pages.
Also, this depends heavily on the guest CPU architecture. For 970, we can't support anything but 4k and 16MB (and even that one is crap). For p7, things are a lot more flexible. But we need to make sure that what we tell the guest is actually possible to do on the particular CPU we're emulating / virtualizing.
Alex
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH 1/3] ppc: Generalize the kvmppc_get_clockfreq() function
2011-09-30 18:06 ` Alexander Graf
@ 2011-10-11 4:29 ` David Gibson
0 siblings, 0 replies; 9+ messages in thread
From: David Gibson @ 2011-10-11 4:29 UTC (permalink / raw)
To: Alexander Graf; +Cc: qemu-devel@nongnu.org
On Fri, Sep 30, 2011 at 08:06:59PM +0200, Alexander Graf wrote:
>
> Am 30.09.2011 um 09:50 schrieb David Gibson <david@gibson.dropbear.id.au>:
>
> > Currently the kvmppc_get_clockfreq() function reads the host's clock
> > frequency from /proc/device-tree, which is useful to past to the guest
> > in KVM setups. However, there are some other host properties
> > advertised in the device tree which can also be relevant to the
> > guests.
> >
> > This patch, therefore, replaces kvmppc_get_clockfreq() which can
> > retrieve any named, single integer property from the host device
> > tree's CPU node.
> >
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> > hw/ppc440_bamboo.c | 2 +-
> > hw/ppce500_mpc8544ds.c | 2 +-
> > hw/spapr.c | 13 ++++++++++++-
> > target-ppc/kvm.c | 30 +++++++++++++++++++-----------
> > target-ppc/kvm_ppc.h | 4 ++--
> > 5 files changed, 35 insertions(+), 16 deletions(-)
> >
> > diff --git a/hw/ppc440_bamboo.c b/hw/ppc440_bamboo.c
> > index 1523764..df85da0 100644
> > --- a/hw/ppc440_bamboo.c
> > +++ b/hw/ppc440_bamboo.c
> > @@ -83,7 +83,7 @@ static int bamboo_load_device_tree(target_phys_addr_t addr,
> > * the correct frequencies. */
> > if (kvm_enabled()) {
> > tb_freq = kvmppc_get_tbfreq();
> > - clock_freq = kvmppc_get_clockfreq();
> > + clock_freq = kvmppc_read_int_cpu_dt("clock-frequency");
>
> Hrm. I was actually trying to abstract host dt handling away
> here. The idea was to use the helper inside of kvm.c, but still
> expose specific functions for specific properties.
Ok, fair enough.
> > }
> >
> > qemu_devtree_setprop_cell(fdt, "/cpus/cpu@0", "clock-frequency",
> > diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
> > index f00367e..eb37c3d 100644
> > --- a/hw/ppce500_mpc8544ds.c
> > +++ b/hw/ppce500_mpc8544ds.c
> > @@ -112,7 +112,7 @@ static int mpc8544_load_device_tree(CPUState *env,
> >
> > if (kvm_enabled()) {
> > /* Read out host's frequencies */
> > - clock_freq = kvmppc_get_clockfreq();
> > + clock_freq = kvmppc_read_int_cpu_dt("clock-frequency");
> > tb_freq = kvmppc_get_tbfreq();
> >
> > /* indicate KVM hypercall interface */
> > diff --git a/hw/spapr.c b/hw/spapr.c
> > index 9a3a1ea..ea5690e 100644
> > --- a/hw/spapr.c
> > +++ b/hw/spapr.c
> > @@ -185,7 +185,8 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
> > uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
> > 0xffffffff, 0xffffffff};
> > uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq() : TIMEBASE_FREQ;
> > - uint32_t cpufreq = kvm_enabled() ? kvmppc_get_clockfreq() : 1000000000;
> > + uint32_t cpufreq = kvm_enabled() ?
> > + kvmppc_read_int_cpu_dt("clock-frequency") : 1000000000;
> >
> > if ((index % smt) != 0) {
> > continue;
> > @@ -233,6 +234,16 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
> > segs, sizeof(segs))));
> > }
> >
> > + /* Advertise VMX/VSX (vector extensions) if available */
> > + if (vmx) {
> > + _FDT((fdt_property_cell(fdt, "ibm,vmx", vmx)));
> > + }
> > +
> > + /* Advertise DFP (Decimal Floating Point) if available */
> > + if (dfp) {
> > + _FDT((fdt_property_cell(fdt, "ibm,dfp", dfp)));
> > + }
> > +
>
> Please make sure that your patch set is bisectable :)
Oops, bad split - that hunk was supposed to be in the other patch of
the series. I'll resend with both the above fixed momentarily.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2011-10-11 4:29 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-30 7:50 [Qemu-devel] [0/3] pseries: RFC: Advertise host CPU capabilties to guest David Gibson
2011-09-30 7:50 ` [Qemu-devel] [PATCH 1/3] ppc: Generalize the kvmppc_get_clockfreq() function David Gibson
2011-09-30 18:06 ` Alexander Graf
2011-10-11 4:29 ` David Gibson
2011-09-30 7:50 ` [Qemu-devel] [PATCH 2/3] pseries: Add device tree properties for VMX/VSX and DFP under kvm David Gibson
2011-09-30 7:50 ` [Qemu-devel] [PATCH 3/3] pseries: Correctly create ibm, segment-page-sizes property David Gibson
2011-10-07 7:20 ` Alexander Graf
2011-09-30 8:20 ` [Qemu-devel] [0/3] pseries: RFC: Advertise host CPU capabilties to guest Alexander Graf
2011-09-30 9:00 ` David Gibson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).