qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3] ppc: introduce CPUPPCState::cpu_dt_id
@ 2013-11-04  1:10 Alexey Kardashevskiy
  2013-11-04  9:41 ` Alexander Graf
  0 siblings, 1 reply; 9+ messages in thread
From: Alexey Kardashevskiy @ 2013-11-04  1:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alexey Kardashevskiy, Alexander Graf, Bharat Bhushan,
	Paolo Bonzini, qemu-ppc, Badari Pulavarty, Scott Wood,
	Paul Mackerras, David Gibson

Normally CPUState::cpu_index is used to pick the right CPU for various
operations. However default consecutive numbering does not always work
for POWERPC.

For example, on POWER7 (which supports 4 threads per core),
"-smp 8,threads=4" should create CPUs with indexes 0,1,2,3,4,5,6,7 and
"-smp 8,threads=1" should create CPUs with indexes 0,4,8,12,16,20,24,28.

These indexes are reflected in /proc/device-tree/cpus/PowerPC,POWER7@XX
and used to call KVM VCPU's ioctls. In order to achieve this,
kvmppc_fixup_cpu() was introduced. Roughly speaking, it multiplies
cpu_index by the number of threads per core.

This approach has disadvantages such as:
1. NUMA configuration stays broken after the fixup;
2. CPU-related commands from QEMU Monitor do not work properly as
the accept fixed CPU indexes and the user does not really know
what they are after fixup as the number of threads per core changes
between CPU versions and via QEMU command line.

This introduces a new @cpu_dt_id field in the CPUPPCState struct which
is set from @cpu_index by default but can be fixed later to the value
which a hypervisor can accept. This also introduces two POWERPC-arch
specific functions:
1. int ppc_get_vcpu_dt_id(CPUState *cs) - returns a device-tree ID
for a CPU;
2. CPUState *ppc_get_vcpu_by_dt_id(int cpu_dt_id) - finds CPUState by
a device-tree CPU ID.

This uses the new functions to:
1. fix emulated XICS hypercall handlers as they receive fixed CPU indexes;
2. fix XICS-KVM to enable in-kernel XICS on right CPU;
3. compose correct device-tree.

This removes @cpu_index fixup as @cpu_dt_id is used instead so QEMU monitor
can accept command-line CPU indexes again.

Cc: Badari Pulavarty  <pbadari@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Bharat Bhushan <bharat.bhushan@freescale.com>
Cc: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
Changes:
v3:
* althouth e500 does not tweak CPU indexes, e500 and openpic-kvm
are fixed too; hopefully it does not break anything.

v2:
* added PPC-specific ppc_get_vcpu_dt_id() and ppc_get_vcpu_by_dt_id()
* fixed kvm_arch_vcpu_id() to use ppc_get_vcpu_dt_id()
* fixed emulated XICS
* removed kvm_arch_vcpu_id() stub for non-KVM case
---
 hw/intc/openpic_kvm.c       |  2 +-
 hw/intc/xics.c              | 15 +++++++++++++--
 hw/intc/xics_kvm.c          | 15 +++++++++------
 hw/ppc/e500.c               |  5 +++--
 hw/ppc/ppc.c                | 25 +++++++++++++++++++++++++
 hw/ppc/spapr.c              |  9 +++++----
 hw/ppc/spapr_hcall.c        |  2 +-
 hw/ppc/spapr_rtas.c         |  4 ++--
 target-ppc/cpu.h            |  6 ++++++
 target-ppc/kvm.c            |  4 ++--
 target-ppc/translate_init.c |  1 +
 11 files changed, 68 insertions(+), 20 deletions(-)

diff --git a/hw/intc/openpic_kvm.c b/hw/intc/openpic_kvm.c
index c7f7b84..a43d2be 100644
--- a/hw/intc/openpic_kvm.c
+++ b/hw/intc/openpic_kvm.c
@@ -228,7 +228,7 @@ int kvm_openpic_connect_vcpu(DeviceState *d, CPUState *cs)
 
     encap.cap = KVM_CAP_IRQ_MPIC;
     encap.args[0] = opp->fd;
-    encap.args[1] = cs->cpu_index;
+    encap.args[1] = ppc_get_vcpu_dt_id(cs);
 
     return kvm_vcpu_ioctl(cs, KVM_ENABLE_CAP, &encap);
 }
diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index a333305..866ee08 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -33,6 +33,17 @@
 #include "qemu/error-report.h"
 #include "qapi/visitor.h"
 
+static int get_cpu_index_by_dt_id(int cpu_dt_id)
+{
+    CPUState *cs = ppc_get_vcpu_by_dt_id(cpu_dt_id);
+
+    if (cs) {
+        return cs->cpu_index;
+    }
+
+    return -1;
+}
+
 void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
 {
     CPUState *cs = CPU(cpu);
@@ -659,7 +670,7 @@ static target_ulong h_cppr(PowerPCCPU *cpu, sPAPREnvironment *spapr,
 static target_ulong h_ipi(PowerPCCPU *cpu, sPAPREnvironment *spapr,
                           target_ulong opcode, target_ulong *args)
 {
-    target_ulong server = args[0];
+    target_ulong server = get_cpu_index_by_dt_id(args[0]);
     target_ulong mfrr = args[1];
 
     if (server >= spapr->icp->nr_servers) {
@@ -728,7 +739,7 @@ static void rtas_set_xive(PowerPCCPU *cpu, sPAPREnvironment *spapr,
     }
 
     nr = rtas_ld(args, 0);
-    server = rtas_ld(args, 1);
+    server = get_cpu_index_by_dt_id(rtas_ld(args, 1));
     priority = rtas_ld(args, 2);
 
     if (!ics_valid_irq(ics, nr) || (server >= ics->icp->nr_servers)
diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index c203646..091fcca 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -65,7 +65,8 @@ static void icp_get_kvm_state(ICPState *ss)
     ret = kvm_vcpu_ioctl(ss->cs, KVM_GET_ONE_REG, &reg);
     if (ret != 0) {
         error_report("Unable to retrieve KVM interrupt controller state"
-                " for CPU %d: %s", ss->cs->cpu_index, strerror(errno));
+                     " for CPU %ld: %s", kvm_arch_vcpu_id(ss->cs),
+                     strerror(errno));
         exit(1);
     }
 
@@ -97,8 +98,8 @@ static int icp_set_kvm_state(ICPState *ss, int version_id)
     ret = kvm_vcpu_ioctl(ss->cs, KVM_SET_ONE_REG, &reg);
     if (ret != 0) {
         error_report("Unable to restore KVM interrupt controller state (0x%"
-                PRIx64 ") for CPU %d: %s", state, ss->cs->cpu_index,
-                strerror(errno));
+                     PRIx64 ") for CPU %ld: %s", state,
+                     kvm_arch_vcpu_id(ss->cs), strerror(errno));
         return ret;
     }
 
@@ -311,8 +312,10 @@ static void xics_kvm_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
     CPUState *cs;
     ICPState *ss;
     KVMXICSState *icpkvm = KVM_XICS(icp);
+    unsigned long cpu_dt_id;
 
     cs = CPU(cpu);
+    cpu_dt_id = kvm_arch_vcpu_id(cs);
     ss = &icp->ss[cs->cpu_index];
 
     assert(cs->cpu_index < icp->nr_servers);
@@ -325,15 +328,15 @@ static void xics_kvm_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
         struct kvm_enable_cap xics_enable_cap = {
             .cap = KVM_CAP_IRQ_XICS,
             .flags = 0,
-            .args = {icpkvm->kernel_xics_fd, cs->cpu_index, 0, 0},
+            .args = {icpkvm->kernel_xics_fd, cpu_dt_id, 0, 0},
         };
 
         ss->cs = cs;
 
         ret = kvm_vcpu_ioctl(ss->cs, KVM_ENABLE_CAP, &xics_enable_cap);
         if (ret < 0) {
-            error_report("Unable to connect CPU%d to kernel XICS: %s",
-                    cs->cpu_index, strerror(errno));
+            error_report("Unable to connect CPU%ld to kernel XICS: %s",
+                         cpu_dt_id, strerror(errno));
             exit(1);
         }
     }
diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c
index cfdd84b..212d9d7 100644
--- a/hw/ppc/e500.c
+++ b/hw/ppc/e500.c
@@ -248,12 +248,13 @@ static int ppce500_load_device_tree(QEMUMachineInitArgs *args,
         env = cpu->env_ptr;
 
         snprintf(cpu_name, sizeof(cpu_name), "/cpus/PowerPC,8544@%x",
-                 cpu->cpu_index);
+                 ppc_get_vcpu_dt_id(cpu));
         qemu_devtree_add_subnode(fdt, cpu_name);
         qemu_devtree_setprop_cell(fdt, cpu_name, "clock-frequency", clock_freq);
         qemu_devtree_setprop_cell(fdt, cpu_name, "timebase-frequency", tb_freq);
         qemu_devtree_setprop_string(fdt, cpu_name, "device_type", "cpu");
-        qemu_devtree_setprop_cell(fdt, cpu_name, "reg", cpu->cpu_index);
+        qemu_devtree_setprop_cell(fdt, cpu_name, "reg",
+                                  ppc_get_vcpu_dt_id(cpu));
         qemu_devtree_setprop_cell(fdt, cpu_name, "d-cache-line-size",
                                   env->dcache_line_size);
         qemu_devtree_setprop_cell(fdt, cpu_name, "i-cache-line-size",
diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
index bf2d3d4..ebe2c5d 100644
--- a/hw/ppc/ppc.c
+++ b/hw/ppc/ppc.c
@@ -1362,3 +1362,28 @@ int PPC_NVRAM_set_params (nvram_t *nvram, uint16_t NVRAM_size,
 
     return 0;
 }
+
+/* CPU device-tree ID helpers */
+int ppc_get_vcpu_dt_id(CPUState *cs)
+{
+    PowerPCCPU *cpu = POWERPC_CPU(cs);
+    CPUPPCState *env = &cpu->env;
+
+    return env->cpu_dt_id;
+}
+
+CPUState *ppc_get_vcpu_by_dt_id(int cpu_dt_id)
+{
+    CPUState *cs;
+
+    CPU_FOREACH(cs) {
+        PowerPCCPU *cpu = POWERPC_CPU(cs);
+        CPUPPCState *env = &cpu->env;
+
+        if (env->cpu_dt_id == cpu_dt_id) {
+            return cs;
+        }
+    }
+
+    return NULL;
+}
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index f76b355..09dc635 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -206,19 +206,20 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment *spapr)
 
     CPU_FOREACH(cpu) {
         DeviceClass *dc = DEVICE_GET_CLASS(cpu);
+        int cpu_dt_id = ppc_get_vcpu_dt_id(cpu);
         uint32_t associativity[] = {cpu_to_be32(0x5),
                                     cpu_to_be32(0x0),
                                     cpu_to_be32(0x0),
                                     cpu_to_be32(0x0),
                                     cpu_to_be32(cpu->numa_node),
-                                    cpu_to_be32(cpu->cpu_index)};
+                                    cpu_to_be32(cpu_dt_id)};
 
-        if ((cpu->cpu_index % smt) != 0) {
+        if ((cpu_dt_id % smt) != 0) {
             continue;
         }
 
         snprintf(cpu_model, 32, "/cpus/%s@%x", dc->fw_name,
-                 cpu->cpu_index);
+                 cpu_dt_id);
 
         offset = fdt_path_offset(fdt, cpu_model);
         if (offset < 0) {
@@ -367,7 +368,7 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
         CPUPPCState *env = &cpu->env;
         DeviceClass *dc = DEVICE_GET_CLASS(cs);
         PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
-        int index = cs->cpu_index;
+        int index = ppc_get_vcpu_dt_id(cs);
         uint32_t servers_prop[smp_threads];
         uint32_t gservers_prop[smp_threads * 2];
         char *nodename;
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index f755a53..1acb0d0 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -466,7 +466,7 @@ static target_ulong h_register_vpa(PowerPCCPU *cpu, sPAPREnvironment *spapr,
     CPUPPCState *tenv;
     CPUState *tcpu;
 
-    tcpu = qemu_get_cpu(procno);
+    tcpu = ppc_get_vcpu_by_dt_id(procno);
     if (!tcpu) {
         return H_PARAMETER;
     }
diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index eb542f2..f56dfca 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -139,7 +139,7 @@ static void rtas_query_cpu_stopped_state(PowerPCCPU *cpu_,
     }
 
     id = rtas_ld(args, 0);
-    cpu = qemu_get_cpu(id);
+    cpu = ppc_get_vcpu_by_dt_id(id);
     if (cpu != NULL) {
         if (cpu->halted) {
             rtas_st(rets, 1, 0);
@@ -172,7 +172,7 @@ static void rtas_start_cpu(PowerPCCPU *cpu_, sPAPREnvironment *spapr,
     start = rtas_ld(args, 1);
     r3 = rtas_ld(args, 2);
 
-    cs = qemu_get_cpu(id);
+    cs = ppc_get_vcpu_by_dt_id(id);
     if (cs != NULL) {
         PowerPCCPU *cpu = POWERPC_CPU(cs);
         CPUPPCState *env = &cpu->env;
diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 26acdba..6d04b6e 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -1068,6 +1068,9 @@ struct CPUPPCState {
      */
     uint8_t fit_period[4];
     uint8_t wdt_period[4];
+
+    /* The CPU index used in the device tree. KVM uses this index too */
+    int cpu_dt_id;
 };
 
 #define SET_FIT_PERIOD(a_, b_, c_, d_)          \
@@ -2147,4 +2150,7 @@ static inline bool cpu_has_work(CPUState *cpu)
 
 void dump_mmu(FILE *f, fprintf_function cpu_fprintf, CPUPPCState *env);
 
+int ppc_get_vcpu_dt_id(CPUState *cs);
+CPUState *ppc_get_vcpu_by_dt_id(int cpu_dt_id);
+
 #endif /* !defined (__CPU_PPC_H__) */
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 10d0cd9..c579d82 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -401,7 +401,7 @@ static inline void kvm_fixup_page_sizes(PowerPCCPU *cpu)
 
 unsigned long kvm_arch_vcpu_id(CPUState *cpu)
 {
-    return cpu->cpu_index;
+    return ppc_get_vcpu_dt_id(cpu);
 }
 
 int kvm_arch_init_vcpu(CPUState *cs)
@@ -1773,7 +1773,7 @@ int kvmppc_fixup_cpu(PowerPCCPU *cpu)
 
     /* Adjust cpu index for SMT */
     smt = kvmppc_smt_threads();
-    cs->cpu_index = (cs->cpu_index / smp_threads) * smt
+    cpu->env.cpu_dt_id = (cs->cpu_index / smp_threads) * smt
         + (cs->cpu_index % smp_threads);
 
     return 0;
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 3d3952c..c8f28bc 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -8557,6 +8557,7 @@ static void ppc_cpu_initfn(Object *obj)
 
     cs->env_ptr = env;
     cpu_exec_init(env);
+    env->cpu_dt_id = cs->cpu_index;
 
     env->msr_mask = pcc->msr_mask;
     env->mmu_model = pcc->mmu_model;
-- 
1.8.4.rc4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v3] ppc: introduce CPUPPCState::cpu_dt_id
  2013-11-04  1:10 [Qemu-devel] [PATCH v3] ppc: introduce CPUPPCState::cpu_dt_id Alexey Kardashevskiy
@ 2013-11-04  9:41 ` Alexander Graf
  2013-11-04  9:58   ` Alexey Kardashevskiy
  2013-11-04 19:42   ` Scott Wood
  0 siblings, 2 replies; 9+ messages in thread
From: Alexander Graf @ 2013-11-04  9:41 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: QEMU Developers, Bharat Bhushan, Paolo Bonzini, Paul Mackerras,
	Badari Pulavarty, Scott Wood, list@suse.de:PReP, David Gibson


On 04.11.2013, at 02:10, Alexey Kardashevskiy <aik@ozlabs.ru> wrote:

> Normally CPUState::cpu_index is used to pick the right CPU for various
> operations. However default consecutive numbering does not always work
> for POWERPC.
> 
> For example, on POWER7 (which supports 4 threads per core),
> "-smp 8,threads=4" should create CPUs with indexes 0,1,2,3,4,5,6,7 and
> "-smp 8,threads=1" should create CPUs with indexes 0,4,8,12,16,20,24,28.
> 
> These indexes are reflected in /proc/device-tree/cpus/PowerPC,POWER7@XX
> and used to call KVM VCPU's ioctls. In order to achieve this,
> kvmppc_fixup_cpu() was introduced. Roughly speaking, it multiplies
> cpu_index by the number of threads per core.
> 
> This approach has disadvantages such as:
> 1. NUMA configuration stays broken after the fixup;
> 2. CPU-related commands from QEMU Monitor do not work properly as
> the accept fixed CPU indexes and the user does not really know
> what they are after fixup as the number of threads per core changes
> between CPU versions and via QEMU command line.
> 
> This introduces a new @cpu_dt_id field in the CPUPPCState struct which
> is set from @cpu_index by default but can be fixed later to the value
> which a hypervisor can accept. This also introduces two POWERPC-arch
> specific functions:
> 1. int ppc_get_vcpu_dt_id(CPUState *cs) - returns a device-tree ID
> for a CPU;
> 2. CPUState *ppc_get_vcpu_by_dt_id(int cpu_dt_id) - finds CPUState by
> a device-tree CPU ID.
> 
> This uses the new functions to:
> 1. fix emulated XICS hypercall handlers as they receive fixed CPU indexes;
> 2. fix XICS-KVM to enable in-kernel XICS on right CPU;
> 3. compose correct device-tree.
> 
> This removes @cpu_index fixup as @cpu_dt_id is used instead so QEMU monitor
> can accept command-line CPU indexes again.

So this patch feels awkward. We use the dt fixup at random places in completely dt unrelated code paths.

What we really have are 3 semantically separate entities:

  * QEMU internal cpu id
  * KVM internal cpu id
  * DT exposed cpu id

As you have noted, it's a good idea to keep the QEMU internal cpu id linear, thus completely separate from the others. The DT exposed cpu id should be 100% local to hw/ppc/spapr*.c. I don't think any code outside of the DT generation and anything that accesses the "Virtual Processor Number" in sPAPR needs to care about the DT cpu id. All that code is 100% KVM agnostic.

The KVM internal cpu id should probably be a new field in the generic CPUState that gets used by kvm specific code that needs to know the KVM internal cpu id a vcpu was created with. The flow should be that kvm_arch_vcpu_id() tells kvm_init_vcpu() the kvm internal id to use which then stores it in CPUState. That way you can always get the KVM intenal cpu id from a CPU struct. All the references to this ID should _only_ happen from KVM only code.


Alex

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v3] ppc: introduce CPUPPCState::cpu_dt_id
  2013-11-04  9:41 ` Alexander Graf
@ 2013-11-04  9:58   ` Alexey Kardashevskiy
  2013-11-04 10:13     ` Alexander Graf
  2013-11-04 19:42   ` Scott Wood
  1 sibling, 1 reply; 9+ messages in thread
From: Alexey Kardashevskiy @ 2013-11-04  9:58 UTC (permalink / raw)
  To: Alexander Graf
  Cc: QEMU Developers, Bharat Bhushan, Paolo Bonzini, Paul Mackerras,
	Badari Pulavarty, Scott Wood, list@suse.de:PReP, David Gibson

On 11/04/2013 08:41 PM, Alexander Graf wrote:
> 
> On 04.11.2013, at 02:10, Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> 
>> Normally CPUState::cpu_index is used to pick the right CPU for various
>> operations. However default consecutive numbering does not always work
>> for POWERPC.
>>
>> For example, on POWER7 (which supports 4 threads per core),
>> "-smp 8,threads=4" should create CPUs with indexes 0,1,2,3,4,5,6,7 and
>> "-smp 8,threads=1" should create CPUs with indexes 0,4,8,12,16,20,24,28.
>>
>> These indexes are reflected in /proc/device-tree/cpus/PowerPC,POWER7@XX
>> and used to call KVM VCPU's ioctls. In order to achieve this,
>> kvmppc_fixup_cpu() was introduced. Roughly speaking, it multiplies
>> cpu_index by the number of threads per core.
>>
>> This approach has disadvantages such as:
>> 1. NUMA configuration stays broken after the fixup;
>> 2. CPU-related commands from QEMU Monitor do not work properly as
>> the accept fixed CPU indexes and the user does not really know
>> what they are after fixup as the number of threads per core changes
>> between CPU versions and via QEMU command line.
>>
>> This introduces a new @cpu_dt_id field in the CPUPPCState struct which
>> is set from @cpu_index by default but can be fixed later to the value
>> which a hypervisor can accept. This also introduces two POWERPC-arch
>> specific functions:
>> 1. int ppc_get_vcpu_dt_id(CPUState *cs) - returns a device-tree ID
>> for a CPU;
>> 2. CPUState *ppc_get_vcpu_by_dt_id(int cpu_dt_id) - finds CPUState by
>> a device-tree CPU ID.
>>
>> This uses the new functions to:
>> 1. fix emulated XICS hypercall handlers as they receive fixed CPU indexes;
>> 2. fix XICS-KVM to enable in-kernel XICS on right CPU;
>> 3. compose correct device-tree.
>>
>> This removes @cpu_index fixup as @cpu_dt_id is used instead so QEMU monitor
>> can accept command-line CPU indexes again.
> 
> So this patch feels awkward. We use the dt fixup at random places in completely dt unrelated code paths.

Yep. I called it smp_cpu_index in the first place :)

> What we really have are 3 semantically separate entities:
> 
>   * QEMU internal cpu id
>   * KVM internal cpu id
>   * DT exposed cpu id
> 

> As you have noted, it's a good idea to keep the QEMU internal cpu id
> linear, thus completely separate from the others. The DT exposed cpu id
> should be 100% local to hw/ppc/spapr*.c. I don't think any code outside
> of the DT generation and anything that accesses the "Virtual Processor
> Number" in sPAPR needs to care about the DT cpu id. All that code is
> 100% KVM agnostic.
> 
> The KVM internal cpu id should probably be a new field in the generic
> CPUState that gets used by kvm specific code that needs to know the KVM
> internal cpu id a vcpu was created with. The flow should be that
> kvm_arch_vcpu_id() tells kvm_init_vcpu() the kvm internal id to use
> which then stores it in CPUState. That way you can always get the KVM
> intenal cpu id from a CPU struct. All the references to this ID should
> _only_ happen from KVM only code.


If DT id is local to spapr*, then how do I implement kvm_arch_vcpu_id()?
Where will it get the fixed value? Do the same calculation in two different
places (for device tree and for kvm)?


-- 
Alexey

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v3] ppc: introduce CPUPPCState::cpu_dt_id
  2013-11-04  9:58   ` Alexey Kardashevskiy
@ 2013-11-04 10:13     ` Alexander Graf
  0 siblings, 0 replies; 9+ messages in thread
From: Alexander Graf @ 2013-11-04 10:13 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: QEMU Developers, Bharat Bhushan, Paolo Bonzini, Paul Mackerras,
	Badari Pulavarty, Scott Wood, list@suse.de:PReP, David Gibson


On 04.11.2013, at 10:58, Alexey Kardashevskiy <aik@ozlabs.ru> wrote:

> On 11/04/2013 08:41 PM, Alexander Graf wrote:
>> 
>> On 04.11.2013, at 02:10, Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>> 
>>> Normally CPUState::cpu_index is used to pick the right CPU for various
>>> operations. However default consecutive numbering does not always work
>>> for POWERPC.
>>> 
>>> For example, on POWER7 (which supports 4 threads per core),
>>> "-smp 8,threads=4" should create CPUs with indexes 0,1,2,3,4,5,6,7 and
>>> "-smp 8,threads=1" should create CPUs with indexes 0,4,8,12,16,20,24,28.
>>> 
>>> These indexes are reflected in /proc/device-tree/cpus/PowerPC,POWER7@XX
>>> and used to call KVM VCPU's ioctls. In order to achieve this,
>>> kvmppc_fixup_cpu() was introduced. Roughly speaking, it multiplies
>>> cpu_index by the number of threads per core.
>>> 
>>> This approach has disadvantages such as:
>>> 1. NUMA configuration stays broken after the fixup;
>>> 2. CPU-related commands from QEMU Monitor do not work properly as
>>> the accept fixed CPU indexes and the user does not really know
>>> what they are after fixup as the number of threads per core changes
>>> between CPU versions and via QEMU command line.
>>> 
>>> This introduces a new @cpu_dt_id field in the CPUPPCState struct which
>>> is set from @cpu_index by default but can be fixed later to the value
>>> which a hypervisor can accept. This also introduces two POWERPC-arch
>>> specific functions:
>>> 1. int ppc_get_vcpu_dt_id(CPUState *cs) - returns a device-tree ID
>>> for a CPU;
>>> 2. CPUState *ppc_get_vcpu_by_dt_id(int cpu_dt_id) - finds CPUState by
>>> a device-tree CPU ID.
>>> 
>>> This uses the new functions to:
>>> 1. fix emulated XICS hypercall handlers as they receive fixed CPU indexes;
>>> 2. fix XICS-KVM to enable in-kernel XICS on right CPU;
>>> 3. compose correct device-tree.
>>> 
>>> This removes @cpu_index fixup as @cpu_dt_id is used instead so QEMU monitor
>>> can accept command-line CPU indexes again.
>> 
>> So this patch feels awkward. We use the dt fixup at random places in completely dt unrelated code paths.
> 
> Yep. I called it smp_cpu_index in the first place :)
> 
>> What we really have are 3 semantically separate entities:
>> 
>>  * QEMU internal cpu id
>>  * KVM internal cpu id
>>  * DT exposed cpu id
>> 
> 
>> As you have noted, it's a good idea to keep the QEMU internal cpu id
>> linear, thus completely separate from the others. The DT exposed cpu id
>> should be 100% local to hw/ppc/spapr*.c. I don't think any code outside
>> of the DT generation and anything that accesses the "Virtual Processor
>> Number" in sPAPR needs to care about the DT cpu id. All that code is
>> 100% KVM agnostic.
>> 
>> The KVM internal cpu id should probably be a new field in the generic
>> CPUState that gets used by kvm specific code that needs to know the KVM
>> internal cpu id a vcpu was created with. The flow should be that
>> kvm_arch_vcpu_id() tells kvm_init_vcpu() the kvm internal id to use
>> which then stores it in CPUState. That way you can always get the KVM
>> intenal cpu id from a CPU struct. All the references to this ID should
>> _only_ happen from KVM only code.
> 
> 
> If DT id is local to spapr*, then how do I implement kvm_arch_vcpu_id()?
> Where will it get the fixed value? Do the same calculation in two different
> places (for device tree and for kvm)?

kvm_arch_vcpu_id() won't get called until qemu_init_vcpu() is issued from ppc_cpu_realizefn(). So if instead of calling cpu_ppc_init() you split up the function and set the kvm id property before realize from ppc_spapr_init(), that should work, no?


Alex

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v3] ppc: introduce CPUPPCState::cpu_dt_id
  2013-11-04  9:41 ` Alexander Graf
  2013-11-04  9:58   ` Alexey Kardashevskiy
@ 2013-11-04 19:42   ` Scott Wood
  2013-11-05  1:26     ` Alexey Kardashevskiy
  1 sibling, 1 reply; 9+ messages in thread
From: Scott Wood @ 2013-11-04 19:42 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Alexey Kardashevskiy, QEMU Developers, Bharat Bhushan,
	Paolo Bonzini, Paul Mackerras, Badari Pulavarty,
	list@suse.de:PReP, David Gibson

On Mon, 2013-11-04 at 10:41 +0100, Alexander Graf wrote:
> What we really have are 3 semantically separate entities:
> 
>   * QEMU internal cpu id
>   * KVM internal cpu id
>   * DT exposed cpu id
> 
> As you have noted, it's a good idea to keep the QEMU internal cpu id
> linear, thus completely separate from the others. The DT exposed cpu id
> should be 100% local to hw/ppc/spapr*.c. I don't think any code outside
> of the DT generation and anything that accesses the "Virtual Processor
> Number" in sPAPR needs to care about the DT cpu id. All that code is
> 100% KVM agnostic.

This patch isn't just for sPAPR...  On e500 the DT cpu id is supposed to
match the MPIC cpu id.

-Scott

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v3] ppc: introduce CPUPPCState::cpu_dt_id
  2013-11-04 19:42   ` Scott Wood
@ 2013-11-05  1:26     ` Alexey Kardashevskiy
  2013-11-05  1:48       ` Scott Wood
  0 siblings, 1 reply; 9+ messages in thread
From: Alexey Kardashevskiy @ 2013-11-05  1:26 UTC (permalink / raw)
  To: Scott Wood, Alexander Graf
  Cc: QEMU Developers, Bharat Bhushan, Paolo Bonzini, Paul Mackerras,
	Badari Pulavarty, list@suse.de:PReP, David Gibson

On 11/05/2013 06:42 AM, Scott Wood wrote:
> On Mon, 2013-11-04 at 10:41 +0100, Alexander Graf wrote:
>> What we really have are 3 semantically separate entities:
>>
>>   * QEMU internal cpu id
>>   * KVM internal cpu id
>>   * DT exposed cpu id
>>
>> As you have noted, it's a good idea to keep the QEMU internal cpu id
>> linear, thus completely separate from the others. The DT exposed cpu id
>> should be 100% local to hw/ppc/spapr*.c. I don't think any code outside
>> of the DT generation and anything that accesses the "Virtual Processor
>> Number" in sPAPR needs to care about the DT cpu id. All that code is
>> 100% KVM agnostic.
> 
> This patch isn't just for sPAPR...  On e500 the DT cpu id is supposed to
> match the MPIC cpu id.


At least is my patch correct for e500? I do not really know what is the
difference between e500 and spapr in this part.


-- 
Alexey

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v3] ppc: introduce CPUPPCState::cpu_dt_id
  2013-11-05  1:26     ` Alexey Kardashevskiy
@ 2013-11-05  1:48       ` Scott Wood
  2013-11-05  6:00         ` Alexander Graf
  0 siblings, 1 reply; 9+ messages in thread
From: Scott Wood @ 2013-11-05  1:48 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Alexander Graf, QEMU Developers, Bharat Bhushan, Paolo Bonzini,
	Paul Mackerras, Badari Pulavarty, list@suse.de:PReP, David Gibson

On Tue, 2013-11-05 at 12:26 +1100, Alexey Kardashevskiy wrote:
> On 11/05/2013 06:42 AM, Scott Wood wrote:
> > On Mon, 2013-11-04 at 10:41 +0100, Alexander Graf wrote:
> >> What we really have are 3 semantically separate entities:
> >>
> >>   * QEMU internal cpu id
> >>   * KVM internal cpu id
> >>   * DT exposed cpu id
> >>
> >> As you have noted, it's a good idea to keep the QEMU internal cpu id
> >> linear, thus completely separate from the others. The DT exposed cpu id
> >> should be 100% local to hw/ppc/spapr*.c. I don't think any code outside
> >> of the DT generation and anything that accesses the "Virtual Processor
> >> Number" in sPAPR needs to care about the DT cpu id. All that code is
> >> 100% KVM agnostic.
> > 
> > This patch isn't just for sPAPR...  On e500 the DT cpu id is supposed to
> > match the MPIC cpu id.
> 
> 
> At least is my patch correct for e500?

I think so.

> I do not really know what is the difference between e500 and spapr in this part.

e500 does not have sPAPR, and if the DT ID is "100% local to
hw/ppc/spapr*.c" then the MPIC code will have a problem.

Currently we don't support smt on e500 in QEMU, but e6500 (an
e500-derivative) does have smt, so it could happen in the future.

-Scott

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v3] ppc: introduce CPUPPCState::cpu_dt_id
  2013-11-05  1:48       ` Scott Wood
@ 2013-11-05  6:00         ` Alexander Graf
  2013-11-06  1:04           ` Scott Wood
  0 siblings, 1 reply; 9+ messages in thread
From: Alexander Graf @ 2013-11-05  6:00 UTC (permalink / raw)
  To: Scott Wood
  Cc: Alexey Kardashevskiy, QEMU Developers, Bharat Bhushan,
	Paolo Bonzini, Paul Mackerras, Badari Pulavarty,
	list@suse.de:PReP, David Gibson



Am 05.11.2013 um 02:48 schrieb Scott Wood <scottwood@freescale.com>:

> On Tue, 2013-11-05 at 12:26 +1100, Alexey Kardashevskiy wrote:
>> On 11/05/2013 06:42 AM, Scott Wood wrote:
>>> On Mon, 2013-11-04 at 10:41 +0100, Alexander Graf wrote:
>>>> What we really have are 3 semantically separate entities:
>>>> 
>>>>  * QEMU internal cpu id
>>>>  * KVM internal cpu id
>>>>  * DT exposed cpu id
>>>> 
>>>> As you have noted, it's a good idea to keep the QEMU internal cpu id
>>>> linear, thus completely separate from the others. The DT exposed cpu id
>>>> should be 100% local to hw/ppc/spapr*.c. I don't think any code outside
>>>> of the DT generation and anything that accesses the "Virtual Processor
>>>> Number" in sPAPR needs to care about the DT cpu id. All that code is
>>>> 100% KVM agnostic.
>>> 
>>> This patch isn't just for sPAPR...  On e500 the DT cpu id is supposed to
>>> match the MPIC cpu id.
>> 
>> 
>> At least is my patch correct for e500?
> 
> I think so.
> 
>> I do not really know what is the difference between e500 and spapr in this part.
> 
> e500 does not have sPAPR, and if the DT ID is "100% local to
> hw/ppc/spapr*.c" then the MPIC code will have a problem.
> 
> Currently we don't support smt on e500 in QEMU, but e6500 (an
> e500-derivative) does have smt, so it could happen in the future.

Sure, at that point we add logic that syncs the kvm and dt ids to whatever the kernel expects from the e500 machine file.

The current logic that says "every Nth vcpu id is a core, the ones in between are threads" is fairly implementation specific to the spapr hv code, no?


Alex

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v3] ppc: introduce CPUPPCState::cpu_dt_id
  2013-11-05  6:00         ` Alexander Graf
@ 2013-11-06  1:04           ` Scott Wood
  0 siblings, 0 replies; 9+ messages in thread
From: Scott Wood @ 2013-11-06  1:04 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Alexey Kardashevskiy, QEMU Developers, Bharat Bhushan,
	Paolo Bonzini, Paul Mackerras, Badari Pulavarty,
	list@suse.de:PReP, David Gibson

On Tue, 2013-11-05 at 07:00 +0100, Alexander Graf wrote:
> 
> Am 05.11.2013 um 02:48 schrieb Scott Wood <scottwood@freescale.com>:
> 
> > On Tue, 2013-11-05 at 12:26 +1100, Alexey Kardashevskiy wrote:
> >> On 11/05/2013 06:42 AM, Scott Wood wrote:
> >>> On Mon, 2013-11-04 at 10:41 +0100, Alexander Graf wrote:
> >>>> What we really have are 3 semantically separate entities:
> >>>> 
> >>>>  * QEMU internal cpu id
> >>>>  * KVM internal cpu id
> >>>>  * DT exposed cpu id
> >>>> 
> >>>> As you have noted, it's a good idea to keep the QEMU internal cpu id
> >>>> linear, thus completely separate from the others. The DT exposed cpu id
> >>>> should be 100% local to hw/ppc/spapr*.c. I don't think any code outside
> >>>> of the DT generation and anything that accesses the "Virtual Processor
> >>>> Number" in sPAPR needs to care about the DT cpu id. All that code is
> >>>> 100% KVM agnostic.
> >>> 
> >>> This patch isn't just for sPAPR...  On e500 the DT cpu id is supposed to
> >>> match the MPIC cpu id.
> >> 
> >> 
> >> At least is my patch correct for e500?
> > 
> > I think so.
> > 
> >> I do not really know what is the difference between e500 and spapr in this part.
> > 
> > e500 does not have sPAPR, and if the DT ID is "100% local to
> > hw/ppc/spapr*.c" then the MPIC code will have a problem.
> > 
> > Currently we don't support smt on e500 in QEMU, but e6500 (an
> > e500-derivative) does have smt, so it could happen in the future.
> 
> Sure, at that point we add logic that syncs the kvm and dt ids to whatever the kernel expects from the e500 machine file.
> 
> The current logic that says "every Nth vcpu id is a core, the ones in
> between are threads" is fairly implementation specific to the spapr hv
> code, no?

It's the same on e6500, though I agree that in theory it should not be
generic code specifying the scheme.  My point relates to the consumer of
this information -- "device tree CPU ID" is a concept that makes sense
outside of sPAPR.

Note that the connection between the interrupt controller and the CPU
reg is an ePAPR requirement, not just a coincidence, or something we
only do on MPIC.  PIR is also supposed to match the device tree CPU ID
-- especially under KVM, where PIR is not modifiable by the guest, in
the absence of a device tree binding specifically for PIR which we do
not have.

-Scott

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-11-06  1:05 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-04  1:10 [Qemu-devel] [PATCH v3] ppc: introduce CPUPPCState::cpu_dt_id Alexey Kardashevskiy
2013-11-04  9:41 ` Alexander Graf
2013-11-04  9:58   ` Alexey Kardashevskiy
2013-11-04 10:13     ` Alexander Graf
2013-11-04 19:42   ` Scott Wood
2013-11-05  1:26     ` Alexey Kardashevskiy
2013-11-05  1:48       ` Scott Wood
2013-11-05  6:00         ` Alexander Graf
2013-11-06  1:04           ` Scott Wood

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).