qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v5 00/10] sPAPR CPU hotplug
@ 2015-11-20 12:54 Bharata B Rao
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 01/10] vl: Don't allow CPU toplogies with partially filled cores Bharata B Rao
                   ` (10 more replies)
  0 siblings, 11 replies; 36+ messages in thread
From: Bharata B Rao @ 2015-11-20 12:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: aik, Bharata B Rao, mdroth, agraf, pbonzini, qemu-ppc, tyreld,
	nfont, imammedo, afaerber, david

This patchset adds CPU hotplug support for sPAPR PowerPC guests using
device_add and device_del commands

(qemu) device_add POWER8-powerpc64-cpu,id=cpu0

The first 5 patches are generic changes. Out of these 4/10 is required
by x86 and s390 as well and has been posted in their CPU hotplug patchsets.
I believe 2/10 and 3/10 would be useful for other archs too.

Andreas - If and when found appropriate, would you be taking patches 1 to
5 via your tree ? Should I post them as separate pre-req patchset ?

Patches 6 to 10 are Power specific.

Changes in v5
-------------
- Get rid of a new element (cpu->queued) the previous version introduced
  and have the same logic to determine if cpu is already dequeued for
  both implementations of cpu_exec_exit(). (2/10)
- Call cpu_remove() from cpu_remove_sync() instead of code duplication. (5/10)
- s/smp_cores/spapr_smp_cores (8/10)
- Set correct tb offset for hotplugged CPU. (8/10)
- s/spapr_hotplug_req_add_event/spapr_hotplug_req_add_by_index (8/10)
- Removed support for incomplete cores and added a separate patch
  to prevent such topologies. (8/10 and 1/10)

v4: https://lists.gnu.org/archive/html/qemu-devel/2015-08/msg00650.html

This series applies on top of ppc-for-2.6 branch of David Gibson's tree.

Bharata B Rao (9):
  vl: Don't allow CPU toplogies with partially filled cores
  exec: Remove cpu from cpus list during cpu_exec_exit()
  exec: Do vmstate unregistration from cpu_exec_exit()
  cpu: Add a sync version of cpu_remove()
  xics_kvm: Add cpu_destroy method to XICS
  spapr: Enable CPU hotplug for pseries-2.5 and add CPU DRC DT entries
  spapr: CPU hotplug support
  spapr: CPU hot unplug support
  target-ppc: Enable CPU hotplug for POWER8 CPU family

Gu Zheng (1):
  cpu: Reclaim vCPU objects

 cpus.c                      |  53 ++++++++++
 exec.c                      |  30 ++++++
 hw/intc/xics.c              |  12 +++
 hw/intc/xics_kvm.c          |  13 ++-
 hw/ppc/spapr.c              | 250 +++++++++++++++++++++++++++++++++++++++++++-
 hw/ppc/spapr_events.c       |   3 +
 hw/ppc/spapr_rtas.c         |  24 +++++
 include/hw/ppc/spapr.h      |   1 +
 include/hw/ppc/xics.h       |   2 +
 include/qom/cpu.h           |  18 ++++
 include/sysemu/kvm.h        |   1 +
 kvm-all.c                   |  57 +++++++++-
 kvm-stub.c                  |   5 +
 target-ppc/translate_init.c |  10 ++
 vl.c                        |   9 ++
 15 files changed, 483 insertions(+), 5 deletions(-)

-- 
2.1.0

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v5 01/10] vl: Don't allow CPU toplogies with partially filled cores
  2015-11-20 12:54 [Qemu-devel] [PATCH v5 00/10] sPAPR CPU hotplug Bharata B Rao
@ 2015-11-20 12:54 ` Bharata B Rao
  2015-12-01  0:37   ` David Gibson
  2015-12-02 13:52   ` Igor Mammedov
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 02/10] exec: Remove cpu from cpus list during cpu_exec_exit() Bharata B Rao
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 36+ messages in thread
From: Bharata B Rao @ 2015-11-20 12:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: aik, Bharata B Rao, mdroth, agraf, pbonzini, qemu-ppc, tyreld,
	nfont, imammedo, afaerber, david

Prevent guests from booting with CPU topologies that have partially
filled CPU cores or can result in partially filled CPU cores after
CPU hotplug like

-smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
-smp 15,sockets=1,cores=4,threads=4,maxcpus=17.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 vl.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/vl.c b/vl.c
index 7d993a5..23a1a1e 100644
--- a/vl.c
+++ b/vl.c
@@ -1248,6 +1248,15 @@ static void smp_parse(QemuOpts *opts)
             exit(1);
         }
 
+        if (cpus % threads || max_cpus % threads) {
+            error_report("cpu topology: "
+                         "sockets (%u) cores (%u) threads (%u) with "
+                         "smp_cpus (%u) maxcpus (%u) "
+                         "will result in partially filled cores",
+                         sockets, cores, threads, cpus, max_cpus);
+            exit(1);
+        }
+
         smp_cpus = cpus;
         smp_cores = cores > 0 ? cores : 1;
         smp_threads = threads > 0 ? threads : 1;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v5 02/10] exec: Remove cpu from cpus list during cpu_exec_exit()
  2015-11-20 12:54 [Qemu-devel] [PATCH v5 00/10] sPAPR CPU hotplug Bharata B Rao
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 01/10] vl: Don't allow CPU toplogies with partially filled cores Bharata B Rao
@ 2015-11-20 12:54 ` Bharata B Rao
  2015-12-01  0:44   ` David Gibson
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 03/10] exec: Do vmstate unregistration from cpu_exec_exit() Bharata B Rao
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 36+ messages in thread
From: Bharata B Rao @ 2015-11-20 12:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: aik, Bharata B Rao, mdroth, agraf, pbonzini, qemu-ppc, tyreld,
	nfont, imammedo, afaerber, david

CPUState *cpu gets added to the cpus list during cpu_exec_init(). It
should be removed from cpu_exec_exit().

cpu_exec_init() is called from generic CPU::instance_finalize and some
archs like PowerPC call it from CPU unrealizefn. So ensure that we
dequeue the cpu only once.

Now -1 value for cpu->cpu_index indicates that we have already dequeued
the cpu for CONFIG_USER_ONLY case also.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 exec.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/exec.c b/exec.c
index b09f18b..fbac85b 100644
--- a/exec.c
+++ b/exec.c
@@ -593,6 +593,7 @@ void cpu_exec_exit(CPUState *cpu)
         return;
     }
 
+    QTAILQ_REMOVE(&cpus, cpu, node);
     bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
     cpu->cpu_index = -1;
 }
@@ -611,6 +612,15 @@ static int cpu_get_free_index(Error **errp)
 
 void cpu_exec_exit(CPUState *cpu)
 {
+    cpu_list_lock();
+    if (cpu->cpu_index == -1) {
+        cpu_list_unlock();
+        return;
+    }
+
+    QTAILQ_REMOVE(&cpus, cpu, node);
+    cpu->cpu_index = -1;
+    cpu_list_unlock();
 }
 #endif
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v5 03/10] exec: Do vmstate unregistration from cpu_exec_exit()
  2015-11-20 12:54 [Qemu-devel] [PATCH v5 00/10] sPAPR CPU hotplug Bharata B Rao
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 01/10] vl: Don't allow CPU toplogies with partially filled cores Bharata B Rao
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 02/10] exec: Remove cpu from cpus list during cpu_exec_exit() Bharata B Rao
@ 2015-11-20 12:54 ` Bharata B Rao
  2015-12-01  0:46   ` David Gibson
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 04/10] cpu: Reclaim vCPU objects Bharata B Rao
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 36+ messages in thread
From: Bharata B Rao @ 2015-11-20 12:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: aik, Bharata B Rao, mdroth, agraf, pbonzini, qemu-ppc, tyreld,
	nfont, imammedo, afaerber, david

cpu_exec_init() does vmstate_register and register_savevm for the CPU device.
These need to be undone from cpu_exec_exit(). These changes are needed to
support CPU hot removal and also to correctly fail hotplug attempts
beyond max_cpus.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 exec.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/exec.c b/exec.c
index fbac85b..47f03c0 100644
--- a/exec.c
+++ b/exec.c
@@ -588,6 +588,8 @@ static int cpu_get_free_index(Error **errp)
 
 void cpu_exec_exit(CPUState *cpu)
 {
+    CPUClass *cc = CPU_GET_CLASS(cpu);
+
     if (cpu->cpu_index == -1) {
         /* cpu_index was never allocated by this @cpu or was already freed. */
         return;
@@ -596,6 +598,15 @@ void cpu_exec_exit(CPUState *cpu)
     QTAILQ_REMOVE(&cpus, cpu, node);
     bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
     cpu->cpu_index = -1;
+    if (cc->vmsd != NULL) {
+        vmstate_unregister(NULL, cc->vmsd, cpu);
+    }
+#if defined(CPU_SAVE_VERSION)
+    unregister_savevm(NULL, "cpu", cpu->env_ptr);
+#endif
+    if (qdev_get_vmsd(DEVICE(cpu)) == NULL) {
+        vmstate_unregister(NULL, &vmstate_cpu_common, cpu);
+    }
 }
 #else
 
@@ -612,6 +623,8 @@ static int cpu_get_free_index(Error **errp)
 
 void cpu_exec_exit(CPUState *cpu)
 {
+    CPUClass *cc = CPU_GET_CLASS(cpu);
+
     cpu_list_lock();
     if (cpu->cpu_index == -1) {
         cpu_list_unlock();
@@ -621,6 +634,13 @@ void cpu_exec_exit(CPUState *cpu)
     QTAILQ_REMOVE(&cpus, cpu, node);
     cpu->cpu_index = -1;
     cpu_list_unlock();
+
+    if (cc->vmsd != NULL) {
+        vmstate_unregister(NULL, cc->vmsd, cpu);
+    }
+    if (qdev_get_vmsd(DEVICE(cpu)) == NULL) {
+        vmstate_unregister(NULL, &vmstate_cpu_common, cpu);
+    }
 }
 #endif
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v5 04/10] cpu: Reclaim vCPU objects
  2015-11-20 12:54 [Qemu-devel] [PATCH v5 00/10] sPAPR CPU hotplug Bharata B Rao
                   ` (2 preceding siblings ...)
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 03/10] exec: Do vmstate unregistration from cpu_exec_exit() Bharata B Rao
@ 2015-11-20 12:54 ` Bharata B Rao
  2015-11-30  7:30   ` Alexey Kardashevskiy
  2015-12-01  0:55   ` David Gibson
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 05/10] cpu: Add a sync version of cpu_remove() Bharata B Rao
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 36+ messages in thread
From: Bharata B Rao @ 2015-11-20 12:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: Zhu Guihua, aik, Bharata B Rao, mdroth, agraf, Chen Fan, pbonzini,
	qemu-ppc, tyreld, nfont, Gu Zheng, imammedo, afaerber, david

From: Gu Zheng <guz.fnst@cn.fujitsu.com>

In order to deal well with the kvm vcpus (which can not be removed without any
protection), we do not close KVM vcpu fd, just record and mark it as stopped
into a list, so that we can reuse it for the appending cpu hot-add request if
possible. It is also the approach that kvm guys suggested:
https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
               [- Explicit CPU_REMOVE() from qemu_kvm/tcg_destroy_vcpu()
                  isn't needed as it is done from cpu_exec_exit()]
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
 cpus.c               | 41 +++++++++++++++++++++++++++++++++++++
 include/qom/cpu.h    | 10 +++++++++
 include/sysemu/kvm.h |  1 +
 kvm-all.c            | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++-
 kvm-stub.c           |  5 +++++
 5 files changed, 113 insertions(+), 1 deletion(-)

diff --git a/cpus.c b/cpus.c
index 877bd70..af2b274 100644
--- a/cpus.c
+++ b/cpus.c
@@ -953,6 +953,21 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data)
     qemu_cpu_kick(cpu);
 }
 
+static void qemu_kvm_destroy_vcpu(CPUState *cpu)
+{
+    if (kvm_destroy_vcpu(cpu) < 0) {
+        error_report("kvm_destroy_vcpu failed.\n");
+        exit(EXIT_FAILURE);
+    }
+
+    object_unparent(OBJECT(cpu));
+}
+
+static void qemu_tcg_destroy_vcpu(CPUState *cpu)
+{
+    object_unparent(OBJECT(cpu));
+}
+
 static void flush_queued_work(CPUState *cpu)
 {
     struct qemu_work_item *wi;
@@ -1053,6 +1068,11 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
             }
         }
         qemu_kvm_wait_io_event(cpu);
+        if (cpu->exit && !cpu_can_run(cpu)) {
+            qemu_kvm_destroy_vcpu(cpu);
+            qemu_mutex_unlock(&qemu_global_mutex);
+            return NULL;
+        }
     }
 
     return NULL;
@@ -1108,6 +1128,7 @@ static void tcg_exec_all(void);
 static void *qemu_tcg_cpu_thread_fn(void *arg)
 {
     CPUState *cpu = arg;
+    CPUState *remove_cpu = NULL;
 
     rcu_register_thread();
 
@@ -1145,6 +1166,16 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
             }
         }
         qemu_tcg_wait_io_event(QTAILQ_FIRST(&cpus));
+        CPU_FOREACH(cpu) {
+            if (cpu->exit && !cpu_can_run(cpu)) {
+                remove_cpu = cpu;
+                break;
+            }
+        }
+        if (remove_cpu) {
+            qemu_tcg_destroy_vcpu(remove_cpu);
+            remove_cpu = NULL;
+        }
     }
 
     return NULL;
@@ -1301,6 +1332,13 @@ void resume_all_vcpus(void)
     }
 }
 
+void cpu_remove(CPUState *cpu)
+{
+    cpu->stop = true;
+    cpu->exit = true;
+    qemu_cpu_kick(cpu);
+}
+
 /* For temporary buffers for forming a name */
 #define VCPU_THREAD_NAME_SIZE 16
 
@@ -1506,6 +1544,9 @@ static void tcg_exec_all(void)
                 break;
             }
         } else if (cpu->stop || cpu->stopped) {
+            if (cpu->exit) {
+                next_cpu = CPU_NEXT(cpu);
+            }
             break;
         }
     }
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 51a1323..67e05b0 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -223,6 +223,7 @@ struct kvm_run;
  * @halted: Nonzero if the CPU is in suspended state.
  * @stop: Indicates a pending stop request.
  * @stopped: Indicates the CPU has been artificially stopped.
+ * @exit: Indicates the CPU has exited due to an unplug operation.
  * @crash_occurred: Indicates the OS reported a crash (panic) for this CPU
  * @tcg_exit_req: Set to force TCG to stop executing linked TBs for this
  *           CPU and return to its top level loop.
@@ -274,6 +275,7 @@ struct CPUState {
     bool created;
     bool stop;
     bool stopped;
+    bool exit;
     bool crash_occurred;
     bool exit_request;
     uint32_t interrupt_request;
@@ -696,6 +698,14 @@ void cpu_exit(CPUState *cpu);
 void cpu_resume(CPUState *cpu);
 
 /**
+ * cpu_remove:
+ * @cpu: The CPU to remove.
+ *
+ * Requests the CPU to be removed.
+ */
+void cpu_remove(CPUState *cpu);
+
+/**
  * qemu_init_vcpu:
  * @cpu: The vCPU to initialize.
  *
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index b31f325..dd1b783 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -206,6 +206,7 @@ int kvm_has_intx_set_mask(void);
 
 int kvm_init_vcpu(CPUState *cpu);
 int kvm_cpu_exec(CPUState *cpu);
+int kvm_destroy_vcpu(CPUState *cpu);
 
 #ifdef NEED_CPU_H
 
diff --git a/kvm-all.c b/kvm-all.c
index c648b81..3befc59 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -60,6 +60,12 @@
 
 #define KVM_MSI_HASHTAB_SIZE    256
 
+struct KVMParkedVcpu {
+    unsigned long vcpu_id;
+    int kvm_fd;
+    QLIST_ENTRY(KVMParkedVcpu) node;
+};
+
 struct KVMState
 {
     AccelState parent_obj;
@@ -93,6 +99,7 @@ struct KVMState
     QTAILQ_HEAD(msi_hashtab, KVMMSIRoute) msi_hashtab[KVM_MSI_HASHTAB_SIZE];
 #endif
     KVMMemoryListener memory_listener;
+    QLIST_HEAD(, KVMParkedVcpu) kvm_parked_vcpus;
 };
 
 KVMState *kvm_state;
@@ -235,6 +242,53 @@ static int kvm_set_user_memory_region(KVMMemoryListener *kml, KVMSlot *slot)
     return kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
 }
 
+int kvm_destroy_vcpu(CPUState *cpu)
+{
+    KVMState *s = kvm_state;
+    long mmap_size;
+    struct KVMParkedVcpu *vcpu = NULL;
+    int ret = 0;
+
+    DPRINTF("kvm_destroy_vcpu\n");
+
+    mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
+    if (mmap_size < 0) {
+        ret = mmap_size;
+        DPRINTF("KVM_GET_VCPU_MMAP_SIZE failed\n");
+        goto err;
+    }
+
+    ret = munmap(cpu->kvm_run, mmap_size);
+    if (ret < 0) {
+        goto err;
+    }
+
+    vcpu = g_malloc0(sizeof(*vcpu));
+    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
+    vcpu->kvm_fd = cpu->kvm_fd;
+    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node);
+err:
+    return ret;
+}
+
+static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
+{
+    struct KVMParkedVcpu *cpu;
+
+    QLIST_FOREACH(cpu, &s->kvm_parked_vcpus, node) {
+        if (cpu->vcpu_id == vcpu_id) {
+            int kvm_fd;
+
+            QLIST_REMOVE(cpu, node);
+            kvm_fd = cpu->kvm_fd;
+            g_free(cpu);
+            return kvm_fd;
+        }
+    }
+
+    return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
+}
+
 int kvm_init_vcpu(CPUState *cpu)
 {
     KVMState *s = kvm_state;
@@ -243,7 +297,7 @@ int kvm_init_vcpu(CPUState *cpu)
 
     DPRINTF("kvm_init_vcpu\n");
 
-    ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)kvm_arch_vcpu_id(cpu));
+    ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
     if (ret < 0) {
         DPRINTF("kvm_create_vcpu failed\n");
         goto err;
@@ -1468,6 +1522,7 @@ static int kvm_init(MachineState *ms)
 #ifdef KVM_CAP_SET_GUEST_DEBUG
     QTAILQ_INIT(&s->kvm_sw_breakpoints);
 #endif
+    QLIST_INIT(&s->kvm_parked_vcpus);
     s->vmfd = -1;
     s->fd = qemu_open("/dev/kvm", O_RDWR);
     if (s->fd == -1) {
diff --git a/kvm-stub.c b/kvm-stub.c
index dc97a5e..0b39456 100644
--- a/kvm-stub.c
+++ b/kvm-stub.c
@@ -32,6 +32,11 @@ bool kvm_allowed;
 bool kvm_readonly_mem_allowed;
 bool kvm_ioeventfd_any_length_allowed;
 
+int kvm_destroy_vcpu(CPUState *cpu)
+{
+    return -ENOSYS;
+}
+
 int kvm_init_vcpu(CPUState *cpu)
 {
     return -ENOSYS;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v5 05/10] cpu: Add a sync version of cpu_remove()
  2015-11-20 12:54 [Qemu-devel] [PATCH v5 00/10] sPAPR CPU hotplug Bharata B Rao
                   ` (3 preceding siblings ...)
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 04/10] cpu: Reclaim vCPU objects Bharata B Rao
@ 2015-11-20 12:54 ` Bharata B Rao
  2015-12-01  0:57   ` David Gibson
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 06/10] xics_kvm: Add cpu_destroy method to XICS Bharata B Rao
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 36+ messages in thread
From: Bharata B Rao @ 2015-11-20 12:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: aik, Bharata B Rao, mdroth, agraf, pbonzini, qemu-ppc, tyreld,
	nfont, imammedo, afaerber, david

This sync API will be used by the CPU hotplug code to wait for the CPU to
completely get removed before flagging the failure to the device_add
command.

Sync version of this call is needed to correctly recover from CPU
realization failures when ->plug() handler fails.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 cpus.c            | 12 ++++++++++++
 include/qom/cpu.h |  8 ++++++++
 2 files changed, 20 insertions(+)

diff --git a/cpus.c b/cpus.c
index af2b274..c2444ba 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1070,6 +1070,8 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
         qemu_kvm_wait_io_event(cpu);
         if (cpu->exit && !cpu_can_run(cpu)) {
             qemu_kvm_destroy_vcpu(cpu);
+            cpu->created = false;
+            qemu_cond_signal(&qemu_cpu_cond);
             qemu_mutex_unlock(&qemu_global_mutex);
             return NULL;
         }
@@ -1174,6 +1176,8 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
         }
         if (remove_cpu) {
             qemu_tcg_destroy_vcpu(remove_cpu);
+            cpu->created = false;
+            qemu_cond_signal(&qemu_cpu_cond);
             remove_cpu = NULL;
         }
     }
@@ -1339,6 +1343,14 @@ void cpu_remove(CPUState *cpu)
     qemu_cpu_kick(cpu);
 }
 
+void cpu_remove_sync(CPUState *cpu)
+{
+    cpu_remove(cpu);
+    while (cpu->created) {
+        qemu_cond_wait(&qemu_cpu_cond, &qemu_global_mutex);
+    }
+}
+
 /* For temporary buffers for forming a name */
 #define VCPU_THREAD_NAME_SIZE 16
 
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 67e05b0..7fc6696 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -705,6 +705,14 @@ void cpu_resume(CPUState *cpu);
  */
 void cpu_remove(CPUState *cpu);
 
+ /**
+ * cpu_remove_sync:
+ * @cpu: The CPU to remove.
+ *
+ * Requests the CPU to be removed and waits till it is removed.
+ */
+void cpu_remove_sync(CPUState *cpu);
+
 /**
  * qemu_init_vcpu:
  * @cpu: The vCPU to initialize.
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v5 06/10] xics_kvm: Add cpu_destroy method to XICS
  2015-11-20 12:54 [Qemu-devel] [PATCH v5 00/10] sPAPR CPU hotplug Bharata B Rao
                   ` (4 preceding siblings ...)
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 05/10] cpu: Add a sync version of cpu_remove() Bharata B Rao
@ 2015-11-20 12:54 ` Bharata B Rao
  2015-12-01  1:01   ` David Gibson
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 07/10] spapr: Enable CPU hotplug for pseries-2.5 and add CPU DRC DT entries Bharata B Rao
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 36+ messages in thread
From: Bharata B Rao @ 2015-11-20 12:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: aik, Bharata B Rao, mdroth, agraf, pbonzini, qemu-ppc, tyreld,
	nfont, imammedo, afaerber, david

XICS is setup for each CPU during initialization. Provide a routine
to undo the same when CPU is unplugged.

This allows reboot of a VM that has undergone CPU hotplug and unplug
to work correctly.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/intc/xics.c        | 12 ++++++++++++
 hw/intc/xics_kvm.c    | 13 +++++++++++--
 include/hw/ppc/xics.h |  2 ++
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 9ff5796..e1161b2 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -44,6 +44,18 @@ static int get_cpu_index_by_dt_id(int cpu_dt_id)
     return -1;
 }
 
+void xics_cpu_destroy(XICSState *icp, PowerPCCPU *cpu)
+{
+    CPUState *cs = CPU(cpu);
+    XICSStateClass *info = XICS_COMMON_GET_CLASS(icp);
+
+    assert(cs->cpu_index < icp->nr_servers);
+
+    if (info->cpu_destroy) {
+        info->cpu_destroy(icp, cpu);
+    }
+}
+
 void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
 {
     CPUState *cs = CPU(cpu);
diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index d58729c..cb96f69 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -331,6 +331,8 @@ static void xics_kvm_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
         abort();
     }
 
+    ss->cs = cs;
+
     /*
      * If we are reusing a parked vCPU fd corresponding to the CPU
      * which was hot-removed earlier we don't have to renable
@@ -343,8 +345,6 @@ static void xics_kvm_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
     if (icpkvm->kernel_xics_fd != -1) {
         int ret;
 
-        ss->cs = cs;
-
         ret = kvm_vcpu_enable_cap(cs, KVM_CAP_IRQ_XICS, 0,
                                   icpkvm->kernel_xics_fd, kvm_arch_vcpu_id(cs));
         if (ret < 0) {
@@ -356,6 +356,14 @@ static void xics_kvm_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
     }
 }
 
+static void xics_kvm_cpu_destroy(XICSState *icp, PowerPCCPU *cpu)
+{
+    CPUState *cs = CPU(cpu);
+    ICPState *ss = &icp->ss[cs->cpu_index];
+
+    ss->cs = NULL;
+}
+
 static void xics_kvm_set_nr_irqs(XICSState *icp, uint32_t nr_irqs, Error **errp)
 {
     icp->nr_irqs = icp->ics->nr_irqs = nr_irqs;
@@ -486,6 +494,7 @@ static void xics_kvm_class_init(ObjectClass *oc, void *data)
 
     dc->realize = xics_kvm_realize;
     xsc->cpu_setup = xics_kvm_cpu_setup;
+    xsc->cpu_destroy = xics_kvm_cpu_destroy;
     xsc->set_nr_irqs = xics_kvm_set_nr_irqs;
     xsc->set_nr_servers = xics_kvm_set_nr_servers;
 }
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 355a966..2faad48 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -68,6 +68,7 @@ struct XICSStateClass {
     DeviceClass parent_class;
 
     void (*cpu_setup)(XICSState *icp, PowerPCCPU *cpu);
+    void (*cpu_destroy)(XICSState *icp, PowerPCCPU *cpu);
     void (*set_nr_irqs)(XICSState *icp, uint32_t nr_irqs, Error **errp);
     void (*set_nr_servers)(XICSState *icp, uint32_t nr_servers, Error **errp);
 };
@@ -166,5 +167,6 @@ int xics_alloc_block(XICSState *icp, int src, int num, bool lsi, bool align);
 void xics_free(XICSState *icp, int irq, int num);
 
 void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu);
+void xics_cpu_destroy(XICSState *icp, PowerPCCPU *cpu);
 
 #endif /* __XICS_H__ */
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v5 07/10] spapr: Enable CPU hotplug for pseries-2.5 and add CPU DRC DT entries
  2015-11-20 12:54 [Qemu-devel] [PATCH v5 00/10] sPAPR CPU hotplug Bharata B Rao
                   ` (5 preceding siblings ...)
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 06/10] xics_kvm: Add cpu_destroy method to XICS Bharata B Rao
@ 2015-11-20 12:54 ` Bharata B Rao
  2015-12-01  1:06   ` David Gibson
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 08/10] spapr: CPU hotplug support Bharata B Rao
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 36+ messages in thread
From: Bharata B Rao @ 2015-11-20 12:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: aik, Bharata B Rao, mdroth, agraf, pbonzini, qemu-ppc, tyreld,
	nfont, imammedo, afaerber, david

Start supporting CPU hotplug from pseries-2.5 onwards. Add CPU
DRC (Dynamic Resource Connector) device tree entries.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr.c         | 23 +++++++++++++++++++++++
 include/hw/ppc/spapr.h |  1 +
 2 files changed, 24 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 030ee35..814b0a6 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -973,6 +973,16 @@ static void spapr_finalize_fdt(sPAPRMachineState *spapr,
         _FDT(spapr_drc_populate_dt(fdt, 0, NULL, SPAPR_DR_CONNECTOR_TYPE_LMB));
     }
 
+    if (smc->dr_cpu_enabled) {
+        int offset = fdt_path_offset(fdt, "/cpus");
+        ret = spapr_drc_populate_dt(fdt, offset, NULL,
+                                    SPAPR_DR_CONNECTOR_TYPE_CPU);
+        if (ret < 0) {
+            fprintf(stderr, "Couldn't set up CPU DR device tree properties\n");
+            exit(1);
+        }
+    }
+
     _FDT((fdt_pack(fdt)));
 
     if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
@@ -1727,6 +1737,8 @@ static void ppc_spapr_init(MachineState *machine)
     long load_limit, fw_size;
     bool kernel_le = false;
     char *filename;
+    int smt = kvmppc_smt_threads();
+    int smp_max_cores = DIV_ROUND_UP(max_cpus, smp_threads);
 
     msi_supported = true;
 
@@ -1793,6 +1805,15 @@ static void ppc_spapr_init(MachineState *machine)
         spapr_validate_node_memory(machine);
     }
 
+    if (smc->dr_cpu_enabled) {
+        for (i = 0; i < smp_max_cores; i++) {
+            sPAPRDRConnector *drc =
+                spapr_dr_connector_new(OBJECT(spapr),
+                                       SPAPR_DR_CONNECTOR_TYPE_CPU, i * smt);
+            qemu_register_reset(spapr_drc_reset, drc);
+        }
+    }
+
     /* init CPUs */
     if (machine->cpu_model == NULL) {
         machine->cpu_model = kvm_enabled() ? "host" : "POWER7";
@@ -2277,6 +2298,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     mc->cpu_index_to_socket_id = spapr_cpu_index_to_socket_id;
 
     smc->dr_lmb_enabled = false;
+    smc->dr_cpu_enabled = false;
     fwc->get_dev_path = spapr_get_fw_dev_path;
     nc->nmi_monitor_handler = spapr_nmi;
 }
@@ -2443,6 +2465,7 @@ static void spapr_machine_2_5_class_init(ObjectClass *oc, void *data)
     mc->alias = "pseries";
     mc->is_default = 1;
     smc->dr_lmb_enabled = true;
+    smc->dr_cpu_enabled = true;
 }
 
 static const TypeInfo spapr_machine_2_5_info = {
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 5baa906..716d7ad 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -36,6 +36,7 @@ struct sPAPRMachineClass {
 
     /*< public >*/
     bool dr_lmb_enabled; /* enable dynamic-reconfig/hotplug of LMBs */
+    bool dr_cpu_enabled; /* enable dynamic-reconfig/hotplug of CPUs */
 };
 
 /**
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v5 08/10] spapr: CPU hotplug support
  2015-11-20 12:54 [Qemu-devel] [PATCH v5 00/10] sPAPR CPU hotplug Bharata B Rao
                   ` (6 preceding siblings ...)
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 07/10] spapr: Enable CPU hotplug for pseries-2.5 and add CPU DRC DT entries Bharata B Rao
@ 2015-11-20 12:54 ` Bharata B Rao
  2015-12-01  1:30   ` David Gibson
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 09/10] spapr: CPU hot unplug support Bharata B Rao
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 36+ messages in thread
From: Bharata B Rao @ 2015-11-20 12:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: aik, Bharata B Rao, mdroth, agraf, pbonzini, qemu-ppc, tyreld,
	nfont, imammedo, afaerber, david

Support CPU hotplug via device-add command. Set up device tree
entries for the hotplugged CPU core and use the exising EPOW event
infrastructure to send CPU hotplug notification to the guest.

Create only cores explicitly from boot path as well as hotplug path
and let the ->plug() handler of the core create the threads of the core.

Also support cold plugged CPUs that are specified by -device option
on cmdline.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c              | 156 +++++++++++++++++++++++++++++++++++++++++++-
 hw/ppc/spapr_events.c       |   3 +
 hw/ppc/spapr_rtas.c         |  24 +++++++
 target-ppc/translate_init.c |   8 +++
 4 files changed, 189 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 814b0a6..4434d45 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -596,6 +596,18 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
     size_t page_sizes_prop_size;
     uint32_t vcpus_per_socket = smp_threads * smp_cores;
     uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
+    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(qdev_get_machine());
+    sPAPRDRConnector *drc;
+    sPAPRDRConnectorClass *drck;
+    int drc_index;
+
+    if (smc->dr_cpu_enabled) {
+        drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, index);
+        g_assert(drc);
+        drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+        drc_index = drck->get_index(drc);
+        _FDT((fdt_setprop_cell(fdt, offset, "ibm,my-drc-index", drc_index)));
+    }
 
     /* Note: we keep CI large pages off for now because a 64K capable guest
      * provisioned with large pages might otherwise try to map a qemu
@@ -1739,6 +1751,7 @@ static void ppc_spapr_init(MachineState *machine)
     char *filename;
     int smt = kvmppc_smt_threads();
     int smp_max_cores = DIV_ROUND_UP(max_cpus, smp_threads);
+    int spapr_smp_cores = DIV_ROUND_UP(smp_cpus, smp_threads);
 
     msi_supported = true;
 
@@ -1818,7 +1831,7 @@ static void ppc_spapr_init(MachineState *machine)
     if (machine->cpu_model == NULL) {
         machine->cpu_model = kvm_enabled() ? "host" : "POWER7";
     }
-    for (i = 0; i < smp_cpus; i++) {
+    for (i = 0; i < spapr_smp_cores; i++) {
         cpu = cpu_ppc_init(machine->cpu_model);
         if (cpu == NULL) {
             fprintf(stderr, "Unable to find PowerPC CPU definition\n");
@@ -2207,10 +2220,135 @@ out:
     error_propagate(errp, local_err);
 }
 
+static void *spapr_populate_hotplug_cpu_dt(DeviceState *dev, CPUState *cs,
+                                           int *fdt_offset,
+                                           sPAPRMachineState *spapr)
+{
+    PowerPCCPU *cpu = POWERPC_CPU(cs);
+    DeviceClass *dc = DEVICE_GET_CLASS(cs);
+    int id = ppc_get_vcpu_dt_id(cpu);
+    void *fdt;
+    int offset, fdt_size;
+    char *nodename;
+
+    fdt = create_device_tree(&fdt_size);
+    nodename = g_strdup_printf("%s@%x", dc->fw_name, id);
+    offset = fdt_add_subnode(fdt, 0, nodename);
+
+    spapr_populate_cpu_dt(cs, fdt, offset, spapr);
+    g_free(nodename);
+
+    *fdt_offset = offset;
+    return fdt;
+}
+
+static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
+                            Error **errp)
+{
+    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(qdev_get_machine());
+    sPAPRMachineState *ms = SPAPR_MACHINE(hotplug_dev);
+    CPUState *cs = CPU(dev);
+    PowerPCCPU *cpu = POWERPC_CPU(cs);
+    int id = ppc_get_vcpu_dt_id(cpu);
+    sPAPRDRConnector *drc =
+        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
+    sPAPRDRConnectorClass *drck;
+    int smt = kvmppc_smt_threads();
+    Error *local_err = NULL;
+    void *fdt = NULL;
+    int i, fdt_offset = 0;
+
+    /* Set NUMA node for the added CPUs  */
+    for (i = 0; i < nb_numa_nodes; i++) {
+        if (test_bit(cs->cpu_index, numa_info[i].node_cpu)) {
+            cs->numa_node = i;
+            break;
+        }
+    }
+
+    /*
+     * Currently CPU core and threads of a core aren't really different
+     * from QEMU point of view since all of them are just CPU devices. Hence
+     * there is no separate realize routines for cores and threads.
+     * We use the id check below to do things differently for cores and threads.
+     *
+     * SMT threads return from here, only main thread (core) will
+     * continue, create threads and signal hotplug event to the guest.
+     */
+    if ((id % smt) != 0) {
+        return;
+    }
+
+    /* Create SMT threads of the core. */
+    for (i = 1; i < smp_threads; i++) {
+        cpu = cpu_ppc_init(current_machine->cpu_model);
+        if (!cpu) {
+            error_report("Unable to find PowerPC CPU definition: %s",
+                          current_machine->cpu_model);
+            exit(EXIT_FAILURE);
+        }
+    }
+
+    if (!smc->dr_cpu_enabled) {
+        /*
+         * This is a cold plugged CPU but the machine doesn't support
+         * DR. So skip the hotplug path ensuring that the CPU is brought
+         * up online with out an associated DR connector.
+         */
+        return;
+    }
+
+    g_assert(drc);
+
+    /*
+     * Setup CPU DT entries only for hotplugged CPUs. For boot time or
+     * coldplugged CPUs DT entries are setup in spapr_finalize_fdt().
+     */
+    if (dev->hotplugged) {
+        fdt = spapr_populate_hotplug_cpu_dt(dev, cs, &fdt_offset, ms);
+    }
+
+    drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+    drck->attach(drc, dev, fdt, fdt_offset, !dev->hotplugged, &local_err);
+    if (local_err) {
+        g_free(fdt);
+        error_propagate(errp, local_err);
+        return;
+    }
+
+    /*
+     * We send hotplug notification interrupt to the guest only in case
+     * of hotplugged CPUs.
+     */
+    if (dev->hotplugged) {
+        spapr_hotplug_req_add_by_index(drc);
+    } else {
+        /*
+         * HACK to support removal of hotplugged CPU after VM migration:
+         *
+         * Since we want to be able to hot-remove those coldplugged CPUs
+         * started at boot time using -device option at the target VM, we set
+         * the right allocation_state and isolation_state for them, which for
+         * the hotplugged CPUs would be set via RTAS calls done from the
+         * guest during hotplug.
+         *
+         * This allows the coldplugged CPUs started using -device option to
+         * have the right isolation and allocation states as expected by the
+         * CPU hot removal code.
+         *
+         * This hack will be removed once we have DRC states migrated as part
+         * of VM migration.
+         */
+        drck->set_allocation_state(drc, SPAPR_DR_ALLOCATION_STATE_USABLE);
+        drck->set_isolation_state(drc, SPAPR_DR_ISOLATION_STATE_UNISOLATED);
+    }
+}
+
 static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
                                       DeviceState *dev, Error **errp)
 {
     sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(qdev_get_machine());
+    sPAPRMachineState *ms = SPAPR_MACHINE(hotplug_dev);
 
     if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
         int node;
@@ -2247,6 +2385,19 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
         }
 
         spapr_memory_plug(hotplug_dev, dev, node, errp);
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+        CPUState *cs = CPU(dev);
+        PowerPCCPU *cpu = POWERPC_CPU(cs);
+
+        if (!smc->dr_cpu_enabled && dev->hotplugged) {
+            error_setg(errp, "CPU hotplug not supported for this machine");
+            cpu_remove_sync(cs);
+            return;
+        }
+
+        spapr_cpu_init(ms, cpu);
+        spapr_cpu_reset(cpu);
+        spapr_cpu_plug(hotplug_dev, dev, errp);
     }
 }
 
@@ -2261,7 +2412,8 @@ static void spapr_machine_device_unplug(HotplugHandler *hotplug_dev,
 static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
                                              DeviceState *dev)
 {
-    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) ||
+        object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
         return HOTPLUG_HANDLER(machine);
     }
     return NULL;
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index 744ea62..1063036 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -436,6 +436,9 @@ static void spapr_hotplug_req_event(uint8_t hp_id, uint8_t hp_action,
     case SPAPR_DR_CONNECTOR_TYPE_LMB:
         hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_MEMORY;
         break;
+    case SPAPR_DR_CONNECTOR_TYPE_CPU:
+        hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_CPU;
+        break;
     default:
         /* we shouldn't be signaling hotplug events for resources
          * that don't support them
diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 34b12a3..7baa862 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -33,6 +33,7 @@
 
 #include "hw/ppc/spapr.h"
 #include "hw/ppc/spapr_vio.h"
+#include "hw/ppc/ppc.h"
 #include "qapi-event.h"
 #include "hw/boards.h"
 
@@ -159,6 +160,27 @@ static void rtas_query_cpu_stopped_state(PowerPCCPU *cpu_,
     rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
 }
 
+/*
+ * Set the timebase offset of the CPU to that of first CPU.
+ * This helps hotplugged CPU to have the correct timebase offset.
+ */
+static void spapr_cpu_update_tb_offset(PowerPCCPU *cpu)
+{
+    PowerPCCPU *fcpu = POWERPC_CPU(first_cpu);
+
+    cpu->env.tb_env->tb_offset = fcpu->env.tb_env->tb_offset;
+}
+
+static void spapr_cpu_set_endianness(PowerPCCPU *cpu)
+{
+    PowerPCCPU *fcpu = POWERPC_CPU(first_cpu);
+    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(fcpu);
+
+    if (!pcc->interrupts_big_endian(fcpu)) {
+        cpu->env.spr[SPR_LPCR] |= LPCR_ILE;
+    }
+}
+
 static void rtas_start_cpu(PowerPCCPU *cpu_, sPAPRMachineState *spapr,
                            uint32_t token, uint32_t nargs,
                            target_ulong args,
@@ -195,6 +217,8 @@ static void rtas_start_cpu(PowerPCCPU *cpu_, sPAPRMachineState *spapr,
         env->nip = start;
         env->gpr[3] = r3;
         cs->halted = 0;
+        spapr_cpu_set_endianness(cpu);
+        spapr_cpu_update_tb_offset(cpu);
 
         qemu_cpu_kick(cs);
 
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index e88dc7f..245d73a 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -30,6 +30,9 @@
 #include "qemu/error-report.h"
 #include "qapi/visitor.h"
 #include "hw/qdev-properties.h"
+#if !defined(CONFIG_USER_ONLY)
+#include "sysemu/sysemu.h"
+#endif
 
 //#define PPC_DUMP_CPU
 //#define PPC_DEBUG_SPR
@@ -8933,6 +8936,11 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
     }
 
 #if !defined(CONFIG_USER_ONLY)
+    if (cs->cpu_index >= max_cpus) {
+        error_setg(errp, "Cannot have more than %d CPUs", max_cpus);
+        return;
+    }
+
     cpu->cpu_dt_id = (cs->cpu_index / smp_threads) * max_smt
         + (cs->cpu_index % smp_threads);
 #endif
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v5 09/10] spapr: CPU hot unplug support
  2015-11-20 12:54 [Qemu-devel] [PATCH v5 00/10] sPAPR CPU hotplug Bharata B Rao
                   ` (7 preceding siblings ...)
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 08/10] spapr: CPU hotplug support Bharata B Rao
@ 2015-11-20 12:54 ` Bharata B Rao
  2015-12-01  1:34   ` David Gibson
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 10/10] target-ppc: Enable CPU hotplug for POWER8 CPU family Bharata B Rao
  2015-11-23 11:54 ` [Qemu-devel] [PATCH v5 00/10] sPAPR CPU hotplug Peter Krempa
  10 siblings, 1 reply; 36+ messages in thread
From: Bharata B Rao @ 2015-11-20 12:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: aik, Bharata B Rao, mdroth, agraf, pbonzini, qemu-ppc, tyreld,
	nfont, imammedo, afaerber, david

Support hot removal of CPU for sPAPR guests by sending the hot unplug
notification to the guest via EPOW interrupt. Release the vCPU object
after CPU hot unplug so that vCPU fd can be parked and reused.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 71 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 4434d45..6dca553 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2401,11 +2401,82 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
     }
 }
 
+static void spapr_cpu_destroy(PowerPCCPU *cpu)
+{
+    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
+
+    xics_cpu_destroy(spapr->icp, cpu);
+    qemu_unregister_reset(spapr_cpu_reset, cpu);
+}
+
+static void spapr_cpu_release(DeviceState *dev, void *opaque)
+{
+    CPUState *cs;
+    int i;
+    int id = ppc_get_vcpu_dt_id(POWERPC_CPU(CPU(dev)));
+
+    for (i = id; i < id + smp_threads; i++) {
+        CPU_FOREACH(cs) {
+            PowerPCCPU *cpu = POWERPC_CPU(cs);
+
+            if (i == ppc_get_vcpu_dt_id(cpu)) {
+                spapr_cpu_destroy(cpu);
+                cpu_remove(cs);
+            }
+        }
+    }
+}
+
+static int spapr_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
+                             Error **errp)
+{
+    CPUState *cs = CPU(dev);
+    PowerPCCPU *cpu = POWERPC_CPU(cs);
+    int id = ppc_get_vcpu_dt_id(cpu);
+    int smt = kvmppc_smt_threads();
+    sPAPRDRConnector *drc =
+        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
+    sPAPRDRConnectorClass *drck;
+    Error *local_err = NULL;
+
+    /*
+     * SMT threads return from here, only main thread (core) will
+     * continue and signal hot unplug event to the guest.
+     */
+    if ((id % smt) != 0) {
+        return 0;
+    }
+    g_assert(drc);
+
+    drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+    drck->detach(drc, dev, spapr_cpu_release, NULL, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return -1;
+    }
+
+    /*
+     * In addition to hotplugged CPUs, send the hot-unplug notification
+     * interrupt to the guest for coldplugged CPUs started via -device
+     * option too.
+     */
+    spapr_hotplug_req_remove_by_index(drc);
+    return 0;
+}
+
 static void spapr_machine_device_unplug(HotplugHandler *hotplug_dev,
                                       DeviceState *dev, Error **errp)
 {
+    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(qdev_get_machine());
+
     if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
         error_setg(errp, "Memory hot unplug not supported by sPAPR");
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+        if (!smc->dr_cpu_enabled) {
+            error_setg(errp, "CPU hot unplug not supported on this machine");
+            return;
+        }
+        spapr_cpu_unplug(hotplug_dev, dev, errp);
     }
 }
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v5 10/10] target-ppc: Enable CPU hotplug for POWER8 CPU family
  2015-11-20 12:54 [Qemu-devel] [PATCH v5 00/10] sPAPR CPU hotplug Bharata B Rao
                   ` (8 preceding siblings ...)
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 09/10] spapr: CPU hot unplug support Bharata B Rao
@ 2015-11-20 12:54 ` Bharata B Rao
  2015-11-23 11:54 ` [Qemu-devel] [PATCH v5 00/10] sPAPR CPU hotplug Peter Krempa
  10 siblings, 0 replies; 36+ messages in thread
From: Bharata B Rao @ 2015-11-20 12:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: aik, Bharata B Rao, mdroth, agraf, pbonzini, qemu-ppc, tyreld,
	nfont, imammedo, afaerber, david

Support CPU hotplug on POWER8 by enabling device_add semantics.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 target-ppc/translate_init.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 245d73a..45d1b7c 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -8207,6 +8207,8 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
     dc->fw_name = "PowerPC,POWER8";
     dc->desc = "POWER8";
     dc->props = powerpc_servercpu_properties;
+    dc->cannot_instantiate_with_device_add_yet = false;
+
     pcc->pvr_match = ppc_pvr_match_power8;
     pcc->pcr_mask = PCR_COMPAT_2_05 | PCR_COMPAT_2_06;
     pcc->init_proc = init_proc_POWER8;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/10] sPAPR CPU hotplug
  2015-11-20 12:54 [Qemu-devel] [PATCH v5 00/10] sPAPR CPU hotplug Bharata B Rao
                   ` (9 preceding siblings ...)
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 10/10] target-ppc: Enable CPU hotplug for POWER8 CPU family Bharata B Rao
@ 2015-11-23 11:54 ` Peter Krempa
  2015-11-23 13:04   ` Christian Borntraeger
  10 siblings, 1 reply; 36+ messages in thread
From: Peter Krempa @ 2015-11-23 11:54 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, imammedo, aik, agraf, qemu-devel, qemu-ppc, tyreld, nfont,
	pbonzini, afaerber, david

[-- Attachment #1: Type: text/plain, Size: 614 bytes --]

On Fri, Nov 20, 2015 at 18:24:29 +0530, Bharata B Rao wrote:
> This patchset adds CPU hotplug support for sPAPR PowerPC guests using
> device_add and device_del commands
> 
> (qemu) device_add POWER8-powerpc64-cpu,id=cpu0

Is there a reason why this uses 'device_add' rather than the 'cpu_add'
command? Libvirt uses two separate approaches already. Due to legacy
reasons we support the HMP 'cpu_set' command, and lately we added
support for QMP 'cpu-add'. Using device_add here will introduce a
different approach and will require yet another compatibility layer in
libvirt to support this.

Peter


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/10] sPAPR CPU hotplug
  2015-11-23 11:54 ` [Qemu-devel] [PATCH v5 00/10] sPAPR CPU hotplug Peter Krempa
@ 2015-11-23 13:04   ` Christian Borntraeger
  2015-12-01  1:43     ` David Gibson
  0 siblings, 1 reply; 36+ messages in thread
From: Christian Borntraeger @ 2015-11-23 13:04 UTC (permalink / raw)
  To: Peter Krempa, Bharata B Rao
  Cc: agraf, aik, qemu-devel, mdroth, pbonzini, qemu-ppc, tyreld, nfont,
	imammedo, afaerber, david

On 11/23/2015 12:54 PM, Peter Krempa wrote:
> On Fri, Nov 20, 2015 at 18:24:29 +0530, Bharata B Rao wrote:
>> This patchset adds CPU hotplug support for sPAPR PowerPC guests using
>> device_add and device_del commands
>>
>> (qemu) device_add POWER8-powerpc64-cpu,id=cpu0
> 
> Is there a reason why this uses 'device_add' rather than the 'cpu_add'
> command? Libvirt uses two separate approaches already. Due to legacy
> reasons we support the HMP 'cpu_set' command, and lately we added
> support for QMP 'cpu-add'. Using device_add here will introduce a
> different approach and will require yet another compatibility layer in
> libvirt to support this.

s390 and powerpc both started with cpu_add patches. Andreas Faerber
suggested then to only implement device_add. This was apparently discussed
at the last KVM forum.

Christian

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 04/10] cpu: Reclaim vCPU objects
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 04/10] cpu: Reclaim vCPU objects Bharata B Rao
@ 2015-11-30  7:30   ` Alexey Kardashevskiy
  2015-12-02  3:50     ` Bharata B Rao
  2015-12-01  0:55   ` David Gibson
  1 sibling, 1 reply; 36+ messages in thread
From: Alexey Kardashevskiy @ 2015-11-30  7:30 UTC (permalink / raw)
  To: Bharata B Rao, qemu-devel
  Cc: Zhu Guihua, mdroth, agraf, Chen Fan, pbonzini, qemu-ppc, tyreld,
	nfont, Gu Zheng, imammedo, afaerber, david

On 11/20/2015 11:54 PM, Bharata B Rao wrote:
> From: Gu Zheng <guz.fnst@cn.fujitsu.com>
>
> In order to deal well with the kvm vcpus (which can not be removed without any
> protection), we do not close KVM vcpu fd, just record and mark it as stopped
> into a list, so that we can reuse it for the appending cpu hot-add request if
> possible. It is also the approach that kvm guys suggested:
> https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html
>
> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
> Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
>                 [- Explicit CPU_REMOVE() from qemu_kvm/tcg_destroy_vcpu()
>                    isn't needed as it is done from cpu_exec_exit()]
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>   cpus.c               | 41 +++++++++++++++++++++++++++++++++++++
>   include/qom/cpu.h    | 10 +++++++++
>   include/sysemu/kvm.h |  1 +
>   kvm-all.c            | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++-
>   kvm-stub.c           |  5 +++++
>   5 files changed, 113 insertions(+), 1 deletion(-)
>
> diff --git a/cpus.c b/cpus.c
> index 877bd70..af2b274 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -953,6 +953,21 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data)
>       qemu_cpu_kick(cpu);
>   }
>
> +static void qemu_kvm_destroy_vcpu(CPUState *cpu)
> +{
> +    if (kvm_destroy_vcpu(cpu) < 0) {
> +        error_report("kvm_destroy_vcpu failed.\n");
> +        exit(EXIT_FAILURE);
> +    }
> +
> +    object_unparent(OBJECT(cpu));
> +}
> +
> +static void qemu_tcg_destroy_vcpu(CPUState *cpu)
> +{
> +    object_unparent(OBJECT(cpu));
> +}
> +
>   static void flush_queued_work(CPUState *cpu)
>   {
>       struct qemu_work_item *wi;
> @@ -1053,6 +1068,11 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
>               }
>           }
>           qemu_kvm_wait_io_event(cpu);
> +        if (cpu->exit && !cpu_can_run(cpu)) {
> +            qemu_kvm_destroy_vcpu(cpu);
> +            qemu_mutex_unlock(&qemu_global_mutex);


Nit: qemu_mutex_unlock_iothread() may be? Or it is important for 
iothread_locked to remain "true"? It does not seem to be used much though.



-- 
Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 01/10] vl: Don't allow CPU toplogies with partially filled cores
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 01/10] vl: Don't allow CPU toplogies with partially filled cores Bharata B Rao
@ 2015-12-01  0:37   ` David Gibson
  2015-12-02  3:04     ` Bharata B Rao
  2015-12-02 13:52   ` Igor Mammedov
  1 sibling, 1 reply; 36+ messages in thread
From: David Gibson @ 2015-12-01  0:37 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, aik, agraf, qemu-devel, pbonzini, qemu-ppc, tyreld, nfont,
	imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 1669 bytes --]

On Fri, Nov 20, 2015 at 06:24:30PM +0530, Bharata B Rao wrote:
> Prevent guests from booting with CPU topologies that have partially
> filled CPU cores or can result in partially filled CPU cores after
> CPU hotplug like
> 
> -smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
> -smp 15,sockets=1,cores=4,threads=4,maxcpus=17.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

I may have missed a bit of the discussion leading up to this.  What
was the rationale for still allowing partially filled sockets (and
otherwise doing things at core rather than socket level?)

> ---
>  vl.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/vl.c b/vl.c
> index 7d993a5..23a1a1e 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1248,6 +1248,15 @@ static void smp_parse(QemuOpts *opts)
>              exit(1);
>          }
>  
> +        if (cpus % threads || max_cpus % threads) {
> +            error_report("cpu topology: "
> +                         "sockets (%u) cores (%u) threads (%u) with "
> +                         "smp_cpus (%u) maxcpus (%u) "
> +                         "will result in partially filled cores",
> +                         sockets, cores, threads, cpus, max_cpus);
> +            exit(1);
> +        }
> +
>          smp_cpus = cpus;
>          smp_cores = cores > 0 ? cores : 1;
>          smp_threads = threads > 0 ? threads : 1;

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 02/10] exec: Remove cpu from cpus list during cpu_exec_exit()
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 02/10] exec: Remove cpu from cpus list during cpu_exec_exit() Bharata B Rao
@ 2015-12-01  0:44   ` David Gibson
  2015-12-02  3:47     ` Bharata B Rao
  0 siblings, 1 reply; 36+ messages in thread
From: David Gibson @ 2015-12-01  0:44 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, aik, agraf, qemu-devel, pbonzini, qemu-ppc, tyreld, nfont,
	imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 1986 bytes --]

On Fri, Nov 20, 2015 at 06:24:31PM +0530, Bharata B Rao wrote:
> CPUState *cpu gets added to the cpus list during cpu_exec_init(). It
> should be removed from cpu_exec_exit().
> 
> cpu_exec_init() is called from generic CPU::instance_finalize and some
> archs like PowerPC call it from CPU unrealizefn. So ensure that we
> dequeue the cpu only once.
> 
> Now -1 value for cpu->cpu_index indicates that we have already dequeued
> the cpu for CONFIG_USER_ONLY case also.

It's not clear to me if you're intending this just as an interim step
or not.  Surely we should fix the existing code to be consistent about
where the QTAILQ_REMOVE is done, rather than using a special -1 flag?

> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  exec.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/exec.c b/exec.c
> index b09f18b..fbac85b 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -593,6 +593,7 @@ void cpu_exec_exit(CPUState *cpu)
>          return;
>      }
>  
> +    QTAILQ_REMOVE(&cpus, cpu, node);
>      bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
>      cpu->cpu_index = -1;
>  }
> @@ -611,6 +612,15 @@ static int cpu_get_free_index(Error **errp)
>  
>  void cpu_exec_exit(CPUState *cpu)
>  {
> +    cpu_list_lock();
> +    if (cpu->cpu_index == -1) {

.. especially since as far as I can tell you're only testing for the
-1 value here, which is in the USER_ONLY version of this function.


> +        cpu_list_unlock();
> +        return;
> +    }
> +
> +    QTAILQ_REMOVE(&cpus, cpu, node);

Again, as the user only version of the function, is this redundant
with the removals in linux-user/{main.c,syscall.c}?

> +    cpu->cpu_index = -1;
> +    cpu_list_unlock();
>  }
>  #endif
>  

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/10] exec: Do vmstate unregistration from cpu_exec_exit()
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 03/10] exec: Do vmstate unregistration from cpu_exec_exit() Bharata B Rao
@ 2015-12-01  0:46   ` David Gibson
  0 siblings, 0 replies; 36+ messages in thread
From: David Gibson @ 2015-12-01  0:46 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, aik, agraf, qemu-devel, pbonzini, qemu-ppc, tyreld, nfont,
	imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 2300 bytes --]

On Fri, Nov 20, 2015 at 06:24:32PM +0530, Bharata B Rao wrote:
> cpu_exec_init() does vmstate_register and register_savevm for the CPU device.
> These need to be undone from cpu_exec_exit(). These changes are needed to
> support CPU hot removal and also to correctly fail hotplug attempts
> beyond max_cpus.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  exec.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/exec.c b/exec.c
> index fbac85b..47f03c0 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -588,6 +588,8 @@ static int cpu_get_free_index(Error **errp)
>  
>  void cpu_exec_exit(CPUState *cpu)
>  {
> +    CPUClass *cc = CPU_GET_CLASS(cpu);
> +
>      if (cpu->cpu_index == -1) {
>          /* cpu_index was never allocated by this @cpu or was already freed. */
>          return;
> @@ -596,6 +598,15 @@ void cpu_exec_exit(CPUState *cpu)
>      QTAILQ_REMOVE(&cpus, cpu, node);
>      bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
>      cpu->cpu_index = -1;
> +    if (cc->vmsd != NULL) {
> +        vmstate_unregister(NULL, cc->vmsd, cpu);
> +    }
> +#if defined(CPU_SAVE_VERSION)
> +    unregister_savevm(NULL, "cpu", cpu->env_ptr);
> +#endif
> +    if (qdev_get_vmsd(DEVICE(cpu)) == NULL) {
> +        vmstate_unregister(NULL, &vmstate_cpu_common, cpu);
> +    }
>  }
>  #else
>  
> @@ -612,6 +623,8 @@ static int cpu_get_free_index(Error **errp)
>  
>  void cpu_exec_exit(CPUState *cpu)
>  {
> +    CPUClass *cc = CPU_GET_CLASS(cpu);
> +
>      cpu_list_lock();
>      if (cpu->cpu_index == -1) {
>          cpu_list_unlock();
> @@ -621,6 +634,13 @@ void cpu_exec_exit(CPUState *cpu)
>      QTAILQ_REMOVE(&cpus, cpu, node);
>      cpu->cpu_index = -1;
>      cpu_list_unlock();
> +
> +    if (cc->vmsd != NULL) {
> +        vmstate_unregister(NULL, cc->vmsd, cpu);
> +    }
> +    if (qdev_get_vmsd(DEVICE(cpu)) == NULL) {
> +        vmstate_unregister(NULL, &vmstate_cpu_common, cpu);
> +    }
>  }
>  #endif
>  

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 04/10] cpu: Reclaim vCPU objects
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 04/10] cpu: Reclaim vCPU objects Bharata B Rao
  2015-11-30  7:30   ` Alexey Kardashevskiy
@ 2015-12-01  0:55   ` David Gibson
  2015-12-02  5:28     ` Bharata B Rao
  1 sibling, 1 reply; 36+ messages in thread
From: David Gibson @ 2015-12-01  0:55 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: Zhu Guihua, mdroth, aik, agraf, qemu-devel, Chen Fan, pbonzini,
	qemu-ppc, tyreld, nfont, Gu Zheng, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 8915 bytes --]

On Fri, Nov 20, 2015 at 06:24:33PM +0530, Bharata B Rao wrote:
> From: Gu Zheng <guz.fnst@cn.fujitsu.com>
> 
> In order to deal well with the kvm vcpus (which can not be removed without any
> protection), we do not close KVM vcpu fd, just record and mark it as stopped
> into a list, so that we can reuse it for the appending cpu hot-add request if
> possible. It is also the approach that kvm guys suggested:
> https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html
> 
> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
> Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
>                [- Explicit CPU_REMOVE() from qemu_kvm/tcg_destroy_vcpu()
>                   isn't needed as it is done from cpu_exec_exit()]
> ---
>  cpus.c               | 41 +++++++++++++++++++++++++++++++++++++
>  include/qom/cpu.h    | 10 +++++++++
>  include/sysemu/kvm.h |  1 +
>  kvm-all.c            | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++-
>  kvm-stub.c           |  5 +++++
>  5 files changed, 113 insertions(+), 1 deletion(-)
> 
> diff --git a/cpus.c b/cpus.c
> index 877bd70..af2b274 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -953,6 +953,21 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data)
>      qemu_cpu_kick(cpu);
>  }
>  
> +static void qemu_kvm_destroy_vcpu(CPUState *cpu)
> +{
> +    if (kvm_destroy_vcpu(cpu) < 0) {
> +        error_report("kvm_destroy_vcpu failed.\n");
> +        exit(EXIT_FAILURE);
> +    }
> +
> +    object_unparent(OBJECT(cpu));
> +}
> +
> +static void qemu_tcg_destroy_vcpu(CPUState *cpu)
> +{
> +    object_unparent(OBJECT(cpu));
> +}
> +
>  static void flush_queued_work(CPUState *cpu)
>  {
>      struct qemu_work_item *wi;
> @@ -1053,6 +1068,11 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
>              }
>          }
>          qemu_kvm_wait_io_event(cpu);
> +        if (cpu->exit && !cpu_can_run(cpu)) {
> +            qemu_kvm_destroy_vcpu(cpu);
> +            qemu_mutex_unlock(&qemu_global_mutex);

This looks like a change to locking semantics, and I can't see the
connection to the described purpose of the patch.

> +            return NULL;
> +        }
>      }
>  
>      return NULL;
> @@ -1108,6 +1128,7 @@ static void tcg_exec_all(void);
>  static void *qemu_tcg_cpu_thread_fn(void *arg)
>  {
>      CPUState *cpu = arg;
> +    CPUState *remove_cpu = NULL;
>  
>      rcu_register_thread();
>  
> @@ -1145,6 +1166,16 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
>              }
>          }
>          qemu_tcg_wait_io_event(QTAILQ_FIRST(&cpus));
> +        CPU_FOREACH(cpu) {
> +            if (cpu->exit && !cpu_can_run(cpu)) {
> +                remove_cpu = cpu;
> +                break;
> +            }
> +        }
> +        if (remove_cpu) {
> +            qemu_tcg_destroy_vcpu(remove_cpu);
> +            remove_cpu = NULL;
> +        }

Any particular reason to only cleanup one cpu per iteration?

Also, any particular reason this isn't folded into tcg_exec_all with
the other cpu->exit logic?

>      }
>  
>      return NULL;
> @@ -1301,6 +1332,13 @@ void resume_all_vcpus(void)
>      }
>  }
>  
> +void cpu_remove(CPUState *cpu)
> +{
> +    cpu->stop = true;
> +    cpu->exit = true;
> +    qemu_cpu_kick(cpu);
> +}
> +
>  /* For temporary buffers for forming a name */
>  #define VCPU_THREAD_NAME_SIZE 16
>  
> @@ -1506,6 +1544,9 @@ static void tcg_exec_all(void)
>                  break;
>              }
>          } else if (cpu->stop || cpu->stopped) {
> +            if (cpu->exit) {
> +                next_cpu = CPU_NEXT(cpu);
> +            }
>              break;
>          }
>      }
> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> index 51a1323..67e05b0 100644
> --- a/include/qom/cpu.h
> +++ b/include/qom/cpu.h
> @@ -223,6 +223,7 @@ struct kvm_run;
>   * @halted: Nonzero if the CPU is in suspended state.
>   * @stop: Indicates a pending stop request.
>   * @stopped: Indicates the CPU has been artificially stopped.
> + * @exit: Indicates the CPU has exited due to an unplug operation.
>   * @crash_occurred: Indicates the OS reported a crash (panic) for this CPU
>   * @tcg_exit_req: Set to force TCG to stop executing linked TBs for this
>   *           CPU and return to its top level loop.
> @@ -274,6 +275,7 @@ struct CPUState {
>      bool created;
>      bool stop;
>      bool stopped;
> +    bool exit;
>      bool crash_occurred;
>      bool exit_request;
>      uint32_t interrupt_request;
> @@ -696,6 +698,14 @@ void cpu_exit(CPUState *cpu);
>  void cpu_resume(CPUState *cpu);
>  
>  /**
> + * cpu_remove:
> + * @cpu: The CPU to remove.
> + *
> + * Requests the CPU to be removed.
> + */
> +void cpu_remove(CPUState *cpu);
> +
> +/**
>   * qemu_init_vcpu:
>   * @cpu: The vCPU to initialize.
>   *
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index b31f325..dd1b783 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -206,6 +206,7 @@ int kvm_has_intx_set_mask(void);
>  
>  int kvm_init_vcpu(CPUState *cpu);
>  int kvm_cpu_exec(CPUState *cpu);
> +int kvm_destroy_vcpu(CPUState *cpu);
>  
>  #ifdef NEED_CPU_H
>  
> diff --git a/kvm-all.c b/kvm-all.c
> index c648b81..3befc59 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -60,6 +60,12 @@
>  
>  #define KVM_MSI_HASHTAB_SIZE    256
>  
> +struct KVMParkedVcpu {
> +    unsigned long vcpu_id;
> +    int kvm_fd;
> +    QLIST_ENTRY(KVMParkedVcpu) node;
> +};
> +
>  struct KVMState
>  {
>      AccelState parent_obj;
> @@ -93,6 +99,7 @@ struct KVMState
>      QTAILQ_HEAD(msi_hashtab, KVMMSIRoute) msi_hashtab[KVM_MSI_HASHTAB_SIZE];
>  #endif
>      KVMMemoryListener memory_listener;
> +    QLIST_HEAD(, KVMParkedVcpu) kvm_parked_vcpus;
>  };
>  
>  KVMState *kvm_state;
> @@ -235,6 +242,53 @@ static int kvm_set_user_memory_region(KVMMemoryListener *kml, KVMSlot *slot)
>      return kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
>  }
>  
> +int kvm_destroy_vcpu(CPUState *cpu)
> +{
> +    KVMState *s = kvm_state;
> +    long mmap_size;
> +    struct KVMParkedVcpu *vcpu = NULL;
> +    int ret = 0;
> +
> +    DPRINTF("kvm_destroy_vcpu\n");
> +
> +    mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
> +    if (mmap_size < 0) {
> +        ret = mmap_size;
> +        DPRINTF("KVM_GET_VCPU_MMAP_SIZE failed\n");
> +        goto err;
> +    }
> +
> +    ret = munmap(cpu->kvm_run, mmap_size);
> +    if (ret < 0) {
> +        goto err;
> +    }
> +
> +    vcpu = g_malloc0(sizeof(*vcpu));
> +    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
> +    vcpu->kvm_fd = cpu->kvm_fd;
> +    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node);
> +err:
> +    return ret;
> +}
> +
> +static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
> +{
> +    struct KVMParkedVcpu *cpu;
> +
> +    QLIST_FOREACH(cpu, &s->kvm_parked_vcpus, node) {
> +        if (cpu->vcpu_id == vcpu_id) {
> +            int kvm_fd;
> +
> +            QLIST_REMOVE(cpu, node);
> +            kvm_fd = cpu->kvm_fd;
> +            g_free(cpu);
> +            return kvm_fd;
> +        }
> +    }

Hmm.. use of a simple list here does mean that unplugging, then
replugging all (except 1) vcpus would be an O(n^2) operation.  That's
probably still alright, I guess.

> +
> +    return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
> +}
> +
>  int kvm_init_vcpu(CPUState *cpu)
>  {
>      KVMState *s = kvm_state;
> @@ -243,7 +297,7 @@ int kvm_init_vcpu(CPUState *cpu)
>  
>      DPRINTF("kvm_init_vcpu\n");
>  
> -    ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)kvm_arch_vcpu_id(cpu));
> +    ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
>      if (ret < 0) {
>          DPRINTF("kvm_create_vcpu failed\n");
>          goto err;
> @@ -1468,6 +1522,7 @@ static int kvm_init(MachineState *ms)
>  #ifdef KVM_CAP_SET_GUEST_DEBUG
>      QTAILQ_INIT(&s->kvm_sw_breakpoints);
>  #endif
> +    QLIST_INIT(&s->kvm_parked_vcpus);
>      s->vmfd = -1;
>      s->fd = qemu_open("/dev/kvm", O_RDWR);
>      if (s->fd == -1) {
> diff --git a/kvm-stub.c b/kvm-stub.c
> index dc97a5e..0b39456 100644
> --- a/kvm-stub.c
> +++ b/kvm-stub.c
> @@ -32,6 +32,11 @@ bool kvm_allowed;
>  bool kvm_readonly_mem_allowed;
>  bool kvm_ioeventfd_any_length_allowed;
>  
> +int kvm_destroy_vcpu(CPUState *cpu)
> +{
> +    return -ENOSYS;
> +}
> +
>  int kvm_init_vcpu(CPUState *cpu)
>  {
>      return -ENOSYS;

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 05/10] cpu: Add a sync version of cpu_remove()
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 05/10] cpu: Add a sync version of cpu_remove() Bharata B Rao
@ 2015-12-01  0:57   ` David Gibson
  0 siblings, 0 replies; 36+ messages in thread
From: David Gibson @ 2015-12-01  0:57 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, aik, agraf, qemu-devel, pbonzini, qemu-ppc, tyreld, nfont,
	imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 2486 bytes --]

On Fri, Nov 20, 2015 at 06:24:34PM +0530, Bharata B Rao wrote:
> This sync API will be used by the CPU hotplug code to wait for the CPU to
> completely get removed before flagging the failure to the device_add
> command.
> 
> Sync version of this call is needed to correctly recover from CPU
> realization failures when ->plug() handler fails.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  cpus.c            | 12 ++++++++++++
>  include/qom/cpu.h |  8 ++++++++
>  2 files changed, 20 insertions(+)
> 
> diff --git a/cpus.c b/cpus.c
> index af2b274..c2444ba 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -1070,6 +1070,8 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
>          qemu_kvm_wait_io_event(cpu);
>          if (cpu->exit && !cpu_can_run(cpu)) {
>              qemu_kvm_destroy_vcpu(cpu);
> +            cpu->created = false;
> +            qemu_cond_signal(&qemu_cpu_cond);
>              qemu_mutex_unlock(&qemu_global_mutex);
>              return NULL;
>          }
> @@ -1174,6 +1176,8 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
>          }
>          if (remove_cpu) {
>              qemu_tcg_destroy_vcpu(remove_cpu);
> +            cpu->created = false;
> +            qemu_cond_signal(&qemu_cpu_cond);
>              remove_cpu = NULL;
>          }
>      }
> @@ -1339,6 +1343,14 @@ void cpu_remove(CPUState *cpu)
>      qemu_cpu_kick(cpu);
>  }
>  
> +void cpu_remove_sync(CPUState *cpu)
> +{
> +    cpu_remove(cpu);
> +    while (cpu->created) {
> +        qemu_cond_wait(&qemu_cpu_cond, &qemu_global_mutex);
> +    }
> +}
> +
>  /* For temporary buffers for forming a name */
>  #define VCPU_THREAD_NAME_SIZE 16
>  
> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> index 67e05b0..7fc6696 100644
> --- a/include/qom/cpu.h
> +++ b/include/qom/cpu.h
> @@ -705,6 +705,14 @@ void cpu_resume(CPUState *cpu);
>   */
>  void cpu_remove(CPUState *cpu);
>  
> + /**
> + * cpu_remove_sync:
> + * @cpu: The CPU to remove.
> + *
> + * Requests the CPU to be removed and waits till it is removed.
> + */
> +void cpu_remove_sync(CPUState *cpu);
> +
>  /**
>   * qemu_init_vcpu:
>   * @cpu: The vCPU to initialize.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/10] xics_kvm: Add cpu_destroy method to XICS
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 06/10] xics_kvm: Add cpu_destroy method to XICS Bharata B Rao
@ 2015-12-01  1:01   ` David Gibson
  2015-12-02  5:45     ` Bharata B Rao
  0 siblings, 1 reply; 36+ messages in thread
From: David Gibson @ 2015-12-01  1:01 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, aik, agraf, qemu-devel, pbonzini, qemu-ppc, tyreld, nfont,
	imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 3996 bytes --]

On Fri, Nov 20, 2015 at 06:24:35PM +0530, Bharata B Rao wrote:
> XICS is setup for each CPU during initialization. Provide a routine
> to undo the same when CPU is unplugged.
> 
> This allows reboot of a VM that has undergone CPU hotplug and unplug
> to work correctly.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/intc/xics.c        | 12 ++++++++++++
>  hw/intc/xics_kvm.c    | 13 +++++++++++--
>  include/hw/ppc/xics.h |  2 ++
>  3 files changed, 25 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> index 9ff5796..e1161b2 100644
> --- a/hw/intc/xics.c
> +++ b/hw/intc/xics.c
> @@ -44,6 +44,18 @@ static int get_cpu_index_by_dt_id(int cpu_dt_id)
>      return -1;
>  }
>  
> +void xics_cpu_destroy(XICSState *icp, PowerPCCPU *cpu)
> +{
> +    CPUState *cs = CPU(cpu);
> +    XICSStateClass *info = XICS_COMMON_GET_CLASS(icp);
> +
> +    assert(cs->cpu_index < icp->nr_servers);
> +
> +    if (info->cpu_destroy) {
> +        info->cpu_destroy(icp, cpu);
> +    }
> +}
> +
>  void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
>  {
>      CPUState *cs = CPU(cpu);
> diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> index d58729c..cb96f69 100644
> --- a/hw/intc/xics_kvm.c
> +++ b/hw/intc/xics_kvm.c
> @@ -331,6 +331,8 @@ static void xics_kvm_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
>          abort();
>      }
>  
> +    ss->cs = cs;
> +

Actually, it probably makes more sense to move the management of
ss->cs to the base xics code.  There's nothing really kvm specific
about it.

>      /*
>       * If we are reusing a parked vCPU fd corresponding to the CPU
>       * which was hot-removed earlier we don't have to renable
> @@ -343,8 +345,6 @@ static void xics_kvm_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
>      if (icpkvm->kernel_xics_fd != -1) {
>          int ret;
>  
> -        ss->cs = cs;
> -
>          ret = kvm_vcpu_enable_cap(cs, KVM_CAP_IRQ_XICS, 0,
>                                    icpkvm->kernel_xics_fd, kvm_arch_vcpu_id(cs));
>          if (ret < 0) {
> @@ -356,6 +356,14 @@ static void xics_kvm_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
>      }
>  }
>  
> +static void xics_kvm_cpu_destroy(XICSState *icp, PowerPCCPU *cpu)
> +{
> +    CPUState *cs = CPU(cpu);
> +    ICPState *ss = &icp->ss[cs->cpu_index];
> +
> +    ss->cs = NULL;
> +}
> +
>  static void xics_kvm_set_nr_irqs(XICSState *icp, uint32_t nr_irqs, Error **errp)
>  {
>      icp->nr_irqs = icp->ics->nr_irqs = nr_irqs;
> @@ -486,6 +494,7 @@ static void xics_kvm_class_init(ObjectClass *oc, void *data)
>  
>      dc->realize = xics_kvm_realize;
>      xsc->cpu_setup = xics_kvm_cpu_setup;
> +    xsc->cpu_destroy = xics_kvm_cpu_destroy;
>      xsc->set_nr_irqs = xics_kvm_set_nr_irqs;
>      xsc->set_nr_servers = xics_kvm_set_nr_servers;
>  }
> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> index 355a966..2faad48 100644
> --- a/include/hw/ppc/xics.h
> +++ b/include/hw/ppc/xics.h
> @@ -68,6 +68,7 @@ struct XICSStateClass {
>      DeviceClass parent_class;
>  
>      void (*cpu_setup)(XICSState *icp, PowerPCCPU *cpu);
> +    void (*cpu_destroy)(XICSState *icp, PowerPCCPU *cpu);
>      void (*set_nr_irqs)(XICSState *icp, uint32_t nr_irqs, Error **errp);
>      void (*set_nr_servers)(XICSState *icp, uint32_t nr_servers, Error **errp);
>  };
> @@ -166,5 +167,6 @@ int xics_alloc_block(XICSState *icp, int src, int num, bool lsi, bool align);
>  void xics_free(XICSState *icp, int irq, int num);
>  
>  void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu);
> +void xics_cpu_destroy(XICSState *icp, PowerPCCPU *cpu);
>  
>  #endif /* __XICS_H__ */

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 07/10] spapr: Enable CPU hotplug for pseries-2.5 and add CPU DRC DT entries
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 07/10] spapr: Enable CPU hotplug for pseries-2.5 and add CPU DRC DT entries Bharata B Rao
@ 2015-12-01  1:06   ` David Gibson
  2015-12-02  5:47     ` Bharata B Rao
  0 siblings, 1 reply; 36+ messages in thread
From: David Gibson @ 2015-12-01  1:06 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, aik, agraf, qemu-devel, pbonzini, qemu-ppc, tyreld, nfont,
	imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 3530 bytes --]

On Fri, Nov 20, 2015 at 06:24:36PM +0530, Bharata B Rao wrote:
> Start supporting CPU hotplug from pseries-2.5 onwards. Add CPU
> DRC (Dynamic Resource Connector) device tree entries.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c         | 23 +++++++++++++++++++++++
>  include/hw/ppc/spapr.h |  1 +
>  2 files changed, 24 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 030ee35..814b0a6 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -973,6 +973,16 @@ static void spapr_finalize_fdt(sPAPRMachineState *spapr,
>          _FDT(spapr_drc_populate_dt(fdt, 0, NULL, SPAPR_DR_CONNECTOR_TYPE_LMB));
>      }
>  
> +    if (smc->dr_cpu_enabled) {
> +        int offset = fdt_path_offset(fdt, "/cpus");
> +        ret = spapr_drc_populate_dt(fdt, offset, NULL,
> +                                    SPAPR_DR_CONNECTOR_TYPE_CPU);
> +        if (ret < 0) {
> +            fprintf(stderr, "Couldn't set up CPU DR device tree properties\n");
> +            exit(1);
> +        }
> +    }
> +
>      _FDT((fdt_pack(fdt)));
>  
>      if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
> @@ -1727,6 +1737,8 @@ static void ppc_spapr_init(MachineState *machine)
>      long load_limit, fw_size;
>      bool kernel_le = false;
>      char *filename;
> +    int smt = kvmppc_smt_threads();
> +    int smp_max_cores = DIV_ROUND_UP(max_cpus, smp_threads);


Given your earlier patch to disallow partially filled cores, a
DIV_ROUND_UP should be the same as a straight divide, yes?

>  
>      msi_supported = true;
>  
> @@ -1793,6 +1805,15 @@ static void ppc_spapr_init(MachineState *machine)
>          spapr_validate_node_memory(machine);
>      }
>  
> +    if (smc->dr_cpu_enabled) {
> +        for (i = 0; i < smp_max_cores; i++) {
> +            sPAPRDRConnector *drc =
> +                spapr_dr_connector_new(OBJECT(spapr),
> +                                       SPAPR_DR_CONNECTOR_TYPE_CPU, i * smt);
> +            qemu_register_reset(spapr_drc_reset, drc);
> +        }
> +    }
> +
>      /* init CPUs */
>      if (machine->cpu_model == NULL) {
>          machine->cpu_model = kvm_enabled() ? "host" : "POWER7";
> @@ -2277,6 +2298,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      mc->cpu_index_to_socket_id = spapr_cpu_index_to_socket_id;
>  
>      smc->dr_lmb_enabled = false;
> +    smc->dr_cpu_enabled = false;
>      fwc->get_dev_path = spapr_get_fw_dev_path;
>      nc->nmi_monitor_handler = spapr_nmi;
>  }
> @@ -2443,6 +2465,7 @@ static void spapr_machine_2_5_class_init(ObjectClass *oc, void *data)
>      mc->alias = "pseries";
>      mc->is_default = 1;
>      smc->dr_lmb_enabled = true;
> +    smc->dr_cpu_enabled = true;
>  }

Given when we are, this will need to move to 2.6, obviously.

>  static const TypeInfo spapr_machine_2_5_info = {
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 5baa906..716d7ad 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -36,6 +36,7 @@ struct sPAPRMachineClass {
>  
>      /*< public >*/
>      bool dr_lmb_enabled; /* enable dynamic-reconfig/hotplug of LMBs */
> +    bool dr_cpu_enabled; /* enable dynamic-reconfig/hotplug of CPUs */
>  };
>  
>  /**

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 08/10] spapr: CPU hotplug support
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 08/10] spapr: CPU hotplug support Bharata B Rao
@ 2015-12-01  1:30   ` David Gibson
  2015-12-01  4:48     ` Bharata B Rao
  0 siblings, 1 reply; 36+ messages in thread
From: David Gibson @ 2015-12-01  1:30 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, aik, agraf, qemu-devel, pbonzini, qemu-ppc, tyreld, nfont,
	imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 12639 bytes --]

On Fri, Nov 20, 2015 at 06:24:37PM +0530, Bharata B Rao wrote:
> Support CPU hotplug via device-add command. Set up device tree
> entries for the hotplugged CPU core and use the exising EPOW event
> infrastructure to send CPU hotplug notification to the guest.
> 
> Create only cores explicitly from boot path as well as hotplug path
> and let the ->plug() handler of the core create the threads of the core.
> 
> Also support cold plugged CPUs that are specified by -device option
> on cmdline.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c              | 156 +++++++++++++++++++++++++++++++++++++++++++-
>  hw/ppc/spapr_events.c       |   3 +
>  hw/ppc/spapr_rtas.c         |  24 +++++++
>  target-ppc/translate_init.c |   8 +++
>  4 files changed, 189 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 814b0a6..4434d45 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -596,6 +596,18 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
>      size_t page_sizes_prop_size;
>      uint32_t vcpus_per_socket = smp_threads * smp_cores;
>      uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
> +    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(qdev_get_machine());
> +    sPAPRDRConnector *drc;
> +    sPAPRDRConnectorClass *drck;
> +    int drc_index;
> +
> +    if (smc->dr_cpu_enabled) {
> +        drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, index);
> +        g_assert(drc);
> +        drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> +        drc_index = drck->get_index(drc);
> +        _FDT((fdt_setprop_cell(fdt, offset, "ibm,my-drc-index", drc_index)));
> +    }
>  
>      /* Note: we keep CI large pages off for now because a 64K capable guest
>       * provisioned with large pages might otherwise try to map a qemu
> @@ -1739,6 +1751,7 @@ static void ppc_spapr_init(MachineState *machine)
>      char *filename;
>      int smt = kvmppc_smt_threads();
>      int smp_max_cores = DIV_ROUND_UP(max_cpus, smp_threads);
> +    int spapr_smp_cores = DIV_ROUND_UP(smp_cpus, smp_threads);
>  
>      msi_supported = true;
>  
> @@ -1818,7 +1831,7 @@ static void ppc_spapr_init(MachineState *machine)
>      if (machine->cpu_model == NULL) {
>          machine->cpu_model = kvm_enabled() ? "host" : "POWER7";
>      }
> -    for (i = 0; i < smp_cpus; i++) {
> +    for (i = 0; i < spapr_smp_cores; i++) {
>          cpu = cpu_ppc_init(machine->cpu_model);
>          if (cpu == NULL) {
>              fprintf(stderr, "Unable to find PowerPC CPU definition\n");
> @@ -2207,10 +2220,135 @@ out:
>      error_propagate(errp, local_err);
>  }
>  
> +static void *spapr_populate_hotplug_cpu_dt(DeviceState *dev, CPUState *cs,
> +                                           int *fdt_offset,
> +                                           sPAPRMachineState *spapr)
> +{
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    DeviceClass *dc = DEVICE_GET_CLASS(cs);
> +    int id = ppc_get_vcpu_dt_id(cpu);
> +    void *fdt;
> +    int offset, fdt_size;
> +    char *nodename;
> +
> +    fdt = create_device_tree(&fdt_size);
> +    nodename = g_strdup_printf("%s@%x", dc->fw_name, id);
> +    offset = fdt_add_subnode(fdt, 0, nodename);
> +
> +    spapr_populate_cpu_dt(cs, fdt, offset, spapr);
> +    g_free(nodename);
> +
> +    *fdt_offset = offset;
> +    return fdt;
> +}
> +
> +static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> +                            Error **errp)
> +{
> +    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(qdev_get_machine());
> +    sPAPRMachineState *ms = SPAPR_MACHINE(hotplug_dev);
> +    CPUState *cs = CPU(dev);
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    int id = ppc_get_vcpu_dt_id(cpu);
> +    sPAPRDRConnector *drc =
> +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> +    sPAPRDRConnectorClass *drck;
> +    int smt = kvmppc_smt_threads();
> +    Error *local_err = NULL;
> +    void *fdt = NULL;
> +    int i, fdt_offset = 0;
> +
> +    /* Set NUMA node for the added CPUs  */
> +    for (i = 0; i < nb_numa_nodes; i++) {
> +        if (test_bit(cs->cpu_index, numa_info[i].node_cpu)) {
> +            cs->numa_node = i;
> +            break;
> +        }
> +    }
> +
> +    /*
> +     * Currently CPU core and threads of a core aren't really different
> +     * from QEMU point of view since all of them are just CPU devices. Hence
> +     * there is no separate realize routines for cores and threads.
> +     * We use the id check below to do things differently for cores and threads.
> +     *
> +     * SMT threads return from here, only main thread (core) will
> +     * continue, create threads and signal hotplug event to the guest.
> +     */
> +    if ((id % smt) != 0) {
> +        return;
> +    }

Hmm.  It seems odd to me to have thread 0 of a core handle the
initialization of the other threads, rather than creating an explicit
Core QOM object and have that construct the cpu objects for all
threads under it.

It also causes a rather weird recursion of cpu_ppc_init() into
cpu_ppc_init(), if I'm following the logic correctly.

> +
> +    /* Create SMT threads of the core. */
> +    for (i = 1; i < smp_threads; i++) {
> +        cpu = cpu_ppc_init(current_machine->cpu_model);
> +        if (!cpu) {
> +            error_report("Unable to find PowerPC CPU definition: %s",
> +                          current_machine->cpu_model);
> +            exit(EXIT_FAILURE);
> +        }
> +    }
> +
> +    if (!smc->dr_cpu_enabled) {
> +        /*
> +         * This is a cold plugged CPU but the machine doesn't support
> +         * DR. So skip the hotplug path ensuring that the CPU is brought
> +         * up online with out an associated DR connector.
> +         */
> +        return;
> +    }
> +
> +    g_assert(drc);
> +
> +    /*
> +     * Setup CPU DT entries only for hotplugged CPUs. For boot time or
> +     * coldplugged CPUs DT entries are setup in spapr_finalize_fdt().
> +     */
> +    if (dev->hotplugged) {
> +        fdt = spapr_populate_hotplug_cpu_dt(dev, cs, &fdt_offset, ms);
> +    }
> +
> +    drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> +    drck->attach(drc, dev, fdt, fdt_offset, !dev->hotplugged, &local_err);
> +    if (local_err) {
> +        g_free(fdt);
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +
> +    /*
> +     * We send hotplug notification interrupt to the guest only in case
> +     * of hotplugged CPUs.
> +     */
> +    if (dev->hotplugged) {
> +        spapr_hotplug_req_add_by_index(drc);
> +    } else {
> +        /*
> +         * HACK to support removal of hotplugged CPU after VM migration:
> +         *
> +         * Since we want to be able to hot-remove those coldplugged CPUs
> +         * started at boot time using -device option at the target VM, we set
> +         * the right allocation_state and isolation_state for them, which for
> +         * the hotplugged CPUs would be set via RTAS calls done from the
> +         * guest during hotplug.
> +         *
> +         * This allows the coldplugged CPUs started using -device option to
> +         * have the right isolation and allocation states as expected by the
> +         * CPU hot removal code.
> +         *
> +         * This hack will be removed once we have DRC states migrated as part
> +         * of VM migration.
> +         */
> +        drck->set_allocation_state(drc, SPAPR_DR_ALLOCATION_STATE_USABLE);
> +        drck->set_isolation_state(drc, SPAPR_DR_ISOLATION_STATE_UNISOLATED);
> +    }
> +}
> +
>  static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
>                                        DeviceState *dev, Error **errp)
>  {
>      sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(qdev_get_machine());
> +    sPAPRMachineState *ms = SPAPR_MACHINE(hotplug_dev);
>  
>      if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
>          int node;
> @@ -2247,6 +2385,19 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
>          }
>  
>          spapr_memory_plug(hotplug_dev, dev, node, errp);
> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +        CPUState *cs = CPU(dev);
> +        PowerPCCPU *cpu = POWERPC_CPU(cs);
> +
> +        if (!smc->dr_cpu_enabled && dev->hotplugged) {
> +            error_setg(errp, "CPU hotplug not supported for this machine");
> +            cpu_remove_sync(cs);
> +            return;
> +        }
> +
> +        spapr_cpu_init(ms, cpu);
> +        spapr_cpu_reset(cpu);
> +        spapr_cpu_plug(hotplug_dev, dev, errp);
>      }
>  }
>  
> @@ -2261,7 +2412,8 @@ static void spapr_machine_device_unplug(HotplugHandler *hotplug_dev,
>  static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
>                                               DeviceState *dev)
>  {
> -    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> +    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) ||
> +        object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
>          return HOTPLUG_HANDLER(machine);
>      }
>      return NULL;
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index 744ea62..1063036 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -436,6 +436,9 @@ static void spapr_hotplug_req_event(uint8_t hp_id, uint8_t hp_action,
>      case SPAPR_DR_CONNECTOR_TYPE_LMB:
>          hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_MEMORY;
>          break;
> +    case SPAPR_DR_CONNECTOR_TYPE_CPU:
> +        hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_CPU;
> +        break;
>      default:
>          /* we shouldn't be signaling hotplug events for resources
>           * that don't support them
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index 34b12a3..7baa862 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -33,6 +33,7 @@
>  
>  #include "hw/ppc/spapr.h"
>  #include "hw/ppc/spapr_vio.h"
> +#include "hw/ppc/ppc.h"
>  #include "qapi-event.h"
>  #include "hw/boards.h"
>  
> @@ -159,6 +160,27 @@ static void rtas_query_cpu_stopped_state(PowerPCCPU *cpu_,
>      rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
>  }
>  
> +/*
> + * Set the timebase offset of the CPU to that of first CPU.
> + * This helps hotplugged CPU to have the correct timebase offset.
> + */
> +static void spapr_cpu_update_tb_offset(PowerPCCPU *cpu)
> +{
> +    PowerPCCPU *fcpu = POWERPC_CPU(first_cpu);
> +
> +    cpu->env.tb_env->tb_offset = fcpu->env.tb_env->tb_offset;
> +}
> +
> +static void spapr_cpu_set_endianness(PowerPCCPU *cpu)
> +{
> +    PowerPCCPU *fcpu = POWERPC_CPU(first_cpu);
> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(fcpu);
> +
> +    if (!pcc->interrupts_big_endian(fcpu)) {
> +        cpu->env.spr[SPR_LPCR] |= LPCR_ILE;
> +    }
> +}
> +
>  static void rtas_start_cpu(PowerPCCPU *cpu_, sPAPRMachineState *spapr,
>                             uint32_t token, uint32_t nargs,
>                             target_ulong args,
> @@ -195,6 +217,8 @@ static void rtas_start_cpu(PowerPCCPU *cpu_, sPAPRMachineState *spapr,
>          env->nip = start;
>          env->gpr[3] = r3;
>          cs->halted = 0;
> +        spapr_cpu_set_endianness(cpu);
> +        spapr_cpu_update_tb_offset(cpu);
>  
>          qemu_cpu_kick(cs);
>  
> diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
> index e88dc7f..245d73a 100644
> --- a/target-ppc/translate_init.c
> +++ b/target-ppc/translate_init.c
> @@ -30,6 +30,9 @@
>  #include "qemu/error-report.h"
>  #include "qapi/visitor.h"
>  #include "hw/qdev-properties.h"
> +#if !defined(CONFIG_USER_ONLY)
> +#include "sysemu/sysemu.h"
> +#endif
>  
>  //#define PPC_DUMP_CPU
>  //#define PPC_DEBUG_SPR
> @@ -8933,6 +8936,11 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
>      }
>  
>  #if !defined(CONFIG_USER_ONLY)
> +    if (cs->cpu_index >= max_cpus) {
> +        error_setg(errp, "Cannot have more than %d CPUs", max_cpus);
> +        return;
> +    }
> +
>      cpu->cpu_dt_id = (cs->cpu_index / smp_threads) * max_smt
>          + (cs->cpu_index % smp_threads);
>  #endif

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 09/10] spapr: CPU hot unplug support
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 09/10] spapr: CPU hot unplug support Bharata B Rao
@ 2015-12-01  1:34   ` David Gibson
  2015-12-02  5:49     ` Bharata B Rao
  0 siblings, 1 reply; 36+ messages in thread
From: David Gibson @ 2015-12-01  1:34 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, aik, agraf, qemu-devel, pbonzini, qemu-ppc, tyreld, nfont,
	imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 3653 bytes --]

On Fri, Nov 20, 2015 at 06:24:38PM +0530, Bharata B Rao wrote:
> Support hot removal of CPU for sPAPR guests by sending the hot unplug
> notification to the guest via EPOW interrupt. Release the vCPU object
> after CPU hot unplug so that vCPU fd can be parked and reused.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 71 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 4434d45..6dca553 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2401,11 +2401,82 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
>      }
>  }
>  
> +static void spapr_cpu_destroy(PowerPCCPU *cpu)
> +{
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> +
> +    xics_cpu_destroy(spapr->icp, cpu);
> +    qemu_unregister_reset(spapr_cpu_reset, cpu);
> +}
> +
> +static void spapr_cpu_release(DeviceState *dev, void *opaque)

I'd prefer to see this called "core_release" since "cpu" doesn't make
it clear that it's acting on a whole core.

> +{
> +    CPUState *cs;
> +    int i;
> +    int id = ppc_get_vcpu_dt_id(POWERPC_CPU(CPU(dev)));
> +
> +    for (i = id; i < id + smp_threads; i++) {
> +        CPU_FOREACH(cs) {
> +            PowerPCCPU *cpu = POWERPC_CPU(cs);
> +
> +            if (i == ppc_get_vcpu_dt_id(cpu)) {
> +                spapr_cpu_destroy(cpu);
> +                cpu_remove(cs);
> +            }
> +        }
> +    }
> +}
> +
> +static int spapr_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
> +                             Error **errp)
> +{
> +    CPUState *cs = CPU(dev);
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    int id = ppc_get_vcpu_dt_id(cpu);
> +    int smt = kvmppc_smt_threads();
> +    sPAPRDRConnector *drc =
> +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> +    sPAPRDRConnectorClass *drck;
> +    Error *local_err = NULL;
> +
> +    /*
> +     * SMT threads return from here, only main thread (core) will
> +     * continue and signal hot unplug event to the guest.
> +     */
> +    if ((id % smt) != 0) {
> +        return 0;
> +    }
> +    g_assert(drc);
> +
> +    drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> +    drck->detach(drc, dev, spapr_cpu_release, NULL, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return -1;
> +    }
> +
> +    /*
> +     * In addition to hotplugged CPUs, send the hot-unplug notification
> +     * interrupt to the guest for coldplugged CPUs started via -device
> +     * option too.
> +     */
> +    spapr_hotplug_req_remove_by_index(drc);
> +    return 0;
> +}
> +
>  static void spapr_machine_device_unplug(HotplugHandler *hotplug_dev,
>                                        DeviceState *dev, Error **errp)
>  {
> +    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(qdev_get_machine());
> +
>      if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
>          error_setg(errp, "Memory hot unplug not supported by sPAPR");
> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +        if (!smc->dr_cpu_enabled) {
> +            error_setg(errp, "CPU hot unplug not supported on this machine");
> +            return;
> +        }
> +        spapr_cpu_unplug(hotplug_dev, dev, errp);
>      }
>  }
>  

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/10] sPAPR CPU hotplug
  2015-11-23 13:04   ` Christian Borntraeger
@ 2015-12-01  1:43     ` David Gibson
  0 siblings, 0 replies; 36+ messages in thread
From: David Gibson @ 2015-12-01  1:43 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: qemu-devel, Peter Krempa, aik, agraf, mdroth, pbonzini, qemu-ppc,
	tyreld, Bharata B Rao, imammedo, nfont, afaerber

[-- Attachment #1: Type: text/plain, Size: 1403 bytes --]

On Mon, Nov 23, 2015 at 02:04:46PM +0100, Christian Borntraeger wrote:
> On 11/23/2015 12:54 PM, Peter Krempa wrote:
> > On Fri, Nov 20, 2015 at 18:24:29 +0530, Bharata B Rao wrote:
> >> This patchset adds CPU hotplug support for sPAPR PowerPC guests using
> >> device_add and device_del commands
> >>
> >> (qemu) device_add POWER8-powerpc64-cpu,id=cpu0
> > 
> > Is there a reason why this uses 'device_add' rather than the 'cpu_add'
> > command? Libvirt uses two separate approaches already. Due to legacy
> > reasons we support the HMP 'cpu_set' command, and lately we added
> > support for QMP 'cpu-add'. Using device_add here will introduce a
> > different approach and will require yet another compatibility layer in
> > libvirt to support this.
> 
> s390 and powerpc both started with cpu_add patches. Andreas Faerber
> suggested then to only implement device_add. This was apparently discussed
> at the last KVM forum.

This may not be the only reason, but one problem with cpu_add is that
the interface is strictly per-vcpu, i.e. per-thread.  That doesn't
work for Power guests, because the PAPR interface used to communicate
to the guest can only work at a core granularity.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 08/10] spapr: CPU hotplug support
  2015-12-01  1:30   ` David Gibson
@ 2015-12-01  4:48     ` Bharata B Rao
  0 siblings, 0 replies; 36+ messages in thread
From: Bharata B Rao @ 2015-12-01  4:48 UTC (permalink / raw)
  To: David Gibson
  Cc: mdroth, aik, agraf, qemu-devel, pbonzini, qemu-ppc, tyreld, nfont,
	imammedo, afaerber

On Tue, Dec 01, 2015 at 12:30:12PM +1100, David Gibson wrote:
> On Fri, Nov 20, 2015 at 06:24:37PM +0530, Bharata B Rao wrote:
> > Support CPU hotplug via device-add command. Set up device tree
> > entries for the hotplugged CPU core and use the exising EPOW event
> > infrastructure to send CPU hotplug notification to the guest.
> > 
> > Create only cores explicitly from boot path as well as hotplug path
> > and let the ->plug() handler of the core create the threads of the core.
> > 
> > Also support cold plugged CPUs that are specified by -device option
> > on cmdline.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr.c              | 156 +++++++++++++++++++++++++++++++++++++++++++-
> >  hw/ppc/spapr_events.c       |   3 +
> >  hw/ppc/spapr_rtas.c         |  24 +++++++
> >  target-ppc/translate_init.c |   8 +++
> >  4 files changed, 189 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 814b0a6..4434d45 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -596,6 +596,18 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
> >      size_t page_sizes_prop_size;
> >      uint32_t vcpus_per_socket = smp_threads * smp_cores;
> >      uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
> > +    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(qdev_get_machine());
> > +    sPAPRDRConnector *drc;
> > +    sPAPRDRConnectorClass *drck;
> > +    int drc_index;
> > +
> > +    if (smc->dr_cpu_enabled) {
> > +        drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, index);
> > +        g_assert(drc);
> > +        drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> > +        drc_index = drck->get_index(drc);
> > +        _FDT((fdt_setprop_cell(fdt, offset, "ibm,my-drc-index", drc_index)));
> > +    }
> >  
> >      /* Note: we keep CI large pages off for now because a 64K capable guest
> >       * provisioned with large pages might otherwise try to map a qemu
> > @@ -1739,6 +1751,7 @@ static void ppc_spapr_init(MachineState *machine)
> >      char *filename;
> >      int smt = kvmppc_smt_threads();
> >      int smp_max_cores = DIV_ROUND_UP(max_cpus, smp_threads);
> > +    int spapr_smp_cores = DIV_ROUND_UP(smp_cpus, smp_threads);
> >  
> >      msi_supported = true;
> >  
> > @@ -1818,7 +1831,7 @@ static void ppc_spapr_init(MachineState *machine)
> >      if (machine->cpu_model == NULL) {
> >          machine->cpu_model = kvm_enabled() ? "host" : "POWER7";
> >      }
> > -    for (i = 0; i < smp_cpus; i++) {
> > +    for (i = 0; i < spapr_smp_cores; i++) {
> >          cpu = cpu_ppc_init(machine->cpu_model);
> >          if (cpu == NULL) {
> >              fprintf(stderr, "Unable to find PowerPC CPU definition\n");
> > @@ -2207,10 +2220,135 @@ out:
> >      error_propagate(errp, local_err);
> >  }
> >  
> > +static void *spapr_populate_hotplug_cpu_dt(DeviceState *dev, CPUState *cs,
> > +                                           int *fdt_offset,
> > +                                           sPAPRMachineState *spapr)
> > +{
> > +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> > +    DeviceClass *dc = DEVICE_GET_CLASS(cs);
> > +    int id = ppc_get_vcpu_dt_id(cpu);
> > +    void *fdt;
> > +    int offset, fdt_size;
> > +    char *nodename;
> > +
> > +    fdt = create_device_tree(&fdt_size);
> > +    nodename = g_strdup_printf("%s@%x", dc->fw_name, id);
> > +    offset = fdt_add_subnode(fdt, 0, nodename);
> > +
> > +    spapr_populate_cpu_dt(cs, fdt, offset, spapr);
> > +    g_free(nodename);
> > +
> > +    *fdt_offset = offset;
> > +    return fdt;
> > +}
> > +
> > +static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> > +                            Error **errp)
> > +{
> > +    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(qdev_get_machine());
> > +    sPAPRMachineState *ms = SPAPR_MACHINE(hotplug_dev);
> > +    CPUState *cs = CPU(dev);
> > +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> > +    int id = ppc_get_vcpu_dt_id(cpu);
> > +    sPAPRDRConnector *drc =
> > +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> > +    sPAPRDRConnectorClass *drck;
> > +    int smt = kvmppc_smt_threads();
> > +    Error *local_err = NULL;
> > +    void *fdt = NULL;
> > +    int i, fdt_offset = 0;
> > +
> > +    /* Set NUMA node for the added CPUs  */
> > +    for (i = 0; i < nb_numa_nodes; i++) {
> > +        if (test_bit(cs->cpu_index, numa_info[i].node_cpu)) {
> > +            cs->numa_node = i;
> > +            break;
> > +        }
> > +    }
> > +
> > +    /*
> > +     * Currently CPU core and threads of a core aren't really different
> > +     * from QEMU point of view since all of them are just CPU devices. Hence
> > +     * there is no separate realize routines for cores and threads.
> > +     * We use the id check below to do things differently for cores and threads.
> > +     *
> > +     * SMT threads return from here, only main thread (core) will
> > +     * continue, create threads and signal hotplug event to the guest.
> > +     */
> > +    if ((id % smt) != 0) {
> > +        return;
> > +    }
> 
> Hmm.  It seems odd to me to have thread 0 of a core handle the
> initialization of the other threads, rather than creating an explicit
> Core QOM object and have that construct the cpu objects for all
> threads under it.

All the threads will go through realize call and hence ->plug() call.
So they all come here and we need the above check to ensure that we
raise the EPOW interrupt to the guest only once.

Having said that, I agree that it is not good to have main thread of
the core creating other threads as is being done here.

I am experimenting with an approach based on bits of what was discussed
here earlier

http://lists.nongnu.org/archive/html/qemu-devel/2015-04/msg00620.html

In my current WIP patchset, I do have a core device that will result
in CPU thread QOM objects getting created as children. I am experimenting
to see if it will be possible to specify CPU as a generic device
(for both boot time as well as hotplug) that works for all archs, will post
something by early next week.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 01/10] vl: Don't allow CPU toplogies with partially filled cores
  2015-12-01  0:37   ` David Gibson
@ 2015-12-02  3:04     ` Bharata B Rao
  0 siblings, 0 replies; 36+ messages in thread
From: Bharata B Rao @ 2015-12-02  3:04 UTC (permalink / raw)
  To: David Gibson
  Cc: mdroth, aik, agraf, qemu-devel, pbonzini, qemu-ppc, tyreld, nfont,
	imammedo, afaerber

On Tue, Dec 01, 2015 at 11:37:03AM +1100, David Gibson wrote:
> On Fri, Nov 20, 2015 at 06:24:30PM +0530, Bharata B Rao wrote:
> > Prevent guests from booting with CPU topologies that have partially
> > filled CPU cores or can result in partially filled CPU cores after
> > CPU hotplug like
> > 
> > -smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
> > -smp 15,sockets=1,cores=4,threads=4,maxcpus=17.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> 
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> 
> I may have missed a bit of the discussion leading up to this.  What
> was the rationale for still allowing partially filled sockets (and
> otherwise doing things at core rather than socket level?)

This is generic parsing code. I don't remember an explicit discussion
around allowing or dis-allowing paritialy filled sockets. I thought
disallowing partially filled cores from generic smp parsing code would
be an acceptable first step for all archs.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 02/10] exec: Remove cpu from cpus list during cpu_exec_exit()
  2015-12-01  0:44   ` David Gibson
@ 2015-12-02  3:47     ` Bharata B Rao
  0 siblings, 0 replies; 36+ messages in thread
From: Bharata B Rao @ 2015-12-02  3:47 UTC (permalink / raw)
  To: David Gibson
  Cc: mdroth, aik, agraf, qemu-devel, pbonzini, qemu-ppc, tyreld, nfont,
	imammedo, afaerber

On Tue, Dec 01, 2015 at 11:44:03AM +1100, David Gibson wrote:
> On Fri, Nov 20, 2015 at 06:24:31PM +0530, Bharata B Rao wrote:
> > CPUState *cpu gets added to the cpus list during cpu_exec_init(). It
> > should be removed from cpu_exec_exit().
> > 
> > cpu_exec_init() is called from generic CPU::instance_finalize and some
> > archs like PowerPC call it from CPU unrealizefn. So ensure that we
> > dequeue the cpu only once.
> > 
> > Now -1 value for cpu->cpu_index indicates that we have already dequeued
> > the cpu for CONFIG_USER_ONLY case also.
> 
> It's not clear to me if you're intending this just as an interim step
> or not.  Surely we should fix the existing code to be consistent about
> where the QTAILQ_REMOVE is done, rather than using a special -1 flag?

cpu_index for a CPU starts with -1 and comes back to -1 when the CPU is
destroyed. So -1 value is already being used to prevent double freeing of
this particular CPU bit from the cpu_index_map. In this patch, I am depending
on the same (-1 value) to guard against double removal of the CPU from the
list. Setting cpu_index to -1 wasn't being done for CONFIG_USER_ONLY and
this patch adds that bit too.

This check against double freeing from bitmap and double removal from
list is needed since we call cpu_exec_init() from CPU::instance_finalize
for all archs and archs like PowerPC call it from unrealizefn. If and
when all archs switch to calling cpu_exec_init()/_exit() from
realizefn/unrealizefn, we wouldn't have to worry about this double freeing.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 04/10] cpu: Reclaim vCPU objects
  2015-11-30  7:30   ` Alexey Kardashevskiy
@ 2015-12-02  3:50     ` Bharata B Rao
  0 siblings, 0 replies; 36+ messages in thread
From: Bharata B Rao @ 2015-12-02  3:50 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Zhu Guihua, mdroth, agraf, qemu-devel, Chen Fan, pbonzini,
	qemu-ppc, tyreld, nfont, Gu Zheng, imammedo, afaerber, david

On Mon, Nov 30, 2015 at 06:30:52PM +1100, Alexey Kardashevskiy wrote:
> On 11/20/2015 11:54 PM, Bharata B Rao wrote:
> >From: Gu Zheng <guz.fnst@cn.fujitsu.com>
> >
> >In order to deal well with the kvm vcpus (which can not be removed without any
> >protection), we do not close KVM vcpu fd, just record and mark it as stopped
> >into a list, so that we can reuse it for the appending cpu hot-add request if
> >possible. It is also the approach that kvm guys suggested:
> >https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html
> >
> >Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> >Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
> >Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
> >Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> >                [- Explicit CPU_REMOVE() from qemu_kvm/tcg_destroy_vcpu()
> >                   isn't needed as it is done from cpu_exec_exit()]
> >Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> >---
> >  cpus.c               | 41 +++++++++++++++++++++++++++++++++++++
> >  include/qom/cpu.h    | 10 +++++++++
> >  include/sysemu/kvm.h |  1 +
> >  kvm-all.c            | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++-
> >  kvm-stub.c           |  5 +++++
> >  5 files changed, 113 insertions(+), 1 deletion(-)
> >
> >diff --git a/cpus.c b/cpus.c
> >index 877bd70..af2b274 100644
> >--- a/cpus.c
> >+++ b/cpus.c
> >@@ -953,6 +953,21 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data)
> >      qemu_cpu_kick(cpu);
> >  }
> >
> >+static void qemu_kvm_destroy_vcpu(CPUState *cpu)
> >+{
> >+    if (kvm_destroy_vcpu(cpu) < 0) {
> >+        error_report("kvm_destroy_vcpu failed.\n");
> >+        exit(EXIT_FAILURE);
> >+    }
> >+
> >+    object_unparent(OBJECT(cpu));
> >+}
> >+
> >+static void qemu_tcg_destroy_vcpu(CPUState *cpu)
> >+{
> >+    object_unparent(OBJECT(cpu));
> >+}
> >+
> >  static void flush_queued_work(CPUState *cpu)
> >  {
> >      struct qemu_work_item *wi;
> >@@ -1053,6 +1068,11 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
> >              }
> >          }
> >          qemu_kvm_wait_io_event(cpu);
> >+        if (cpu->exit && !cpu_can_run(cpu)) {
> >+            qemu_kvm_destroy_vcpu(cpu);
> >+            qemu_mutex_unlock(&qemu_global_mutex);
> 
> 
> Nit: qemu_mutex_unlock_iothread() may be? Or it is important for
> iothread_locked to remain "true"? It does not seem to be used much though.

This patch is quite old and qemu_global_mutex got changed to
qemu_mutex_unlock_iothread() some time ago in this part of the code.

Thanks for noticing this, will fix this in next version.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 04/10] cpu: Reclaim vCPU objects
  2015-12-01  0:55   ` David Gibson
@ 2015-12-02  5:28     ` Bharata B Rao
  0 siblings, 0 replies; 36+ messages in thread
From: Bharata B Rao @ 2015-12-02  5:28 UTC (permalink / raw)
  To: David Gibson
  Cc: Zhu Guihua, mdroth, aik, agraf, qemu-devel, pbonzini, qemu-ppc,
	tyreld, nfont, Chen Fan, imammedo, afaerber

On Tue, Dec 01, 2015 at 11:55:58AM +1100, David Gibson wrote:
> On Fri, Nov 20, 2015 at 06:24:33PM +0530, Bharata B Rao wrote:
> > From: Gu Zheng <guz.fnst@cn.fujitsu.com>
> > 
> > In order to deal well with the kvm vcpus (which can not be removed without any
> > protection), we do not close KVM vcpu fd, just record and mark it as stopped
> > into a list, so that we can reuse it for the appending cpu hot-add request if
> > possible. It is also the approach that kvm guys suggested:
> > https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html
> > 
> > Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> > Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
> > Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> >                [- Explicit CPU_REMOVE() from qemu_kvm/tcg_destroy_vcpu()
> >                   isn't needed as it is done from cpu_exec_exit()]
> > ---
> >  cpus.c               | 41 +++++++++++++++++++++++++++++++++++++
> >  include/qom/cpu.h    | 10 +++++++++
> >  include/sysemu/kvm.h |  1 +
> >  kvm-all.c            | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++-
> >  kvm-stub.c           |  5 +++++
> >  5 files changed, 113 insertions(+), 1 deletion(-)
> > 
> > diff --git a/cpus.c b/cpus.c
> > index 877bd70..af2b274 100644
> > --- a/cpus.c
> > +++ b/cpus.c
> > @@ -953,6 +953,21 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data)
> >      qemu_cpu_kick(cpu);
> >  }
> >  
> > +static void qemu_kvm_destroy_vcpu(CPUState *cpu)
> > +{
> > +    if (kvm_destroy_vcpu(cpu) < 0) {
> > +        error_report("kvm_destroy_vcpu failed.\n");
> > +        exit(EXIT_FAILURE);
> > +    }
> > +
> > +    object_unparent(OBJECT(cpu));
> > +}
> > +
> > +static void qemu_tcg_destroy_vcpu(CPUState *cpu)
> > +{
> > +    object_unparent(OBJECT(cpu));
> > +}
> > +
> >  static void flush_queued_work(CPUState *cpu)
> >  {
> >      struct qemu_work_item *wi;
> > @@ -1053,6 +1068,11 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
> >              }
> >          }
> >          qemu_kvm_wait_io_event(cpu);
> > +        if (cpu->exit && !cpu_can_run(cpu)) {
> > +            qemu_kvm_destroy_vcpu(cpu);
> > +            qemu_mutex_unlock(&qemu_global_mutex);
> 
> This looks like a change to locking semantics, and I can't see the
> connection to the described purpose of the patch.

As I replied in another thread to Alexey, this needs fixing.

> 
> > +            return NULL;
> > +        }
> >      }
> >  
> >      return NULL;
> > @@ -1108,6 +1128,7 @@ static void tcg_exec_all(void);
> >  static void *qemu_tcg_cpu_thread_fn(void *arg)
> >  {
> >      CPUState *cpu = arg;
> > +    CPUState *remove_cpu = NULL;
> >  
> >      rcu_register_thread();
> >  
> > @@ -1145,6 +1166,16 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
> >              }
> >          }
> >          qemu_tcg_wait_io_event(QTAILQ_FIRST(&cpus));
> > +        CPU_FOREACH(cpu) {
> > +            if (cpu->exit && !cpu_can_run(cpu)) {
> > +                remove_cpu = cpu;
> > +                break;
> > +            }
> > +        }
> > +        if (remove_cpu) {
> > +            qemu_tcg_destroy_vcpu(remove_cpu);
> > +            remove_cpu = NULL;
> > +        }
> 
> Any particular reason to only cleanup one cpu per iteration?

Not sure, this is borrowed from x86 CPU hotplug patchset.
Zhu - do you know why ?

> 
> Also, any particular reason this isn't folded into tcg_exec_all with
> the other cpu->exit logic?

Looks like it can be done. Will give it a try in the next iteration.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/10] xics_kvm: Add cpu_destroy method to XICS
  2015-12-01  1:01   ` David Gibson
@ 2015-12-02  5:45     ` Bharata B Rao
  0 siblings, 0 replies; 36+ messages in thread
From: Bharata B Rao @ 2015-12-02  5:45 UTC (permalink / raw)
  To: David Gibson
  Cc: mdroth, aik, agraf, qemu-devel, pbonzini, qemu-ppc, tyreld, nfont,
	imammedo, afaerber

On Tue, Dec 01, 2015 at 12:01:36PM +1100, David Gibson wrote:
> On Fri, Nov 20, 2015 at 06:24:35PM +0530, Bharata B Rao wrote:
> > XICS is setup for each CPU during initialization. Provide a routine
> > to undo the same when CPU is unplugged.
> > 
> > This allows reboot of a VM that has undergone CPU hotplug and unplug
> > to work correctly.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  hw/intc/xics.c        | 12 ++++++++++++
> >  hw/intc/xics_kvm.c    | 13 +++++++++++--
> >  include/hw/ppc/xics.h |  2 ++
> >  3 files changed, 25 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> > index 9ff5796..e1161b2 100644
> > --- a/hw/intc/xics.c
> > +++ b/hw/intc/xics.c
> > @@ -44,6 +44,18 @@ static int get_cpu_index_by_dt_id(int cpu_dt_id)
> >      return -1;
> >  }
> >  
> > +void xics_cpu_destroy(XICSState *icp, PowerPCCPU *cpu)
> > +{
> > +    CPUState *cs = CPU(cpu);
> > +    XICSStateClass *info = XICS_COMMON_GET_CLASS(icp);
> > +
> > +    assert(cs->cpu_index < icp->nr_servers);
> > +
> > +    if (info->cpu_destroy) {
> > +        info->cpu_destroy(icp, cpu);
> > +    }
> > +}
> > +
> >  void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
> >  {
> >      CPUState *cs = CPU(cpu);
> > diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> > index d58729c..cb96f69 100644
> > --- a/hw/intc/xics_kvm.c
> > +++ b/hw/intc/xics_kvm.c
> > @@ -331,6 +331,8 @@ static void xics_kvm_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
> >          abort();
> >      }
> >  
> > +    ss->cs = cs;
> > +
> 
> Actually, it probably makes more sense to move the management of
> ss->cs to the base xics code.  There's nothing really kvm specific
> about it.

Right, let me try changing this in next iteration.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 07/10] spapr: Enable CPU hotplug for pseries-2.5 and add CPU DRC DT entries
  2015-12-01  1:06   ` David Gibson
@ 2015-12-02  5:47     ` Bharata B Rao
  0 siblings, 0 replies; 36+ messages in thread
From: Bharata B Rao @ 2015-12-02  5:47 UTC (permalink / raw)
  To: David Gibson
  Cc: mdroth, aik, agraf, qemu-devel, pbonzini, qemu-ppc, tyreld, nfont,
	imammedo, afaerber

On Tue, Dec 01, 2015 at 12:06:43PM +1100, David Gibson wrote:
> On Fri, Nov 20, 2015 at 06:24:36PM +0530, Bharata B Rao wrote:
> > Start supporting CPU hotplug from pseries-2.5 onwards. Add CPU
> > DRC (Dynamic Resource Connector) device tree entries.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr.c         | 23 +++++++++++++++++++++++
> >  include/hw/ppc/spapr.h |  1 +
> >  2 files changed, 24 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 030ee35..814b0a6 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -973,6 +973,16 @@ static void spapr_finalize_fdt(sPAPRMachineState *spapr,
> >          _FDT(spapr_drc_populate_dt(fdt, 0, NULL, SPAPR_DR_CONNECTOR_TYPE_LMB));
> >      }
> >  
> > +    if (smc->dr_cpu_enabled) {
> > +        int offset = fdt_path_offset(fdt, "/cpus");
> > +        ret = spapr_drc_populate_dt(fdt, offset, NULL,
> > +                                    SPAPR_DR_CONNECTOR_TYPE_CPU);
> > +        if (ret < 0) {
> > +            fprintf(stderr, "Couldn't set up CPU DR device tree properties\n");
> > +            exit(1);
> > +        }
> > +    }
> > +
> >      _FDT((fdt_pack(fdt)));
> >  
> >      if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
> > @@ -1727,6 +1737,8 @@ static void ppc_spapr_init(MachineState *machine)
> >      long load_limit, fw_size;
> >      bool kernel_le = false;
> >      char *filename;
> > +    int smt = kvmppc_smt_threads();
> > +    int smp_max_cores = DIV_ROUND_UP(max_cpus, smp_threads);
> 
> 
> Given your earlier patch to disallow partially filled cores, a
> DIV_ROUND_UP should be the same as a straight divide, yes?

Yes, will change.

> 
> >  
> >      msi_supported = true;
> >  
> > @@ -1793,6 +1805,15 @@ static void ppc_spapr_init(MachineState *machine)
> >          spapr_validate_node_memory(machine);
> >      }
> >  
> > +    if (smc->dr_cpu_enabled) {
> > +        for (i = 0; i < smp_max_cores; i++) {
> > +            sPAPRDRConnector *drc =
> > +                spapr_dr_connector_new(OBJECT(spapr),
> > +                                       SPAPR_DR_CONNECTOR_TYPE_CPU, i * smt);
> > +            qemu_register_reset(spapr_drc_reset, drc);
> > +        }
> > +    }
> > +
> >      /* init CPUs */
> >      if (machine->cpu_model == NULL) {
> >          machine->cpu_model = kvm_enabled() ? "host" : "POWER7";
> > @@ -2277,6 +2298,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
> >      mc->cpu_index_to_socket_id = spapr_cpu_index_to_socket_id;
> >  
> >      smc->dr_lmb_enabled = false;
> > +    smc->dr_cpu_enabled = false;
> >      fwc->get_dev_path = spapr_get_fw_dev_path;
> >      nc->nmi_monitor_handler = spapr_nmi;
> >  }
> > @@ -2443,6 +2465,7 @@ static void spapr_machine_2_5_class_init(ObjectClass *oc, void *data)
> >      mc->alias = "pseries";
> >      mc->is_default = 1;
> >      smc->dr_lmb_enabled = true;
> > +    smc->dr_cpu_enabled = true;
> >  }
> 
> Given when we are, this will need to move to 2.6, obviously.

Yes, will move to 2.6 machine type in the next iteration.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 09/10] spapr: CPU hot unplug support
  2015-12-01  1:34   ` David Gibson
@ 2015-12-02  5:49     ` Bharata B Rao
  0 siblings, 0 replies; 36+ messages in thread
From: Bharata B Rao @ 2015-12-02  5:49 UTC (permalink / raw)
  To: David Gibson
  Cc: mdroth, aik, agraf, qemu-devel, pbonzini, qemu-ppc, tyreld, nfont,
	imammedo, afaerber

On Tue, Dec 01, 2015 at 12:34:06PM +1100, David Gibson wrote:
> On Fri, Nov 20, 2015 at 06:24:38PM +0530, Bharata B Rao wrote:
> > Support hot removal of CPU for sPAPR guests by sending the hot unplug
> > notification to the guest via EPOW interrupt. Release the vCPU object
> > after CPU hot unplug so that vCPU fd can be parked and reused.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 71 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 4434d45..6dca553 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -2401,11 +2401,82 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
> >      }
> >  }
> >  
> > +static void spapr_cpu_destroy(PowerPCCPU *cpu)
> > +{
> > +    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> > +
> > +    xics_cpu_destroy(spapr->icp, cpu);
> > +    qemu_unregister_reset(spapr_cpu_reset, cpu);
> > +}
> > +
> > +static void spapr_cpu_release(DeviceState *dev, void *opaque)
> 
> I'd prefer to see this called "core_release" since "cpu" doesn't make
> it clear that it's acting on a whole core.

Sure. Thanks for your review, will address your comments in the next
iteration.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 01/10] vl: Don't allow CPU toplogies with partially filled cores
  2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 01/10] vl: Don't allow CPU toplogies with partially filled cores Bharata B Rao
  2015-12-01  0:37   ` David Gibson
@ 2015-12-02 13:52   ` Igor Mammedov
  2015-12-02 14:14     ` Eduardo Habkost
  1 sibling, 1 reply; 36+ messages in thread
From: Igor Mammedov @ 2015-12-02 13:52 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, aik, agraf, Eduardo Habkost, qemu-devel, qemu-ppc, tyreld,
	nfont, pbonzini, afaerber, david

On Fri, 20 Nov 2015 18:24:30 +0530
Bharata B Rao <bharata@linux.vnet.ibm.com> wrote:

> Prevent guests from booting with CPU topologies that have partially
> filled CPU cores or can result in partially filled CPU cores after
> CPU hotplug like
> 
> -smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
> -smp 15,sockets=1,cores=4,threads=4,maxcpus=17.

CCing Eduardo who might tell if it's ok wrt x86

> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  vl.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/vl.c b/vl.c
> index 7d993a5..23a1a1e 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1248,6 +1248,15 @@ static void smp_parse(QemuOpts *opts)
>              exit(1);
>          }
>  
> +        if (cpus % threads || max_cpus % threads) {
> +            error_report("cpu topology: "
> +                         "sockets (%u) cores (%u) threads (%u) with "
> +                         "smp_cpus (%u) maxcpus (%u) "
> +                         "will result in partially filled cores",
> +                         sockets, cores, threads, cpus, max_cpus);
> +            exit(1);
> +        }
> +
>          smp_cpus = cpus;
>          smp_cores = cores > 0 ? cores : 1;
>          smp_threads = threads > 0 ? threads : 1;

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 01/10] vl: Don't allow CPU toplogies with partially filled cores
  2015-12-02 13:52   ` Igor Mammedov
@ 2015-12-02 14:14     ` Eduardo Habkost
  2015-12-02 14:38       ` Bharata B Rao
  0 siblings, 1 reply; 36+ messages in thread
From: Eduardo Habkost @ 2015-12-02 14:14 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: qemu-devel, aik, mdroth, agraf, qemu-ppc, tyreld, Bharata B Rao,
	pbonzini, nfont, afaerber, david

On Wed, Dec 02, 2015 at 02:52:39PM +0100, Igor Mammedov wrote:
> On Fri, 20 Nov 2015 18:24:30 +0530
> Bharata B Rao <bharata@linux.vnet.ibm.com> wrote:
> 
> > Prevent guests from booting with CPU topologies that have partially
> > filled CPU cores or can result in partially filled CPU cores after
> > CPU hotplug like
> > 
> > -smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
> > -smp 15,sockets=1,cores=4,threads=4,maxcpus=17.
> 
> CCing Eduardo who might tell if it's ok wrt x86

I am always afraid of introducing fatal errors for configurations
that are obviously wrong but may already exist in production (so
it would prevent users from migrating existing VMs). But I am
becoming more inclined to be a bit more aggressive and stop
allowing these broken configurations.

But I would rewrite the error message, and just say that "cpus"
and "maxcpus" must be multiples of "threads". It's easier to
understand and explain than what "partially filled cores" mean.
You don't need to mention "sockets" and "cores" in the error
message, either.

Also, please print different error messages to indicate if
"maxcpus" or "cpus" is the culprit.

What about (cpus % (cores * threads) != 0)? It would result in
sockets with different numbers of cores. Should we allow that?

The patch makes QEMU crash with the following command-line:

  $ qemu-system-x86_64 -smp 2,cores=2,sockets=2
  Floating point exception (core dumped)
  $

It can probably be solved by moving the check after
smp_{cpus,cores,threads} are already set.

> 
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > ---
> >  vl.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/vl.c b/vl.c
> > index 7d993a5..23a1a1e 100644
> > --- a/vl.c
> > +++ b/vl.c
> > @@ -1248,6 +1248,15 @@ static void smp_parse(QemuOpts *opts)
> >              exit(1);
> >          }
> >  
> > +        if (cpus % threads || max_cpus % threads) {
> > +            error_report("cpu topology: "
> > +                         "sockets (%u) cores (%u) threads (%u) with "
> > +                         "smp_cpus (%u) maxcpus (%u) "
> > +                         "will result in partially filled cores",
> > +                         sockets, cores, threads, cpus, max_cpus);
> > +            exit(1);
> > +        }
> > +
> >          smp_cpus = cpus;
> >          smp_cores = cores > 0 ? cores : 1;
> >          smp_threads = threads > 0 ? threads : 1;
> 



-- 
Eduardo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 01/10] vl: Don't allow CPU toplogies with partially filled cores
  2015-12-02 14:14     ` Eduardo Habkost
@ 2015-12-02 14:38       ` Bharata B Rao
  2015-12-02 15:19         ` Eduardo Habkost
  0 siblings, 1 reply; 36+ messages in thread
From: Bharata B Rao @ 2015-12-02 14:38 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mdroth, aik, agraf, qemu-devel, qemu-ppc, tyreld, pbonzini,
	Igor Mammedov, nfont, afaerber, david

On Wed, Dec 02, 2015 at 12:14:59PM -0200, Eduardo Habkost wrote:
> On Wed, Dec 02, 2015 at 02:52:39PM +0100, Igor Mammedov wrote:
> > On Fri, 20 Nov 2015 18:24:30 +0530
> > Bharata B Rao <bharata@linux.vnet.ibm.com> wrote:
> > 
> > > Prevent guests from booting with CPU topologies that have partially
> > > filled CPU cores or can result in partially filled CPU cores after
> > > CPU hotplug like
> > > 
> > > -smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
> > > -smp 15,sockets=1,cores=4,threads=4,maxcpus=17.
> > 
> > CCing Eduardo who might tell if it's ok wrt x86
> 
> I am always afraid of introducing fatal errors for configurations
> that are obviously wrong but may already exist in production (so
> it would prevent users from migrating existing VMs). But I am
> becoming more inclined to be a bit more aggressive and stop
> allowing these broken configurations.
> 
> But I would rewrite the error message, and just say that "cpus"
> and "maxcpus" must be multiples of "threads". It's easier to
> understand and explain than what "partially filled cores" mean.
> You don't need to mention "sockets" and "cores" in the error
> message, either.
> 
> Also, please print different error messages to indicate if
> "maxcpus" or "cpus" is the culprit.

Sure, makes it easier for the user.

> 
> What about (cpus % (cores * threads) != 0)? It would result in
> sockets with different numbers of cores. Should we allow that?

I had planned to prevent that and also

(max_cpus % (cores * threads) != 0) in the next iteration. Would be
good to know if enforcing such a condition would be acceptable.

> 
> The patch makes QEMU crash with the following command-line:
> 
>   $ qemu-system-x86_64 -smp 2,cores=2,sockets=2
>   Floating point exception (core dumped)
>   $
> 
> It can probably be solved by moving the check after
> smp_{cpus,cores,threads} are already set.

Oh yes, will fix in the next iteration.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v5 01/10] vl: Don't allow CPU toplogies with partially filled cores
  2015-12-02 14:38       ` Bharata B Rao
@ 2015-12-02 15:19         ` Eduardo Habkost
  0 siblings, 0 replies; 36+ messages in thread
From: Eduardo Habkost @ 2015-12-02 15:19 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, aik, agraf, qemu-devel, qemu-ppc, tyreld, pbonzini,
	Igor Mammedov, nfont, afaerber, david

On Wed, Dec 02, 2015 at 08:08:23PM +0530, Bharata B Rao wrote:
> On Wed, Dec 02, 2015 at 12:14:59PM -0200, Eduardo Habkost wrote:
> > On Wed, Dec 02, 2015 at 02:52:39PM +0100, Igor Mammedov wrote:
> > > On Fri, 20 Nov 2015 18:24:30 +0530
> > > Bharata B Rao <bharata@linux.vnet.ibm.com> wrote:
> > > 
> > > > Prevent guests from booting with CPU topologies that have partially
> > > > filled CPU cores or can result in partially filled CPU cores after
> > > > CPU hotplug like
> > > > 
> > > > -smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
> > > > -smp 15,sockets=1,cores=4,threads=4,maxcpus=17.
> > > 
> > > CCing Eduardo who might tell if it's ok wrt x86
> > 
> > I am always afraid of introducing fatal errors for configurations
> > that are obviously wrong but may already exist in production (so
> > it would prevent users from migrating existing VMs). But I am
> > becoming more inclined to be a bit more aggressive and stop
> > allowing these broken configurations.
> > 
> > But I would rewrite the error message, and just say that "cpus"
> > and "maxcpus" must be multiples of "threads". It's easier to
> > understand and explain than what "partially filled cores" mean.
> > You don't need to mention "sockets" and "cores" in the error
> > message, either.
> > 
> > Also, please print different error messages to indicate if
> > "maxcpus" or "cpus" is the culprit.
> 
> Sure, makes it easier for the user.
> 
> > 
> > What about (cpus % (cores * threads) != 0)? It would result in
> > sockets with different numbers of cores. Should we allow that?
> 
> I had planned to prevent that and also
> 
> (max_cpus % (cores * threads) != 0) in the next iteration. Would be
> good to know if enforcing such a condition would be acceptable.

I'm all for being strict about core count, too, and only make the
requirements less strict if really necessary.

After all, if we really wanted to support sockets with different
numbers of cores each, or cores with different numbers of threads
each, the functionality shouldn't be restricted to the last
socket/core, so it would require a completely different
command-line interface.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2015-12-02 15:19 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-20 12:54 [Qemu-devel] [PATCH v5 00/10] sPAPR CPU hotplug Bharata B Rao
2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 01/10] vl: Don't allow CPU toplogies with partially filled cores Bharata B Rao
2015-12-01  0:37   ` David Gibson
2015-12-02  3:04     ` Bharata B Rao
2015-12-02 13:52   ` Igor Mammedov
2015-12-02 14:14     ` Eduardo Habkost
2015-12-02 14:38       ` Bharata B Rao
2015-12-02 15:19         ` Eduardo Habkost
2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 02/10] exec: Remove cpu from cpus list during cpu_exec_exit() Bharata B Rao
2015-12-01  0:44   ` David Gibson
2015-12-02  3:47     ` Bharata B Rao
2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 03/10] exec: Do vmstate unregistration from cpu_exec_exit() Bharata B Rao
2015-12-01  0:46   ` David Gibson
2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 04/10] cpu: Reclaim vCPU objects Bharata B Rao
2015-11-30  7:30   ` Alexey Kardashevskiy
2015-12-02  3:50     ` Bharata B Rao
2015-12-01  0:55   ` David Gibson
2015-12-02  5:28     ` Bharata B Rao
2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 05/10] cpu: Add a sync version of cpu_remove() Bharata B Rao
2015-12-01  0:57   ` David Gibson
2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 06/10] xics_kvm: Add cpu_destroy method to XICS Bharata B Rao
2015-12-01  1:01   ` David Gibson
2015-12-02  5:45     ` Bharata B Rao
2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 07/10] spapr: Enable CPU hotplug for pseries-2.5 and add CPU DRC DT entries Bharata B Rao
2015-12-01  1:06   ` David Gibson
2015-12-02  5:47     ` Bharata B Rao
2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 08/10] spapr: CPU hotplug support Bharata B Rao
2015-12-01  1:30   ` David Gibson
2015-12-01  4:48     ` Bharata B Rao
2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 09/10] spapr: CPU hot unplug support Bharata B Rao
2015-12-01  1:34   ` David Gibson
2015-12-02  5:49     ` Bharata B Rao
2015-11-20 12:54 ` [Qemu-devel] [PATCH v5 10/10] target-ppc: Enable CPU hotplug for POWER8 CPU family Bharata B Rao
2015-11-23 11:54 ` [Qemu-devel] [PATCH v5 00/10] sPAPR CPU hotplug Peter Krempa
2015-11-23 13:04   ` Christian Borntraeger
2015-12-01  1:43     ` David Gibson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).