qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Salil Mehta via <qemu-devel@nongnu.org>
To: <qemu-devel@nongnu.org>, <qemu-arm@nongnu.org>, <mst@redhat.com>
Cc: <salil.mehta@huawei.com>, <maz@kernel.org>,
	<jean-philippe@linaro.org>, <jonathan.cameron@huawei.com>,
	<lpieralisi@kernel.org>, <peter.maydell@linaro.org>,
	<richard.henderson@linaro.org>, <imammedo@redhat.com>,
	<andrew.jones@linux.dev>, <david@redhat.com>, <philmd@linaro.org>,
	<peterx@redhat.com>, <eric.auger@redhat.com>, <will@kernel.org>,
	<ardb@kernel.org>, <oliver.upton@linux.dev>,
	<pbonzini@redhat.com>, <gshan@redhat.com>, <rafael@kernel.org>,
	<borntraeger@linux.ibm.com>, <alex.bennee@linaro.org>,
	<npiggin@gmail.com>, <harshpb@linux.ibm.com>,
	<linux@armlinux.org.uk>, <darren@os.amperecomputing.com>,
	<ilkka@os.amperecomputing.com>, <vishnu@os.amperecomputing.com>,
	<karl.heubaum@oracle.com>, <miguel.luis@oracle.com>,
	<salil.mehta@opnsrc.net>, <zhukeqian1@huawei.com>,
	<wangxiongfeng2@huawei.com>, <wangyanan55@huawei.com>,
	<jiakernel2@gmail.com>, <maobibo@loongson.cn>,
	<lixianglai@loongson.cn>, <shahuang@redhat.com>,
	<zhao1.liu@intel.com>, <linuxarm@huawei.com>,
	<gustavo.romero@linaro.org>
Subject: [PATCH RFC V5 05/30] arm/virt, kvm: Pre-create KVM vCPUs for all unplugged QOM vCPUs @machine init
Date: Tue, 15 Oct 2024 10:59:47 +0100	[thread overview]
Message-ID: <20241015100012.254223-6-salil.mehta@huawei.com> (raw)
In-Reply-To: <20241015100012.254223-1-salil.mehta@huawei.com>

ARM CPU architecture does not allows CPU to be plugged after system has
initialized. This is an constraint. Hence, Kernel must know all the CPUs being
booted during its initialization. This applies to the Guest Kernel as well and
therefore, number of KVM vCPUs need to be fixed at the VM initialization time.

Also, The GIC must know all the CPUs it is connected to during its
initialization, and this cannot change afterward. This must also be ensured
during the initialization of the VGIC in KVM. This is necessary because:

1. The association between GICR and MPIDR must be fixed at VM initialization
   time. This is represented by the register `GICR_TYPER(mp_affinity, proc_num)`
2. Memory regions associated with GICR, etc., cannot be changed (added, deleted,
   or modified) after the VM has been initialized. This is not an ARM
   architectural constraint but rather invites a difficult and messy change in
   VGIC data structures.

This patch adds support to pre-create all possible vCPUs within the KVM host
as part of the virtual machine initialization. These vCPUs can later be unparked
and attached to the hotplugged QOM vCPU.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reported-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
[VP: Identified CPU stall issue & suggested probable fix]
---
 hw/arm/virt.c         | 74 ++++++++++++++++++++++++++++++++++++++-----
 include/hw/core/cpu.h |  1 +
 target/arm/cpu64.c    |  9 ++++++
 target/arm/kvm.c      | 41 +++++++++++++++++++++++-
 target/arm/kvm_arm.h  | 10 ++++++
 5 files changed, 126 insertions(+), 9 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 48102d5a4c..858a19bc8b 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2366,17 +2366,12 @@ static void machvirt_init(MachineState *machine)
 
     assert(possible_cpus->len == max_cpus);
     for (n = 0; n < possible_cpus->len; n++) {
+        CPUArchId *cpu_slot;
         Object *cpuobj;
         CPUState *cs;
 
-        if (n >= smp_cpus) {
-            break;
-        }
-
         cpuobj = object_new(possible_cpus->cpus[n].type);
-
         cs = CPU(cpuobj);
-        cs->cpu_index = n;
 
         aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
         object_property_set_int(cpuobj, "socket-id",
@@ -2388,8 +2383,61 @@ static void machvirt_init(MachineState *machine)
         object_property_set_int(cpuobj, "thread-id",
                                 virt_get_thread_id(machine, n), NULL);
 
-        qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
-        object_unref(cpuobj);
+        if (n < smp_cpus) {
+            qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
+            object_unref(cpuobj);
+        } else {
+            /*
+             * Handling vCPUs that are yet to be hot-plugged requires the
+             * unrealized `ARMCPU` object for the following purposes:
+             *
+             * 1. To create the corresponding host KVM vCPU.
+             * 2. During the GICv3 realization phase, the `GICR_TYPER` value is
+             *    derived using the fetched MPIDR/mp-affinity. It's worth
+             *    considering modifying the GICv3 realization code to directly
+             *    fetch the `arch-id`/mp-affinity from the possible vCPUs.
+             * 3. Additionally, the `ARMCPU` object must be retained until
+             *    `virt_cpu_post_init`, as there may be late per-vCPU
+             *    initializations.
+             *
+             * Once these tasks are completed, the initialized `ARMCPU` object
+             * can be safely released as those are not required and will be
+             * recreated when they are {hot,cold}-plugged later.
+             */
+            cs->cpu_index = n;
+            cpu_slot = virt_find_cpu_slot(cs);
+
+            /*
+             * We will pre-create the KVM vCPUs corresponding to the currently
+             * unplugged but possible QOM vCPUs and park them until they are
+             * actually hot-plugged. The ARM architecture does not allow new
+             * CPUs to be plugged after the system has been initialized, and
+             * this constraint is also reflected in KVM.
+             */
+            if (kvm_enabled()) {
+                kvm_arm_create_host_vcpu(ARM_CPU(cs));
+                /*
+                 * Override the default architecture ID with the one fetched
+                 * from KVM. After initialization, we will destroy the CPUState
+                 * for disabled vCPUs; however, the CPU slot and its association
+                 * with the architecture ID (and consequently the vCPU ID) will
+                 * remain fixed for the entire lifetime of QEMU.
+                 */
+                cpu_slot->arch_id = arm_cpu_mp_affinity(ARM_CPU(cs));
+            }
+
+           /*
+            * GICv3 realization will need `mp-affinity` to derive `gicr_typer`
+            */
+            object_property_set_int(cpuobj, "mp-affinity", cpu_slot->arch_id,
+                                    NULL);
+
+            /*
+             * Add the unplugged vCPU to the vCPU slot temporarily. It will be
+             * released later in the virt_post_init_cpu() function
+             */
+            cpu_slot->cpu = cs;
+        }
     }
 
     /* Now we've created the CPUs we can see if they have the hypvirt timer */
@@ -2992,6 +3040,16 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
     /* insert the cold/hot-plugged vcpu in the slot */
     cpu_slot = virt_find_cpu_slot(cs);
     cpu_slot->cpu = CPU(dev);
+
+    if (kvm_enabled()) {
+        /*
+         * Override the default architecture ID with the one fetched from KVM
+         * Currently, KVM derives the architecture ID from the vCPU ID specified
+         * by QEMU. In the future, we might implement a change where the entire
+         * architecture ID can be configured directly by QEMU.
+         */
+        cpu_slot->arch_id = arm_cpu_mp_affinity(ARM_CPU(cs));
+    }
 }
 
 static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 299e96c45b..4e0cb325a0 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -534,6 +534,7 @@ struct CPUState {
     uint64_t dirty_pages;
     int kvm_vcpu_stats_fd;
     bool vcpu_dirty;
+    VMChangeStateEntry *vmcse;
 
     /* Use by accel-block: CPU is executing an ioctl() */
     QemuLockCnt in_ioctl_lock;
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 0e217f827e..d2f4624d61 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -791,6 +791,14 @@ static void aarch64_cpu_set_aarch64(Object *obj, bool value, Error **errp)
     }
 }
 
+static void aarch64_cpu_initfn(Object *obj)
+{
+    CPUState *cs = CPU(obj);
+
+    /* TODO: re-check if this is necessary still */
+    cs->thread_id = 0;
+}
+
 static void aarch64_cpu_finalizefn(Object *obj)
 {
 }
@@ -850,6 +858,7 @@ void aarch64_cpu_register(const ARMCPUInfo *info)
 static const TypeInfo aarch64_cpu_type_info = {
     .name = TYPE_AARCH64_CPU,
     .parent = TYPE_ARM_CPU,
+    .instance_init = aarch64_cpu_initfn,
     .instance_finalize = aarch64_cpu_finalizefn,
     .abstract = true,
     .class_init = aarch64_cpu_class_init,
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index f1f1b5b375..e82cb2aa8b 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -1003,6 +1003,38 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu)
     write_list_to_cpustate(cpu);
 }
 
+void kvm_arm_create_host_vcpu(ARMCPU *cpu)
+{
+    CPUState *cs = CPU(cpu);
+    unsigned long vcpu_id = cs->cpu_index;
+    int ret;
+
+    ret = kvm_create_vcpu(cs);
+    if (ret < 0) {
+        error_report("Failed to create host vcpu %ld", vcpu_id);
+        abort();
+    }
+
+    /*
+     * Initialize the vCPU in the host. This will reset the sys regs
+     * for this vCPU and related registers like MPIDR_EL1 etc. also
+     * gets programmed during this call to host. These are referred
+     * later while setting device attributes of the GICR during GICv3
+     * reset
+     */
+    ret = kvm_arch_init_vcpu(cs);
+    if (ret < 0) {
+        error_report("Failed to initialize host vcpu %ld", vcpu_id);
+        abort();
+    }
+
+    /*
+     * park the created vCPU. shall be used during kvm_get_vcpu() when
+     * threads are created during realization of ARM vCPUs.
+     */
+    kvm_park_vcpu(cs);
+}
+
 /*
  * Update KVM's MP_STATE based on what QEMU thinks it is
  */
@@ -1874,7 +1906,14 @@ int kvm_arch_init_vcpu(CPUState *cs)
         return -EINVAL;
     }
 
-    qemu_add_vm_change_state_handler(kvm_arm_vm_state_change, cpu);
+    /*
+     * Install VM change handler only when vCPU thread has been spawned
+     * i.e. vCPU is being realized
+     */
+    if (cs->thread_id) {
+        cs->vmcse = qemu_add_vm_change_state_handler(kvm_arm_vm_state_change,
+                                                     cpu);
+    }
 
     /* Determine init features for this CPU */
     memset(cpu->kvm_init_features, 0, sizeof(cpu->kvm_init_features));
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index cfaa0d9bc7..93d12eeb74 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -96,6 +96,16 @@ void kvm_arm_cpu_post_load(ARMCPU *cpu);
  */
 void kvm_arm_reset_vcpu(ARMCPU *cpu);
 
+/**
+ * kvm_arm_create_host_vcpu:
+ * @cpu: ARMCPU
+ *
+ * Called to pre-create possible KVM vCPU within the host during the
+ * `virt_machine` initialization phase. This pre-created vCPU will be parked and
+ * will be reused when ARM QOM vCPU is actually hotplugged.
+ */
+void kvm_arm_create_host_vcpu(ARMCPU *cpu);
+
 #ifdef CONFIG_KVM
 /**
  * kvm_arm_create_scratch_host_vcpu:
-- 
2.34.1



  parent reply	other threads:[~2024-10-15 10:03 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-15  9:59 [PATCH RFC V5 00/30] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
2024-10-15  9:59 ` [PATCH RFC V5 01/30] arm/virt, target/arm: Add new ARMCPU {socket, cluster, core, thread}-id property Salil Mehta via
2024-10-15  9:59 ` [PATCH RFC V5 02/30] hw/arm/virt: Disable vCPU hotplug for *unsupported* Accel or GIC Type Salil Mehta via
2024-10-15  9:59 ` [PATCH RFC V5 03/30] hw/arm/virt: Move setting of common vCPU properties in a function Salil Mehta via
2024-10-15  9:59 ` [PATCH RFC V5 04/30] arm/virt, target/arm: Machine init time change common to vCPU {cold|hot}-plug Salil Mehta via
2024-10-15  9:59 ` Salil Mehta via [this message]
2024-10-15  9:59 ` [PATCH RFC V5 06/30] arm/virt, gicv3: Changes to pre-size GIC with possible vCPUs @machine init Salil Mehta via
2024-10-15  9:59 ` [PATCH RFC V5 07/30] arm/virt, gicv3: Introduce GICv3 CPU Interface *accessibility* flag and checks Salil Mehta via
2024-10-15  9:59 ` [PATCH RFC V5 08/30] hw/intc/arm-gicv3*: Changes required to (re)init the GICv3 vCPU Interface Salil Mehta via
2024-10-15  9:59 ` [PATCH RFC V5 09/30] arm/acpi: Enable ACPI support for vCPU hotplug Salil Mehta via
2024-10-15  9:59 ` [PATCH RFC V5 10/30] arm/virt: Enhance GED framework to handle vCPU hotplug events Salil Mehta via
2024-10-15  9:59 ` [PATCH RFC V5 11/30] arm/virt: Init PMU at host for all possible vCPUs Salil Mehta via
2025-09-02 13:26   ` Philippe Mathieu-Daudé
2024-10-15  9:59 ` [PATCH RFC V5 12/30] arm/virt: Release objects for *disabled* possible vCPUs after init Salil Mehta via
2024-10-15  9:59 ` [PATCH RFC V5 13/30] arm/virt/acpi: Update ACPI DSDT Tbl to include CPUs AML with hotplug support Salil Mehta via
2024-10-15  9:59 ` [PATCH RFC V5 14/30] hw/acpi: Make _MAT method optional Salil Mehta via
2024-10-15  9:59 ` [PATCH RFC V5 15/30] hw/arm/acpi: MADT Tbl change to size the guest with possible vCPUs Salil Mehta via
2024-10-15  9:59 ` [PATCH RFC V5 16/30] target/arm: Force ARM vCPU *present* status ACPI *persistent* Salil Mehta via
2024-10-15  9:59 ` [PATCH RFC V5 17/30] arm/virt: Add/update basic hot-(un)plug framework Salil Mehta via
2024-10-15 10:00 ` [PATCH RFC V5 18/30] arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug Salil Mehta via
2024-10-15 10:00 ` [PATCH RFC V5 19/30] hw/arm, gicv3: Changes to notify GICv3 CPU state with vCPU hot-(un)plug event Salil Mehta via
2024-10-15 10:00 ` [PATCH RFC V5 20/30] hw/arm: Changes required for reset and to support next boot Salil Mehta via
2024-10-15 10:00 ` [PATCH RFC V5 21/30] arm/virt: Update the guest(via GED) about vCPU hot-(un)plug events Salil Mehta via
2024-10-15 10:00 ` [PATCH RFC V5 22/30] target/arm/cpu: Check if hotplugged ARM vCPU's FEAT match existing Salil Mehta via
2024-10-15 10:00 ` [PATCH RFC V5 23/30] tcg: Update tcg_register_thread() leg to handle region alloc for hotplugged vCPU Salil Mehta via
2024-10-15 10:00 ` [PATCH RFC V5 24/30] target/arm: Add support to *unrealize* ARMCPU during vCPU Hot-unplug Salil Mehta via
2024-10-15 10:00 ` [PATCH RFC V5 25/30] tcg/mttcg: Introduce MTTCG thread unregistration leg Salil Mehta via
2024-10-15 10:00 ` [PATCH RFC V5 26/30] hw/intc/arm_gicv3_common: Add GICv3CPUState 'accessible' flag migration handling Salil Mehta via
2024-10-15 10:00 ` [PATCH RFC V5 27/30] target/arm/kvm, tcg: Handle SMCCC hypercall exits in VMM during PSCI_CPU_{ON, OFF} Salil Mehta via
2024-10-15 10:00 ` [PATCH RFC V5 28/30] target/arm/kvm: Write vCPU's state back to KVM on cold-reset Salil Mehta via
2024-10-15 10:16 ` [PATCH RFC V5 29/30] hw/intc/arm_gicv3_kvm: Pause all vCPU to ensure locking in KVM of resetting vCPU Salil Mehta via
2024-10-15 10:18 ` [PATCH RFC V5 30/30] hw/arm/virt: Expose cold-booted vCPUs as MADT GICC *Enabled* Salil Mehta via
2024-10-16 14:09 ` [PATCH RFC V5 00/30] Support of Virtual CPU Hotplug for ARMv8 Arch Miguel Luis
2024-10-21 20:49   ` Salil Mehta
2024-10-17  7:07 ` Gavin Shan
2024-10-21 20:47   ` Salil Mehta
2025-05-21  0:22 ` Gavin Shan
2025-05-21 15:06   ` Gustavo Romero
2025-05-21 23:54     ` Gavin Shan
2025-05-22 13:24       ` Salil Mehta via
2025-05-22 12:40     ` Igor Mammedov
2025-05-22 14:16       ` Salil Mehta via
2025-05-22 13:05     ` Salil Mehta via
2025-05-22 12:44   ` Salil Mehta via

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241015100012.254223-6-salil.mehta@huawei.com \
    --to=qemu-devel@nongnu.org \
    --cc=alex.bennee@linaro.org \
    --cc=andrew.jones@linux.dev \
    --cc=ardb@kernel.org \
    --cc=borntraeger@linux.ibm.com \
    --cc=darren@os.amperecomputing.com \
    --cc=david@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=gshan@redhat.com \
    --cc=gustavo.romero@linaro.org \
    --cc=harshpb@linux.ibm.com \
    --cc=ilkka@os.amperecomputing.com \
    --cc=imammedo@redhat.com \
    --cc=jean-philippe@linaro.org \
    --cc=jiakernel2@gmail.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=karl.heubaum@oracle.com \
    --cc=linux@armlinux.org.uk \
    --cc=linuxarm@huawei.com \
    --cc=lixianglai@loongson.cn \
    --cc=lpieralisi@kernel.org \
    --cc=maobibo@loongson.cn \
    --cc=maz@kernel.org \
    --cc=miguel.luis@oracle.com \
    --cc=mst@redhat.com \
    --cc=npiggin@gmail.com \
    --cc=oliver.upton@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=peterx@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=rafael@kernel.org \
    --cc=richard.henderson@linaro.org \
    --cc=salil.mehta@huawei.com \
    --cc=salil.mehta@opnsrc.net \
    --cc=shahuang@redhat.com \
    --cc=vishnu@os.amperecomputing.com \
    --cc=wangxiongfeng2@huawei.com \
    --cc=wangyanan55@huawei.com \
    --cc=will@kernel.org \
    --cc=zhao1.liu@intel.com \
    --cc=zhukeqian1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).