qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Salil Mehta <salil.mehta@opnsrc.net>
To: Gavin Shan <gshan@redhat.com>
Cc: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com,
	 salil.mehta@huawei.com, maz@kernel.org,
	jean-philippe@linaro.org,  jonathan.cameron@huawei.com,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	 richard.henderson@linaro.org, imammedo@redhat.com,
	armbru@redhat.com,  andrew.jones@linux.dev, david@redhat.com,
	philmd@linaro.org,  eric.auger@redhat.com, will@kernel.org,
	ardb@kernel.org,  oliver.upton@linux.dev, pbonzini@redhat.com,
	rafael@kernel.org,  borntraeger@linux.ibm.com,
	alex.bennee@linaro.org, gustavo.romero@linaro.org,
	 npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	 darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	 vishnu@os.amperecomputing.com,
	gankulkarni@os.amperecomputing.com,  karl.heubaum@oracle.com,
	miguel.luis@oracle.com, zhukeqian1@huawei.com,
	 wangxiongfeng2@huawei.com, wangyanan55@huawei.com,
	wangzhou1@hisilicon.com,  linuxarm@huawei.com,
	jiakernel2@gmail.com, maobibo@loongson.cn,
	 lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com,  Keqian Zhu <zhuqian1@huawei.com>
Subject: Re: [PATCH RFC V6 05/24] arm/virt,kvm: Pre-create KVM vCPUs for 'disabled' QOM vCPUs at machine init
Date: Wed, 22 Oct 2025 18:50:02 +0000	[thread overview]
Message-ID: <CAJ7pxeaUfUeXwtTVheCTxej-aCTCx0n8-XyAKaFneVUjcWL_7w@mail.gmail.com> (raw)
In-Reply-To: <CAJ7pxeYurHLqj8GnLrfznmofMpsaw91GeZ3KMyucL0B_gn9gPg@mail.gmail.com>

Hi Gavin,

On Wed, Oct 22, 2025 at 6:18 PM Salil Mehta <salil.mehta@opnsrc.net> wrote:
>
> Hi Gavin,
>
> On Wed, Oct 22, 2025 at 10:37 AM Gavin Shan <gshan@redhat.com> wrote:
> >
> > Hi Salil,
> >
> > On 10/1/25 11:01 AM, salil.mehta@opnsrc.net wrote:
> > > From: Salil Mehta <salil.mehta@huawei.com>
> > >
> > > ARM CPU architecture does not allow CPUs to be plugged after system has
> > > initialized. This is a constraint. Hence, the Kernel must know all the CPUs
> > > being booted during its initialization. This applies to the Guest Kernel as
> > > well and therefore, the number of KVM vCPU descriptors in the host must be
> > > fixed at VM initialization time.
> > >
> > > Also, the GIC must know all the CPUs it is connected to during its
> > > initialization, and this cannot change afterward. This must also be ensured
> > > during the initialization of the VGIC in KVM. This is necessary because:
> > >
> > > 1. The association between GICR and MPIDR must be fixed at VM initialization
> > >     time. This is represented by the register
> > >     `GICR_TYPER(mp_affinity, proc_num)`.
> > > 2. Memory regions associated with GICR, etc., cannot be changed (added,
> > >     deleted, or modified) after the VM has been initialized. This is not an
> > >     ARM architectural constraint but rather invites a difficult and messy
> > >     change in VGIC data structures.
> > >
> > > To enable a hot-add–like model while preserving these constraints, the virt
> > > machine may enumerate more CPUs than are enabled at boot using
> > > `-smp disabledcpus=N`. Such CPUs are present but start offline (i.e.,
> > > administratively disabled at init). The topology remains fixed at VM
> > > creation time; only the online/offline status may change later.
> > >
> > > Administratively disabled vCPUs are not realized in QOM until first enabled,
> > > avoiding creation of unnecessary vCPU threads at boot. On large systems, this
> > > reduces startup time proportionally to the number of disabled vCPUs. Once a
> > > QOM vCPU is realized and its thread created, subsequent enable/disable actions
> > > do not unrealize it. This behaviour was adopted following review feedback and
> > > differs from earlier RFC versions.
> > >
> > > Co-developed-by: Keqian Zhu <zhuqian1@huawei.com>
> > > Signed-off-by: Keqian Zhu <zhuqian1@huawei.com>
> > > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > > ---
> > >   accel/kvm/kvm-all.c    |  2 +-
> > >   hw/arm/virt.c          | 77 ++++++++++++++++++++++++++++++++++++++----
> > >   hw/core/qdev.c         | 17 ++++++++++
> > >   include/hw/qdev-core.h | 19 +++++++++++
> > >   include/system/kvm.h   |  8 +++++
> > >   target/arm/cpu.c       |  2 ++
> > >   target/arm/kvm.c       | 40 +++++++++++++++++++++-
> > >   target/arm/kvm_arm.h   | 11 ++++++
> > >   8 files changed, 168 insertions(+), 8 deletions(-)
> > >

[...]

> > > +void kvm_arm_create_host_vcpu(ARMCPU *cpu)
> > > +{
> > > +    CPUState *cs = CPU(cpu);
> > > +    unsigned long vcpu_id = cs->cpu_index;
> > > +    int ret;
> > > +
> > > +    ret = kvm_create_vcpu(cs);
> > > +    if (ret < 0) {
> > > +        error_report("Failed to create host vcpu %ld", vcpu_id);
> > > +        abort();
> > > +    }
> > > +
> > > +    /*
> > > +     * Initialize the vCPU in the host. This will reset the sys regs
> > > +     * for this vCPU and related registers like MPIDR_EL1 etc. also
> > > +     * get programmed during this call to host. These are referenced
> > > +     * later while setting device attributes of the GICR during GICv3
> > > +     * reset.
> > > +     */
> > > +    ret = kvm_arch_init_vcpu(cs);
> > > +    if (ret < 0) {
> > > +        error_report("Failed to initialize host vcpu %ld", vcpu_id);
> > > +        abort();
> > > +    }
> > > +
> > > +    /*
> > > +     * park the created vCPU. shall be used during kvm_get_vcpu() when
> > > +     * threads are created during realization of ARM vCPUs.
> > > +     */
> > > +    kvm_park_vcpu(cs);
> > > +}
> > > +
> >
> > I don't think we're able to simply call kvm_arch_init_vcpu() in the lazily realized
> > path. Otherwise, it can trigger a crash dump on my Nvidia's grace-hopper machine where
> > SVE is supported by default.
>
> Thanks for reporting this. That is not true. As long as we initialize
> KVM correctly and
> finalize the features like SVE we should be fine. In fact, this is
> precisely what we are
> doing right now.
>
> To understand the crash, I need a bit more info.
>
> 1#  is happening because KVM_ARM_VCPU_INIT is failing. If yes, the can you check
>       within the KVM if it is happening because
>      a.  features specified by QEMU are not matching the defaults within the KVM
>            (HInt: check kvm_vcpu_init_check_features())?
>      b. or complaining about init feate change kvm_vcpu_init_changed()?
> 2#  or it is happening during the setting of vector length or
> finalizing features?
>
> int kvm_arch_init_vcpu(CPUState *cs)
> {
>    [...]
>          /* Do KVM_ARM_VCPU_INIT ioctl */
>         ret = kvm_arm_vcpu_init(cpu);   ---->[1]
>         if (ret) {
>            return ret;
>        }
>           if (cpu_isar_feature(aa64_sve, cpu)) {
>         ret = kvm_arm_sve_set_vls(cpu); ---->[2]
>         if (ret) {
>             return ret;
>         }
>         ret = kvm_arm_vcpu_finalize(cpu, KVM_ARM_VCPU_SVE);--->[3]
>         if (ret) {
>             return ret;
>         }
>     }
> [...]
> }
>
> I think it's happening because vector length is going uninitialized.
> This initialization
> happens in context to  arm_cpu_finalize_features() which I forgot to call before
> calling KVM finalize.
>
> >
> > kvm_arch_init_vcpu() is supposed to be called in the realization path in current
> > implementation (without this series) because the parameters (features) to KVM_ARM_VCPU_INIT
> > is populated at vCPU realization time.
>
> Not necessarily. It is just meant to initialize the KVM. If we take care of the
> KVM requirements in the similar way the realize path does we should be
> fine. Can you try to add the patch below in your code and test if it works?
>
>  diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index c4b68a0b17..1091593478 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -1068,6 +1068,9 @@ void kvm_arm_create_host_vcpu(ARMCPU *cpu)
>          abort();
>      }
>
> +     /* finalize the features like SVE, SME etc */
> +     arm_cpu_finalize_features(cpu, &error_abort);
> +
>      /*
>       * Initialize the vCPU in the host. This will reset the sys regs
>       * for this vCPU and related registers like MPIDR_EL1 etc. also
>
>
>
>
> >
> > $ home/gavin/sandbox/qemu.main/build/qemu-system-aarch64           \
> >    --enable-kvm -machine virt,gic-version=3 -cpu host               \
> >    -smp cpus=4,disabledcpus=2 -m 1024M                              \
> >    -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image    \
> >    -initrd /home/gavin/sandbox/images/rootfs.cpio.xz -nographic
> > qemu-system-aarch64: Failed to initialize host vcpu 4
> > Aborted (core dumped)
> >
> > Backtrace
> > =========
> > (gdb) bt
> > #0  0x0000ffff9106bc80 in __pthread_kill_implementation () at /lib64/libc.so.6
> > #1  0x0000ffff9101aa40 [PAC] in raise () at /lib64/libc.so.6
> > #2  0x0000ffff91005988 [PAC] in abort () at /lib64/libc.so.6
> > #3  0x0000aaaab1cc26b8 [PAC] in kvm_arm_create_host_vcpu (cpu=0xaaaab9ab1bc0)
> >      at ../target/arm/kvm.c:1081
> > #4  0x0000aaaab1cd0c94 in virt_setup_lazy_vcpu_realization (cpuobj=0xaaaab9ab1bc0, vms=0xaaaab98870a0)
> >      at ../hw/arm/virt.c:2483
> > #5  0x0000aaaab1cd180c in machvirt_init (machine=0xaaaab98870a0) at ../hw/arm/virt.c:2777
> > #6  0x0000aaaab160f220 in machine_run_board_init
> >      (machine=0xaaaab98870a0, mem_path=0x0, errp=0xfffffa86bdc8) at ../hw/core/machine.c:1722
> > #7  0x0000aaaab1a25ef4 in qemu_init_board () at ../system/vl.c:2723
> > #8  0x0000aaaab1a2635c in qmp_x_exit_preconfig (errp=0xaaaab38a50f0 <error_fatal>)
> >      at ../system/vl.c:2821
> > #9  0x0000aaaab1a28b08 in qemu_init (argc=15, argv=0xfffffa86c1f8) at ../system/vl.c:3882
> > #10 0x0000aaaab221d9e4 in main (argc=15, argv=0xfffffa86c1f8) at ../system/main.c:71
>
>
> Thank you for this. Please let me know if the above fix works and also
> the return values in
> case you encounter errors.

I've pushed the fix to below branch for your convenience:

Branch: https://github.com/salil-mehta/qemu/commits/virt-cpuhp-armv8/rfc-v6.2
Fix: https://github.com/salil-mehta/qemu/commit/1f1fbc0998ffb1fe26140df3c336bf2be2aa8669

Thanks
Salil.


  reply	other threads:[~2025-10-22 18:51 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-01  1:01 [PATCH RFC V6 00/24] Support of Virtual CPU Hotplug-like Feature for ARMv8+ Arch salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 01/24] hw/core: Introduce administrative power-state property and its accessors salil.mehta
2025-10-09 10:48   ` Miguel Luis
2025-10-01  1:01 ` [PATCH RFC V6 02/24] hw/core, qemu-options.hx: Introduce 'disabledcpus' SMP parameter salil.mehta
2025-10-09 11:28   ` Miguel Luis
2025-10-09 13:17     ` Igor Mammedov
2025-10-09 11:51   ` Markus Armbruster
2025-10-28  5:48   ` Gavin Shan
2025-10-01  1:01 ` [PATCH RFC V6 03/24] hw/arm/virt: Clamp 'maxcpus' as-per machine's vCPU deferred online-capability salil.mehta
2025-10-09 12:32   ` Miguel Luis
2025-10-09 13:11     ` Igor Mammedov
2025-10-01  1:01 ` [PATCH RFC V6 04/24] arm/virt, target/arm: Add new ARMCPU {socket, cluster, core, thread}-id property salil.mehta
2025-10-28  6:24   ` [PATCH RFC V6 04/24] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property Gavin Shan
2025-10-01  1:01 ` [PATCH RFC V6 05/24] arm/virt, kvm: Pre-create KVM vCPUs for 'disabled' QOM vCPUs at machine init salil.mehta
2025-10-22 10:36   ` [PATCH RFC V6 05/24] arm/virt,kvm: " Gavin Shan
2025-10-22 18:18     ` Salil Mehta
2025-10-22 18:50       ` Salil Mehta [this message]
2025-10-23  0:14         ` Gavin Shan
2025-10-23  0:35           ` Salil Mehta
2025-10-23  1:29             ` Salil Mehta
2025-10-23  4:14               ` Gavin Shan
2025-10-23 11:27                 ` Salil Mehta
2025-10-23  1:58             ` Gavin Shan
2025-10-23 11:17               ` Salil Mehta
2025-10-01  1:01 ` [PATCH RFC V6 06/24] arm/virt, gicv3: Pre-size GIC with possible " salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 07/24] arm/gicv3: Refactor CPU interface init for shared TCG/KVM use salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 08/24] arm/virt, gicv3: Guard CPU interface access for admin disabled vCPUs salil.mehta
2025-10-24  4:07   ` Gavin Shan
2025-10-28 11:59   ` Gavin Shan
2025-10-01  1:01 ` [PATCH RFC V6 09/24] hw/intc/arm_gicv3_common: Migrate & check 'GICv3CPUState' accessibility mismatch salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 10/24] arm/virt: Init PMU at host for all present vCPUs salil.mehta
2025-10-03 15:02   ` Igor Mammedov
2025-10-01  1:01 ` [PATCH RFC V6 11/24] hw/arm/acpi: MADT change to size the guest with possible vCPUs salil.mehta
2025-10-03 15:09   ` Igor Mammedov
     [not found]     ` <0175e40f70424dd9a29389b8a4f16c42@huawei.com>
2025-10-07 12:20       ` Igor Mammedov
2025-10-10  3:15         ` Salil Mehta
2025-10-01  1:01 ` [PATCH RFC V6 12/24] hw/core: Introduce generic device power-state handler interface salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 13/24] qdev: make admin power state changes trigger platform transitions via ACPI salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 14/24] arm/acpi: Introduce dedicated CPU OSPM interface for ARM-like platforms salil.mehta
2025-10-03 14:58   ` Igor Mammedov
     [not found]     ` <7da6a9c470684754810414f0abd23a62@huawei.com>
2025-10-07 12:06       ` Igor Mammedov
2025-10-10  3:00         ` Salil Mehta
2025-10-24  4:47   ` Gavin Shan
2025-10-01  1:01 ` [PATCH RFC V6 15/24] acpi/ged: Notify OSPM of CPU administrative state changes via GED salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 16/24] arm/virt/acpi: Update ACPI DSDT Tbl to include 'Online-Capable' CPUs AML salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 17/24] hw/arm/virt, acpi/ged: Add PowerStateHandler hooks for runtime CPU state changes salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 18/24] target/arm/kvm, tcg: Handle SMCCC hypercall exits in VMM during PSCI_CPU_{ON, OFF} salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 19/24] target/arm/cpu: Add the Accessor hook to fetch ARM CPU arch-id salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 20/24] target/arm/kvm: Write vCPU's state back to KVM on cold-reset salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 21/24] hw/intc/arm-gicv3-kvm: Pause all vCPUs & cache ICC_CTLR_EL1 for userspace PSCI CPU_ON salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 22/24] monitor, qdev: Introduce 'device_set' to change admin state of existing devices salil.mehta
2025-10-09  8:55   ` [PATCH RFC V6 22/24] monitor,qdev: " Markus Armbruster
2025-10-09 12:51     ` Igor Mammedov
2025-10-09 14:03       ` Daniel P. Berrangé
2025-10-09 14:55       ` Markus Armbruster
2025-10-09 15:19         ` Peter Maydell
2025-10-10  4:59           ` Markus Armbruster
2025-10-17 14:50         ` Igor Mammedov
2025-10-20 11:22           ` Markus Armbruster
2025-10-29 10:08             ` Igor Mammedov
2025-10-29 11:38               ` Markus Armbruster
2025-10-01  1:01 ` [PATCH RFC V6 23/24] monitor, qapi: add 'info cpus-powerstate' and QMP query (Admin + Oper states) salil.mehta
2025-10-09 11:53   ` [PATCH RFC V6 23/24] monitor,qapi: " Markus Armbruster
2025-10-01  1:01 ` [PATCH RFC V6 24/24] tcg: Defer TB flush for 'lazy realized' vCPUs on first region alloc salil.mehta
2025-10-01 21:34   ` Richard Henderson
2025-10-02 12:27     ` Salil Mehta via
2025-10-02 15:41       ` Richard Henderson
2025-10-07 10:14         ` Salil Mehta via
2025-10-06 14:00 ` [PATCH RFC V6 00/24] Support of Virtual CPU Hotplug-like Feature for ARMv8+ Arch Igor Mammedov
2025-10-13  0:34 ` Gavin Shan
2025-10-22 10:07 ` Gavin Shan
2025-10-24  6:55   ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJ7pxeaUfUeXwtTVheCTxej-aCTCx0n8-XyAKaFneVUjcWL_7w@mail.gmail.com \
    --to=salil.mehta@opnsrc.net \
    --cc=alex.bennee@linaro.org \
    --cc=andrew.jones@linux.dev \
    --cc=ardb@kernel.org \
    --cc=armbru@redhat.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=darren@os.amperecomputing.com \
    --cc=david@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=gankulkarni@os.amperecomputing.com \
    --cc=gshan@redhat.com \
    --cc=gustavo.romero@linaro.org \
    --cc=harshpb@linux.ibm.com \
    --cc=ilkka@os.amperecomputing.com \
    --cc=imammedo@redhat.com \
    --cc=jean-philippe@linaro.org \
    --cc=jiakernel2@gmail.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=karl.heubaum@oracle.com \
    --cc=linux@armlinux.org.uk \
    --cc=linuxarm@huawei.com \
    --cc=lixianglai@loongson.cn \
    --cc=lpieralisi@kernel.org \
    --cc=maobibo@loongson.cn \
    --cc=maz@kernel.org \
    --cc=miguel.luis@oracle.com \
    --cc=mst@redhat.com \
    --cc=npiggin@gmail.com \
    --cc=oliver.upton@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=philmd@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rafael@kernel.org \
    --cc=richard.henderson@linaro.org \
    --cc=salil.mehta@huawei.com \
    --cc=shahuang@redhat.com \
    --cc=vishnu@os.amperecomputing.com \
    --cc=wangxiongfeng2@huawei.com \
    --cc=wangyanan55@huawei.com \
    --cc=wangzhou1@hisilicon.com \
    --cc=will@kernel.org \
    --cc=zhao1.liu@intel.com \
    --cc=zhukeqian1@huawei.com \
    --cc=zhuqian1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).