qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
@ 2024-06-13 23:36 Salil Mehta via
  2024-06-13 23:36 ` [PATCH RFC V3 01/29] arm/virt, target/arm: Add new ARMCPU {socket, cluster, core, thread}-id property Salil Mehta via
                   ` (32 more replies)
  0 siblings, 33 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

PROLOGUE
========

To assist in review and set the right expectations from this RFC, please first
read the sections *APPENDED AT THE END* of this cover letter:

1. Important *DISCLAIMER* [Section (X)]
2. Work presented at KVMForum Conference (slides available) [Section (V)F]
3. Organization of patches [Section (XI)]
4. References [Section (XII)]
5. Detailed TODO list of leftover work or work-in-progress [Section (IX)]

There has been interest shown by other organizations in adapting this series
for their architecture. Hence, RFC V2 [21] has been split into architecture
*agnostic* [22] and *specific* patch sets.

This is an ARM architecture-specific patch set carved out of RFC V2. Please
check section (XI)B for details of architecture agnostic patches.

SECTIONS [I - XIII] are as follows:

(I) Key Changes [details in last section (XIV)]
==============================================

RFC V2 -> RFC V3

1. Split into Architecture *agnostic* (V13) [22] and *specific* (RFC V3) patch sets.
2. Addressed comments by Gavin Shan (RedHat), Shaoqin Huang (RedHat), Philippe Mathieu-Daudé (Linaro),
   Jonathan Cameron (Huawei), Zhao Liu (Intel).

RFC V1 -> RFC V2

RFC V1: https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.mehta@huawei.com/

1. ACPI MADT Table GIC CPU Interface can now be presented [6] as ACPI
   *online-capable* or *enabled* to the Guest OS at boot time. This means
   associated CPUs can have ACPI _STA as *enabled* or *disabled* even after boot.
   See UEFI ACPI 6.5 Spec, Section 05, Table 5.37 GICC CPU Interface Flags[20].
2. SMCC/HVC Hypercall exit handling in userspace/Qemu for PSCI CPU_{ON,OFF}
   request. This is required to {dis}allow online'ing a vCPU.
3. Always presenting unplugged vCPUs in CPUs ACPI AML code as ACPI _STA.PRESENT 
   to the Guest OS. Toggling ACPI _STA.Enabled to give an effect of the
   hot{un}plug.
4. Live Migration works (some issues are still there).
5. TCG/HVF/qtest does not support Hotplug and falls back to default.
6. Code for TCG support exists in this release (it is a work-in-progress).
7. ACPI _OSC method can now be used by OSPM to negotiate Qemu VM platform
   hotplug capability (_OSC Query support still pending).
8. Misc. Bug fixes.

(II) Summary
============

This patch set introduces virtual CPU hotplug support for the ARMv8 architecture
in QEMU. The idea is to be able to hotplug and hot-unplug vCPUs while the guest VM
is running, without requiring a reboot. This does *not* make any assumptions about
the physical CPU hotplug availability within the host system but rather tries to
solve the problem at the virtualizer/QEMU layer. It introduces ACPI CPU hotplug hooks
and event handling to interface with the guest kernel, and code to initialize, plug,
and unplug CPUs. No changes are required within the host kernel/KVM except the
support of hypercall exit handling in the user-space/Qemu, which has recently
been added to the kernel. Corresponding guest kernel changes have been
posted on the mailing list [3] [4] by James Morse.

(III) Motivation
================

This allows scaling the guest VM compute capacity on-demand, which would be
useful for the following example scenarios:

1. Vertical Pod Autoscaling [9][10] in the cloud: Part of the orchestration
   framework that could adjust resource requests (CPU and Mem requests) for
   the containers in a pod, based on usage.
2. Pay-as-you-grow Business Model: Infrastructure providers could allocate and
   restrict the total number of compute resources available to the guest VM
   according to the SLA (Service Level Agreement). VM owners could request more
   compute to be hot-plugged for some cost.

For example, Kata Container VM starts with a minimum amount of resources (i.e.,
hotplug everything approach). Why?

1. Allowing faster *boot time* and
2. Reduction in *memory footprint*

Kata Container VM can boot with just 1 vCPU, and then later more vCPUs can be
hot-plugged as needed.

(IV) Terminology
================

(*) Possible CPUs: Total vCPUs that could ever exist in the VM. This includes
                   any cold-booted CPUs plus any CPUs that could be later
                   hot-plugged.
                   - Qemu parameter (-smp maxcpus=N)
(*) Present CPUs:  Possible CPUs that are ACPI 'present'. These might or might
                   not be ACPI 'enabled'. 
                   - Present vCPUs = Possible vCPUs (Always on ARM Arch)
(*) Enabled CPUs:  Possible CPUs that are ACPI 'present' and 'enabled' and can
                   now be ‘onlined’ (PSCI) for use by the Guest Kernel. All cold-
                   booted vCPUs are ACPI 'enabled' at boot. Later, using
                   device_add, more vCPUs can be hotplugged and made ACPI
                   'enabled'.
                   - Qemu parameter (-smp cpus=N). Can be used to specify some
	           cold-booted vCPUs during VM init. Some can be added using the
	           '-device' option.

(V) Constraints Due to ARMv8 CPU Architecture [+] Other Impediments
===================================================================

A. Physical Limitation to Support CPU Hotplug: (Architectural Constraint)
   1. ARMv8 CPU architecture does not support the concept of the physical CPU
      hotplug. 
      a. There are many per-CPU components like PMU, SVE, MTE, Arch timers, etc.,
         whose behavior needs to be clearly defined when the CPU is hot(un)plugged.
         There is no specification for this.

   2. Other ARM components like GIC, etc., have not been designed to realize
      physical CPU hotplug capability as of now. For example,
      a. Every physical CPU has a unique GICC (GIC CPU Interface) by construct.
         Architecture does not specify what CPU hot(un)plug would mean in
         context to any of these.
      b. CPUs/GICC are physically connected to unique GICR (GIC Redistributor).
         GIC Redistributors are always part of the always-on power domain. Hence,
         they cannot be powered off as per specification.

B. Impediments in Firmware/ACPI (Architectural Constraint)

   1. Firmware has to expose GICC, GICR, and other per-CPU features like PMU,
      SVE, MTE, Arch Timers, etc., to the OS. Due to the architectural constraint
      stated in section A1(a), all interrupt controller structures of
      MADT describing GIC CPU Interfaces and the GIC Redistributors MUST be
      presented by firmware to the OSPM during boot time.
   2. Architectures that support CPU hotplug can evaluate the ACPI _MAT method to
      get this kind of information from the firmware even after boot, and the
      OSPM has the capability to process these. ARM kernel uses information in MADT
      interrupt controller structures to identify the number of present CPUs during
      boot and hence does not allow to change these after boot. The number of
      present CPUs cannot be changed. It is an architectural constraint!

C. Impediments in KVM to Support Virtual CPU Hotplug (Architectural Constraint)

   1. KVM VGIC:
      a. Sizing of various VGIC resources like memory regions, etc., related to
         the redistributor happens only once and is fixed at the VM init time
         and cannot be changed later after initialization has happened.
         KVM statically configures these resources based on the number of vCPUs
         and the number/size of redistributor ranges.
      b. Association between vCPU and its VGIC redistributor is fixed at the
         VM init time within the KVM, i.e., when redistributor iodevs gets
         registered. VGIC does not allow to setup/change this association
         after VM initialization has happened. Physically, every CPU/GICC is
         uniquely connected with its redistributor, and there is no
         architectural way to set this up.
   2. KVM vCPUs:
      a. Lack of specification means destruction of KVM vCPUs does not exist as
         there is no reference to tell what to do with other per-vCPU
         components like redistributors, arch timer, etc.
      b. In fact, KVM does not implement the destruction of vCPUs for any
         architecture. This is independent of whether the architecture
         actually supports CPU Hotplug feature. For example, even for x86 KVM
         does not implement the destruction of vCPUs.

D. Impediments in Qemu to Support Virtual CPU Hotplug (KVM Constraints->Arch)

   1. Qemu CPU Objects MUST be created to initialize all the Host KVM vCPUs to
      overcome the KVM constraint. KVM vCPUs are created and initialized when Qemu
      CPU Objects are realized. But keeping the QOM CPU objects realized for
      'yet-to-be-plugged' vCPUs can create problems when these new vCPUs shall
      be plugged using device_add and a new QOM CPU object shall be created.
   2. GICV3State and GICV3CPUState objects MUST be sized over *possible vCPUs*
      during VM init time while QOM GICV3 Object is realized. This is because
      KVM VGIC can only be initialized once during init time. But every
      GICV3CPUState has an associated QOM CPU Object. Later might correspond to
      vCPU which are 'yet-to-be-plugged' (unplugged at init).
   3. How should new QOM CPU objects be connected back to the GICV3CPUState
      objects and disconnected from it in case the CPU is being hot(un)plugged?
   4. How should 'unplugged' or 'yet-to-be-plugged' vCPUs be represented in the
      QOM for which KVM vCPU already exists? For example, whether to keep,
       a. No QOM CPU objects Or
       b. Unrealized CPU Objects
   5. How should vCPU state be exposed via ACPI to the Guest? Especially for
      the unplugged/yet-to-be-plugged vCPUs whose CPU objects might not exist
      within the QOM but the Guest always expects all possible vCPUs to be
      identified as ACPI *present* during boot.
   6. How should Qemu expose GIC CPU interfaces for the unplugged or
      yet-to-be-plugged vCPUs using ACPI MADT Table to the Guest?

E. Summary of Approach ([+] Workarounds to problems in sections A, B, C & D)

   1. At VM Init, pre-create all the possible vCPUs in the Host KVM i.e., even
      for the vCPUs which are yet-to-be-plugged in Qemu but keep them in the
      powered-off state.
   2. After the KVM vCPUs have been initialized in the Host, the KVM vCPU
      objects corresponding to the unplugged/yet-to-be-plugged vCPUs are parked
      at the existing per-VM "kvm_parked_vcpus" list in Qemu. (similar to x86)
   3. GICV3State and GICV3CPUState objects are sized over possible vCPUs during
      VM init time i.e., when Qemu GIC is realized. This, in turn, sizes KVM VGIC
      resources like memory regions, etc., related to the redistributors with the
      number of possible KVM vCPUs. This never changes after VM has initialized.
   4. Qemu CPU objects corresponding to unplugged/yet-to-be-plugged vCPUs are
      released post Host KVM CPU and GIC/VGIC initialization.
   5. Build ACPI MADT Table with the following updates:
      a. Number of GIC CPU interface entries (=possible vCPUs)
      b. Present Boot vCPU as MADT.GICC.Enabled=1 (Not hot[un]pluggable) 
      c. Present hot(un)pluggable vCPUs as MADT.GICC.online-capable=1  
         - MADT.GICC.Enabled=0 (Mutually exclusive) [6][7]
	 - vCPU can be ACPI enabled+onlined after Guest boots (Firmware Policy) 
	 - Some issues with above (details in later sections)
   6. Expose below ACPI Status to Guest kernel:
      a. Always _STA.Present=1 (all possible vCPUs)
      b. _STA.Enabled=1 (plugged vCPUs)
      c. _STA.Enabled=0 (unplugged vCPUs)
   7. vCPU hotplug *realizes* new QOM CPU object. The following happens:
      a. Realizes, initializes QOM CPU Object & spawns Qemu vCPU thread.
      b. Unparks the existing KVM vCPU ("kvm_parked_vcpus" list).
         - Attaches to QOM CPU object.
      c. Reinitializes KVM vCPU in the Host.
         - Resets the core and sys regs, sets defaults, etc.
      d. Runs KVM vCPU (created with "start-powered-off").
	 - vCPU thread sleeps (waits for vCPU reset via PSCI). 
      e. Updates Qemu GIC.
         - Wires back IRQs related to this vCPU.
         - GICV3CPUState association with QOM CPU Object.
      f. Updates [6] ACPI _STA.Enabled=1.
      g. Notifies Guest about the new vCPU (via ACPI GED interface).
	 - Guest checks _STA.Enabled=1.
	 - Guest adds processor (registers CPU with LDM) [3].
      h. Plugs the QOM CPU object in the slot.
         - slot-number = cpu-index {socket, cluster, core, thread}.
      i. Guest online's vCPU (CPU_ON PSCI call over HVC/SMC).
         - KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
         - Qemu powers-on KVM vCPU in the Host.
   8. vCPU hot-unplug *unrealizes* QOM CPU Object. The following happens:
      a. Notifies Guest (via ACPI GED interface) vCPU hot-unplug event.
         - Guest offline's vCPU (CPU_OFF PSCI call over HVC/SMC).
      b. KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
         - Qemu powers-off the KVM vCPU in the Host.
      c. Guest signals *Eject* vCPU to Qemu.
      d. Qemu updates [6] ACPI _STA.Enabled=0.
      e. Updates GIC.
         - Un-wires IRQs related to this vCPU.
         - GICV3CPUState association with new QOM CPU Object is updated.
      f. Unplugs the vCPU.
	 - Removes from slot.
         - Parks KVM vCPU ("kvm_parked_vcpus" list).
         - Unrealizes QOM CPU Object & joins back Qemu vCPU thread.
	 - Destroys QOM CPU object.
      g. Guest checks ACPI _STA.Enabled=0.
         - Removes processor (unregisters CPU with LDM) [3].

F. Work Presented at KVM Forum Conferences:
==========================================

Details of the above work have been presented at KVMForum2020 and KVMForum2023
conferences. Slides & video are available at the links below:
a. KVMForum 2023
   - Challenges Revisited in Supporting Virt CPU Hotplug on architectures that don't Support CPU Hotplug (like ARM64).
     https://kvm-forum.qemu.org/2023/KVM-forum-cpu-hotplug_7OJ1YyJ.pdf
     https://kvm-forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Virt_CPU_Hotplug_-__ii0iNb3.pdf
     https://www.youtube.com/watch?v=hyrw4j2D6I0&t=23970s
     https://kvm-forum.qemu.org/2023/talk/9SMPDQ/
b. KVMForum 2020
   - Challenges in Supporting Virtual CPU Hotplug on SoC Based Systems (like ARM64) - Salil Mehta, Huawei.
     https://sched.co/eE4m

(VI) Commands Used
==================

A. Qemu launch commands to init the machine:

    $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3 \
      -cpu host -smp cpus=4,maxcpus=6 \
      -m 300M \
      -kernel Image \
      -initrd rootfs.cpio.gz \
      -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2 acpi=force" \
      -nographic \
      -bios QEMU_EFI.fd \

B. Hot-(un)plug related commands:

  # Hotplug a host vCPU (accel=kvm):
    $ device_add host-arm-cpu,id=core4,core-id=4

  # Hotplug a vCPU (accel=tcg):
    $ device_add cortex-a57-arm-cpu,id=core4,core-id=4

  # Delete the vCPU:
    $ device_del core4

Sample output on guest after boot:

    $ cat /sys/devices/system/cpu/possible
    0-5
    $ cat /sys/devices/system/cpu/present
    0-5
    $ cat /sys/devices/system/cpu/enabled
    0-3
    $ cat /sys/devices/system/cpu/online
    0-1
    $ cat /sys/devices/system/cpu/offline
    2-5

Sample output on guest after hotplug of vCPU=4:

    $ cat /sys/devices/system/cpu/possible
    0-5
    $ cat /sys/devices/system/cpu/present
    0-5
    $ cat /sys/devices/system/cpu/enabled
    0-4
    $ cat /sys/devices/system/cpu/online
    0-1,4
    $ cat /sys/devices/system/cpu/offline
    2-3,5

    Note: vCPU=4 was explicitly 'onlined' after hot-plug
    $ echo 1 > /sys/devices/system/cpu/cpu4/online

(VII) Latest Repository
=======================

(*) Latest Qemu RFC V3 (Architecture Specific) patch set:
    https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v3
(*) Latest Qemu V13 (Architecture Agnostic) patch set:
    https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v3.arch.agnostic.v13
(*) QEMU changes for vCPU hotplug can be cloned from below site:
    https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2
(*) Guest Kernel changes (by James Morse, ARM) are available here:
    https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git virtual_cpu_hotplug/rfc/v2
(*) Leftover patches of the kernel are available here:
    https://lore.kernel.org/lkml/20240529133446.28446-1-Jonathan.Cameron@huawei.com/
    https://github.com/salil-mehta/linux/commits/virtual_cpu_hotplug/rfc/v6.jic/ (not latest)

(VIII) KNOWN ISSUES
===================

1. Migration has been lightly tested but has been found working.
2. TCG is broken.
3. HVF and qtest are not supported yet.
4. ACPI MADT Table flags [7] MADT.GICC.Enabled and MADT.GICC.online-capable are
   mutually exclusive, i.e., as per the change [6], a vCPU cannot be both
   GICC.Enabled and GICC.online-capable. This means:
      [ Link: https://bugzilla.tianocore.org/show_bug.cgi?id=3706 ]
   a. If we have to support hot-unplug of the cold-booted vCPUs, then these MUST
      be specified as GICC.online-capable in the MADT Table during boot by the
      firmware/Qemu. But this requirement conflicts with the requirement to
      support new Qemu changes with legacy OS that don't understand
      MADT.GICC.online-capable Bit. Legacy OS during boot time will ignore this
      bit, and hence these vCPUs will not appear on such OS. This is unexpected
      behavior.
   b. In case we decide to specify vCPUs as MADT.GICC.Enabled and try to unplug
      these cold-booted vCPUs from OS (which in actuality should be blocked by
      returning error at Qemu), then features like 'kexec' will break.
   c. As I understand, removal of the cold-booted vCPUs is a required feature
      and x86 world allows it.
   d. Hence, either we need a specification change to make the MADT.GICC.Enabled
      and MADT.GICC.online-capable Bits NOT mutually exclusive or NOT support
      the removal of cold-booted vCPUs. In the latter case, a check can be introduced
      to bar the users from unplugging vCPUs, which were cold-booted, using QMP
      commands. (Needs discussion!)
      Please check the patch part of this patch set:
      [hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled].
   
      NOTE: This is definitely not a blocker!
5. Code related to the notification to GICV3 about the hot(un)plug of a vCPU event
   might need further discussion.


(IX) THINGS TO DO
=================

1. Fix issues related to TCG/Emulation support. (Not a blocker)
2. Comprehensive Testing is in progress. (Positive feedback from Oracle & Ampere)
3. Qemu Documentation (.rst) needs to be updated.
4. Fix qtest, HVF Support (Future).
5. Update the design issue related to ACPI MADT.GICC flags discussed in known
   issues. This might require UEFI ACPI specification change (Not a blocker).
6. Add ACPI _OSC 'Query' support. Only part of _OSC support exists now. (Not a blocker).

The above is *not* a complete list. Will update later!

Best regards,  
Salil.

(X) DISCLAIMER
==============

This work is an attempt to present a proof-of-concept of the ARM64 vCPU hotplug
implementation to the community. This is *not* production-level code and might
have bugs. Comprehensive testing is being done on HiSilicon Kunpeng920 SoC,
Oracle, and Ampere servers. We are nearing stable code and a non-RFC
version shall be floated soon.

This work is *mostly* in the lines of the discussions that have happened in the
previous years [see refs below] across different channels like the mailing list,
Linaro Open Discussions platform, and various conferences like KVMForum, etc. This
RFC is being used as a way to verify the idea mentioned in this cover letter and
to get community views. Once this has been agreed upon, a formal patch shall be
posted to the mailing list for review.

[The concept being presented has been found to work!]

(XI) ORGANIZATION OF PATCHES
============================
 
A. Architecture *specific* patches:

   [Patch 1-8, 17, 27, 29] logic required during machine init.
    (*) Some validation checks.
    (*) Introduces core-id property and some util functions required later.
    (*) Logic to pre-create vCPUs.
    (*) GIC initialization pre-sized with possible vCPUs.
    (*) Some refactoring to have common hot and cold plug logic together.
    (*) Release of disabled QOM CPU objects in post_cpu_init().
    (*) Support of ACPI _OSC method to negotiate platform hotplug capabilities.
   [Patch 9-16] logic related to ACPI at machine init time.
    (*) Changes required to Enable ACPI for CPU hotplug.
    (*) Initialization of ACPI GED framework to cater to CPU Hotplug Events.
    (*) ACPI MADT/MAT changes.
   [Patch 18-26] logic required during vCPU hot-(un)plug.
    (*) Basic framework changes to support vCPU hot-(un)plug.
    (*) ACPI GED changes for hot-(un)plug hooks.
    (*) Wire-unwire the IRQs.
    (*) GIC notification logic.
    (*) ARMCPU unrealize logic.
    (*) Handling of SMCC Hypercall Exits by KVM to Qemu.
   
B. Architecture *agnostic* patches:

   [PATCH V13 0/8] Add architecture agnostic code to support vCPU Hotplug.
   https://lore.kernel.org/qemu-devel/20240607115649.214622-1-salil.mehta@huawei.com/T/#md0887eb07976bc76606a8204614ccc7d9a01c1f7
    (*) Refactors vCPU create, Parking, unparking logic of vCPUs, and addition of traces.
    (*) Build ACPI AML related to CPU control dev.
    (*) Changes related to the destruction of CPU Address Space.
    (*) Changes related to the uninitialization of GDB Stub.
    (*) Updating of Docs.

(XII) REFERENCES
================

[1] https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.mehta@huawei.com/
[2] https://lore.kernel.org/linux-arm-kernel/20200625133757.22332-1-salil.mehta@huawei.com/
[3] https://lore.kernel.org/lkml/20230203135043.409192-1-james.morse@arm.com/
[4] https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/
[5] https://lore.kernel.org/all/20230404154050.2270077-1-oliver.upton@linux.dev/
[6] https://bugzilla.tianocore.org/show_bug.cgi?id=3706
[7] https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic-cpu-interface-gicc-structure
[8] https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
[9] https://cloud.google.com/kubernetes-engine/docs/concepts/verticalpodautoscaler
[10] https://docs.aws.amazon.com/eks/latest/userguide/vertical-pod-autoscaler.html
[11] https://lkml.org/lkml/2019/7/10/235
[12] https://lists.cs.columbia.edu/pipermail/kvmarm/2018-July/032316.html
[13] https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg06517.html
[14] https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-lists.linaro.org/thread/7CGL6JTACPUZEYQC34CZ2ZBWJGSR74WE/
[15] http://lists.nongnu.org/archive/html/qemu-devel/2018-07/msg01168.html
[16] https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00131.html
[17] https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-lists.linaro.org/message/X74JS6P2N4AUWHHATJJVVFDI2EMDZJ74/
[18] https://lore.kernel.org/lkml/20210608154805.216869-1-jean-philippe@linaro.org/
[19] https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/ 
[20] https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gicc-cpu-interface-flags
[21] https://lore.kernel.org/qemu-devel/20230926100436.28284-1-salil.mehta@huawei.com/
[22] https://lore.kernel.org/qemu-devel/20240607115649.214622-1-salil.mehta@huawei.com/T/#md0887eb07976bc76606a8204614ccc7d9a01c1f7

(XIII) ACKNOWLEDGEMENTS
=======================

I would like to take this opportunity to thank below people for various
discussions with me over different channels during the development:

Marc Zyngier (Google)               Catalin Marinas (ARM),         
James Morse(ARM),                   Will Deacon (Google), 
Jean-Phillipe Brucker (Linaro),     Sudeep Holla (ARM),
Lorenzo Pieralisi (Linaro),         Gavin Shan (Redhat), 
Jonathan Cameron (Huawei),          Darren Hart (Ampere),
Igor Mamedov (Redhat),              Ilkka Koskinen (Ampere),
Andrew Jones (Redhat),              Karl Heubaum (Oracle),
Keqian Zhu (Huawei),                Miguel Luis (Oracle),
Xiongfeng Wang (Huawei),            Vishnu Pajjuri (Ampere),
Shameerali Kolothum (Huawei)        Russell King (Oracle)
Xuwei/Joy (Huawei),                 Peter Maydel (Linaro)
Zengtao/Prime (Huawei),             And all those whom I have missed! 

Many thanks to the following people for their current or past contributions:

1. James Morse (ARM)
   (Current Kernel part of vCPU Hotplug Support on AARCH64)
2. Jean-Philippe Brucker (Linaro)
   (Prototyped one of the earlier PSCI-based POC [17][18] based on RFC V1)
3. Keqian Zhu (Huawei)
   (Co-developed Qemu prototype)
4. Xiongfeng Wang (Huawei)
   (Co-developed an earlier kernel prototype with me)
5. Vishnu Pajjuri (Ampere)
   (Verification on Ampere ARM64 Platforms + fixes)
6. Miguel Luis (Oracle)
   (Verification on Oracle ARM64 Platforms + fixes)
7. Russell King (Oracle) & Jonathan Cameron (Huawei)
   (Helping in upstreaming James Morse's Kernel patches).

(XIV) Change Log:
=================

RFC V2 -> RFC V3:
-----------------
1. Miscellaneous:
   - Split the RFC V2 into arch-agnostic and arch-specific patch sets.
2. Addressed Gavin Shan's (RedHat) comments:
   - Made CPU property accessors inline.
     https://lore.kernel.org/qemu-devel/6cd28639-2cfa-f233-c6d9-d5d2ec5b1c58@redhat.com/
   - Collected Reviewed-bys [PATCH RFC V2 4/37, 14/37, 22/37].
   - Dropped the patch as it was not required after init logic was refactored.
     https://lore.kernel.org/qemu-devel/4fb2eef9-6742-1eeb-721a-b3db04b1be97@redhat.com/
   - Fixed the range check for the core during vCPU Plug.
     https://lore.kernel.org/qemu-devel/1c5fa24c-6bf3-750f-4f22-087e4a9311af@redhat.com/
   - Added has_hotpluggable_vcpus check to make build_cpus_aml() conditional.
     https://lore.kernel.org/qemu-devel/832342cb-74bc-58dd-c5d7-6f995baeb0f2@redhat.com/
   - Fixed the states initialization in cpu_hotplug_hw_init() to accommodate previous refactoring.
     https://lore.kernel.org/qemu-devel/da5e5609-1883-8650-c7d8-6868c7b74f1c@redhat.com/
   - Fixed typos.
     https://lore.kernel.org/qemu-devel/eb1ac571-7844-55e6-15e7-3dd7df21366b@redhat.com/
   - Removed the unnecessary 'goto fail'.
     https://lore.kernel.org/qemu-devel/4d8980ac-f402-60d4-fe52-787815af8a7d@redhat.com/#t
   - Added check for hotpluggable vCPUs in the _OSC method.
     https://lore.kernel.org/qemu-devel/20231017001326.FUBqQ1PTowF2GxQpnL3kIW0AhmSqbspazwixAHVSi6c@z/
3. Addressed Shaoqin Huang's (Intel) comments:
   - Fixed the compilation break due to the absence of a call to virt_cpu_properties() missing
     along with its definition.
     https://lore.kernel.org/qemu-devel/3632ee24-47f7-ae68-8790-26eb2cf9950b@redhat.com/
4. Addressed Jonathan Cameron's (Huawei) comments:
   - Gated the 'disabled vcpu message' for GIC version < 3.
     https://lore.kernel.org/qemu-devel/20240116155911.00004fe1@Huawei.com/

RFC V1 -> RFC V2:
-----------------
1. Addressed James Morse's (ARM) requirement as per Linaro Open Discussion:
   - Exposed all possible vCPUs as always ACPI _STA.present and available during boot time.
   - Added the _OSC handling as required by James's patches.
   - Introduction of 'online-capable' bit handling in the flag of MADT GICC.
   - SMCC Hypercall Exit handling in Qemu.
2. Addressed Marc Zyngier's comment:
   - Fixed the note about GIC CPU Interface in the cover letter.
3. Addressed issues raised by Vishnu Pajjuru (Ampere) & Miguel Luis (Oracle) during testing:
   - Live/Pseudo Migration crashes.
4. Others:
   - Introduced the concept of persistent vCPU at QOM.
   - Introduced wrapper APIs of present, possible, and persistent.
   - Change at ACPI hotplug H/W init leg accommodating initializing is_present and is_enabled states.
   - Check to avoid unplugging cold-booted vCPUs.
   - Disabled hotplugging with TCG/HVF/QTEST.
   - Introduced CPU Topology, {socket, cluster, core, thread}-id property.
   - Extract virt CPU properties as a common virt_vcpu_properties() function.

Author Salil Mehta (1):
  target/arm/kvm,tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu

Jean-Philippe Brucker (2):
  hw/acpi: Make _MAT method optional
  target/arm/kvm: Write CPU state back to KVM on reset

Miguel Luis (1):
  tcg/mttcg: enable threads to unregister in tcg_ctxs[]

Salil Mehta (25):
  arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id
    property
  cpu-common: Add common CPU utility for possible vCPUs
  hw/arm/virt: Limit number of possible vCPUs for unsupported Accel or
    GIC Type
  hw/arm/virt: Move setting of common CPU properties in a function
  arm/virt,target/arm: Machine init time change common to vCPU
    {cold|hot}-plug
  arm/virt,kvm: Pre-create disabled possible vCPUs @machine init
  arm/virt,gicv3: Changes to pre-size GIC with possible vcpus @machine
    init
  arm/virt: Init PMU at host for all possible vcpus
  arm/acpi: Enable ACPI support for vcpu hotplug
  arm/virt: Add cpu hotplug events to GED during creation
  arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
  arm/virt/acpi: Build CPUs AML with CPU Hotplug support
  arm/virt: Make ARM vCPU *present* status ACPI *persistent*
  hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} Bits
    to Guest
  hw/arm: MADT Tbl change to size the guest with possible vCPUs
  arm/virt: Release objects for *disabled* possible vCPUs after init
  arm/virt: Add/update basic hot-(un)plug framework
  arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug
  hw/arm,gicv3: Changes to update GIC with vCPU hot-plug notification
  hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register
    info
  arm/virt: Update the guest(via GED) about CPU hot-(un)plug events
  hw/arm: Changes required for reset and to support next boot
  target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
  hw/arm: Support hotplug capability check using _OSC method
  hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled

 accel/tcg/tcg-accel-ops-mttcg.c    |   1 +
 cpu-common.c                       |  37 ++
 hw/acpi/cpu.c                      |  62 +-
 hw/acpi/generic_event_device.c     |  11 +
 hw/arm/Kconfig                     |   1 +
 hw/arm/boot.c                      |   2 +-
 hw/arm/virt-acpi-build.c           | 113 +++-
 hw/arm/virt.c                      | 877 +++++++++++++++++++++++------
 hw/core/gpio.c                     |   2 +-
 hw/intc/arm_gicv3.c                |   1 +
 hw/intc/arm_gicv3_common.c         |  66 ++-
 hw/intc/arm_gicv3_cpuif.c          | 269 +++++----
 hw/intc/arm_gicv3_cpuif_common.c   |   5 +
 hw/intc/arm_gicv3_kvm.c            |  39 +-
 hw/intc/gicv3_internal.h           |   2 +
 include/hw/acpi/cpu.h              |   2 +
 include/hw/arm/boot.h              |   2 +
 include/hw/arm/virt.h              |  38 +-
 include/hw/core/cpu.h              |  78 +++
 include/hw/intc/arm_gicv3_common.h |  23 +
 include/hw/qdev-core.h             |   2 +
 include/tcg/startup.h              |   7 +
 target/arm/arm-powerctl.c          |  51 +-
 target/arm/cpu-qom.h               |  18 +-
 target/arm/cpu.c                   | 112 ++++
 target/arm/cpu.h                   |  18 +
 target/arm/cpu64.c                 |  15 +
 target/arm/gdbstub.c               |   6 +
 target/arm/helper.c                |  27 +-
 target/arm/internals.h             |  14 +-
 target/arm/kvm.c                   | 146 ++++-
 target/arm/kvm_arm.h               |  25 +
 target/arm/meson.build             |   1 +
 target/arm/{tcg => }/psci.c        |   8 +
 target/arm/tcg/meson.build         |   4 -
 tcg/tcg.c                          |  24 +
 36 files changed, 1749 insertions(+), 360 deletions(-)
 rename target/arm/{tcg => }/psci.c (97%)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 01/29] arm/virt, target/arm: Add new ARMCPU {socket, cluster, core, thread}-id property
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-08-12  4:35   ` [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property Gavin Shan
  2024-06-13 23:36 ` [PATCH RFC V3 02/29] cpu-common: Add common CPU utility for possible vCPUs Salil Mehta via
                   ` (31 subsequent siblings)
  32 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

This shall be used to store user specified topology{socket,cluster,core,thread}
and shall be converted to a unique 'vcpu-id' which is used as slot-index during
hot(un)plug of vCPU.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c         | 10 ++++++++++
 include/hw/arm/virt.h | 28 ++++++++++++++++++++++++++++
 target/arm/cpu.c      |  4 ++++
 target/arm/cpu.h      |  4 ++++
 4 files changed, 46 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 3c93c0c0a6..11fc7fc318 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2215,6 +2215,14 @@ static void machvirt_init(MachineState *machine)
                           &error_fatal);
 
         aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
+        object_property_set_int(cpuobj, "socket-id",
+                                virt_get_socket_id(machine, n), NULL);
+        object_property_set_int(cpuobj, "cluster-id",
+                                virt_get_cluster_id(machine, n), NULL);
+        object_property_set_int(cpuobj, "core-id",
+                                virt_get_core_id(machine, n), NULL);
+        object_property_set_int(cpuobj, "thread-id",
+                                virt_get_thread_id(machine, n), NULL);
 
         if (!vms->secure) {
             object_property_set_bool(cpuobj, "has_el3", false, NULL);
@@ -2708,6 +2716,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
 {
     int n;
     unsigned int max_cpus = ms->smp.max_cpus;
+    unsigned int smp_threads = ms->smp.threads;
     VirtMachineState *vms = VIRT_MACHINE(ms);
     MachineClass *mc = MACHINE_GET_CLASS(vms);
 
@@ -2721,6 +2730,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
     ms->possible_cpus->len = max_cpus;
     for (n = 0; n < ms->possible_cpus->len; n++) {
         ms->possible_cpus->cpus[n].type = ms->cpu_type;
+        ms->possible_cpus->cpus[n].vcpus_count = smp_threads;
         ms->possible_cpus->cpus[n].arch_id =
             virt_cpu_mp_affinity(vms, n);
 
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index bb486d36b1..6f9a7bb60b 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -209,4 +209,32 @@ static inline int virt_gicv3_redist_region_count(VirtMachineState *vms)
             vms->highmem_redists) ? 2 : 1;
 }
 
+static inline int virt_get_socket_id(const MachineState *ms, int cpu_index)
+{
+    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
+
+    return ms->possible_cpus->cpus[cpu_index].props.socket_id;
+}
+
+static inline int virt_get_cluster_id(const MachineState *ms, int cpu_index)
+{
+    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
+
+    return ms->possible_cpus->cpus[cpu_index].props.cluster_id;
+}
+
+static inline int virt_get_core_id(const MachineState *ms, int cpu_index)
+{
+    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
+
+    return ms->possible_cpus->cpus[cpu_index].props.core_id;
+}
+
+static inline int virt_get_thread_id(const MachineState *ms, int cpu_index)
+{
+    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
+
+    return ms->possible_cpus->cpus[cpu_index].props.thread_id;
+}
+
 #endif /* QEMU_ARM_VIRT_H */
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 77f8c9c748..abc4ed0842 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -2582,6 +2582,10 @@ static Property arm_cpu_properties[] = {
     DEFINE_PROP_UINT64("mp-affinity", ARMCPU,
                         mp_affinity, ARM64_AFFINITY_INVALID),
     DEFINE_PROP_INT32("node-id", ARMCPU, node_id, CPU_UNSET_NUMA_NODE_ID),
+    DEFINE_PROP_INT32("socket-id", ARMCPU, socket_id, 0),
+    DEFINE_PROP_INT32("cluster-id", ARMCPU, cluster_id, 0),
+    DEFINE_PROP_INT32("core-id", ARMCPU, core_id, 0),
+    DEFINE_PROP_INT32("thread-id", ARMCPU, thread_id, 0),
     DEFINE_PROP_INT32("core-count", ARMCPU, core_count, -1),
     /* True to default to the backward-compat old CNTFRQ rather than 1Ghz */
     DEFINE_PROP_BOOL("backcompat-cntfrq", ARMCPU, backcompat_cntfrq, false),
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index c17264c239..208c719db3 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1076,6 +1076,10 @@ struct ArchCPU {
     QLIST_HEAD(, ARMELChangeHook) el_change_hooks;
 
     int32_t node_id; /* NUMA node this CPU belongs to */
+    int32_t socket_id;
+    int32_t cluster_id;
+    int32_t core_id;
+    int32_t thread_id;
 
     /* Used to synchronize KVM and QEMU in-kernel device levels */
     uint8_t device_irq_level;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 02/29] cpu-common: Add common CPU utility for possible vCPUs
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
  2024-06-13 23:36 ` [PATCH RFC V3 01/29] arm/virt, target/arm: Add new ARMCPU {socket, cluster, core, thread}-id property Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-07-04  3:12   ` Nicholas Piggin
  2024-08-12  4:59   ` Gavin Shan
  2024-06-13 23:36 ` [PATCH RFC V3 03/29] hw/arm/virt: Limit number of possible vCPUs for unsupported Accel or GIC Type Salil Mehta via
                   ` (30 subsequent siblings)
  32 siblings, 2 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

This patch adds various utility functions that may be required to fetch or check
the state of possible vCPUs. It also introduces the concept of *disabled* vCPUs,
which are part of the *possible* vCPUs but are not enabled. This state will be
used during machine initialization and later during the plugging or unplugging
of vCPUs. We release the QOM CPU objects for all disabled vCPUs.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 cpu-common.c          | 31 +++++++++++++++++++++++++
 include/hw/core/cpu.h | 54 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 85 insertions(+)

diff --git a/cpu-common.c b/cpu-common.c
index ce78273af5..49d2a50835 100644
--- a/cpu-common.c
+++ b/cpu-common.c
@@ -24,6 +24,7 @@
 #include "sysemu/cpus.h"
 #include "qemu/lockable.h"
 #include "trace/trace-root.h"
+#include "hw/boards.h"
 
 QemuMutex qemu_cpu_list_lock;
 static QemuCond exclusive_cond;
@@ -107,6 +108,36 @@ void cpu_list_remove(CPUState *cpu)
     cpu_list_generation_id++;
 }
 
+CPUState *qemu_get_possible_cpu(int index)
+{
+    MachineState *ms = MACHINE(qdev_get_machine());
+    const CPUArchIdList *possible_cpus = ms->possible_cpus;
+
+    assert((index >= 0) && (index < possible_cpus->len));
+
+    return CPU(possible_cpus->cpus[index].cpu);
+}
+
+bool qemu_present_cpu(CPUState *cpu)
+{
+    return cpu;
+}
+
+bool qemu_enabled_cpu(CPUState *cpu)
+{
+    return cpu && !cpu->disabled;
+}
+
+uint64_t qemu_get_cpu_archid(int cpu_index)
+{
+    MachineState *ms = MACHINE(qdev_get_machine());
+    const CPUArchIdList *possible_cpus = ms->possible_cpus;
+
+    assert((cpu_index >= 0) && (cpu_index < possible_cpus->len));
+
+    return possible_cpus->cpus[cpu_index].arch_id;
+}
+
 CPUState *qemu_get_cpu(int index)
 {
     CPUState *cpu;
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 60b160d0b4..60b4778da9 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -528,6 +528,18 @@ struct CPUState {
     CPUPluginState *plugin_state;
 #endif
 
+    /*
+     * Some architectures do not allow the *presence* of vCPUs to be changed
+     * after the guest has booted, based on information specified by the
+     * VMM/firmware via ACPI MADT at boot time. Thus, to enable vCPU hotplug on
+     * these architectures, possible vCPUs can have a CPUState object in a
+     * 'disabled' state or may not have a CPUState object at all. This is
+     * possible when vCPU hotplug is supported, and vCPUs are
+     * 'yet-to-be-plugged' in the QOM or have been hot-unplugged. By default,
+     * every CPUState is enabled across all architectures.
+     */
+    bool disabled;
+
     /* TODO Move common fields from CPUArchState here. */
     int cpu_index;
     int cluster_index;
@@ -914,6 +926,48 @@ static inline bool cpu_in_exclusive_context(const CPUState *cpu)
  */
 CPUState *qemu_get_cpu(int index);
 
+/**
+ * qemu_get_possible_cpu:
+ * @index: The CPUState@cpu_index value of the CPU to obtain.
+ *         Input index MUST be in range [0, Max Possible CPUs)
+ *
+ * If CPUState object exists,then it gets a CPU matching
+ * @index in the possible CPU array.
+ *
+ * Returns: The possible CPU or %NULL if CPU does not exist.
+ */
+CPUState *qemu_get_possible_cpu(int index);
+
+/**
+ * qemu_present_cpu:
+ * @cpu: The vCPU to check
+ *
+ * Checks if the vCPU is amongst the present possible vcpus.
+ *
+ * Returns: True if it is present possible vCPU else false
+ */
+bool qemu_present_cpu(CPUState *cpu);
+
+/**
+ * qemu_enabled_cpu:
+ * @cpu: The vCPU to check
+ *
+ * Checks if the vCPU is enabled.
+ *
+ * Returns: True if it is 'enabled' else false
+ */
+bool qemu_enabled_cpu(CPUState *cpu);
+
+/**
+ * qemu_get_cpu_archid:
+ * @cpu_index: possible vCPU for which arch-id needs to be retreived
+ *
+ * Fetches the vCPU arch-id from the present possible vCPUs.
+ *
+ * Returns: arch-id of the possible vCPU
+ */
+uint64_t qemu_get_cpu_archid(int cpu_index);
+
 /**
  * cpu_exists:
  * @id: Guest-exposed CPU ID to lookup.
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 03/29] hw/arm/virt: Limit number of possible vCPUs for unsupported Accel or GIC Type
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
  2024-06-13 23:36 ` [PATCH RFC V3 01/29] arm/virt, target/arm: Add new ARMCPU {socket, cluster, core, thread}-id property Salil Mehta via
  2024-06-13 23:36 ` [PATCH RFC V3 02/29] cpu-common: Add common CPU utility for possible vCPUs Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-08-12  5:09   ` Gavin Shan
  2024-06-13 23:36 ` [PATCH RFC V3 04/29] hw/arm/virt: Move setting of common CPU properties in a function Salil Mehta via
                   ` (29 subsequent siblings)
  32 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

If Virtual CPU Hotplug support does not exist on a particular Accel platform or
ARM GIC version, we should limit the possible vCPUs to those available during
boot time (i.e SMP CPUs) and explicitly disable Virtual CPU Hotplug support.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c | 66 +++++++++++++++++++++++++++++----------------------
 1 file changed, 38 insertions(+), 28 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 11fc7fc318..3e1c4d2d2f 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2082,8 +2082,6 @@ static void machvirt_init(MachineState *machine)
     unsigned int smp_cpus = machine->smp.cpus;
     unsigned int max_cpus = machine->smp.max_cpus;
 
-    possible_cpus = mc->possible_cpu_arch_ids(machine);
-
     /*
      * In accelerated mode, the memory map is computed earlier in kvm_type()
      * to create a VM with the right number of IPA bits.
@@ -2098,7 +2096,7 @@ static void machvirt_init(MachineState *machine)
          * we are about to deal with. Once this is done, get rid of
          * the object.
          */
-        cpuobj = object_new(possible_cpus->cpus[0].type);
+        cpuobj = object_new(machine->cpu_type);
         armcpu = ARM_CPU(cpuobj);
 
         pa_bits = arm_pamax(armcpu);
@@ -2113,6 +2111,43 @@ static void machvirt_init(MachineState *machine)
      */
     finalize_gic_version(vms);
 
+    /*
+     * The maximum number of CPUs depends on the GIC version, or on how
+     * many redistributors we can fit into the memory map (which in turn
+     * depends on whether this is a GICv3 or v4).
+     */
+    if (vms->gic_version == VIRT_GIC_VERSION_2) {
+        virt_max_cpus = GIC_NCPU;
+    } else {
+        virt_max_cpus = virt_redist_capacity(vms, VIRT_GIC_REDIST);
+        if (vms->highmem_redists) {
+            virt_max_cpus += virt_redist_capacity(vms, VIRT_HIGH_GIC_REDIST2);
+        }
+    }
+
+    if (tcg_enabled() || hvf_enabled() || qtest_enabled() ||
+        (vms->gic_version < VIRT_GIC_VERSION_3)) {
+        max_cpus = machine->smp.max_cpus = smp_cpus;
+        mc->has_hotpluggable_cpus = false;
+        if (vms->gic_version >= VIRT_GIC_VERSION_3) {
+            warn_report("cpu hotplug feature has been disabled");
+        }
+    }
+
+    if (max_cpus > virt_max_cpus) {
+        error_report("Number of SMP CPUs requested (%d) exceeds max CPUs "
+                     "supported by machine 'mach-virt' (%d)",
+                     max_cpus, virt_max_cpus);
+        if (vms->gic_version != VIRT_GIC_VERSION_2 && !vms->highmem_redists) {
+            error_printf("Try 'highmem-redists=on' for more CPUs\n");
+        }
+
+        exit(1);
+    }
+
+    /* uses smp.max_cpus to initialize all possible vCPUs */
+    possible_cpus = mc->possible_cpu_arch_ids(machine);
+
     if (vms->secure) {
         /*
          * The Secure view of the world is the same as the NonSecure,
@@ -2147,31 +2182,6 @@ static void machvirt_init(MachineState *machine)
         vms->psci_conduit = QEMU_PSCI_CONDUIT_HVC;
     }
 
-    /*
-     * The maximum number of CPUs depends on the GIC version, or on how
-     * many redistributors we can fit into the memory map (which in turn
-     * depends on whether this is a GICv3 or v4).
-     */
-    if (vms->gic_version == VIRT_GIC_VERSION_2) {
-        virt_max_cpus = GIC_NCPU;
-    } else {
-        virt_max_cpus = virt_redist_capacity(vms, VIRT_GIC_REDIST);
-        if (vms->highmem_redists) {
-            virt_max_cpus += virt_redist_capacity(vms, VIRT_HIGH_GIC_REDIST2);
-        }
-    }
-
-    if (max_cpus > virt_max_cpus) {
-        error_report("Number of SMP CPUs requested (%d) exceeds max CPUs "
-                     "supported by machine 'mach-virt' (%d)",
-                     max_cpus, virt_max_cpus);
-        if (vms->gic_version != VIRT_GIC_VERSION_2 && !vms->highmem_redists) {
-            error_printf("Try 'highmem-redists=on' for more CPUs\n");
-        }
-
-        exit(1);
-    }
-
     if (vms->secure && (kvm_enabled() || hvf_enabled())) {
         error_report("mach-virt: %s does not support providing "
                      "Security extensions (TrustZone) to the guest CPU",
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 04/29] hw/arm/virt: Move setting of common CPU properties in a function
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (2 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 03/29] hw/arm/virt: Limit number of possible vCPUs for unsupported Accel or GIC Type Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-08-12  5:19   ` Gavin Shan
  2024-06-13 23:36 ` [PATCH RFC V3 05/29] arm/virt, target/arm: Machine init time change common to vCPU {cold|hot}-plug Salil Mehta via
                   ` (28 subsequent siblings)
  32 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

Factor out CPU properties code common for {hot,cold}-plugged CPUs. This allows
code reuse.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c         | 261 ++++++++++++++++++++++++++++--------------
 include/hw/arm/virt.h |   4 +
 2 files changed, 182 insertions(+), 83 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 3e1c4d2d2f..2e0ec7d869 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1753,6 +1753,46 @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx)
     return arm_build_mp_affinity(idx, clustersz);
 }
 
+static CPUArchId *virt_find_cpu_slot(MachineState *ms, int vcpuid)
+{
+    VirtMachineState *vms = VIRT_MACHINE(ms);
+    CPUArchId *found_cpu;
+    uint64_t mp_affinity;
+
+    assert(vcpuid >= 0 && vcpuid < ms->possible_cpus->len);
+
+    mp_affinity = virt_cpu_mp_affinity(vms, vcpuid);
+    found_cpu = &ms->possible_cpus->cpus[vcpuid];
+
+    assert(found_cpu->arch_id == mp_affinity);
+
+    /*
+     * RFC: Question:
+     * Slot-id is the index where vCPU with certain arch-id(=mpidr/ap-affinity)
+     * is plugged. For Host KVM, MPIDR for vCPU is derived using vcpu-id.
+     * As I understand, MPIDR and vcpu-id are property of vCPU but slot-id is
+     * more related to machine? Current code assumes slot-id and vcpu-id are
+     * same i.e. meaning of slot is bit vague.
+     *
+     * Q1: Is there any requirement to clearly represent slot and dissociate it
+     *     from vcpu-id?
+     * Q2: Should we make MPIDR within host KVM user configurable?
+     *
+     *          +----+----+----+----+----+----+----+----+
+     * MPIDR    |||  Res  |   Aff2  |   Aff1  |  Aff0   |
+     *          +----+----+----+----+----+----+----+----+
+     *                     \         \         \   |    |
+     *                      \   8bit  \   8bit  \  |4bit|
+     *                       \<------->\<------->\ |<-->|
+     *                        \         \         \|    |
+     *          +----+----+----+----+----+----+----+----+
+     * VCPU-ID  |  Byte4  |  Byte2  |  Byte1  |  Byte0  |
+     *          +----+----+----+----+----+----+----+----+
+     */
+
+    return found_cpu;
+}
+
 static inline bool *virt_get_high_memmap_enabled(VirtMachineState *vms,
                                                  int index)
 {
@@ -2065,16 +2105,129 @@ static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
     }
 }
 
+static void virt_cpu_set_properties(Object *cpuobj, const CPUArchId *cpu_slot,
+                                    Error **errp)
+{
+    MachineState *ms = MACHINE(qdev_get_machine());
+    VirtMachineState *vms = VIRT_MACHINE(ms);
+    Error *local_err = NULL;
+    VirtMachineClass *vmc;
+
+    vmc = VIRT_MACHINE_GET_CLASS(ms);
+
+    /* now, set the cpu object property values */
+    numa_cpu_pre_plug(cpu_slot, DEVICE(cpuobj), &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    object_property_set_int(cpuobj, "mp-affinity", cpu_slot->arch_id, NULL);
+
+    if (!vms->secure) {
+        object_property_set_bool(cpuobj, "has_el3", false, NULL);
+    }
+
+    if (!vms->virt && object_property_find(cpuobj, "has_el2")) {
+        object_property_set_bool(cpuobj, "has_el2", false, NULL);
+    }
+
+    if (vmc->kvm_no_adjvtime &&
+        object_property_find(cpuobj, "kvm-no-adjvtime")) {
+        object_property_set_bool(cpuobj, "kvm-no-adjvtime", true, NULL);
+    }
+
+    if (vmc->no_kvm_steal_time &&
+        object_property_find(cpuobj, "kvm-steal-time")) {
+        object_property_set_bool(cpuobj, "kvm-steal-time", false, NULL);
+    }
+
+    if (vmc->no_pmu && object_property_find(cpuobj, "pmu")) {
+        object_property_set_bool(cpuobj, "pmu", false, NULL);
+    }
+
+    if (vmc->no_tcg_lpa2 && object_property_find(cpuobj, "lpa2")) {
+        object_property_set_bool(cpuobj, "lpa2", false, NULL);
+    }
+
+    if (object_property_find(cpuobj, "reset-cbar")) {
+        object_property_set_int(cpuobj, "reset-cbar",
+                                vms->memmap[VIRT_CPUPERIPHS].base,
+                                &local_err);
+        if (local_err) {
+            goto out;
+        }
+    }
+
+    /* link already initialized {secure,tag}-memory regions to this cpu */
+    object_property_set_link(cpuobj, "memory", OBJECT(vms->sysmem), &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    if (vms->secure) {
+        object_property_set_link(cpuobj, "secure-memory",
+                                 OBJECT(vms->secure_sysmem), &local_err);
+        if (local_err) {
+            goto out;
+        }
+    }
+
+    if (vms->mte) {
+        if (!object_property_find(cpuobj, "tag-memory")) {
+            error_setg(&local_err, "MTE requested, but not supported "
+                       "by the guest CPU");
+            if (local_err) {
+                goto out;
+            }
+        }
+
+        object_property_set_link(cpuobj, "tag-memory", OBJECT(vms->tag_sysmem),
+                                 &local_err);
+        if (local_err) {
+            goto out;
+        }
+
+        if (vms->secure) {
+            object_property_set_link(cpuobj, "secure-tag-memory",
+                                     OBJECT(vms->secure_tag_sysmem),
+                                     &local_err);
+            if (local_err) {
+                goto out;
+            }
+        }
+    }
+
+    /*
+     * RFC: Question: this must only be called for the hotplugged cpus. For the
+     * cold booted secondary cpus this is being taken care in arm_load_kernel()
+     * in boot.c. Perhaps we should remove that code now?
+     */
+    if (vms->psci_conduit != QEMU_PSCI_CONDUIT_DISABLED) {
+        object_property_set_int(cpuobj, "psci-conduit", vms->psci_conduit,
+                                NULL);
+
+        /* Secondary CPUs start in PSCI powered-down state */
+        if (CPU(cpuobj)->cpu_index > 0) {
+            object_property_set_bool(cpuobj, "start-powered-off", true, NULL);
+        }
+    }
+
+out:
+    if (local_err) {
+        error_propagate(errp, local_err);
+    }
+}
+
 static void machvirt_init(MachineState *machine)
 {
     VirtMachineState *vms = VIRT_MACHINE(machine);
     VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(machine);
     MachineClass *mc = MACHINE_GET_CLASS(machine);
     const CPUArchIdList *possible_cpus;
-    MemoryRegion *sysmem = get_system_memory();
+    MemoryRegion *secure_tag_sysmem = NULL;
     MemoryRegion *secure_sysmem = NULL;
     MemoryRegion *tag_sysmem = NULL;
-    MemoryRegion *secure_tag_sysmem = NULL;
+    MemoryRegion *sysmem;
     int n, virt_max_cpus;
     bool firmware_loaded;
     bool aarch64 = true;
@@ -2148,6 +2301,8 @@ static void machvirt_init(MachineState *machine)
     /* uses smp.max_cpus to initialize all possible vCPUs */
     possible_cpus = mc->possible_cpu_arch_ids(machine);
 
+    sysmem = vms->sysmem = get_system_memory();
+
     if (vms->secure) {
         /*
          * The Secure view of the world is the same as the NonSecure,
@@ -2155,7 +2310,7 @@ static void machvirt_init(MachineState *machine)
          * containing the system memory at low priority; any secure-only
          * devices go in at higher priority and take precedence.
          */
-        secure_sysmem = g_new(MemoryRegion, 1);
+        secure_sysmem = vms->secure_sysmem = g_new(MemoryRegion, 1);
         memory_region_init(secure_sysmem, OBJECT(machine), "secure-memory",
                            UINT64_MAX);
         memory_region_add_subregion_overlap(secure_sysmem, 0, sysmem, -1);
@@ -2203,10 +2358,28 @@ static void machvirt_init(MachineState *machine)
         exit(1);
     }
 
+    if (vms->mte) {
+        /* Create the memory region only once, but link to all cpus later */
+        tag_sysmem = vms->tag_sysmem = g_new(MemoryRegion, 1);
+        memory_region_init(tag_sysmem, OBJECT(machine),
+                           "tag-memory", UINT64_MAX / 32);
+
+        if (vms->secure) {
+            secure_tag_sysmem = vms->secure_tag_sysmem = g_new(MemoryRegion, 1);
+            memory_region_init(secure_tag_sysmem, OBJECT(machine),
+                               "secure-tag-memory", UINT64_MAX / 32);
+
+            /* As with ram, secure-tag takes precedence over tag.  */
+            memory_region_add_subregion_overlap(secure_tag_sysmem, 0,
+                                                tag_sysmem, -1);
+        }
+    }
+
     create_fdt(vms);
 
     assert(possible_cpus->len == max_cpus);
     for (n = 0; n < possible_cpus->len; n++) {
+        CPUArchId *cpu_slot;
         Object *cpuobj;
         CPUState *cs;
 
@@ -2215,15 +2388,10 @@ static void machvirt_init(MachineState *machine)
         }
 
         cpuobj = object_new(possible_cpus->cpus[n].type);
-        object_property_set_int(cpuobj, "mp-affinity",
-                                possible_cpus->cpus[n].arch_id, NULL);
 
         cs = CPU(cpuobj);
         cs->cpu_index = n;
 
-        numa_cpu_pre_plug(&possible_cpus->cpus[cs->cpu_index], DEVICE(cpuobj),
-                          &error_fatal);
-
         aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
         object_property_set_int(cpuobj, "socket-id",
                                 virt_get_socket_id(machine, n), NULL);
@@ -2234,81 +2402,8 @@ static void machvirt_init(MachineState *machine)
         object_property_set_int(cpuobj, "thread-id",
                                 virt_get_thread_id(machine, n), NULL);
 
-        if (!vms->secure) {
-            object_property_set_bool(cpuobj, "has_el3", false, NULL);
-        }
-
-        if (!vms->virt && object_property_find(cpuobj, "has_el2")) {
-            object_property_set_bool(cpuobj, "has_el2", false, NULL);
-        }
-
-        if (vmc->kvm_no_adjvtime &&
-            object_property_find(cpuobj, "kvm-no-adjvtime")) {
-            object_property_set_bool(cpuobj, "kvm-no-adjvtime", true, NULL);
-        }
-
-        if (vmc->no_kvm_steal_time &&
-            object_property_find(cpuobj, "kvm-steal-time")) {
-            object_property_set_bool(cpuobj, "kvm-steal-time", false, NULL);
-        }
-
-        if (vmc->no_pmu && object_property_find(cpuobj, "pmu")) {
-            object_property_set_bool(cpuobj, "pmu", false, NULL);
-        }
-
-        if (vmc->no_tcg_lpa2 && object_property_find(cpuobj, "lpa2")) {
-            object_property_set_bool(cpuobj, "lpa2", false, NULL);
-        }
-
-        if (object_property_find(cpuobj, "reset-cbar")) {
-            object_property_set_int(cpuobj, "reset-cbar",
-                                    vms->memmap[VIRT_CPUPERIPHS].base,
-                                    &error_abort);
-        }
-
-        object_property_set_link(cpuobj, "memory", OBJECT(sysmem),
-                                 &error_abort);
-        if (vms->secure) {
-            object_property_set_link(cpuobj, "secure-memory",
-                                     OBJECT(secure_sysmem), &error_abort);
-        }
-
-        if (vms->mte) {
-            /* Create the memory region only once, but link to all cpus. */
-            if (!tag_sysmem) {
-                /*
-                 * The property exists only if MemTag is supported.
-                 * If it is, we must allocate the ram to back that up.
-                 */
-                if (!object_property_find(cpuobj, "tag-memory")) {
-                    error_report("MTE requested, but not supported "
-                                 "by the guest CPU");
-                    exit(1);
-                }
-
-                tag_sysmem = g_new(MemoryRegion, 1);
-                memory_region_init(tag_sysmem, OBJECT(machine),
-                                   "tag-memory", UINT64_MAX / 32);
-
-                if (vms->secure) {
-                    secure_tag_sysmem = g_new(MemoryRegion, 1);
-                    memory_region_init(secure_tag_sysmem, OBJECT(machine),
-                                       "secure-tag-memory", UINT64_MAX / 32);
-
-                    /* As with ram, secure-tag takes precedence over tag.  */
-                    memory_region_add_subregion_overlap(secure_tag_sysmem, 0,
-                                                        tag_sysmem, -1);
-                }
-            }
-
-            object_property_set_link(cpuobj, "tag-memory", OBJECT(tag_sysmem),
-                                     &error_abort);
-            if (vms->secure) {
-                object_property_set_link(cpuobj, "secure-tag-memory",
-                                         OBJECT(secure_tag_sysmem),
-                                         &error_abort);
-            }
-        }
+        cpu_slot = virt_find_cpu_slot(machine, cs->cpu_index);
+        virt_cpu_set_properties(cpuobj, cpu_slot, &error_fatal);
 
         qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
         object_unref(cpuobj);
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 6f9a7bb60b..780bd53ceb 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -139,6 +139,10 @@ struct VirtMachineState {
     DeviceState *platform_bus_dev;
     FWCfgState *fw_cfg;
     PFlashCFI01 *flash[2];
+    MemoryRegion *sysmem;
+    MemoryRegion *secure_sysmem;
+    MemoryRegion *tag_sysmem;
+    MemoryRegion *secure_tag_sysmem;
     bool secure;
     bool highmem;
     bool highmem_compact;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 05/29] arm/virt, target/arm: Machine init time change common to vCPU {cold|hot}-plug
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (3 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 04/29] hw/arm/virt: Move setting of common CPU properties in a function Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-06-13 23:36 ` [PATCH RFC V3 06/29] arm/virt, kvm: Pre-create disabled possible vCPUs @machine init Salil Mehta via
                   ` (27 subsequent siblings)
  32 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

Introduce the common logic required during the initialization of both cold and
hot-plugged vCPUs. Additionally, initialize the *disabled* state of the vCPUs,
which will be used further during the initialization phases of various other
components like GIC, PMU, ACPI, etc., as part of the virtual machine
initialization.

KVM vCPUs corresponding to unplugged or yet-to-be-plugged QOM CPUs are kept in a
powered-off state in the KVM Host and do not run the guest code. Plugged vCPUs
are also kept in a powered-off state, but vCPU threads exist and remain in a
sleeping state.

TBD:
For cold-booted vCPUs, this change also exists in the `arm_load_kernel()`
function in `boot.c`. However, for hot-plugged CPUs, this change should remain
part of the pre-plug phase. We are duplicating the powering-off of the
cold-booted CPUs. Should we remove the duplicate change from `boot.c`?

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reported-by: Gavin Shan <gshan@redhat.com>
[GS: pointed the assertion due to wrong range check]
---
 hw/arm/virt.c      | 94 +++++++++++++++++++++++++++++++++++++++++++++-
 target/arm/cpu.c   |  7 ++++
 target/arm/cpu64.c | 14 +++++++
 3 files changed, 114 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 2e0ec7d869..a285139165 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2817,6 +2817,26 @@ static int64_t virt_get_default_cpu_node_id(const MachineState *ms, int idx)
     return socket_id % ms->numa_state->num_nodes;
 }
 
+static int
+virt_get_cpu_id_from_cpu_topo(const MachineState *ms, DeviceState *dev)
+{
+    int cpu_id, sock_vcpu_num, clus_vcpu_num, core_vcpu_num;
+    ARMCPU *cpu = ARM_CPU(dev);
+
+    /* calculate total logical cpus across socket/cluster/core */
+    sock_vcpu_num = cpu->socket_id * (ms->smp.threads * ms->smp.cores *
+                    ms->smp.clusters);
+    clus_vcpu_num = cpu->cluster_id * (ms->smp.threads * ms->smp.cores);
+    core_vcpu_num = cpu->core_id * ms->smp.threads;
+
+    /* get vcpu-id(logical cpu index) for this vcpu from this topology */
+    cpu_id = (sock_vcpu_num + clus_vcpu_num + core_vcpu_num) + cpu->thread_id;
+
+    assert(cpu_id >= 0 && cpu_id < ms->possible_cpus->len);
+
+    return cpu_id;
+}
+
 static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
 {
     int n;
@@ -2899,6 +2919,72 @@ static void virt_memory_plug(HotplugHandler *hotplug_dev,
                          dev, &error_abort);
 }
 
+static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
+                              Error **errp)
+{
+    MachineState *ms = MACHINE(hotplug_dev);
+    ARMCPU *cpu = ARM_CPU(dev);
+    CPUState *cs = CPU(dev);
+    CPUArchId *cpu_slot;
+
+    /* sanity check the cpu */
+    if (!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) {
+        error_setg(errp, "Invalid CPU type, expected cpu type: '%s'",
+                   ms->cpu_type);
+        return;
+    }
+
+    if ((cpu->thread_id < 0) || (cpu->thread_id >= ms->smp.threads)) {
+        error_setg(errp, "Invalid thread-id %u specified, correct range 0:%u",
+                   cpu->thread_id, ms->smp.threads - 1);
+        return;
+    }
+
+    if ((cpu->core_id < 0) || (cpu->core_id >= ms->smp.cores)) {
+        error_setg(errp, "Invalid core-id %d specified, correct range 0:%u",
+                   cpu->core_id, ms->smp.cores - 1);
+        return;
+    }
+
+    if ((cpu->cluster_id < 0) || (cpu->cluster_id >= ms->smp.clusters)) {
+        error_setg(errp, "Invalid cluster-id %u specified, correct range 0:%u",
+                   cpu->cluster_id, ms->smp.clusters - 1);
+        return;
+    }
+
+    if ((cpu->socket_id < 0) || (cpu->socket_id >= ms->smp.sockets)) {
+        error_setg(errp, "Invalid socket-id %u specified, correct range 0:%u",
+                   cpu->socket_id, ms->smp.sockets - 1);
+        return;
+    }
+
+    cs->cpu_index = virt_get_cpu_id_from_cpu_topo(ms, dev);
+
+    cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
+    if (qemu_present_cpu(CPU(cpu_slot->cpu))) {
+        error_setg(errp, "cpu(id%d=%d:%d:%d:%d) with arch-id %" PRIu64 " exist",
+                   cs->cpu_index, cpu->socket_id, cpu->cluster_id, cpu->core_id,
+                   cpu->thread_id, cpu_slot->arch_id);
+        return;
+    }
+    virt_cpu_set_properties(OBJECT(cs), cpu_slot, errp);
+}
+
+static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
+                          Error **errp)
+{
+    MachineState *ms = MACHINE(hotplug_dev);
+    CPUState *cs = CPU(dev);
+    CPUArchId *cpu_slot;
+
+    /* insert the cold/hot-plugged vcpu in the slot */
+    cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
+    cpu_slot->cpu = CPU(dev);
+
+    cs->disabled = false;
+    return;
+}
+
 static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
                                             DeviceState *dev, Error **errp)
 {
@@ -2906,6 +2992,8 @@ static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
 
     if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
         virt_memory_pre_plug(hotplug_dev, dev, errp);
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+        virt_cpu_pre_plug(hotplug_dev, dev, errp);
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
         virtio_md_pci_pre_plug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
@@ -2962,6 +3050,8 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
         virt_memory_plug(hotplug_dev, dev, errp);
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
         virtio_md_pci_plug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+        virt_cpu_plug(hotplug_dev, dev, errp);
     }
 
     if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
@@ -3046,7 +3136,8 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
     if (device_is_dynamic_sysbus(mc, dev) ||
         object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) ||
         object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI) ||
-        object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
+        object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI) ||
+        object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
         return HOTPLUG_HANDLER(machine);
     }
     return NULL;
@@ -3150,6 +3241,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
     mc->valid_cpu_types = valid_cpu_types;
     mc->get_default_cpu_node_id = virt_get_default_cpu_node_id;
     mc->kvm_type = virt_kvm_type;
+    mc->has_hotpluggable_cpus = true;
     assert(!mc->get_hotplug_handler);
     mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
     hc->pre_plug = virt_machine_device_pre_plug_cb;
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index abc4ed0842..c92162fa97 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -2639,6 +2639,12 @@ static const TCGCPUOps arm_tcg_ops = {
 };
 #endif /* CONFIG_TCG */
 
+static int64_t arm_cpu_get_arch_id(CPUState *cs)
+{
+    ARMCPU *cpu = ARM_CPU(cs);
+    return cpu->mp_affinity;
+}
+
 static void arm_cpu_class_init(ObjectClass *oc, void *data)
 {
     ARMCPUClass *acc = ARM_CPU_CLASS(oc);
@@ -2658,6 +2664,7 @@ static void arm_cpu_class_init(ObjectClass *oc, void *data)
     cc->has_work = arm_cpu_has_work;
     cc->mmu_index = arm_cpu_mmu_index;
     cc->dump_state = arm_cpu_dump_state;
+    cc->get_arch_id = arm_cpu_get_arch_id;
     cc->set_pc = arm_cpu_set_pc;
     cc->get_pc = arm_cpu_get_pc;
     cc->gdb_read_register = arm_cpu_gdb_read_register;
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index c15d086049..d6b48b3424 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -780,6 +780,17 @@ static void aarch64_cpu_set_aarch64(Object *obj, bool value, Error **errp)
     }
 }
 
+static void aarch64_cpu_initfn(Object *obj)
+{
+    CPUState *cs = CPU(obj);
+
+    /*
+     * we start every ARM64 vcpu as disabled possible vCPU. It needs to be
+     * enabled explicitly
+     */
+    cs->disabled = true;
+}
+
 static void aarch64_cpu_finalizefn(Object *obj)
 {
 }
@@ -792,7 +803,9 @@ static const gchar *aarch64_gdb_arch_name(CPUState *cs)
 static void aarch64_cpu_class_init(ObjectClass *oc, void *data)
 {
     CPUClass *cc = CPU_CLASS(oc);
+    DeviceClass *dc = DEVICE_CLASS(oc);
 
+    dc->user_creatable = true;
     cc->gdb_read_register = aarch64_cpu_gdb_read_register;
     cc->gdb_write_register = aarch64_cpu_gdb_write_register;
     cc->gdb_core_xml_file = "aarch64-core.xml";
@@ -837,6 +850,7 @@ void aarch64_cpu_register(const ARMCPUInfo *info)
 static const TypeInfo aarch64_cpu_type_info = {
     .name = TYPE_AARCH64_CPU,
     .parent = TYPE_ARM_CPU,
+    .instance_init = aarch64_cpu_initfn,
     .instance_finalize = aarch64_cpu_finalizefn,
     .abstract = true,
     .class_init = aarch64_cpu_class_init,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 06/29] arm/virt, kvm: Pre-create disabled possible vCPUs @machine init
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (4 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 05/29] arm/virt, target/arm: Machine init time change common to vCPU {cold|hot}-plug Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-08-13  0:58   ` [PATCH RFC V3 06/29] arm/virt,kvm: " Gavin Shan
  2024-08-19  5:31   ` Gavin Shan
  2024-06-13 23:36 ` [PATCH RFC V3 07/29] arm/virt, gicv3: Changes to pre-size GIC with possible vcpus " Salil Mehta via
                   ` (26 subsequent siblings)
  32 siblings, 2 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

In the ARMv8 architecture, the GIC must know all the CPUs it is connected to
during its initialization, and this cannot change afterward. This must be
ensured during the initialization of the VGIC as well in KVM, which requires all
vCPUs to be created and present during its initialization. This is necessary
because:

1. The association between GICC and MPIDR must be fixed at VM initialization
   time. This is represented by the register `GIC_TYPER(mp_affinity, proc_num)`.
2. GICC (CPU interfaces), GICR (redistributors), etc., must all be initialized
   at boot time.
3. Memory regions associated with GICR, etc., cannot be changed (added, deleted,
   or modified) after the VM has been initialized.

This patch adds support to pre-create all possible vCPUs within the host using
the KVM interface as part of the virtual machine initialization. These vCPUs can
later be attached to QOM/ACPI when they are actually hot-plugged and made
present.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reported-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
[VP: Identified CPU stall issue & suggested probable fix]
---
 hw/arm/virt.c         | 56 +++++++++++++++++++++++++++++++++++--------
 include/hw/core/cpu.h |  1 +
 target/arm/cpu64.c    |  1 +
 target/arm/kvm.c      | 41 ++++++++++++++++++++++++++++++-
 target/arm/kvm_arm.h  | 11 +++++++++
 5 files changed, 99 insertions(+), 11 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index a285139165..81e7a27786 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2383,14 +2383,8 @@ static void machvirt_init(MachineState *machine)
         Object *cpuobj;
         CPUState *cs;
 
-        if (n >= smp_cpus) {
-            break;
-        }
-
         cpuobj = object_new(possible_cpus->cpus[n].type);
-
         cs = CPU(cpuobj);
-        cs->cpu_index = n;
 
         aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
         object_property_set_int(cpuobj, "socket-id",
@@ -2402,11 +2396,53 @@ static void machvirt_init(MachineState *machine)
         object_property_set_int(cpuobj, "thread-id",
                                 virt_get_thread_id(machine, n), NULL);
 
-        cpu_slot = virt_find_cpu_slot(machine, cs->cpu_index);
-        virt_cpu_set_properties(cpuobj, cpu_slot, &error_fatal);
+        if (n < smp_cpus) {
+            qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
+            object_unref(cpuobj);
+        } else {
+            /* handling for vcpus which are yet to be hot-plugged */
+            cs->cpu_index = n;
+            cpu_slot = virt_find_cpu_slot(machine, cs->cpu_index);
 
-        qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
-        object_unref(cpuobj);
+            /*
+             * ARM host vCPU features need to be fixed at the boot time. But as
+             * per current approach this CPU object will be destroyed during
+             * cpu_post_init(). During hotplug of vCPUs these properties are
+             * initialized again.
+             */
+            virt_cpu_set_properties(cpuobj, cpu_slot, &error_fatal);
+
+            /*
+             * For KVM, we shall be pre-creating the now disabled/un-plugged
+             * possbile host vcpus and park them till the time they are
+             * actually hot plugged. This is required to pre-size the host
+             * GICC and GICR with the all possible vcpus for this VM.
+             */
+            if (kvm_enabled()) {
+                kvm_arm_create_host_vcpu(ARM_CPU(cs));
+            }
+            /*
+             * Add disabled vCPU to CPU slot during the init phase of the virt
+             * machine
+             * 1. We need this ARMCPU object during the GIC init. This object
+             *    will facilitate in pre-realizing the GIC. Any info like
+             *    mp-affinity(required to derive gicr_type) etc. could still be
+             *    fetched while preserving QOM abstraction akin to realized
+             *    vCPUs.
+             * 2. Now, after initialization of the virt machine is complete we
+             *    could use two approaches to deal with this ARMCPU object:
+             *    (i) re-use this ARMCPU object during hotplug of this vCPU.
+             *                             OR
+             *    (ii) defer release this ARMCPU object after gic has been
+             *         initialized or during pre-plug phase when a vCPU is
+             *         hotplugged.
+             *
+             *    We will use the (ii) approach and release the ARMCPU objects
+             *    after GIC and machine has been fully initialized during
+             *    machine_init_done() phase.
+             */
+             cpu_slot->cpu = cs;
+        }
     }
 
     /* Now we've created the CPUs we can see if they have the hypvirt timer */
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 60b4778da9..62e68611c0 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -520,6 +520,7 @@ struct CPUState {
     uint64_t dirty_pages;
     int kvm_vcpu_stats_fd;
     bool vcpu_dirty;
+    VMChangeStateEntry *vmcse;
 
     /* Use by accel-block: CPU is executing an ioctl() */
     QemuLockCnt in_ioctl_lock;
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index d6b48b3424..9b7e8b032c 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -789,6 +789,7 @@ static void aarch64_cpu_initfn(Object *obj)
      * enabled explicitly
      */
     cs->disabled = true;
+    cs->thread_id = 0;
 }
 
 static void aarch64_cpu_finalizefn(Object *obj)
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 7cf5cf31de..01c83c1994 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -1003,6 +1003,38 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu)
     write_list_to_cpustate(cpu);
 }
 
+void kvm_arm_create_host_vcpu(ARMCPU *cpu)
+{
+    CPUState *cs = CPU(cpu);
+    unsigned long vcpu_id = cs->cpu_index;
+    int ret;
+
+    ret = kvm_create_vcpu(cs);
+    if (ret < 0) {
+        error_report("Failed to create host vcpu %ld", vcpu_id);
+        abort();
+    }
+
+    /*
+     * Initialize the vCPU in the host. This will reset the sys regs
+     * for this vCPU and related registers like MPIDR_EL1 etc. also
+     * gets programmed during this call to host. These are referred
+     * later while setting device attributes of the GICR during GICv3
+     * reset
+     */
+    ret = kvm_arch_init_vcpu(cs);
+    if (ret < 0) {
+        error_report("Failed to initialize host vcpu %ld", vcpu_id);
+        abort();
+    }
+
+    /*
+     * park the created vCPU. shall be used during kvm_get_vcpu() when
+     * threads are created during realization of ARM vCPUs.
+     */
+    kvm_park_vcpu(cs);
+}
+
 /*
  * Update KVM's MP_STATE based on what QEMU thinks it is
  */
@@ -1874,7 +1906,14 @@ int kvm_arch_init_vcpu(CPUState *cs)
         return -EINVAL;
     }
 
-    qemu_add_vm_change_state_handler(kvm_arm_vm_state_change, cpu);
+    /*
+     * Install VM change handler only when vCPU thread has been spawned
+     * i.e. vCPU is being realized
+     */
+    if (cs->thread_id) {
+        cs->vmcse = qemu_add_vm_change_state_handler(kvm_arm_vm_state_change,
+                                                     cpu);
+    }
 
     /* Determine init features for this CPU */
     memset(cpu->kvm_init_features, 0, sizeof(cpu->kvm_init_features));
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index cfaa0d9bc7..0be7e896d2 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -96,6 +96,17 @@ void kvm_arm_cpu_post_load(ARMCPU *cpu);
  */
 void kvm_arm_reset_vcpu(ARMCPU *cpu);
 
+/**
+ * kvm_arm_create_host_vcpu:
+ * @cpu: ARMCPU
+ *
+ * Called at to pre create all possible kvm vCPUs within the the host at the
+ * virt machine init time. This will also init this pre-created vCPU and
+ * hence result in vCPU reset at host. These pre created and inited vCPUs
+ * shall be parked for use when ARM vCPUs are actually realized.
+ */
+void kvm_arm_create_host_vcpu(ARMCPU *cpu);
+
 #ifdef CONFIG_KVM
 /**
  * kvm_arm_create_scratch_host_vcpu:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 07/29] arm/virt, gicv3: Changes to pre-size GIC with possible vcpus @machine init
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (5 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 06/29] arm/virt, kvm: Pre-create disabled possible vCPUs @machine init Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-06-13 23:36 ` [PATCH RFC V3 08/29] arm/virt: Init PMU at host for all possible vcpus Salil Mehta via
                   ` (25 subsequent siblings)
  32 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

The GIC needs to be pre-sized with possible vCPUs at initialization time. This
is necessary because memory regions and resources associated with GICC/GICR,
etc., cannot be changed (added, deleted, or modified) after the VM has been
initialized. Additionally, `GIC_TYPER` needs to be initialized with
`mp_affinity` and CPU interface number association, which cannot be changed
after the GIC has been initialized.

Once all the CPU interfaces of the GIC have been initialized, it must be ensured
that any updates to the GICC during reset only take place for the *enabled*
vCPUs and not the disabled ones. Therefore, proper checks are required at
various places.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
[changed the comment in arm_gicv3_icc_reset]
---
 hw/arm/virt.c              | 15 ++++++++-------
 hw/intc/arm_gicv3_common.c |  7 +++++--
 hw/intc/arm_gicv3_cpuif.c  |  8 ++++++++
 hw/intc/arm_gicv3_kvm.c    | 34 +++++++++++++++++++++++++++++++---
 include/hw/arm/virt.h      |  2 +-
 5 files changed, 53 insertions(+), 13 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 81e7a27786..ac53bfadca 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -751,6 +751,7 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
     const char *gictype;
     int i;
     unsigned int smp_cpus = ms->smp.cpus;
+    unsigned int max_cpus = ms->smp.max_cpus;
     uint32_t nb_redist_regions = 0;
     int revision;
 
@@ -775,7 +776,7 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
     }
     vms->gic = qdev_new(gictype);
     qdev_prop_set_uint32(vms->gic, "revision", revision);
-    qdev_prop_set_uint32(vms->gic, "num-cpu", smp_cpus);
+    qdev_prop_set_uint32(vms->gic, "num-cpu", max_cpus);
     /* Note that the num-irq property counts both internal and external
      * interrupts; there are always 32 of the former (mandated by GIC spec).
      */
@@ -787,7 +788,7 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
     if (vms->gic_version != VIRT_GIC_VERSION_2) {
         QList *redist_region_count;
         uint32_t redist0_capacity = virt_redist_capacity(vms, VIRT_GIC_REDIST);
-        uint32_t redist0_count = MIN(smp_cpus, redist0_capacity);
+        uint32_t redist0_count = MIN(max_cpus, redist0_capacity);
 
         nb_redist_regions = virt_gicv3_redist_region_count(vms);
 
@@ -798,7 +799,7 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
                 virt_redist_capacity(vms, VIRT_HIGH_GIC_REDIST2);
 
             qlist_append_int(redist_region_count,
-                MIN(smp_cpus - redist0_count, redist1_capacity));
+                MIN(max_cpus - redist0_count, redist1_capacity));
         }
         qdev_prop_set_array(vms->gic, "redist-region-count",
                             redist_region_count);
@@ -871,7 +872,7 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
         } else if (vms->virt) {
             qemu_irq irq = qdev_get_gpio_in(vms->gic,
                                             intidbase + ARCH_GIC_MAINT_IRQ);
-            sysbus_connect_irq(gicbusdev, i + 4 * smp_cpus, irq);
+            sysbus_connect_irq(gicbusdev, i + 4 * max_cpus, irq);
         }
 
         qdev_connect_gpio_out_named(cpudev, "pmu-interrupt", 0,
@@ -879,11 +880,11 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
                                                      + VIRTUAL_PMU_IRQ));
 
         sysbus_connect_irq(gicbusdev, i, qdev_get_gpio_in(cpudev, ARM_CPU_IRQ));
-        sysbus_connect_irq(gicbusdev, i + smp_cpus,
+        sysbus_connect_irq(gicbusdev, i + max_cpus,
                            qdev_get_gpio_in(cpudev, ARM_CPU_FIQ));
-        sysbus_connect_irq(gicbusdev, i + 2 * smp_cpus,
+        sysbus_connect_irq(gicbusdev, i + 2 * max_cpus,
                            qdev_get_gpio_in(cpudev, ARM_CPU_VIRQ));
-        sysbus_connect_irq(gicbusdev, i + 3 * smp_cpus,
+        sysbus_connect_irq(gicbusdev, i + 3 * max_cpus,
                            qdev_get_gpio_in(cpudev, ARM_CPU_VFIQ));
 
         if (vms->gic_version != VIRT_GIC_VERSION_2) {
diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c
index bd50a1b079..183d2de7eb 100644
--- a/hw/intc/arm_gicv3_common.c
+++ b/hw/intc/arm_gicv3_common.c
@@ -436,10 +436,13 @@ static void arm_gicv3_common_realize(DeviceState *dev, Error **errp)
     s->cpu = g_new0(GICv3CPUState, s->num_cpu);
 
     for (i = 0; i < s->num_cpu; i++) {
-        CPUState *cpu = qemu_get_cpu(i);
+        CPUState *cpu = qemu_get_possible_cpu(i);
         uint64_t cpu_affid;
 
-        s->cpu[i].cpu = cpu;
+        if (qemu_enabled_cpu(cpu)) {
+            s->cpu[i].cpu = cpu;
+        }
+
         s->cpu[i].gic = s;
         /* Store GICv3CPUState in CPUARMState gicv3state pointer */
         gicv3_set_gicv3state(cpu, &s->cpu[i]);
diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index bdb13b00e9..2a8aff0b99 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -1052,6 +1052,10 @@ void gicv3_cpuif_update(GICv3CPUState *cs)
     ARMCPU *cpu = ARM_CPU(cs->cpu);
     CPUARMState *env = &cpu->env;
 
+    if (!qemu_enabled_cpu(cs->cpu)) {
+        return;
+    }
+
     g_assert(bql_locked());
 
     trace_gicv3_cpuif_update(gicv3_redist_affid(cs), cs->hppi.irq,
@@ -2036,6 +2040,10 @@ static void icc_generate_sgi(CPUARMState *env, GICv3CPUState *cs,
     for (i = 0; i < s->num_cpu; i++) {
         GICv3CPUState *ocs = &s->cpu[i];
 
+        if (!qemu_enabled_cpu(ocs->cpu)) {
+            continue;
+        }
+
         if (irm) {
             /* IRM == 1 : route to all CPUs except self */
             if (cs == ocs) {
diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
index 9ea6b8e218..8dbbd79e1b 100644
--- a/hw/intc/arm_gicv3_kvm.c
+++ b/hw/intc/arm_gicv3_kvm.c
@@ -24,6 +24,7 @@
 #include "hw/intc/arm_gicv3_common.h"
 #include "qemu/error-report.h"
 #include "qemu/module.h"
+#include "sysemu/cpus.h"
 #include "sysemu/kvm.h"
 #include "sysemu/runstate.h"
 #include "kvm_arm.h"
@@ -458,6 +459,18 @@ static void kvm_arm_gicv3_put(GICv3State *s)
         GICv3CPUState *c = &s->cpu[ncpu];
         int num_pri_bits;
 
+        /*
+         * To support hotplug of vcpus we need to make sure all gic cpuif/GICC
+         * are initialized at machvirt init time. Once the init is done we
+         * release the ARMCPU object for disabled vcpus but this leg could hit
+         * during reset of GICC later as well i.e. after init has happened and
+         * all of the cases we want to make sure we dont acess the GICC for
+         * the disabled VCPUs.
+         */
+        if (!qemu_enabled_cpu(c->cpu)) {
+            continue;
+        }
+
         kvm_gicc_access(s, ICC_SRE_EL1, ncpu, &c->icc_sre_el1, true);
         kvm_gicc_access(s, ICC_CTLR_EL1, ncpu,
                         &c->icc_ctlr_el1[GICV3_NS], true);
@@ -616,6 +629,11 @@ static void kvm_arm_gicv3_get(GICv3State *s)
         GICv3CPUState *c = &s->cpu[ncpu];
         int num_pri_bits;
 
+        /* don't access GICC for the disabled vCPUs. */
+        if (!qemu_enabled_cpu(c->cpu)) {
+            continue;
+        }
+
         kvm_gicc_access(s, ICC_SRE_EL1, ncpu, &c->icc_sre_el1, false);
         kvm_gicc_access(s, ICC_CTLR_EL1, ncpu,
                         &c->icc_ctlr_el1[GICV3_NS], false);
@@ -695,10 +713,19 @@ static void arm_gicv3_icc_reset(CPUARMState *env, const ARMCPRegInfo *ri)
         return;
     }
 
+    /*
+     * This shall be called even when vcpu is being hotplugged or onlined and
+     * other vcpus might be running. Host kernel KVM code to handle device
+     * access of IOCTLs KVM_{GET|SET}_DEVICE_ATTR might fail due to inability to
+     * grab vcpu locks for all the vcpus. Hence, we need to pause all vcpus to
+     * facilitate locking within host.
+     */
+    pause_all_vcpus();
     /* Initialize to actual HW supported configuration */
     kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS,
                       KVM_VGIC_ATTR(ICC_CTLR_EL1, c->gicr_typer),
                       &c->icc_ctlr_el1[GICV3_NS], false, &error_abort);
+    resume_all_vcpus();
 
     c->icc_ctlr_el1[GICV3_S] = c->icc_ctlr_el1[GICV3_NS];
 }
@@ -813,9 +840,10 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
     gicv3_init_irqs_and_mmio(s, kvm_arm_gicv3_set_irq, NULL);
 
     for (i = 0; i < s->num_cpu; i++) {
-        ARMCPU *cpu = ARM_CPU(qemu_get_cpu(i));
-
-        define_arm_cp_regs(cpu, gicv3_cpuif_reginfo);
+        CPUState *cs = qemu_get_cpu(i);
+        if (qemu_enabled_cpu(cs)) {
+            define_arm_cp_regs(ARM_CPU(cs), gicv3_cpuif_reginfo);
+        }
     }
 
     /* Try to create the device via the device control API */
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 780bd53ceb..36ac5ff4a2 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -209,7 +209,7 @@ static inline int virt_gicv3_redist_region_count(VirtMachineState *vms)
 
     assert(vms->gic_version != VIRT_GIC_VERSION_2);
 
-    return (MACHINE(vms)->smp.cpus > redist0_capacity &&
+    return (MACHINE(vms)->smp.max_cpus > redist0_capacity &&
             vms->highmem_redists) ? 2 : 1;
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 08/29] arm/virt: Init PMU at host for all possible vcpus
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (6 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 07/29] arm/virt, gicv3: Changes to pre-size GIC with possible vcpus " Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-07-04  3:07   ` Nicholas Piggin
  2024-06-13 23:36 ` [PATCH RFC V3 09/29] arm/acpi: Enable ACPI support for vcpu hotplug Salil Mehta via
                   ` (24 subsequent siblings)
  32 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

PMU for all possible vCPUs must be initialized at the VM initialization time.
Refactor existing code to accomodate possible vCPUs. This also assumes that all
processor being used are identical.

Past discussion for reference:
Link: https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00131.html

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c         | 12 ++++++++----
 include/hw/arm/virt.h |  1 +
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index ac53bfadca..57ec429022 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2045,12 +2045,14 @@ static void finalize_gic_version(VirtMachineState *vms)
  */
 static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
 {
+    CPUArchIdList *possible_cpus = vms->parent.possible_cpus;
     int max_cpus = MACHINE(vms)->smp.max_cpus;
-    bool aarch64, pmu, steal_time;
+    bool aarch64, steal_time;
     CPUState *cpu;
+    int n;
 
     aarch64 = object_property_get_bool(OBJECT(first_cpu), "aarch64", NULL);
-    pmu = object_property_get_bool(OBJECT(first_cpu), "pmu", NULL);
+    vms->pmu = object_property_get_bool(OBJECT(first_cpu), "pmu", NULL);
     steal_time = object_property_get_bool(OBJECT(first_cpu),
                                           "kvm-steal-time", NULL);
 
@@ -2077,8 +2079,10 @@ static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
             memory_region_add_subregion(sysmem, pvtime_reg_base, pvtime);
         }
 
-        CPU_FOREACH(cpu) {
-            if (pmu) {
+        for (n = 0; n < possible_cpus->len; n++) {
+            cpu = qemu_get_possible_cpu(n);
+
+            if (vms->pmu) {
                 assert(arm_feature(&ARM_CPU(cpu)->env, ARM_FEATURE_PMU));
                 if (kvm_irqchip_in_kernel()) {
                     kvm_arm_pmu_set_irq(ARM_CPU(cpu), VIRTUAL_PMU_IRQ);
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 36ac5ff4a2..d8dcc89a0d 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -155,6 +155,7 @@ struct VirtMachineState {
     bool ras;
     bool mte;
     bool dtb_randomness;
+    bool pmu;
     OnOffAuto acpi;
     VirtGICType gic_version;
     VirtIOMMUType iommu;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 09/29] arm/acpi: Enable ACPI support for vcpu hotplug
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (7 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 08/29] arm/virt: Init PMU at host for all possible vcpus Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-06-13 23:36 ` [PATCH RFC V3 10/29] arm/virt: Add cpu hotplug events to GED during creation Salil Mehta via
                   ` (23 subsequent siblings)
  32 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

ACPI is required to interface QEMU with the guest. Roughly falls into below
cases,

1. Convey the possible vcpus config at the machine init time to the guest
   using various DSDT tables like MADT etc.
2. Convey vcpu hotplug events to guest(using GED)
3. Assist in evaluation of various ACPI methods(like _EVT, _STA, _OST, _EJ0,
   _MAT etc.)
4. Provides ACPI cpu hotplug state and 12 Byte memory mapped cpu hotplug
   control register interface to the OSPM/guest corresponding to each possible
   vcpu. The register interface consists of various R/W fields and their
   handling operations. These are called when ever register fields or memory
   regions are accessed(i.e. read or written) by OSPM when ever it evaluates
   various ACPI methods.

Note: lot of this framework code is inherited from the changes already done for
      x86 but still some minor changes are required to make it compatible with
      ARM64.)

This patch enables the ACPI support for virtual cpu hotplug. ACPI changes
required will follow in subsequent patches.

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index 8b97683a45..e94f19a26a 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -33,6 +33,7 @@ config ARM_VIRT
     select ACPI_HW_REDUCED
     select ACPI_APEI
     select ACPI_VIOT
+    select ACPI_CPU_HOTPLUG
     select VIRTIO_MEM_SUPPORTED
     select ACPI_CXL
     select ACPI_HMAT
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 10/29] arm/virt: Add cpu hotplug events to GED during creation
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (8 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 09/29] arm/acpi: Enable ACPI support for vcpu hotplug Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-06-13 23:36 ` [PATCH RFC V3 11/29] arm/virt: Create GED dev before *disabled* CPU Objs are destroyed Salil Mehta via
                   ` (22 subsequent siblings)
  32 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

Add the CPU Hotplug event to the set of supported GED events during the creation
of the GED device at VM initialization. Additionally, initialize the memory
map for the CPU Hotplug control device, which is used for event exchanges
between QEMU/VMM and the guest.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
---
 hw/arm/virt.c         | 5 ++++-
 include/hw/arm/virt.h | 1 +
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 57ec429022..918bcb9a1b 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -80,6 +80,7 @@
 #include "hw/mem/pc-dimm.h"
 #include "hw/mem/nvdimm.h"
 #include "hw/acpi/generic_event_device.h"
+#include "hw/acpi/cpu_hotplug.h"
 #include "hw/virtio/virtio-md-pci.h"
 #include "hw/virtio/virtio-iommu.h"
 #include "hw/char/pl011.h"
@@ -176,6 +177,7 @@ static const MemMapEntry base_memmap[] = {
     [VIRT_NVDIMM_ACPI] =        { 0x09090000, NVDIMM_ACPI_IO_LEN},
     [VIRT_PVTIME] =             { 0x090a0000, 0x00010000 },
     [VIRT_SECURE_GPIO] =        { 0x090b0000, 0x00001000 },
+    [VIRT_CPUHP_ACPI] =         { 0x090c0000, ACPI_CPU_HOTPLUG_REG_LEN},
     [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
     /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
     [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
@@ -660,7 +662,7 @@ static inline DeviceState *create_acpi_ged(VirtMachineState *vms)
     DeviceState *dev;
     MachineState *ms = MACHINE(vms);
     int irq = vms->irqmap[VIRT_ACPI_GED];
-    uint32_t event = ACPI_GED_PWR_DOWN_EVT;
+    uint32_t event = ACPI_GED_PWR_DOWN_EVT | ACPI_GED_CPU_HOTPLUG_EVT;
 
     if (ms->ram_slots) {
         event |= ACPI_GED_MEM_HOTPLUG_EVT;
@@ -676,6 +678,7 @@ static inline DeviceState *create_acpi_ged(VirtMachineState *vms)
 
     sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, vms->memmap[VIRT_ACPI_GED].base);
     sysbus_mmio_map(SYS_BUS_DEVICE(dev), 1, vms->memmap[VIRT_PCDIMM_ACPI].base);
+    sysbus_mmio_map(SYS_BUS_DEVICE(dev), 3, vms->memmap[VIRT_CPUHP_ACPI].base);
     sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, qdev_get_gpio_in(vms->gic, irq));
 
     return dev;
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index d8dcc89a0d..d711cab46d 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -75,6 +75,7 @@ enum {
     VIRT_PCDIMM_ACPI,
     VIRT_ACPI_GED,
     VIRT_NVDIMM_ACPI,
+    VIRT_CPUHP_ACPI,
     VIRT_PVTIME,
     VIRT_LOWMEMMAP_LAST,
 };
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 11/29] arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (9 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 10/29] arm/virt: Add cpu hotplug events to GED during creation Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-08-13  1:04   ` Gavin Shan
  2024-06-13 23:36 ` [PATCH RFC V3 12/29] arm/virt/acpi: Build CPUs AML with CPU Hotplug support Salil Mehta via
                   ` (21 subsequent siblings)
  32 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

ACPI CPU hotplug state (is_present=_STA.PRESENT, is_enabled=_STA.ENABLED) for
all the possible vCPUs MUST be initialized during machine init. This is done
during the creation of the GED device. VMM/Qemu MUST expose/fake the ACPI state
of the disabled vCPUs to the Guest kernel as 'present' (_STA.PRESENT) always
i.e. ACPI persistent. if the 'disabled' vCPU objectes are destroyed before the
GED device has been created then their ACPI hotplug state might not get
initialized correctly as acpi_persistent flag is part of the CPUState. This will
expose wrong status of the unplugged vCPUs to the Guest kernel.

Hence, moving the GED device creation before disabled vCPU objects get destroyed
as part of the post CPU init routine.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 918bcb9a1b..5f98162587 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2467,6 +2467,12 @@ static void machvirt_init(MachineState *machine)
 
     create_gic(vms, sysmem);
 
+    has_ged = has_ged && aarch64 && firmware_loaded &&
+              virt_is_acpi_enabled(vms);
+    if (has_ged) {
+        vms->acpi_dev = create_acpi_ged(vms);
+    }
+
     virt_cpu_post_init(vms, sysmem);
 
     fdt_add_pmu_nodes(vms);
@@ -2489,9 +2495,7 @@ static void machvirt_init(MachineState *machine)
 
     create_pcie(vms);
 
-    if (has_ged && aarch64 && firmware_loaded && virt_is_acpi_enabled(vms)) {
-        vms->acpi_dev = create_acpi_ged(vms);
-    } else {
+    if (!has_ged) {
         create_gpio_devices(vms, VIRT_GPIO, sysmem);
     }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 12/29] arm/virt/acpi: Build CPUs AML with CPU Hotplug support
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (10 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 11/29] arm/virt: Create GED dev before *disabled* CPU Objs are destroyed Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-06-13 23:36 ` [PATCH RFC V3 13/29] arm/virt: Make ARM vCPU *present* status ACPI *persistent* Salil Mehta via
                   ` (20 subsequent siblings)
  32 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

Support for Virtual CPU Hotplug requires a sequence of ACPI handshakes between
QEMU and the guest kernel when a vCPU is plugged or unplugged. Most of the AML
code to support these handshakes already exists. This AML needs to be built
during VM initialization for the ARM architecture as well, if GED support
exists.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt-acpi-build.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index c3ccfef026..2d44567df5 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -799,6 +799,7 @@ static void
 build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
 {
     VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
+    MachineClass *mc = MACHINE_GET_CLASS(vms);
     Aml *scope, *dsdt;
     MachineState *ms = MACHINE(vms);
     const MemMapEntry *memmap = vms->memmap;
@@ -815,7 +816,18 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
      * the RTC ACPI device at all when using UEFI.
      */
     scope = aml_scope("\\_SB");
-    acpi_dsdt_add_cpus(scope, vms);
+    /* if GED is enabled then cpus AML shall be added as part build_cpus_aml */
+    if (vms->acpi_dev && mc->has_hotpluggable_cpus) {
+        CPUHotplugFeatures opts = {
+             .acpi_1_compatible = false,
+             .has_legacy_cphp = false
+        };
+
+        build_cpus_aml(scope, ms, opts, NULL, memmap[VIRT_CPUHP_ACPI].base,
+                       "\\_SB", NULL, AML_SYSTEM_MEMORY);
+    } else {
+        acpi_dsdt_add_cpus(scope, vms);
+    }
     acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
                        (irqmap[VIRT_UART] + ARM_SPI_BASE));
     if (vmc->acpi_expose_flash) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 13/29] arm/virt: Make ARM vCPU *present* status ACPI *persistent*
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (11 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 12/29] arm/virt/acpi: Build CPUs AML with CPU Hotplug support Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-07-04  2:49   ` Nicholas Piggin
  2024-06-13 23:36 ` [PATCH RFC V3 14/29] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES, ENA} Bits to Guest Salil Mehta via
                   ` (19 subsequent siblings)
  32 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

ARM arch does not allow CPUs presence to be changed [1] after kernel has booted.
Hence, firmware/ACPI/Qemu must ensure persistent view of the vCPUs to the Guest
kernel even when they are not present in the QoM i.e. are unplugged or are
yet-to-be-plugged

References:
[1] Check comment 5 in the bugzilla entry
   Link: https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 cpu-common.c          |  6 ++++++
 hw/arm/virt.c         |  7 +++++++
 include/hw/core/cpu.h | 21 +++++++++++++++++++++
 3 files changed, 34 insertions(+)

diff --git a/cpu-common.c b/cpu-common.c
index 49d2a50835..e4b4dee99a 100644
--- a/cpu-common.c
+++ b/cpu-common.c
@@ -128,6 +128,12 @@ bool qemu_enabled_cpu(CPUState *cpu)
     return cpu && !cpu->disabled;
 }
 
+bool qemu_persistent_cpu(CPUState *cpu)
+{
+    /* cpu state can be faked to the guest via acpi */
+    return cpu && cpu->acpi_persistent;
+}
+
 uint64_t qemu_get_cpu_archid(int cpu_index)
 {
     MachineState *ms = MACHINE(qdev_get_machine());
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 5f98162587..9d33f30a6a 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3016,6 +3016,13 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
         return;
     }
     virt_cpu_set_properties(OBJECT(cs), cpu_slot, errp);
+
+    /*
+     * To give persistent presence view of vCPUs to the guest, ACPI might need
+     * to fake the presence of the vCPUs to the guest but keep them disabled.
+     * This shall be used during the init of ACPI Hotplug state and hot-unplug
+     */
+     cs->acpi_persistent = true;
 }
 
 static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 62e68611c0..e13e542177 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -540,6 +540,14 @@ struct CPUState {
      * every CPUState is enabled across all architectures.
      */
     bool disabled;
+    /*
+     * On certain architectures, to provide a persistent view of the 'presence'
+     * of vCPUs to the guest, ACPI might need to fake the 'presence' of the
+     * vCPUs but keep them ACPI-disabled for the guest. This is achieved by
+     * returning `_STA.PRES=True` and `_STA.Ena=False` for the unplugged vCPUs
+     * in QEMU QoM.
+     */
+    bool acpi_persistent;
 
     /* TODO Move common fields from CPUArchState here. */
     int cpu_index;
@@ -959,6 +967,19 @@ bool qemu_present_cpu(CPUState *cpu);
  */
 bool qemu_enabled_cpu(CPUState *cpu);
 
+/**
+ * qemu_persistent_cpu:
+ * @cpu: The vCPU to check
+ *
+ * Checks if the vCPU state should always be reflected as *present* via ACPI
+ * to the Guest. By default, this is False on all architectures and has to be
+ * explicity set during initialization.
+ *
+ * Returns: True if it is ACPI 'persistent' CPU
+ *
+ */
+bool qemu_persistent_cpu(CPUState *cpu);
+
 /**
  * qemu_get_cpu_archid:
  * @cpu_index: possible vCPU for which arch-id needs to be retreived
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 14/29] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES, ENA} Bits to Guest
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (12 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 13/29] arm/virt: Make ARM vCPU *present* status ACPI *persistent* Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-06-13 23:36 ` [PATCH RFC V3 15/29] hw/arm: MADT Tbl change to size the guest with possible vCPUs Salil Mehta via
                   ` (18 subsequent siblings)
  32 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

ACPI AML changes are required to properly reflect the `_STA.PRES` and `_STA.ENA`
bits to the guest during initialization, when CPUs are hot-plugged, and after
CPUs are hot-unplugged.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/acpi/cpu.c                  | 53 ++++++++++++++++++++++++++++++----
 hw/acpi/generic_event_device.c | 11 +++++++
 include/hw/acpi/cpu.h          |  2 ++
 3 files changed, 61 insertions(+), 5 deletions(-)

diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index 4c63514b16..40b8899125 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -64,10 +64,11 @@ static uint64_t cpu_hotplug_rd(void *opaque, hwaddr addr, unsigned size)
     cdev = &cpu_st->devs[cpu_st->selector];
     switch (addr) {
     case ACPI_CPU_FLAGS_OFFSET_RW: /* pack and return is_* fields */
-        val |= cdev->cpu ? 1 : 0;
+        val |= cdev->is_enabled ? 1 : 0;
         val |= cdev->is_inserting ? 2 : 0;
         val |= cdev->is_removing  ? 4 : 0;
         val |= cdev->fw_remove  ? 16 : 0;
+        val |= cdev->is_present ? 32 : 0;
         trace_cpuhp_acpi_read_flags(cpu_st->selector, val);
         break;
     case ACPI_CPU_CMD_DATA_OFFSET_RW:
@@ -230,7 +231,23 @@ void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,
     state->dev_count = id_list->len;
     state->devs = g_new0(typeof(*state->devs), state->dev_count);
     for (i = 0; i < id_list->len; i++) {
-        state->devs[i].cpu =  CPU(id_list->cpus[i].cpu);
+        struct CPUState *cpu = CPU(id_list->cpus[i].cpu);
+        /*
+         * In most archs, CPUs which are ACPI 'present' are also ACPI 'enabled'
+         * by default. And these states are consistent at QOM and ACPI level.
+         */
+        if (qemu_enabled_cpu(cpu)) {
+            state->devs[i].cpu = cpu;
+            state->devs[i].is_present = true;
+            state->devs[i].is_enabled = true;
+        } else {
+            state->devs[i].is_enabled = false;
+            /*
+             * Some archs might expose 'disabled' QOM CPUs as ACPI 'present'.
+             * Hence, these states at QOM and ACPI level might be inconsistent.
+             */
+            state->devs[i].is_present = qemu_present_cpu(cpu);
+        }
         state->devs[i].arch_id = id_list->cpus[i].arch_id;
     }
     memory_region_init_io(&state->ctrl_reg, owner, &cpu_hotplug_ops, state,
@@ -263,6 +280,8 @@ void acpi_cpu_plug_cb(HotplugHandler *hotplug_dev,
     }
 
     cdev->cpu = CPU(dev);
+    cdev->is_present = true;
+    cdev->is_enabled = true;
     if (dev->hotplugged) {
         cdev->is_inserting = true;
         acpi_send_event(DEVICE(hotplug_dev), ACPI_CPU_HOTPLUG_STATUS);
@@ -294,6 +313,11 @@ void acpi_cpu_unplug_cb(CPUHotplugState *cpu_st,
         return;
     }
 
+    cdev->is_enabled = false;
+    if (!qemu_persistent_cpu(CPU(dev))) {
+        cdev->is_present = false;
+    }
+
     cdev->cpu = NULL;
 }
 
@@ -304,6 +328,8 @@ static const VMStateDescription vmstate_cpuhp_sts = {
     .fields = (const VMStateField[]) {
         VMSTATE_BOOL(is_inserting, AcpiCpuStatus),
         VMSTATE_BOOL(is_removing, AcpiCpuStatus),
+        VMSTATE_BOOL(is_present, AcpiCpuStatus),
+        VMSTATE_BOOL(is_enabled, AcpiCpuStatus),
         VMSTATE_UINT32(ost_event, AcpiCpuStatus),
         VMSTATE_UINT32(ost_status, AcpiCpuStatus),
         VMSTATE_END_OF_LIST()
@@ -341,6 +367,7 @@ const VMStateDescription vmstate_cpu_hotplug = {
 #define CPU_REMOVE_EVENT  "CRMV"
 #define CPU_EJECT_EVENT   "CEJ0"
 #define CPU_FW_EJECT_EVENT "CEJF"
+#define CPU_PRESENT       "CPRS"
 
 void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
                     build_madt_cpu_fn build_madt_cpu, hwaddr base_addr,
@@ -399,7 +426,9 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
         aml_append(field, aml_named_field(CPU_EJECT_EVENT, 1));
         /* tell firmware to do device eject, write only */
         aml_append(field, aml_named_field(CPU_FW_EJECT_EVENT, 1));
-        aml_append(field, aml_reserved_field(3));
+        /* 1 if present, read only */
+        aml_append(field, aml_named_field(CPU_PRESENT, 1));
+        aml_append(field, aml_reserved_field(2));
         aml_append(field, aml_named_field(CPU_COMMAND, 8));
         aml_append(cpu_ctrl_dev, field);
 
@@ -429,6 +458,7 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
         Aml *ctrl_lock = aml_name("%s.%s", cphp_res_path, CPU_LOCK);
         Aml *cpu_selector = aml_name("%s.%s", cphp_res_path, CPU_SELECTOR);
         Aml *is_enabled = aml_name("%s.%s", cphp_res_path, CPU_ENABLED);
+        Aml *is_present = aml_name("%s.%s", cphp_res_path, CPU_PRESENT);
         Aml *cpu_cmd = aml_name("%s.%s", cphp_res_path, CPU_COMMAND);
         Aml *cpu_data = aml_name("%s.%s", cphp_res_path, CPU_DATA);
         Aml *ins_evt = aml_name("%s.%s", cphp_res_path, CPU_INSERT_EVENT);
@@ -457,13 +487,26 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
         {
             Aml *idx = aml_arg(0);
             Aml *sta = aml_local(0);
+            Aml *ifctx2;
+            Aml *else_ctx;
 
             aml_append(method, aml_acquire(ctrl_lock, 0xFFFF));
             aml_append(method, aml_store(idx, cpu_selector));
             aml_append(method, aml_store(zero, sta));
-            ifctx = aml_if(aml_equal(is_enabled, one));
+            ifctx = aml_if(aml_equal(is_present, one));
             {
-                aml_append(ifctx, aml_store(aml_int(0xF), sta));
+                ifctx2 = aml_if(aml_equal(is_enabled, one));
+                {
+                    /* cpu is present and enabled */
+                    aml_append(ifctx2, aml_store(aml_int(0xF), sta));
+                }
+                aml_append(ifctx, ifctx2);
+                else_ctx = aml_else();
+                {
+                    /* cpu is present but disabled */
+                    aml_append(else_ctx, aml_store(aml_int(0xD), sta));
+                }
+                aml_append(ifctx, else_ctx);
             }
             aml_append(method, ifctx);
             aml_append(method, aml_release(ctrl_lock));
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index 63226b0040..e92ce07955 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -333,6 +333,16 @@ static const VMStateDescription vmstate_memhp_state = {
     }
 };
 
+static const VMStateDescription vmstate_cpuhp_state = {
+    .name = "acpi-ged/cpuhp",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .fields      = (VMStateField[]) {
+        VMSTATE_CPU_HOTPLUG(cpuhp_state, AcpiGedState),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static const VMStateDescription vmstate_ged_state = {
     .name = "acpi-ged-state",
     .version_id = 1,
@@ -381,6 +391,7 @@ static const VMStateDescription vmstate_acpi_ged = {
     },
     .subsections = (const VMStateDescription * const []) {
         &vmstate_memhp_state,
+        &vmstate_cpuhp_state,
         &vmstate_ghes_state,
         NULL
     }
diff --git a/include/hw/acpi/cpu.h b/include/hw/acpi/cpu.h
index 48cded697c..07b524b713 100644
--- a/include/hw/acpi/cpu.h
+++ b/include/hw/acpi/cpu.h
@@ -24,6 +24,8 @@ typedef struct AcpiCpuStatus {
     uint64_t arch_id;
     bool is_inserting;
     bool is_removing;
+    bool is_present;
+    bool is_enabled;
     bool fw_remove;
     uint32_t ost_event;
     uint32_t ost_status;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 15/29] hw/arm: MADT Tbl change to size the guest with possible vCPUs
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (13 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 14/29] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES, ENA} Bits to Guest Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-06-13 23:36 ` [PATCH RFC V3 16/29] hw/acpi: Make _MAT method optional Salil Mehta via
                   ` (17 subsequent siblings)
  32 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

Changes are required during the building of the MADT table by QEMU to
accommodate disabled possible vCPUs. This information will be used by the guest
kernel to size up its resources during boot time. The pre-sizing of the guest
kernel based on possible vCPUs will facilitate the hotplug of the disabled
vCPUs.

This change also addresses ACPI MADT GIC CPU interface flag-related changes
recently introduced in the UEFI ACPI 6.5 specification, which allows deferred
virtual CPU online'ing in the guest kernel.

Link: https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic-cpu-interface-gicc-structure

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt-acpi-build.c | 36 ++++++++++++++++++++++++++++++------
 1 file changed, 30 insertions(+), 6 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 2d44567df5..4b4906f407 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -657,6 +657,29 @@ static void build_append_gicr(GArray *table_data, uint64_t base, uint32_t size)
     build_append_int_noprefix(table_data, size, 4); /* Discovery Range Length */
 }
 
+static uint32_t virt_acpi_get_gicc_flags(CPUState *cpu)
+{
+    MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
+
+    /* can only exist in 'enabled' state */
+    if (!mc->has_hotpluggable_cpus) {
+        return 1;
+    }
+
+    /*
+     * ARM GIC CPU Interface can be 'online-capable' or 'enabled' at boot
+     * We MUST set 'online-capable' bit for all hotpluggable CPUs except the
+     * first/boot CPU. Cold-booted CPUs without 'Id' can also be unplugged.
+     * Though as-of-now this is only used as a debugging feature.
+     *
+     *   UEFI ACPI Specification 6.5
+     *   Section: 5.2.12.14. GIC CPU Interface (GICC) Structure
+     *   Table:   5.37 GICC CPU Interface Flags
+     *   Link: https://uefi.org/specs/ACPI/6.5
+     */
+    return cpu && !cpu->cpu_index ? 1 : (1 << 3);
+}
+
 static void
 build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
 {
@@ -683,12 +706,13 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     build_append_int_noprefix(table_data, vms->gic_version, 1);
     build_append_int_noprefix(table_data, 0, 3);   /* Reserved */
 
-    for (i = 0; i < MACHINE(vms)->smp.cpus; i++) {
-        ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(i));
+    for (i = 0; i < MACHINE(vms)->smp.max_cpus; i++) {
+        CPUState *cpu = qemu_get_possible_cpu(i);
         uint64_t physical_base_address = 0, gich = 0, gicv = 0;
         uint32_t vgic_interrupt = vms->virt ? ARCH_GIC_MAINT_IRQ : 0;
-        uint32_t pmu_interrupt = arm_feature(&armcpu->env, ARM_FEATURE_PMU) ?
-                                             VIRTUAL_PMU_IRQ : 0;
+        uint32_t pmu_interrupt = vms->pmu ? VIRTUAL_PMU_IRQ : 0;
+        uint32_t flags = virt_acpi_get_gicc_flags(cpu);
+        uint64_t mpidr = qemu_get_cpu_archid(i);
 
         if (vms->gic_version == VIRT_GIC_VERSION_2) {
             physical_base_address = memmap[VIRT_GIC_CPU].base;
@@ -703,7 +727,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
         build_append_int_noprefix(table_data, i, 4);    /* GIC ID */
         build_append_int_noprefix(table_data, i, 4);    /* ACPI Processor UID */
         /* Flags */
-        build_append_int_noprefix(table_data, 1, 4);    /* Enabled */
+        build_append_int_noprefix(table_data, flags, 4);
         /* Parking Protocol Version */
         build_append_int_noprefix(table_data, 0, 4);
         /* Performance Interrupt GSIV */
@@ -717,7 +741,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
         build_append_int_noprefix(table_data, vgic_interrupt, 4);
         build_append_int_noprefix(table_data, 0, 8);    /* GICR Base Address*/
         /* MPIDR */
-        build_append_int_noprefix(table_data, arm_cpu_mp_affinity(armcpu), 8);
+        build_append_int_noprefix(table_data, mpidr, 8);
         /* Processor Power Efficiency Class */
         build_append_int_noprefix(table_data, 0, 1);
         /* Reserved */
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 16/29] hw/acpi: Make _MAT method optional
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (14 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 15/29] hw/arm: MADT Tbl change to size the guest with possible vCPUs Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-06-13 23:36 ` [PATCH RFC V3 17/29] arm/virt: Release objects for *disabled* possible vCPUs after init Salil Mehta via
                   ` (16 subsequent siblings)
  32 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

From: Jean-Philippe Brucker <jean-philippe@linaro.org>

The GICC interface on arm64 vCPUs is statically defined in the MADT, and
doesn't require a _MAT entry. Although the GICC is indicated as present
by the MADT entry, it can only be used from vCPU sysregs, which aren't
accessible until hot-add.

Co-developed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Co-developed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/acpi/cpu.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index 40b8899125..2240c6e108 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -717,10 +717,13 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
             aml_append(dev, method);
 
             /* build _MAT object */
-            build_madt_cpu(i, arch_ids, madt_buf, true); /* set enabled flag */
-            aml_append(dev, aml_name_decl("_MAT",
+            if (build_madt_cpu) {
+                build_madt_cpu(i, arch_ids, madt_buf,
+                                true); /* set enabled flag */
+                aml_append(dev, aml_name_decl("_MAT",
                 aml_buffer(madt_buf->len, (uint8_t *)madt_buf->data)));
-            g_array_free(madt_buf, true);
+                g_array_free(madt_buf, true);
+            }
 
             if (CPU(arch_ids->cpus[i].cpu) != first_cpu) {
                 method = aml_method("_EJ0", 1, AML_NOTSERIALIZED);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 17/29] arm/virt: Release objects for *disabled* possible vCPUs after init
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (15 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 16/29] hw/acpi: Make _MAT method optional Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-08-13  1:17   ` Gavin Shan
  2024-06-13 23:36 ` [PATCH RFC V3 18/29] arm/virt: Add/update basic hot-(un)plug framework Salil Mehta via
                   ` (15 subsequent siblings)
  32 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

During `machvirt_init()`, QOM ARMCPU objects are pre-created along with the
corresponding KVM vCPUs in the host for all possible vCPUs. This is necessary
due to the architectural constraint that KVM restricts the deferred creation of
KVM vCPUs and VGIC initialization/sizing after VM initialization. Hence, VGIC is
pre-sized with possible vCPUs.

After the initialization of the machine is complete, the disabled possible KVM
vCPUs are parked in the per-virt-machine list "kvm_parked_vcpus," and we release
the QOM ARMCPU objects for the disabled vCPUs. These will be re-created when the
vCPU is hotplugged again. The QOM ARMCPU object is then re-attached to the
corresponding parked KVM vCPU.

Alternatively, we could have chosen not to release the QOM CPU objects and kept
reusing them. This approach might require some modifications to the
`qdevice_add()` interface to retrieve the old ARMCPU object instead of creating
a new one for the hotplug request.

Each of these approaches has its own pros and cons. This prototype uses the
first approach (suggestions are welcome!).

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 9d33f30a6a..a72cd3b20d 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2050,6 +2050,7 @@ static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
 {
     CPUArchIdList *possible_cpus = vms->parent.possible_cpus;
     int max_cpus = MACHINE(vms)->smp.max_cpus;
+    MachineState *ms = MACHINE(vms);
     bool aarch64, steal_time;
     CPUState *cpu;
     int n;
@@ -2111,6 +2112,37 @@ static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
             }
         }
     }
+
+    if (kvm_enabled() || tcg_enabled()) {
+        for (n = 0; n < possible_cpus->len; n++) {
+            cpu = qemu_get_possible_cpu(n);
+
+            /*
+             * Now, GIC has been sized with possible CPUs and we dont require
+             * disabled vCPU objects to be represented in the QOM. Release the
+             * disabled ARMCPU objects earlier used during init for pre-sizing.
+             *
+             * We fake to the guest through ACPI about the presence(_STA.PRES=1)
+             * of these non-existent vCPUs at VMM/qemu and present these as
+             * disabled vCPUs(_STA.ENA=0) so that they cant be used. These vCPUs
+             * can be later added to the guest through hotplug exchanges when
+             * ARMCPU objects are created back again using 'device_add' QMP
+             * command.
+             */
+            /*
+             * RFC: Question: Other approach could've been to keep them forever
+             * and release it only once when qemu exits as part of finalize or
+             * when new vCPU is hotplugged. In the later old could be released
+             * for the newly created object for the same vCPU?
+             */
+            if (!qemu_enabled_cpu(cpu)) {
+                CPUArchId *cpu_slot;
+                cpu_slot = virt_find_cpu_slot(ms, cpu->cpu_index);
+                cpu_slot->cpu = NULL;
+                object_unref(OBJECT(cpu));
+            }
+        }
+    }
 }
 
 static void virt_cpu_set_properties(Object *cpuobj, const CPUArchId *cpu_slot,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 18/29] arm/virt: Add/update basic hot-(un)plug framework
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (16 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 17/29] arm/virt: Release objects for *disabled* possible vCPUs after init Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-08-13  1:21   ` Gavin Shan
  2024-06-13 23:36 ` [PATCH RFC V3 19/29] arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug Salil Mehta via
                   ` (14 subsequent siblings)
  32 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

Add CPU hot-unplug hooks and update hotplug hooks with additional sanity checks
for use in hotplug paths.

Note: The functional contents of the hooks (currently left with TODO comments)
will be gradually filled in subsequent patches in an incremental approach to
patch and logic building, which would roughly include the following:

1. (Un)wiring of interrupts between vCPU<->GIC.
2. Sending events to the guest for hot-(un)plug so that the guest can take
   appropriate actions.
3. Notifying the GIC about the hot-(un)plug action so that the vCPU can be
   (un)stitched to the GIC CPU interface.
4. Updating the guest with next boot information for this vCPU in the firmware.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c | 105 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 105 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index a72cd3b20d..f6b8c21f26 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -85,6 +85,7 @@
 #include "hw/virtio/virtio-iommu.h"
 #include "hw/char/pl011.h"
 #include "qemu/guest-random.h"
+#include "qapi/qmp/qdict.h"
 
 static GlobalProperty arm_virt_compat[] = {
     { TYPE_VIRTIO_IOMMU_PCI, "aw-bits", "48" },
@@ -3002,11 +3003,23 @@ static void virt_memory_plug(HotplugHandler *hotplug_dev,
 static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
                               Error **errp)
 {
+    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
     MachineState *ms = MACHINE(hotplug_dev);
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
     ARMCPU *cpu = ARM_CPU(dev);
     CPUState *cs = CPU(dev);
     CPUArchId *cpu_slot;
 
+    if (dev->hotplugged && !vms->acpi_dev) {
+        error_setg(errp, "GED acpi device does not exists");
+        return;
+    }
+
+    if (dev->hotplugged && !mc->has_hotpluggable_cpus) {
+        error_setg(errp, "CPU hotplug not supported on this machine");
+        return;
+    }
+
     /* sanity check the cpu */
     if (!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) {
         error_setg(errp, "Invalid CPU type, expected cpu type: '%s'",
@@ -3049,6 +3062,22 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
     }
     virt_cpu_set_properties(OBJECT(cs), cpu_slot, errp);
 
+    /*
+     * Fix the GIC for this new vCPU being plugged. The QOM CPU object for the
+     * new vCPU need to be updated in the corresponding QOM GICv3CPUState object
+     * We also need to re-wire the IRQs for this new CPU object. This update
+     * is limited to the QOM only and does not affects the KVM. Later has
+     * already been pre-sized with possible CPU at VM init time. This is a
+     * workaround to the constraints posed by ARM architecture w.r.t supporting
+     * CPU Hotplug. Specification does not exist for the later.
+     * This patch-up is required both for {cold,hot}-plugged vCPUs. Cold-inited
+     * vCPUs have their GIC state initialized during machvit_init().
+     */
+    if (vms->acpi_dev) {
+        /* TODO: update GIC about this hotplug change here */
+        /* TODO: wire the GIC<->CPU irqs */
+    }
+
     /*
      * To give persistent presence view of vCPUs to the guest, ACPI might need
      * to fake the presence of the vCPUs to the guest but keep them disabled.
@@ -3060,6 +3089,7 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
 static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
                           Error **errp)
 {
+    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
     MachineState *ms = MACHINE(hotplug_dev);
     CPUState *cs = CPU(dev);
     CPUArchId *cpu_slot;
@@ -3068,10 +3098,81 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
     cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
     cpu_slot->cpu = CPU(dev);
 
+    /*
+     * Update the ACPI Hotplug state both for vCPUs being {hot,cold}-plugged.
+     * vCPUs can be cold-plugged using '-device' option. For vCPUs being hot
+     * plugged, guest is also notified.
+     */
+    if (vms->acpi_dev) {
+        /* TODO: update acpi hotplug state. Send cpu hotplug event to guest */
+        /* TODO: register cpu for reset & update F/W info for the next boot */
+    }
+
     cs->disabled = false;
     return;
 }
 
+static void virt_cpu_unplug_request(HotplugHandler *hotplug_dev,
+                                    DeviceState *dev, Error **errp)
+{
+    MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
+    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
+    ARMCPU *cpu = ARM_CPU(dev);
+    CPUState *cs = CPU(dev);
+
+    if (!vms->acpi_dev || !dev->realized) {
+        error_setg(errp, "GED does not exists or device is not realized!");
+        return;
+    }
+
+    if (!mc->has_hotpluggable_cpus) {
+        error_setg(errp, "CPU hot(un)plug not supported on this machine");
+        return;
+    }
+
+    if (cs->cpu_index == first_cpu->cpu_index) {
+        error_setg(errp, "Boot CPU(id%d=%d:%d:%d:%d) hot-unplug not supported",
+                   first_cpu->cpu_index, cpu->socket_id, cpu->cluster_id,
+                   cpu->core_id, cpu->thread_id);
+        return;
+    }
+
+    /* TODO: request cpu hotplug from guest */
+
+    return;
+}
+
+static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
+                            Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
+    MachineState *ms = MACHINE(hotplug_dev);
+    CPUState *cs = CPU(dev);
+    CPUArchId *cpu_slot;
+
+    if (!vms->acpi_dev || !dev->realized) {
+        error_setg(errp, "GED does not exists or device is not realized!");
+        return;
+    }
+
+    cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
+
+    /* TODO: update the acpi cpu hotplug state for cpu hot-unplug */
+
+    /* TODO: unwire the gic-cpu irqs here */
+    /* TODO: update the GIC about this hot unplug change */
+
+    /* TODO: unregister cpu for reset & update F/W info for the next boot */
+
+    qobject_unref(dev->opts);
+    dev->opts = NULL;
+
+    cpu_slot->cpu = NULL;
+    cs->disabled = true;
+
+    return;
+}
+
 static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
                                             DeviceState *dev, Error **errp)
 {
@@ -3196,6 +3297,8 @@ static void virt_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
         virtio_md_pci_unplug_request(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev),
                                      errp);
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+        virt_cpu_unplug_request(hotplug_dev, dev, errp);
     } else {
         error_setg(errp, "device unplug request for unsupported device"
                    " type: %s", object_get_typename(OBJECT(dev)));
@@ -3209,6 +3312,8 @@ static void virt_machine_device_unplug_cb(HotplugHandler *hotplug_dev,
         virt_dimm_unplug(hotplug_dev, dev, errp);
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
         virtio_md_pci_unplug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+        virt_cpu_unplug(hotplug_dev, dev, errp);
     } else {
         error_setg(errp, "virt: device unplug for unsupported device"
                    " type: %s", object_get_typename(OBJECT(dev)));
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 19/29] arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (17 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 18/29] arm/virt: Add/update basic hot-(un)plug framework Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-06-13 23:36 ` [PATCH RFC V3 20/29] hw/arm, gicv3: Changes to update GIC with vCPU hot-plug notification Salil Mehta via
                   ` (13 subsequent siblings)
  32 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

Refactors the existing GIC create code to extract common code to wire the
vcpu<->gic interrupts. This function could be used with cold-plug case and also
used when vCPU is hot-plugged. It also introduces a new function to unwire the
vcpu<->gic interrupts for the vCPU hot-unplug cases.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reported-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
[4/05/2024: Issue with total number of PPI available during create GIC]
Suggested-by: Miguel Luis <miguel.luis@oracle.com>
[5/05/2024: Fix the total number of PPIs available as per ARM BSA to avoid overflow]
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c          | 154 ++++++++++++++++++++++++++++-------------
 hw/core/gpio.c         |   2 +-
 include/hw/qdev-core.h |   2 +
 target/arm/cpu-qom.h   |  18 +++--
 4 files changed, 118 insertions(+), 58 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index f6b8c21f26..1556c362f7 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -747,6 +747,107 @@ static bool gicv3_nmi_present(VirtMachineState *vms)
            (vms->gic_version != VIRT_GIC_VERSION_2);
 }
 
+/*
+ * Mapping from the output timer irq lines from the CPU to the GIC PPI inputs
+ * we use for the virt board.
+ */
+const int timer_irq[] = {
+    [GTIMER_PHYS] = ARCH_TIMER_NS_EL1_IRQ,
+    [GTIMER_VIRT] = ARCH_TIMER_VIRT_IRQ,
+    [GTIMER_HYP]  = ARCH_TIMER_NS_EL2_IRQ,
+    [GTIMER_SEC]  = ARCH_TIMER_S_EL1_IRQ,
+};
+
+static void unwire_gic_cpu_irqs(VirtMachineState *vms, CPUState *cs)
+{
+    MachineState *ms = MACHINE(vms);
+    unsigned int max_cpus = ms->smp.max_cpus;
+    DeviceState *cpudev = DEVICE(cs);
+    DeviceState *gicdev = vms->gic;
+    int cpu = CPU(cs)->cpu_index;
+    int type = vms->gic_version;
+    int irq, num_gpio_in;
+
+    for (irq = 0; irq < ARRAY_SIZE(timer_irq); irq++) {
+        qdev_disconnect_gpio_out_named(cpudev, NULL, irq);
+    }
+
+    if (type != VIRT_GIC_VERSION_2) {
+        qdev_disconnect_gpio_out_named(cpudev, "gicv3-maintenance-interrupt",
+                                       0);
+    } else if (vms->virt) {
+        qdev_disconnect_gpio_out_named(gicdev, SYSBUS_DEVICE_GPIO_IRQ,
+                                       cpu + 4 * max_cpus);
+    }
+
+    /*
+     * RFC: Question: This currently does not takes care of intimating the
+     * devices which might be sitting on system bus. Do we need a
+     * sysbus_disconnect_irq() which also does the job of notification beside
+     * disconnection?
+     */
+    qdev_disconnect_gpio_out_named(cpudev, "pmu-interrupt", 0);
+
+    /* Unwire GIC's IRQ/FIQ/VIRQ/VFIQ/NMI/VINMI interrupt outputs to CPU */
+    num_gpio_in = (vms->gic_version != VIRT_GIC_VERSION_2) ?
+                                                NUM_GPIO_IN : NUM_GICV2_GPIO_IN;
+    for (irq = 0; irq < num_gpio_in; irq++) {
+        qdev_disconnect_gpio_out_named(gicdev, SYSBUS_DEVICE_GPIO_IRQ,
+                                        cpu + irq * max_cpus);
+    }
+}
+
+static void wire_gic_cpu_irqs(VirtMachineState *vms, CPUState *cs)
+{
+    MachineState *ms = MACHINE(vms);
+    unsigned int max_cpus = ms->smp.max_cpus;
+    DeviceState *cpudev = DEVICE(cs);
+    DeviceState *gicdev = vms->gic;
+    int cpu = CPU(cs)->cpu_index;
+    int type = vms->gic_version;
+    SysBusDevice *gicbusdev;
+    int intidbase;
+    int irqn;
+
+    intidbase = NUM_IRQS + cpu * GIC_INTERNAL;
+
+    for (irqn = 0; irqn < ARRAY_SIZE(timer_irq); irqn++) {
+        qdev_connect_gpio_out(cpudev, irqn,
+                              qdev_get_gpio_in(gicdev,
+                                               intidbase + timer_irq[irqn]));
+    }
+
+    gicbusdev = SYS_BUS_DEVICE(gicdev);
+    if (type != VIRT_GIC_VERSION_2) {
+        qemu_irq irq = qdev_get_gpio_in(gicdev,
+                                        intidbase + ARCH_GIC_MAINT_IRQ);
+        qdev_connect_gpio_out_named(cpudev, "gicv3-maintenance-interrupt",
+                                    0, irq);
+    } else if (vms->virt) {
+        qemu_irq irq = qdev_get_gpio_in(gicdev,
+                                        intidbase + ARCH_GIC_MAINT_IRQ);
+        sysbus_connect_irq(gicbusdev, cpu + 4 * max_cpus, irq);
+    }
+
+    qdev_connect_gpio_out_named(cpudev, "pmu-interrupt", 0,
+                                qdev_get_gpio_in(gicdev,
+                                                 intidbase + VIRTUAL_PMU_IRQ));
+
+    sysbus_connect_irq(gicbusdev, cpu, qdev_get_gpio_in(cpudev, ARM_CPU_IRQ));
+    sysbus_connect_irq(gicbusdev, cpu + max_cpus,
+                       qdev_get_gpio_in(cpudev, ARM_CPU_FIQ));
+    sysbus_connect_irq(gicbusdev, cpu + 2 * max_cpus,
+                       qdev_get_gpio_in(cpudev, ARM_CPU_VIRQ));
+    sysbus_connect_irq(gicbusdev, cpu + 3 * max_cpus,
+                       qdev_get_gpio_in(cpudev, ARM_CPU_VFIQ));
+    if (vms->gic_version != VIRT_GIC_VERSION_2) {
+        sysbus_connect_irq(gicbusdev, cpu + 4 * max_cpus,
+                           qdev_get_gpio_in(cpudev, ARM_CPU_NMI));
+        sysbus_connect_irq(gicbusdev, cpu + 5 * max_cpus,
+                           qdev_get_gpio_in(cpudev, ARM_CPU_VINMI));
+    }
+}
+
 static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
 {
     MachineState *ms = MACHINE(vms);
@@ -849,54 +950,7 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
      * CPU's inputs.
      */
     for (i = 0; i < smp_cpus; i++) {
-        DeviceState *cpudev = DEVICE(qemu_get_cpu(i));
-        int intidbase = NUM_IRQS + i * GIC_INTERNAL;
-        /* Mapping from the output timer irq lines from the CPU to the
-         * GIC PPI inputs we use for the virt board.
-         */
-        const int timer_irq[] = {
-            [GTIMER_PHYS] = ARCH_TIMER_NS_EL1_IRQ,
-            [GTIMER_VIRT] = ARCH_TIMER_VIRT_IRQ,
-            [GTIMER_HYP]  = ARCH_TIMER_NS_EL2_IRQ,
-            [GTIMER_SEC]  = ARCH_TIMER_S_EL1_IRQ,
-            [GTIMER_HYPVIRT] = ARCH_TIMER_NS_EL2_VIRT_IRQ,
-        };
-
-        for (unsigned irq = 0; irq < ARRAY_SIZE(timer_irq); irq++) {
-            qdev_connect_gpio_out(cpudev, irq,
-                                  qdev_get_gpio_in(vms->gic,
-                                                   intidbase + timer_irq[irq]));
-        }
-
-        if (vms->gic_version != VIRT_GIC_VERSION_2) {
-            qemu_irq irq = qdev_get_gpio_in(vms->gic,
-                                            intidbase + ARCH_GIC_MAINT_IRQ);
-            qdev_connect_gpio_out_named(cpudev, "gicv3-maintenance-interrupt",
-                                        0, irq);
-        } else if (vms->virt) {
-            qemu_irq irq = qdev_get_gpio_in(vms->gic,
-                                            intidbase + ARCH_GIC_MAINT_IRQ);
-            sysbus_connect_irq(gicbusdev, i + 4 * max_cpus, irq);
-        }
-
-        qdev_connect_gpio_out_named(cpudev, "pmu-interrupt", 0,
-                                    qdev_get_gpio_in(vms->gic, intidbase
-                                                     + VIRTUAL_PMU_IRQ));
-
-        sysbus_connect_irq(gicbusdev, i, qdev_get_gpio_in(cpudev, ARM_CPU_IRQ));
-        sysbus_connect_irq(gicbusdev, i + max_cpus,
-                           qdev_get_gpio_in(cpudev, ARM_CPU_FIQ));
-        sysbus_connect_irq(gicbusdev, i + 2 * max_cpus,
-                           qdev_get_gpio_in(cpudev, ARM_CPU_VIRQ));
-        sysbus_connect_irq(gicbusdev, i + 3 * max_cpus,
-                           qdev_get_gpio_in(cpudev, ARM_CPU_VFIQ));
-
-        if (vms->gic_version != VIRT_GIC_VERSION_2) {
-            sysbus_connect_irq(gicbusdev, i + 4 * smp_cpus,
-                               qdev_get_gpio_in(cpudev, ARM_CPU_NMI));
-            sysbus_connect_irq(gicbusdev, i + 5 * smp_cpus,
-                               qdev_get_gpio_in(cpudev, ARM_CPU_VINMI));
-        }
+        wire_gic_cpu_irqs(vms, qemu_get_cpu(i));
     }
 
     fdt_add_gic_node(vms);
@@ -3075,7 +3129,7 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
      */
     if (vms->acpi_dev) {
         /* TODO: update GIC about this hotplug change here */
-        /* TODO: wire the GIC<->CPU irqs */
+        wire_gic_cpu_irqs(vms, cs);
     }
 
     /*
@@ -3159,7 +3213,7 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
 
     /* TODO: update the acpi cpu hotplug state for cpu hot-unplug */
 
-    /* TODO: unwire the gic-cpu irqs here */
+    unwire_gic_cpu_irqs(vms, cs);
     /* TODO: update the GIC about this hot unplug change */
 
     /* TODO: unregister cpu for reset & update F/W info for the next boot */
diff --git a/hw/core/gpio.c b/hw/core/gpio.c
index 80d07a6ec9..abb164d5c0 100644
--- a/hw/core/gpio.c
+++ b/hw/core/gpio.c
@@ -143,7 +143,7 @@ qemu_irq qdev_get_gpio_out_connector(DeviceState *dev, const char *name, int n)
 
 /* disconnect a GPIO output, returning the disconnected input (if any) */
 
-static qemu_irq qdev_disconnect_gpio_out_named(DeviceState *dev,
+qemu_irq qdev_disconnect_gpio_out_named(DeviceState *dev,
                                                const char *name, int n)
 {
     char *propname = g_strdup_printf("%s[%d]",
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 5336728a23..742e62e400 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -738,6 +738,8 @@ qemu_irq qdev_get_gpio_out_connector(DeviceState *dev, const char *name, int n);
  */
 qemu_irq qdev_intercept_gpio_out(DeviceState *dev, qemu_irq icpt,
                                  const char *name, int n);
+qemu_irq qdev_disconnect_gpio_out_named(DeviceState *dev,
+                                               const char *name, int n);
 
 BusState *qdev_get_child_bus(DeviceState *dev, const char *name);
 
diff --git a/target/arm/cpu-qom.h b/target/arm/cpu-qom.h
index b497667d61..e49fb096de 100644
--- a/target/arm/cpu-qom.h
+++ b/target/arm/cpu-qom.h
@@ -37,13 +37,17 @@ DECLARE_CLASS_CHECKERS(AArch64CPUClass, AARCH64_CPU,
 #define ARM_CPU_TYPE_NAME(name) (name ARM_CPU_TYPE_SUFFIX)
 
 /* Meanings of the ARMCPU object's seven inbound GPIO lines */
-#define ARM_CPU_IRQ 0
-#define ARM_CPU_FIQ 1
-#define ARM_CPU_VIRQ 2
-#define ARM_CPU_VFIQ 3
-#define ARM_CPU_NMI 4
-#define ARM_CPU_VINMI 5
-#define ARM_CPU_VFNMI 6
+enum {
+    ARM_CPU_IRQ = 0,
+    ARM_CPU_FIQ = 1,
+    ARM_CPU_VIRQ = 2,
+    ARM_CPU_VFIQ = 3,
+    NUM_GICV2_GPIO_IN = (ARM_CPU_VFIQ+1),
+    ARM_CPU_NMI = 4,
+    ARM_CPU_VINMI = 5,
+    /* ARM_CPU_VFNMI = 6, */ /* not used? */
+    NUM_GPIO_IN = (ARM_CPU_VINMI+1),
+};
 
 /* For M profile, some registers are banked secure vs non-secure;
  * these are represented as a 2-element array where the first element
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 20/29] hw/arm, gicv3: Changes to update GIC with vCPU hot-plug notification
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (18 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 19/29] arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-06-13 23:36 ` [PATCH RFC V3 21/29] hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register info Salil Mehta via
                   ` (12 subsequent siblings)
  32 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

Virtual CPU hot-(un)plug events MUST be notified to the GIC. Introduce a
notfication mechanism to update any such events to GIC so that it can update its
vCPU to GIC CPU interface association.

This is required to implement a workaround to the limitations posed by the ARM
architecture. For details about the constraints and workarounds please check
below slides:

Link: https://kvm-forum.qemu.org/2023/talk/9SMPDQ/

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c                      | 27 +++++++++++++--
 hw/intc/arm_gicv3_common.c         | 54 +++++++++++++++++++++++++++++-
 hw/intc/arm_gicv3_cpuif_common.c   |  5 +++
 hw/intc/gicv3_internal.h           |  1 +
 include/hw/arm/virt.h              |  1 +
 include/hw/intc/arm_gicv3_common.h | 22 ++++++++++++
 6 files changed, 107 insertions(+), 3 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 1556c362f7..9f7e07bd8e 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -685,6 +685,16 @@ static inline DeviceState *create_acpi_ged(VirtMachineState *vms)
     return dev;
 }
 
+static void virt_add_gic_cpuhp_notifier(VirtMachineState *vms)
+{
+    MachineClass *mc = MACHINE_GET_CLASS(vms);
+
+    if (mc->has_hotpluggable_cpus) {
+        Notifier *cpuhp_notifier = gicv3_cpuhp_notifier(vms->gic);
+        notifier_list_add(&vms->cpuhp_notifiers, cpuhp_notifier);
+    }
+}
+
 static void create_its(VirtMachineState *vms)
 {
     const char *itsclass = its_class_name();
@@ -960,6 +970,9 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
     } else if (vms->gic_version == VIRT_GIC_VERSION_2) {
         create_v2m(vms);
     }
+
+    /* add GIC CPU hot(un)plug update notifier */
+    virt_add_gic_cpuhp_notifier(vms);
 }
 
 static void create_uart(const VirtMachineState *vms, int uart,
@@ -2472,6 +2485,8 @@ static void machvirt_init(MachineState *machine)
 
     create_fdt(vms);
 
+    notifier_list_init(&vms->cpuhp_notifiers);
+
     assert(possible_cpus->len == max_cpus);
     for (n = 0; n < possible_cpus->len; n++) {
         CPUArchId *cpu_slot;
@@ -3054,6 +3069,14 @@ static void virt_memory_plug(HotplugHandler *hotplug_dev,
                          dev, &error_abort);
 }
 
+static void virt_update_gic(VirtMachineState *vms, CPUState *cs)
+{
+    GICv3CPUHotplugInfo gic_info = { .gic = vms->gic, .cpu = cs };
+
+    /* notify gic to stitch GICC to this new cpu */
+    notifier_list_notify(&vms->cpuhp_notifiers, &gic_info);
+}
+
 static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
                               Error **errp)
 {
@@ -3128,7 +3151,7 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
      * vCPUs have their GIC state initialized during machvit_init().
      */
     if (vms->acpi_dev) {
-        /* TODO: update GIC about this hotplug change here */
+        virt_update_gic(vms, cs);
         wire_gic_cpu_irqs(vms, cs);
     }
 
@@ -3214,7 +3237,7 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
     /* TODO: update the acpi cpu hotplug state for cpu hot-unplug */
 
     unwire_gic_cpu_irqs(vms, cs);
-    /* TODO: update the GIC about this hot unplug change */
+    virt_update_gic(vms, cs);
 
     /* TODO: unregister cpu for reset & update F/W info for the next boot */
 
diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c
index 183d2de7eb..155342055b 100644
--- a/hw/intc/arm_gicv3_common.c
+++ b/hw/intc/arm_gicv3_common.c
@@ -33,7 +33,6 @@
 #include "hw/arm/linux-boot-if.h"
 #include "sysemu/kvm.h"
 
-
 static void gicv3_gicd_no_migration_shift_bug_post_load(GICv3State *cs)
 {
     if (cs->gicd_no_migration_shift_bug) {
@@ -366,6 +365,56 @@ void gicv3_init_irqs_and_mmio(GICv3State *s, qemu_irq_handler handler,
     }
 }
 
+static int arm_gicv3_get_proc_num(GICv3State *s, CPUState *cpu)
+{
+    uint64_t mp_affinity;
+    uint64_t gicr_typer;
+    uint64_t cpu_affid;
+    int i;
+
+    mp_affinity = object_property_get_uint(OBJECT(cpu), "mp-affinity", NULL);
+    /* match the cpu mp-affinity to get the gic cpuif number */
+    for (i = 0; i < s->num_cpu; i++) {
+        gicr_typer = s->cpu[i].gicr_typer;
+        cpu_affid = (gicr_typer >> 32) & 0xFFFFFF;
+        if (cpu_affid == mp_affinity) {
+            return i;
+        }
+    }
+
+    return -1;
+}
+
+static void arm_gicv3_cpu_update_notifier(Notifier *notifier, void * data)
+{
+    GICv3CPUHotplugInfo *gic_info = (GICv3CPUHotplugInfo *)data;
+    CPUState *cpu = gic_info->cpu;
+    int gic_cpuif_num;
+    GICv3State *s;
+
+    s = ARM_GICV3_COMMON(gic_info->gic);
+
+    /* this shall get us mapped gicv3 cpuif corresponding to mpidr */
+    gic_cpuif_num = arm_gicv3_get_proc_num(s, cpu);
+    if (gic_cpuif_num < 0) {
+        error_report("Failed to associate cpu %d with any GIC cpuif",
+                     cpu->cpu_index);
+        abort();
+    }
+
+    /* check if update is for vcpu hot-unplug */
+    if (qemu_enabled_cpu(cpu)) {
+        s->cpu[gic_cpuif_num].cpu = NULL;
+        return;
+    }
+
+    /* re-stitch the gic cpuif to this new cpu */
+    gicv3_set_gicv3state(cpu, &s->cpu[gic_cpuif_num]);
+    gicv3_set_cpustate(&s->cpu[gic_cpuif_num], cpu);
+
+    /* TODO: initialize the registers info for this newly added cpu */
+}
+
 static void arm_gicv3_common_realize(DeviceState *dev, Error **errp)
 {
     GICv3State *s = ARM_GICV3_COMMON(dev);
@@ -488,6 +537,8 @@ static void arm_gicv3_common_realize(DeviceState *dev, Error **errp)
         s->cpu[cpuidx - 1].gicr_typer |= GICR_TYPER_LAST;
     }
 
+    s->cpu_update_notifier.notify = arm_gicv3_cpu_update_notifier;
+
     s->itslist = g_ptr_array_new();
 }
 
@@ -495,6 +546,7 @@ static void arm_gicv3_finalize(Object *obj)
 {
     GICv3State *s = ARM_GICV3_COMMON(obj);
 
+    notifier_remove(&s->cpu_update_notifier);
     g_free(s->redist_region_count);
 }
 
diff --git a/hw/intc/arm_gicv3_cpuif_common.c b/hw/intc/arm_gicv3_cpuif_common.c
index ff1239f65d..381cf2754b 100644
--- a/hw/intc/arm_gicv3_cpuif_common.c
+++ b/hw/intc/arm_gicv3_cpuif_common.c
@@ -20,3 +20,8 @@ void gicv3_set_gicv3state(CPUState *cpu, GICv3CPUState *s)
 
     env->gicv3state = (void *)s;
 };
+
+void gicv3_set_cpustate(GICv3CPUState *s, CPUState *cpu)
+{
+    s->cpu = cpu;
+}
diff --git a/hw/intc/gicv3_internal.h b/hw/intc/gicv3_internal.h
index bc9f518fe8..42441c19c6 100644
--- a/hw/intc/gicv3_internal.h
+++ b/hw/intc/gicv3_internal.h
@@ -861,5 +861,6 @@ static inline void gicv3_cache_all_target_cpustates(GICv3State *s)
 }
 
 void gicv3_set_gicv3state(CPUState *cpu, GICv3CPUState *s);
+void gicv3_set_cpustate(GICv3CPUState *s, CPUState *cpu);
 
 #endif /* QEMU_ARM_GICV3_INTERNAL_H */
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index d711cab46d..9c728ba042 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -181,6 +181,7 @@ struct VirtMachineState {
     char *oem_id;
     char *oem_table_id;
     bool ns_el2_virt_timer_irq;
+    NotifierList cpuhp_notifiers;
 };
 
 #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
diff --git a/include/hw/intc/arm_gicv3_common.h b/include/hw/intc/arm_gicv3_common.h
index cd09bee3bc..496b198016 100644
--- a/include/hw/intc/arm_gicv3_common.h
+++ b/include/hw/intc/arm_gicv3_common.h
@@ -293,6 +293,7 @@ struct GICv3State {
     GICv3CPUState *gicd_irouter_target[GICV3_MAXIRQ];
     uint32_t gicd_nsacr[DIV_ROUND_UP(GICV3_MAXIRQ, 16)];
 
+    Notifier cpu_update_notifier;
     GICv3CPUState *cpu;
     /* List of all ITSes connected to this GIC */
     GPtrArray *itslist;
@@ -342,6 +343,27 @@ struct ARMGICv3CommonClass {
 
 void gicv3_init_irqs_and_mmio(GICv3State *s, qemu_irq_handler handler,
                               const MemoryRegionOps *ops);
+/**
+ * Structure used by GICv3 CPU hotplug notifier
+ */
+typedef struct GICv3CPUHotplugInfo {
+    DeviceState *gic; /* GICv3State */
+    CPUState *cpu;
+} GICv3CPUHotplugInfo;
+
+/**
+ * gicv3_cpuhp_notifier
+ *
+ * Returns CPU hotplug notifier which could be used to update GIC about any
+ * CPU hot(un)plug events.
+ *
+ * Returns: Notifier initialized with CPU Hot(un)plug update function
+ */
+static inline Notifier *gicv3_cpuhp_notifier(DeviceState *dev)
+{
+    GICv3State *s = ARM_GICV3_COMMON(dev);
+    return &s->cpu_update_notifier;
+}
 
 /**
  * gicv3_class_name
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 21/29] hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register info
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (19 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 20/29] hw/arm, gicv3: Changes to update GIC with vCPU hot-plug notification Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-06-13 23:36 ` [PATCH RFC V3 22/29] arm/virt: Update the guest(via GED) about CPU hot-(un)plug events Salil Mehta via
                   ` (11 subsequent siblings)
  32 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

vCPU register info needs to be re-initialized each time vCPU is hot-plugged.
This has to be done both for emulation/TCG and KVM case. This is done in
context to the GIC update notification for any vCPU hot-(un)plug events. This
change adds that support and re-factors existing to maximize the code re-use.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/intc/arm_gicv3.c                |   1 +
 hw/intc/arm_gicv3_common.c         |   7 +-
 hw/intc/arm_gicv3_cpuif.c          | 261 +++++++++++++++--------------
 hw/intc/arm_gicv3_kvm.c            |   7 +-
 hw/intc/gicv3_internal.h           |   1 +
 include/hw/intc/arm_gicv3_common.h |   1 +
 6 files changed, 151 insertions(+), 127 deletions(-)

diff --git a/hw/intc/arm_gicv3.c b/hw/intc/arm_gicv3.c
index 58e18fff54..2a30625916 100644
--- a/hw/intc/arm_gicv3.c
+++ b/hw/intc/arm_gicv3.c
@@ -459,6 +459,7 @@ static void arm_gicv3_class_init(ObjectClass *klass, void *data)
     ARMGICv3Class *agc = ARM_GICV3_CLASS(klass);
 
     agcc->post_load = arm_gicv3_post_load;
+    agcc->init_cpu_reginfo = gicv3_init_cpu_reginfo;
     device_class_set_parent_realize(dc, arm_gic_realize, &agc->parent_realize);
 }
 
diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c
index 155342055b..02e68a437e 100644
--- a/hw/intc/arm_gicv3_common.c
+++ b/hw/intc/arm_gicv3_common.c
@@ -389,10 +389,12 @@ static void arm_gicv3_cpu_update_notifier(Notifier *notifier, void * data)
 {
     GICv3CPUHotplugInfo *gic_info = (GICv3CPUHotplugInfo *)data;
     CPUState *cpu = gic_info->cpu;
+    ARMGICv3CommonClass *c;
     int gic_cpuif_num;
     GICv3State *s;
 
     s = ARM_GICV3_COMMON(gic_info->gic);
+    c = ARM_GICV3_COMMON_GET_CLASS(s);
 
     /* this shall get us mapped gicv3 cpuif corresponding to mpidr */
     gic_cpuif_num = arm_gicv3_get_proc_num(s, cpu);
@@ -412,7 +414,10 @@ static void arm_gicv3_cpu_update_notifier(Notifier *notifier, void * data)
     gicv3_set_gicv3state(cpu, &s->cpu[gic_cpuif_num]);
     gicv3_set_cpustate(&s->cpu[gic_cpuif_num], cpu);
 
-    /* TODO: initialize the registers info for this newly added cpu */
+    /* initialize the registers info for this newly added cpu */
+    if (c->init_cpu_reginfo) {
+        c->init_cpu_reginfo(cpu);
+    }
 }
 
 static void arm_gicv3_common_realize(DeviceState *dev, Error **errp)
diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index 2a8aff0b99..1ace0f54d8 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -3033,143 +3033,138 @@ static void gicv3_cpuif_el_change_hook(ARMCPU *cpu, void *opaque)
     gicv3_cpuif_virt_irq_fiq_update(cs);
 }
 
-void gicv3_init_cpuif(GICv3State *s)
+void gicv3_init_cpu_reginfo(CPUState *cs)
 {
-    /* Called from the GICv3 realize function; register our system
-     * registers with the CPU
-     */
-    int i;
+    ARMCPU *cpu = ARM_CPU(cs);
+    GICv3CPUState *gcs = icc_cs_from_env(&cpu->env);
 
-    for (i = 0; i < s->num_cpu; i++) {
-        ARMCPU *cpu = ARM_CPU(qemu_get_cpu(i));
-        GICv3CPUState *cs = &s->cpu[i];
+    /*
+     * If the CPU doesn't define a GICv3 configuration, probably because
+     * in real hardware it doesn't have one, then we use default values
+     * matching the one used by most Arm CPUs. This applies to:
+     *  cpu->gic_num_lrs
+     *  cpu->gic_vpribits
+     *  cpu->gic_vprebits
+     *  cpu->gic_pribits
+     */
 
-        /*
-         * If the CPU doesn't define a GICv3 configuration, probably because
-         * in real hardware it doesn't have one, then we use default values
-         * matching the one used by most Arm CPUs. This applies to:
-         *  cpu->gic_num_lrs
-         *  cpu->gic_vpribits
-         *  cpu->gic_vprebits
-         *  cpu->gic_pribits
-         */
+    /*
+     * Note that we can't just use the GICv3CPUState as an opaque pointer
+     * in define_arm_cp_regs_with_opaque(), because when we're called back
+     * it might be with code translated by CPU 0 but run by CPU 1, in
+     * which case we'd get the wrong value.
+     * So instead we define the regs with no ri->opaque info, and
+     * get back to the GICv3CPUState from the CPUARMState.
+     *
+     * These CP regs callbacks can be called from either TCG or HVF code.
+     */
+    define_arm_cp_regs(cpu, gicv3_cpuif_reginfo);
 
-        /* Note that we can't just use the GICv3CPUState as an opaque pointer
-         * in define_arm_cp_regs_with_opaque(), because when we're called back
-         * it might be with code translated by CPU 0 but run by CPU 1, in
-         * which case we'd get the wrong value.
-         * So instead we define the regs with no ri->opaque info, and
-         * get back to the GICv3CPUState from the CPUARMState.
-         *
-         * These CP regs callbacks can be called from either TCG or HVF code.
-         */
-        define_arm_cp_regs(cpu, gicv3_cpuif_reginfo);
+    /*
+     * If the CPU implements FEAT_NMI and FEAT_GICv3 it must also
+     * implement FEAT_GICv3_NMI, which is the CPU interface part
+     * of NMI support. This is distinct from whether the GIC proper
+     * (redistributors and distributor) have NMI support. In QEMU
+     * that is a property of the GIC device in s->nmi_support;
+     * cs->nmi_support indicates the CPU interface's support.
+     */
+    if (cpu_isar_feature(aa64_nmi, cpu)) {
+        gcs->nmi_support = true;
+        define_arm_cp_regs(cpu, gicv3_cpuif_gicv3_nmi_reginfo);
+    }
 
-        /*
-         * If the CPU implements FEAT_NMI and FEAT_GICv3 it must also
-         * implement FEAT_GICv3_NMI, which is the CPU interface part
-         * of NMI support. This is distinct from whether the GIC proper
-         * (redistributors and distributor) have NMI support. In QEMU
-         * that is a property of the GIC device in s->nmi_support;
-         * cs->nmi_support indicates the CPU interface's support.
-         */
-        if (cpu_isar_feature(aa64_nmi, cpu)) {
-            cs->nmi_support = true;
-            define_arm_cp_regs(cpu, gicv3_cpuif_gicv3_nmi_reginfo);
-        }
+    /*
+     * The CPU implementation specifies the number of supported
+     * bits of physical priority. For backwards compatibility
+     * of migration, we have a compat property that forces use
+     * of 8 priority bits regardless of what the CPU really has.
+     */
+    if (gcs->gic->force_8bit_prio) {
+        gcs->pribits = 8;
+    } else {
+        gcs->pribits = cpu->gic_pribits ?: 5;
+    }
 
-        /*
-         * The CPU implementation specifies the number of supported
-         * bits of physical priority. For backwards compatibility
-         * of migration, we have a compat property that forces use
-         * of 8 priority bits regardless of what the CPU really has.
-         */
-        if (s->force_8bit_prio) {
-            cs->pribits = 8;
-        } else {
-            cs->pribits = cpu->gic_pribits ?: 5;
-        }
+    /*
+     * The GICv3 has separate ID register fields for virtual priority
+     * and preemption bit values, but only a single ID register field
+     * for the physical priority bits. The preemption bit count is
+     * always the same as the priority bit count, except that 8 bits
+     * of priority means 7 preemption bits. We precalculate the
+     * preemption bits because it simplifies the code and makes the
+     * parallels between the virtual and physical bits of the GIC
+     * a bit clearer.
+     */
+    gcs->prebits = gcs->pribits;
+    if (gcs->prebits == 8) {
+        gcs->prebits--;
+    }
+    /*
+     * Check that CPU code defining pribits didn't violate
+     * architectural constraints our implementation relies on.
+     */
+    g_assert(gcs->pribits >= 4 && gcs->pribits <= 8);
 
-        /*
-         * The GICv3 has separate ID register fields for virtual priority
-         * and preemption bit values, but only a single ID register field
-         * for the physical priority bits. The preemption bit count is
-         * always the same as the priority bit count, except that 8 bits
-         * of priority means 7 preemption bits. We precalculate the
-         * preemption bits because it simplifies the code and makes the
-         * parallels between the virtual and physical bits of the GIC
-         * a bit clearer.
-         */
-        cs->prebits = cs->pribits;
-        if (cs->prebits == 8) {
-            cs->prebits--;
-        }
-        /*
-         * Check that CPU code defining pribits didn't violate
-         * architectural constraints our implementation relies on.
-         */
-        g_assert(cs->pribits >= 4 && cs->pribits <= 8);
+    /*
+     * gicv3_cpuif_reginfo[] defines ICC_AP*R0_EL1; add definitions
+     * for ICC_AP*R{1,2,3}_EL1 if the prebits value requires them.
+     */
+    if (gcs->prebits >= 6) {
+        define_arm_cp_regs(cpu, gicv3_cpuif_icc_apxr1_reginfo);
+    }
+    if (gcs->prebits == 7) {
+        define_arm_cp_regs(cpu, gicv3_cpuif_icc_apxr23_reginfo);
+    }
 
-        /*
-         * gicv3_cpuif_reginfo[] defines ICC_AP*R0_EL1; add definitions
-         * for ICC_AP*R{1,2,3}_EL1 if the prebits value requires them.
-         */
-        if (cs->prebits >= 6) {
-            define_arm_cp_regs(cpu, gicv3_cpuif_icc_apxr1_reginfo);
-        }
-        if (cs->prebits == 7) {
-            define_arm_cp_regs(cpu, gicv3_cpuif_icc_apxr23_reginfo);
-        }
+    if (arm_feature(&cpu->env, ARM_FEATURE_EL2)) {
+        int j;
 
-        if (arm_feature(&cpu->env, ARM_FEATURE_EL2)) {
-            int j;
+        gcs->num_list_regs = cpu->gic_num_lrs ?: 4;
+        gcs->vpribits = cpu->gic_vpribits ?: 5;
+        gcs->vprebits = cpu->gic_vprebits ?: 5;
 
-            cs->num_list_regs = cpu->gic_num_lrs ?: 4;
-            cs->vpribits = cpu->gic_vpribits ?: 5;
-            cs->vprebits = cpu->gic_vprebits ?: 5;
 
-            /* Check against architectural constraints: getting these
-             * wrong would be a bug in the CPU code defining these,
-             * and the implementation relies on them holding.
-             */
-            g_assert(cs->vprebits <= cs->vpribits);
-            g_assert(cs->vprebits >= 5 && cs->vprebits <= 7);
-            g_assert(cs->vpribits >= 5 && cs->vpribits <= 8);
+        /* Check against architectural constraints: getting these
+         * wrong would be a bug in the CPU code defining these,
+         * and the implementation relies on them holding.
+         */
+        g_assert(gcs->vprebits <= gcs->vpribits);
+        g_assert(gcs->vprebits >= 5 && gcs->vprebits <= 7);
+        g_assert(gcs->vpribits >= 5 && gcs->vpribits <= 8);
 
-            define_arm_cp_regs(cpu, gicv3_cpuif_hcr_reginfo);
+        define_arm_cp_regs(cpu, gicv3_cpuif_hcr_reginfo);
 
-            for (j = 0; j < cs->num_list_regs; j++) {
-                /* Note that the AArch64 LRs are 64-bit; the AArch32 LRs
-                 * are split into two cp15 regs, LR (the low part, with the
-                 * same encoding as the AArch64 LR) and LRC (the high part).
-                 */
-                ARMCPRegInfo lr_regset[] = {
-                    { .name = "ICH_LRn_EL2", .state = ARM_CP_STATE_BOTH,
-                      .opc0 = 3, .opc1 = 4, .crn = 12,
-                      .crm = 12 + (j >> 3), .opc2 = j & 7,
-                      .type = ARM_CP_IO | ARM_CP_NO_RAW,
-                      .nv2_redirect_offset = 0x400 + 8 * j,
-                      .access = PL2_RW,
-                      .readfn = ich_lr_read,
-                      .writefn = ich_lr_write,
-                    },
-                    { .name = "ICH_LRCn_EL2", .state = ARM_CP_STATE_AA32,
-                      .cp = 15, .opc1 = 4, .crn = 12,
-                      .crm = 14 + (j >> 3), .opc2 = j & 7,
-                      .type = ARM_CP_IO | ARM_CP_NO_RAW,
-                      .access = PL2_RW,
-                      .readfn = ich_lr_read,
-                      .writefn = ich_lr_write,
-                    },
-                };
-                define_arm_cp_regs(cpu, lr_regset);
-            }
-            if (cs->vprebits >= 6) {
-                define_arm_cp_regs(cpu, gicv3_cpuif_ich_apxr1_reginfo);
-            }
-            if (cs->vprebits == 7) {
-                define_arm_cp_regs(cpu, gicv3_cpuif_ich_apxr23_reginfo);
-            }
+        for (j = 0; j < gcs->num_list_regs; j++) {
+            /* Note that the AArch64 LRs are 64-bit; the AArch32 LRs
+             * are split into two cp15 regs, LR (the low part, with the
+             * same encoding as the AArch64 LR) and LRC (the high part).
+             */
+            ARMCPRegInfo lr_regset[] = {
+                { .name = "ICH_LRn_EL2", .state = ARM_CP_STATE_BOTH,
+                  .opc0 = 3, .opc1 = 4, .crn = 12,
+                  .crm = 12 + (j >> 3), .opc2 = j & 7,
+                  .type = ARM_CP_IO | ARM_CP_NO_RAW,
+                  .nv2_redirect_offset = 0x400 + 8 * j,
+                  .access = PL2_RW,
+                  .readfn = ich_lr_read,
+                  .writefn = ich_lr_write,
+                },
+                { .name = "ICH_LRCn_EL2", .state = ARM_CP_STATE_AA32,
+                  .cp = 15, .opc1 = 4, .crn = 12,
+                  .crm = 14 + (j >> 3), .opc2 = j & 7,
+                  .type = ARM_CP_IO | ARM_CP_NO_RAW,
+                  .access = PL2_RW,
+                  .readfn = ich_lr_read,
+                  .writefn = ich_lr_write,
+                },
+            };
+            define_arm_cp_regs(cpu, lr_regset);
+        }
+        if (gcs->vprebits >= 6) {
+            define_arm_cp_regs(cpu, gicv3_cpuif_ich_apxr1_reginfo);
+        }
+        if (gcs->vprebits == 7) {
+            define_arm_cp_regs(cpu, gicv3_cpuif_ich_apxr23_reginfo);
         }
         if (tcg_enabled() || qtest_enabled()) {
             /*
@@ -3177,10 +3172,26 @@ void gicv3_init_cpuif(GICv3State *s)
              * state only changes on EL changes involving EL2 or EL3, so for
              * the non-TCG case this is OK, as EL2 and EL3 can't exist.
              */
-            arm_register_el_change_hook(cpu, gicv3_cpuif_el_change_hook, cs);
+            arm_register_el_change_hook(cpu, gicv3_cpuif_el_change_hook, gcs);
         } else {
             assert(!arm_feature(&cpu->env, ARM_FEATURE_EL2));
             assert(!arm_feature(&cpu->env, ARM_FEATURE_EL3));
         }
     }
 }
+
+void gicv3_init_cpuif(GICv3State *s)
+{
+    /* Called from the GICv3 realize function; register our system
+     * registers with the CPU
+     */
+    int i;
+
+    for (i = 0; i < s->num_cpu; i++) {
+        ARMCPU *cpu = ARM_CPU(qemu_get_cpu(i));
+
+        if (qemu_enabled_cpu(CPU(cpu))) {
+            gicv3_init_cpu_reginfo(CPU(cpu));
+        }
+    }
+}
diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
index 8dbbd79e1b..1c8e880117 100644
--- a/hw/intc/arm_gicv3_kvm.c
+++ b/hw/intc/arm_gicv3_kvm.c
@@ -804,6 +804,10 @@ static void vm_change_state_handler(void *opaque, bool running,
     }
 }
 
+static void kvm_gicv3_init_cpu_reginfo(CPUState *cs)
+{
+    define_arm_cp_regs(ARM_CPU(cs), gicv3_cpuif_reginfo);
+}
 
 static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
 {
@@ -842,7 +846,7 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
     for (i = 0; i < s->num_cpu; i++) {
         CPUState *cs = qemu_get_cpu(i);
         if (qemu_enabled_cpu(cs)) {
-            define_arm_cp_regs(ARM_CPU(cs), gicv3_cpuif_reginfo);
+            kvm_gicv3_init_cpu_reginfo(cs);
         }
     }
 
@@ -930,6 +934,7 @@ static void kvm_arm_gicv3_class_init(ObjectClass *klass, void *data)
 
     agcc->pre_save = kvm_arm_gicv3_get;
     agcc->post_load = kvm_arm_gicv3_put;
+    agcc->init_cpu_reginfo = kvm_gicv3_init_cpu_reginfo;
     device_class_set_parent_realize(dc, kvm_arm_gicv3_realize,
                                     &kgc->parent_realize);
     resettable_class_set_parent_phases(rc, NULL, kvm_arm_gicv3_reset_hold, NULL,
diff --git a/hw/intc/gicv3_internal.h b/hw/intc/gicv3_internal.h
index 42441c19c6..e568eef57f 100644
--- a/hw/intc/gicv3_internal.h
+++ b/hw/intc/gicv3_internal.h
@@ -722,6 +722,7 @@ void gicv3_redist_vinvall(GICv3CPUState *cs, uint64_t vptaddr);
 
 void gicv3_redist_send_sgi(GICv3CPUState *cs, int grp, int irq, bool ns);
 void gicv3_init_cpuif(GICv3State *s);
+void gicv3_init_cpu_reginfo(CPUState *cs);
 
 /**
  * gicv3_cpuif_update:
diff --git a/include/hw/intc/arm_gicv3_common.h b/include/hw/intc/arm_gicv3_common.h
index 496b198016..b63abeae82 100644
--- a/include/hw/intc/arm_gicv3_common.h
+++ b/include/hw/intc/arm_gicv3_common.h
@@ -339,6 +339,7 @@ struct ARMGICv3CommonClass {
 
     void (*pre_save)(GICv3State *s);
     void (*post_load)(GICv3State *s);
+    void (*init_cpu_reginfo)(CPUState *cs);
 };
 
 void gicv3_init_irqs_and_mmio(GICv3State *s, qemu_irq_handler handler,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 22/29] arm/virt: Update the guest(via GED) about CPU hot-(un)plug events
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (20 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 21/29] hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register info Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-06-13 23:36 ` [PATCH RFC V3 23/29] hw/arm: Changes required for reset and to support next boot Salil Mehta via
                   ` (10 subsequent siblings)
  32 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

During any vCPU hot-(un)plug, running guest VM needs to be intimated about the
new vCPU being added or request the deletion of the vCPU which is already part
of the guest VM. This is done using the ACPI GED event which eventually gets
demultiplexed to a CPU hotplug event and further to specific hot-(un)plug event
of a particular vCPU.

This change adds the ACPI calls to the existing hot-(un)plug hooks to trigger
ACPI GED events from QEMU to guest VM.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c | 34 ++++++++++++++++++++++++++--------
 1 file changed, 26 insertions(+), 8 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 9f7e07bd8e..4fa2b7d9e7 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3169,6 +3169,7 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
     VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
     MachineState *ms = MACHINE(hotplug_dev);
     CPUState *cs = CPU(dev);
+    Error *local_err = NULL;
     CPUArchId *cpu_slot;
 
     /* insert the cold/hot-plugged vcpu in the slot */
@@ -3181,12 +3182,18 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
      * plugged, guest is also notified.
      */
     if (vms->acpi_dev) {
-        /* TODO: update acpi hotplug state. Send cpu hotplug event to guest */
+        HotplugHandlerClass *hhc;
+        /* update acpi hotplug state and send cpu hotplug event to guest */
+        hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi_dev);
+        hhc->plug(HOTPLUG_HANDLER(vms->acpi_dev), dev, &local_err);
+        if (local_err) {
+            error_propagate(errp, local_err);
+            return;
+        }
         /* TODO: register cpu for reset & update F/W info for the next boot */
     }
 
     cs->disabled = false;
-    return;
 }
 
 static void virt_cpu_unplug_request(HotplugHandler *hotplug_dev,
@@ -3194,8 +3201,10 @@ static void virt_cpu_unplug_request(HotplugHandler *hotplug_dev,
 {
     MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
     VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
+    HotplugHandlerClass *hhc;
     ARMCPU *cpu = ARM_CPU(dev);
     CPUState *cs = CPU(dev);
+    Error *local_err = NULL;
 
     if (!vms->acpi_dev || !dev->realized) {
         error_setg(errp, "GED does not exists or device is not realized!");
@@ -3214,9 +3223,12 @@ static void virt_cpu_unplug_request(HotplugHandler *hotplug_dev,
         return;
     }
 
-    /* TODO: request cpu hotplug from guest */
-
-    return;
+    /* request cpu hotplug from guest */
+    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi_dev);
+    hhc->unplug_request(HOTPLUG_HANDLER(vms->acpi_dev), dev, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+    }
 }
 
 static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
@@ -3224,7 +3236,9 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
 {
     VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
     MachineState *ms = MACHINE(hotplug_dev);
+    HotplugHandlerClass *hhc;
     CPUState *cs = CPU(dev);
+    Error *local_err = NULL;
     CPUArchId *cpu_slot;
 
     if (!vms->acpi_dev || !dev->realized) {
@@ -3234,7 +3248,13 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
 
     cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
 
-    /* TODO: update the acpi cpu hotplug state for cpu hot-unplug */
+    /* update the acpi cpu hotplug state for cpu hot-unplug */
+    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi_dev);
+    hhc->unplug(HOTPLUG_HANDLER(vms->acpi_dev), dev, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
 
     unwire_gic_cpu_irqs(vms, cs);
     virt_update_gic(vms, cs);
@@ -3246,8 +3266,6 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
 
     cpu_slot->cpu = NULL;
     cs->disabled = true;
-
-    return;
 }
 
 static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 23/29] hw/arm: Changes required for reset and to support next boot
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (21 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 22/29] arm/virt: Update the guest(via GED) about CPU hot-(un)plug events Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-06-13 23:36 ` [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug Salil Mehta via
                   ` (9 subsequent siblings)
  32 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

Updates the firmware config with the next boot cpus information and also
registers the reset callback to be called when guest reboots to reset the cpu.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/boot.c         |  2 +-
 hw/arm/virt.c         | 17 ++++++++++++++---
 include/hw/arm/boot.h |  2 ++
 include/hw/arm/virt.h |  1 +
 4 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index d480a7da02..cb5c1e4848 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -682,7 +682,7 @@ fail:
     return -1;
 }
 
-static void do_cpu_reset(void *opaque)
+void do_cpu_reset(void *opaque)
 {
     ARMCPU *cpu = opaque;
     CPUState *cs = CPU(cpu);
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 4fa2b7d9e7..a2200099a1 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -45,6 +45,8 @@
 #include "sysemu/device_tree.h"
 #include "sysemu/numa.h"
 #include "sysemu/runstate.h"
+#include "sysemu/reset.h"
+#include "sysemu/sysemu.h"
 #include "sysemu/tpm.h"
 #include "sysemu/tcg.h"
 #include "sysemu/kvm.h"
@@ -1405,7 +1407,7 @@ static FWCfgState *create_fw_cfg(const VirtMachineState *vms, AddressSpace *as)
     char *nodename;
 
     fw_cfg = fw_cfg_init_mem_wide(base + 8, base, 8, base + 16, as);
-    fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, (uint16_t)ms->smp.cpus);
+    fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, vms->boot_cpus);
 
     nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base);
     qemu_fdt_add_subnode(ms->fdt, nodename);
@@ -3190,9 +3192,14 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
             error_propagate(errp, local_err);
             return;
         }
-        /* TODO: register cpu for reset & update F/W info for the next boot */
+	/* register this cpu for reset & update F/W info for the next boot */
+	qemu_register_reset(do_cpu_reset, ARM_CPU(cs));
     }
 
+    vms->boot_cpus++;
+    if (vms->fw_cfg) {
+	    fw_cfg_modify_i16(vms->fw_cfg, FW_CFG_NB_CPUS, vms->boot_cpus);
+    }
     cs->disabled = false;
 }
 
@@ -3259,7 +3266,11 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
     unwire_gic_cpu_irqs(vms, cs);
     virt_update_gic(vms, cs);
 
-    /* TODO: unregister cpu for reset & update F/W info for the next boot */
+    qemu_unregister_reset(do_cpu_reset, ARM_CPU(cs));
+    vms->boot_cpus--;
+    if (vms->fw_cfg) {
+        fw_cfg_modify_i16(vms->fw_cfg, FW_CFG_NB_CPUS, vms->boot_cpus);
+    }
 
     qobject_unref(dev->opts);
     dev->opts = NULL;
diff --git a/include/hw/arm/boot.h b/include/hw/arm/boot.h
index 80c492d742..f81326a1dc 100644
--- a/include/hw/arm/boot.h
+++ b/include/hw/arm/boot.h
@@ -178,6 +178,8 @@ AddressSpace *arm_boot_address_space(ARMCPU *cpu,
 int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
                  hwaddr addr_limit, AddressSpace *as, MachineState *ms);
 
+void do_cpu_reset(void *opaque);
+
 /* Write a secure board setup routine with a dummy handler for SMCs */
 void arm_write_secure_board_setup_dummy_smc(ARMCPU *cpu,
                                             const struct arm_boot_info *info,
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 9c728ba042..8dce426cc0 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -167,6 +167,7 @@ struct VirtMachineState {
     MemMapEntry *memmap;
     char *pciehb_nodename;
     const int *irqmap;
+    uint16_t boot_cpus;
     int fdt_size;
     uint32_t clock_phandle;
     uint32_t gic_phandle;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (22 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 23/29] hw/arm: Changes required for reset and to support next boot Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-08-16 15:37   ` Alex Bennée
  2024-06-13 23:36 ` [PATCH RFC V3 25/29] target/arm/kvm: Write CPU state back to KVM on reset Salil Mehta via
                   ` (8 subsequent siblings)
  32 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

vCPU Hot-unplug will result in QOM CPU object unrealization which will do away
with all the vCPU thread creations, allocations, registrations that happened
as part of the realization process. This change introduces the ARM CPU unrealize
function taking care of exactly that.

Note, initialized KVM vCPUs are not destroyed in host KVM but their Qemu context
is parked at the QEMU KVM layer.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reported-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
[VP: Identified CPU stall issue & suggested probable fix]
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 target/arm/cpu.c       | 101 +++++++++++++++++++++++++++++++++++++++++
 target/arm/cpu.h       |  14 ++++++
 target/arm/gdbstub.c   |   6 +++
 target/arm/helper.c    |  25 ++++++++++
 target/arm/internals.h |   3 ++
 target/arm/kvm.c       |   5 ++
 6 files changed, 154 insertions(+)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index c92162fa97..a3dc669309 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -157,6 +157,16 @@ void arm_register_pre_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
     QLIST_INSERT_HEAD(&cpu->pre_el_change_hooks, entry, node);
 }
 
+void arm_unregister_pre_el_change_hooks(ARMCPU *cpu)
+{
+    ARMELChangeHook *entry, *next;
+
+    QLIST_FOREACH_SAFE(entry, &cpu->pre_el_change_hooks, node, next) {
+        QLIST_REMOVE(entry, node);
+        g_free(entry);
+    }
+}
+
 void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
                                  void *opaque)
 {
@@ -168,6 +178,16 @@ void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
     QLIST_INSERT_HEAD(&cpu->el_change_hooks, entry, node);
 }
 
+void arm_unregister_el_change_hooks(ARMCPU *cpu)
+{
+    ARMELChangeHook *entry, *next;
+
+    QLIST_FOREACH_SAFE(entry, &cpu->el_change_hooks, node, next) {
+        QLIST_REMOVE(entry, node);
+        g_free(entry);
+    }
+}
+
 static void cp_reg_reset(gpointer key, gpointer value, gpointer opaque)
 {
     /* Reset a single ARMCPRegInfo register */
@@ -2552,6 +2572,85 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
     acc->parent_realize(dev, errp);
 }
 
+static void arm_cpu_unrealizefn(DeviceState *dev)
+{
+    ARMCPUClass *acc = ARM_CPU_GET_CLASS(dev);
+    ARMCPU *cpu = ARM_CPU(dev);
+    CPUARMState *env = &cpu->env;
+    CPUState *cs = CPU(dev);
+    bool has_secure;
+
+    has_secure = cpu->has_el3 || arm_feature(env, ARM_FEATURE_M_SECURITY);
+
+    /* rock 'n' un-roll, whatever happened in the arm_cpu_realizefn cleanly */
+    cpu_address_space_destroy(cs, ARMASIdx_NS);
+
+    if (cpu->tag_memory != NULL) {
+        cpu_address_space_destroy(cs, ARMASIdx_TagNS);
+        if (has_secure) {
+            cpu_address_space_destroy(cs, ARMASIdx_TagS);
+        }
+    }
+
+    if (has_secure) {
+        cpu_address_space_destroy(cs, ARMASIdx_S);
+    }
+
+    destroy_cpreg_list(cpu);
+    arm_cpu_unregister_gdb_regs(cpu);
+    unregister_cp_regs_for_features(cpu);
+
+    if (cpu->sau_sregion && arm_feature(env, ARM_FEATURE_M_SECURITY)) {
+        g_free(env->sau.rbar);
+        g_free(env->sau.rlar);
+    }
+
+    if (arm_feature(env, ARM_FEATURE_PMSA) &&
+        arm_feature(env, ARM_FEATURE_V7) &&
+        cpu->pmsav7_dregion) {
+        if (arm_feature(env, ARM_FEATURE_V8)) {
+            g_free(env->pmsav8.rbar[M_REG_NS]);
+            g_free(env->pmsav8.rlar[M_REG_NS]);
+            if (arm_feature(env, ARM_FEATURE_M_SECURITY)) {
+                g_free(env->pmsav8.rbar[M_REG_S]);
+                g_free(env->pmsav8.rlar[M_REG_S]);
+            }
+        } else {
+            g_free(env->pmsav7.drbar);
+            g_free(env->pmsav7.drsr);
+            g_free(env->pmsav7.dracr);
+        }
+        if (cpu->pmsav8r_hdregion) {
+            g_free(env->pmsav8.hprbar);
+            g_free(env->pmsav8.hprlar);
+        }
+    }
+
+    if (arm_feature(env, ARM_FEATURE_PMU)) {
+        if (!kvm_enabled()) {
+            arm_unregister_pre_el_change_hooks(cpu);
+            arm_unregister_el_change_hooks(cpu);
+        }
+
+#ifndef CONFIG_USER_ONLY
+        if (cpu->pmu_timer) {
+            timer_del(cpu->pmu_timer);
+        }
+#endif
+    }
+
+    cpu_remove_sync(CPU(dev));
+    acc->parent_unrealize(dev);
+
+#ifndef CONFIG_USER_ONLY
+    timer_del(cpu->gt_timer[GTIMER_PHYS]);
+    timer_del(cpu->gt_timer[GTIMER_VIRT]);
+    timer_del(cpu->gt_timer[GTIMER_HYP]);
+    timer_del(cpu->gt_timer[GTIMER_SEC]);
+    timer_del(cpu->gt_timer[GTIMER_HYPVIRT]);
+#endif
+}
+
 static ObjectClass *arm_cpu_class_by_name(const char *cpu_model)
 {
     ObjectClass *oc;
@@ -2654,6 +2753,8 @@ static void arm_cpu_class_init(ObjectClass *oc, void *data)
 
     device_class_set_parent_realize(dc, arm_cpu_realizefn,
                                     &acc->parent_realize);
+    device_class_set_parent_unrealize(dc, arm_cpu_unrealizefn,
+                                      &acc->parent_unrealize);
 
     device_class_set_props(dc, arm_cpu_properties);
 
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 208c719db3..a4a7555f7e 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1118,6 +1118,7 @@ struct ARMCPUClass {
 
     const ARMCPUInfo *info;
     DeviceRealize parent_realize;
+    DeviceUnrealize parent_unrealize;
     ResettablePhases parent_phases;
 };
 
@@ -3228,6 +3229,13 @@ static inline AddressSpace *arm_addressspace(CPUState *cs, MemTxAttrs attrs)
  */
 void arm_register_pre_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
                                  void *opaque);
+/**
+ * arm_unregister_pre_el_change_hook:
+ * unregister all pre EL change hook functions. Generally called during
+ * unrealize'ing leg
+ */
+void arm_unregister_pre_el_change_hooks(ARMCPU *cpu);
+
 /**
  * arm_register_el_change_hook:
  * Register a hook function which will be called immediately after this
@@ -3240,6 +3248,12 @@ void arm_register_pre_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
  */
 void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook, void
         *opaque);
+/**
+ * arm_unregister_el_change_hook:
+ * unregister all EL change hook functions.  Generally called during
+ * unrealize'ing leg
+ */
+void arm_unregister_el_change_hooks(ARMCPU *cpu);
 
 /**
  * arm_rebuild_hflags:
diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
index a3bb73cfa7..948e40b981 100644
--- a/target/arm/gdbstub.c
+++ b/target/arm/gdbstub.c
@@ -555,3 +555,9 @@ void arm_cpu_register_gdb_regs_for_features(ARMCPU *cpu)
     }
 #endif /* CONFIG_TCG */
 }
+
+void arm_cpu_unregister_gdb_regs(ARMCPU *cpu)
+{
+    CPUState *cs = CPU(cpu);
+    gdb_unregister_coprocessor_all(cs);
+}
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 7587635960..9a2468347a 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -264,6 +264,19 @@ void init_cpreg_list(ARMCPU *cpu)
     g_list_free(keys);
 }
 
+void destroy_cpreg_list(ARMCPU *cpu)
+{
+    assert(cpu->cpreg_indexes);
+    assert(cpu->cpreg_values);
+    assert(cpu->cpreg_vmstate_indexes);
+    assert(cpu->cpreg_vmstate_values);
+
+    g_free(cpu->cpreg_indexes);
+    g_free(cpu->cpreg_values);
+    g_free(cpu->cpreg_vmstate_indexes);
+    g_free(cpu->cpreg_vmstate_values);
+}
+
 static bool arm_pan_enabled(CPUARMState *env)
 {
     if (is_a64(env)) {
@@ -9987,6 +10000,18 @@ void register_cp_regs_for_features(ARMCPU *cpu)
 #endif
 }
 
+void unregister_cp_regs_for_features(ARMCPU *cpu)
+{
+    CPUARMState *env = &cpu->env;
+    if (arm_feature(env, ARM_FEATURE_M)) {
+        /* M profile has no coprocessor registers */
+        return;
+    }
+
+    /* empty it all. unregister all the coprocessor registers */
+    g_hash_table_remove_all(cpu->cp_regs);
+}
+
 /*
  * Private utility function for define_one_arm_cp_reg_with_opaque():
  * add a single reginfo struct to the hash table.
diff --git a/target/arm/internals.h b/target/arm/internals.h
index ee3ebd383e..34dab0bb02 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -353,9 +353,12 @@ void arm_cpu_register(const ARMCPUInfo *info);
 void aarch64_cpu_register(const ARMCPUInfo *info);
 
 void register_cp_regs_for_features(ARMCPU *cpu);
+void unregister_cp_regs_for_features(ARMCPU *cpu);
 void init_cpreg_list(ARMCPU *cpu);
+void destroy_cpreg_list(ARMCPU *cpu);
 
 void arm_cpu_register_gdb_regs_for_features(ARMCPU *cpu);
+void arm_cpu_unregister_gdb_regs(ARMCPU *cpu);
 void arm_translate_init(void);
 
 void arm_restore_state_to_opc(CPUState *cs,
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 01c83c1994..1121771c4a 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -1988,6 +1988,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
 
 int kvm_arch_destroy_vcpu(CPUState *cs)
 {
+    /* vCPUs which are yet to be realized will not have handler */
+    if (cs->thread_id) {
+        qemu_del_vm_change_state_handler(cs->vmcse);
+    }
+
     return 0;
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 25/29] target/arm/kvm: Write CPU state back to KVM on reset
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (23 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug Salil Mehta via
@ 2024-06-13 23:36 ` Salil Mehta via
  2024-07-04  3:27   ` Nicholas Piggin
  2024-06-14  0:15 ` [PATCH RFC V3 26/29] target/arm/kvm, tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu Salil Mehta via
                   ` (7 subsequent siblings)
  32 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-06-13 23:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

From: Jean-Philippe Brucker <jean-philippe@linaro.org>

When a KVM vCPU is reset following a PSCI CPU_ON call, its power state
is not synchronized with KVM at the moment. Because the vCPU is not
marked dirty, we miss the call to kvm_arch_put_registers() that writes
to KVM's MP_STATE. Force mp_state synchronization.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 target/arm/kvm.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 1121771c4a..7acd83ce64 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -980,6 +980,7 @@ void kvm_arm_cpu_post_load(ARMCPU *cpu)
 void kvm_arm_reset_vcpu(ARMCPU *cpu)
 {
     int ret;
+    CPUState *cs = CPU(cpu);
 
     /* Re-init VCPU so that all registers are set to
      * their respective reset values.
@@ -1001,6 +1002,12 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu)
      * for the same reason we do so in kvm_arch_get_registers().
      */
     write_list_to_cpustate(cpu);
+
+    /*
+     * Ensure we call kvm_arch_put_registers(). The vCPU isn't marked dirty if
+     * it was parked in KVM and is now booting from a PSCI CPU_ON call.
+     */
+    cs->vcpu_dirty = true;
 }
 
 void kvm_arm_create_host_vcpu(ARMCPU *cpu)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 26/29] target/arm/kvm, tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (24 preceding siblings ...)
  2024-06-13 23:36 ` [PATCH RFC V3 25/29] target/arm/kvm: Write CPU state back to KVM on reset Salil Mehta via
@ 2024-06-14  0:15 ` Salil Mehta via
  2024-06-14  0:18 ` [PATCH RFC V3 27/29] hw/arm: Support hotplug capability check using _OSC method Salil Mehta via
                   ` (6 subsequent siblings)
  32 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-14  0:15 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

From: Author Salil Mehta <salil.mehta@huawei.com>

Add registration and Handling of HVC/SMC hypercall exits to VMM

Co-developed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 target/arm/arm-powerctl.c   | 51 ++++++++++++++++----
 target/arm/helper.c         |  2 +-
 target/arm/internals.h      | 11 -----
 target/arm/kvm.c            | 93 +++++++++++++++++++++++++++++++++++++
 target/arm/kvm_arm.h        | 14 ++++++
 target/arm/meson.build      |  1 +
 target/arm/{tcg => }/psci.c |  8 ++++
 target/arm/tcg/meson.build  |  4 --
 8 files changed, 159 insertions(+), 25 deletions(-)
 rename target/arm/{tcg => }/psci.c (97%)

diff --git a/target/arm/arm-powerctl.c b/target/arm/arm-powerctl.c
index 2b2055c6ac..f567478e89 100644
--- a/target/arm/arm-powerctl.c
+++ b/target/arm/arm-powerctl.c
@@ -17,6 +17,7 @@
 #include "qemu/main-loop.h"
 #include "sysemu/tcg.h"
 #include "target/arm/multiprocessing.h"
+#include "hw/boards.h"
 
 #ifndef DEBUG_ARM_POWERCTL
 #define DEBUG_ARM_POWERCTL 0
@@ -29,18 +30,37 @@
         } \
     } while (0)
 
+static CPUArchId *arm_get_archid_by_id(uint64_t id)
+{
+    int n;
+    CPUArchId *arch_id;
+    MachineState *ms = MACHINE(qdev_get_machine());
+
+    /*
+     * At this point disabled CPUs don't have a CPUState, but their CPUArchId
+     * exists.
+     *
+     * TODO: Is arch_id == mp_affinity? This needs work.
+     */
+    for (n = 0; n < ms->possible_cpus->len; n++) {
+        arch_id = &ms->possible_cpus->cpus[n];
+
+        if (arch_id->arch_id == id) {
+            return arch_id;
+        }
+    }
+    return NULL;
+}
+
 CPUState *arm_get_cpu_by_id(uint64_t id)
 {
-    CPUState *cpu;
+    CPUArchId *arch_id;
 
     DPRINTF("cpu %" PRId64 "\n", id);
 
-    CPU_FOREACH(cpu) {
-        ARMCPU *armcpu = ARM_CPU(cpu);
-
-        if (arm_cpu_mp_affinity(armcpu) == id) {
-            return cpu;
-        }
+    arch_id = arm_get_archid_by_id(id);
+    if (arch_id && arch_id->cpu) {
+        return CPU(arch_id->cpu);
     }
 
     qemu_log_mask(LOG_GUEST_ERROR,
@@ -98,6 +118,7 @@ int arm_set_cpu_on(uint64_t cpuid, uint64_t entry, uint64_t context_id,
 {
     CPUState *target_cpu_state;
     ARMCPU *target_cpu;
+    CPUArchId *arch_id;
     struct CpuOnInfo *info;
 
     assert(bql_locked());
@@ -118,12 +139,24 @@ int arm_set_cpu_on(uint64_t cpuid, uint64_t entry, uint64_t context_id,
     }
 
     /* Retrieve the cpu we are powering up */
-    target_cpu_state = arm_get_cpu_by_id(cpuid);
-    if (!target_cpu_state) {
+    arch_id = arm_get_archid_by_id(cpuid);
+    if (!arch_id) {
         /* The cpu was not found */
         return QEMU_ARM_POWERCTL_INVALID_PARAM;
     }
 
+    target_cpu_state = CPU(arch_id->cpu);
+    if (!qemu_enabled_cpu(target_cpu_state)) {
+        /*
+         * The cpu is not plugged in or disabled. We should return appropriate
+         * value as introduced in DEN0022E PSCI 1.2 issue E
+         */
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "[ARM]%s: Denying attempt to online removed/disabled "
+                      "CPU%" PRId64"\n", __func__, cpuid);
+        return QEMU_ARM_POWERCTL_IS_OFF;
+    }
+
     target_cpu = ARM_CPU(target_cpu_state);
     if (target_cpu->power_state == PSCI_ON) {
         qemu_log_mask(LOG_GUEST_ERROR,
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 9a2468347a..4ea0a42f52 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11842,7 +11842,7 @@ void arm_cpu_do_interrupt(CPUState *cs)
                       env->exception.syndrome);
     }
 
-    if (tcg_enabled() && arm_is_psci_call(cpu, cs->exception_index)) {
+    if (arm_is_psci_call(cpu, cs->exception_index)) {
         arm_handle_psci_call(cpu);
         qemu_log_mask(CPU_LOG_INT, "...handled as PSCI call\n");
         return;
diff --git a/target/arm/internals.h b/target/arm/internals.h
index 34dab0bb02..f7b7f7966a 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -480,21 +480,10 @@ vaddr arm_adjust_watchpoint_address(CPUState *cs, vaddr addr, int len);
 /* Callback function for when a watchpoint or breakpoint triggers. */
 void arm_debug_excp_handler(CPUState *cs);
 
-#if defined(CONFIG_USER_ONLY) || !defined(CONFIG_TCG)
-static inline bool arm_is_psci_call(ARMCPU *cpu, int excp_type)
-{
-    return false;
-}
-static inline void arm_handle_psci_call(ARMCPU *cpu)
-{
-    g_assert_not_reached();
-}
-#else
 /* Return true if the r0/x0 value indicates that this SMC/HVC is a PSCI call. */
 bool arm_is_psci_call(ARMCPU *cpu, int excp_type);
 /* Actually handle a PSCI call */
 void arm_handle_psci_call(ARMCPU *cpu);
-#endif
 
 /**
  * arm_clear_exclusive: clear the exclusive monitor
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 7acd83ce64..eb1c623828 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -544,9 +544,51 @@ int kvm_arch_get_default_type(MachineState *ms)
     return fixed_ipa ? 0 : size;
 }
 
+static bool kvm_arm_set_vm_attr(struct kvm_device_attr *attr, const char *name)
+{
+    int err;
+
+    err = kvm_vm_ioctl(kvm_state, KVM_HAS_DEVICE_ATTR, attr);
+    if (err != 0) {
+        error_report("%s: KVM_HAS_DEVICE_ATTR: %s", name, strerror(-err));
+        return false;
+    }
+
+    err = kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, attr);
+    if (err != 0) {
+        error_report("%s: KVM_SET_DEVICE_ATTR: %s", name, strerror(-err));
+        return false;
+    }
+
+    return true;
+}
+
+int kvm_arm_set_smccc_filter(uint64_t func, uint8_t faction)
+{
+    struct kvm_smccc_filter filter = {
+        .base = func,
+        .nr_functions = 1,
+        .action = faction,
+    };
+    struct kvm_device_attr attr = {
+        .group = KVM_ARM_VM_SMCCC_CTRL,
+        .attr = KVM_ARM_VM_SMCCC_FILTER,
+        .flags = 0,
+        .addr = (uintptr_t)&filter,
+    };
+
+    if (!kvm_arm_set_vm_attr(&attr, "SMCCC Filter")) {
+        error_report("failed to set SMCCC filter in KVM Host");
+        return -1;
+    }
+
+    return 0;
+}
+
 int kvm_arch_init(MachineState *ms, KVMState *s)
 {
     int ret = 0;
+
     /* For ARM interrupt delivery is always asynchronous,
      * whether we are using an in-kernel VGIC or not.
      */
@@ -609,6 +651,22 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
     hw_breakpoints = g_array_sized_new(true, true,
                                        sizeof(HWBreakpoint), max_hw_bps);
 
+    /*
+     * To be able to handle PSCI CPU ON calls in QEMU, we need to install SMCCC
+     * filter in the Host KVM. This is required to support features like
+     * virtual CPU Hotplug on ARM platforms.
+     */
+    if (kvm_arm_set_smccc_filter(PSCI_0_2_FN64_CPU_ON,
+                                 KVM_SMCCC_FILTER_FWD_TO_USER)) {
+        error_report("CPU On PSCI-to-user-space fwd filter install failed");
+        abort();
+    }
+    if (kvm_arm_set_smccc_filter(PSCI_0_2_FN_CPU_OFF,
+                                 KVM_SMCCC_FILTER_FWD_TO_USER)) {
+        error_report("CPU Off PSCI-to-user-space fwd filter install failed");
+        abort();
+    }
+
     return ret;
 }
 
@@ -1459,6 +1517,38 @@ static bool kvm_arm_handle_debug(ARMCPU *cpu,
     return false;
 }
 
+static int kvm_arm_handle_hypercall(CPUState *cs, struct kvm_run *run)
+{
+    ARMCPU *cpu = ARM_CPU(cs);
+    CPUARMState *env = &cpu->env;
+
+    kvm_cpu_synchronize_state(cs);
+
+    /*
+     * hard coding immediate to 0 as we dont expect non-zero value as of now
+     * This might change in future versions. Hence, KVM_GET_ONE_REG  could be
+     * used in such cases but it must be enhanced then only synchronize will
+     * also fetch ESR_EL2 value.
+     */
+    if (run->hypercall.flags == KVM_HYPERCALL_EXIT_SMC) {
+        cs->exception_index = EXCP_SMC;
+        env->exception.syndrome = syn_aa64_smc(0);
+    } else {
+        cs->exception_index = EXCP_HVC;
+        env->exception.syndrome = syn_aa64_hvc(0);
+    }
+    env->exception.target_el = 1;
+    bql_lock();
+    arm_cpu_do_interrupt(cs);
+    bql_unlock();
+
+    /*
+     * For PSCI, exit the kvm_run loop and process the work. Especially
+     * important if this was a CPU_OFF command and we can't return to the guest.
+     */
+    return EXCP_INTERRUPT;
+}
+
 int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
 {
     ARMCPU *cpu = ARM_CPU(cs);
@@ -1475,6 +1565,9 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
         ret = kvm_arm_handle_dabt_nisv(cpu, run->arm_nisv.esr_iss,
                                        run->arm_nisv.fault_ipa);
         break;
+    case KVM_EXIT_HYPERCALL:
+          ret = kvm_arm_handle_hypercall(cs, run);
+        break;
     default:
         qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n",
                       __func__, run->exit_reason);
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index 0be7e896d2..b9c2b0f501 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -225,6 +225,15 @@ void kvm_arm_pvtime_init(ARMCPU *cpu, uint64_t ipa);
 
 int kvm_arm_set_irq(int cpu, int irqtype, int irq, int level);
 
+/**
+ * kvm_arm_set_smccc_filter
+ * @func: funcion
+ * @faction: SMCCC filter action(handle, deny, fwd-to-user) to be deployed
+ *
+ * Sets the ARMs SMC-CC filter in KVM Host for selective hypercall exits
+ */
+int kvm_arm_set_smccc_filter(uint64_t func, uint8_t faction);
+
 #else
 
 /*
@@ -294,6 +303,11 @@ static inline uint32_t kvm_arm_sve_get_vls(ARMCPU *cpu)
     g_assert_not_reached();
 }
 
+static inline int kvm_arm_set_smccc_filter(uint64_t func, uint8_t faction)
+{
+    g_assert_not_reached();
+}
+
 #endif
 
 #endif
diff --git a/target/arm/meson.build b/target/arm/meson.build
index 2e10464dbb..3e9f704f35 100644
--- a/target/arm/meson.build
+++ b/target/arm/meson.build
@@ -23,6 +23,7 @@ arm_system_ss.add(files(
   'arm-qmp-cmds.c',
   'cortex-regs.c',
   'machine.c',
+  'psci.c',
   'ptw.c',
 ))
 
diff --git a/target/arm/tcg/psci.c b/target/arm/psci.c
similarity index 97%
rename from target/arm/tcg/psci.c
rename to target/arm/psci.c
index 51d2ca3d30..b3fcb85079 100644
--- a/target/arm/tcg/psci.c
+++ b/target/arm/psci.c
@@ -21,7 +21,9 @@
 #include "exec/helper-proto.h"
 #include "kvm-consts.h"
 #include "qemu/main-loop.h"
+#include "qemu/error-report.h"
 #include "sysemu/runstate.h"
+#include "sysemu/tcg.h"
 #include "internals.h"
 #include "arm-powerctl.h"
 #include "target/arm/multiprocessing.h"
@@ -158,6 +160,11 @@ void arm_handle_psci_call(ARMCPU *cpu)
     case QEMU_PSCI_0_1_FN_CPU_SUSPEND:
     case QEMU_PSCI_0_2_FN_CPU_SUSPEND:
     case QEMU_PSCI_0_2_FN64_CPU_SUSPEND:
+       if (!tcg_enabled()) {
+            warn_report("CPU suspend not supported in non-tcg mode");
+            break;
+       }
+#ifdef CONFIG_TCG
         /* Affinity levels are not supported in QEMU */
         if (param[1] & 0xfffe0000) {
             ret = QEMU_PSCI_RET_INVALID_PARAMS;
@@ -170,6 +177,7 @@ void arm_handle_psci_call(ARMCPU *cpu)
             env->regs[0] = 0;
         }
         helper_wfi(env, 4);
+#endif
         break;
     case QEMU_PSCI_1_0_FN_PSCI_FEATURES:
         switch (param[1]) {
diff --git a/target/arm/tcg/meson.build b/target/arm/tcg/meson.build
index 508932a249..5b43c84c40 100644
--- a/target/arm/tcg/meson.build
+++ b/target/arm/tcg/meson.build
@@ -54,9 +54,5 @@ arm_ss.add(when: 'TARGET_AARCH64', if_true: files(
   'sve_helper.c',
 ))
 
-arm_system_ss.add(files(
-  'psci.c',
-))
-
 arm_system_ss.add(when: 'CONFIG_ARM_V7M', if_true: files('cpu-v7m.c'))
 arm_user_ss.add(when: 'TARGET_AARCH64', if_false: files('cpu-v7m.c'))
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 27/29] hw/arm: Support hotplug capability check using _OSC method
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (25 preceding siblings ...)
  2024-06-14  0:15 ` [PATCH RFC V3 26/29] target/arm/kvm, tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu Salil Mehta via
@ 2024-06-14  0:18 ` Salil Mehta via
  2024-06-14  0:19 ` [PATCH RFC V3 28/29] tcg/mttcg: enable threads to unregister in tcg_ctxs[] Salil Mehta via
                   ` (5 subsequent siblings)
  32 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-14  0:18 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

Physical CPU hotplug results in (un)setting of ACPI _STA.Present bit. AARCH64
platforms do not support physical CPU hotplug. Virtual CPU hotplug support being
implemented toggles ACPI _STA.Enabled Bit to achieve hotplug functionality. This
is not same as physical CPU hotplug support.

In future, if ARM architecture supports physical CPU hotplug then the current
design of virtual CPU hotplug can be used unchanged. Hence, there is a need for
firmware/VMM/Qemu to support evaluation of platform wide capabilitiy related to
the *type* of CPU hotplug support present on the platform. OSPM might need this
during boot time to correctly initialize the CPUs and other related components
in the kernel.

NOTE: This implementation will be improved to add the support of *query* in the
subsequent versions. This is very minimal support to assist kernel.

ASL for the implemented _OSC method:

Method (_OSC, 4, NotSerialized)  // _OSC: Operating System Capabilities
{
    CreateDWordField (Arg3, Zero, CDW1)
    If ((Arg0 == ToUUID ("0811b06e-4a27-44f9-8d60-3cbbc22e7b48") /* Platform-wide Capabilities */))
    {
        CreateDWordField (Arg3, 0x04, CDW2)
        Local0 = CDW2 /* \_SB_._OSC.CDW2 */
        If ((Arg1 != One))
        {
            CDW1 |= 0x08
        }

        Local0 &= 0x00800000
        If ((CDW2 != Local0))
        {
            CDW1 |= 0x10
        }

        CDW2 = Local0
    }
    Else
    {
        CDW1 |= 0x04
    }

    Return (Arg3)
}

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt-acpi-build.c | 52 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 4b4906f407..6cb613103f 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -818,6 +818,55 @@ static void build_fadt_rev6(GArray *table_data, BIOSLinker *linker,
     build_fadt(table_data, linker, &fadt, vms->oem_id, vms->oem_table_id);
 }
 
+static void build_virt_osc_method(Aml *scope, VirtMachineState *vms)
+{
+    Aml *if_uuid, *else_uuid, *if_rev, *if_caps_masked, *method;
+    Aml *a_cdw1 = aml_name("CDW1");
+    Aml *a_cdw2 = aml_local(0);
+
+    method = aml_method("_OSC", 4, AML_NOTSERIALIZED);
+    aml_append(method, aml_create_dword_field(aml_arg(3), aml_int(0), "CDW1"));
+
+    /* match UUID */
+    if_uuid = aml_if(aml_equal(
+        aml_arg(0), aml_touuid("0811B06E-4A27-44F9-8D60-3CBBC22E7B48")));
+
+    aml_append(if_uuid, aml_create_dword_field(aml_arg(3), aml_int(4), "CDW2"));
+    aml_append(if_uuid, aml_store(aml_name("CDW2"), a_cdw2));
+
+    /* check unknown revision in arg(1) */
+    if_rev = aml_if(aml_lnot(aml_equal(aml_arg(1), aml_int(1))));
+    /* set revision error bits,  DWORD1 Bit[3] */
+    aml_append(if_rev, aml_or(a_cdw1, aml_int(0x08), a_cdw1));
+    aml_append(if_uuid, if_rev);
+
+    /*
+     * check support for vCPU hotplug type(=enabled) platform-wide capability
+     * in DWORD2 as sepcified in the below ACPI Specification ECR,
+     *  # https://bugzilla.tianocore.org/show_bug.cgi?id=4481
+     */
+    if (vms->acpi_dev) {
+        aml_append(if_uuid, aml_and(a_cdw2, aml_int(0x800000), a_cdw2));
+        /* check if OSPM specified hotplug capability bits were masked */
+        if_caps_masked = aml_if(aml_lnot(aml_equal(aml_name("CDW2"), a_cdw2)));
+        aml_append(if_caps_masked, aml_or(a_cdw1, aml_int(0x10), a_cdw1));
+        aml_append(if_uuid, if_caps_masked);
+    }
+    aml_append(if_uuid, aml_store(a_cdw2, aml_name("CDW2")));
+
+    aml_append(method, if_uuid);
+    else_uuid = aml_else();
+
+    /* set unrecognized UUID error bits, DWORD1 Bit[2] */
+    aml_append(else_uuid, aml_or(a_cdw1, aml_int(4), a_cdw1));
+    aml_append(method, else_uuid);
+
+    aml_append(method, aml_return(aml_arg(3)));
+    aml_append(scope, method);
+
+    return;
+}
+
 /* DSDT */
 static void
 build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
@@ -852,6 +901,9 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     } else {
         acpi_dsdt_add_cpus(scope, vms);
     }
+
+    build_virt_osc_method(scope, vms);
+
     acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
                        (irqmap[VIRT_UART] + ARM_SPI_BASE));
     if (vmc->acpi_expose_flash) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 28/29] tcg/mttcg: enable threads to unregister in tcg_ctxs[]
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (26 preceding siblings ...)
  2024-06-14  0:18 ` [PATCH RFC V3 27/29] hw/arm: Support hotplug capability check using _OSC method Salil Mehta via
@ 2024-06-14  0:19 ` Salil Mehta via
  2024-06-14  0:20 ` [PATCH RFC V3 29/29] hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled Salil Mehta via
                   ` (4 subsequent siblings)
  32 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-14  0:19 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

From: Miguel Luis <miguel.luis@oracle.com>

[BROKEN: This patch is just for reference. It has problems as it does not takes
care of the TranslationBlocks and their assigned regions during CPU unrealize]

When using TCG acceleration in a multi-threaded context each vCPU has its own
thread registered in tcg_ctxs[] upon creation and tcg_cur_ctxs stores the current
number of threads that got created. Although, the lack of a mechanism to
unregister these threads is a problem when exercising vCPU hotplug/unplug
due to the fact that tcg_cur_ctxs gets incremented everytime a vCPU gets
hotplugged but never gets decremented everytime a vCPU gets unplugged, therefore
breaking the assert stating tcg_cur_ctxs < tcg_max_ctxs after a certain amount
of vCPU hotplugs.

Suggested-by: Salil Mehta <salil.mehta@huawei.com>
[SM: Check Things To Do Section, https://lore.kernel.org/all/20200613213629.21984-1-salil.mehta@huawei.com/]
Signed-off-by: Miguel Luis <miguel.luis@oracle.com>
---
 accel/tcg/tcg-accel-ops-mttcg.c |  1 +
 include/tcg/startup.h           |  7 +++++++
 tcg/tcg.c                       | 24 ++++++++++++++++++++++++
 3 files changed, 32 insertions(+)

diff --git a/accel/tcg/tcg-accel-ops-mttcg.c b/accel/tcg/tcg-accel-ops-mttcg.c
index c552b45b8e..b6d7911a87 100644
--- a/accel/tcg/tcg-accel-ops-mttcg.c
+++ b/accel/tcg/tcg-accel-ops-mttcg.c
@@ -122,6 +122,7 @@ static void *mttcg_cpu_thread_fn(void *arg)
     bql_unlock();
     rcu_remove_force_rcu_notifier(&force_rcu.notifier);
     rcu_unregister_thread();
+    tcg_unregister_thread();
     return NULL;
 }
 
diff --git a/include/tcg/startup.h b/include/tcg/startup.h
index f71305765c..dc35b24de5 100644
--- a/include/tcg/startup.h
+++ b/include/tcg/startup.h
@@ -45,6 +45,13 @@ void tcg_init(size_t tb_size, int splitwx, unsigned max_cpus);
  */
 void tcg_register_thread(void);
 
+/**
+ * tcg_unregister_thread: Unregister this thread with the TCG runtime
+ *
+ * TBD
+ */
+void tcg_unregister_thread(void);
+
 /**
  * tcg_prologue_init(): Generate the code for the TCG prologue
  *
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 34e3056380..e5bbe8dc07 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -761,6 +761,15 @@ QEMU_BUILD_BUG_ON((int)(offsetof(CPUNegativeOffsetState, tlb.f[0]) -
                   < MIN_TLB_MASK_TABLE_OFS);
 #endif
 
+/* TODO: vCPU Hotplug: Need to come back and fix the TCG   */
+static void free_tcg_plugin_context(TCGContext *s)
+{
+#ifdef CONFIG_PLUGIN
+    g_ptr_array_unref(s->plugin_tb->insns);
+    g_free(s->plugin_tb);
+#endif
+}
+
 /*
  * All TCG threads except the parent (i.e. the one that called tcg_context_init
  * and registered the target's TCG globals) must register with this function
@@ -810,6 +819,21 @@ void tcg_register_thread(void)
 
     tcg_ctx = s;
 }
+
+void tcg_unregister_thread(void)
+{
+    TCGContext *s = tcg_ctx;
+    unsigned int n;
+
+    /* Unclaim an entry in tcg_ctxs */
+    n = qatomic_fetch_dec(&tcg_cur_ctxs);
+    g_assert(n > 1);
+    qatomic_store_release(&tcg_ctxs[n - 1], 0);
+
+    free_tcg_plugin_context(s);
+
+    g_free(s);
+}
 #endif /* !CONFIG_USER_ONLY */
 
 /* pool based memory allocation */
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH RFC V3 29/29] hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (27 preceding siblings ...)
  2024-06-14  0:19 ` [PATCH RFC V3 28/29] tcg/mttcg: enable threads to unregister in tcg_ctxs[] Salil Mehta via
@ 2024-06-14  0:20 ` Salil Mehta via
  2024-06-26  9:53 ` [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Vishnu Pajjuri
                   ` (3 subsequent siblings)
  32 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-14  0:20 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, mst
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, gshan,
	rafael, borntraeger, alex.bennee, npiggin, harshpb, linux, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

Hotpluggable CPUs MUST be exposed as 'online-capable' according to the new
change. However, cold-booted CPUs, if marked as 'online-capable' during boot
time, might not be detected by legacy operating systems. This could cause
compatibility problems.

Original Change Link: https://bugzilla.tianocore.org/show_bug.cgi?id=3706

Since updating the specification might take time, it is necessary to disable the
support for unplugging any cold-booted CPUs to preserve compatibility with
legacy operating systems.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt-acpi-build.c | 29 ++++++++++++++++++++---------
 hw/arm/virt.c            | 16 ++++++++++++++++
 include/hw/core/cpu.h    |  2 ++
 3 files changed, 38 insertions(+), 9 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 6cb613103f..322ed8e35b 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -667,17 +667,28 @@ static uint32_t virt_acpi_get_gicc_flags(CPUState *cpu)
     }
 
     /*
-     * ARM GIC CPU Interface can be 'online-capable' or 'enabled' at boot
-     * We MUST set 'online-capable' bit for all hotpluggable CPUs except the
-     * first/boot CPU. Cold-booted CPUs without 'Id' can also be unplugged.
-     * Though as-of-now this is only used as a debugging feature.
+     * The ARM GIC CPU Interface can be either 'online-capable' or 'enabled' at
+     * boot. We MUST set the 'online-capable' bit for all hotpluggable CPUs.
      *
-     *   UEFI ACPI Specification 6.5
-     *   Section: 5.2.12.14. GIC CPU Interface (GICC) Structure
-     *   Table:   5.37 GICC CPU Interface Flags
-     *   Link: https://uefi.org/specs/ACPI/6.5
+     * Change Link: https://bugzilla.tianocore.org/show_bug.cgi?id=3706
+     *
+     * Refer to the UEFI ACPI Specification 6.5:
+     * Section: 5.2.12.14. GIC CPU Interface (GICC) Structure
+     * Table: 5.37 GICC CPU Interface Flags
+     * Link: https://uefi.org/specs/ACPI/6.5
+     *
+     * Cold-booted CPUs, except for the first/boot CPU, SHOULD be allowed to be
+     * hot(un)plugged as well. However, for this to happen, these CPUs MUST have
+     * the 'online-capable' bit set. This creates a compatibility problem with
+     * legacy OS, as it might ignore 'online-capable' bits during boot time, and
+     * hence some CPUs might not get detected.
+     *
+     * To fix this, the MADT GIC CPU interface flag should allow both
+     * 'online-capable' and 'enabled' bits to be set together. This change will
+     * require an update to the UEFI ACPI standard. Until this update occurs,
+     * all cold-booted CPUs should be exposed as 'enabled' only.
      */
-    return cpu && !cpu->cpu_index ? 1 : (1 << 3);
+    return cpu && cpu->cold_booted ? 1 : (1 << 3);
 }
 
 static void
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index a2200099a1..770b599acf 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3163,6 +3163,10 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
      * This shall be used during the init of ACPI Hotplug state and hot-unplug
      */
      cs->acpi_persistent = true;
+
+    if (!dev->hotplugged) {
+        cs->cold_booted = true;
+    }
 }
 
 static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
@@ -3223,6 +3227,18 @@ static void virt_cpu_unplug_request(HotplugHandler *hotplug_dev,
         return;
     }
 
+    /*
+     * UEFI ACPI standard change is required to make both 'enabled' and the
+     * 'online-capable' bit co-exist instead of being mutually exclusive.
+     * check virt_acpi_get_gicc_flags() for more details.
+     *
+     * Disable the unplugging of cold-booted vCPUs as a temporary mitigation.
+     */
+    if (cs->cold_booted) {
+        error_setg(errp, "Hot-unplug of cold-booted CPU not supported!");
+        return;
+    }
+
     if (cs->cpu_index == first_cpu->cpu_index) {
         error_setg(errp, "Boot CPU(id%d=%d:%d:%d:%d) hot-unplug not supported",
                    first_cpu->cpu_index, cpu->socket_id, cpu->cluster_id,
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index e13e542177..99b699b47f 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -556,6 +556,8 @@ struct CPUState {
     uint32_t halted;
     int32_t exception_index;
 
+    bool cold_booted;
+
     AccelCPUState *accel;
 
     /* Used to keep track of an outstanding cpu throttle thread for migration
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (28 preceding siblings ...)
  2024-06-14  0:20 ` [PATCH RFC V3 29/29] hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled Salil Mehta via
@ 2024-06-26  9:53 ` Vishnu Pajjuri
  2024-06-26 18:01   ` Salil Mehta via
  2024-07-01 11:38 ` Miguel Luis
                   ` (2 subsequent siblings)
  32 siblings, 1 reply; 105+ messages in thread
From: Vishnu Pajjuri @ 2024-06-26  9:53 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm, mst
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, gshan, rafael,
	borntraeger, alex.bennee, npiggin, harshpb, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

[-- Attachment #1: Type: text/plain, Size: 33654 bytes --]

Hi Salil,

On 14-06-2024 05:06, Salil Mehta wrote:
> PROLOGUE
> ========
>
> To assist in review and set the right expectations from this RFC, please first
> read the sections *APPENDED AT THE END* of this cover letter:
>
> 1. Important *DISCLAIMER* [Section (X)]
> 2. Work presented at KVMForum Conference (slides available) [Section (V)F]
> 3. Organization of patches [Section (XI)]
> 4. References [Section (XII)]
> 5. Detailed TODO list of leftover work or work-in-progress [Section (IX)]
>
> There has been interest shown by other organizations in adapting this series
> for their architecture. Hence, RFC V2 [21] has been split into architecture
> *agnostic* [22] and *specific* patch sets.
>
> This is an ARM architecture-specific patch set carved out of RFC V2. Please
> check section (XI)B for details of architecture agnostic patches.
>
> SECTIONS [I - XIII] are as follows:
>
> (I) Key Changes [details in last section (XIV)]
> ==============================================
>
> RFC V2 -> RFC V3
>
> 1. Split into Architecture *agnostic* (V13) [22] and *specific* (RFC V3) patch sets.
> 2. Addressed comments by Gavin Shan (RedHat), Shaoqin Huang (RedHat), Philippe Mathieu-Daudé (Linaro),
>     Jonathan Cameron (Huawei), Zhao Liu (Intel).

I tried following test cases with rfc-v3 and kernel patches v10, and 
it's looking good on Ampere platforms.

  * Regular hotplug and hot unplug tests
  * Live migration with and with out hot-plugging vcpus tests

Please feel free to add,
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>

_Regards_,

-Vishnu.

> RFC V1 -> RFC V2
>
> RFC V1:https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.mehta@huawei.com/
>
> 1. ACPI MADT Table GIC CPU Interface can now be presented [6] as ACPI
>     *online-capable* or *enabled* to the Guest OS at boot time. This means
>     associated CPUs can have ACPI _STA as *enabled* or *disabled* even after boot.
>     See UEFI ACPI 6.5 Spec, Section 05, Table 5.37 GICC CPU Interface Flags[20].
> 2. SMCC/HVC Hypercall exit handling in userspace/Qemu for PSCI CPU_{ON,OFF}
>     request. This is required to {dis}allow online'ing a vCPU.
> 3. Always presenting unplugged vCPUs in CPUs ACPI AML code as ACPI _STA.PRESENT
>     to the Guest OS. Toggling ACPI _STA.Enabled to give an effect of the
>     hot{un}plug.
> 4. Live Migration works (some issues are still there).
> 5. TCG/HVF/qtest does not support Hotplug and falls back to default.
> 6. Code for TCG support exists in this release (it is a work-in-progress).
> 7. ACPI _OSC method can now be used by OSPM to negotiate Qemu VM platform
>     hotplug capability (_OSC Query support still pending).
> 8. Misc. Bug fixes.
>
> (II) Summary
> ============
>
> This patch set introduces virtual CPU hotplug support for the ARMv8 architecture
> in QEMU. The idea is to be able to hotplug and hot-unplug vCPUs while the guest VM
> is running, without requiring a reboot. This does *not* make any assumptions about
> the physical CPU hotplug availability within the host system but rather tries to
> solve the problem at the virtualizer/QEMU layer. It introduces ACPI CPU hotplug hooks
> and event handling to interface with the guest kernel, and code to initialize, plug,
> and unplug CPUs. No changes are required within the host kernel/KVM except the
> support of hypercall exit handling in the user-space/Qemu, which has recently
> been added to the kernel. Corresponding guest kernel changes have been
> posted on the mailing list [3] [4] by James Morse.
>
> (III) Motivation
> ================
>
> This allows scaling the guest VM compute capacity on-demand, which would be
> useful for the following example scenarios:
>
> 1. Vertical Pod Autoscaling [9][10] in the cloud: Part of the orchestration
>     framework that could adjust resource requests (CPU and Mem requests) for
>     the containers in a pod, based on usage.
> 2. Pay-as-you-grow Business Model: Infrastructure providers could allocate and
>     restrict the total number of compute resources available to the guest VM
>     according to the SLA (Service Level Agreement). VM owners could request more
>     compute to be hot-plugged for some cost.
>
> For example, Kata Container VM starts with a minimum amount of resources (i.e.,
> hotplug everything approach). Why?
>
> 1. Allowing faster *boot time* and
> 2. Reduction in *memory footprint*
>
> Kata Container VM can boot with just 1 vCPU, and then later more vCPUs can be
> hot-plugged as needed.
>
> (IV) Terminology
> ================
>
> (*) Possible CPUs: Total vCPUs that could ever exist in the VM. This includes
>                     any cold-booted CPUs plus any CPUs that could be later
>                     hot-plugged.
>                     - Qemu parameter (-smp maxcpus=N)
> (*) Present CPUs:  Possible CPUs that are ACPI 'present'. These might or might
>                     not be ACPI 'enabled'.
>                     - Present vCPUs = Possible vCPUs (Always on ARM Arch)
> (*) Enabled CPUs:  Possible CPUs that are ACPI 'present' and 'enabled' and can
>                     now be ‘onlined’ (PSCI) for use by the Guest Kernel. All cold-
>                     booted vCPUs are ACPI 'enabled' at boot. Later, using
>                     device_add, more vCPUs can be hotplugged and made ACPI
>                     'enabled'.
>                     - Qemu parameter (-smp cpus=N). Can be used to specify some
> 	           cold-booted vCPUs during VM init. Some can be added using the
> 	           '-device' option.
>
> (V) Constraints Due to ARMv8 CPU Architecture [+] Other Impediments
> ===================================================================
>
> A. Physical Limitation to Support CPU Hotplug: (Architectural Constraint)
>     1. ARMv8 CPU architecture does not support the concept of the physical CPU
>        hotplug.
>        a. There are many per-CPU components like PMU, SVE, MTE, Arch timers, etc.,
>           whose behavior needs to be clearly defined when the CPU is hot(un)plugged.
>           There is no specification for this.
>
>     2. Other ARM components like GIC, etc., have not been designed to realize
>        physical CPU hotplug capability as of now. For example,
>        a. Every physical CPU has a unique GICC (GIC CPU Interface) by construct.
>           Architecture does not specify what CPU hot(un)plug would mean in
>           context to any of these.
>        b. CPUs/GICC are physically connected to unique GICR (GIC Redistributor).
>           GIC Redistributors are always part of the always-on power domain. Hence,
>           they cannot be powered off as per specification.
>
> B. Impediments in Firmware/ACPI (Architectural Constraint)
>
>     1. Firmware has to expose GICC, GICR, and other per-CPU features like PMU,
>        SVE, MTE, Arch Timers, etc., to the OS. Due to the architectural constraint
>        stated in section A1(a), all interrupt controller structures of
>        MADT describing GIC CPU Interfaces and the GIC Redistributors MUST be
>        presented by firmware to the OSPM during boot time.
>     2. Architectures that support CPU hotplug can evaluate the ACPI _MAT method to
>        get this kind of information from the firmware even after boot, and the
>        OSPM has the capability to process these. ARM kernel uses information in MADT
>        interrupt controller structures to identify the number of present CPUs during
>        boot and hence does not allow to change these after boot. The number of
>        present CPUs cannot be changed. It is an architectural constraint!
>
> C. Impediments in KVM to Support Virtual CPU Hotplug (Architectural Constraint)
>
>     1. KVM VGIC:
>        a. Sizing of various VGIC resources like memory regions, etc., related to
>           the redistributor happens only once and is fixed at the VM init time
>           and cannot be changed later after initialization has happened.
>           KVM statically configures these resources based on the number of vCPUs
>           and the number/size of redistributor ranges.
>        b. Association between vCPU and its VGIC redistributor is fixed at the
>           VM init time within the KVM, i.e., when redistributor iodevs gets
>           registered. VGIC does not allow to setup/change this association
>           after VM initialization has happened. Physically, every CPU/GICC is
>           uniquely connected with its redistributor, and there is no
>           architectural way to set this up.
>     2. KVM vCPUs:
>        a. Lack of specification means destruction of KVM vCPUs does not exist as
>           there is no reference to tell what to do with other per-vCPU
>           components like redistributors, arch timer, etc.
>        b. In fact, KVM does not implement the destruction of vCPUs for any
>           architecture. This is independent of whether the architecture
>           actually supports CPU Hotplug feature. For example, even for x86 KVM
>           does not implement the destruction of vCPUs.
>
> D. Impediments in Qemu to Support Virtual CPU Hotplug (KVM Constraints->Arch)
>
>     1. Qemu CPU Objects MUST be created to initialize all the Host KVM vCPUs to
>        overcome the KVM constraint. KVM vCPUs are created and initialized when Qemu
>        CPU Objects are realized. But keeping the QOM CPU objects realized for
>        'yet-to-be-plugged' vCPUs can create problems when these new vCPUs shall
>        be plugged using device_add and a new QOM CPU object shall be created.
>     2. GICV3State and GICV3CPUState objects MUST be sized over *possible vCPUs*
>        during VM init time while QOM GICV3 Object is realized. This is because
>        KVM VGIC can only be initialized once during init time. But every
>        GICV3CPUState has an associated QOM CPU Object. Later might correspond to
>        vCPU which are 'yet-to-be-plugged' (unplugged at init).
>     3. How should new QOM CPU objects be connected back to the GICV3CPUState
>        objects and disconnected from it in case the CPU is being hot(un)plugged?
>     4. How should 'unplugged' or 'yet-to-be-plugged' vCPUs be represented in the
>        QOM for which KVM vCPU already exists? For example, whether to keep,
>         a. No QOM CPU objects Or
>         b. Unrealized CPU Objects
>     5. How should vCPU state be exposed via ACPI to the Guest? Especially for
>        the unplugged/yet-to-be-plugged vCPUs whose CPU objects might not exist
>        within the QOM but the Guest always expects all possible vCPUs to be
>        identified as ACPI *present* during boot.
>     6. How should Qemu expose GIC CPU interfaces for the unplugged or
>        yet-to-be-plugged vCPUs using ACPI MADT Table to the Guest?
>
> E. Summary of Approach ([+] Workarounds to problems in sections A, B, C & D)
>
>     1. At VM Init, pre-create all the possible vCPUs in the Host KVM i.e., even
>        for the vCPUs which are yet-to-be-plugged in Qemu but keep them in the
>        powered-off state.
>     2. After the KVM vCPUs have been initialized in the Host, the KVM vCPU
>        objects corresponding to the unplugged/yet-to-be-plugged vCPUs are parked
>        at the existing per-VM "kvm_parked_vcpus" list in Qemu. (similar to x86)
>     3. GICV3State and GICV3CPUState objects are sized over possible vCPUs during
>        VM init time i.e., when Qemu GIC is realized. This, in turn, sizes KVM VGIC
>        resources like memory regions, etc., related to the redistributors with the
>        number of possible KVM vCPUs. This never changes after VM has initialized.
>     4. Qemu CPU objects corresponding to unplugged/yet-to-be-plugged vCPUs are
>        released post Host KVM CPU and GIC/VGIC initialization.
>     5. Build ACPI MADT Table with the following updates:
>        a. Number of GIC CPU interface entries (=possible vCPUs)
>        b. Present Boot vCPU as MADT.GICC.Enabled=1 (Not hot[un]pluggable)
>        c. Present hot(un)pluggable vCPUs as MADT.GICC.online-capable=1
>           - MADT.GICC.Enabled=0 (Mutually exclusive) [6][7]
> 	 - vCPU can be ACPI enabled+onlined after Guest boots (Firmware Policy)
> 	 - Some issues with above (details in later sections)
>     6. Expose below ACPI Status to Guest kernel:
>        a. Always _STA.Present=1 (all possible vCPUs)
>        b. _STA.Enabled=1 (plugged vCPUs)
>        c. _STA.Enabled=0 (unplugged vCPUs)
>     7. vCPU hotplug *realizes* new QOM CPU object. The following happens:
>        a. Realizes, initializes QOM CPU Object & spawns Qemu vCPU thread.
>        b. Unparks the existing KVM vCPU ("kvm_parked_vcpus" list).
>           - Attaches to QOM CPU object.
>        c. Reinitializes KVM vCPU in the Host.
>           - Resets the core and sys regs, sets defaults, etc.
>        d. Runs KVM vCPU (created with "start-powered-off").
> 	 - vCPU thread sleeps (waits for vCPU reset via PSCI).
>        e. Updates Qemu GIC.
>           - Wires back IRQs related to this vCPU.
>           - GICV3CPUState association with QOM CPU Object.
>        f. Updates [6] ACPI _STA.Enabled=1.
>        g. Notifies Guest about the new vCPU (via ACPI GED interface).
> 	 - Guest checks _STA.Enabled=1.
> 	 - Guest adds processor (registers CPU with LDM) [3].
>        h. Plugs the QOM CPU object in the slot.
>           - slot-number = cpu-index {socket, cluster, core, thread}.
>        i. Guest online's vCPU (CPU_ON PSCI call over HVC/SMC).
>           - KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
>           - Qemu powers-on KVM vCPU in the Host.
>     8. vCPU hot-unplug *unrealizes* QOM CPU Object. The following happens:
>        a. Notifies Guest (via ACPI GED interface) vCPU hot-unplug event.
>           - Guest offline's vCPU (CPU_OFF PSCI call over HVC/SMC).
>        b. KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
>           - Qemu powers-off the KVM vCPU in the Host.
>        c. Guest signals *Eject* vCPU to Qemu.
>        d. Qemu updates [6] ACPI _STA.Enabled=0.
>        e. Updates GIC.
>           - Un-wires IRQs related to this vCPU.
>           - GICV3CPUState association with new QOM CPU Object is updated.
>        f. Unplugs the vCPU.
> 	 - Removes from slot.
>           - Parks KVM vCPU ("kvm_parked_vcpus" list).
>           - Unrealizes QOM CPU Object & joins back Qemu vCPU thread.
> 	 - Destroys QOM CPU object.
>        g. Guest checks ACPI _STA.Enabled=0.
>           - Removes processor (unregisters CPU with LDM) [3].
>
> F. Work Presented at KVM Forum Conferences:
> ==========================================
>
> Details of the above work have been presented at KVMForum2020 and KVMForum2023
> conferences. Slides & video are available at the links below:
> a. KVMForum 2023
>     - Challenges Revisited in Supporting Virt CPU Hotplug on architectures that don't Support CPU Hotplug (like ARM64).
>       https://kvm-forum.qemu.org/2023/KVM-forum-cpu-hotplug_7OJ1YyJ.pdf
>       https://kvm-forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Virt_CPU_Hotplug_-__ii0iNb3.pdf
>       https://www.youtube.com/watch?v=hyrw4j2D6I0&t=23970s
>       https://kvm-forum.qemu.org/2023/talk/9SMPDQ/
> b. KVMForum 2020
>     - Challenges in Supporting Virtual CPU Hotplug on SoC Based Systems (like ARM64) - Salil Mehta, Huawei.
>       https://sched.co/eE4m
>
> (VI) Commands Used
> ==================
>
> A. Qemu launch commands to init the machine:
>
>      $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3 \
>        -cpu host -smp cpus=4,maxcpus=6 \
>        -m 300M \
>        -kernel Image \
>        -initrd rootfs.cpio.gz \
>        -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2 acpi=force" \
>        -nographic \
>        -bios QEMU_EFI.fd \
>
> B. Hot-(un)plug related commands:
>
>    # Hotplug a host vCPU (accel=kvm):
>      $ device_add host-arm-cpu,id=core4,core-id=4
>
>    # Hotplug a vCPU (accel=tcg):
>      $ device_add cortex-a57-arm-cpu,id=core4,core-id=4
>
>    # Delete the vCPU:
>      $ device_del core4
>
> Sample output on guest after boot:
>
>      $ cat /sys/devices/system/cpu/possible
>      0-5
>      $ cat /sys/devices/system/cpu/present
>      0-5
>      $ cat /sys/devices/system/cpu/enabled
>      0-3
>      $ cat /sys/devices/system/cpu/online
>      0-1
>      $ cat /sys/devices/system/cpu/offline
>      2-5
>
> Sample output on guest after hotplug of vCPU=4:
>
>      $ cat /sys/devices/system/cpu/possible
>      0-5
>      $ cat /sys/devices/system/cpu/present
>      0-5
>      $ cat /sys/devices/system/cpu/enabled
>      0-4
>      $ cat /sys/devices/system/cpu/online
>      0-1,4
>      $ cat /sys/devices/system/cpu/offline
>      2-3,5
>
>      Note: vCPU=4 was explicitly 'onlined' after hot-plug
>      $ echo 1 > /sys/devices/system/cpu/cpu4/online
>
> (VII) Latest Repository
> =======================
>
> (*) Latest Qemu RFC V3 (Architecture Specific) patch set:
>      https://github.com/salil-mehta/qemu.git  virt-cpuhp-armv8/rfc-v3
> (*) Latest Qemu V13 (Architecture Agnostic) patch set:
>      https://github.com/salil-mehta/qemu.git  virt-cpuhp-armv8/rfc-v3.arch.agnostic.v13
> (*) QEMU changes for vCPU hotplug can be cloned from below site:
>      https://github.com/salil-mehta/qemu.git  virt-cpuhp-armv8/rfc-v2
> (*) Guest Kernel changes (by James Morse, ARM) are available here:
>      https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git  virtual_cpu_hotplug/rfc/v2
> (*) Leftover patches of the kernel are available here:
>      https://lore.kernel.org/lkml/20240529133446.28446-1-Jonathan.Cameron@huawei.com/
>      https://github.com/salil-mehta/linux/commits/virtual_cpu_hotplug/rfc/v6.jic/  (not latest)
>
> (VIII) KNOWN ISSUES
> ===================
>
> 1. Migration has been lightly tested but has been found working.
> 2. TCG is broken.
> 3. HVF and qtest are not supported yet.
> 4. ACPI MADT Table flags [7] MADT.GICC.Enabled and MADT.GICC.online-capable are
>     mutually exclusive, i.e., as per the change [6], a vCPU cannot be both
>     GICC.Enabled and GICC.online-capable. This means:
>        [ Link:https://bugzilla.tianocore.org/show_bug.cgi?id=3706  ]
>     a. If we have to support hot-unplug of the cold-booted vCPUs, then these MUST
>        be specified as GICC.online-capable in the MADT Table during boot by the
>        firmware/Qemu. But this requirement conflicts with the requirement to
>        support new Qemu changes with legacy OS that don't understand
>        MADT.GICC.online-capable Bit. Legacy OS during boot time will ignore this
>        bit, and hence these vCPUs will not appear on such OS. This is unexpected
>        behavior.
>     b. In case we decide to specify vCPUs as MADT.GICC.Enabled and try to unplug
>        these cold-booted vCPUs from OS (which in actuality should be blocked by
>        returning error at Qemu), then features like 'kexec' will break.
>     c. As I understand, removal of the cold-booted vCPUs is a required feature
>        and x86 world allows it.
>     d. Hence, either we need a specification change to make the MADT.GICC.Enabled
>        and MADT.GICC.online-capable Bits NOT mutually exclusive or NOT support
>        the removal of cold-booted vCPUs. In the latter case, a check can be introduced
>        to bar the users from unplugging vCPUs, which were cold-booted, using QMP
>        commands. (Needs discussion!)
>        Please check the patch part of this patch set:
>        [hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled].
>     
>        NOTE: This is definitely not a blocker!
> 5. Code related to the notification to GICV3 about the hot(un)plug of a vCPU event
>     might need further discussion.
>
>
> (IX) THINGS TO DO
> =================
>
> 1. Fix issues related to TCG/Emulation support. (Not a blocker)
> 2. Comprehensive Testing is in progress. (Positive feedback from Oracle & Ampere)
> 3. Qemu Documentation (.rst) needs to be updated.
> 4. Fix qtest, HVF Support (Future).
> 5. Update the design issue related to ACPI MADT.GICC flags discussed in known
>     issues. This might require UEFI ACPI specification change (Not a blocker).
> 6. Add ACPI _OSC 'Query' support. Only part of _OSC support exists now. (Not a blocker).
>
> The above is *not* a complete list. Will update later!
>
> Best regards,
> Salil.
>
> (X) DISCLAIMER
> ==============
>
> This work is an attempt to present a proof-of-concept of the ARM64 vCPU hotplug
> implementation to the community. This is *not* production-level code and might
> have bugs. Comprehensive testing is being done on HiSilicon Kunpeng920 SoC,
> Oracle, and Ampere servers. We are nearing stable code and a non-RFC
> version shall be floated soon.
>
> This work is *mostly* in the lines of the discussions that have happened in the
> previous years [see refs below] across different channels like the mailing list,
> Linaro Open Discussions platform, and various conferences like KVMForum, etc. This
> RFC is being used as a way to verify the idea mentioned in this cover letter and
> to get community views. Once this has been agreed upon, a formal patch shall be
> posted to the mailing list for review.
>
> [The concept being presented has been found to work!]
>
> (XI) ORGANIZATION OF PATCHES
> ============================
>   
> A. Architecture *specific* patches:
>
>     [Patch 1-8, 17, 27, 29] logic required during machine init.
>      (*) Some validation checks.
>      (*) Introduces core-id property and some util functions required later.
>      (*) Logic to pre-create vCPUs.
>      (*) GIC initialization pre-sized with possible vCPUs.
>      (*) Some refactoring to have common hot and cold plug logic together.
>      (*) Release of disabled QOM CPU objects in post_cpu_init().
>      (*) Support of ACPI _OSC method to negotiate platform hotplug capabilities.
>     [Patch 9-16] logic related to ACPI at machine init time.
>      (*) Changes required to Enable ACPI for CPU hotplug.
>      (*) Initialization of ACPI GED framework to cater to CPU Hotplug Events.
>      (*) ACPI MADT/MAT changes.
>     [Patch 18-26] logic required during vCPU hot-(un)plug.
>      (*) Basic framework changes to support vCPU hot-(un)plug.
>      (*) ACPI GED changes for hot-(un)plug hooks.
>      (*) Wire-unwire the IRQs.
>      (*) GIC notification logic.
>      (*) ARMCPU unrealize logic.
>      (*) Handling of SMCC Hypercall Exits by KVM to Qemu.
>     
> B. Architecture *agnostic* patches:
>
>     [PATCH V13 0/8] Add architecture agnostic code to support vCPU Hotplug.
>     https://lore.kernel.org/qemu-devel/20240607115649.214622-1-salil.mehta@huawei.com/T/#md0887eb07976bc76606a8204614ccc7d9a01c1f7
>      (*) Refactors vCPU create, Parking, unparking logic of vCPUs, and addition of traces.
>      (*) Build ACPI AML related to CPU control dev.
>      (*) Changes related to the destruction of CPU Address Space.
>      (*) Changes related to the uninitialization of GDB Stub.
>      (*) Updating of Docs.
>
> (XII) REFERENCES
> ================
>
> [1]https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.mehta@huawei.com/
> [2]https://lore.kernel.org/linux-arm-kernel/20200625133757.22332-1-salil.mehta@huawei.com/
> [3]https://lore.kernel.org/lkml/20230203135043.409192-1-james.morse@arm.com/
> [4]https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/
> [5]https://lore.kernel.org/all/20230404154050.2270077-1-oliver.upton@linux.dev/
> [6]https://bugzilla.tianocore.org/show_bug.cgi?id=3706
> [7]https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic-cpu-interface-gicc-structure
> [8]https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
> [9]https://cloud.google.com/kubernetes-engine/docs/concepts/verticalpodautoscaler
> [10]https://docs.aws.amazon.com/eks/latest/userguide/vertical-pod-autoscaler.html
> [11]https://lkml.org/lkml/2019/7/10/235
> [12]https://lists.cs.columbia.edu/pipermail/kvmarm/2018-July/032316.html
> [13]https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg06517.html
> [14]https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-lists.linaro.org/thread/7CGL6JTACPUZEYQC34CZ2ZBWJGSR74WE/
> [15]http://lists.nongnu.org/archive/html/qemu-devel/2018-07/msg01168.html
> [16]https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00131.html
> [17]https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-lists.linaro.org/message/X74JS6P2N4AUWHHATJJVVFDI2EMDZJ74/
> [18]https://lore.kernel.org/lkml/20210608154805.216869-1-jean-philippe@linaro.org/
> [19]https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/  
> [20]https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gicc-cpu-interface-flags
> [21]https://lore.kernel.org/qemu-devel/20230926100436.28284-1-salil.mehta@huawei.com/
> [22]https://lore.kernel.org/qemu-devel/20240607115649.214622-1-salil.mehta@huawei.com/T/#md0887eb07976bc76606a8204614ccc7d9a01c1f7
>
> (XIII) ACKNOWLEDGEMENTS
> =======================
>
> I would like to take this opportunity to thank below people for various
> discussions with me over different channels during the development:
>
> Marc Zyngier (Google)               Catalin Marinas (ARM),
> James Morse(ARM),                   Will Deacon (Google),
> Jean-Phillipe Brucker (Linaro),     Sudeep Holla (ARM),
> Lorenzo Pieralisi (Linaro),         Gavin Shan (Redhat),
> Jonathan Cameron (Huawei),          Darren Hart (Ampere),
> Igor Mamedov (Redhat),              Ilkka Koskinen (Ampere),
> Andrew Jones (Redhat),              Karl Heubaum (Oracle),
> Keqian Zhu (Huawei),                Miguel Luis (Oracle),
> Xiongfeng Wang (Huawei),            Vishnu Pajjuri (Ampere),
> Shameerali Kolothum (Huawei)        Russell King (Oracle)
> Xuwei/Joy (Huawei),                 Peter Maydel (Linaro)
> Zengtao/Prime (Huawei),             And all those whom I have missed!
>
> Many thanks to the following people for their current or past contributions:
>
> 1. James Morse (ARM)
>     (Current Kernel part of vCPU Hotplug Support on AARCH64)
> 2. Jean-Philippe Brucker (Linaro)
>     (Prototyped one of the earlier PSCI-based POC [17][18] based on RFC V1)
> 3. Keqian Zhu (Huawei)
>     (Co-developed Qemu prototype)
> 4. Xiongfeng Wang (Huawei)
>     (Co-developed an earlier kernel prototype with me)
> 5. Vishnu Pajjuri (Ampere)
>     (Verification on Ampere ARM64 Platforms + fixes)
> 6. Miguel Luis (Oracle)
>     (Verification on Oracle ARM64 Platforms + fixes)
> 7. Russell King (Oracle) & Jonathan Cameron (Huawei)
>     (Helping in upstreaming James Morse's Kernel patches).
>
> (XIV) Change Log:
> =================
>
> RFC V2 -> RFC V3:
> -----------------
> 1. Miscellaneous:
>     - Split the RFC V2 into arch-agnostic and arch-specific patch sets.
> 2. Addressed Gavin Shan's (RedHat) comments:
>     - Made CPU property accessors inline.
>       https://lore.kernel.org/qemu-devel/6cd28639-2cfa-f233-c6d9-d5d2ec5b1c58@redhat.com/
>     - Collected Reviewed-bys [PATCH RFC V2 4/37, 14/37, 22/37].
>     - Dropped the patch as it was not required after init logic was refactored.
>       https://lore.kernel.org/qemu-devel/4fb2eef9-6742-1eeb-721a-b3db04b1be97@redhat.com/
>     - Fixed the range check for the core during vCPU Plug.
>       https://lore.kernel.org/qemu-devel/1c5fa24c-6bf3-750f-4f22-087e4a9311af@redhat.com/
>     - Added has_hotpluggable_vcpus check to make build_cpus_aml() conditional.
>       https://lore.kernel.org/qemu-devel/832342cb-74bc-58dd-c5d7-6f995baeb0f2@redhat.com/
>     - Fixed the states initialization in cpu_hotplug_hw_init() to accommodate previous refactoring.
>       https://lore.kernel.org/qemu-devel/da5e5609-1883-8650-c7d8-6868c7b74f1c@redhat.com/
>     - Fixed typos.
>       https://lore.kernel.org/qemu-devel/eb1ac571-7844-55e6-15e7-3dd7df21366b@redhat.com/
>     - Removed the unnecessary 'goto fail'.
>       https://lore.kernel.org/qemu-devel/4d8980ac-f402-60d4-fe52-787815af8a7d@redhat.com/#t
>     - Added check for hotpluggable vCPUs in the _OSC method.
>       https://lore.kernel.org/qemu-devel/20231017001326.FUBqQ1PTowF2GxQpnL3kIW0AhmSqbspazwixAHVSi6c@z/
> 3. Addressed Shaoqin Huang's (Intel) comments:
>     - Fixed the compilation break due to the absence of a call to virt_cpu_properties() missing
>       along with its definition.
>       https://lore.kernel.org/qemu-devel/3632ee24-47f7-ae68-8790-26eb2cf9950b@redhat.com/
> 4. Addressed Jonathan Cameron's (Huawei) comments:
>     - Gated the 'disabled vcpu message' for GIC version < 3.
>       https://lore.kernel.org/qemu-devel/20240116155911.00004fe1@Huawei.com/
>
> RFC V1 -> RFC V2:
> -----------------
> 1. Addressed James Morse's (ARM) requirement as per Linaro Open Discussion:
>     - Exposed all possible vCPUs as always ACPI _STA.present and available during boot time.
>     - Added the _OSC handling as required by James's patches.
>     - Introduction of 'online-capable' bit handling in the flag of MADT GICC.
>     - SMCC Hypercall Exit handling in Qemu.
> 2. Addressed Marc Zyngier's comment:
>     - Fixed the note about GIC CPU Interface in the cover letter.
> 3. Addressed issues raised by Vishnu Pajjuru (Ampere) & Miguel Luis (Oracle) during testing:
>     - Live/Pseudo Migration crashes.
> 4. Others:
>     - Introduced the concept of persistent vCPU at QOM.
>     - Introduced wrapper APIs of present, possible, and persistent.
>     - Change at ACPI hotplug H/W init leg accommodating initializing is_present and is_enabled states.
>     - Check to avoid unplugging cold-booted vCPUs.
>     - Disabled hotplugging with TCG/HVF/QTEST.
>     - Introduced CPU Topology, {socket, cluster, core, thread}-id property.
>     - Extract virt CPU properties as a common virt_vcpu_properties() function.
>
> Author Salil Mehta (1):
>    target/arm/kvm,tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu
>
> Jean-Philippe Brucker (2):
>    hw/acpi: Make _MAT method optional
>    target/arm/kvm: Write CPU state back to KVM on reset
>
> Miguel Luis (1):
>    tcg/mttcg: enable threads to unregister in tcg_ctxs[]
>
> Salil Mehta (25):
>    arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id
>      property
>    cpu-common: Add common CPU utility for possible vCPUs
>    hw/arm/virt: Limit number of possible vCPUs for unsupported Accel or
>      GIC Type
>    hw/arm/virt: Move setting of common CPU properties in a function
>    arm/virt,target/arm: Machine init time change common to vCPU
>      {cold|hot}-plug
>    arm/virt,kvm: Pre-create disabled possible vCPUs @machine init
>    arm/virt,gicv3: Changes to pre-size GIC with possible vcpus @machine
>      init
>    arm/virt: Init PMU at host for all possible vcpus
>    arm/acpi: Enable ACPI support for vcpu hotplug
>    arm/virt: Add cpu hotplug events to GED during creation
>    arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
>    arm/virt/acpi: Build CPUs AML with CPU Hotplug support
>    arm/virt: Make ARM vCPU *present* status ACPI *persistent*
>    hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} Bits
>      to Guest
>    hw/arm: MADT Tbl change to size the guest with possible vCPUs
>    arm/virt: Release objects for *disabled* possible vCPUs after init
>    arm/virt: Add/update basic hot-(un)plug framework
>    arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug
>    hw/arm,gicv3: Changes to update GIC with vCPU hot-plug notification
>    hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register
>      info
>    arm/virt: Update the guest(via GED) about CPU hot-(un)plug events
>    hw/arm: Changes required for reset and to support next boot
>    target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
>    hw/arm: Support hotplug capability check using _OSC method
>    hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled
>
>   accel/tcg/tcg-accel-ops-mttcg.c    |   1 +
>   cpu-common.c                       |  37 ++
>   hw/acpi/cpu.c                      |  62 +-
>   hw/acpi/generic_event_device.c     |  11 +
>   hw/arm/Kconfig                     |   1 +
>   hw/arm/boot.c                      |   2 +-
>   hw/arm/virt-acpi-build.c           | 113 +++-
>   hw/arm/virt.c                      | 877 +++++++++++++++++++++++------
>   hw/core/gpio.c                     |   2 +-
>   hw/intc/arm_gicv3.c                |   1 +
>   hw/intc/arm_gicv3_common.c         |  66 ++-
>   hw/intc/arm_gicv3_cpuif.c          | 269 +++++----
>   hw/intc/arm_gicv3_cpuif_common.c   |   5 +
>   hw/intc/arm_gicv3_kvm.c            |  39 +-
>   hw/intc/gicv3_internal.h           |   2 +
>   include/hw/acpi/cpu.h              |   2 +
>   include/hw/arm/boot.h              |   2 +
>   include/hw/arm/virt.h              |  38 +-
>   include/hw/core/cpu.h              |  78 +++
>   include/hw/intc/arm_gicv3_common.h |  23 +
>   include/hw/qdev-core.h             |   2 +
>   include/tcg/startup.h              |   7 +
>   target/arm/arm-powerctl.c          |  51 +-
>   target/arm/cpu-qom.h               |  18 +-
>   target/arm/cpu.c                   | 112 ++++
>   target/arm/cpu.h                   |  18 +
>   target/arm/cpu64.c                 |  15 +
>   target/arm/gdbstub.c               |   6 +
>   target/arm/helper.c                |  27 +-
>   target/arm/internals.h             |  14 +-
>   target/arm/kvm.c                   | 146 ++++-
>   target/arm/kvm_arm.h               |  25 +
>   target/arm/meson.build             |   1 +
>   target/arm/{tcg => }/psci.c        |   8 +
>   target/arm/tcg/meson.build         |   4 -
>   tcg/tcg.c                          |  24 +
>   36 files changed, 1749 insertions(+), 360 deletions(-)
>   rename target/arm/{tcg => }/psci.c (97%)
>

[-- Attachment #2: Type: text/html, Size: 38522 bytes --]

^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-06-26  9:53 ` [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Vishnu Pajjuri
@ 2024-06-26 18:01   ` Salil Mehta via
  0 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-06-26 18:01 UTC (permalink / raw)
  To: Vishnu Pajjuri, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com,
	rafael@kernel.org, borntraeger@linux.ibm.com,
	alex.bennee@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com,
	linux@armlinux.org.uk, darren@os.amperecomputing.com,
	ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com,
	karl.heubaum@oracle.com, miguel.luis@oracle.com,
	salil.mehta@opnsrc.net, zhukeqian, wangxiongfeng (C),
	wangyanan (Y), jiakernel2@gmail.com, maobibo@loongson.cn,
	lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com,
	Linuxarm

Hi Vishnu,
 
> From: Vishnu Pajjuri <vishnu@amperemail.onmicrosoft.com> 
> Sent: Wednesday, June 26, 2024 10:53 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org; mst@redhat.com
> 
> Hi Salil,
> On 14-06-2024 05:06, Salil Mehta wrote:
> PROLOGUE
> ========
> 
> To assist in review and set the right expectations from this RFC, please first
> read the sections *APPENDED AT THE END* of this cover letter:
> 
> 1. Important *DISCLAIMER* [Section (X)]
> 2. Work presented at KVMForum Conference (slides available) [Section (V)F]
> 3. Organization of patches [Section (XI)]
> 4. References [Section (XII)]
> 5. Detailed TODO list of leftover work or work-in-progress [Section (IX)]
> 
> There has been interest shown by other organizations in adapting this series
> for their architecture. Hence, RFC V2 [21] has been split into architecture
> *agnostic* [22] and *specific* patch sets.
> 
> This is an ARM architecture-specific patch set carved out of RFC V2. Please
> check section (XI)B for details of architecture agnostic patches.
> 
> SECTIONS [I - XIII] are as follows:
> 
> (I) Key Changes [details in last section (XIV)]
> ==============================================
> 
> RFC V2 -> RFC V3
> 
> 1. Split into Architecture *agnostic* (V13) [22] and *specific* (RFC V3) patch sets.
> 2. Addressed comments by Gavin Shan (RedHat), Shaoqin Huang (RedHat), Philippe Mathieu-Daudé (Linaro),
> > Jonathan Cameron (Huawei), Zhao Liu (Intel).
> I tried following test cases with rfc-v3 and kernel patches v10, and it's looking good on Ampere platforms.
> • Regular hotplug and hot unplug tests
> • Live migration with and with out hot-plugging vcpus tests
> Please feel free to add,
> Tested-by: Vishnu Pajjuri mailto:vishnu@os.amperecomputing.com


Many thanks for testing and confirming the functionality. Really appreciate this!

Best
Salil.


> 
> Regards,
> -Vishnu.
> 
> RFC V1 -> RFC V2
> 

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (29 preceding siblings ...)
  2024-06-26  9:53 ` [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Vishnu Pajjuri
@ 2024-07-01 11:38 ` Miguel Luis
  2024-07-01 16:30   ` Salil Mehta via
  2024-08-07  9:53 ` Gavin Shan
  2024-08-28 20:35 ` Gustavo Romero
  32 siblings, 1 reply; 105+ messages in thread
From: Miguel Luis @ 2024-07-01 11:38 UTC (permalink / raw)
  To: Salil Mehta
  Cc: qemu-devel@nongnu.org, qemu-arm@nongnu.org, Michael S . Tsirkin,
	Marc Zyngier, Jean-Philippe Brucker, Jonathan Cameron,
	Lorenzo Pieralisi, Peter Maydell, Richard Henderson,
	Igor Mammedov, andrew.jones@linux.dev, david@redhat.com,
	Philippe Mathieu-Daudé, eric.auger@redhat.com,
	will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev,
	pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, Karl Heubaum,
	salil.mehta@opnsrc.net, zhukeqian1@huawei.com,
	wangxiongfeng2@huawei.com, wangyanan55@huawei.com,
	jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn,
	shahuang@redhat.com, zhao1.liu@intel.com, linuxarm@huawei.com

Hi Salil,

> On 13 Jun 2024, at 23:36, Salil Mehta <salil.mehta@huawei.com> wrote:
> 
> PROLOGUE
> ========
> 
> To assist in review and set the right expectations from this RFC, please first
> read the sections *APPENDED AT THE END* of this cover letter:
> 
> 1. Important *DISCLAIMER* [Section (X)]
> 2. Work presented at KVMForum Conference (slides available) [Section (V)F]
> 3. Organization of patches [Section (XI)]
> 4. References [Section (XII)]
> 5. Detailed TODO list of leftover work or work-in-progress [Section (IX)]
> 
> There has been interest shown by other organizations in adapting this series
> for their architecture. Hence, RFC V2 [21] has been split into architecture
> *agnostic* [22] and *specific* patch sets.
> 
> This is an ARM architecture-specific patch set carved out of RFC V2. Please
> check section (XI)B for details of architecture agnostic patches.
> 
> SECTIONS [I - XIII] are as follows:
> 
> (I) Key Changes [details in last section (XIV)]
> ==============================================
> 
> RFC V2 -> RFC V3
> 
> 1. Split into Architecture *agnostic* (V13) [22] and *specific* (RFC V3) patch sets.
> 2. Addressed comments by Gavin Shan (RedHat), Shaoqin Huang (RedHat), Philippe Mathieu-Daudé (Linaro),
>   Jonathan Cameron (Huawei), Zhao Liu (Intel).
> 

I’ve tested this series along with v10 kernel patches from [1] on the following items:

Boot.
Hotplug up to maxcpus.
Hot unplug down to the number of boot cpus.
Hotplug vcpus then migrate to a new VM.
Hot unplug down to the number of boot cpus then migrate to a new VM.
Up to 6 successive live migrations.

And in which LGTM.

Please feel free to add,
Tested-by: Miguel Luis <miguel.luis@oracle.com>

Regards,
Miguel

[1] https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=for-next/vcpu-hotplug

> RFC V1 -> RFC V2
> 
> RFC V1: https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.mehta@huawei.com/
> 
> 1. ACPI MADT Table GIC CPU Interface can now be presented [6] as ACPI
>   *online-capable* or *enabled* to the Guest OS at boot time. This means
>   associated CPUs can have ACPI _STA as *enabled* or *disabled* even after boot.
>   See UEFI ACPI 6.5 Spec, Section 05, Table 5.37 GICC CPU Interface Flags[20].
> 2. SMCC/HVC Hypercall exit handling in userspace/Qemu for PSCI CPU_{ON,OFF}
>   request. This is required to {dis}allow online'ing a vCPU.
> 3. Always presenting unplugged vCPUs in CPUs ACPI AML code as ACPI _STA.PRESENT 
>   to the Guest OS. Toggling ACPI _STA.Enabled to give an effect of the
>   hot{un}plug.
> 4. Live Migration works (some issues are still there).
> 5. TCG/HVF/qtest does not support Hotplug and falls back to default.
> 6. Code for TCG support exists in this release (it is a work-in-progress).
> 7. ACPI _OSC method can now be used by OSPM to negotiate Qemu VM platform
>   hotplug capability (_OSC Query support still pending).
> 8. Misc. Bug fixes.
> 
> (II) Summary
> ============
> 
> This patch set introduces virtual CPU hotplug support for the ARMv8 architecture
> in QEMU. The idea is to be able to hotplug and hot-unplug vCPUs while the guest VM
> is running, without requiring a reboot. This does *not* make any assumptions about
> the physical CPU hotplug availability within the host system but rather tries to
> solve the problem at the virtualizer/QEMU layer. It introduces ACPI CPU hotplug hooks
> and event handling to interface with the guest kernel, and code to initialize, plug,
> and unplug CPUs. No changes are required within the host kernel/KVM except the
> support of hypercall exit handling in the user-space/Qemu, which has recently
> been added to the kernel. Corresponding guest kernel changes have been
> posted on the mailing list [3] [4] by James Morse.
> 
> (III) Motivation
> ================
> 
> This allows scaling the guest VM compute capacity on-demand, which would be
> useful for the following example scenarios:
> 
> 1. Vertical Pod Autoscaling [9][10] in the cloud: Part of the orchestration
>   framework that could adjust resource requests (CPU and Mem requests) for
>   the containers in a pod, based on usage.
> 2. Pay-as-you-grow Business Model: Infrastructure providers could allocate and
>   restrict the total number of compute resources available to the guest VM
>   according to the SLA (Service Level Agreement). VM owners could request more
>   compute to be hot-plugged for some cost.
> 
> For example, Kata Container VM starts with a minimum amount of resources (i.e.,
> hotplug everything approach). Why?
> 
> 1. Allowing faster *boot time* and
> 2. Reduction in *memory footprint*
> 
> Kata Container VM can boot with just 1 vCPU, and then later more vCPUs can be
> hot-plugged as needed.
> 
> (IV) Terminology
> ================
> 
> (*) Possible CPUs: Total vCPUs that could ever exist in the VM. This includes
>                   any cold-booted CPUs plus any CPUs that could be later
>                   hot-plugged.
>                   - Qemu parameter (-smp maxcpus=N)
> (*) Present CPUs:  Possible CPUs that are ACPI 'present'. These might or might
>                   not be ACPI 'enabled'. 
>                   - Present vCPUs = Possible vCPUs (Always on ARM Arch)
> (*) Enabled CPUs:  Possible CPUs that are ACPI 'present' and 'enabled' and can
>                   now be ‘onlined’ (PSCI) for use by the Guest Kernel. All cold-
>                   booted vCPUs are ACPI 'enabled' at boot. Later, using
>                   device_add, more vCPUs can be hotplugged and made ACPI
>                   'enabled'.
>                   - Qemu parameter (-smp cpus=N). Can be used to specify some
>           cold-booted vCPUs during VM init. Some can be added using the
>           '-device' option.
> 
> (V) Constraints Due to ARMv8 CPU Architecture [+] Other Impediments
> ===================================================================
> 
> A. Physical Limitation to Support CPU Hotplug: (Architectural Constraint)
>   1. ARMv8 CPU architecture does not support the concept of the physical CPU
>      hotplug. 
>      a. There are many per-CPU components like PMU, SVE, MTE, Arch timers, etc.,
>         whose behavior needs to be clearly defined when the CPU is hot(un)plugged.
>         There is no specification for this.
> 
>   2. Other ARM components like GIC, etc., have not been designed to realize
>      physical CPU hotplug capability as of now. For example,
>      a. Every physical CPU has a unique GICC (GIC CPU Interface) by construct.
>         Architecture does not specify what CPU hot(un)plug would mean in
>         context to any of these.
>      b. CPUs/GICC are physically connected to unique GICR (GIC Redistributor).
>         GIC Redistributors are always part of the always-on power domain. Hence,
>         they cannot be powered off as per specification.
> 
> B. Impediments in Firmware/ACPI (Architectural Constraint)
> 
>   1. Firmware has to expose GICC, GICR, and other per-CPU features like PMU,
>      SVE, MTE, Arch Timers, etc., to the OS. Due to the architectural constraint
>      stated in section A1(a), all interrupt controller structures of
>      MADT describing GIC CPU Interfaces and the GIC Redistributors MUST be
>      presented by firmware to the OSPM during boot time.
>   2. Architectures that support CPU hotplug can evaluate the ACPI _MAT method to
>      get this kind of information from the firmware even after boot, and the
>      OSPM has the capability to process these. ARM kernel uses information in MADT
>      interrupt controller structures to identify the number of present CPUs during
>      boot and hence does not allow to change these after boot. The number of
>      present CPUs cannot be changed. It is an architectural constraint!
> 
> C. Impediments in KVM to Support Virtual CPU Hotplug (Architectural Constraint)
> 
>   1. KVM VGIC:
>      a. Sizing of various VGIC resources like memory regions, etc., related to
>         the redistributor happens only once and is fixed at the VM init time
>         and cannot be changed later after initialization has happened.
>         KVM statically configures these resources based on the number of vCPUs
>         and the number/size of redistributor ranges.
>      b. Association between vCPU and its VGIC redistributor is fixed at the
>         VM init time within the KVM, i.e., when redistributor iodevs gets
>         registered. VGIC does not allow to setup/change this association
>         after VM initialization has happened. Physically, every CPU/GICC is
>         uniquely connected with its redistributor, and there is no
>         architectural way to set this up.
>   2. KVM vCPUs:
>      a. Lack of specification means destruction of KVM vCPUs does not exist as
>         there is no reference to tell what to do with other per-vCPU
>         components like redistributors, arch timer, etc.
>      b. In fact, KVM does not implement the destruction of vCPUs for any
>         architecture. This is independent of whether the architecture
>         actually supports CPU Hotplug feature. For example, even for x86 KVM
>         does not implement the destruction of vCPUs.
> 
> D. Impediments in Qemu to Support Virtual CPU Hotplug (KVM Constraints->Arch)
> 
>   1. Qemu CPU Objects MUST be created to initialize all the Host KVM vCPUs to
>      overcome the KVM constraint. KVM vCPUs are created and initialized when Qemu
>      CPU Objects are realized. But keeping the QOM CPU objects realized for
>      'yet-to-be-plugged' vCPUs can create problems when these new vCPUs shall
>      be plugged using device_add and a new QOM CPU object shall be created.
>   2. GICV3State and GICV3CPUState objects MUST be sized over *possible vCPUs*
>      during VM init time while QOM GICV3 Object is realized. This is because
>      KVM VGIC can only be initialized once during init time. But every
>      GICV3CPUState has an associated QOM CPU Object. Later might correspond to
>      vCPU which are 'yet-to-be-plugged' (unplugged at init).
>   3. How should new QOM CPU objects be connected back to the GICV3CPUState
>      objects and disconnected from it in case the CPU is being hot(un)plugged?
>   4. How should 'unplugged' or 'yet-to-be-plugged' vCPUs be represented in the
>      QOM for which KVM vCPU already exists? For example, whether to keep,
>       a. No QOM CPU objects Or
>       b. Unrealized CPU Objects
>   5. How should vCPU state be exposed via ACPI to the Guest? Especially for
>      the unplugged/yet-to-be-plugged vCPUs whose CPU objects might not exist
>      within the QOM but the Guest always expects all possible vCPUs to be
>      identified as ACPI *present* during boot.
>   6. How should Qemu expose GIC CPU interfaces for the unplugged or
>      yet-to-be-plugged vCPUs using ACPI MADT Table to the Guest?
> 
> E. Summary of Approach ([+] Workarounds to problems in sections A, B, C & D)
> 
>   1. At VM Init, pre-create all the possible vCPUs in the Host KVM i.e., even
>      for the vCPUs which are yet-to-be-plugged in Qemu but keep them in the
>      powered-off state.
>   2. After the KVM vCPUs have been initialized in the Host, the KVM vCPU
>      objects corresponding to the unplugged/yet-to-be-plugged vCPUs are parked
>      at the existing per-VM "kvm_parked_vcpus" list in Qemu. (similar to x86)
>   3. GICV3State and GICV3CPUState objects are sized over possible vCPUs during
>      VM init time i.e., when Qemu GIC is realized. This, in turn, sizes KVM VGIC
>      resources like memory regions, etc., related to the redistributors with the
>      number of possible KVM vCPUs. This never changes after VM has initialized.
>   4. Qemu CPU objects corresponding to unplugged/yet-to-be-plugged vCPUs are
>      released post Host KVM CPU and GIC/VGIC initialization.
>   5. Build ACPI MADT Table with the following updates:
>      a. Number of GIC CPU interface entries (=possible vCPUs)
>      b. Present Boot vCPU as MADT.GICC.Enabled=1 (Not hot[un]pluggable) 
>      c. Present hot(un)pluggable vCPUs as MADT.GICC.online-capable=1  
>         - MADT.GICC.Enabled=0 (Mutually exclusive) [6][7]
> - vCPU can be ACPI enabled+onlined after Guest boots (Firmware Policy) 
> - Some issues with above (details in later sections)
>   6. Expose below ACPI Status to Guest kernel:
>      a. Always _STA.Present=1 (all possible vCPUs)
>      b. _STA.Enabled=1 (plugged vCPUs)
>      c. _STA.Enabled=0 (unplugged vCPUs)
>   7. vCPU hotplug *realizes* new QOM CPU object. The following happens:
>      a. Realizes, initializes QOM CPU Object & spawns Qemu vCPU thread.
>      b. Unparks the existing KVM vCPU ("kvm_parked_vcpus" list).
>         - Attaches to QOM CPU object.
>      c. Reinitializes KVM vCPU in the Host.
>         - Resets the core and sys regs, sets defaults, etc.
>      d. Runs KVM vCPU (created with "start-powered-off").
> - vCPU thread sleeps (waits for vCPU reset via PSCI). 
>      e. Updates Qemu GIC.
>         - Wires back IRQs related to this vCPU.
>         - GICV3CPUState association with QOM CPU Object.
>      f. Updates [6] ACPI _STA.Enabled=1.
>      g. Notifies Guest about the new vCPU (via ACPI GED interface).
> - Guest checks _STA.Enabled=1.
> - Guest adds processor (registers CPU with LDM) [3].
>      h. Plugs the QOM CPU object in the slot.
>         - slot-number = cpu-index {socket, cluster, core, thread}.
>      i. Guest online's vCPU (CPU_ON PSCI call over HVC/SMC).
>         - KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
>         - Qemu powers-on KVM vCPU in the Host.
>   8. vCPU hot-unplug *unrealizes* QOM CPU Object. The following happens:
>      a. Notifies Guest (via ACPI GED interface) vCPU hot-unplug event.
>         - Guest offline's vCPU (CPU_OFF PSCI call over HVC/SMC).
>      b. KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
>         - Qemu powers-off the KVM vCPU in the Host.
>      c. Guest signals *Eject* vCPU to Qemu.
>      d. Qemu updates [6] ACPI _STA.Enabled=0.
>      e. Updates GIC.
>         - Un-wires IRQs related to this vCPU.
>         - GICV3CPUState association with new QOM CPU Object is updated.
>      f. Unplugs the vCPU.
> - Removes from slot.
>         - Parks KVM vCPU ("kvm_parked_vcpus" list).
>         - Unrealizes QOM CPU Object & joins back Qemu vCPU thread.
> - Destroys QOM CPU object.
>      g. Guest checks ACPI _STA.Enabled=0.
>         - Removes processor (unregisters CPU with LDM) [3].
> 
> F. Work Presented at KVM Forum Conferences:
> ==========================================
> 
> Details of the above work have been presented at KVMForum2020 and KVMForum2023
> conferences. Slides & video are available at the links below:
> a. KVMForum 2023
>   - Challenges Revisited in Supporting Virt CPU Hotplug on architectures that don't Support CPU Hotplug (like ARM64).
>     https://kvm-forum.qemu.org/2023/KVM-forum-cpu-hotplug_7OJ1YyJ.pdf
>     https://kvm-forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Virt_CPU_Hotplug_-__ii0iNb3.pdf
>     https://www.youtube.com/watch?v=hyrw4j2D6I0&t=23970s
>     https://kvm-forum.qemu.org/2023/talk/9SMPDQ/
> b. KVMForum 2020
>   - Challenges in Supporting Virtual CPU Hotplug on SoC Based Systems (like ARM64) - Salil Mehta, Huawei.
>     https://sched.co/eE4m
> 
> (VI) Commands Used
> ==================
> 
> A. Qemu launch commands to init the machine:
> 
>    $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3 \
>      -cpu host -smp cpus=4,maxcpus=6 \
>      -m 300M \
>      -kernel Image \
>      -initrd rootfs.cpio.gz \
>      -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2 acpi=force" \
>      -nographic \
>      -bios QEMU_EFI.fd \
> 
> B. Hot-(un)plug related commands:
> 
>  # Hotplug a host vCPU (accel=kvm):
>    $ device_add host-arm-cpu,id=core4,core-id=4
> 
>  # Hotplug a vCPU (accel=tcg):
>    $ device_add cortex-a57-arm-cpu,id=core4,core-id=4
> 
>  # Delete the vCPU:
>    $ device_del core4
> 
> Sample output on guest after boot:
> 
>    $ cat /sys/devices/system/cpu/possible
>    0-5
>    $ cat /sys/devices/system/cpu/present
>    0-5
>    $ cat /sys/devices/system/cpu/enabled
>    0-3
>    $ cat /sys/devices/system/cpu/online
>    0-1
>    $ cat /sys/devices/system/cpu/offline
>    2-5
> 
> Sample output on guest after hotplug of vCPU=4:
> 
>    $ cat /sys/devices/system/cpu/possible
>    0-5
>    $ cat /sys/devices/system/cpu/present
>    0-5
>    $ cat /sys/devices/system/cpu/enabled
>    0-4
>    $ cat /sys/devices/system/cpu/online
>    0-1,4
>    $ cat /sys/devices/system/cpu/offline
>    2-3,5
> 
>    Note: vCPU=4 was explicitly 'onlined' after hot-plug
>    $ echo 1 > /sys/devices/system/cpu/cpu4/online
> 
> (VII) Latest Repository
> =======================
> 
> (*) Latest Qemu RFC V3 (Architecture Specific) patch set:
>    https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v3
> (*) Latest Qemu V13 (Architecture Agnostic) patch set:
>    https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v3.arch.agnostic.v13
> (*) QEMU changes for vCPU hotplug can be cloned from below site:
>    https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2
> (*) Guest Kernel changes (by James Morse, ARM) are available here:
>    https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git virtual_cpu_hotplug/rfc/v2
> (*) Leftover patches of the kernel are available here:
>    https://lore.kernel.org/lkml/20240529133446.28446-1-Jonathan.Cameron@huawei.com/
>    https://github.com/salil-mehta/linux/commits/virtual_cpu_hotplug/rfc/v6.jic/ (not latest)
> 
> (VIII) KNOWN ISSUES
> ===================
> 
> 1. Migration has been lightly tested but has been found working.
> 2. TCG is broken.
> 3. HVF and qtest are not supported yet.
> 4. ACPI MADT Table flags [7] MADT.GICC.Enabled and MADT.GICC.online-capable are
>   mutually exclusive, i.e., as per the change [6], a vCPU cannot be both
>   GICC.Enabled and GICC.online-capable. This means:
>      [ Link: https://bugzilla.tianocore.org/show_bug.cgi?id=3706 ]
>   a. If we have to support hot-unplug of the cold-booted vCPUs, then these MUST
>      be specified as GICC.online-capable in the MADT Table during boot by the
>      firmware/Qemu. But this requirement conflicts with the requirement to
>      support new Qemu changes with legacy OS that don't understand
>      MADT.GICC.online-capable Bit. Legacy OS during boot time will ignore this
>      bit, and hence these vCPUs will not appear on such OS. This is unexpected
>      behavior.
>   b. In case we decide to specify vCPUs as MADT.GICC.Enabled and try to unplug
>      these cold-booted vCPUs from OS (which in actuality should be blocked by
>      returning error at Qemu), then features like 'kexec' will break.
>   c. As I understand, removal of the cold-booted vCPUs is a required feature
>      and x86 world allows it.
>   d. Hence, either we need a specification change to make the MADT.GICC.Enabled
>      and MADT.GICC.online-capable Bits NOT mutually exclusive or NOT support
>      the removal of cold-booted vCPUs. In the latter case, a check can be introduced
>      to bar the users from unplugging vCPUs, which were cold-booted, using QMP
>      commands. (Needs discussion!)
>      Please check the patch part of this patch set:
>      [hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled].
> 
>      NOTE: This is definitely not a blocker!
> 5. Code related to the notification to GICV3 about the hot(un)plug of a vCPU event
>   might need further discussion.
> 
> 
> (IX) THINGS TO DO
> =================
> 
> 1. Fix issues related to TCG/Emulation support. (Not a blocker)
> 2. Comprehensive Testing is in progress. (Positive feedback from Oracle & Ampere)
> 3. Qemu Documentation (.rst) needs to be updated.
> 4. Fix qtest, HVF Support (Future).
> 5. Update the design issue related to ACPI MADT.GICC flags discussed in known
>   issues. This might require UEFI ACPI specification change (Not a blocker).
> 6. Add ACPI _OSC 'Query' support. Only part of _OSC support exists now. (Not a blocker).
> 
> The above is *not* a complete list. Will update later!
> 
> Best regards,  
> Salil.
> 
> (X) DISCLAIMER
> ==============
> 
> This work is an attempt to present a proof-of-concept of the ARM64 vCPU hotplug
> implementation to the community. This is *not* production-level code and might
> have bugs. Comprehensive testing is being done on HiSilicon Kunpeng920 SoC,
> Oracle, and Ampere servers. We are nearing stable code and a non-RFC
> version shall be floated soon.
> 
> This work is *mostly* in the lines of the discussions that have happened in the
> previous years [see refs below] across different channels like the mailing list,
> Linaro Open Discussions platform, and various conferences like KVMForum, etc. This
> RFC is being used as a way to verify the idea mentioned in this cover letter and
> to get community views. Once this has been agreed upon, a formal patch shall be
> posted to the mailing list for review.
> 
> [The concept being presented has been found to work!]
> 
> (XI) ORGANIZATION OF PATCHES
> ============================
> 
> A. Architecture *specific* patches:
> 
>   [Patch 1-8, 17, 27, 29] logic required during machine init.
>    (*) Some validation checks.
>    (*) Introduces core-id property and some util functions required later.
>    (*) Logic to pre-create vCPUs.
>    (*) GIC initialization pre-sized with possible vCPUs.
>    (*) Some refactoring to have common hot and cold plug logic together.
>    (*) Release of disabled QOM CPU objects in post_cpu_init().
>    (*) Support of ACPI _OSC method to negotiate platform hotplug capabilities.
>   [Patch 9-16] logic related to ACPI at machine init time.
>    (*) Changes required to Enable ACPI for CPU hotplug.
>    (*) Initialization of ACPI GED framework to cater to CPU Hotplug Events.
>    (*) ACPI MADT/MAT changes.
>   [Patch 18-26] logic required during vCPU hot-(un)plug.
>    (*) Basic framework changes to support vCPU hot-(un)plug.
>    (*) ACPI GED changes for hot-(un)plug hooks.
>    (*) Wire-unwire the IRQs.
>    (*) GIC notification logic.
>    (*) ARMCPU unrealize logic.
>    (*) Handling of SMCC Hypercall Exits by KVM to Qemu.
> 
> B. Architecture *agnostic* patches:
> 
>   [PATCH V13 0/8] Add architecture agnostic code to support vCPU Hotplug.
>   https://lore.kernel.org/qemu-devel/20240607115649.214622-1-salil.mehta@huawei.com/T/#md0887eb07976bc76606a8204614ccc7d9a01c1f7
>    (*) Refactors vCPU create, Parking, unparking logic of vCPUs, and addition of traces.
>    (*) Build ACPI AML related to CPU control dev.
>    (*) Changes related to the destruction of CPU Address Space.
>    (*) Changes related to the uninitialization of GDB Stub.
>    (*) Updating of Docs.
> 
> (XII) REFERENCES
> ================
> 
> [1] https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.mehta@huawei.com/
> [2] https://lore.kernel.org/linux-arm-kernel/20200625133757.22332-1-salil.mehta@huawei.com/
> [3] https://lore.kernel.org/lkml/20230203135043.409192-1-james.morse@arm.com/
> [4] https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/
> [5] https://lore.kernel.org/all/20230404154050.2270077-1-oliver.upton@linux.dev/
> [6] https://bugzilla.tianocore.org/show_bug.cgi?id=3706
> [7] https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic-cpu-interface-gicc-structure
> [8] https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
> [9] https://cloud.google.com/kubernetes-engine/docs/concepts/verticalpodautoscaler
> [10] https://docs.aws.amazon.com/eks/latest/userguide/vertical-pod-autoscaler.html
> [11] https://lkml.org/lkml/2019/7/10/235
> [12] https://lists.cs.columbia.edu/pipermail/kvmarm/2018-July/032316.html
> [13] https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg06517.html
> [14] https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-lists.linaro.org/thread/7CGL6JTACPUZEYQC34CZ2ZBWJGSR74WE/
> [15] http://lists.nongnu.org/archive/html/qemu-devel/2018-07/msg01168.html
> [16] https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00131.html
> [17] https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-lists.linaro.org/message/X74JS6P2N4AUWHHATJJVVFDI2EMDZJ74/
> [18] https://lore.kernel.org/lkml/20210608154805.216869-1-jean-philippe@linaro.org/
> [19] https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/ 
> [20] https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gicc-cpu-interface-flags
> [21] https://lore.kernel.org/qemu-devel/20230926100436.28284-1-salil.mehta@huawei.com/
> [22] https://lore.kernel.org/qemu-devel/20240607115649.214622-1-salil.mehta@huawei.com/T/#md0887eb07976bc76606a8204614ccc7d9a01c1f7
> 
> (XIII) ACKNOWLEDGEMENTS
> =======================
> 
> I would like to take this opportunity to thank below people for various
> discussions with me over different channels during the development:
> 
> Marc Zyngier (Google)               Catalin Marinas (ARM),         
> James Morse(ARM),                   Will Deacon (Google), 
> Jean-Phillipe Brucker (Linaro),     Sudeep Holla (ARM),
> Lorenzo Pieralisi (Linaro),         Gavin Shan (Redhat), 
> Jonathan Cameron (Huawei),          Darren Hart (Ampere),
> Igor Mamedov (Redhat),              Ilkka Koskinen (Ampere),
> Andrew Jones (Redhat),              Karl Heubaum (Oracle),
> Keqian Zhu (Huawei),                Miguel Luis (Oracle),
> Xiongfeng Wang (Huawei),            Vishnu Pajjuri (Ampere),
> Shameerali Kolothum (Huawei)        Russell King (Oracle)
> Xuwei/Joy (Huawei),                 Peter Maydel (Linaro)
> Zengtao/Prime (Huawei),             And all those whom I have missed! 
> 
> Many thanks to the following people for their current or past contributions:
> 
> 1. James Morse (ARM)
>   (Current Kernel part of vCPU Hotplug Support on AARCH64)
> 2. Jean-Philippe Brucker (Linaro)
>   (Prototyped one of the earlier PSCI-based POC [17][18] based on RFC V1)
> 3. Keqian Zhu (Huawei)
>   (Co-developed Qemu prototype)
> 4. Xiongfeng Wang (Huawei)
>   (Co-developed an earlier kernel prototype with me)
> 5. Vishnu Pajjuri (Ampere)
>   (Verification on Ampere ARM64 Platforms + fixes)
> 6. Miguel Luis (Oracle)
>   (Verification on Oracle ARM64 Platforms + fixes)
> 7. Russell King (Oracle) & Jonathan Cameron (Huawei)
>   (Helping in upstreaming James Morse's Kernel patches).
> 
> (XIV) Change Log:
> =================
> 
> RFC V2 -> RFC V3:
> -----------------
> 1. Miscellaneous:
>   - Split the RFC V2 into arch-agnostic and arch-specific patch sets.
> 2. Addressed Gavin Shan's (RedHat) comments:
>   - Made CPU property accessors inline.
>     https://lore.kernel.org/qemu-devel/6cd28639-2cfa-f233-c6d9-d5d2ec5b1c58@redhat.com/
>   - Collected Reviewed-bys [PATCH RFC V2 4/37, 14/37, 22/37].
>   - Dropped the patch as it was not required after init logic was refactored.
>     https://lore.kernel.org/qemu-devel/4fb2eef9-6742-1eeb-721a-b3db04b1be97@redhat.com/
>   - Fixed the range check for the core during vCPU Plug.
>     https://lore.kernel.org/qemu-devel/1c5fa24c-6bf3-750f-4f22-087e4a9311af@redhat.com/
>   - Added has_hotpluggable_vcpus check to make build_cpus_aml() conditional.
>     https://lore.kernel.org/qemu-devel/832342cb-74bc-58dd-c5d7-6f995baeb0f2@redhat.com/
>   - Fixed the states initialization in cpu_hotplug_hw_init() to accommodate previous refactoring.
>     https://lore.kernel.org/qemu-devel/da5e5609-1883-8650-c7d8-6868c7b74f1c@redhat.com/
>   - Fixed typos.
>     https://lore.kernel.org/qemu-devel/eb1ac571-7844-55e6-15e7-3dd7df21366b@redhat.com/
>   - Removed the unnecessary 'goto fail'.
>     https://lore.kernel.org/qemu-devel/4d8980ac-f402-60d4-fe52-787815af8a7d@redhat.com/#t
>   - Added check for hotpluggable vCPUs in the _OSC method.
>     https://lore.kernel.org/qemu-devel/20231017001326.FUBqQ1PTowF2GxQpnL3kIW0AhmSqbspazwixAHVSi6c@z/
> 3. Addressed Shaoqin Huang's (Intel) comments:
>   - Fixed the compilation break due to the absence of a call to virt_cpu_properties() missing
>     along with its definition.
>     https://lore.kernel.org/qemu-devel/3632ee24-47f7-ae68-8790-26eb2cf9950b@redhat.com/
> 4. Addressed Jonathan Cameron's (Huawei) comments:
>   - Gated the 'disabled vcpu message' for GIC version < 3.
>     https://lore.kernel.org/qemu-devel/20240116155911.00004fe1@Huawei.com/
> 
> RFC V1 -> RFC V2:
> -----------------
> 1. Addressed James Morse's (ARM) requirement as per Linaro Open Discussion:
>   - Exposed all possible vCPUs as always ACPI _STA.present and available during boot time.
>   - Added the _OSC handling as required by James's patches.
>   - Introduction of 'online-capable' bit handling in the flag of MADT GICC.
>   - SMCC Hypercall Exit handling in Qemu.
> 2. Addressed Marc Zyngier's comment:
>   - Fixed the note about GIC CPU Interface in the cover letter.
> 3. Addressed issues raised by Vishnu Pajjuru (Ampere) & Miguel Luis (Oracle) during testing:
>   - Live/Pseudo Migration crashes.
> 4. Others:
>   - Introduced the concept of persistent vCPU at QOM.
>   - Introduced wrapper APIs of present, possible, and persistent.
>   - Change at ACPI hotplug H/W init leg accommodating initializing is_present and is_enabled states.
>   - Check to avoid unplugging cold-booted vCPUs.
>   - Disabled hotplugging with TCG/HVF/QTEST.
>   - Introduced CPU Topology, {socket, cluster, core, thread}-id property.
>   - Extract virt CPU properties as a common virt_vcpu_properties() function.
> 
> Author Salil Mehta (1):
>  target/arm/kvm,tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu
> 
> Jean-Philippe Brucker (2):
>  hw/acpi: Make _MAT method optional
>  target/arm/kvm: Write CPU state back to KVM on reset
> 
> Miguel Luis (1):
>  tcg/mttcg: enable threads to unregister in tcg_ctxs[]
> 
> Salil Mehta (25):
>  arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id
>    property
>  cpu-common: Add common CPU utility for possible vCPUs
>  hw/arm/virt: Limit number of possible vCPUs for unsupported Accel or
>    GIC Type
>  hw/arm/virt: Move setting of common CPU properties in a function
>  arm/virt,target/arm: Machine init time change common to vCPU
>    {cold|hot}-plug
>  arm/virt,kvm: Pre-create disabled possible vCPUs @machine init
>  arm/virt,gicv3: Changes to pre-size GIC with possible vcpus @machine
>    init
>  arm/virt: Init PMU at host for all possible vcpus
>  arm/acpi: Enable ACPI support for vcpu hotplug
>  arm/virt: Add cpu hotplug events to GED during creation
>  arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
>  arm/virt/acpi: Build CPUs AML with CPU Hotplug support
>  arm/virt: Make ARM vCPU *present* status ACPI *persistent*
>  hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} Bits
>    to Guest
>  hw/arm: MADT Tbl change to size the guest with possible vCPUs
>  arm/virt: Release objects for *disabled* possible vCPUs after init
>  arm/virt: Add/update basic hot-(un)plug framework
>  arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug
>  hw/arm,gicv3: Changes to update GIC with vCPU hot-plug notification
>  hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register
>    info
>  arm/virt: Update the guest(via GED) about CPU hot-(un)plug events
>  hw/arm: Changes required for reset and to support next boot
>  target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
>  hw/arm: Support hotplug capability check using _OSC method
>  hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled
> 
> accel/tcg/tcg-accel-ops-mttcg.c    |   1 +
> cpu-common.c                       |  37 ++
> hw/acpi/cpu.c                      |  62 +-
> hw/acpi/generic_event_device.c     |  11 +
> hw/arm/Kconfig                     |   1 +
> hw/arm/boot.c                      |   2 +-
> hw/arm/virt-acpi-build.c           | 113 +++-
> hw/arm/virt.c                      | 877 +++++++++++++++++++++++------
> hw/core/gpio.c                     |   2 +-
> hw/intc/arm_gicv3.c                |   1 +
> hw/intc/arm_gicv3_common.c         |  66 ++-
> hw/intc/arm_gicv3_cpuif.c          | 269 +++++----
> hw/intc/arm_gicv3_cpuif_common.c   |   5 +
> hw/intc/arm_gicv3_kvm.c            |  39 +-
> hw/intc/gicv3_internal.h           |   2 +
> include/hw/acpi/cpu.h              |   2 +
> include/hw/arm/boot.h              |   2 +
> include/hw/arm/virt.h              |  38 +-
> include/hw/core/cpu.h              |  78 +++
> include/hw/intc/arm_gicv3_common.h |  23 +
> include/hw/qdev-core.h             |   2 +
> include/tcg/startup.h              |   7 +
> target/arm/arm-powerctl.c          |  51 +-
> target/arm/cpu-qom.h               |  18 +-
> target/arm/cpu.c                   | 112 ++++
> target/arm/cpu.h                   |  18 +
> target/arm/cpu64.c                 |  15 +
> target/arm/gdbstub.c               |   6 +
> target/arm/helper.c                |  27 +-
> target/arm/internals.h             |  14 +-
> target/arm/kvm.c                   | 146 ++++-
> target/arm/kvm_arm.h               |  25 +
> target/arm/meson.build             |   1 +
> target/arm/{tcg => }/psci.c        |   8 +
> target/arm/tcg/meson.build         |   4 -
> tcg/tcg.c                          |  24 +
> 36 files changed, 1749 insertions(+), 360 deletions(-)
> rename target/arm/{tcg => }/psci.c (97%)
> 
> -- 
> 2.34.1
> 


^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-07-01 11:38 ` Miguel Luis
@ 2024-07-01 16:30   ` Salil Mehta via
  0 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-07-01 16:30 UTC (permalink / raw)
  To: Miguel Luis
  Cc: qemu-devel@nongnu.org, qemu-arm@nongnu.org, Michael S . Tsirkin,
	Marc Zyngier, Jean-Philippe Brucker, Jonathan Cameron,
	Lorenzo Pieralisi, Peter Maydell, Richard Henderson,
	Igor Mammedov, andrew.jones@linux.dev, david@redhat.com,
	Philippe Mathieu-Daudé, eric.auger@redhat.com,
	will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev,
	pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, Karl Heubaum,
	salil.mehta@opnsrc.net, zhukeqian, wangxiongfeng (C),
	wangyanan (Y), jiakernel2@gmail.com, maobibo@loongson.cn,
	lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com,
	Linuxarm

HI Miguel,

>  From: Miguel Luis <miguel.luis@oracle.com>
>  Sent: Monday, July 1, 2024 12:39 PM
>  To: Salil Mehta <salil.mehta@huawei.com>
>  
>  Hi Salil,
>  
>  > On 13 Jun 2024, at 23:36, Salil Mehta <salil.mehta@huawei.com> wrote:
>  >
>  > PROLOGUE
>  > ========
>  >
>  > To assist in review and set the right expectations from this RFC,
>  > please first read the sections *APPENDED AT THE END* of this cover
>  letter:
>  >
>  > 1. Important *DISCLAIMER* [Section (X)] 2. Work presented at
>  KVMForum
>  > Conference (slides available) [Section (V)F] 3. Organization of
>  > patches [Section (XI)] 4. References [Section (XII)] 5. Detailed TODO
>  > list of leftover work or work-in-progress [Section (IX)]
>  >
>  > There has been interest shown by other organizations in adapting this
>  > series for their architecture. Hence, RFC V2 [21] has been split into
>  > architecture
>  > *agnostic* [22] and *specific* patch sets.
>  >
>  > This is an ARM architecture-specific patch set carved out of RFC V2.
>  > Please check section (XI)B for details of architecture agnostic patches.
>  >
>  > SECTIONS [I - XIII] are as follows:
>  >
>  > (I) Key Changes [details in last section (XIV)]
>  > ==============================================
>  >
>  > RFC V2 -> RFC V3
>  >
>  > 1. Split into Architecture *agnostic* (V13) [22] and *specific* (RFC V3)
>  patch sets.
>  > 2. Addressed comments by Gavin Shan (RedHat), Shaoqin Huang
>  (RedHat), Philippe Mathieu-Daudé (Linaro),
>  >   Jonathan Cameron (Huawei), Zhao Liu (Intel).
>  >
>  
>  I’ve tested this series along with v10 kernel patches from [1] on the
>  following items:
>  
>  Boot.
>  Hotplug up to maxcpus.
>  Hot unplug down to the number of boot cpus.
>  Hotplug vcpus then migrate to a new VM.
>  Hot unplug down to the number of boot cpus then migrate to a new VM.
>  Up to 6 successive live migrations.
>  
>  And in which LGTM.
>  
>  Please feel free to add,
>  Tested-by: Miguel Luis <miguel.luis@oracle.com>

Many thanks for your efforts. Appreciate this.


Best
Salil.


>  
>  Regards,
>  Miguel
>  
>  [1]
>  https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=for
>  -next/vcpu-hotplug
>  
>  > RFC V1 -> RFC V2

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 13/29] arm/virt: Make ARM vCPU *present* status ACPI *persistent*
  2024-06-13 23:36 ` [PATCH RFC V3 13/29] arm/virt: Make ARM vCPU *present* status ACPI *persistent* Salil Mehta via
@ 2024-07-04  2:49   ` Nicholas Piggin
  2024-07-04 11:23     ` Salil Mehta via
  0 siblings, 1 reply; 105+ messages in thread
From: Nicholas Piggin @ 2024-07-04  2:49 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm, mst
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, gshan, rafael,
	borntraeger, alex.bennee, harshpb, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

On Fri Jun 14, 2024 at 9:36 AM AEST, Salil Mehta wrote:
> ARM arch does not allow CPUs presence to be changed [1] after kernel has booted.
> Hence, firmware/ACPI/Qemu must ensure persistent view of the vCPUs to the Guest
> kernel even when they are not present in the QoM i.e. are unplugged or are
> yet-to-be-plugged

Do you need arch-independent state for this? If ARM always requires
it then can it be implemented between arm and acpi interface?

If not, then perhaps could it be done in the patch that introduces
all the other state?

> References:
> [1] Check comment 5 in the bugzilla entry
>    Link: https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5

If I understand correctly (and I don't know ACPI, so it's likely
I don't), that is and update to ACPI spec to say some bit in ACPI
table must remain set regardless of CPU hotplug state.

Reference links are good, I think it would be nice to add a small
summary in the changelog too.

Thanks,
Nick

>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>  cpu-common.c          |  6 ++++++
>  hw/arm/virt.c         |  7 +++++++
>  include/hw/core/cpu.h | 21 +++++++++++++++++++++
>  3 files changed, 34 insertions(+)
>
> diff --git a/cpu-common.c b/cpu-common.c
> index 49d2a50835..e4b4dee99a 100644
> --- a/cpu-common.c
> +++ b/cpu-common.c
> @@ -128,6 +128,12 @@ bool qemu_enabled_cpu(CPUState *cpu)
>      return cpu && !cpu->disabled;
>  }
>  
> +bool qemu_persistent_cpu(CPUState *cpu)
> +{
> +    /* cpu state can be faked to the guest via acpi */
> +    return cpu && cpu->acpi_persistent;
> +}
> +
>  uint64_t qemu_get_cpu_archid(int cpu_index)
>  {
>      MachineState *ms = MACHINE(qdev_get_machine());
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 5f98162587..9d33f30a6a 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -3016,6 +3016,13 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>          return;
>      }
>      virt_cpu_set_properties(OBJECT(cs), cpu_slot, errp);
> +
> +    /*
> +     * To give persistent presence view of vCPUs to the guest, ACPI might need
> +     * to fake the presence of the vCPUs to the guest but keep them disabled.
> +     * This shall be used during the init of ACPI Hotplug state and hot-unplug
> +     */
> +     cs->acpi_persistent = true;
>  }
>  
>  static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index 62e68611c0..e13e542177 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -540,6 +540,14 @@ struct CPUState {
>       * every CPUState is enabled across all architectures.
>       */
>      bool disabled;
> +    /*
> +     * On certain architectures, to provide a persistent view of the 'presence'
> +     * of vCPUs to the guest, ACPI might need to fake the 'presence' of the
> +     * vCPUs but keep them ACPI-disabled for the guest. This is achieved by
> +     * returning `_STA.PRES=True` and `_STA.Ena=False` for the unplugged vCPUs
> +     * in QEMU QoM.
> +     */
> +    bool acpi_persistent;
>  
>      /* TODO Move common fields from CPUArchState here. */
>      int cpu_index;
> @@ -959,6 +967,19 @@ bool qemu_present_cpu(CPUState *cpu);
>   */
>  bool qemu_enabled_cpu(CPUState *cpu);
>  
> +/**
> + * qemu_persistent_cpu:
> + * @cpu: The vCPU to check
> + *
> + * Checks if the vCPU state should always be reflected as *present* via ACPI
> + * to the Guest. By default, this is False on all architectures and has to be
> + * explicity set during initialization.
> + *
> + * Returns: True if it is ACPI 'persistent' CPU
> + *
> + */
> +bool qemu_persistent_cpu(CPUState *cpu);
> +
>  /**
>   * qemu_get_cpu_archid:
>   * @cpu_index: possible vCPU for which arch-id needs to be retreived



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 08/29] arm/virt: Init PMU at host for all possible vcpus
  2024-06-13 23:36 ` [PATCH RFC V3 08/29] arm/virt: Init PMU at host for all possible vcpus Salil Mehta via
@ 2024-07-04  3:07   ` Nicholas Piggin
  2024-07-04 12:03     ` Salil Mehta via
  0 siblings, 1 reply; 105+ messages in thread
From: Nicholas Piggin @ 2024-07-04  3:07 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm, mst
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, gshan, rafael,
	borntraeger, alex.bennee, harshpb, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

On Fri Jun 14, 2024 at 9:36 AM AEST, Salil Mehta wrote:
> PMU for all possible vCPUs must be initialized at the VM initialization time.
> Refactor existing code to accomodate possible vCPUs. This also assumes that all
> processor being used are identical.
>
> Past discussion for reference:
> Link: https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00131.html

I guess it's something for the ARM people, but there's a lot of
information in there, could it be useful to summarise important
parts here, e.g., from Andrew:

 KVM requires all VCPUs to have a PMU if one does. If the ARM ARM
 says it's possible to have PMUs for only some CPUs, then, for TCG,
 the restriction could be relaxed.

(I assume he meant ARM arch)

>
> Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>  hw/arm/virt.c         | 12 ++++++++----
>  include/hw/arm/virt.h |  1 +
>  2 files changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index ac53bfadca..57ec429022 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2045,12 +2045,14 @@ static void finalize_gic_version(VirtMachineState *vms)
>   */
>  static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
>  {
> +    CPUArchIdList *possible_cpus = vms->parent.possible_cpus;
>      int max_cpus = MACHINE(vms)->smp.max_cpus;
> -    bool aarch64, pmu, steal_time;
> +    bool aarch64, steal_time;
>      CPUState *cpu;
> +    int n;
>  
>      aarch64 = object_property_get_bool(OBJECT(first_cpu), "aarch64", NULL);
> -    pmu = object_property_get_bool(OBJECT(first_cpu), "pmu", NULL);
> +    vms->pmu = object_property_get_bool(OBJECT(first_cpu), "pmu", NULL);
>      steal_time = object_property_get_bool(OBJECT(first_cpu),
>                                            "kvm-steal-time", NULL);
>  
> @@ -2077,8 +2079,10 @@ static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
>              memory_region_add_subregion(sysmem, pvtime_reg_base, pvtime);
>          }
>  
> -        CPU_FOREACH(cpu) {
> -            if (pmu) {
> +        for (n = 0; n < possible_cpus->len; n++) {
> +            cpu = qemu_get_possible_cpu(n);
> +

Maybe a CPU_FOREACH_POSSIBLE()?

Thanks,
Nick

> +            if (vms->pmu) {
>                  assert(arm_feature(&ARM_CPU(cpu)->env, ARM_FEATURE_PMU));
>                  if (kvm_irqchip_in_kernel()) {
>                      kvm_arm_pmu_set_irq(ARM_CPU(cpu), VIRTUAL_PMU_IRQ);
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index 36ac5ff4a2..d8dcc89a0d 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -155,6 +155,7 @@ struct VirtMachineState {
>      bool ras;
>      bool mte;
>      bool dtb_randomness;
> +    bool pmu;
>      OnOffAuto acpi;
>      VirtGICType gic_version;
>      VirtIOMMUType iommu;



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 02/29] cpu-common: Add common CPU utility for possible vCPUs
  2024-06-13 23:36 ` [PATCH RFC V3 02/29] cpu-common: Add common CPU utility for possible vCPUs Salil Mehta via
@ 2024-07-04  3:12   ` Nicholas Piggin
  2024-08-12  4:59   ` Gavin Shan
  1 sibling, 0 replies; 105+ messages in thread
From: Nicholas Piggin @ 2024-07-04  3:12 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm, mst
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, gshan, rafael,
	borntraeger, alex.bennee, harshpb, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

On Fri Jun 14, 2024 at 9:36 AM AEST, Salil Mehta wrote:

[...]

> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index 60b160d0b4..60b4778da9 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h

[...]

> +/**
> + * qemu_get_cpu_archid:
> + * @cpu_index: possible vCPU for which arch-id needs to be retreived
> + *
> + * Fetches the vCPU arch-id from the present possible vCPUs.
> + *
> + * Returns: arch-id of the possible vCPU
> + */
> +uint64_t qemu_get_cpu_archid(int cpu_index);

Not sure if blind... I can't see where this is used.

I'd be interested to see why it needs to be in non-arch code,
presumably it's only relevant to arch specific code. I'm
guessing ACPI needs it, but then could it be put into some
ACPI state or helper?

Thanks,
Nick


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 25/29] target/arm/kvm: Write CPU state back to KVM on reset
  2024-06-13 23:36 ` [PATCH RFC V3 25/29] target/arm/kvm: Write CPU state back to KVM on reset Salil Mehta via
@ 2024-07-04  3:27   ` Nicholas Piggin
  2024-07-04 12:27     ` Salil Mehta via
  0 siblings, 1 reply; 105+ messages in thread
From: Nicholas Piggin @ 2024-07-04  3:27 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm, mst
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, gshan, rafael,
	borntraeger, alex.bennee, harshpb, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

On Fri Jun 14, 2024 at 9:36 AM AEST, Salil Mehta wrote:
> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
>
> When a KVM vCPU is reset following a PSCI CPU_ON call, its power state
> is not synchronized with KVM at the moment. Because the vCPU is not
> marked dirty, we miss the call to kvm_arch_put_registers() that writes
> to KVM's MP_STATE. Force mp_state synchronization.

Hmm. Is this a bug fix for upstream? arm does respond to CPU_ON calls
by the look, but maybe it's not doing KVM parking until your series?
Maybe just a slight change to say "When KVM parking is implemented for
ARM..." if so.

>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>  target/arm/kvm.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index 1121771c4a..7acd83ce64 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -980,6 +980,7 @@ void kvm_arm_cpu_post_load(ARMCPU *cpu)
>  void kvm_arm_reset_vcpu(ARMCPU *cpu)
>  {
>      int ret;
> +    CPUState *cs = CPU(cpu);
>  
>      /* Re-init VCPU so that all registers are set to
>       * their respective reset values.
> @@ -1001,6 +1002,12 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu)
>       * for the same reason we do so in kvm_arch_get_registers().
>       */
>      write_list_to_cpustate(cpu);
> +
> +    /*
> +     * Ensure we call kvm_arch_put_registers(). The vCPU isn't marked dirty if
> +     * it was parked in KVM and is now booting from a PSCI CPU_ON call.
> +     */
> +    cs->vcpu_dirty = true;
>  }
>  
>  void kvm_arm_create_host_vcpu(ARMCPU *cpu)

Also above my pay grade, but arm_set_cpu_on_async_work() which seems
to be what calls the CPU reset you refer to does a bunch of CPU register
and state setting including the power state setting that you mention.
Would the vcpu_dirty be better to go there?

Thanks,
Nick


^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 13/29] arm/virt: Make ARM vCPU *present* status ACPI *persistent*
  2024-07-04  2:49   ` Nicholas Piggin
@ 2024-07-04 11:23     ` Salil Mehta via
  2024-07-05  0:08       ` Nicholas Piggin
  0 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-07-04 11:23 UTC (permalink / raw)
  To: Nicholas Piggin, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com,
	rafael@kernel.org, borntraeger@linux.ibm.com,
	alex.bennee@linaro.org, harshpb@linux.ibm.com,
	linux@armlinux.org.uk, darren@os.amperecomputing.com,
	ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com,
	karl.heubaum@oracle.com, miguel.luis@oracle.com,
	salil.mehta@opnsrc.net, zhukeqian, wangxiongfeng (C),
	wangyanan (Y), jiakernel2@gmail.com, maobibo@loongson.cn,
	lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com,
	Linuxarm

HI Nick,

Thanks for taking time to review. Please find my replies inline.

>  From: Nicholas Piggin <npiggin@gmail.com>
>  Sent: Thursday, July 4, 2024 3:49 AM
>  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  qemu-arm@nongnu.org; mst@redhat.com
>  
>  On Fri Jun 14, 2024 at 9:36 AM AEST, Salil Mehta wrote:
>  > ARM arch does not allow CPUs presence to be changed [1] after kernel
>  has booted.
>  > Hence, firmware/ACPI/Qemu must ensure persistent view of the vCPUs  to
>  > the Guest kernel even when they are not present in the QoM i.e. are
>  > unplugged or are yet-to-be-plugged
>  
>  Do you need arch-independent state for this? If ARM always requires it
>  then can it be implemented between arm and acpi interface?


Yes, we do need as we cannot say if the same constraint applies to other
architectures as well. Above stated constraint affects how the architecture
common ACPI CPU code is initialized.


>  
>  If not, then perhaps could it be done in the patch that introduces all the
>  other state?
>  
>  > References:
>  > [1] Check comment 5 in the bugzilla entry
>  >    Link: https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
>  
>  If I understand correctly (and I don't know ACPI, so it's likely I don't), that is
>  and update to ACPI spec to say some bit in ACPI table must remain set
>  regardless of CPU hotplug state.


ARM does not claims anything related to CPU hotplug right now. It simply
does not exists. The ACPI update is simply reinforcing the existing fact that
_STA.Present bit in the ACPI spec cannot be changed after system has booted. 

This is  because for ARM arch there are many other initializations which depend
upon the exact availability of CPU count during boot and they do not expect
that to change after boot. For example, there are so many per-CPU features
and the GIC CPU interface etc. which all expect this to be fixed at boot time.
This is immutable requirement from ARM.


>  
>  Reference links are good, I think it would be nice to add a small summary in
>  the changelog too.

sure. I will do.

Thanks
Salil.

>  
>  Thanks,
>  Nick
>  
>  >
>  > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  > ---
>  >  cpu-common.c          |  6 ++++++
>  >  hw/arm/virt.c         |  7 +++++++
>  >  include/hw/core/cpu.h | 21 +++++++++++++++++++++
>  >  3 files changed, 34 insertions(+)
>  >
>  > diff --git a/cpu-common.c b/cpu-common.c index
>  49d2a50835..e4b4dee99a
>  > 100644
>  > --- a/cpu-common.c
>  > +++ b/cpu-common.c
>  > @@ -128,6 +128,12 @@ bool qemu_enabled_cpu(CPUState *cpu)
>  >      return cpu && !cpu->disabled;
>  >  }
>  >
>  > +bool qemu_persistent_cpu(CPUState *cpu) {
>  > +    /* cpu state can be faked to the guest via acpi */
>  > +    return cpu && cpu->acpi_persistent; }
>  > +
>  >  uint64_t qemu_get_cpu_archid(int cpu_index)  {
>  >      MachineState *ms = MACHINE(qdev_get_machine()); diff --git
>  > a/hw/arm/virt.c b/hw/arm/virt.c index 5f98162587..9d33f30a6a 100644
>  > --- a/hw/arm/virt.c
>  > +++ b/hw/arm/virt.c
>  > @@ -3016,6 +3016,13 @@ static void virt_cpu_pre_plug(HotplugHandler
>  *hotplug_dev, DeviceState *dev,
>  >          return;
>  >      }
>  >      virt_cpu_set_properties(OBJECT(cs), cpu_slot, errp);
>  > +
>  > +    /*
>  > +     * To give persistent presence view of vCPUs to the guest, ACPI might
>  need
>  > +     * to fake the presence of the vCPUs to the guest but keep them
>  disabled.
>  > +     * This shall be used during the init of ACPI Hotplug state and hot-
>  unplug
>  > +     */
>  > +     cs->acpi_persistent = true;
>  >  }
>  >
>  >  static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState
>  > *dev, diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h index
>  > 62e68611c0..e13e542177 100644
>  > --- a/include/hw/core/cpu.h
>  > +++ b/include/hw/core/cpu.h
>  > @@ -540,6 +540,14 @@ struct CPUState {
>  >       * every CPUState is enabled across all architectures.
>  >       */
>  >      bool disabled;
>  > +    /*
>  > +     * On certain architectures, to provide a persistent view of the
>  'presence'
>  > +     * of vCPUs to the guest, ACPI might need to fake the 'presence' of the
>  > +     * vCPUs but keep them ACPI-disabled for the guest. This is achieved
>  by
>  > +     * returning `_STA.PRES=True` and `_STA.Ena=False` for the unplugged
>  vCPUs
>  > +     * in QEMU QoM.
>  > +     */
>  > +    bool acpi_persistent;
>  >
>  >      /* TODO Move common fields from CPUArchState here. */
>  >      int cpu_index;
>  > @@ -959,6 +967,19 @@ bool qemu_present_cpu(CPUState *cpu);
>  >   */
>  >  bool qemu_enabled_cpu(CPUState *cpu);
>  >
>  > +/**
>  > + * qemu_persistent_cpu:
>  > + * @cpu: The vCPU to check
>  > + *
>  > + * Checks if the vCPU state should always be reflected as *present*
>  > +via ACPI
>  > + * to the Guest. By default, this is False on all architectures and
>  > +has to be
>  > + * explicity set during initialization.
>  > + *
>  > + * Returns: True if it is ACPI 'persistent' CPU
>  > + *
>  > + */
>  > +bool qemu_persistent_cpu(CPUState *cpu);
>  > +
>  >  /**
>  >   * qemu_get_cpu_archid:
>  >   * @cpu_index: possible vCPU for which arch-id needs to be retreived


^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 08/29] arm/virt: Init PMU at host for all possible vcpus
  2024-07-04  3:07   ` Nicholas Piggin
@ 2024-07-04 12:03     ` Salil Mehta via
  0 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-07-04 12:03 UTC (permalink / raw)
  To: Nicholas Piggin, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com,
	rafael@kernel.org, borntraeger@linux.ibm.com,
	alex.bennee@linaro.org, harshpb@linux.ibm.com,
	linux@armlinux.org.uk, darren@os.amperecomputing.com,
	ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com,
	karl.heubaum@oracle.com, miguel.luis@oracle.com,
	salil.mehta@opnsrc.net, zhukeqian, wangxiongfeng (C),
	wangyanan (Y), jiakernel2@gmail.com, maobibo@loongson.cn,
	lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com,
	Linuxarm

HI Nick,

>  From: Nicholas Piggin <npiggin@gmail.com>
>  Sent: Thursday, July 4, 2024 4:08 AM
>  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  qemu-arm@nongnu.org; mst@redhat.com
>  
>  On Fri Jun 14, 2024 at 9:36 AM AEST, Salil Mehta wrote:
>  > PMU for all possible vCPUs must be initialized at the VM initialization time.
>  > Refactor existing code to accomodate possible vCPUs. This also assumes
>  > that all processor being used are identical.
>  >
>  > Past discussion for reference:
>  > Link:
>  > https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00131.html
>  
>  I guess it's something for the ARM people, but there's a lot of information in
>  there, could it be useful to summarise important parts here, e.g., from
>  Andrew:
>  
>   KVM requires all VCPUs to have a PMU if one does. If the ARM ARM  says
>  it's possible to have PMUs for only some CPUs, then, for TCG,  the
>  restriction could be relaxed.
>  
>  (I assume he meant ARM arch)


I retained the link just for a reference. Right now it is an assumption that
all vCPUs have similar features. This is reflected in KVM as well. 
(Maybe this might not be the case in future eventually with the advent of
heterogenous computing. Linaro had something going in that direction? 
For now, it looks to be a far-fetched idea). 

I can definitely summarize what was discussed earlier.


>  > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
>  > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
>  > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>  > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  > ---
>  >  hw/arm/virt.c         | 12 ++++++++----
>  >  include/hw/arm/virt.h |  1 +
>  >  2 files changed, 9 insertions(+), 4 deletions(-)
>  >
>  > diff --git a/hw/arm/virt.c b/hw/arm/virt.c index
>  > ac53bfadca..57ec429022 100644
>  > --- a/hw/arm/virt.c
>  > +++ b/hw/arm/virt.c
>  > @@ -2045,12 +2045,14 @@ static void
>  finalize_gic_version(VirtMachineState *vms)
>  >   */
>  >  static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion
>  > *sysmem)  {
>  > +    CPUArchIdList *possible_cpus = vms->parent.possible_cpus;
>  >      int max_cpus = MACHINE(vms)->smp.max_cpus;
>  > -    bool aarch64, pmu, steal_time;
>  > +    bool aarch64, steal_time;
>  >      CPUState *cpu;
>  > +    int n;
>  >
>  >      aarch64 = object_property_get_bool(OBJECT(first_cpu), "aarch64", NULL);
>  > -    pmu = object_property_get_bool(OBJECT(first_cpu), "pmu", NULL);
>  > +    vms->pmu = object_property_get_bool(OBJECT(first_cpu), "pmu",
>  > + NULL);
>  >      steal_time = object_property_get_bool(OBJECT(first_cpu),
>  >                                            "kvm-steal-time", NULL);
>  >
>  > @@ -2077,8 +2079,10 @@ static void virt_cpu_post_init(VirtMachineState
>  *vms, MemoryRegion *sysmem)
>  >              memory_region_add_subregion(sysmem, pvtime_reg_base,
>  pvtime);
>  >          }
>  >
>  > -        CPU_FOREACH(cpu) {
>  > -            if (pmu) {
>  > +        for (n = 0; n < possible_cpus->len; n++) {
>  > +            cpu = qemu_get_possible_cpu(n);
>  > +
>  
>  Maybe a CPU_FOREACH_POSSIBLE()?


sure.


Thank you
Salil.

>  
>  Thanks,
>  Nick
>  
>  > +            if (vms->pmu) {
>  >                  assert(arm_feature(&ARM_CPU(cpu)->env,
>  ARM_FEATURE_PMU));
>  >                  if (kvm_irqchip_in_kernel()) {
>  >                      kvm_arm_pmu_set_irq(ARM_CPU(cpu),
>  > VIRTUAL_PMU_IRQ); diff --git a/include/hw/arm/virt.h
>  > b/include/hw/arm/virt.h index 36ac5ff4a2..d8dcc89a0d 100644
>  > --- a/include/hw/arm/virt.h
>  > +++ b/include/hw/arm/virt.h
>  > @@ -155,6 +155,7 @@ struct VirtMachineState {
>  >      bool ras;
>  >      bool mte;
>  >      bool dtb_randomness;
>  > +    bool pmu;
>  >      OnOffAuto acpi;
>  >      VirtGICType gic_version;
>  >      VirtIOMMUType iommu;


^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 25/29] target/arm/kvm: Write CPU state back to KVM on reset
  2024-07-04  3:27   ` Nicholas Piggin
@ 2024-07-04 12:27     ` Salil Mehta via
  0 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-07-04 12:27 UTC (permalink / raw)
  To: Nicholas Piggin, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com,
	rafael@kernel.org, borntraeger@linux.ibm.com,
	alex.bennee@linaro.org, harshpb@linux.ibm.com,
	linux@armlinux.org.uk, darren@os.amperecomputing.com,
	ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com,
	karl.heubaum@oracle.com, miguel.luis@oracle.com,
	salil.mehta@opnsrc.net, zhukeqian, wangxiongfeng (C),
	wangyanan (Y), jiakernel2@gmail.com, maobibo@loongson.cn,
	lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com,
	Linuxarm

Hi Nick,

>  From: Nicholas Piggin <npiggin@gmail.com>
>  Sent: Thursday, July 4, 2024 4:28 AM
>  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  qemu-arm@nongnu.org; mst@redhat.com
>  
>  On Fri Jun 14, 2024 at 9:36 AM AEST, Salil Mehta wrote:
>  > From: Jean-Philippe Brucker <jean-philippe@linaro.org>
>  >
>  > When a KVM vCPU is reset following a PSCI CPU_ON call, its power state
>  > is not synchronized with KVM at the moment. Because the vCPU is not
>  > marked dirty, we miss the call to kvm_arch_put_registers() that writes
>  > to KVM's MP_STATE. Force mp_state synchronization.
>  
>  Hmm. Is this a bug fix for upstream? arm does respond to CPU_ON calls by
>  the look, but maybe it's not doing KVM parking until your series?


Yes, this is required we now park and un-park the vCPUs. We must ensure the
KVM resets the KVM VCPU state as well. Hence, not a fix but a change which
is required in context to this patch-set.


>  Maybe just a slight change to say "When KVM parking is implemented for
>  ARM..." if so.

Sure.

>  
>  >
>  > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>  > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  > ---
>  >  target/arm/kvm.c | 7 +++++++
>  >  1 file changed, 7 insertions(+)
>  >
>  > diff --git a/target/arm/kvm.c b/target/arm/kvm.c index
>  > 1121771c4a..7acd83ce64 100644
>  > --- a/target/arm/kvm.c
>  > +++ b/target/arm/kvm.c
>  > @@ -980,6 +980,7 @@ void kvm_arm_cpu_post_load(ARMCPU *cpu)
>  void
>  > kvm_arm_reset_vcpu(ARMCPU *cpu)  {
>  >      int ret;
>  > +    CPUState *cs = CPU(cpu);
>  >
>  >      /* Re-init VCPU so that all registers are set to
>  >       * their respective reset values.
>  > @@ -1001,6 +1002,12 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu)
>  >       * for the same reason we do so in kvm_arch_get_registers().
>  >       */
>  >      write_list_to_cpustate(cpu);
>  > +
>  > +    /*
>  > +     * Ensure we call kvm_arch_put_registers(). The vCPU isn't marked dirty if
>  > +     * it was parked in KVM and is now booting from a PSCI CPU_ON call.
>  > +     */
>  > +    cs->vcpu_dirty = true;
>  >  }
>  >
>  >  void kvm_arm_create_host_vcpu(ARMCPU *cpu)
>  
>  Also above my pay grade, but arm_set_cpu_on_async_work() which seems
>  to be what calls the CPU reset you refer to does a bunch of CPU register and
>  state setting including the power state setting that you mention.
>  Would the vcpu_dirty be better to go there?


Maybe we can. Let me cross verify this.


Thanks
Salil.

>  
>  Thanks,
>  Nick

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 13/29] arm/virt: Make ARM vCPU *present* status ACPI *persistent*
  2024-07-04 11:23     ` Salil Mehta via
@ 2024-07-05  0:08       ` Nicholas Piggin
  0 siblings, 0 replies; 105+ messages in thread
From: Nicholas Piggin @ 2024-07-05  0:08 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com,
	rafael@kernel.org, borntraeger@linux.ibm.com,
	alex.bennee@linaro.org, harshpb@linux.ibm.com,
	linux@armlinux.org.uk, darren@os.amperecomputing.com,
	ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com,
	karl.heubaum@oracle.com, miguel.luis@oracle.com,
	salil.mehta@opnsrc.net, zhukeqian, wangxiongfeng (C),
	wangyanan (Y), jiakernel2@gmail.com, maobibo@loongson.cn,
	lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com,
	Linuxarm

On Thu Jul 4, 2024 at 9:23 PM AEST, Salil Mehta wrote:
> HI Nick,
>
> Thanks for taking time to review. Please find my replies inline.
>
> >  From: Nicholas Piggin <npiggin@gmail.com>
> >  Sent: Thursday, July 4, 2024 3:49 AM
> >  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
> >  qemu-arm@nongnu.org; mst@redhat.com
> >  
> >  On Fri Jun 14, 2024 at 9:36 AM AEST, Salil Mehta wrote:
> >  > ARM arch does not allow CPUs presence to be changed [1] after kernel
> >  has booted.
> >  > Hence, firmware/ACPI/Qemu must ensure persistent view of the vCPUs  to
> >  > the Guest kernel even when they are not present in the QoM i.e. are
> >  > unplugged or are yet-to-be-plugged
> >  
> >  Do you need arch-independent state for this? If ARM always requires it
> >  then can it be implemented between arm and acpi interface?
>
>
> Yes, we do need as we cannot say if the same constraint applies to other
> architectures as well. Above stated constraint affects how the architecture
> common ACPI CPU code is initialized.

Right, but could it be done with an ACPI property that the arch can
change, or an argument from arch code to an ACPI init routine? Or
even a machine property that ACPI could query.

> >  If not, then perhaps could it be done in the patch that introduces all the
> >  other state?
> >  
> >  > References:
> >  > [1] Check comment 5 in the bugzilla entry
> >  >    Link: https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
> >  
> >  If I understand correctly (and I don't know ACPI, so it's likely I don't), that is
> >  and update to ACPI spec to say some bit in ACPI table must remain set
> >  regardless of CPU hotplug state.
>
>
> ARM does not claims anything related to CPU hotplug right now. It simply
> does not exists. The ACPI update is simply reinforcing the existing fact that
> _STA.Present bit in the ACPI spec cannot be changed after system has booted. 
>
> This is  because for ARM arch there are many other initializations which depend
> upon the exact availability of CPU count during boot and they do not expect
> that to change after boot. For example, there are so many per-CPU features
> and the GIC CPU interface etc. which all expect this to be fixed at boot time.
> This is immutable requirement from ARM.
>
>
> >  
> >  Reference links are good, I think it would be nice to add a small summary in
> >  the changelog too.
>
> sure. I will do.

Thanks. Something like what you wrote above would work.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (30 preceding siblings ...)
  2024-07-01 11:38 ` Miguel Luis
@ 2024-08-07  9:53 ` Gavin Shan
  2024-08-07 13:27   ` Salil Mehta via
  2024-08-28 20:35 ` Gustavo Romero
  32 siblings, 1 reply; 105+ messages in thread
From: Gavin Shan @ 2024-08-07  9:53 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm, mst
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, rafael,
	borntraeger, alex.bennee, npiggin, harshpb, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

Hi Salil,

With this series and latest upstream Linux kernel (host), I ran into core dump as below.
I'm not sure if it's a known issue or not.

# uname -r
6.11.0-rc2-gavin+
# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 -accel kvm \
   -machine virt,gic-version=host,nvdimm=on -cpu host                 \
   -smp maxcpus=2,cpus=1,sockets=2,clusters=1,cores=1,threads=1       \
   -m 4096M,slots=16,maxmem=128G                                      \
   -object memory-backend-ram,id=mem0,size=2048M                      \
   -object memory-backend-ram,id=mem1,size=2048M                      \
   -numa node,nodeid=0,memdev=mem0,cpus=0-0                           \
   -numa node,nodeid=1,memdev=mem1,cpus=1-1                           \
     :
qemu-system-aarch64: Failed to initialize host vcpu 1
Aborted (core dumped)

# gdb /var/lib/systemd/coredump/core.0 /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64
(gdb) bt
#0  0x0000ffff9eec42e8 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x0000ffff9ee7c73c in raise () at /lib64/libc.so.6
#2  0x0000ffff9ee69034 in abort () at /lib64/libc.so.6
#3  0x0000aaaac71152c0 in kvm_arm_create_host_vcpu (cpu=0xaaaae4c0cb80)
     at ../target/arm/kvm.c:1093
#4  0x0000aaaac7057520 in machvirt_init (machine=0xaaaae48198c0) at ../hw/arm/virt.c:2534
#5  0x0000aaaac6b0d31c in machine_run_board_init
     (machine=0xaaaae48198c0, mem_path=0x0, errp=0xfffff754ee38) at ../hw/core/machine.c:1576
#6  0x0000aaaac6f58d70 in qemu_init_board () at ../system/vl.c:2620
#7  0x0000aaaac6f590dc in qmp_x_exit_preconfig (errp=0xaaaac8911120 <error_fatal>)
     at ../system/vl.c:2712
#8  0x0000aaaac6f5b728 in qemu_init (argc=82, argv=0xfffff754f1d8) at ../system/vl.c:3758
#9  0x0000aaaac6a5315c in main (argc=82, argv=0xfffff754f1d8) at ../system/main.c:47

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-08-07  9:53 ` Gavin Shan
@ 2024-08-07 13:27   ` Salil Mehta via
  2024-08-07 16:07     ` Salil Mehta via
  2024-08-07 23:41     ` Gavin Shan
  0 siblings, 2 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-08-07 13:27 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Gavin,

Let me figure out this. Have you also included the below patch along with the
architecture agnostic patch-set accepted in this Qemu cycle?

https://lore.kernel.org/all/20240801142322.3948866-3-peter.maydell@linaro.org/


Thanks
Salil.

>  From: Gavin Shan <gshan@redhat.com>
>  Sent: Wednesday, August 7, 2024 10:54 AM
>  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  qemu-arm@nongnu.org; mst@redhat.com
>  
>  Hi Salil,
>  
>  With this series and latest upstream Linux kernel (host), I ran into core
>  dump as below.
>  I'm not sure if it's a known issue or not.
>  
>  # uname -r
>  6.11.0-rc2-gavin+
>  # /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 -accel
>  kvm \
>     -machine virt,gic-version=host,nvdimm=on -cpu host                 \
>     -smp maxcpus=2,cpus=1,sockets=2,clusters=1,cores=1,threads=1       \
>     -m 4096M,slots=16,maxmem=128G                                      \
>     -object memory-backend-ram,id=mem0,size=2048M                      \
>     -object memory-backend-ram,id=mem1,size=2048M                      \
>     -numa node,nodeid=0,memdev=mem0,cpus=0-0                           \
>     -numa node,nodeid=1,memdev=mem1,cpus=1-1                           \
>       :
>  qemu-system-aarch64: Failed to initialize host vcpu 1 Aborted (core
>  dumped)
>  
>  # gdb /var/lib/systemd/coredump/core.0
>  /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64
>  (gdb) bt
>  #0  0x0000ffff9eec42e8 in __pthread_kill_implementation () at
>  /lib64/libc.so.6
>  #1  0x0000ffff9ee7c73c in raise () at /lib64/libc.so.6
>  #2  0x0000ffff9ee69034 in abort () at /lib64/libc.so.6
>  #3  0x0000aaaac71152c0 in kvm_arm_create_host_vcpu
>  (cpu=0xaaaae4c0cb80)
>       at ../target/arm/kvm.c:1093
>  #4  0x0000aaaac7057520 in machvirt_init (machine=0xaaaae48198c0) at
>  ../hw/arm/virt.c:2534
>  #5  0x0000aaaac6b0d31c in machine_run_board_init
>       (machine=0xaaaae48198c0, mem_path=0x0, errp=0xfffff754ee38) at
>  ../hw/core/machine.c:1576
>  #6  0x0000aaaac6f58d70 in qemu_init_board () at ../system/vl.c:2620
>  #7  0x0000aaaac6f590dc in qmp_x_exit_preconfig (errp=0xaaaac8911120
>  <error_fatal>)
>       at ../system/vl.c:2712
>  #8  0x0000aaaac6f5b728 in qemu_init (argc=82, argv=0xfffff754f1d8) at
>  ../system/vl.c:3758
>  #9  0x0000aaaac6a5315c in main (argc=82, argv=0xfffff754f1d8) at
>  ../system/main.c:47
>  
>  Thanks,
>  Gavin
>  


^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-08-07 13:27   ` Salil Mehta via
@ 2024-08-07 16:07     ` Salil Mehta via
  2024-08-08  5:00       ` Gavin Shan
  2024-08-07 23:41     ` Gavin Shan
  1 sibling, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-08-07 16:07 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Gavin,

I tested ARM arch specific patches with the latest Qemu which contains below mentioned
fix and I cannot reproduce the crash. I used kernel linux-6.11-rc2 and it booted successfully.
Though I did see a kernel crash on attempting to hotplug first vCPU. 

(qemu) device_add host-arm-cpu,id=core4,core-id=4
(qemu) [  365.125477] Unable to handle kernel write to read-only memory at virtual address ffff800081ba4190
[  365.126366] Mem abort info:
[  365.126640]   ESR = 0x000000009600004e
[  365.127010]   EC = 0x25: DABT (current EL), IL = 32 bits
[  365.127524]   SET = 0, FnV = 0
[  365.127822]   EA = 0, S1PTW = 0
[  365.128130]   FSC = 0x0e: level 2 permission fault
[  365.128598] Data abort info:
[  365.128881]   ISV = 0, ISS = 0x0000004e, ISS2 = 0x00000000
[  365.129447]   CM = 0, WnR = 1, TnD = 0, TagAccess = 0
[  365.129943]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[  365.130442] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000045830000
[  365.131068] [ffff800081ba4190] pgd=0000000000000000, p4d=10000000467df003, pud=10000000467e0003, pmd=0060000045600781
[  365.132069] Internal error: Oops: 000000009600004e [#1] PREEMPT SMP
[  365.132661] Modules linked in:
[  365.132952] CPU: 0 UID: 0 PID: 11 Comm: kworker/u24:0 Not tainted 6.11.0-rc2 #228
[  365.133699] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  365.134415] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
[  365.134969] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  365.135679] pc : register_cpu+0x138/0x250
[  365.136093] lr : register_cpu+0x120/0x250
[  365.136506] sp : ffff800082cbba10
[  365.136847] x29: ffff800082cbba10 x28: ffff8000826479c0 x27: ffff000000a7e098
[  365.137575] x26: ffff8000827c2838 x25: 0000000000000004 x24: ffff80008264d9b0
[  365.138311] x23: 0000000000000004 x22: ffff000012a482d0 x21: ffff800081e30a00
[  365.139037] x20: 0000000000000000 x19: ffff800081ba4190 x18: ffffffffffffffff
[  365.139764] x17: 0000000000000000 x16: 0000000000000000 x15: ffff000001adaa1c
[  365.140490] x14: ffffffffffffffff x13: ffff000001ada2e0 x12: 0000000000000000
[  365.141216] x11: ffff800081e32780 x10: 0000000000000000 x9 : 0000000000000001
[  365.141945] x8 : 0101010101010101 x7 : 7f7f7f7f7f7f7f7f x6 : 6f7274726e737460
[  365.142668] x5 : ffff0000027b1920 x4 : ffff0000027b1b40 x3 : ffff0000027b1880
[  365.143400] x2 : ffff0000001933c0 x1 : ffff800081ba4190 x0 : 0000000000000010
[  365.144129] Call trace:
[  365.144382]  register_cpu+0x138/0x250
[  365.144759]  arch_register_cpu+0x7c/0xc4
[  365.145166]  acpi_processor_add+0x468/0x590
[  365.145594]  acpi_bus_attach+0x1ac/0x2dc
[  365.146002]  acpi_dev_for_one_check+0x34/0x40
[  365.146449]  device_for_each_child+0x5c/0xb0
[  365.146887]  acpi_dev_for_each_child+0x3c/0x64
[  365.147341]  acpi_bus_attach+0x78/0x2dc
[  365.147734]  acpi_bus_scan+0x68/0x208
[  365.148110]  acpi_scan_rescan_bus+0x4c/0x78
[  365.148537]  acpi_device_hotplug+0x1f8/0x460
[  365.148975]  acpi_hotplug_work_fn+0x24/0x3c
[  365.149402]  process_one_work+0x150/0x294
[  365.149817]  worker_thread+0x2e4/0x3ec
[  365.150199]  kthread+0x118/0x11c
[  365.150536]  ret_from_fork+0x10/0x20
[  365.150903] Code: 91064021 9ad72000 8b130c33 d503201f (f820327f)
[  365.151527] ---[ end trace 0000000000000000 ]---


Do let me know how the Qemu with Arch specific patches goes.

Thanks
Salil.

>  From: Salil Mehta
>  Sent: Wednesday, August 7, 2024 2:27 PM
>  To: 'Gavin Shan' <gshan@redhat.com>; qemu-devel@nongnu.org; qemu-
>  arm@nongnu.org; mst@redhat.com
>  
>  Hi Gavin,
>  
>  Let me figure out this. Have you also included the below patch along with
>  the architecture agnostic patch-set accepted in this Qemu cycle?
>  
>  https://lore.kernel.org/all/20240801142322.3948866-3-
>  peter.maydell@linaro.org/
>  
>  
>  Thanks
>  Salil.
>  
>  >  From: Gavin Shan <gshan@redhat.com>
>  >  Sent: Wednesday, August 7, 2024 10:54 AM
>  >  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  > qemu-arm@nongnu.org; mst@redhat.com
>  >
>  >  Hi Salil,
>  >
>  >  With this series and latest upstream Linux kernel (host), I ran into
>  > core  dump as below.
>  >  I'm not sure if it's a known issue or not.
>  >
>  >  # uname -r
>  >  6.11.0-rc2-gavin+
>  >  # /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 -accel
>  kvm
>  > \
>  >     -machine virt,gic-version=host,nvdimm=on -cpu host                 \
>  >     -smp maxcpus=2,cpus=1,sockets=2,clusters=1,cores=1,threads=1       \
>  >     -m 4096M,slots=16,maxmem=128G                                      \
>  >     -object memory-backend-ram,id=mem0,size=2048M                      \
>  >     -object memory-backend-ram,id=mem1,size=2048M                      \
>  >     -numa node,nodeid=0,memdev=mem0,cpus=0-0                           \
>  >     -numa node,nodeid=1,memdev=mem1,cpus=1-1                           \
>  >       :
>  >  qemu-system-aarch64: Failed to initialize host vcpu 1 Aborted (core
>  >  dumped)
>  >
>  >  # gdb /var/lib/systemd/coredump/core.0
>  >  /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64
>  >  (gdb) bt
>  >  #0  0x0000ffff9eec42e8 in __pthread_kill_implementation () at
>  >  /lib64/libc.so.6
>  >  #1  0x0000ffff9ee7c73c in raise () at /lib64/libc.so.6
>  >  #2  0x0000ffff9ee69034 in abort () at /lib64/libc.so.6
>  >  #3  0x0000aaaac71152c0 in kvm_arm_create_host_vcpu
>  >  (cpu=0xaaaae4c0cb80)
>  >       at ../target/arm/kvm.c:1093
>  >  #4  0x0000aaaac7057520 in machvirt_init (machine=0xaaaae48198c0) at
>  >  ../hw/arm/virt.c:2534
>  >  #5  0x0000aaaac6b0d31c in machine_run_board_init
>  >       (machine=0xaaaae48198c0, mem_path=0x0, errp=0xfffff754ee38) at
>  >  ../hw/core/machine.c:1576
>  >  #6  0x0000aaaac6f58d70 in qemu_init_board () at ../system/vl.c:2620
>  >  #7  0x0000aaaac6f590dc in qmp_x_exit_preconfig (errp=0xaaaac8911120
>  >  <error_fatal>)
>  >       at ../system/vl.c:2712
>  >  #8  0x0000aaaac6f5b728 in qemu_init (argc=82, argv=0xfffff754f1d8) at
>  >  ../system/vl.c:3758
>  >  #9  0x0000aaaac6a5315c in main (argc=82, argv=0xfffff754f1d8) at
>  >  ../system/main.c:47
>  >
>  >  Thanks,
>  >  Gavin
>  >


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-08-07 13:27   ` Salil Mehta via
  2024-08-07 16:07     ` Salil Mehta via
@ 2024-08-07 23:41     ` Gavin Shan
  2024-08-07 23:48       ` Salil Mehta via
  1 sibling, 1 reply; 105+ messages in thread
From: Gavin Shan @ 2024-08-07 23:41 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Salil,

On 8/7/24 11:27 PM, Salil Mehta wrote:
> 
> Let me figure out this. Have you also included the below patch along with the
> architecture agnostic patch-set accepted in this Qemu cycle?
> 
> https://lore.kernel.org/all/20240801142322.3948866-3-peter.maydell@linaro.org/
>  

There are no vCPU fd to be parked and unparked when the core dump happenes. I
tried it, but didn't help. I added more debugging messages and the core dump is
triggered in the following path. It seems 'cpu->sve_vq.map' isn't correct since
it's populated in CPU realization path, and those non-cold-booted CPUs aren't
realized in the booting stage.

# dmesg | grep "Scalable Vector Extension"
[    0.117121] CPU features: detected: Scalable Vector Extension

# start_vm
===> machvirt_init: create CPU object (idx=0, type=[host-arm-cpu])
cpu_common_initfn
arm_cpu_initfn
aarch64_cpu_initfn
aarch64_cpu_instance_init
aarch64_host_initfn
arm_cpu_post_init
===> machvirt_init: realize CPU object (idx=0)
virt_cpu_pre_plug
arm_cpu_realizefn
cpu_common_realizefn
virt_cpu_plug
===> machvirt_init: create CPU object (idx=1, type=[host-arm-cpu])
cpu_common_initfn
arm_cpu_initfn
aarch64_cpu_initfn
aarch64_cpu_instance_init
aarch64_host_initfn
arm_cpu_post_init
kvm_arch_init_vcpu: Error -22 from kvm_arm_sve_set_vls()
qemu-system-aarch64: Failed to initialize host vcpu 1
Aborted (core dumped)

Thanks,
Gavin

>>   
>>   With this series and latest upstream Linux kernel (host), I ran into core
>>   dump as below.
>>   I'm not sure if it's a known issue or not.
>>   
>>   # uname -r
>>   6.11.0-rc2-gavin+
>>   # /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 -accel
>>   kvm \
>>      -machine virt,gic-version=host,nvdimm=on -cpu host                 \
>>      -smp maxcpus=2,cpus=1,sockets=2,clusters=1,cores=1,threads=1       \
>>      -m 4096M,slots=16,maxmem=128G                                      \
>>      -object memory-backend-ram,id=mem0,size=2048M                      \
>>      -object memory-backend-ram,id=mem1,size=2048M                      \
>>      -numa node,nodeid=0,memdev=mem0,cpus=0-0                           \
>>      -numa node,nodeid=1,memdev=mem1,cpus=1-1                           \
>>        :
>>   qemu-system-aarch64: Failed to initialize host vcpu 1 Aborted (core
>>   dumped)
>>   
>>   # gdb /var/lib/systemd/coredump/core.0
>>   /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64
>>   (gdb) bt
>>   #0  0x0000ffff9eec42e8 in __pthread_kill_implementation () at
>>   /lib64/libc.so.6
>>   #1  0x0000ffff9ee7c73c in raise () at /lib64/libc.so.6
>>   #2  0x0000ffff9ee69034 in abort () at /lib64/libc.so.6
>>   #3  0x0000aaaac71152c0 in kvm_arm_create_host_vcpu
>>   (cpu=0xaaaae4c0cb80)
>>        at ../target/arm/kvm.c:1093
>>   #4  0x0000aaaac7057520 in machvirt_init (machine=0xaaaae48198c0) at
>>   ../hw/arm/virt.c:2534
>>   #5  0x0000aaaac6b0d31c in machine_run_board_init
>>        (machine=0xaaaae48198c0, mem_path=0x0, errp=0xfffff754ee38) at
>>   ../hw/core/machine.c:1576
>>   #6  0x0000aaaac6f58d70 in qemu_init_board () at ../system/vl.c:2620
>>   #7  0x0000aaaac6f590dc in qmp_x_exit_preconfig (errp=0xaaaac8911120
>>   <error_fatal>)
>>        at ../system/vl.c:2712
>>   #8  0x0000aaaac6f5b728 in qemu_init (argc=82, argv=0xfffff754f1d8) at
>>   ../system/vl.c:3758
>>   #9  0x0000aaaac6a5315c in main (argc=82, argv=0xfffff754f1d8) at
>>   ../system/main.c:47
>>   
>>   Thanks,
>>   Gavin
>>   
> 



^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-08-07 23:41     ` Gavin Shan
@ 2024-08-07 23:48       ` Salil Mehta via
  2024-08-08  0:29         ` Gavin Shan
  0 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-08-07 23:48 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Gavin,

Thanks for further information.

>  From: Gavin Shan <gshan@redhat.com>
>  Sent: Thursday, August 8, 2024 12:41 AM
>  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  qemu-arm@nongnu.org; mst@redhat.com
>  
>  Hi Salil,
>  
>  On 8/7/24 11:27 PM, Salil Mehta wrote:
>  >
>  > Let me figure out this. Have you also included the below patch along
>  > with the architecture agnostic patch-set accepted in this Qemu cycle?
>  >
>  > https://lore.kernel.org/all/20240801142322.3948866-3-peter.maydell@lin
>  > aro.org/
>  >
>  
>  There are no vCPU fd to be parked and unparked when the core dump
>  happenes. I tried it, but didn't help. I added more debugging messages and
>  the core dump is triggered in the following path. It seems 'cpu-
>  >sve_vq.map' isn't correct since it's populated in CPU realization path, and
>  those non-cold-booted CPUs aren't realized in the booting stage.


Ah, I've to fix the SVE support. I'm already working on it and will be part of
the RFC V4.

Have you tried booting VM by disabling the SVE support?


>  
>  # dmesg | grep "Scalable Vector Extension"
>  [    0.117121] CPU features: detected: Scalable Vector Extension
>  
>  # start_vm
>  ===> machvirt_init: create CPU object (idx=0, type=[host-arm-cpu])
>  cpu_common_initfn arm_cpu_initfn aarch64_cpu_initfn
>  aarch64_cpu_instance_init aarch64_host_initfn arm_cpu_post_init ===>
>  machvirt_init: realize CPU object (idx=0) virt_cpu_pre_plug
>  arm_cpu_realizefn cpu_common_realizefn virt_cpu_plug ===>
>  machvirt_init: create CPU object (idx=1, type=[host-arm-cpu])
>  cpu_common_initfn arm_cpu_initfn aarch64_cpu_initfn
>  aarch64_cpu_instance_init aarch64_host_initfn arm_cpu_post_init
>  kvm_arch_init_vcpu: Error -22 from kvm_arm_sve_set_vls()
>  qemu-system-aarch64: Failed to initialize host vcpu 1 Aborted (core
>  dumped)

Yes, sure. 

Thanks
Salil.


>  
>  Thanks,
>  Gavin
>  
>  >>
>  >>   With this series and latest upstream Linux kernel (host), I ran into core
>  >>   dump as below.
>  >>   I'm not sure if it's a known issue or not.
>  >>
>  >>   # uname -r
>  >>   6.11.0-rc2-gavin+
>  >>   # /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 -
>  accel
>  >>   kvm \
>  >>      -machine virt,gic-version=host,nvdimm=on -cpu host                 \
>  >>      -smp maxcpus=2,cpus=1,sockets=2,clusters=1,cores=1,threads=1       \
>  >>      -m 4096M,slots=16,maxmem=128G                                      \
>  >>      -object memory-backend-ram,id=mem0,size=2048M                      \
>  >>      -object memory-backend-ram,id=mem1,size=2048M                      \
>  >>      -numa node,nodeid=0,memdev=mem0,cpus=0-0                           \
>  >>      -numa node,nodeid=1,memdev=mem1,cpus=1-1                           \
>  >>        :
>  >>   qemu-system-aarch64: Failed to initialize host vcpu 1 Aborted (core
>  >>   dumped)
>  >>
>  >>   # gdb /var/lib/systemd/coredump/core.0
>  >>   /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64
>  >>   (gdb) bt
>  >>   #0  0x0000ffff9eec42e8 in __pthread_kill_implementation () at
>  >>   /lib64/libc.so.6
>  >>   #1  0x0000ffff9ee7c73c in raise () at /lib64/libc.so.6
>  >>   #2  0x0000ffff9ee69034 in abort () at /lib64/libc.so.6
>  >>   #3  0x0000aaaac71152c0 in kvm_arm_create_host_vcpu
>  >>   (cpu=0xaaaae4c0cb80)
>  >>        at ../target/arm/kvm.c:1093
>  >>   #4  0x0000aaaac7057520 in machvirt_init (machine=0xaaaae48198c0) at
>  >>   ../hw/arm/virt.c:2534
>  >>   #5  0x0000aaaac6b0d31c in machine_run_board_init
>  >>        (machine=0xaaaae48198c0, mem_path=0x0, errp=0xfffff754ee38) at
>  >>   ../hw/core/machine.c:1576
>  >>   #6  0x0000aaaac6f58d70 in qemu_init_board () at ../system/vl.c:2620
>  >>   #7  0x0000aaaac6f590dc in qmp_x_exit_preconfig
>  (errp=0xaaaac8911120
>  >>   <error_fatal>)
>  >>        at ../system/vl.c:2712
>  >>   #8  0x0000aaaac6f5b728 in qemu_init (argc=82, argv=0xfffff754f1d8) at
>  >>   ../system/vl.c:3758
>  >>   #9  0x0000aaaac6a5315c in main (argc=82, argv=0xfffff754f1d8) at
>  >>   ../system/main.c:47
>  >>
>  >>   Thanks,
>  >>   Gavin
>  >>
>  >
>  


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-08-07 23:48       ` Salil Mehta via
@ 2024-08-08  0:29         ` Gavin Shan
  2024-08-08  4:15           ` Gavin Shan
  2024-08-08  8:36           ` Salil Mehta via
  0 siblings, 2 replies; 105+ messages in thread
From: Gavin Shan @ 2024-08-08  0:29 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Salil,

On 8/8/24 9:48 AM, Salil Mehta wrote:
>>   On 8/7/24 11:27 PM, Salil Mehta wrote:
>>   >
>>   > Let me figure out this. Have you also included the below patch along
>>   > with the architecture agnostic patch-set accepted in this Qemu cycle?
>>   >
>>   > https://lore.kernel.org/all/20240801142322.3948866-3-peter.maydell@lin
>>   > aro.org/
>>   >
>>   
>>   There are no vCPU fd to be parked and unparked when the core dump
>>   happenes. I tried it, but didn't help. I added more debugging messages and
>>   the core dump is triggered in the following path. It seems 'cpu-
>>   >sve_vq.map' isn't correct since it's populated in CPU realization path, and
>>   those non-cold-booted CPUs aren't realized in the booting stage.
> 
> 
> Ah, I've to fix the SVE support. I'm already working on it and will be part of
> the RFC V4.
> 
> Have you tried booting VM by disabling the SVE support?
> 

I'm able to boot the guest after SVE is disabled by clearing the corresponding
bits in ID_AA64PFR0, as below.

static bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
{
     :

     /*
      * SVE is explicitly disabled. Otherwise, the non-cold-booted
      * CPUs can't be initialized in the vCPU hotplug scenario.
      */
     err = read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64pfr0,
                          ARM64_SYS_REG(3, 0, 0, 4, 0));
     ahcf->isar.id_aa64pfr0 &= ~R_ID_AA64PFR0_SVE_MASK;
}

However, I'm unable to hot-add a vCPU and haven't get a chance to look
at it closely.

(qemu) device_add host-arm-cpu,id=cpu,socket-id=1
(qemu) [  258.901027] Unable to handle kernel write to read-only memory at virtual address ffff800080fa7190
[  258.901686] Mem abort info:
[  258.901889]   ESR = 0x000000009600004e
[  258.902160]   EC = 0x25: DABT (current EL), IL = 32 bits
[  258.902543]   SET = 0, FnV = 0
[  258.902763]   EA = 0, S1PTW = 0
[  258.902991]   FSC = 0x0e: level 2 permission fault
[  258.903338] Data abort info:
[  258.903547]   ISV = 0, ISS = 0x0000004e, ISS2 = 0x00000000
[  258.903943]   CM = 0, WnR = 1, TnD = 0, TagAccess = 0
[  258.904304]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[  258.904687] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000b8e24000
[  258.905258] [ffff800080fa7190] pgd=10000000b95b0003, p4d=10000000b95b0003, pud=10000000b95b1003, pmd=00600000b8c00781
[  258.906026] Internal error: Oops: 000000009600004e [#1] PREEMPT SMP
[  258.906474] Modules linked in:
[  258.906705] CPU: 0 UID: 0 PID: 29 Comm: kworker/u8:1 Not tainted 6.11.0-rc2-gavin-gb446a2dae984 #7
[  258.907338] Hardware name: QEMU KVM Virtual Machine, BIOS edk2-stable202402-prebuilt.qemu.org 02/14/2024
[  258.908009] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
[  258.908401] pstate: 63400005 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
[  258.908899] pc : register_cpu+0x140/0x290
[  258.909195] lr : register_cpu+0x128/0x290
[  258.909487] sp : ffff8000817fba10
[  258.909727] x29: ffff8000817fba10 x28: 0000000000000000 x27: ffff0000011f9098
[  258.910246] x26: ffff80008167b1b0 x25: 0000000000000001 x24: ffff80008153dad0
[  258.910762] x23: 0000000000000001 x22: ffff0000ff7de210 x21: ffff8000811b9a00
[  258.911279] x20: 0000000000000000 x19: ffff800080fa7190 x18: ffffffffffffffff
[  258.911798] x17: 0000000000000000 x16: 0000000000000000 x15: ffff000005a46a1c
[  258.912326] x14: ffffffffffffffff x13: ffff000005a4632b x12: 0000000000000000
[  258.912854] x11: 0000000000000040 x10: 0000000000000000 x9 : ffff8000808a6cd4
[  258.913382] x8 : 0101010101010101 x7 : 7f7f7f7f7f7f7f7f x6 : fefefefefefefeff
[  258.913906] x5 : ffff0000053fab40 x4 : ffff0000053fa920 x3 : ffff0000053fabb0
[  258.914439] x2 : ffff000000de1100 x1 : ffff800080fa7190 x0 : 0000000000000002
[  258.914968] Call trace:
[  258.915154]  register_cpu+0x140/0x290
[  258.915429]  arch_register_cpu+0x84/0xd8
[  258.915726]  acpi_processor_add+0x480/0x5b0
[  258.916042]  acpi_bus_attach+0x1c4/0x300
[  258.916334]  acpi_dev_for_one_check+0x3c/0x50
[  258.916689]  device_for_each_child+0x68/0xc8
[  258.917012]  acpi_dev_for_each_child+0x48/0x80
[  258.917344]  acpi_bus_attach+0x84/0x300
[  258.917629]  acpi_bus_scan+0x74/0x220
[  258.917902]  acpi_scan_rescan_bus+0x54/0x88
[  258.918211]  acpi_device_hotplug+0x208/0x478
[  258.918529]  acpi_hotplug_work_fn+0x2c/0x50
[  258.918839]  process_one_work+0x15c/0x3c0
[  258.919139]  worker_thread+0x2ec/0x400
[  258.919417]  kthread+0x120/0x130
[  258.919658]  ret_from_fork+0x10/0x20
[  258.919924] Code: 91064021 9ad72000 8b130c33 d503201f (f820327f)
[  258.920373] ---[ end trace 0000000000000000 ]---

Thanks,
Gavin




^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-08-08  0:29         ` Gavin Shan
@ 2024-08-08  4:15           ` Gavin Shan
  2024-08-08  8:39             ` Salil Mehta via
  2024-08-08  8:36           ` Salil Mehta via
  1 sibling, 1 reply; 105+ messages in thread
From: Gavin Shan @ 2024-08-08  4:15 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Salil,

On 8/8/24 10:29 AM, Gavin Shan wrote:
> On 8/8/24 9:48 AM, Salil Mehta wrote:
> 
> However, I'm unable to hot-add a vCPU and haven't get a chance to look
> at it closely.
> 
> (qemu) device_add host-arm-cpu,id=cpu,socket-id=1
> (qemu) [  258.901027] Unable to handle kernel write to read-only memory at virtual address ffff800080fa7190
> [  258.901686] Mem abort info:
> [  258.901889]   ESR = 0x000000009600004e
> [  258.902160]   EC = 0x25: DABT (current EL), IL = 32 bits
> [  258.902543]   SET = 0, FnV = 0
> [  258.902763]   EA = 0, S1PTW = 0
> [  258.902991]   FSC = 0x0e: level 2 permission fault
> [  258.903338] Data abort info:
> [  258.903547]   ISV = 0, ISS = 0x0000004e, ISS2 = 0x00000000
> [  258.903943]   CM = 0, WnR = 1, TnD = 0, TagAccess = 0
> [  258.904304]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [  258.904687] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000b8e24000
> [  258.905258] [ffff800080fa7190] pgd=10000000b95b0003, p4d=10000000b95b0003, pud=10000000b95b1003, pmd=00600000b8c00781
> [  258.906026] Internal error: Oops: 000000009600004e [#1] PREEMPT SMP
> [  258.906474] Modules linked in:
> [  258.906705] CPU: 0 UID: 0 PID: 29 Comm: kworker/u8:1 Not tainted 6.11.0-rc2-gavin-gb446a2dae984 #7
> [  258.907338] Hardware name: QEMU KVM Virtual Machine, BIOS edk2-stable202402-prebuilt.qemu.org 02/14/2024
> [  258.908009] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
> [  258.908401] pstate: 63400005 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
> [  258.908899] pc : register_cpu+0x140/0x290
> [  258.909195] lr : register_cpu+0x128/0x290
> [  258.909487] sp : ffff8000817fba10
> [  258.909727] x29: ffff8000817fba10 x28: 0000000000000000 x27: ffff0000011f9098
> [  258.910246] x26: ffff80008167b1b0 x25: 0000000000000001 x24: ffff80008153dad0
> [  258.910762] x23: 0000000000000001 x22: ffff0000ff7de210 x21: ffff8000811b9a00
> [  258.911279] x20: 0000000000000000 x19: ffff800080fa7190 x18: ffffffffffffffff
> [  258.911798] x17: 0000000000000000 x16: 0000000000000000 x15: ffff000005a46a1c
> [  258.912326] x14: ffffffffffffffff x13: ffff000005a4632b x12: 0000000000000000
> [  258.912854] x11: 0000000000000040 x10: 0000000000000000 x9 : ffff8000808a6cd4
> [  258.913382] x8 : 0101010101010101 x7 : 7f7f7f7f7f7f7f7f x6 : fefefefefefefeff
> [  258.913906] x5 : ffff0000053fab40 x4 : ffff0000053fa920 x3 : ffff0000053fabb0
> [  258.914439] x2 : ffff000000de1100 x1 : ffff800080fa7190 x0 : 0000000000000002
> [  258.914968] Call trace:
> [  258.915154]  register_cpu+0x140/0x290
> [  258.915429]  arch_register_cpu+0x84/0xd8
> [  258.915726]  acpi_processor_add+0x480/0x5b0
> [  258.916042]  acpi_bus_attach+0x1c4/0x300
> [  258.916334]  acpi_dev_for_one_check+0x3c/0x50
> [  258.916689]  device_for_each_child+0x68/0xc8
> [  258.917012]  acpi_dev_for_each_child+0x48/0x80
> [  258.917344]  acpi_bus_attach+0x84/0x300
> [  258.917629]  acpi_bus_scan+0x74/0x220
> [  258.917902]  acpi_scan_rescan_bus+0x54/0x88
> [  258.918211]  acpi_device_hotplug+0x208/0x478
> [  258.918529]  acpi_hotplug_work_fn+0x2c/0x50
> [  258.918839]  process_one_work+0x15c/0x3c0
> [  258.919139]  worker_thread+0x2ec/0x400
> [  258.919417]  kthread+0x120/0x130
> [  258.919658]  ret_from_fork+0x10/0x20
> [  258.919924] Code: 91064021 9ad72000 8b130c33 d503201f (f820327f)
> [  258.920373] ---[ end trace 0000000000000000 ]---
> 

The fix [1] is needed by the guest kernel. With this, I'm able to hot add
vCPU and hot remove vCPU successfully.

[1] https://lkml.org/lkml/2024/8/8/155

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-08-07 16:07     ` Salil Mehta via
@ 2024-08-08  5:00       ` Gavin Shan
  0 siblings, 0 replies; 105+ messages in thread
From: Gavin Shan @ 2024-08-08  5:00 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Salil,

On 8/8/24 2:07 AM, Salil Mehta wrote:
> I tested ARM arch specific patches with the latest Qemu which contains below mentioned
> fix and I cannot reproduce the crash. I used kernel linux-6.11-rc2 and it booted successfully.
> Though I did see a kernel crash on attempting to hotplug first vCPU.
> 
> (qemu) device_add host-arm-cpu,id=core4,core-id=4
> (qemu) [  365.125477] Unable to handle kernel write to read-only memory at virtual address ffff800081ba4190
> [  365.126366] Mem abort info:
> [  365.126640]   ESR = 0x000000009600004e
> [  365.127010]   EC = 0x25: DABT (current EL), IL = 32 bits
> [  365.127524]   SET = 0, FnV = 0
> [  365.127822]   EA = 0, S1PTW = 0
> [  365.128130]   FSC = 0x0e: level 2 permission fault
> [  365.128598] Data abort info:
> [  365.128881]   ISV = 0, ISS = 0x0000004e, ISS2 = 0x00000000
> [  365.129447]   CM = 0, WnR = 1, TnD = 0, TagAccess = 0
> [  365.129943]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [  365.130442] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000045830000
> [  365.131068] [ffff800081ba4190] pgd=0000000000000000, p4d=10000000467df003, pud=10000000467e0003, pmd=0060000045600781
> [  365.132069] Internal error: Oops: 000000009600004e [#1] PREEMPT SMP
> [  365.132661] Modules linked in:
> [  365.132952] CPU: 0 UID: 0 PID: 11 Comm: kworker/u24:0 Not tainted 6.11.0-rc2 #228
> [  365.133699] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
> [  365.134415] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
> [  365.134969] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [  365.135679] pc : register_cpu+0x138/0x250
> [  365.136093] lr : register_cpu+0x120/0x250
> [  365.136506] sp : ffff800082cbba10
> [  365.136847] x29: ffff800082cbba10 x28: ffff8000826479c0 x27: ffff000000a7e098
> [  365.137575] x26: ffff8000827c2838 x25: 0000000000000004 x24: ffff80008264d9b0
> [  365.138311] x23: 0000000000000004 x22: ffff000012a482d0 x21: ffff800081e30a00
> [  365.139037] x20: 0000000000000000 x19: ffff800081ba4190 x18: ffffffffffffffff
> [  365.139764] x17: 0000000000000000 x16: 0000000000000000 x15: ffff000001adaa1c
> [  365.140490] x14: ffffffffffffffff x13: ffff000001ada2e0 x12: 0000000000000000
> [  365.141216] x11: ffff800081e32780 x10: 0000000000000000 x9 : 0000000000000001
> [  365.141945] x8 : 0101010101010101 x7 : 7f7f7f7f7f7f7f7f x6 : 6f7274726e737460
> [  365.142668] x5 : ffff0000027b1920 x4 : ffff0000027b1b40 x3 : ffff0000027b1880
> [  365.143400] x2 : ffff0000001933c0 x1 : ffff800081ba4190 x0 : 0000000000000010
> [  365.144129] Call trace:
> [  365.144382]  register_cpu+0x138/0x250
> [  365.144759]  arch_register_cpu+0x7c/0xc4
> [  365.145166]  acpi_processor_add+0x468/0x590
> [  365.145594]  acpi_bus_attach+0x1ac/0x2dc
> [  365.146002]  acpi_dev_for_one_check+0x34/0x40
> [  365.146449]  device_for_each_child+0x5c/0xb0
> [  365.146887]  acpi_dev_for_each_child+0x3c/0x64
> [  365.147341]  acpi_bus_attach+0x78/0x2dc
> [  365.147734]  acpi_bus_scan+0x68/0x208
> [  365.148110]  acpi_scan_rescan_bus+0x4c/0x78
> [  365.148537]  acpi_device_hotplug+0x1f8/0x460
> [  365.148975]  acpi_hotplug_work_fn+0x24/0x3c
> [  365.149402]  process_one_work+0x150/0x294
> [  365.149817]  worker_thread+0x2e4/0x3ec
> [  365.150199]  kthread+0x118/0x11c
> [  365.150536]  ret_from_fork+0x10/0x20
> [  365.150903] Code: 91064021 9ad72000 8b130c33 d503201f (f820327f)
> [  365.151527] ---[ end trace 0000000000000000 ]---
> 

Should be fixed by: https://lkml.org/lkml/2024/8/8/155

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-08-08  0:29         ` Gavin Shan
  2024-08-08  4:15           ` Gavin Shan
@ 2024-08-08  8:36           ` Salil Mehta via
  1 sibling, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-08-08  8:36 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Gavin,

>  From: Gavin Shan <gshan@redhat.com>
>  Sent: Thursday, August 8, 2024 1:29 AM
>  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  qemu-arm@nongnu.org; mst@redhat.com
>  
>  Hi Salil,
>  
>  On 8/8/24 9:48 AM, Salil Mehta wrote:
>  >>   On 8/7/24 11:27 PM, Salil Mehta wrote:
>  >>   >
>  >>   > Let me figure out this. Have you also included the below patch along
>  >>   > with the architecture agnostic patch-set accepted in this Qemu cycle?
>  >>   >
>  >>   > https://lore.kernel.org/all/20240801142322.3948866-3-
>  peter.maydell@lin
>  >>   > aro.org/
>  >>   >
>  >>
>  >>   There are no vCPU fd to be parked and unparked when the core dump
>  >>   happenes. I tried it, but didn't help. I added more debugging messages
>  and
>  >>   the core dump is triggered in the following path. It seems 'cpu-
>  >>   >sve_vq.map' isn't correct since it's populated in CPU realization path,
>  and
>  >>   those non-cold-booted CPUs aren't realized in the booting stage.
>  >
>  >
>  > Ah, I've to fix the SVE support. I'm already working on it and will be
>  > part of the RFC V4.
>  >
>  > Have you tried booting VM by disabling the SVE support?
>  >
>  
>  I'm able to boot the guest after SVE is disabled by clearing the
>  corresponding bits in ID_AA64PFR0, as below.
>  
>  static bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
>  {
>       :
>  
>       /*
>        * SVE is explicitly disabled. Otherwise, the non-cold-booted
>        * CPUs can't be initialized in the vCPU hotplug scenario.
>        */
>       err = read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64pfr0,
>                            ARM64_SYS_REG(3, 0, 0, 4, 0));
>       ahcf->isar.id_aa64pfr0 &= ~R_ID_AA64PFR0_SVE_MASK; }
>  
>  However, I'm unable to hot-add a vCPU and haven't get a chance to look at
>  it closely.
>  
>  (qemu) device_add host-arm-cpu,id=cpu,socket-id=1
>  (qemu) [  258.901027] Unable to handle kernel write to read-only memory
>  at virtual address ffff800080fa7190 [  258.901686] Mem abort info:
>  [  258.901889]   ESR = 0x000000009600004e
>  [  258.902160]   EC = 0x25: DABT (current EL), IL = 32 bits
>  [  258.902543]   SET = 0, FnV = 0
>  [  258.902763]   EA = 0, S1PTW = 0
>  [  258.902991]   FSC = 0x0e: level 2 permission fault
>  [  258.903338] Data abort info:
>  [  258.903547]   ISV = 0, ISS = 0x0000004e, ISS2 = 0x00000000
>  [  258.903943]   CM = 0, WnR = 1, TnD = 0, TagAccess = 0
>  [  258.904304]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
>  [  258.904687] swapper pgtable: 4k pages, 48-bit VAs,
>  pgdp=00000000b8e24000 [  258.905258] [ffff800080fa7190]
>  pgd=10000000b95b0003, p4d=10000000b95b0003, pud=10000000b95b1003,
>  pmd=00600000b8c00781 [  258.906026] Internal error: Oops:
>  000000009600004e [#1] PREEMPT SMP [  258.906474] Modules linked in:
>  [  258.906705] CPU: 0 UID: 0 PID: 29 Comm: kworker/u8:1 Not tainted 6.11.0-
>  rc2-gavin-gb446a2dae984 #7 [  258.907338] Hardware name: QEMU KVM
>  Virtual Machine, BIOS edk2-stable202402-prebuilt.qemu.org 02/14/2024 [
>  258.908009] Workqueue: kacpi_hotplug acpi_hotplug_work_fn [
>  258.908401] pstate: 63400005 (nZCv daif +PAN -UAO +TCO +DIT -SSBS
>  BTYPE=--) [  258.908899] pc : register_cpu+0x140/0x290 [  258.909195] lr :
>  register_cpu+0x128/0x290 [  258.909487] sp : ffff8000817fba10 [  258.909727]
>  x29: ffff8000817fba10 x28: 0000000000000000 x27: ffff0000011f9098 [
>  258.910246] x26: ffff80008167b1b0 x25: 0000000000000001 x24:
>  ffff80008153dad0 [  258.910762] x23: 0000000000000001 x22:
>  ffff0000ff7de210 x21: ffff8000811b9a00 [  258.911279] x20:
>  0000000000000000 x19: ffff800080fa7190 x18: ffffffffffffffff [  258.911798]
>  x17: 0000000000000000 x16: 0000000000000000 x15: ffff000005a46a1c [
>  258.912326] x14: ffffffffffffffff x13: ffff000005a4632b x12: 0000000000000000
>  [  258.912854] x11: 0000000000000040 x10: 0000000000000000 x9 :
>  ffff8000808a6cd4 [  258.913382] x8 : 0101010101010101 x7 : 7f7f7f7f7f7f7f7f
>  x6 : fefefefefefefeff [  258.913906] x5 : ffff0000053fab40 x4 :
>  ffff0000053fa920 x3 : ffff0000053fabb0 [  258.914439] x2 : ffff000000de1100
>  x1 : ffff800080fa7190 x0 : 0000000000000002 [  258.914968] Call trace:
>  [  258.915154]  register_cpu+0x140/0x290 [  258.915429]
>  arch_register_cpu+0x84/0xd8 [  258.915726]
>  acpi_processor_add+0x480/0x5b0 [  258.916042]
>  acpi_bus_attach+0x1c4/0x300 [  258.916334]
>  acpi_dev_for_one_check+0x3c/0x50 [  258.916689]
>  device_for_each_child+0x68/0xc8 [  258.917012]
>  acpi_dev_for_each_child+0x48/0x80 [  258.917344]
>  acpi_bus_attach+0x84/0x300 [  258.917629]  acpi_bus_scan+0x74/0x220 [
>  258.917902]  acpi_scan_rescan_bus+0x54/0x88 [  258.918211]
>  acpi_device_hotplug+0x208/0x478 [  258.918529]
>  acpi_hotplug_work_fn+0x2c/0x50 [  258.918839]
>  process_one_work+0x15c/0x3c0 [  258.919139]
>  worker_thread+0x2ec/0x400 [  258.919417]  kthread+0x120/0x130 [
>  258.919658]  ret_from_fork+0x10/0x20 [  258.919924] Code: 91064021
>  9ad72000 8b130c33 d503201f (f820327f) [  258.920373] ---[ end trace
>  0000000000000000 ]---


Yes, this crash. Thanks for confirming!


>  
>  Thanks,
>  Gavin
>  
>  


^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-08-08  4:15           ` Gavin Shan
@ 2024-08-08  8:39             ` Salil Mehta via
  0 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-08-08  8:39 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Gavin,

>  From: Gavin Shan <gshan@redhat.com>
>  Sent: Thursday, August 8, 2024 5:15 AM
>  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  qemu-arm@nongnu.org; mst@redhat.com
>  
>  Hi Salil,
>  
>  On 8/8/24 10:29 AM, Gavin Shan wrote:
>  > On 8/8/24 9:48 AM, Salil Mehta wrote:
>  >
>  > However, I'm unable to hot-add a vCPU and haven't get a chance to look
>  > at it closely.
>  >
>  > (qemu) device_add host-arm-cpu,id=cpu,socket-id=1
>  > (qemu) [  258.901027] Unable to handle kernel write to read-only
>  > memory at virtual address ffff800080fa7190 [  258.901686] Mem abort info:
>  > [  258.901889]   ESR = 0x000000009600004e [  258.902160]   EC = 0x25:
>  > DABT (current EL), IL = 32 bits [  258.902543]   SET = 0, FnV = 0 [
>  > 258.902763]   EA = 0, S1PTW = 0 [  258.902991]   FSC = 0x0e: level 2
>  > permission fault [  258.903338] Data abort info:
>  > [  258.903547]   ISV = 0, ISS = 0x0000004e, ISS2 = 0x00000000 [
>  > 258.903943]   CM = 0, WnR = 1, TnD = 0, TagAccess = 0 [  258.904304]
>  > GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [  258.904687] swapper
>  > pgtable: 4k pages, 48-bit VAs, pgdp=00000000b8e24000 [  258.905258]
>  > [ffff800080fa7190] pgd=10000000b95b0003, p4d=10000000b95b0003,
>  > pud=10000000b95b1003, pmd=00600000b8c00781 [  258.906026] Internal
>  > error: Oops: 000000009600004e [#1] PREEMPT SMP [  258.906474] Modules
>  linked in:
>  > [  258.906705] CPU: 0 UID: 0 PID: 29 Comm: kworker/u8:1 Not tainted
>  > 6.11.0-rc2-gavin-gb446a2dae984 #7 [  258.907338] Hardware name: QEMU
>  > KVM Virtual Machine, BIOS edk2-stable202402-prebuilt.qemu.org
>  > 02/14/2024 [  258.908009] Workqueue: kacpi_hotplug
>  > acpi_hotplug_work_fn [  258.908401] pstate: 63400005 (nZCv daif +PAN
>  > -UAO +TCO +DIT -SSBS BTYPE=--) [  258.908899] pc :
>  > register_cpu+0x140/0x290 [  258.909195] lr : register_cpu+0x128/0x290
>  > [  258.909487] sp : ffff8000817fba10 [  258.909727] x29:
>  > ffff8000817fba10 x28: 0000000000000000 x27: ffff0000011f9098 [
>  > 258.910246] x26: ffff80008167b1b0 x25: 0000000000000001 x24:
>  > ffff80008153dad0 [  258.910762] x23: 0000000000000001 x22:
>  > ffff0000ff7de210 x21: ffff8000811b9a00 [  258.911279] x20:
>  > 0000000000000000 x19: ffff800080fa7190 x18: ffffffffffffffff [
>  > 258.911798] x17: 0000000000000000 x16: 0000000000000000 x15:
>  > ffff000005a46a1c [  258.912326] x14: ffffffffffffffff x13:
>  > ffff000005a4632b x12: 0000000000000000 [  258.912854] x11:
>  > 0000000000000040 x10: 0000000000000000 x9 : ffff8000808a6cd4 [
>  > 258.913382] x8 : 0101010101010101 x7 : 7f7f7f7f7f7f7f7f x6 :
>  fefefefefefefeff [  258.913906] x5 : ffff0000053fab40 x4 : ffff0000053fa920 x3
>  : ffff0000053fabb0 [  258.914439] x2 : ffff000000de1100 x1 : ffff800080fa7190
>  x0 : 0000000000000002 [  258.914968] Call trace:
>  > [  258.915154]  register_cpu+0x140/0x290 [  258.915429]
>  > arch_register_cpu+0x84/0xd8 [  258.915726]
>  > acpi_processor_add+0x480/0x5b0 [  258.916042]
>  > acpi_bus_attach+0x1c4/0x300 [  258.916334]
>  > acpi_dev_for_one_check+0x3c/0x50 [  258.916689]
>  > device_for_each_child+0x68/0xc8 [  258.917012]
>  > acpi_dev_for_each_child+0x48/0x80 [  258.917344]
>  > acpi_bus_attach+0x84/0x300 [  258.917629]  acpi_bus_scan+0x74/0x220 [
>  > 258.917902]  acpi_scan_rescan_bus+0x54/0x88 [  258.918211]
>  > acpi_device_hotplug+0x208/0x478 [  258.918529]
>  > acpi_hotplug_work_fn+0x2c/0x50 [  258.918839]
>  > process_one_work+0x15c/0x3c0
>  [  258.919139]  worker_thread+0x2ec/0x400
>  > [  258.919417]  kthread+0x120/0x130 [  258.919658]
>  > ret_from_fork+0x10/0x20 [  258.919924] Code: 91064021 9ad72000
>  > 8b130c33 d503201f (f820327f) [  258.920373] ---[ end trace
>  > 0000000000000000 ]---
>  >
>  
>  The fix [1] is needed by the guest kernel. With this, I'm able to hot add vCPU
>  and hot remove vCPU successfully.
>  
>  [1] https://lkml.org/lkml/2024/8/8/155


Good catch in the kernel. And many thanks for fixing as well.


>  
>  Thanks,
>  Gavin
>  


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
  2024-06-13 23:36 ` [PATCH RFC V3 01/29] arm/virt, target/arm: Add new ARMCPU {socket, cluster, core, thread}-id property Salil Mehta via
@ 2024-08-12  4:35   ` Gavin Shan
  2024-08-12  8:15     ` Igor Mammedov
  2024-08-19 11:53     ` Salil Mehta via
  0 siblings, 2 replies; 105+ messages in thread
From: Gavin Shan @ 2024-08-12  4:35 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm, mst
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, rafael,
	borntraeger, alex.bennee, npiggin, harshpb, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

On 6/14/24 9:36 AM, Salil Mehta wrote:
> This shall be used to store user specified topology{socket,cluster,core,thread}
> and shall be converted to a unique 'vcpu-id' which is used as slot-index during
> hot(un)plug of vCPU.
> 
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt.c         | 10 ++++++++++
>   include/hw/arm/virt.h | 28 ++++++++++++++++++++++++++++
>   target/arm/cpu.c      |  4 ++++
>   target/arm/cpu.h      |  4 ++++
>   4 files changed, 46 insertions(+)
> 

Those 4 properties are introduced to determine the vCPU's slot, which is the index
to MachineState::possible_cpus::cpus[]. From there, the CPU object or instance is
referenced and then the CPU's state can be further determined. It sounds reasonable
to use the CPU's topology to determine the index. However, I'm wandering if this can
be simplified to use 'cpu-index' or 'index' for a couple of facts: (1) 'cpu-index'
or 'index' is simplified. Users have to provide 4 parameters in order to determine
its index in the extreme case, for example "device_add host-arm-cpu, id=cpu7,socket-id=1,
cluster-id=1,core-id=1,thread-id=1". With 'cpu-index' or 'index', it can be simplified
to 'index=7'. (2) The cold-booted and hotpluggable CPUs are determined by their index
instead of their topology. For example, CPU0/1/2/3 are cold-booted CPUs while CPU4/5/6/7
are hotpluggable CPUs with command lines '-smp maxcpus=8,cpus=4'. So 'index' makes
more sense to identify a vCPU's slot.

> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 3c93c0c0a6..11fc7fc318 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2215,6 +2215,14 @@ static void machvirt_init(MachineState *machine)
>                             &error_fatal);
>   
>           aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
> +        object_property_set_int(cpuobj, "socket-id",
> +                                virt_get_socket_id(machine, n), NULL);
> +        object_property_set_int(cpuobj, "cluster-id",
> +                                virt_get_cluster_id(machine, n), NULL);
> +        object_property_set_int(cpuobj, "core-id",
> +                                virt_get_core_id(machine, n), NULL);
> +        object_property_set_int(cpuobj, "thread-id",
> +                                virt_get_thread_id(machine, n), NULL);
>   
>           if (!vms->secure) {
>               object_property_set_bool(cpuobj, "has_el3", false, NULL);
> @@ -2708,6 +2716,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>   {
>       int n;
>       unsigned int max_cpus = ms->smp.max_cpus;
> +    unsigned int smp_threads = ms->smp.threads;
>       VirtMachineState *vms = VIRT_MACHINE(ms);
>       MachineClass *mc = MACHINE_GET_CLASS(vms);
>   
> @@ -2721,6 +2730,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>       ms->possible_cpus->len = max_cpus;
>       for (n = 0; n < ms->possible_cpus->len; n++) {
>           ms->possible_cpus->cpus[n].type = ms->cpu_type;
> +        ms->possible_cpus->cpus[n].vcpus_count = smp_threads;
>           ms->possible_cpus->cpus[n].arch_id =
>               virt_cpu_mp_affinity(vms, n);
>   

Why @vcpus_count is initialized to @smp_threads? it needs to be documented in
the commit log.

> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index bb486d36b1..6f9a7bb60b 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -209,4 +209,32 @@ static inline int virt_gicv3_redist_region_count(VirtMachineState *vms)
>               vms->highmem_redists) ? 2 : 1;
>   }
>   
> +static inline int virt_get_socket_id(const MachineState *ms, int cpu_index)
> +{
> +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
> +
> +    return ms->possible_cpus->cpus[cpu_index].props.socket_id;
> +}
> +
> +static inline int virt_get_cluster_id(const MachineState *ms, int cpu_index)
> +{
> +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
> +
> +    return ms->possible_cpus->cpus[cpu_index].props.cluster_id;
> +}
> +
> +static inline int virt_get_core_id(const MachineState *ms, int cpu_index)
> +{
> +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
> +
> +    return ms->possible_cpus->cpus[cpu_index].props.core_id;
> +}
> +
> +static inline int virt_get_thread_id(const MachineState *ms, int cpu_index)
> +{
> +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
> +
> +    return ms->possible_cpus->cpus[cpu_index].props.thread_id;
> +}
> +
>   #endif /* QEMU_ARM_VIRT_H */
> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> index 77f8c9c748..abc4ed0842 100644
> --- a/target/arm/cpu.c
> +++ b/target/arm/cpu.c
> @@ -2582,6 +2582,10 @@ static Property arm_cpu_properties[] = {
>       DEFINE_PROP_UINT64("mp-affinity", ARMCPU,
>                           mp_affinity, ARM64_AFFINITY_INVALID),
>       DEFINE_PROP_INT32("node-id", ARMCPU, node_id, CPU_UNSET_NUMA_NODE_ID),
> +    DEFINE_PROP_INT32("socket-id", ARMCPU, socket_id, 0),
> +    DEFINE_PROP_INT32("cluster-id", ARMCPU, cluster_id, 0),
> +    DEFINE_PROP_INT32("core-id", ARMCPU, core_id, 0),
> +    DEFINE_PROP_INT32("thread-id", ARMCPU, thread_id, 0),
>       DEFINE_PROP_INT32("core-count", ARMCPU, core_count, -1),
>       /* True to default to the backward-compat old CNTFRQ rather than 1Ghz */
>       DEFINE_PROP_BOOL("backcompat-cntfrq", ARMCPU, backcompat_cntfrq, false),
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index c17264c239..208c719db3 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -1076,6 +1076,10 @@ struct ArchCPU {
>       QLIST_HEAD(, ARMELChangeHook) el_change_hooks;
>   
>       int32_t node_id; /* NUMA node this CPU belongs to */
> +    int32_t socket_id;
> +    int32_t cluster_id;
> +    int32_t core_id;
> +    int32_t thread_id;
>   
>       /* Used to synchronize KVM and QEMU in-kernel device levels */
>       uint8_t device_irq_level;

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 02/29] cpu-common: Add common CPU utility for possible vCPUs
  2024-06-13 23:36 ` [PATCH RFC V3 02/29] cpu-common: Add common CPU utility for possible vCPUs Salil Mehta via
  2024-07-04  3:12   ` Nicholas Piggin
@ 2024-08-12  4:59   ` Gavin Shan
  2024-08-12  5:41     ` 回复: " liu ping
  1 sibling, 1 reply; 105+ messages in thread
From: Gavin Shan @ 2024-08-12  4:59 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm, mst
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, rafael,
	borntraeger, alex.bennee, npiggin, harshpb, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

On 6/14/24 9:36 AM, Salil Mehta wrote:
> This patch adds various utility functions that may be required to fetch or check
> the state of possible vCPUs. It also introduces the concept of *disabled* vCPUs,
> which are part of the *possible* vCPUs but are not enabled. This state will be
> used during machine initialization and later during the plugging or unplugging
> of vCPUs. We release the QOM CPU objects for all disabled vCPUs.
> 
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   cpu-common.c          | 31 +++++++++++++++++++++++++
>   include/hw/core/cpu.h | 54 +++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 85 insertions(+)
> 
> diff --git a/cpu-common.c b/cpu-common.c
> index ce78273af5..49d2a50835 100644
> --- a/cpu-common.c
> +++ b/cpu-common.c
> @@ -24,6 +24,7 @@
>   #include "sysemu/cpus.h"
>   #include "qemu/lockable.h"
>   #include "trace/trace-root.h"
> +#include "hw/boards.h"
>   
>   QemuMutex qemu_cpu_list_lock;
>   static QemuCond exclusive_cond;
> @@ -107,6 +108,36 @@ void cpu_list_remove(CPUState *cpu)
>       cpu_list_generation_id++;
>   }
>   
> +CPUState *qemu_get_possible_cpu(int index)
> +{
> +    MachineState *ms = MACHINE(qdev_get_machine());
> +    const CPUArchIdList *possible_cpus = ms->possible_cpus;
> +
> +    assert((index >= 0) && (index < possible_cpus->len));
> +
> +    return CPU(possible_cpus->cpus[index].cpu);
> +}
> +
> +bool qemu_present_cpu(CPUState *cpu)
> +{
> +    return cpu;
> +}
> +
> +bool qemu_enabled_cpu(CPUState *cpu)
> +{
> +    return cpu && !cpu->disabled;
> +}
> +
> +uint64_t qemu_get_cpu_archid(int cpu_index)
> +{
> +    MachineState *ms = MACHINE(qdev_get_machine());
> +    const CPUArchIdList *possible_cpus = ms->possible_cpus;
> +
> +    assert((cpu_index >= 0) && (cpu_index < possible_cpus->len));
> +
> +    return possible_cpus->cpus[cpu_index].arch_id;
> +}
> +
>   CPUState *qemu_get_cpu(int index)
>   {
>       CPUState *cpu;
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index 60b160d0b4..60b4778da9 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -528,6 +528,18 @@ struct CPUState {
>       CPUPluginState *plugin_state;
>   #endif
>   
> +    /*
> +     * Some architectures do not allow the *presence* of vCPUs to be changed
> +     * after the guest has booted, based on information specified by the
> +     * VMM/firmware via ACPI MADT at boot time. Thus, to enable vCPU hotplug on
> +     * these architectures, possible vCPUs can have a CPUState object in a
> +     * 'disabled' state or may not have a CPUState object at all. This is
> +     * possible when vCPU hotplug is supported, and vCPUs are
> +     * 'yet-to-be-plugged' in the QOM or have been hot-unplugged. By default,
> +     * every CPUState is enabled across all architectures.
> +     */
> +    bool disabled;
> +

The information to determine vCPU's state has been distributed to two data structs:
MachineState::possible_cpus::cpus[] and CPUState. Why not just to maintain the vCPU's
state from MachineState::possible_cpus::cpus[]? For example, adding a new field, or
something similiar, to 'struct CPUArchId' as below.

typedef struct CPUArchId {
#define CPU_ARCH_ID_FLAG_PRESENT	(1 << 0)
#define CPU_ARCH_ID_FLAG_ENABLED        (1 << 1)
     uint32_t flags;
        :
} CPUArchId;

In order to determine if a CPUState instance is enabled or not. CPUState::cpu_index
is used as the index to MachineState::possible_cpus::cpus[]. The flags can be parsed
to determine the vCPU's state.

>       /* TODO Move common fields from CPUArchState here. */
>       int cpu_index;
>       int cluster_index;
> @@ -914,6 +926,48 @@ static inline bool cpu_in_exclusive_context(const CPUState *cpu)
>    */
>   CPUState *qemu_get_cpu(int index);
>   
> +/**
> + * qemu_get_possible_cpu:
> + * @index: The CPUState@cpu_index value of the CPU to obtain.
> + *         Input index MUST be in range [0, Max Possible CPUs)
> + *
> + * If CPUState object exists,then it gets a CPU matching
> + * @index in the possible CPU array.
> + *
> + * Returns: The possible CPU or %NULL if CPU does not exist.
> + */
> +CPUState *qemu_get_possible_cpu(int index);
> +
> +/**
> + * qemu_present_cpu:
> + * @cpu: The vCPU to check
> + *
> + * Checks if the vCPU is amongst the present possible vcpus.
> + *
> + * Returns: True if it is present possible vCPU else false
> + */
> +bool qemu_present_cpu(CPUState *cpu);
> +
> +/**
> + * qemu_enabled_cpu:
> + * @cpu: The vCPU to check
> + *
> + * Checks if the vCPU is enabled.
> + *
> + * Returns: True if it is 'enabled' else false
> + */
> +bool qemu_enabled_cpu(CPUState *cpu);
> +
> +/**
> + * qemu_get_cpu_archid:
> + * @cpu_index: possible vCPU for which arch-id needs to be retreived
> + *
> + * Fetches the vCPU arch-id from the present possible vCPUs.
> + *
> + * Returns: arch-id of the possible vCPU
> + */
> +uint64_t qemu_get_cpu_archid(int cpu_index);
> +
>   /**
>    * cpu_exists:
>    * @id: Guest-exposed CPU ID to lookup.

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 03/29] hw/arm/virt: Limit number of possible vCPUs for unsupported Accel or GIC Type
  2024-06-13 23:36 ` [PATCH RFC V3 03/29] hw/arm/virt: Limit number of possible vCPUs for unsupported Accel or GIC Type Salil Mehta via
@ 2024-08-12  5:09   ` Gavin Shan
  0 siblings, 0 replies; 105+ messages in thread
From: Gavin Shan @ 2024-08-12  5:09 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm, mst
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, rafael,
	borntraeger, alex.bennee, npiggin, harshpb, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

On 6/14/24 9:36 AM, Salil Mehta wrote:
> If Virtual CPU Hotplug support does not exist on a particular Accel platform or
> ARM GIC version, we should limit the possible vCPUs to those available during
> boot time (i.e SMP CPUs) and explicitly disable Virtual CPU Hotplug support.
> 
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt.c | 66 +++++++++++++++++++++++++++++----------------------
>   1 file changed, 38 insertions(+), 28 deletions(-)
> 

Most of the code changes are moving the check between @max_cpus and @max_supported_cpus.
It would make the review easier if the code movement can be put into a preparatory patch
if you agree.

Thanks,
Gavin

> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 11fc7fc318..3e1c4d2d2f 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2082,8 +2082,6 @@ static void machvirt_init(MachineState *machine)
>       unsigned int smp_cpus = machine->smp.cpus;
>       unsigned int max_cpus = machine->smp.max_cpus;
>   
> -    possible_cpus = mc->possible_cpu_arch_ids(machine);
> -
>       /*
>        * In accelerated mode, the memory map is computed earlier in kvm_type()
>        * to create a VM with the right number of IPA bits.
> @@ -2098,7 +2096,7 @@ static void machvirt_init(MachineState *machine)
>            * we are about to deal with. Once this is done, get rid of
>            * the object.
>            */
> -        cpuobj = object_new(possible_cpus->cpus[0].type);
> +        cpuobj = object_new(machine->cpu_type);
>           armcpu = ARM_CPU(cpuobj);
>   
>           pa_bits = arm_pamax(armcpu);
> @@ -2113,6 +2111,43 @@ static void machvirt_init(MachineState *machine)
>        */
>       finalize_gic_version(vms);
>   
> +    /*
> +     * The maximum number of CPUs depends on the GIC version, or on how
> +     * many redistributors we can fit into the memory map (which in turn
> +     * depends on whether this is a GICv3 or v4).
> +     */
> +    if (vms->gic_version == VIRT_GIC_VERSION_2) {
> +        virt_max_cpus = GIC_NCPU;
> +    } else {
> +        virt_max_cpus = virt_redist_capacity(vms, VIRT_GIC_REDIST);
> +        if (vms->highmem_redists) {
> +            virt_max_cpus += virt_redist_capacity(vms, VIRT_HIGH_GIC_REDIST2);
> +        }
> +    }
> +
> +    if (tcg_enabled() || hvf_enabled() || qtest_enabled() ||
> +        (vms->gic_version < VIRT_GIC_VERSION_3)) {
> +        max_cpus = machine->smp.max_cpus = smp_cpus;
> +        mc->has_hotpluggable_cpus = false;
> +        if (vms->gic_version >= VIRT_GIC_VERSION_3) {
> +            warn_report("cpu hotplug feature has been disabled");
> +        }
> +    }
> +
> +    if (max_cpus > virt_max_cpus) {
> +        error_report("Number of SMP CPUs requested (%d) exceeds max CPUs "
> +                     "supported by machine 'mach-virt' (%d)",
> +                     max_cpus, virt_max_cpus);
> +        if (vms->gic_version != VIRT_GIC_VERSION_2 && !vms->highmem_redists) {
> +            error_printf("Try 'highmem-redists=on' for more CPUs\n");
> +        }
> +
> +        exit(1);
> +    }
> +
> +    /* uses smp.max_cpus to initialize all possible vCPUs */
> +    possible_cpus = mc->possible_cpu_arch_ids(machine);
> +
>       if (vms->secure) {
>           /*
>            * The Secure view of the world is the same as the NonSecure,
> @@ -2147,31 +2182,6 @@ static void machvirt_init(MachineState *machine)
>           vms->psci_conduit = QEMU_PSCI_CONDUIT_HVC;
>       }
>   
> -    /*
> -     * The maximum number of CPUs depends on the GIC version, or on how
> -     * many redistributors we can fit into the memory map (which in turn
> -     * depends on whether this is a GICv3 or v4).
> -     */
> -    if (vms->gic_version == VIRT_GIC_VERSION_2) {
> -        virt_max_cpus = GIC_NCPU;
> -    } else {
> -        virt_max_cpus = virt_redist_capacity(vms, VIRT_GIC_REDIST);
> -        if (vms->highmem_redists) {
> -            virt_max_cpus += virt_redist_capacity(vms, VIRT_HIGH_GIC_REDIST2);
> -        }
> -    }
> -
> -    if (max_cpus > virt_max_cpus) {
> -        error_report("Number of SMP CPUs requested (%d) exceeds max CPUs "
> -                     "supported by machine 'mach-virt' (%d)",
> -                     max_cpus, virt_max_cpus);
> -        if (vms->gic_version != VIRT_GIC_VERSION_2 && !vms->highmem_redists) {
> -            error_printf("Try 'highmem-redists=on' for more CPUs\n");
> -        }
> -
> -        exit(1);
> -    }
> -
>       if (vms->secure && (kvm_enabled() || hvf_enabled())) {
>           error_report("mach-virt: %s does not support providing "
>                        "Security extensions (TrustZone) to the guest CPU",



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 04/29] hw/arm/virt: Move setting of common CPU properties in a function
  2024-06-13 23:36 ` [PATCH RFC V3 04/29] hw/arm/virt: Move setting of common CPU properties in a function Salil Mehta via
@ 2024-08-12  5:19   ` Gavin Shan
  0 siblings, 0 replies; 105+ messages in thread
From: Gavin Shan @ 2024-08-12  5:19 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm, mst
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, rafael,
	borntraeger, alex.bennee, npiggin, harshpb, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

On 6/14/24 9:36 AM, Salil Mehta wrote:
> Factor out CPU properties code common for {hot,cold}-plugged CPUs. This allows
> code reuse.
> 
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt.c         | 261 ++++++++++++++++++++++++++++--------------
>   include/hw/arm/virt.h |   4 +
>   2 files changed, 182 insertions(+), 83 deletions(-)
> 

If this series are going to be split for easier review, this is an candidate
for a prepatory patch.

> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 3e1c4d2d2f..2e0ec7d869 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -1753,6 +1753,46 @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx)
>       return arm_build_mp_affinity(idx, clustersz);
>   }
>   
> +static CPUArchId *virt_find_cpu_slot(MachineState *ms, int vcpuid)
> +{
> +    VirtMachineState *vms = VIRT_MACHINE(ms);
> +    CPUArchId *found_cpu;
> +    uint64_t mp_affinity;
> +
> +    assert(vcpuid >= 0 && vcpuid < ms->possible_cpus->len);
> +
> +    mp_affinity = virt_cpu_mp_affinity(vms, vcpuid);
> +    found_cpu = &ms->possible_cpus->cpus[vcpuid];
> +
> +    assert(found_cpu->arch_id == mp_affinity);
> +
> +    /*
> +     * RFC: Question:
> +     * Slot-id is the index where vCPU with certain arch-id(=mpidr/ap-affinity)
> +     * is plugged. For Host KVM, MPIDR for vCPU is derived using vcpu-id.
> +     * As I understand, MPIDR and vcpu-id are property of vCPU but slot-id is
> +     * more related to machine? Current code assumes slot-id and vcpu-id are
> +     * same i.e. meaning of slot is bit vague.
> +     *
> +     * Q1: Is there any requirement to clearly represent slot and dissociate it
> +     *     from vcpu-id?
> +     * Q2: Should we make MPIDR within host KVM user configurable?
> +     *
> +     *          +----+----+----+----+----+----+----+----+
> +     * MPIDR    |||  Res  |   Aff2  |   Aff1  |  Aff0   |
> +     *          +----+----+----+----+----+----+----+----+
> +     *                     \         \         \   |    |
> +     *                      \   8bit  \   8bit  \  |4bit|
> +     *                       \<------->\<------->\ |<-->|
> +     *                        \         \         \|    |
> +     *          +----+----+----+----+----+----+----+----+
> +     * VCPU-ID  |  Byte4  |  Byte2  |  Byte1  |  Byte0  |
> +     *          +----+----+----+----+----+----+----+----+
> +     */
> +
> +    return found_cpu;
> +}
> +

I don't see why virt_find_cpu_slot() is needed. Apart from the sanity check, it
basically returns ms->possible_cpus->cpus[]. The caller can dereference
ms->possible_cpus->cpus[] directly. About the sanity check, the mp_affinity
has been properly populated in virt_possible_cpu_arch_ids(). I don't see why
we it needs to be checked again.

>   static inline bool *virt_get_high_memmap_enabled(VirtMachineState *vms,
>                                                    int index)
>   {
> @@ -2065,16 +2105,129 @@ static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
>       }
>   }
>   
> +static void virt_cpu_set_properties(Object *cpuobj, const CPUArchId *cpu_slot,
> +                                    Error **errp)
> +{
> +    MachineState *ms = MACHINE(qdev_get_machine());
> +    VirtMachineState *vms = VIRT_MACHINE(ms);
> +    Error *local_err = NULL;
> +    VirtMachineClass *vmc;
> +
> +    vmc = VIRT_MACHINE_GET_CLASS(ms);
> +
> +    /* now, set the cpu object property values */
> +    numa_cpu_pre_plug(cpu_slot, DEVICE(cpuobj), &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
> +    object_property_set_int(cpuobj, "mp-affinity", cpu_slot->arch_id, NULL);
> +
> +    if (!vms->secure) {
> +        object_property_set_bool(cpuobj, "has_el3", false, NULL);
> +    }
> +
> +    if (!vms->virt && object_property_find(cpuobj, "has_el2")) {
> +        object_property_set_bool(cpuobj, "has_el2", false, NULL);
> +    }
> +
> +    if (vmc->kvm_no_adjvtime &&
> +        object_property_find(cpuobj, "kvm-no-adjvtime")) {
> +        object_property_set_bool(cpuobj, "kvm-no-adjvtime", true, NULL);
> +    }
> +
> +    if (vmc->no_kvm_steal_time &&
> +        object_property_find(cpuobj, "kvm-steal-time")) {
> +        object_property_set_bool(cpuobj, "kvm-steal-time", false, NULL);
> +    }
> +
> +    if (vmc->no_pmu && object_property_find(cpuobj, "pmu")) {
> +        object_property_set_bool(cpuobj, "pmu", false, NULL);
> +    }
> +
> +    if (vmc->no_tcg_lpa2 && object_property_find(cpuobj, "lpa2")) {
> +        object_property_set_bool(cpuobj, "lpa2", false, NULL);
> +    }
> +
> +    if (object_property_find(cpuobj, "reset-cbar")) {
> +        object_property_set_int(cpuobj, "reset-cbar",
> +                                vms->memmap[VIRT_CPUPERIPHS].base,
> +                                &local_err);
> +        if (local_err) {
> +            goto out;
> +        }
> +    }
> +
> +    /* link already initialized {secure,tag}-memory regions to this cpu */
> +    object_property_set_link(cpuobj, "memory", OBJECT(vms->sysmem), &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
> +    if (vms->secure) {
> +        object_property_set_link(cpuobj, "secure-memory",
> +                                 OBJECT(vms->secure_sysmem), &local_err);
> +        if (local_err) {
> +            goto out;
> +        }
> +    }
> +
> +    if (vms->mte) {
> +        if (!object_property_find(cpuobj, "tag-memory")) {
> +            error_setg(&local_err, "MTE requested, but not supported "
> +                       "by the guest CPU");
> +            if (local_err) {
> +                goto out;
> +            }
> +        }
> +
> +        object_property_set_link(cpuobj, "tag-memory", OBJECT(vms->tag_sysmem),
> +                                 &local_err);
> +        if (local_err) {
> +            goto out;
> +        }
> +
> +        if (vms->secure) {
> +            object_property_set_link(cpuobj, "secure-tag-memory",
> +                                     OBJECT(vms->secure_tag_sysmem),
> +                                     &local_err);
> +            if (local_err) {
> +                goto out;
> +            }
> +        }
> +    }
> +
> +    /*
> +     * RFC: Question: this must only be called for the hotplugged cpus. For the
> +     * cold booted secondary cpus this is being taken care in arm_load_kernel()
> +     * in boot.c. Perhaps we should remove that code now?
> +     */
> +    if (vms->psci_conduit != QEMU_PSCI_CONDUIT_DISABLED) {
> +        object_property_set_int(cpuobj, "psci-conduit", vms->psci_conduit,
> +                                NULL);
> +
> +        /* Secondary CPUs start in PSCI powered-down state */
> +        if (CPU(cpuobj)->cpu_index > 0) {
> +            object_property_set_bool(cpuobj, "start-powered-off", true, NULL);
> +        }
> +    }
> +
> +out:
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +    }
> +}
> +
>   static void machvirt_init(MachineState *machine)
>   {
>       VirtMachineState *vms = VIRT_MACHINE(machine);
>       VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(machine);
>       MachineClass *mc = MACHINE_GET_CLASS(machine);
>       const CPUArchIdList *possible_cpus;
> -    MemoryRegion *sysmem = get_system_memory();
> +    MemoryRegion *secure_tag_sysmem = NULL;
>       MemoryRegion *secure_sysmem = NULL;
>       MemoryRegion *tag_sysmem = NULL;
> -    MemoryRegion *secure_tag_sysmem = NULL;
> +    MemoryRegion *sysmem;
>       int n, virt_max_cpus;
>       bool firmware_loaded;
>       bool aarch64 = true;
> @@ -2148,6 +2301,8 @@ static void machvirt_init(MachineState *machine)
>       /* uses smp.max_cpus to initialize all possible vCPUs */
>       possible_cpus = mc->possible_cpu_arch_ids(machine);
>   
> +    sysmem = vms->sysmem = get_system_memory();
> +
>       if (vms->secure) {
>           /*
>            * The Secure view of the world is the same as the NonSecure,
> @@ -2155,7 +2310,7 @@ static void machvirt_init(MachineState *machine)
>            * containing the system memory at low priority; any secure-only
>            * devices go in at higher priority and take precedence.
>            */
> -        secure_sysmem = g_new(MemoryRegion, 1);
> +        secure_sysmem = vms->secure_sysmem = g_new(MemoryRegion, 1);
>           memory_region_init(secure_sysmem, OBJECT(machine), "secure-memory",
>                              UINT64_MAX);
>           memory_region_add_subregion_overlap(secure_sysmem, 0, sysmem, -1);
> @@ -2203,10 +2358,28 @@ static void machvirt_init(MachineState *machine)
>           exit(1);
>       }
>   
> +    if (vms->mte) {
> +        /* Create the memory region only once, but link to all cpus later */
> +        tag_sysmem = vms->tag_sysmem = g_new(MemoryRegion, 1);
> +        memory_region_init(tag_sysmem, OBJECT(machine),
> +                           "tag-memory", UINT64_MAX / 32);
> +
> +        if (vms->secure) {
> +            secure_tag_sysmem = vms->secure_tag_sysmem = g_new(MemoryRegion, 1);
> +            memory_region_init(secure_tag_sysmem, OBJECT(machine),
> +                               "secure-tag-memory", UINT64_MAX / 32);
> +
> +            /* As with ram, secure-tag takes precedence over tag.  */
> +            memory_region_add_subregion_overlap(secure_tag_sysmem, 0,
> +                                                tag_sysmem, -1);
> +        }
> +    }
> +
>       create_fdt(vms);
>   
>       assert(possible_cpus->len == max_cpus);
>       for (n = 0; n < possible_cpus->len; n++) {
> +        CPUArchId *cpu_slot;
>           Object *cpuobj;
>           CPUState *cs;
>   
> @@ -2215,15 +2388,10 @@ static void machvirt_init(MachineState *machine)
>           }
>   
>           cpuobj = object_new(possible_cpus->cpus[n].type);
> -        object_property_set_int(cpuobj, "mp-affinity",
> -                                possible_cpus->cpus[n].arch_id, NULL);
>   
>           cs = CPU(cpuobj);
>           cs->cpu_index = n;
>   
> -        numa_cpu_pre_plug(&possible_cpus->cpus[cs->cpu_index], DEVICE(cpuobj),
> -                          &error_fatal);
> -
>           aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
>           object_property_set_int(cpuobj, "socket-id",
>                                   virt_get_socket_id(machine, n), NULL);
> @@ -2234,81 +2402,8 @@ static void machvirt_init(MachineState *machine)
>           object_property_set_int(cpuobj, "thread-id",
>                                   virt_get_thread_id(machine, n), NULL);
>   
> -        if (!vms->secure) {
> -            object_property_set_bool(cpuobj, "has_el3", false, NULL);
> -        }
> -
> -        if (!vms->virt && object_property_find(cpuobj, "has_el2")) {
> -            object_property_set_bool(cpuobj, "has_el2", false, NULL);
> -        }
> -
> -        if (vmc->kvm_no_adjvtime &&
> -            object_property_find(cpuobj, "kvm-no-adjvtime")) {
> -            object_property_set_bool(cpuobj, "kvm-no-adjvtime", true, NULL);
> -        }
> -
> -        if (vmc->no_kvm_steal_time &&
> -            object_property_find(cpuobj, "kvm-steal-time")) {
> -            object_property_set_bool(cpuobj, "kvm-steal-time", false, NULL);
> -        }
> -
> -        if (vmc->no_pmu && object_property_find(cpuobj, "pmu")) {
> -            object_property_set_bool(cpuobj, "pmu", false, NULL);
> -        }
> -
> -        if (vmc->no_tcg_lpa2 && object_property_find(cpuobj, "lpa2")) {
> -            object_property_set_bool(cpuobj, "lpa2", false, NULL);
> -        }
> -
> -        if (object_property_find(cpuobj, "reset-cbar")) {
> -            object_property_set_int(cpuobj, "reset-cbar",
> -                                    vms->memmap[VIRT_CPUPERIPHS].base,
> -                                    &error_abort);
> -        }
> -
> -        object_property_set_link(cpuobj, "memory", OBJECT(sysmem),
> -                                 &error_abort);
> -        if (vms->secure) {
> -            object_property_set_link(cpuobj, "secure-memory",
> -                                     OBJECT(secure_sysmem), &error_abort);
> -        }
> -
> -        if (vms->mte) {
> -            /* Create the memory region only once, but link to all cpus. */
> -            if (!tag_sysmem) {
> -                /*
> -                 * The property exists only if MemTag is supported.
> -                 * If it is, we must allocate the ram to back that up.
> -                 */
> -                if (!object_property_find(cpuobj, "tag-memory")) {
> -                    error_report("MTE requested, but not supported "
> -                                 "by the guest CPU");
> -                    exit(1);
> -                }
> -
> -                tag_sysmem = g_new(MemoryRegion, 1);
> -                memory_region_init(tag_sysmem, OBJECT(machine),
> -                                   "tag-memory", UINT64_MAX / 32);
> -
> -                if (vms->secure) {
> -                    secure_tag_sysmem = g_new(MemoryRegion, 1);
> -                    memory_region_init(secure_tag_sysmem, OBJECT(machine),
> -                                       "secure-tag-memory", UINT64_MAX / 32);
> -
> -                    /* As with ram, secure-tag takes precedence over tag.  */
> -                    memory_region_add_subregion_overlap(secure_tag_sysmem, 0,
> -                                                        tag_sysmem, -1);
> -                }
> -            }
> -
> -            object_property_set_link(cpuobj, "tag-memory", OBJECT(tag_sysmem),
> -                                     &error_abort);
> -            if (vms->secure) {
> -                object_property_set_link(cpuobj, "secure-tag-memory",
> -                                         OBJECT(secure_tag_sysmem),
> -                                         &error_abort);
> -            }
> -        }
> +        cpu_slot = virt_find_cpu_slot(machine, cs->cpu_index);
> +        virt_cpu_set_properties(cpuobj, cpu_slot, &error_fatal);
>   
>           qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
>           object_unref(cpuobj);
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index 6f9a7bb60b..780bd53ceb 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -139,6 +139,10 @@ struct VirtMachineState {
>       DeviceState *platform_bus_dev;
>       FWCfgState *fw_cfg;
>       PFlashCFI01 *flash[2];
> +    MemoryRegion *sysmem;
> +    MemoryRegion *secure_sysmem;
> +    MemoryRegion *tag_sysmem;
> +    MemoryRegion *secure_tag_sysmem;
>       bool secure;
>       bool highmem;
>       bool highmem_compact;

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 105+ messages in thread

* 回复: [PATCH RFC V3 02/29] cpu-common: Add common CPU utility for possible vCPUs
  2024-08-12  4:59   ` Gavin Shan
@ 2024-08-12  5:41     ` liu ping
  0 siblings, 0 replies; 105+ messages in thread
From: liu ping @ 2024-08-12  5:41 UTC (permalink / raw)
  To: Gavin Shan, Salil Mehta, qemu-devel@nongnu.org,
	qemu-arm@nongnu.org, mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org,
	jonathan.cameron@huawei.com, lpieralisi@kernel.org,
	peter.maydell@linaro.org, richard.henderson@linaro.org,
	imammedo@redhat.com, andrew.jones@linux.dev, david@redhat.com,
	philmd@linaro.org, eric.auger@redhat.com, will@kernel.org,
	ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com,
	rafael@kernel.org, borntraeger@linux.ibm.com,
	alex.bennee@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com,
	linux@armlinux.org.uk, darren@os.amperecomputing.com,
	ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com,
	karl.heubaum@oracle.com, miguel.luis@oracle.com,
	salil.mehta@opnsrc.net, zhukeqian1@huawei.com,
	wangxiongfeng2@huawei.com, wangyanan55@huawei.com,
	jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn,
	shahuang@redhat.com, zhao1.liu@intel.com, linuxarm@huawei.com

[-- Attachment #1: Type: text/plain, Size: 7475 bytes --]

unsubscribe
________________________________
发件人: qemu-devel-bounces+liuping24=outlook.com@nongnu.org <qemu-devel-bounces+liuping24=outlook.com@nongnu.org> 代表 Gavin Shan <gshan@redhat.com>
发送时间: 2024年8月11日 21:59
收件人: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org <qemu-devel@nongnu.org>; qemu-arm@nongnu.org <qemu-arm@nongnu.org>; mst@redhat.com <mst@redhat.com>
抄送: maz@kernel.org <maz@kernel.org>; jean-philippe@linaro.org <jean-philippe@linaro.org>; jonathan.cameron@huawei.com <jonathan.cameron@huawei.com>; lpieralisi@kernel.org <lpieralisi@kernel.org>; peter.maydell@linaro.org <peter.maydell@linaro.org>; richard.henderson@linaro.org <richard.henderson@linaro.org>; imammedo@redhat.com <imammedo@redhat.com>; andrew.jones@linux.dev <andrew.jones@linux.dev>; david@redhat.com <david@redhat.com>; philmd@linaro.org <philmd@linaro.org>; eric.auger@redhat.com <eric.auger@redhat.com>; will@kernel.org <will@kernel.org>; ardb@kernel.org <ardb@kernel.org>; oliver.upton@linux.dev <oliver.upton@linux.dev>; pbonzini@redhat.com <pbonzini@redhat.com>; rafael@kernel.org <rafael@kernel.org>; borntraeger@linux.ibm.com <borntraeger@linux.ibm.com>; alex.bennee@linaro.org <alex.bennee@linaro.org>; npiggin@gmail.com <npiggin@gmail.com>; harshpb@linux.ibm.com <harshpb@linux.ibm.com>; linux@armlinux.org.uk <linux@armlinux.org.uk>; darren@os.amperecomputing.com <darren@os.amperecomputing.com>; ilkka@os.amperecomputing.com <ilkka@os.amperecomputing.com>; vishnu@os.amperecomputing.com <vishnu@os.amperecomputing.com>; karl.heubaum@oracle.com <karl.heubaum@oracle.com>; miguel.luis@oracle.com <miguel.luis@oracle.com>; salil.mehta@opnsrc.net <salil.mehta@opnsrc.net>; zhukeqian1@huawei.com <zhukeqian1@huawei.com>; wangxiongfeng2@huawei.com <wangxiongfeng2@huawei.com>; wangyanan55@huawei.com <wangyanan55@huawei.com>; jiakernel2@gmail.com <jiakernel2@gmail.com>; maobibo@loongson.cn <maobibo@loongson.cn>; lixianglai@loongson.cn <lixianglai@loongson.cn>; shahuang@redhat.com <shahuang@redhat.com>; zhao1.liu@intel.com <zhao1.liu@intel.com>; linuxarm@huawei.com <linuxarm@huawei.com>
主题: Re: [PATCH RFC V3 02/29] cpu-common: Add common CPU utility for possible vCPUs

On 6/14/24 9:36 AM, Salil Mehta wrote:
> This patch adds various utility functions that may be required to fetch or check
> the state of possible vCPUs. It also introduces the concept of *disabled* vCPUs,
> which are part of the *possible* vCPUs but are not enabled. This state will be
> used during machine initialization and later during the plugging or unplugging
> of vCPUs. We release the QOM CPU objects for all disabled vCPUs.
>
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   cpu-common.c          | 31 +++++++++++++++++++++++++
>   include/hw/core/cpu.h | 54 +++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 85 insertions(+)
>
> diff --git a/cpu-common.c b/cpu-common.c
> index ce78273af5..49d2a50835 100644
> --- a/cpu-common.c
> +++ b/cpu-common.c
> @@ -24,6 +24,7 @@
>   #include "sysemu/cpus.h"
>   #include "qemu/lockable.h"
>   #include "trace/trace-root.h"
> +#include "hw/boards.h"
>
>   QemuMutex qemu_cpu_list_lock;
>   static QemuCond exclusive_cond;
> @@ -107,6 +108,36 @@ void cpu_list_remove(CPUState *cpu)
>       cpu_list_generation_id++;
>   }
>
> +CPUState *qemu_get_possible_cpu(int index)
> +{
> +    MachineState *ms = MACHINE(qdev_get_machine());
> +    const CPUArchIdList *possible_cpus = ms->possible_cpus;
> +
> +    assert((index >= 0) && (index < possible_cpus->len));
> +
> +    return CPU(possible_cpus->cpus[index].cpu);
> +}
> +
> +bool qemu_present_cpu(CPUState *cpu)
> +{
> +    return cpu;
> +}
> +
> +bool qemu_enabled_cpu(CPUState *cpu)
> +{
> +    return cpu && !cpu->disabled;
> +}
> +
> +uint64_t qemu_get_cpu_archid(int cpu_index)
> +{
> +    MachineState *ms = MACHINE(qdev_get_machine());
> +    const CPUArchIdList *possible_cpus = ms->possible_cpus;
> +
> +    assert((cpu_index >= 0) && (cpu_index < possible_cpus->len));
> +
> +    return possible_cpus->cpus[cpu_index].arch_id;
> +}
> +
>   CPUState *qemu_get_cpu(int index)
>   {
>       CPUState *cpu;
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index 60b160d0b4..60b4778da9 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -528,6 +528,18 @@ struct CPUState {
>       CPUPluginState *plugin_state;
>   #endif
>
> +    /*
> +     * Some architectures do not allow the *presence* of vCPUs to be changed
> +     * after the guest has booted, based on information specified by the
> +     * VMM/firmware via ACPI MADT at boot time. Thus, to enable vCPU hotplug on
> +     * these architectures, possible vCPUs can have a CPUState object in a
> +     * 'disabled' state or may not have a CPUState object at all. This is
> +     * possible when vCPU hotplug is supported, and vCPUs are
> +     * 'yet-to-be-plugged' in the QOM or have been hot-unplugged. By default,
> +     * every CPUState is enabled across all architectures.
> +     */
> +    bool disabled;
> +

The information to determine vCPU's state has been distributed to two data structs:
MachineState::possible_cpus::cpus[] and CPUState. Why not just to maintain the vCPU's
state from MachineState::possible_cpus::cpus[]? For example, adding a new field, or
something similiar, to 'struct CPUArchId' as below.

typedef struct CPUArchId {
#define CPU_ARCH_ID_FLAG_PRESENT        (1 << 0)
#define CPU_ARCH_ID_FLAG_ENABLED        (1 << 1)
     uint32_t flags;
        :
} CPUArchId;

In order to determine if a CPUState instance is enabled or not. CPUState::cpu_index
is used as the index to MachineState::possible_cpus::cpus[]. The flags can be parsed
to determine the vCPU's state.

>       /* TODO Move common fields from CPUArchState here. */
>       int cpu_index;
>       int cluster_index;
> @@ -914,6 +926,48 @@ static inline bool cpu_in_exclusive_context(const CPUState *cpu)
>    */
>   CPUState *qemu_get_cpu(int index);
>
> +/**
> + * qemu_get_possible_cpu:
> + * @index: The CPUState@cpu_index value of the CPU to obtain.
> + *         Input index MUST be in range [0, Max Possible CPUs)
> + *
> + * If CPUState object exists,then it gets a CPU matching
> + * @index in the possible CPU array.
> + *
> + * Returns: The possible CPU or %NULL if CPU does not exist.
> + */
> +CPUState *qemu_get_possible_cpu(int index);
> +
> +/**
> + * qemu_present_cpu:
> + * @cpu: The vCPU to check
> + *
> + * Checks if the vCPU is amongst the present possible vcpus.
> + *
> + * Returns: True if it is present possible vCPU else false
> + */
> +bool qemu_present_cpu(CPUState *cpu);
> +
> +/**
> + * qemu_enabled_cpu:
> + * @cpu: The vCPU to check
> + *
> + * Checks if the vCPU is enabled.
> + *
> + * Returns: True if it is 'enabled' else false
> + */
> +bool qemu_enabled_cpu(CPUState *cpu);
> +
> +/**
> + * qemu_get_cpu_archid:
> + * @cpu_index: possible vCPU for which arch-id needs to be retreived
> + *
> + * Fetches the vCPU arch-id from the present possible vCPUs.
> + *
> + * Returns: arch-id of the possible vCPU
> + */
> +uint64_t qemu_get_cpu_archid(int cpu_index);
> +
>   /**
>    * cpu_exists:
>    * @id: Guest-exposed CPU ID to lookup.

Thanks,
Gavin



[-- Attachment #2: Type: text/html, Size: 10659 bytes --]

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
  2024-08-12  4:35   ` [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property Gavin Shan
@ 2024-08-12  8:15     ` Igor Mammedov
  2024-08-13  0:31       ` Gavin Shan
  2024-08-19 12:07       ` Salil Mehta via
  2024-08-19 11:53     ` Salil Mehta via
  1 sibling, 2 replies; 105+ messages in thread
From: Igor Mammedov @ 2024-08-12  8:15 UTC (permalink / raw)
  To: Gavin Shan
  Cc: Salil Mehta, qemu-devel, qemu-arm, mst, maz, jean-philippe,
	jonathan.cameron, lpieralisi, peter.maydell, richard.henderson,
	andrew.jones, david, philmd, eric.auger, will, ardb, oliver.upton,
	pbonzini, rafael, borntraeger, alex.bennee, npiggin, harshpb,
	linux, darren, ilkka, vishnu, karl.heubaum, miguel.luis,
	salil.mehta, zhukeqian1, wangxiongfeng2, wangyanan55, jiakernel2,
	maobibo, lixianglai, shahuang, zhao1.liu, linuxarm

On Mon, 12 Aug 2024 14:35:56 +1000
Gavin Shan <gshan@redhat.com> wrote:

> On 6/14/24 9:36 AM, Salil Mehta wrote:
> > This shall be used to store user specified topology{socket,cluster,core,thread}
> > and shall be converted to a unique 'vcpu-id' which is used as slot-index during
> > hot(un)plug of vCPU.
> > 
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c         | 10 ++++++++++
> >   include/hw/arm/virt.h | 28 ++++++++++++++++++++++++++++
> >   target/arm/cpu.c      |  4 ++++
> >   target/arm/cpu.h      |  4 ++++
> >   4 files changed, 46 insertions(+)
> >   
> 
> Those 4 properties are introduced to determine the vCPU's slot, which is the index
> to MachineState::possible_cpus::cpus[]. From there, the CPU object or instance is
> referenced and then the CPU's state can be further determined. It sounds reasonable
> to use the CPU's topology to determine the index. However, I'm wandering if this can
> be simplified to use 'cpu-index' or 'index' for a couple of facts: (1) 'cpu-index'

Please, don't. We've spent a bunch of time to get rid of cpu-index in user
visible interface (well, old NUMA CLI is still there along with 'new' topology
based one, but that's the last one).

> or 'index' is simplified. Users have to provide 4 parameters in order to determine
> its index in the extreme case, for example "device_add host-arm-cpu, id=cpu7,socket-id=1,
> cluster-id=1,core-id=1,thread-id=1". With 'cpu-index' or 'index', it can be simplified
> to 'index=7'. (2) The cold-booted and hotpluggable CPUs are determined by their index
> instead of their topology. For example, CPU0/1/2/3 are cold-booted CPUs while CPU4/5/6/7
> are hotpluggable CPUs with command lines '-smp maxcpus=8,cpus=4'. So 'index' makes
> more sense to identify a vCPU's slot.
cpu-index have been used for hotplug with x86 machines as a starting point
to implement hotplug as it was easy to hack and it has already existed in QEMU.

But that didn't scale as was desired and had its own issues.
Hence the current interface that majority agreed upon.
I don't remember exact arguments anymore (they could be found qemu-devel if needed)
Here is a link to the talk that tried to explain why topo based was introduced.
  http://events17.linuxfoundation.org/sites/events/files/slides/CPU%20Hot-plug%20support%20in%20QEMU.pdf


> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 3c93c0c0a6..11fc7fc318 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -2215,6 +2215,14 @@ static void machvirt_init(MachineState *machine)
> >                             &error_fatal);
> >   
> >           aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
> > +        object_property_set_int(cpuobj, "socket-id",
> > +                                virt_get_socket_id(machine, n), NULL);
> > +        object_property_set_int(cpuobj, "cluster-id",
> > +                                virt_get_cluster_id(machine, n), NULL);
> > +        object_property_set_int(cpuobj, "core-id",
> > +                                virt_get_core_id(machine, n), NULL);
> > +        object_property_set_int(cpuobj, "thread-id",
> > +                                virt_get_thread_id(machine, n), NULL);
> >   
> >           if (!vms->secure) {
> >               object_property_set_bool(cpuobj, "has_el3", false, NULL);
> > @@ -2708,6 +2716,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
> >   {
> >       int n;
> >       unsigned int max_cpus = ms->smp.max_cpus;
> > +    unsigned int smp_threads = ms->smp.threads;
> >       VirtMachineState *vms = VIRT_MACHINE(ms);
> >       MachineClass *mc = MACHINE_GET_CLASS(vms);
> >   
> > @@ -2721,6 +2730,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
> >       ms->possible_cpus->len = max_cpus;
> >       for (n = 0; n < ms->possible_cpus->len; n++) {
> >           ms->possible_cpus->cpus[n].type = ms->cpu_type;
> > +        ms->possible_cpus->cpus[n].vcpus_count = smp_threads;
> >           ms->possible_cpus->cpus[n].arch_id =
> >               virt_cpu_mp_affinity(vms, n);
> >     
> 
> Why @vcpus_count is initialized to @smp_threads? it needs to be documented in
> the commit log.
> 
> > diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> > index bb486d36b1..6f9a7bb60b 100644
> > --- a/include/hw/arm/virt.h
> > +++ b/include/hw/arm/virt.h
> > @@ -209,4 +209,32 @@ static inline int virt_gicv3_redist_region_count(VirtMachineState *vms)
> >               vms->highmem_redists) ? 2 : 1;
> >   }
> >   
> > +static inline int virt_get_socket_id(const MachineState *ms, int cpu_index)
> > +{
> > +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
> > +
> > +    return ms->possible_cpus->cpus[cpu_index].props.socket_id;
> > +}
> > +
> > +static inline int virt_get_cluster_id(const MachineState *ms, int cpu_index)
> > +{
> > +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
> > +
> > +    return ms->possible_cpus->cpus[cpu_index].props.cluster_id;
> > +}
> > +
> > +static inline int virt_get_core_id(const MachineState *ms, int cpu_index)
> > +{
> > +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
> > +
> > +    return ms->possible_cpus->cpus[cpu_index].props.core_id;
> > +}
> > +
> > +static inline int virt_get_thread_id(const MachineState *ms, int cpu_index)
> > +{
> > +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
> > +
> > +    return ms->possible_cpus->cpus[cpu_index].props.thread_id;
> > +}
> > +
> >   #endif /* QEMU_ARM_VIRT_H */
> > diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> > index 77f8c9c748..abc4ed0842 100644
> > --- a/target/arm/cpu.c
> > +++ b/target/arm/cpu.c
> > @@ -2582,6 +2582,10 @@ static Property arm_cpu_properties[] = {
> >       DEFINE_PROP_UINT64("mp-affinity", ARMCPU,
> >                           mp_affinity, ARM64_AFFINITY_INVALID),
> >       DEFINE_PROP_INT32("node-id", ARMCPU, node_id, CPU_UNSET_NUMA_NODE_ID),
> > +    DEFINE_PROP_INT32("socket-id", ARMCPU, socket_id, 0),
> > +    DEFINE_PROP_INT32("cluster-id", ARMCPU, cluster_id, 0),
> > +    DEFINE_PROP_INT32("core-id", ARMCPU, core_id, 0),
> > +    DEFINE_PROP_INT32("thread-id", ARMCPU, thread_id, 0),
> >       DEFINE_PROP_INT32("core-count", ARMCPU, core_count, -1),
> >       /* True to default to the backward-compat old CNTFRQ rather than 1Ghz */
> >       DEFINE_PROP_BOOL("backcompat-cntfrq", ARMCPU, backcompat_cntfrq, false),
> > diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> > index c17264c239..208c719db3 100644
> > --- a/target/arm/cpu.h
> > +++ b/target/arm/cpu.h
> > @@ -1076,6 +1076,10 @@ struct ArchCPU {
> >       QLIST_HEAD(, ARMELChangeHook) el_change_hooks;
> >   
> >       int32_t node_id; /* NUMA node this CPU belongs to */
> > +    int32_t socket_id;
> > +    int32_t cluster_id;
> > +    int32_t core_id;
> > +    int32_t thread_id;
> >   
> >       /* Used to synchronize KVM and QEMU in-kernel device levels */
> >       uint8_t device_irq_level;  
> 
> Thanks,
> Gavin
> 



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
  2024-08-12  8:15     ` Igor Mammedov
@ 2024-08-13  0:31       ` Gavin Shan
  2024-08-19 12:07       ` Salil Mehta via
  1 sibling, 0 replies; 105+ messages in thread
From: Gavin Shan @ 2024-08-13  0:31 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Salil Mehta, qemu-devel, qemu-arm, mst, maz, jean-philippe,
	jonathan.cameron, lpieralisi, peter.maydell, richard.henderson,
	andrew.jones, david, philmd, eric.auger, will, ardb, oliver.upton,
	pbonzini, rafael, borntraeger, alex.bennee, npiggin, harshpb,
	linux, darren, ilkka, vishnu, karl.heubaum, miguel.luis,
	salil.mehta, zhukeqian1, wangxiongfeng2, wangyanan55, jiakernel2,
	maobibo, lixianglai, shahuang, zhao1.liu, linuxarm

On 8/12/24 6:15 PM, Igor Mammedov wrote:
> On Mon, 12 Aug 2024 14:35:56 +1000
> Gavin Shan <gshan@redhat.com> wrote:
> 
>> On 6/14/24 9:36 AM, Salil Mehta wrote:
>>> This shall be used to store user specified topology{socket,cluster,core,thread}
>>> and shall be converted to a unique 'vcpu-id' which is used as slot-index during
>>> hot(un)plug of vCPU.
>>>
>>> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
>>> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>>> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>>> ---
>>>    hw/arm/virt.c         | 10 ++++++++++
>>>    include/hw/arm/virt.h | 28 ++++++++++++++++++++++++++++
>>>    target/arm/cpu.c      |  4 ++++
>>>    target/arm/cpu.h      |  4 ++++
>>>    4 files changed, 46 insertions(+)
>>>    
>>
>> Those 4 properties are introduced to determine the vCPU's slot, which is the index
>> to MachineState::possible_cpus::cpus[]. From there, the CPU object or instance is
>> referenced and then the CPU's state can be further determined. It sounds reasonable
>> to use the CPU's topology to determine the index. However, I'm wandering if this can
>> be simplified to use 'cpu-index' or 'index' for a couple of facts: (1) 'cpu-index'
> 
> Please, don't. We've spent a bunch of time to get rid of cpu-index in user
> visible interface (well, old NUMA CLI is still there along with 'new' topology
> based one, but that's the last one).
> 

Ok, thanks for the hints. It's a question I had from the beginning. I didn't dig into
the historic background. From the vCPU hotplug document (cpu-hotplug.rst), the CPU
topology is used to identify hot-added vCPU on x86 and it's reasonable for ARM to
follow this mechanism.

>> or 'index' is simplified. Users have to provide 4 parameters in order to determine
>> its index in the extreme case, for example "device_add host-arm-cpu, id=cpu7,socket-id=1,
>> cluster-id=1,core-id=1,thread-id=1". With 'cpu-index' or 'index', it can be simplified
>> to 'index=7'. (2) The cold-booted and hotpluggable CPUs are determined by their index
>> instead of their topology. For example, CPU0/1/2/3 are cold-booted CPUs while CPU4/5/6/7
>> are hotpluggable CPUs with command lines '-smp maxcpus=8,cpus=4'. So 'index' makes
>> more sense to identify a vCPU's slot.
> cpu-index have been used for hotplug with x86 machines as a starting point
> to implement hotplug as it was easy to hack and it has already existed in QEMU.
> 
> But that didn't scale as was desired and had its own issues.
> Hence the current interface that majority agreed upon.
> I don't remember exact arguments anymore (they could be found qemu-devel if needed)
> Here is a link to the talk that tried to explain why topo based was introduced.
>    http://events17.linuxfoundation.org/sites/events/files/slides/CPU%20Hot-plug%20support%20in%20QEMU.pdf
> 

Right, I overlooked the migration case where the source and destination vCPU have
to be strictly correlation. This strict correlation can become broken with 'index'
or 'cpu-index', but it's ensured with the CPU topology as stated on page-19.

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 06/29] arm/virt,kvm: Pre-create disabled possible vCPUs @machine init
  2024-06-13 23:36 ` [PATCH RFC V3 06/29] arm/virt, kvm: Pre-create disabled possible vCPUs @machine init Salil Mehta via
@ 2024-08-13  0:58   ` Gavin Shan
  2024-08-19  5:31   ` Gavin Shan
  1 sibling, 0 replies; 105+ messages in thread
From: Gavin Shan @ 2024-08-13  0:58 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm, mst
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, rafael,
	borntraeger, alex.bennee, npiggin, harshpb, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

On 6/14/24 9:36 AM, Salil Mehta wrote:
> In the ARMv8 architecture, the GIC must know all the CPUs it is connected to
> during its initialization, and this cannot change afterward. This must be
> ensured during the initialization of the VGIC as well in KVM, which requires all
> vCPUs to be created and present during its initialization. This is necessary
> because:
> 
> 1. The association between GICC and MPIDR must be fixed at VM initialization
>     time. This is represented by the register `GIC_TYPER(mp_affinity, proc_num)`.
> 2. GICC (CPU interfaces), GICR (redistributors), etc., must all be initialized
>     at boot time.
> 3. Memory regions associated with GICR, etc., cannot be changed (added, deleted,
>     or modified) after the VM has been initialized.
> 
> This patch adds support to pre-create all possible vCPUs within the host using
> the KVM interface as part of the virtual machine initialization. These vCPUs can
> later be attached to QOM/ACPI when they are actually hot-plugged and made
> present.
> 
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Reported-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> [VP: Identified CPU stall issue & suggested probable fix]
> ---
>   hw/arm/virt.c         | 56 +++++++++++++++++++++++++++++++++++--------
>   include/hw/core/cpu.h |  1 +
>   target/arm/cpu64.c    |  1 +
>   target/arm/kvm.c      | 41 ++++++++++++++++++++++++++++++-
>   target/arm/kvm_arm.h  | 11 +++++++++
>   5 files changed, 99 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index a285139165..81e7a27786 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2383,14 +2383,8 @@ static void machvirt_init(MachineState *machine)
>           Object *cpuobj;
>           CPUState *cs;
>   
> -        if (n >= smp_cpus) {
> -            break;
> -        }
> -
>           cpuobj = object_new(possible_cpus->cpus[n].type);
> -
>           cs = CPU(cpuobj);
> -        cs->cpu_index = n;
>   

Fixed @cpu_index assignment is removed here...

>           aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
>           object_property_set_int(cpuobj, "socket-id",
> @@ -2402,11 +2396,53 @@ static void machvirt_init(MachineState *machine)
>           object_property_set_int(cpuobj, "thread-id",
>                                   virt_get_thread_id(machine, n), NULL);
>   
> -        cpu_slot = virt_find_cpu_slot(machine, cs->cpu_index);
> -        virt_cpu_set_properties(cpuobj, cpu_slot, &error_fatal);
> +        if (n < smp_cpus) {
> +            qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
> +            object_unref(cpuobj);
> +        } else {
> +            /* handling for vcpus which are yet to be hot-plugged */
> +            cs->cpu_index = n;
> +            cpu_slot = virt_find_cpu_slot(machine, cs->cpu_index);
>   

For hotpluggable vCPUs, we have the fixed @cpu_index assignment here and virt_cpu_pre_plug().
However, @cpu_index for non-hotpluggable vCPUs will be automatically assigned in the following
path. It causes inconsistent behaviour to hotpluggable and non-hotpluggable vCPUs. We need to
fix @cpu_index for non-hotpluggable vCPUs as well.

   qdev_realize
     arm_cpu_realizefn
       cpu_exec_realizefn
         cpu_list_add

> -        qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
> -        object_unref(cpuobj);
> +            /*
> +             * ARM host vCPU features need to be fixed at the boot time. But as
> +             * per current approach this CPU object will be destroyed during
> +             * cpu_post_init(). During hotplug of vCPUs these properties are
> +             * initialized again.
> +             */
> +            virt_cpu_set_properties(cpuobj, cpu_slot, &error_fatal);
> +
> +            /*
> +             * For KVM, we shall be pre-creating the now disabled/un-plugged
> +             * possbile host vcpus and park them till the time they are
> +             * actually hot plugged. This is required to pre-size the host
> +             * GICC and GICR with the all possible vcpus for this VM.
> +             */
> +            if (kvm_enabled()) {
> +                kvm_arm_create_host_vcpu(ARM_CPU(cs));
> +            }
> +            /*
> +             * Add disabled vCPU to CPU slot during the init phase of the virt
> +             * machine
> +             * 1. We need this ARMCPU object during the GIC init. This object
> +             *    will facilitate in pre-realizing the GIC. Any info like
> +             *    mp-affinity(required to derive gicr_type) etc. could still be
> +             *    fetched while preserving QOM abstraction akin to realized
> +             *    vCPUs.
> +             * 2. Now, after initialization of the virt machine is complete we
> +             *    could use two approaches to deal with this ARMCPU object:
> +             *    (i) re-use this ARMCPU object during hotplug of this vCPU.
> +             *                             OR
> +             *    (ii) defer release this ARMCPU object after gic has been
> +             *         initialized or during pre-plug phase when a vCPU is
> +             *         hotplugged.
> +             *
> +             *    We will use the (ii) approach and release the ARMCPU objects
> +             *    after GIC and machine has been fully initialized during
> +             *    machine_init_done() phase.
> +             */

For those hotpluggable vCPUs, ARMCPU objects are instanciated, providing information
for GIC's initialization and then destroyed. ARMCPU objects are a bit heavy for that.
The question is if ms->possible_cpus->cpus[] can be reused to provide information for
GIC's initialization? If it can be used for that, the left question is how to avoid
instanciating ARMCPU objects when vCPU fds are created and parked.


> +             cpu_slot->cpu = cs;
> +        }
>       }
>   
>       /* Now we've created the CPUs we can see if they have the hypvirt timer */
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index 60b4778da9..62e68611c0 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -520,6 +520,7 @@ struct CPUState {
>       uint64_t dirty_pages;
>       int kvm_vcpu_stats_fd;
>       bool vcpu_dirty;
> +    VMChangeStateEntry *vmcse;
>   
>       /* Use by accel-block: CPU is executing an ioctl() */
>       QemuLockCnt in_ioctl_lock;
> diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
> index d6b48b3424..9b7e8b032c 100644
> --- a/target/arm/cpu64.c
> +++ b/target/arm/cpu64.c
> @@ -789,6 +789,7 @@ static void aarch64_cpu_initfn(Object *obj)
>        * enabled explicitly
>        */
>       cs->disabled = true;
> +    cs->thread_id = 0;
>   }
>   
>   static void aarch64_cpu_finalizefn(Object *obj)
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index 7cf5cf31de..01c83c1994 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -1003,6 +1003,38 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu)
>       write_list_to_cpustate(cpu);
>   }
>   
> +void kvm_arm_create_host_vcpu(ARMCPU *cpu)
> +{
> +    CPUState *cs = CPU(cpu);
> +    unsigned long vcpu_id = cs->cpu_index;
> +    int ret;
> +
> +    ret = kvm_create_vcpu(cs);
> +    if (ret < 0) {
> +        error_report("Failed to create host vcpu %ld", vcpu_id);
> +        abort();
> +    }
> +
> +    /*
> +     * Initialize the vCPU in the host. This will reset the sys regs
> +     * for this vCPU and related registers like MPIDR_EL1 etc. also
> +     * gets programmed during this call to host. These are referred
> +     * later while setting device attributes of the GICR during GICv3
> +     * reset
> +     */
> +    ret = kvm_arch_init_vcpu(cs);
> +    if (ret < 0) {
> +        error_report("Failed to initialize host vcpu %ld", vcpu_id);
> +        abort();
> +    }
> +
> +    /*
> +     * park the created vCPU. shall be used during kvm_get_vcpu() when
> +     * threads are created during realization of ARM vCPUs.
> +     */
> +    kvm_park_vcpu(cs);
> +}
> +
>   /*
>    * Update KVM's MP_STATE based on what QEMU thinks it is
>    */
> @@ -1874,7 +1906,14 @@ int kvm_arch_init_vcpu(CPUState *cs)
>           return -EINVAL;
>       }
>   
> -    qemu_add_vm_change_state_handler(kvm_arm_vm_state_change, cpu);
> +    /*
> +     * Install VM change handler only when vCPU thread has been spawned
> +     * i.e. vCPU is being realized
> +     */
> +    if (cs->thread_id) {
> +        cs->vmcse = qemu_add_vm_change_state_handler(kvm_arm_vm_state_change,
> +                                                     cpu);
> +    }
>   
>       /* Determine init features for this CPU */
>       memset(cpu->kvm_init_features, 0, sizeof(cpu->kvm_init_features));
> diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
> index cfaa0d9bc7..0be7e896d2 100644
> --- a/target/arm/kvm_arm.h
> +++ b/target/arm/kvm_arm.h
> @@ -96,6 +96,17 @@ void kvm_arm_cpu_post_load(ARMCPU *cpu);
>    */
>   void kvm_arm_reset_vcpu(ARMCPU *cpu);
>   
> +/**
> + * kvm_arm_create_host_vcpu:
> + * @cpu: ARMCPU
> + *
> + * Called at to pre create all possible kvm vCPUs within the the host at the
> + * virt machine init time. This will also init this pre-created vCPU and
> + * hence result in vCPU reset at host. These pre created and inited vCPUs
> + * shall be parked for use when ARM vCPUs are actually realized.
> + */
> +void kvm_arm_create_host_vcpu(ARMCPU *cpu);
> +
>   #ifdef CONFIG_KVM
>   /**
>    * kvm_arm_create_scratch_host_vcpu:

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 11/29] arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
  2024-06-13 23:36 ` [PATCH RFC V3 11/29] arm/virt: Create GED dev before *disabled* CPU Objs are destroyed Salil Mehta via
@ 2024-08-13  1:04   ` Gavin Shan
  2024-08-19 12:10     ` Salil Mehta via
  0 siblings, 1 reply; 105+ messages in thread
From: Gavin Shan @ 2024-08-13  1:04 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm, mst
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, rafael,
	borntraeger, alex.bennee, npiggin, harshpb, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

On 6/14/24 9:36 AM, Salil Mehta wrote:
> ACPI CPU hotplug state (is_present=_STA.PRESENT, is_enabled=_STA.ENABLED) for
> all the possible vCPUs MUST be initialized during machine init. This is done
> during the creation of the GED device. VMM/Qemu MUST expose/fake the ACPI state
> of the disabled vCPUs to the Guest kernel as 'present' (_STA.PRESENT) always
> i.e. ACPI persistent. if the 'disabled' vCPU objectes are destroyed before the
> GED device has been created then their ACPI hotplug state might not get
> initialized correctly as acpi_persistent flag is part of the CPUState. This will
> expose wrong status of the unplugged vCPUs to the Guest kernel.
> 
> Hence, moving the GED device creation before disabled vCPU objects get destroyed
> as part of the post CPU init routine.
> 
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt.c | 10 +++++++---
>   1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 918bcb9a1b..5f98162587 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2467,6 +2467,12 @@ static void machvirt_init(MachineState *machine)
>   
>       create_gic(vms, sysmem);
>   
> +    has_ged = has_ged && aarch64 && firmware_loaded &&
> +              virt_is_acpi_enabled(vms);
> +    if (has_ged) {
> +        vms->acpi_dev = create_acpi_ged(vms);
> +    }
> +
>       virt_cpu_post_init(vms, sysmem);
>   
>       fdt_add_pmu_nodes(vms);
> @@ -2489,9 +2495,7 @@ static void machvirt_init(MachineState *machine)
>   
>       create_pcie(vms);
>   
> -    if (has_ged && aarch64 && firmware_loaded && virt_is_acpi_enabled(vms)) {
> -        vms->acpi_dev = create_acpi_ged(vms);
> -    } else {
> +    if (!has_ged) {
>           create_gpio_devices(vms, VIRT_GPIO, sysmem);
>       }
>   

It's likely the GPIO device can be created before those disabled CPU objects
are destroyed. It means the whole chunk of code can be moved together, I think.

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 17/29] arm/virt: Release objects for *disabled* possible vCPUs after init
  2024-06-13 23:36 ` [PATCH RFC V3 17/29] arm/virt: Release objects for *disabled* possible vCPUs after init Salil Mehta via
@ 2024-08-13  1:17   ` Gavin Shan
  2024-08-19 12:21     ` Salil Mehta via
  0 siblings, 1 reply; 105+ messages in thread
From: Gavin Shan @ 2024-08-13  1:17 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm, mst
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, rafael,
	borntraeger, alex.bennee, npiggin, harshpb, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

On 6/14/24 9:36 AM, Salil Mehta wrote:
> During `machvirt_init()`, QOM ARMCPU objects are pre-created along with the
> corresponding KVM vCPUs in the host for all possible vCPUs. This is necessary
> due to the architectural constraint that KVM restricts the deferred creation of
> KVM vCPUs and VGIC initialization/sizing after VM initialization. Hence, VGIC is
> pre-sized with possible vCPUs.
> 
> After the initialization of the machine is complete, the disabled possible KVM
> vCPUs are parked in the per-virt-machine list "kvm_parked_vcpus," and we release
> the QOM ARMCPU objects for the disabled vCPUs. These will be re-created when the
> vCPU is hotplugged again. The QOM ARMCPU object is then re-attached to the
> corresponding parked KVM vCPU.
> 
> Alternatively, we could have chosen not to release the QOM CPU objects and kept
> reusing them. This approach might require some modifications to the
> `qdevice_add()` interface to retrieve the old ARMCPU object instead of creating
> a new one for the hotplug request.
> 
> Each of these approaches has its own pros and cons. This prototype uses the
> first approach (suggestions are welcome!).
> 
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt.c | 32 ++++++++++++++++++++++++++++++++
>   1 file changed, 32 insertions(+)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 9d33f30a6a..a72cd3b20d 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2050,6 +2050,7 @@ static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
>   {
>       CPUArchIdList *possible_cpus = vms->parent.possible_cpus;
>       int max_cpus = MACHINE(vms)->smp.max_cpus;
> +    MachineState *ms = MACHINE(vms);
>       bool aarch64, steal_time;
>       CPUState *cpu;
>       int n;
> @@ -2111,6 +2112,37 @@ static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
>               }
>           }
>       }
> +
> +    if (kvm_enabled() || tcg_enabled()) {
> +        for (n = 0; n < possible_cpus->len; n++) {
> +            cpu = qemu_get_possible_cpu(n);
> +
> +            /*
> +             * Now, GIC has been sized with possible CPUs and we dont require
> +             * disabled vCPU objects to be represented in the QOM. Release the
> +             * disabled ARMCPU objects earlier used during init for pre-sizing.
> +             *
> +             * We fake to the guest through ACPI about the presence(_STA.PRES=1)
> +             * of these non-existent vCPUs at VMM/qemu and present these as
> +             * disabled vCPUs(_STA.ENA=0) so that they cant be used. These vCPUs
> +             * can be later added to the guest through hotplug exchanges when
> +             * ARMCPU objects are created back again using 'device_add' QMP
> +             * command.
> +             */
> +            /*
> +             * RFC: Question: Other approach could've been to keep them forever
> +             * and release it only once when qemu exits as part of finalize or
> +             * when new vCPU is hotplugged. In the later old could be released
> +             * for the newly created object for the same vCPU?
> +             */
> +            if (!qemu_enabled_cpu(cpu)) {
> +                CPUArchId *cpu_slot;
> +                cpu_slot = virt_find_cpu_slot(ms, cpu->cpu_index);
> +                cpu_slot->cpu = NULL;
> +                object_unref(OBJECT(cpu));
> +            }
> +        }
> +    }
>   }

It's probably hard to keep those ARMCPU objects forever. First of all, one vCPU
can be hot-added first and then hot-removed afterwards. With those ARMCPU objects
kept forever, the syntax of 'device_add' and 'device_del' become broken at least.
The ideal mechanism would be to avoid instanciating those ARMCPU objects and
destroying them soon. I don't know if ms->possible_cpus->cpus[] can fit and how
much efforts needed.

Thanks,
Gavin

>   
>   static void virt_cpu_set_properties(Object *cpuobj, const CPUArchId *cpu_slot,



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 18/29] arm/virt: Add/update basic hot-(un)plug framework
  2024-06-13 23:36 ` [PATCH RFC V3 18/29] arm/virt: Add/update basic hot-(un)plug framework Salil Mehta via
@ 2024-08-13  1:21   ` Gavin Shan
  2024-08-19 12:30     ` Salil Mehta via
  0 siblings, 1 reply; 105+ messages in thread
From: Gavin Shan @ 2024-08-13  1:21 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm, mst
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, rafael,
	borntraeger, alex.bennee, npiggin, harshpb, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

On 6/14/24 9:36 AM, Salil Mehta wrote:
> Add CPU hot-unplug hooks and update hotplug hooks with additional sanity checks
> for use in hotplug paths.
> 
> Note: The functional contents of the hooks (currently left with TODO comments)
> will be gradually filled in subsequent patches in an incremental approach to
> patch and logic building, which would roughly include the following:
> 
> 1. (Un)wiring of interrupts between vCPU<->GIC.
> 2. Sending events to the guest for hot-(un)plug so that the guest can take
>     appropriate actions.
> 3. Notifying the GIC about the hot-(un)plug action so that the vCPU can be
>     (un)stitched to the GIC CPU interface.
> 4. Updating the guest with next boot information for this vCPU in the firmware.
> 
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt.c | 105 ++++++++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 105 insertions(+)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index a72cd3b20d..f6b8c21f26 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -85,6 +85,7 @@
>   #include "hw/virtio/virtio-iommu.h"
>   #include "hw/char/pl011.h"
>   #include "qemu/guest-random.h"
> +#include "qapi/qmp/qdict.h"
>   
>   static GlobalProperty arm_virt_compat[] = {
>       { TYPE_VIRTIO_IOMMU_PCI, "aw-bits", "48" },
> @@ -3002,11 +3003,23 @@ static void virt_memory_plug(HotplugHandler *hotplug_dev,
>   static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>                                 Error **errp)
>   {
> +    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>       MachineState *ms = MACHINE(hotplug_dev);
> +    MachineClass *mc = MACHINE_GET_CLASS(ms);
>       ARMCPU *cpu = ARM_CPU(dev);
>       CPUState *cs = CPU(dev);
>       CPUArchId *cpu_slot;
>   
> +    if (dev->hotplugged && !vms->acpi_dev) {
> +        error_setg(errp, "GED acpi device does not exists");
> +        return;
> +    }
> +
> +    if (dev->hotplugged && !mc->has_hotpluggable_cpus) {
> +        error_setg(errp, "CPU hotplug not supported on this machine");
> +        return;
> +    }
> +
>       /* sanity check the cpu */
>       if (!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) {
>           error_setg(errp, "Invalid CPU type, expected cpu type: '%s'",
> @@ -3049,6 +3062,22 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>       }
>       virt_cpu_set_properties(OBJECT(cs), cpu_slot, errp);
>   
> +    /*
> +     * Fix the GIC for this new vCPU being plugged. The QOM CPU object for the
> +     * new vCPU need to be updated in the corresponding QOM GICv3CPUState object
> +     * We also need to re-wire the IRQs for this new CPU object. This update
> +     * is limited to the QOM only and does not affects the KVM. Later has
> +     * already been pre-sized with possible CPU at VM init time. This is a
> +     * workaround to the constraints posed by ARM architecture w.r.t supporting
> +     * CPU Hotplug. Specification does not exist for the later.
> +     * This patch-up is required both for {cold,hot}-plugged vCPUs. Cold-inited
> +     * vCPUs have their GIC state initialized during machvit_init().
> +     */
> +    if (vms->acpi_dev) {
> +        /* TODO: update GIC about this hotplug change here */
> +        /* TODO: wire the GIC<->CPU irqs */
> +    }
> +
>       /*
>        * To give persistent presence view of vCPUs to the guest, ACPI might need
>        * to fake the presence of the vCPUs to the guest but keep them disabled.
> @@ -3060,6 +3089,7 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>   static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>                             Error **errp)
>   {
> +    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>       MachineState *ms = MACHINE(hotplug_dev);
>       CPUState *cs = CPU(dev);
>       CPUArchId *cpu_slot;
> @@ -3068,10 +3098,81 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>       cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
>       cpu_slot->cpu = CPU(dev);
>   
> +    /*
> +     * Update the ACPI Hotplug state both for vCPUs being {hot,cold}-plugged.
> +     * vCPUs can be cold-plugged using '-device' option. For vCPUs being hot
> +     * plugged, guest is also notified.
> +     */
> +    if (vms->acpi_dev) {
> +        /* TODO: update acpi hotplug state. Send cpu hotplug event to guest */
> +        /* TODO: register cpu for reset & update F/W info for the next boot */
> +    }
> +
>       cs->disabled = false;
>       return;
>   }
>   
> +static void virt_cpu_unplug_request(HotplugHandler *hotplug_dev,
> +                                    DeviceState *dev, Error **errp)
> +{
> +    MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
> +    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> +    ARMCPU *cpu = ARM_CPU(dev);
> +    CPUState *cs = CPU(dev);
> +
> +    if (!vms->acpi_dev || !dev->realized) {
> +        error_setg(errp, "GED does not exists or device is not realized!");
> +        return;
> +    }
> +

How can a vCPU be unplugged even when it hasn't been realized? :)

> +    if (!mc->has_hotpluggable_cpus) {
> +        error_setg(errp, "CPU hot(un)plug not supported on this machine");
> +        return;
> +    }
> +
> +    if (cs->cpu_index == first_cpu->cpu_index) {
> +        error_setg(errp, "Boot CPU(id%d=%d:%d:%d:%d) hot-unplug not supported",
> +                   first_cpu->cpu_index, cpu->socket_id, cpu->cluster_id,
> +                   cpu->core_id, cpu->thread_id);
> +        return;
> +    }
> +
> +    /* TODO: request cpu hotplug from guest */
> +
> +    return;
> +}
> +
> +static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
> +                            Error **errp)
> +{
> +    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> +    MachineState *ms = MACHINE(hotplug_dev);
> +    CPUState *cs = CPU(dev);
> +    CPUArchId *cpu_slot;
> +
> +    if (!vms->acpi_dev || !dev->realized) {
> +        error_setg(errp, "GED does not exists or device is not realized!");
> +        return;
> +    }
> +

Same question as above.

> +    cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
> +
> +    /* TODO: update the acpi cpu hotplug state for cpu hot-unplug */
> +
> +    /* TODO: unwire the gic-cpu irqs here */
> +    /* TODO: update the GIC about this hot unplug change */
> +
> +    /* TODO: unregister cpu for reset & update F/W info for the next boot */
> +
> +    qobject_unref(dev->opts);
> +    dev->opts = NULL;
> +
> +    cpu_slot->cpu = NULL;
> +    cs->disabled = true;
> +
> +    return;
> +}
> +

The 'return' isn't needed.

>   static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
>                                               DeviceState *dev, Error **errp)
>   {
> @@ -3196,6 +3297,8 @@ static void virt_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
>       } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
>           virtio_md_pci_unplug_request(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev),
>                                        errp);
> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +        virt_cpu_unplug_request(hotplug_dev, dev, errp);
>       } else {
>           error_setg(errp, "device unplug request for unsupported device"
>                      " type: %s", object_get_typename(OBJECT(dev)));
> @@ -3209,6 +3312,8 @@ static void virt_machine_device_unplug_cb(HotplugHandler *hotplug_dev,
>           virt_dimm_unplug(hotplug_dev, dev, errp);
>       } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
>           virtio_md_pci_unplug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +        virt_cpu_unplug(hotplug_dev, dev, errp);
>       } else {
>           error_setg(errp, "virt: device unplug for unsupported device"
>                      " type: %s", object_get_typename(OBJECT(dev)));

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
  2024-06-13 23:36 ` [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug Salil Mehta via
@ 2024-08-16 15:37   ` Alex Bennée
  2024-08-16 15:50     ` Peter Maydell
  2024-08-19 12:35     ` Salil Mehta via
  0 siblings, 2 replies; 105+ messages in thread
From: Alex Bennée @ 2024-08-16 15:37 UTC (permalink / raw)
  To: Salil Mehta
  Cc: qemu-devel, qemu-arm, mst, maz, jean-philippe, jonathan.cameron,
	lpieralisi, peter.maydell, richard.henderson, imammedo,
	andrew.jones, david, philmd, eric.auger, will, ardb, oliver.upton,
	pbonzini, gshan, rafael, borntraeger, npiggin, harshpb, linux,
	darren, ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta,
	zhukeqian1, wangxiongfeng2, wangyanan55, jiakernel2, maobibo,
	lixianglai, shahuang, zhao1.liu, linuxarm

Salil Mehta <salil.mehta@huawei.com> writes:

> vCPU Hot-unplug will result in QOM CPU object unrealization which will do away
> with all the vCPU thread creations, allocations, registrations that happened
> as part of the realization process. This change introduces the ARM CPU unrealize
> function taking care of exactly that.
>
> Note, initialized KVM vCPUs are not destroyed in host KVM but their Qemu context
> is parked at the QEMU KVM layer.
>
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Reported-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> [VP: Identified CPU stall issue & suggested probable fix]
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>  target/arm/cpu.c       | 101 +++++++++++++++++++++++++++++++++++++++++
>  target/arm/cpu.h       |  14 ++++++
>  target/arm/gdbstub.c   |   6 +++
>  target/arm/helper.c    |  25 ++++++++++
>  target/arm/internals.h |   3 ++
>  target/arm/kvm.c       |   5 ++
>  6 files changed, 154 insertions(+)
>
> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> index c92162fa97..a3dc669309 100644
> --- a/target/arm/cpu.c
> +++ b/target/arm/cpu.c
> @@ -157,6 +157,16 @@ void arm_register_pre_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
>      QLIST_INSERT_HEAD(&cpu->pre_el_change_hooks, entry, node);
>  }
>  
> +void arm_unregister_pre_el_change_hooks(ARMCPU *cpu)
> +{
> +    ARMELChangeHook *entry, *next;
> +
> +    QLIST_FOREACH_SAFE(entry, &cpu->pre_el_change_hooks, node, next) {
> +        QLIST_REMOVE(entry, node);
> +        g_free(entry);
> +    }
> +}
> +
>  void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
>                                   void *opaque)
>  {
> @@ -168,6 +178,16 @@ void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
>      QLIST_INSERT_HEAD(&cpu->el_change_hooks, entry, node);
>  }
>  
> +void arm_unregister_el_change_hooks(ARMCPU *cpu)
> +{
> +    ARMELChangeHook *entry, *next;
> +
> +    QLIST_FOREACH_SAFE(entry, &cpu->el_change_hooks, node, next) {
> +        QLIST_REMOVE(entry, node);
> +        g_free(entry);
> +    }
> +}
> +
>  static void cp_reg_reset(gpointer key, gpointer value, gpointer opaque)
>  {
>      /* Reset a single ARMCPRegInfo register */
> @@ -2552,6 +2572,85 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
>      acc->parent_realize(dev, errp);
>  }
>  
> +static void arm_cpu_unrealizefn(DeviceState *dev)
> +{
> +    ARMCPUClass *acc = ARM_CPU_GET_CLASS(dev);
> +    ARMCPU *cpu = ARM_CPU(dev);
> +    CPUARMState *env = &cpu->env;
> +    CPUState *cs = CPU(dev);
> +    bool has_secure;
> +
> +    has_secure = cpu->has_el3 || arm_feature(env, ARM_FEATURE_M_SECURITY);
> +
> +    /* rock 'n' un-roll, whatever happened in the arm_cpu_realizefn cleanly */
> +    cpu_address_space_destroy(cs, ARMASIdx_NS);

On current master this will fail:

../../target/arm/cpu.c: In function ‘arm_cpu_unrealizefn’:
../../target/arm/cpu.c:2626:5: error: implicit declaration of function ‘cpu_address_space_destroy’ [-Werror=implicit-function-declaration]
 2626 |     cpu_address_space_destroy(cs, ARMASIdx_NS);
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~
../../target/arm/cpu.c:2626:5: error: nested extern declaration of ‘cpu_address_space_destroy’ [-Werror=nested-externs]
cc1: all warnings being treated as errors

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
  2024-08-16 15:37   ` Alex Bennée
@ 2024-08-16 15:50     ` Peter Maydell
  2024-08-16 17:00       ` Peter Maydell
  2024-08-19 12:58       ` Salil Mehta via
  2024-08-19 12:35     ` Salil Mehta via
  1 sibling, 2 replies; 105+ messages in thread
From: Peter Maydell @ 2024-08-16 15:50 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Salil Mehta, qemu-devel, qemu-arm, mst, maz, jean-philippe,
	jonathan.cameron, lpieralisi, richard.henderson, imammedo,
	andrew.jones, david, philmd, eric.auger, will, ardb, oliver.upton,
	pbonzini, gshan, rafael, borntraeger, npiggin, harshpb, linux,
	darren, ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta,
	zhukeqian1, wangxiongfeng2, wangyanan55, jiakernel2, maobibo,
	lixianglai, shahuang, zhao1.liu, linuxarm

On Fri, 16 Aug 2024 at 16:37, Alex Bennée <alex.bennee@linaro.org> wrote:
>
> Salil Mehta <salil.mehta@huawei.com> writes:
>
> > vCPU Hot-unplug will result in QOM CPU object unrealization which will do away
> > with all the vCPU thread creations, allocations, registrations that happened
> > as part of the realization process. This change introduces the ARM CPU unrealize
> > function taking care of exactly that.
> >
> > Note, initialized KVM vCPUs are not destroyed in host KVM but their Qemu context
> > is parked at the QEMU KVM layer.
> >
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Reported-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> > [VP: Identified CPU stall issue & suggested probable fix]
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >  target/arm/cpu.c       | 101 +++++++++++++++++++++++++++++++++++++++++
> >  target/arm/cpu.h       |  14 ++++++
> >  target/arm/gdbstub.c   |   6 +++
> >  target/arm/helper.c    |  25 ++++++++++
> >  target/arm/internals.h |   3 ++
> >  target/arm/kvm.c       |   5 ++
> >  6 files changed, 154 insertions(+)
> >
> > diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> > index c92162fa97..a3dc669309 100644
> > --- a/target/arm/cpu.c
> > +++ b/target/arm/cpu.c
> > @@ -157,6 +157,16 @@ void arm_register_pre_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
> >      QLIST_INSERT_HEAD(&cpu->pre_el_change_hooks, entry, node);
> >  }
> >
> > +void arm_unregister_pre_el_change_hooks(ARMCPU *cpu)
> > +{
> > +    ARMELChangeHook *entry, *next;
> > +
> > +    QLIST_FOREACH_SAFE(entry, &cpu->pre_el_change_hooks, node, next) {
> > +        QLIST_REMOVE(entry, node);
> > +        g_free(entry);
> > +    }
> > +}
> > +
> >  void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
> >                                   void *opaque)
> >  {
> > @@ -168,6 +178,16 @@ void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
> >      QLIST_INSERT_HEAD(&cpu->el_change_hooks, entry, node);
> >  }
> >
> > +void arm_unregister_el_change_hooks(ARMCPU *cpu)
> > +{
> > +    ARMELChangeHook *entry, *next;
> > +
> > +    QLIST_FOREACH_SAFE(entry, &cpu->el_change_hooks, node, next) {
> > +        QLIST_REMOVE(entry, node);
> > +        g_free(entry);
> > +    }
> > +}
> > +
> >  static void cp_reg_reset(gpointer key, gpointer value, gpointer opaque)
> >  {
> >      /* Reset a single ARMCPRegInfo register */
> > @@ -2552,6 +2572,85 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
> >      acc->parent_realize(dev, errp);
> >  }
> >
> > +static void arm_cpu_unrealizefn(DeviceState *dev)
> > +{
> > +    ARMCPUClass *acc = ARM_CPU_GET_CLASS(dev);
> > +    ARMCPU *cpu = ARM_CPU(dev);
> > +    CPUARMState *env = &cpu->env;
> > +    CPUState *cs = CPU(dev);
> > +    bool has_secure;
> > +
> > +    has_secure = cpu->has_el3 || arm_feature(env, ARM_FEATURE_M_SECURITY);
> > +
> > +    /* rock 'n' un-roll, whatever happened in the arm_cpu_realizefn cleanly */
> > +    cpu_address_space_destroy(cs, ARMASIdx_NS);
>
> On current master this will fail:
>
> ../../target/arm/cpu.c: In function ‘arm_cpu_unrealizefn’:
> ../../target/arm/cpu.c:2626:5: error: implicit declaration of function ‘cpu_address_space_destroy’ [-Werror=implicit-function-declaration]
>  2626 |     cpu_address_space_destroy(cs, ARMASIdx_NS);
>       |     ^~~~~~~~~~~~~~~~~~~~~~~~~
> ../../target/arm/cpu.c:2626:5: error: nested extern declaration of ‘cpu_address_space_destroy’ [-Werror=nested-externs]
> cc1: all warnings being treated as errors

We shouldn't need to explicitly call cpu_address_space_destroy()
from a target-specific unrealize anyway: we can do it all
from the base class (and I think this would fix some
leaks in current code for targets that hot-unplug, though
I should check that). Otherwise you need to duplicate all
the logic for figuring out which address spaces we created
in realize, which is fragile and not necessary when all we
want to do is "delete every address space the CPU object has"
and we want to do that for every target architecture always.

-- PMM


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
  2024-08-16 15:50     ` Peter Maydell
@ 2024-08-16 17:00       ` Peter Maydell
  2024-08-19 12:59         ` Salil Mehta via
  2024-08-19 12:58       ` Salil Mehta via
  1 sibling, 1 reply; 105+ messages in thread
From: Peter Maydell @ 2024-08-16 17:00 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Salil Mehta, qemu-devel, qemu-arm, mst, maz, jean-philippe,
	jonathan.cameron, lpieralisi, richard.henderson, imammedo,
	andrew.jones, david, philmd, eric.auger, will, ardb, oliver.upton,
	pbonzini, gshan, rafael, borntraeger, npiggin, harshpb, linux,
	darren, ilkka, vishnu, karl.heubaum, miguel.luis, salil.mehta,
	zhukeqian1, wangxiongfeng2, wangyanan55, jiakernel2, maobibo,
	lixianglai, shahuang, zhao1.liu, linuxarm

On Fri, 16 Aug 2024 at 16:50, Peter Maydell <peter.maydell@linaro.org> wrote:
> We shouldn't need to explicitly call cpu_address_space_destroy()
> from a target-specific unrealize anyway: we can do it all
> from the base class (and I think this would fix some
> leaks in current code for targets that hot-unplug, though
> I should check that). Otherwise you need to duplicate all
> the logic for figuring out which address spaces we created
> in realize, which is fragile and not necessary when all we
> want to do is "delete every address space the CPU object has"
> and we want to do that for every target architecture always.

I have a patch to do this now, but I need to test it a bit more
and confirm (or disprove) my hypothesis that we're currently
leaking memory on existing architectures with vCPU
hot-unplug before I send it out.

-- PMM


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 06/29] arm/virt,kvm: Pre-create disabled possible vCPUs @machine init
  2024-06-13 23:36 ` [PATCH RFC V3 06/29] arm/virt, kvm: Pre-create disabled possible vCPUs @machine init Salil Mehta via
  2024-08-13  0:58   ` [PATCH RFC V3 06/29] arm/virt,kvm: " Gavin Shan
@ 2024-08-19  5:31   ` Gavin Shan
  2024-08-19 13:06     ` Salil Mehta via
  1 sibling, 1 reply; 105+ messages in thread
From: Gavin Shan @ 2024-08-19  5:31 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm, mst
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, rafael,
	borntraeger, alex.bennee, npiggin, harshpb, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

On 6/14/24 9:36 AM, Salil Mehta wrote:
> In the ARMv8 architecture, the GIC must know all the CPUs it is connected to
> during its initialization, and this cannot change afterward. This must be
> ensured during the initialization of the VGIC as well in KVM, which requires all
> vCPUs to be created and present during its initialization. This is necessary
> because:
> 
> 1. The association between GICC and MPIDR must be fixed at VM initialization
>     time. This is represented by the register `GIC_TYPER(mp_affinity, proc_num)`.
> 2. GICC (CPU interfaces), GICR (redistributors), etc., must all be initialized
>     at boot time.
> 3. Memory regions associated with GICR, etc., cannot be changed (added, deleted,
>     or modified) after the VM has been initialized.
> 
> This patch adds support to pre-create all possible vCPUs within the host using
> the KVM interface as part of the virtual machine initialization. These vCPUs can
> later be attached to QOM/ACPI when they are actually hot-plugged and made
> present.
> 
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Reported-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> [VP: Identified CPU stall issue & suggested probable fix]
> ---
>   hw/arm/virt.c         | 56 +++++++++++++++++++++++++++++++++++--------
>   include/hw/core/cpu.h |  1 +
>   target/arm/cpu64.c    |  1 +
>   target/arm/kvm.c      | 41 ++++++++++++++++++++++++++++++-
>   target/arm/kvm_arm.h  | 11 +++++++++
>   5 files changed, 99 insertions(+), 11 deletions(-)
> 

The vCPU file descriptor is associated with a feature bitmap when the file descriptor
is initialized by ioctl(vm_fd, KVM_ARM_VCPU_INIT, &init). The feature bitmap is sorted
out based on the vCPU properties. The vCPU properties can be different when the vCPU
file descriptor is initialized for the first time when the vCPU is instantiated, and
re-initialized when the vCPU is hot added.

It can lead to system crash as below. We probably need a mechanism to disallow passing
extra properties when vCPU is hot added to avoid the conflicts to the global properties
from the command line "-cpu host,pmu=on". Some of the properties like "id", "socket-id"
are still needed.

/home/gavin/sandbox/qemu.main/build/qemu-system-aarch64                  \
-accel kvm -machine virt,gic-version=host,nvdimm=on                      \
-cpu host -smp maxcpus=2,cpus=1,sockets=2,clusters=1,cores=1,threads=1   \
-m 4096M,slots=16,maxmem=128G                                            \
-object memory-backend-ram,id=mem0,size=2048M                            \
-object memory-backend-ram,id=mem1,size=2048M                            \
-numa node,nodeid=0,memdev=mem0,cpus=0-0                                 \
-numa node,nodeid=1,memdev=mem1,cpus=1-1                                 \
-L /home/gavin/sandbox/qemu.main/build/pc-bios                           \
-monitor none -serial mon:stdio -nographic                               \
-gdb tcp::6666 -qmp tcp:localhost:5555,server,wait=off                   \
-bios /home/gavin/sandbox/qemu.main/build/pc-bios/edk2-aarch64-code.fd   \
-kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image            \
-initrd /home/gavin/sandbox/images/rootfs.cpio.xz                        \
-append memhp_default_state=online_movable                               \
     :
(qemu) device_add host-arm-cpu,id=cpu1,socket-id=1,pmu=off
kvm_arch_init_vcpu: Error -22 from kvm_arm_vcpu_init()
qemu-system-aarch64: kvm_init_vcpu: kvm_arch_init_vcpu failed (1): Invalid argument

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
  2024-08-12  4:35   ` [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property Gavin Shan
  2024-08-12  8:15     ` Igor Mammedov
@ 2024-08-19 11:53     ` Salil Mehta via
  2024-09-04 14:42       ` zhao1.liu
  1 sibling, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-08-19 11:53 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

HI Gavin,

Sorry, I was away for almost entire last week. Joined back today.
Thanks for taking out time to review. 

>  From: Gavin Shan <gshan@redhat.com>
>  Sent: Monday, August 12, 2024 5:36 AM
>  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  qemu-arm@nongnu.org; mst@redhat.com
>  
>  On 6/14/24 9:36 AM, Salil Mehta wrote:
>  > This shall be used to store user specified
>  > topology{socket,cluster,core,thread}
>  > and shall be converted to a unique 'vcpu-id' which is used as
>  > slot-index during hot(un)plug of vCPU.
>  >
>  > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
>  > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>  > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  > ---
>  >   hw/arm/virt.c         | 10 ++++++++++
>  >   include/hw/arm/virt.h | 28 ++++++++++++++++++++++++++++
>  >   target/arm/cpu.c      |  4 ++++
>  >   target/arm/cpu.h      |  4 ++++
>  >   4 files changed, 46 insertions(+)
>  >
>  
>  Those 4 properties are introduced to determine the vCPU's slot, which is the
>  index to MachineState::possible_cpus::cpus[]. 

Correct.

From there, the CPU object
>  or instance is referenced and then the CPU's state can be further
>  determined. It sounds reasonable to use the CPU's topology to determine
>  the index. However, I'm wandering if this can be simplified to use 'cpu-
>  index' or 'index' 

Are you suggesting to use CPU index while specifying vCPUs through
command line and I'm not even sure how will it simply CPU naming?

CPU index is internal index to QOM. The closest thing which you can
have is the 'slot-id'  and later can have mapping to the CPU index
internally but I'm not sure how much useful it is to introduce this 
slot abstraction. I did raise this in the original RFC I posted in 2020.


for a couple of facts: (1) 'cpu-index'
>  or 'index' is simplified. Users have to provide 4 parameters in order to
>  determine its index in the extreme case, for example "device_add host-
>  arm-cpu, id=cpu7,socket-id=1, cluster-id=1,core-id=1,thread-id=1". With
>  'cpu-index' or 'index', it can be simplified to 'index=7'. (2) The cold-booted
>  and hotpluggable CPUs are determined by their index instead of their
>  topology. For example, CPU0/1/2/3 are cold-booted CPUs while CPU4/5/6/7
>  are hotpluggable CPUs with command lines '-smp maxcpus=8,cpus=4'. So
>  'index' makes more sense to identify a vCPU's slot.


I'm not sure if anybody wants to use it this way. People want to specify topology
i.e. where the vCPU fits. Internally it's up to QOM to translate that topology to
some index.


>  
>  > diff --git a/hw/arm/virt.c b/hw/arm/virt.c index
>  > 3c93c0c0a6..11fc7fc318 100644
>  > --- a/hw/arm/virt.c
>  > +++ b/hw/arm/virt.c
>  > @@ -2215,6 +2215,14 @@ static void machvirt_init(MachineState
>  *machine)
>  >                             &error_fatal);
>  >
>  >           aarch64 &= object_property_get_bool(cpuobj, "aarch64",
>  > NULL);
>  > +        object_property_set_int(cpuobj, "socket-id",
>  > +                                virt_get_socket_id(machine, n), NULL);
>  > +        object_property_set_int(cpuobj, "cluster-id",
>  > +                                virt_get_cluster_id(machine, n), NULL);
>  > +        object_property_set_int(cpuobj, "core-id",
>  > +                                virt_get_core_id(machine, n), NULL);
>  > +        object_property_set_int(cpuobj, "thread-id",
>  > +                                virt_get_thread_id(machine, n),
>  > + NULL);
>  >
>  >           if (!vms->secure) {
>  >               object_property_set_bool(cpuobj, "has_el3", false,
>  > NULL); @@ -2708,6 +2716,7 @@ static const CPUArchIdList
>  *virt_possible_cpu_arch_ids(MachineState *ms)
>  >   {
>  >       int n;
>  >       unsigned int max_cpus = ms->smp.max_cpus;
>  > +    unsigned int smp_threads = ms->smp.threads;
>  >       VirtMachineState *vms = VIRT_MACHINE(ms);
>  >       MachineClass *mc = MACHINE_GET_CLASS(vms);
>  >
>  > @@ -2721,6 +2730,7 @@ static const CPUArchIdList
>  *virt_possible_cpu_arch_ids(MachineState *ms)
>  >       ms->possible_cpus->len = max_cpus;
>  >       for (n = 0; n < ms->possible_cpus->len; n++) {
>  >           ms->possible_cpus->cpus[n].type = ms->cpu_type;
>  > +        ms->possible_cpus->cpus[n].vcpus_count = smp_threads;
>  >           ms->possible_cpus->cpus[n].arch_id =
>  >               virt_cpu_mp_affinity(vms, n);
>  >
>  
>  Why @vcpus_count is initialized to @smp_threads? it needs to be
>  documented in the commit log.


Because every thread internally amounts to a vCPU in QOM and which
is in 1:1 relationship with KVM vCPU. AFAIK, QOM does not strictly follows
any architecture. Once you start to get into details of threads there
are many aspects of shared resources one will have to consider and
these can vary across different implementations of architecture.

It is a bigger problem than you think, which I've touched at very nascent
stages while doing POC of vCPU hotplug but tried to avoid till now. 


But I would like to hear other community members views on this.

Hi Igor/Peter,

What is your take on this?

Thanks
Salil.



>  > diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h index
>  > bb486d36b1..6f9a7bb60b 100644
>  > --- a/include/hw/arm/virt.h
>  > +++ b/include/hw/arm/virt.h
>  > @@ -209,4 +209,32 @@ static inline int
>  virt_gicv3_redist_region_count(VirtMachineState *vms)
>  >               vms->highmem_redists) ? 2 : 1;
>  >   }
>  >
>  > +static inline int virt_get_socket_id(const MachineState *ms, int
>  > +cpu_index) {
>  > +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
>  > +
>  > +    return ms->possible_cpus->cpus[cpu_index].props.socket_id;
>  > +}
>  > +
>  > +static inline int virt_get_cluster_id(const MachineState *ms, int
>  > +cpu_index) {
>  > +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
>  > +
>  > +    return ms->possible_cpus->cpus[cpu_index].props.cluster_id;
>  > +}
>  > +
>  > +static inline int virt_get_core_id(const MachineState *ms, int
>  > +cpu_index) {
>  > +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
>  > +
>  > +    return ms->possible_cpus->cpus[cpu_index].props.core_id;
>  > +}
>  > +
>  > +static inline int virt_get_thread_id(const MachineState *ms, int
>  > +cpu_index) {
>  > +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
>  > +
>  > +    return ms->possible_cpus->cpus[cpu_index].props.thread_id;
>  > +}
>  > +
>  >   #endif /* QEMU_ARM_VIRT_H */
>  > diff --git a/target/arm/cpu.c b/target/arm/cpu.c index
>  > 77f8c9c748..abc4ed0842 100644
>  > --- a/target/arm/cpu.c
>  > +++ b/target/arm/cpu.c
>  > @@ -2582,6 +2582,10 @@ static Property arm_cpu_properties[] = {
>  >       DEFINE_PROP_UINT64("mp-affinity", ARMCPU,
>  >                           mp_affinity, ARM64_AFFINITY_INVALID),
>  >       DEFINE_PROP_INT32("node-id", ARMCPU, node_id,
>  > CPU_UNSET_NUMA_NODE_ID),
>  > +    DEFINE_PROP_INT32("socket-id", ARMCPU, socket_id, 0),
>  > +    DEFINE_PROP_INT32("cluster-id", ARMCPU, cluster_id, 0),
>  > +    DEFINE_PROP_INT32("core-id", ARMCPU, core_id, 0),
>  > +    DEFINE_PROP_INT32("thread-id", ARMCPU, thread_id, 0),
>  >       DEFINE_PROP_INT32("core-count", ARMCPU, core_count, -1),
>  >       /* True to default to the backward-compat old CNTFRQ rather than
>  1Ghz */
>  >       DEFINE_PROP_BOOL("backcompat-cntfrq", ARMCPU,
>  backcompat_cntfrq,
>  > false), diff --git a/target/arm/cpu.h b/target/arm/cpu.h index
>  > c17264c239..208c719db3 100644
>  > --- a/target/arm/cpu.h
>  > +++ b/target/arm/cpu.h
>  > @@ -1076,6 +1076,10 @@ struct ArchCPU {
>  >       QLIST_HEAD(, ARMELChangeHook) el_change_hooks;
>  >
>  >       int32_t node_id; /* NUMA node this CPU belongs to */
>  > +    int32_t socket_id;
>  > +    int32_t cluster_id;
>  > +    int32_t core_id;
>  > +    int32_t thread_id;
>  >
>  >       /* Used to synchronize KVM and QEMU in-kernel device levels */
>  >       uint8_t device_irq_level;
>  
>  Thanks,
>  Gavin
>  


^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
  2024-08-12  8:15     ` Igor Mammedov
  2024-08-13  0:31       ` Gavin Shan
@ 2024-08-19 12:07       ` Salil Mehta via
  1 sibling, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-08-19 12:07 UTC (permalink / raw)
  To: Igor Mammedov, Gavin Shan
  Cc: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com,
	maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, andrew.jones@linux.dev,
	david@redhat.com, philmd@linaro.org, eric.auger@redhat.com,
	will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev,
	pbonzini@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com,
	alex.bennee@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com,
	linux@armlinux.org.uk, darren@os.amperecomputing.com,
	ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com,
	karl.heubaum@oracle.com, miguel.luis@oracle.com,
	salil.mehta@opnsrc.net, zhukeqian, wangxiongfeng (C),
	wangyanan (Y), jiakernel2@gmail.com, maobibo@loongson.cn,
	lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com,
	Linuxarm

>  From: Igor Mammedov <imammedo@redhat.com>
>  Sent: Monday, August 12, 2024 9:16 AM
>  To: Gavin Shan <gshan@redhat.com>
>  
>  On Mon, 12 Aug 2024 14:35:56 +1000
>  Gavin Shan <gshan@redhat.com> wrote:
>  
>  > On 6/14/24 9:36 AM, Salil Mehta wrote:
>  > > This shall be used to store user specified
>  > > topology{socket,cluster,core,thread}
>  > > and shall be converted to a unique 'vcpu-id' which is used as
>  > > slot-index during hot(un)plug of vCPU.
>  > >
>  > > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
>  > > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>  > > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  > > ---
>  > >   hw/arm/virt.c         | 10 ++++++++++
>  > >   include/hw/arm/virt.h | 28 ++++++++++++++++++++++++++++
>  > >   target/arm/cpu.c      |  4 ++++
>  > >   target/arm/cpu.h      |  4 ++++
>  > >   4 files changed, 46 insertions(+)
>  > >
>  >
>  > Those 4 properties are introduced to determine the vCPU's slot, which
>  > is the index to MachineState::possible_cpus::cpus[]. From there, the
>  > CPU object or instance is referenced and then the CPU's state can be
>  > further determined. It sounds reasonable to use the CPU's topology to
>  > determine the index. However, I'm wandering if this can be simplified to
>  use 'cpu-index' or 'index' for a couple of facts: (1) 'cpu-index'
>  
>  Please, don't. We've spent a bunch of time to get rid of cpu-index in user
>  visible interface (well, old NUMA CLI is still there along with 'new' topology
>  based one, but that's the last one).


Agreed. We shouldn't expose CPU index to user.

>  
>  > or 'index' is simplified. Users have to provide 4 parameters in order
>  > to determine its index in the extreme case, for example "device_add
>  > host-arm-cpu, id=cpu7,socket-id=1,
>  > cluster-id=1,core-id=1,thread-id=1". With 'cpu-index' or 'index', it
>  > can be simplified to 'index=7'. (2) The cold-booted and hotpluggable
>  > CPUs are determined by their index instead of their topology. For
>  > example, CPU0/1/2/3 are cold-booted CPUs while CPU4/5/6/7 are
>  hotpluggable CPUs with command lines '-smp maxcpus=8,cpus=4'. So 'index'
>  makes more sense to identify a vCPU's slot.
>  cpu-index have been used for hotplug with x86 machines as a starting point
>  to implement hotplug as it was easy to hack and it has already existed in
>  QEMU.
>  
>  But that didn't scale as was desired and had its own issues.
>  Hence the current interface that majority agreed upon.
>  I don't remember exact arguments anymore (they could be found qemu-
>  devel if needed) Here is a link to the talk that tried to explain why topo
>  based was introduced.
>  
>  http://events17.linuxfoundation.org/sites/events/files/slides/CPU%20Hot-
>  plug%20support%20in%20QEMU.pdf


I think you are referring to slide-19 of above presentation?

Thanks
Salil.


^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 11/29] arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
  2024-08-13  1:04   ` Gavin Shan
@ 2024-08-19 12:10     ` Salil Mehta via
  2024-08-20  0:22       ` Gavin Shan
  0 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-08-19 12:10 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

>  From: Gavin Shan <gshan@redhat.com>
>  Sent: Tuesday, August 13, 2024 2:05 AM
>  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  qemu-arm@nongnu.org; mst@redhat.com
>  
>  On 6/14/24 9:36 AM, Salil Mehta wrote:
>  > ACPI CPU hotplug state (is_present=_STA.PRESENT,
>  > is_enabled=_STA.ENABLED) for all the possible vCPUs MUST be
>  > initialized during machine init. This is done during the creation of
>  > the GED device. VMM/Qemu MUST expose/fake the ACPI state of the
>  > disabled vCPUs to the Guest kernel as 'present' (_STA.PRESENT) always
>  > i.e. ACPI persistent. if the 'disabled' vCPU objectes are destroyed
>  > before the GED device has been created then their ACPI hotplug state
>  > might not get initialized correctly as acpi_persistent flag is part of the
>  CPUState. This will expose wrong status of the unplugged vCPUs to the
>  Guest kernel.
>  >
>  > Hence, moving the GED device creation before disabled vCPU objects get
>  > destroyed as part of the post CPU init routine.
>  >
>  > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  > ---
>  >   hw/arm/virt.c | 10 +++++++---
>  >   1 file changed, 7 insertions(+), 3 deletions(-)
>  >
>  > diff --git a/hw/arm/virt.c b/hw/arm/virt.c index
>  > 918bcb9a1b..5f98162587 100644
>  > --- a/hw/arm/virt.c
>  > +++ b/hw/arm/virt.c
>  > @@ -2467,6 +2467,12 @@ static void machvirt_init(MachineState
>  > *machine)
>  >
>  >       create_gic(vms, sysmem);
>  >
>  > +    has_ged = has_ged && aarch64 && firmware_loaded &&
>  > +              virt_is_acpi_enabled(vms);
>  > +    if (has_ged) {
>  > +        vms->acpi_dev = create_acpi_ged(vms);
>  > +    }
>  > +
>  >       virt_cpu_post_init(vms, sysmem);
>  >
>  >       fdt_add_pmu_nodes(vms);
>  > @@ -2489,9 +2495,7 @@ static void machvirt_init(MachineState
>  *machine)
>  >
>  >       create_pcie(vms);
>  >
>  > -    if (has_ged && aarch64 && firmware_loaded &&
>  virt_is_acpi_enabled(vms)) {
>  > -        vms->acpi_dev = create_acpi_ged(vms);
>  > -    } else {
>  > +    if (!has_ged) {
>  >           create_gpio_devices(vms, VIRT_GPIO, sysmem);
>  >       }
>  >
>  
>  It's likely the GPIO device can be created before those disabled CPU objects
>  are destroyed. It means the whole chunk of code can be moved together, I
>  think.

I was not totally sure of this. Hence, kept the order of the rest like that. I can
definitely check again if we can do that to reduce the change.

Thanks
Salil.



>  
>  Thanks,
>  Gavin
>  


^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 17/29] arm/virt: Release objects for *disabled* possible vCPUs after init
  2024-08-13  1:17   ` Gavin Shan
@ 2024-08-19 12:21     ` Salil Mehta via
  2024-08-20  0:05       ` Gavin Shan
  0 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-08-19 12:21 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Gavin,

>  From: Gavin Shan <gshan@redhat.com>
>  Sent: Tuesday, August 13, 2024 2:17 AM
>  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  qemu-arm@nongnu.org; mst@redhat.com
>  
>  On 6/14/24 9:36 AM, Salil Mehta wrote:
>  > During `machvirt_init()`, QOM ARMCPU objects are pre-created along
>  > with the corresponding KVM vCPUs in the host for all possible vCPUs.
>  > This is necessary due to the architectural constraint that KVM
>  > restricts the deferred creation of KVM vCPUs and VGIC
>  > initialization/sizing after VM initialization. Hence, VGIC is pre-sized with
>  possible vCPUs.
>  >
>  > After the initialization of the machine is complete, the disabled
>  > possible KVM vCPUs are parked in the per-virt-machine list
>  > "kvm_parked_vcpus," and we release the QOM ARMCPU objects for the
>  > disabled vCPUs. These will be re-created when the vCPU is hotplugged
>  > again. The QOM ARMCPU object is then re-attached to the corresponding
>  parked KVM vCPU.
>  >
>  > Alternatively, we could have chosen not to release the QOM CPU objects
>  > and kept reusing them. This approach might require some modifications
>  > to the `qdevice_add()` interface to retrieve the old ARMCPU object
>  > instead of creating a new one for the hotplug request.
>  >
>  > Each of these approaches has its own pros and cons. This prototype
>  > uses the first approach (suggestions are welcome!).
>  >
>  > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
>  > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>  > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  > ---
>  >   hw/arm/virt.c | 32 ++++++++++++++++++++++++++++++++
>  >   1 file changed, 32 insertions(+)
>  >
>  > diff --git a/hw/arm/virt.c b/hw/arm/virt.c index
>  > 9d33f30a6a..a72cd3b20d 100644
>  > --- a/hw/arm/virt.c
>  > +++ b/hw/arm/virt.c
>  > @@ -2050,6 +2050,7 @@ static void virt_cpu_post_init(VirtMachineState
>  *vms, MemoryRegion *sysmem)
>  >   {
>  >       CPUArchIdList *possible_cpus = vms->parent.possible_cpus;
>  >       int max_cpus = MACHINE(vms)->smp.max_cpus;
>  > +    MachineState *ms = MACHINE(vms);
>  >       bool aarch64, steal_time;
>  >       CPUState *cpu;
>  >       int n;
>  > @@ -2111,6 +2112,37 @@ static void virt_cpu_post_init(VirtMachineState
>  *vms, MemoryRegion *sysmem)
>  >               }
>  >           }
>  >       }
>  > +
>  > +    if (kvm_enabled() || tcg_enabled()) {
>  > +        for (n = 0; n < possible_cpus->len; n++) {
>  > +            cpu = qemu_get_possible_cpu(n);
>  > +
>  > +            /*
>  > +             * Now, GIC has been sized with possible CPUs and we dont
>  require
>  > +             * disabled vCPU objects to be represented in the QOM. Release
>  the
>  > +             * disabled ARMCPU objects earlier used during init for pre-sizing.
>  > +             *
>  > +             * We fake to the guest through ACPI about the
>  presence(_STA.PRES=1)
>  > +             * of these non-existent vCPUs at VMM/qemu and present these
>  as
>  > +             * disabled vCPUs(_STA.ENA=0) so that they cant be used. These
>  vCPUs
>  > +             * can be later added to the guest through hotplug exchanges
>  when
>  > +             * ARMCPU objects are created back again using 'device_add' QMP
>  > +             * command.
>  > +             */
>  > +            /*
>  > +             * RFC: Question: Other approach could've been to keep them
>  forever
>  > +             * and release it only once when qemu exits as part of finalize or
>  > +             * when new vCPU is hotplugged. In the later old could be released
>  > +             * for the newly created object for the same vCPU?
>  > +             */
>  > +            if (!qemu_enabled_cpu(cpu)) {
>  > +                CPUArchId *cpu_slot;
>  > +                cpu_slot = virt_find_cpu_slot(ms, cpu->cpu_index);
>  > +                cpu_slot->cpu = NULL;
>  > +                object_unref(OBJECT(cpu));
>  > +            }
>  > +        }
>  > +    }
>  >   }
>  
>  It's probably hard to keep those ARMCPU objects forever. First of all, one
>  vCPU can be hot-added first and then hot-removed afterwards. With those
>  ARMCPU objects kept forever, the syntax of 'device_add' and 'device_del'
>  become broken at least.

I had prototyped both approaches 4 years back. Yes, interface problem with
device_add was solved by a trick of keeping the old vCPU object and on
device_add instead of creating a new vCPU object we could use the old vCPU
object and then call qdev_realize() on it.

But bigger problem with this approach is that of migration. Only realized objects
have state to be migrated. So it might look cleaner on one aspect but had its
own issues.

I think I did share a prototype of this with Igor which he was not in agreement with
and wanted vCPU objects to be destroyed like in x86. Hence, we stuck with
the current approach.


>  The ideal mechanism would be to avoid instanciating those ARMCPU objects
>  and destroying them soon. I don't know if ms->possible_cpus->cpus[] can
>  fit and how much efforts needed.

This is what we are doing now in the current approach. Please read the KVMForum
slides of 2023 for more details and the cover letter of RFC V3 for more details.



Thanks
Salil.

>  
>  Thanks,
>  Gavin


^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 18/29] arm/virt: Add/update basic hot-(un)plug framework
  2024-08-13  1:21   ` Gavin Shan
@ 2024-08-19 12:30     ` Salil Mehta via
  0 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-08-19 12:30 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Gavin,

>  From: Gavin Shan <gshan@redhat.com>
>  Sent: Tuesday, August 13, 2024 2:21 AM
>  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  qemu-arm@nongnu.org; mst@redhat.com
>  
>  On 6/14/24 9:36 AM, Salil Mehta wrote:
>  > Add CPU hot-unplug hooks and update hotplug hooks with additional
>  > sanity checks for use in hotplug paths.
>  >
>  > Note: The functional contents of the hooks (currently left with TODO
>  > comments) will be gradually filled in subsequent patches in an
>  > incremental approach to patch and logic building, which would roughly
>  include the following:
>  >
>  > 1. (Un)wiring of interrupts between vCPU<->GIC.
>  > 2. Sending events to the guest for hot-(un)plug so that the guest can take
>  >     appropriate actions.
>  > 3. Notifying the GIC about the hot-(un)plug action so that the vCPU can be
>  >     (un)stitched to the GIC CPU interface.
>  > 4. Updating the guest with next boot information for this vCPU in the
>  firmware.
>  >
>  > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
>  > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>  > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  > ---
>  >   hw/arm/virt.c | 105
>  ++++++++++++++++++++++++++++++++++++++++++++++++++
>  >   1 file changed, 105 insertions(+)
>  >
>  > diff --git a/hw/arm/virt.c b/hw/arm/virt.c index
>  > a72cd3b20d..f6b8c21f26 100644
>  > --- a/hw/arm/virt.c
>  > +++ b/hw/arm/virt.c
>  > @@ -85,6 +85,7 @@
>  >   #include "hw/virtio/virtio-iommu.h"
>  >   #include "hw/char/pl011.h"
>  >   #include "qemu/guest-random.h"
>  > +#include "qapi/qmp/qdict.h"
>  >
>  >   static GlobalProperty arm_virt_compat[] = {
>  >       { TYPE_VIRTIO_IOMMU_PCI, "aw-bits", "48" }, @@ -3002,11 +3003,23
>  > @@ static void virt_memory_plug(HotplugHandler *hotplug_dev,
>  >   static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState
>  *dev,
>  >                                 Error **errp)
>  >   {
>  > +    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>  >       MachineState *ms = MACHINE(hotplug_dev);
>  > +    MachineClass *mc = MACHINE_GET_CLASS(ms);
>  >       ARMCPU *cpu = ARM_CPU(dev);
>  >       CPUState *cs = CPU(dev);
>  >       CPUArchId *cpu_slot;
>  >
>  > +    if (dev->hotplugged && !vms->acpi_dev) {
>  > +        error_setg(errp, "GED acpi device does not exists");
>  > +        return;
>  > +    }
>  > +
>  > +    if (dev->hotplugged && !mc->has_hotpluggable_cpus) {
>  > +        error_setg(errp, "CPU hotplug not supported on this machine");
>  > +        return;
>  > +    }
>  > +
>  >       /* sanity check the cpu */
>  >       if (!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) {
>  >           error_setg(errp, "Invalid CPU type, expected cpu type:
>  > '%s'", @@ -3049,6 +3062,22 @@ static void
>  virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>  >       }
>  >       virt_cpu_set_properties(OBJECT(cs), cpu_slot, errp);
>  >
>  > +    /*
>  > +     * Fix the GIC for this new vCPU being plugged. The QOM CPU object for the
>  > +     * new vCPU need to be updated in the corresponding QOM GICv3CPUState object
>  > +     * We also need to re-wire the IRQs for this new CPU object. This update
>  > +     * is limited to the QOM only and does not affects the KVM. Later has
>  > +     * already been pre-sized with possible CPU at VM init time. This is a
>  > +     * workaround to the constraints posed by ARM architecture w.r.t supporting
>  > +     * CPU Hotplug. Specification does not exist for the later.
>  > +     * This patch-up is required both for {cold,hot}-plugged vCPUs. Cold-inited
>  > +     * vCPUs have their GIC state initialized during machvit_init().
>  > +     */
>  > +    if (vms->acpi_dev) {
>  > +        /* TODO: update GIC about this hotplug change here */
>  > +        /* TODO: wire the GIC<->CPU irqs */
>  > +    }
>  > +
>  >       /*
>  >        * To give persistent presence view of vCPUs to the guest, ACPI might need
>  >        * to fake the presence of the vCPUs to the guest but keep them disabled.
>  > @@ -3060,6 +3089,7 @@ static void virt_cpu_pre_plug(HotplugHandler
>  *hotplug_dev, DeviceState *dev,
>  >   static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState
>  *dev,
>  >                             Error **errp)
>  >   {
>  > +    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>  >       MachineState *ms = MACHINE(hotplug_dev);
>  >       CPUState *cs = CPU(dev);
>  >       CPUArchId *cpu_slot;
>  > @@ -3068,10 +3098,81 @@ static void virt_cpu_plug(HotplugHandler
>  *hotplug_dev, DeviceState *dev,
>  >       cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
>  >       cpu_slot->cpu = CPU(dev);
>  >
>  > +    /*
>  > +     * Update the ACPI Hotplug state both for vCPUs being {hot,cold}-
>  plugged.
>  > +     * vCPUs can be cold-plugged using '-device' option. For vCPUs being
>  hot
>  > +     * plugged, guest is also notified.
>  > +     */
>  > +    if (vms->acpi_dev) {
>  > +        /* TODO: update acpi hotplug state. Send cpu hotplug event to guest
>  */
>  > +        /* TODO: register cpu for reset & update F/W info for the next boot
>  */
>  > +    }
>  > +
>  >       cs->disabled = false;
>  >       return;
>  >   }
>  >
>  > +static void virt_cpu_unplug_request(HotplugHandler *hotplug_dev,
>  > +                                    DeviceState *dev, Error **errp) {
>  > +    MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
>  > +    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>  > +    ARMCPU *cpu = ARM_CPU(dev);
>  > +    CPUState *cs = CPU(dev);
>  > +
>  > +    if (!vms->acpi_dev || !dev->realized) {
>  > +        error_setg(errp, "GED does not exists or device is not realized!");
>  > +        return;
>  > +    }
>  > +
>  
>  How can a vCPU be unplugged even when it hasn't been realized? :)

😊

This is a relic - a fossil from the past which exists because of the 2 approaches
I prototyped earlier.  In the other approach vCPU objects could exist.
Thanks for pointing.


>  
>  > +    if (!mc->has_hotpluggable_cpus) {
>  > +        error_setg(errp, "CPU hot(un)plug not supported on this machine");
>  > +        return;
>  > +    }
>  > +
>  > +    if (cs->cpu_index == first_cpu->cpu_index) {
>  > +        error_setg(errp, "Boot CPU(id%d=%d:%d:%d:%d) hot-unplug not supported",
>  > +                   first_cpu->cpu_index, cpu->socket_id, cpu->cluster_id,
>  > +                   cpu->core_id, cpu->thread_id);
>  > +        return;
>  > +    }
>  > +
>  > +    /* TODO: request cpu hotplug from guest */
>  > +
>  > +    return;
>  > +}
>  > +
>  > +static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState
>  *dev,
>  > +                            Error **errp) {
>  > +    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>  > +    MachineState *ms = MACHINE(hotplug_dev);
>  > +    CPUState *cs = CPU(dev);
>  > +    CPUArchId *cpu_slot;
>  > +
>  > +    if (!vms->acpi_dev || !dev->realized) {
>  > +        error_setg(errp, "GED does not exists or device is not realized!");
>  > +        return;
>  > +    }
>  > +
>  
>  Same question as above.


Same answer. 😊


>  
>  > +    cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
>  > +
>  > +    /* TODO: update the acpi cpu hotplug state for cpu hot-unplug */
>  > +
>  > +    /* TODO: unwire the gic-cpu irqs here */
>  > +    /* TODO: update the GIC about this hot unplug change */
>  > +
>  > +    /* TODO: unregister cpu for reset & update F/W info for the next
>  > + boot */
>  > +
>  > +    qobject_unref(dev->opts);
>  > +    dev->opts = NULL;
>  > +
>  > +    cpu_slot->cpu = NULL;
>  > +    cs->disabled = true;
>  > +
>  > +    return;
>  > +}
>  > +
>  
>  The 'return' isn't needed.


Yep. well identified. It actually does not exist in the code if you apply all the patches.
Looks like it has been removed in the subsequent patches after adding it here. Will fix this
instance and the other in subsequent patch where it is being removed.

Thanks for pointing.

Salil.


^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
  2024-08-16 15:37   ` Alex Bennée
  2024-08-16 15:50     ` Peter Maydell
@ 2024-08-19 12:35     ` Salil Mehta via
  2024-08-28 20:23       ` Gustavo Romero
  1 sibling, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-08-19 12:35 UTC (permalink / raw)
  To: Alex Bennée
  Cc: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com,
	maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com,
	rafael@kernel.org, borntraeger@linux.ibm.com, npiggin@gmail.com,
	harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Alex,

>  From: Alex Bennée <alex.bennee@linaro.org>
>  Sent: Friday, August 16, 2024 4:37 PM
>  To: Salil Mehta <salil.mehta@huawei.com>
>  
>  Salil Mehta <salil.mehta@huawei.com> writes:
>  
>  > vCPU Hot-unplug will result in QOM CPU object unrealization which will
>  > do away with all the vCPU thread creations, allocations, registrations
>  > that happened as part of the realization process. This change
>  > introduces the ARM CPU unrealize function taking care of exactly that.
>  >
>  > Note, initialized KVM vCPUs are not destroyed in host KVM but their
>  > Qemu context is parked at the QEMU KVM layer.
>  >
>  > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
>  > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>  > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  > Reported-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
>  > [VP: Identified CPU stall issue & suggested probable fix]
>  > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  > ---
>  >  target/arm/cpu.c       | 101
>  +++++++++++++++++++++++++++++++++++++++++
>  >  target/arm/cpu.h       |  14 ++++++
>  >  target/arm/gdbstub.c   |   6 +++
>  >  target/arm/helper.c    |  25 ++++++++++
>  >  target/arm/internals.h |   3 ++
>  >  target/arm/kvm.c       |   5 ++
>  >  6 files changed, 154 insertions(+)
>  >
>  > diff --git a/target/arm/cpu.c b/target/arm/cpu.c index
>  > c92162fa97..a3dc669309 100644
>  > --- a/target/arm/cpu.c
>  > +++ b/target/arm/cpu.c
>  > @@ -157,6 +157,16 @@ void
>  arm_register_pre_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn
>  *hook,
>  >      QLIST_INSERT_HEAD(&cpu->pre_el_change_hooks, entry, node);  }
>  >
>  > +void arm_unregister_pre_el_change_hooks(ARMCPU *cpu) {
>  > +    ARMELChangeHook *entry, *next;
>  > +
>  > +    QLIST_FOREACH_SAFE(entry, &cpu->pre_el_change_hooks, node,
>  next) {
>  > +        QLIST_REMOVE(entry, node);
>  > +        g_free(entry);
>  > +    }
>  > +}
>  > +
>  >  void arm_register_el_change_hook(ARMCPU *cpu,
>  ARMELChangeHookFn *hook,
>  >                                   void *opaque)  { @@ -168,6 +178,16
>  > @@ void arm_register_el_change_hook(ARMCPU *cpu,
>  ARMELChangeHookFn *hook,
>  >      QLIST_INSERT_HEAD(&cpu->el_change_hooks, entry, node);  }
>  >
>  > +void arm_unregister_el_change_hooks(ARMCPU *cpu) {
>  > +    ARMELChangeHook *entry, *next;
>  > +
>  > +    QLIST_FOREACH_SAFE(entry, &cpu->el_change_hooks, node, next) {
>  > +        QLIST_REMOVE(entry, node);
>  > +        g_free(entry);
>  > +    }
>  > +}
>  > +
>  >  static void cp_reg_reset(gpointer key, gpointer value, gpointer
>  > opaque)  {
>  >      /* Reset a single ARMCPRegInfo register */ @@ -2552,6 +2572,85 @@
>  > static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
>  >      acc->parent_realize(dev, errp);
>  >  }
>  >
>  > +static void arm_cpu_unrealizefn(DeviceState *dev) {
>  > +    ARMCPUClass *acc = ARM_CPU_GET_CLASS(dev);
>  > +    ARMCPU *cpu = ARM_CPU(dev);
>  > +    CPUARMState *env = &cpu->env;
>  > +    CPUState *cs = CPU(dev);
>  > +    bool has_secure;
>  > +
>  > +    has_secure = cpu->has_el3 || arm_feature(env,
>  > + ARM_FEATURE_M_SECURITY);
>  > +
>  > +    /* rock 'n' un-roll, whatever happened in the arm_cpu_realizefn
>  cleanly */
>  > +    cpu_address_space_destroy(cs, ARMASIdx_NS);
>  
>  On current master this will fail:
>  
>  ../../target/arm/cpu.c: In function ‘arm_cpu_unrealizefn’:
>  ../../target/arm/cpu.c:2626:5: error: implicit declaration of function
>  ‘cpu_address_space_destroy’ [-Werror=implicit-function-declaration]
>   2626 |     cpu_address_space_destroy(cs, ARMASIdx_NS);
>        |     ^~~~~~~~~~~~~~~~~~~~~~~~~
>  ../../target/arm/cpu.c:2626:5: error: nested extern declaration of
>  ‘cpu_address_space_destroy’ [-Werror=nested-externs]
>  cc1: all warnings being treated as errors


The current master already has arch-agnostic patch-set. I've applied the
RFC V3 to the latest and complied. I did not see this issue?

I've create a new branch for your reference.

https://github.com/salil-mehta/qemu/tree/virt-cpuhp-armv8/rfc-v4-rc4

Please let me know if this works for you?


Thanks
Salil.



>  
>  --
>  Alex Bennée
>  Virtualisation Tech Lead @ Linaro

^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
  2024-08-16 15:50     ` Peter Maydell
  2024-08-16 17:00       ` Peter Maydell
@ 2024-08-19 12:58       ` Salil Mehta via
  2024-08-19 13:46         ` Peter Maydell
  1 sibling, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-08-19 12:58 UTC (permalink / raw)
  To: Peter Maydell, Alex Bennée
  Cc: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com,
	maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, richard.henderson@linaro.org,
	imammedo@redhat.com, andrew.jones@linux.dev, david@redhat.com,
	philmd@linaro.org, eric.auger@redhat.com, will@kernel.org,
	ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com,
	gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Peter,

>  From: Peter Maydell <peter.maydell@linaro.org>
>  Sent: Friday, August 16, 2024 4:51 PM
>  To: Alex Bennée <alex.bennee@linaro.org>
>  
>  On Fri, 16 Aug 2024 at 16:37, Alex Bennée <alex.bennee@linaro.org> wrote:
>  >
>  > Salil Mehta <salil.mehta@huawei.com> writes:
>  >
>  > > vCPU Hot-unplug will result in QOM CPU object unrealization which
>  > > will do away with all the vCPU thread creations, allocations,
>  > > registrations that happened as part of the realization process. This
>  > > change introduces the ARM CPU unrealize function taking care of exactly
>  that.
>  > >
>  > > Note, initialized KVM vCPUs are not destroyed in host KVM but their
>  > > Qemu context is parked at the QEMU KVM layer.
>  > >
>  > > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
>  > > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>  > > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  > > Reported-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
>  > > [VP: Identified CPU stall issue & suggested probable fix]
>  > > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  > > ---
>  > >  target/arm/cpu.c       | 101
>  +++++++++++++++++++++++++++++++++++++++++
>  > >  target/arm/cpu.h       |  14 ++++++
>  > >  target/arm/gdbstub.c   |   6 +++
>  > >  target/arm/helper.c    |  25 ++++++++++
>  > >  target/arm/internals.h |   3 ++
>  > >  target/arm/kvm.c       |   5 ++
>  > >  6 files changed, 154 insertions(+)
>  > >
>  > > diff --git a/target/arm/cpu.c b/target/arm/cpu.c index
>  > > c92162fa97..a3dc669309 100644
>  > > --- a/target/arm/cpu.c
>  > > +++ b/target/arm/cpu.c
>  > > @@ -157,6 +157,16 @@ void
>  arm_register_pre_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn
>  *hook,
>  > >      QLIST_INSERT_HEAD(&cpu->pre_el_change_hooks, entry, node);  }
>  > >
>  > > +void arm_unregister_pre_el_change_hooks(ARMCPU *cpu) {
>  > > +    ARMELChangeHook *entry, *next;
>  > > +
>  > > +    QLIST_FOREACH_SAFE(entry, &cpu->pre_el_change_hooks, node,
>  next) {
>  > > +        QLIST_REMOVE(entry, node);
>  > > +        g_free(entry);
>  > > +    }
>  > > +}
>  > > +
>  > >  void arm_register_el_change_hook(ARMCPU *cpu,
>  ARMELChangeHookFn *hook,
>  > >                                   void *opaque)  { @@ -168,6 +178,16
>  > > @@ void arm_register_el_change_hook(ARMCPU *cpu,
>  ARMELChangeHookFn *hook,
>  > >      QLIST_INSERT_HEAD(&cpu->el_change_hooks, entry, node);  }
>  > >
>  > > +void arm_unregister_el_change_hooks(ARMCPU *cpu) {
>  > > +    ARMELChangeHook *entry, *next;
>  > > +
>  > > +    QLIST_FOREACH_SAFE(entry, &cpu->el_change_hooks, node, next)
>  {
>  > > +        QLIST_REMOVE(entry, node);
>  > > +        g_free(entry);
>  > > +    }
>  > > +}
>  > > +
>  > >  static void cp_reg_reset(gpointer key, gpointer value, gpointer
>  > > opaque)  {
>  > >      /* Reset a single ARMCPRegInfo register */ @@ -2552,6 +2572,85
>  > > @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
>  > >      acc->parent_realize(dev, errp);  }
>  > >
>  > > +static void arm_cpu_unrealizefn(DeviceState *dev) {
>  > > +    ARMCPUClass *acc = ARM_CPU_GET_CLASS(dev);
>  > > +    ARMCPU *cpu = ARM_CPU(dev);
>  > > +    CPUARMState *env = &cpu->env;
>  > > +    CPUState *cs = CPU(dev);
>  > > +    bool has_secure;
>  > > +
>  > > +    has_secure = cpu->has_el3 || arm_feature(env,
>  > > + ARM_FEATURE_M_SECURITY);
>  > > +
>  > > +    /* rock 'n' un-roll, whatever happened in the arm_cpu_realizefn
>  cleanly */
>  > > +    cpu_address_space_destroy(cs, ARMASIdx_NS);
>  >
>  > On current master this will fail:
>  >
>  > ../../target/arm/cpu.c: In function ‘arm_cpu_unrealizefn’:
>  > ../../target/arm/cpu.c:2626:5: error: implicit declaration of function
>  ‘cpu_address_space_destroy’ [-Werror=implicit-function-declaration]
>  >  2626 |     cpu_address_space_destroy(cs, ARMASIdx_NS);
>  >       |     ^~~~~~~~~~~~~~~~~~~~~~~~~
>  > ../../target/arm/cpu.c:2626:5: error: nested extern declaration of
>  > ‘cpu_address_space_destroy’ [-Werror=nested-externs]
>  > cc1: all warnings being treated as errors
>  
>  We shouldn't need to explicitly call cpu_address_space_destroy() from a
>  target-specific unrealize anyway: we can do it all from the base class (and I
>  think this would fix some leaks in current code for targets that hot-unplug,
>  though I should check that). Otherwise you need to duplicate all the logic for
>  figuring out which address spaces we created in realize, which is fragile and
>  not necessary when all we want to do is "delete every address space the
>  CPU object has"
>  and we want to do that for every target architecture always.


Agreed but I would suggest to make it optional i.e. in case architecture want
to release to from its code. It should be allowed.  This also ensures clarity of the
flows,

https://lore.kernel.org/qemu-devel/a308e1f4f06f4e3ab6ab51f353601f43@huawei.com/


Thanks
Salil.



>  
>  -- PMM

^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
  2024-08-16 17:00       ` Peter Maydell
@ 2024-08-19 12:59         ` Salil Mehta via
  2024-08-19 13:43           ` Peter Maydell
  0 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-08-19 12:59 UTC (permalink / raw)
  To: Peter Maydell, Alex Bennée
  Cc: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com,
	maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, richard.henderson@linaro.org,
	imammedo@redhat.com, andrew.jones@linux.dev, david@redhat.com,
	philmd@linaro.org, eric.auger@redhat.com, will@kernel.org,
	ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com,
	gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

>  From: Peter Maydell <peter.maydell@linaro.org>
>  Sent: Friday, August 16, 2024 6:00 PM
>  To: Alex Bennée <alex.bennee@linaro.org>
>  
>  On Fri, 16 Aug 2024 at 16:50, Peter Maydell <peter.maydell@linaro.org>
>  wrote:
>  > We shouldn't need to explicitly call cpu_address_space_destroy() from
>  > a target-specific unrealize anyway: we can do it all from the base
>  > class (and I think this would fix some leaks in current code for
>  > targets that hot-unplug, though I should check that). Otherwise you
>  > need to duplicate all the logic for figuring out which address spaces
>  > we created in realize, which is fragile and not necessary when all we
>  > want to do is "delete every address space the CPU object has"
>  > and we want to do that for every target architecture always.
>  
>  I have a patch to do this now, but I need to test it a bit more and confirm (or
>  disprove) my hypothesis that we're currently leaking memory on existing
>  architectures with vCPU hot-unplug before I send it out.

I think you are referring to this patch?

https://lore.kernel.org/qemu-devel/20230918160257.30127-9-philmd@linaro.org/


>  
>  -- PMM

^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 06/29] arm/virt,kvm: Pre-create disabled possible vCPUs @machine init
  2024-08-19  5:31   ` Gavin Shan
@ 2024-08-19 13:06     ` Salil Mehta via
  0 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-08-19 13:06 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Gavin,

>  From: Gavin Shan <gshan@redhat.com>
>  Sent: Monday, August 19, 2024 6:32 AM
>  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  qemu-arm@nongnu.org; mst@redhat.com
>  
>  On 6/14/24 9:36 AM, Salil Mehta wrote:
>  > In the ARMv8 architecture, the GIC must know all the CPUs it is
>  > connected to during its initialization, and this cannot change
>  > afterward. This must be ensured during the initialization of the VGIC
>  > as well in KVM, which requires all vCPUs to be created and present
>  > during its initialization. This is necessary
>  > because:
>  >
>  > 1. The association between GICC and MPIDR must be fixed at VM
>  initialization
>  >     time. This is represented by the register `GIC_TYPER(mp_affinity,
>  proc_num)`.
>  > 2. GICC (CPU interfaces), GICR (redistributors), etc., must all be initialized
>  >     at boot time.
>  > 3. Memory regions associated with GICR, etc., cannot be changed (added, deleted,
>  >     or modified) after the VM has been initialized.
>  >
>  > This patch adds support to pre-create all possible vCPUs within the
>  > host using the KVM interface as part of the virtual machine
>  > initialization. These vCPUs can later be attached to QOM/ACPI when
>  > they are actually hot-plugged and made present.
>  >
>  > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
>  > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>  > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  > Reported-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
>  > [VP: Identified CPU stall issue & suggested probable fix]
>  > ---
>  >   hw/arm/virt.c         | 56 +++++++++++++++++++++++++++++++++++-----
>  ---
>  >   include/hw/core/cpu.h |  1 +
>  >   target/arm/cpu64.c    |  1 +
>  >   target/arm/kvm.c      | 41 ++++++++++++++++++++++++++++++-
>  >   target/arm/kvm_arm.h  | 11 +++++++++
>  >   5 files changed, 99 insertions(+), 11 deletions(-)
>  >
>  
>  The vCPU file descriptor is associated with a feature bitmap when the file
>  descriptor is initialized by ioctl(vm_fd, KVM_ARM_VCPU_INIT, &init). The
>  feature bitmap is sorted out based on the vCPU properties. The vCPU
>  properties can be different when the vCPU file descriptor is initialized for
>  the first time when the vCPU is instantiated, and re-initialized when the
>  vCPU is hot added.

  
>  It can lead to system crash as below. We probably need a mechanism to
>  disallow passing extra properties when vCPU is hot added to avoid the
>  conflicts to the global properties from the command line "-cpu
>  host,pmu=on". Some of the properties like "id", "socket-id"
>  are still needed.


Yes, Good catch. I knew that but It almost went under my hood. Thanks for
pointing and reminding it. We need a check there. Will fix it.


>  
>  /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64                  \
>  -accel kvm -machine virt,gic-version=host,nvdimm=on                      \
>  -cpu host -smp maxcpus=2,cpus=1,sockets=2,clusters=1,cores=1,threads=1
>  \
>  -m 4096M,slots=16,maxmem=128G                                            \
>  -object memory-backend-ram,id=mem0,size=2048M                            \
>  -object memory-backend-ram,id=mem1,size=2048M                            \
>  -numa node,nodeid=0,memdev=mem0,cpus=0-0                                 \
>  -numa node,nodeid=1,memdev=mem1,cpus=1-1                                 \
>  -L /home/gavin/sandbox/qemu.main/build/pc-bios                           \
>  -monitor none -serial mon:stdio -nographic                               \
>  -gdb tcp::6666 -qmp tcp:localhost:5555,server,wait=off                   \
>  -bios /home/gavin/sandbox/qemu.main/build/pc-bios/edk2-aarch64-
>  code.fd   \
>  -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image            \
>  -initrd /home/gavin/sandbox/images/rootfs.cpio.xz                        \
>  -append memhp_default_state=online_movable                               \
>       :
>  (qemu) device_add host-arm-cpu,id=cpu1,socket-id=1,pmu=off
>  kvm_arch_init_vcpu: Error -22 from kvm_arm_vcpu_init()
>  qemu-system-aarch64: kvm_init_vcpu: kvm_arch_init_vcpu failed (1):
>  Invalid argument

Yes. thanks.

>  
>  Thanks,
>  Gavin
>  


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
  2024-08-19 12:59         ` Salil Mehta via
@ 2024-08-19 13:43           ` Peter Maydell
  0 siblings, 0 replies; 105+ messages in thread
From: Peter Maydell @ 2024-08-19 13:43 UTC (permalink / raw)
  To: Salil Mehta
  Cc: Alex Bennée, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com, maz@kernel.org, jean-philippe@linaro.org,
	Jonathan Cameron, lpieralisi@kernel.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com,
	rafael@kernel.org, borntraeger@linux.ibm.com, npiggin@gmail.com,
	harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

On Mon, 19 Aug 2024 at 13:59, Salil Mehta <salil.mehta@huawei.com> wrote:
>
> >  From: Peter Maydell <peter.maydell@linaro.org>
> >  Sent: Friday, August 16, 2024 6:00 PM
> >  To: Alex Bennée <alex.bennee@linaro.org>
> >
> >  On Fri, 16 Aug 2024 at 16:50, Peter Maydell <peter.maydell@linaro.org>
> >  wrote:
> >  > We shouldn't need to explicitly call cpu_address_space_destroy() from
> >  > a target-specific unrealize anyway: we can do it all from the base
> >  > class (and I think this would fix some leaks in current code for
> >  > targets that hot-unplug, though I should check that). Otherwise you
> >  > need to duplicate all the logic for figuring out which address spaces
> >  > we created in realize, which is fragile and not necessary when all we
> >  > want to do is "delete every address space the CPU object has"
> >  > and we want to do that for every target architecture always.
> >
> >  I have a patch to do this now, but I need to test it a bit more and confirm (or
> >  disprove) my hypothesis that we're currently leaking memory on existing
> >  architectures with vCPU hot-unplug before I send it out.
>
> I think you are referring to this patch?
>
> https://lore.kernel.org/qemu-devel/20230918160257.30127-9-philmd@linaro.org/

I'd forgotten that Phil had sent that patch out. My patch
is a bit different because it refactors cpu_address_space_destroy()
into a single function that destroys all the ASes (and so we
don't for instance need cpu->cpu_ases_count any more).

-- PMM


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
  2024-08-19 12:58       ` Salil Mehta via
@ 2024-08-19 13:46         ` Peter Maydell
  2024-08-20 15:34           ` Salil Mehta via
  0 siblings, 1 reply; 105+ messages in thread
From: Peter Maydell @ 2024-08-19 13:46 UTC (permalink / raw)
  To: Salil Mehta
  Cc: Alex Bennée, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com, maz@kernel.org, jean-philippe@linaro.org,
	Jonathan Cameron, lpieralisi@kernel.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com,
	rafael@kernel.org, borntraeger@linux.ibm.com, npiggin@gmail.com,
	harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

On Mon, 19 Aug 2024 at 13:58, Salil Mehta <salil.mehta@huawei.com> wrote:
>
> Hi Peter,
>
> >  From: Peter Maydell <peter.maydell@linaro.org>
> >
> >  We shouldn't need to explicitly call cpu_address_space_destroy() from a
> >  target-specific unrealize anyway: we can do it all from the base class (and I
> >  think this would fix some leaks in current code for targets that hot-unplug,
> >  though I should check that). Otherwise you need to duplicate all the logic for
> >  figuring out which address spaces we created in realize, which is fragile and
> >  not necessary when all we want to do is "delete every address space the
> >  CPU object has"
> >  and we want to do that for every target architecture always.
>
>
> Agreed but I would suggest to make it optional i.e. in case architecture want
> to release to from its code. It should be allowed.  This also ensures clarity of the
> flows,
>
> https://lore.kernel.org/qemu-devel/a308e1f4f06f4e3ab6ab51f353601f43@huawei.com/

Do you have any concrete examples where a target arch would want to
explicitly release an AS from its own code? Unless there's a
real use case for doing that, I think that "common code always
does the cleanup of the ASes, nothing else ever does" is a
simple design rule that avoids the need for target-specific code
and means we don't need complicated handling for "some of the
ASes in cpu->cpu_ases are live and some have been released":
either the CPU is realized and they're all valid, or else
we're in the process of unrealizing the CPU and we get rid of
them all at once.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 17/29] arm/virt: Release objects for *disabled* possible vCPUs after init
  2024-08-19 12:21     ` Salil Mehta via
@ 2024-08-20  0:05       ` Gavin Shan
  2024-08-20 16:40         ` Salil Mehta via
  0 siblings, 1 reply; 105+ messages in thread
From: Gavin Shan @ 2024-08-20  0:05 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Salil,

On 8/19/24 10:21 PM, Salil Mehta wrote:
>>   From: Gavin Shan <gshan@redhat.com>
>>   Sent: Tuesday, August 13, 2024 2:17 AM
>>   To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>>   qemu-arm@nongnu.org; mst@redhat.com
>>   
>>   On 6/14/24 9:36 AM, Salil Mehta wrote:
>>   > During `machvirt_init()`, QOM ARMCPU objects are pre-created along
>>   > with the corresponding KVM vCPUs in the host for all possible vCPUs.
>>   > This is necessary due to the architectural constraint that KVM
>>   > restricts the deferred creation of KVM vCPUs and VGIC
>>   > initialization/sizing after VM initialization. Hence, VGIC is pre-sized with
>>   possible vCPUs.
>>   >
>>   > After the initialization of the machine is complete, the disabled
>>   > possible KVM vCPUs are parked in the per-virt-machine list
>>   > "kvm_parked_vcpus," and we release the QOM ARMCPU objects for the
>>   > disabled vCPUs. These will be re-created when the vCPU is hotplugged
>>   > again. The QOM ARMCPU object is then re-attached to the corresponding
>>   parked KVM vCPU.
>>   >
>>   > Alternatively, we could have chosen not to release the QOM CPU objects
>>   > and kept reusing them. This approach might require some modifications
>>   > to the `qdevice_add()` interface to retrieve the old ARMCPU object
>>   > instead of creating a new one for the hotplug request.
>>   >
>>   > Each of these approaches has its own pros and cons. This prototype
>>   > uses the first approach (suggestions are welcome!).
>>   >
>>   > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
>>   > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>>   > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>>   > ---
>>   >   hw/arm/virt.c | 32 ++++++++++++++++++++++++++++++++
>>   >   1 file changed, 32 insertions(+)
>>   >
>>   > diff --git a/hw/arm/virt.c b/hw/arm/virt.c index
>>   > 9d33f30a6a..a72cd3b20d 100644
>>   > --- a/hw/arm/virt.c
>>   > +++ b/hw/arm/virt.c
>>   > @@ -2050,6 +2050,7 @@ static void virt_cpu_post_init(VirtMachineState
>>   *vms, MemoryRegion *sysmem)
>>   >   {
>>   >       CPUArchIdList *possible_cpus = vms->parent.possible_cpus;
>>   >       int max_cpus = MACHINE(vms)->smp.max_cpus;
>>   > +    MachineState *ms = MACHINE(vms);
>>   >       bool aarch64, steal_time;
>>   >       CPUState *cpu;
>>   >       int n;
>>   > @@ -2111,6 +2112,37 @@ static void virt_cpu_post_init(VirtMachineState
>>   *vms, MemoryRegion *sysmem)
>>   >               }
>>   >           }
>>   >       }
>>   > +
>>   > +    if (kvm_enabled() || tcg_enabled()) {
>>   > +        for (n = 0; n < possible_cpus->len; n++) {
>>   > +            cpu = qemu_get_possible_cpu(n);
>>   > +
>>   > +            /*
>>   > +             * Now, GIC has been sized with possible CPUs and we dont
>>   require
>>   > +             * disabled vCPU objects to be represented in the QOM. Release
>>   the
>>   > +             * disabled ARMCPU objects earlier used during init for pre-sizing.
>>   > +             *
>>   > +             * We fake to the guest through ACPI about the
>>   presence(_STA.PRES=1)
>>   > +             * of these non-existent vCPUs at VMM/qemu and present these
>>   as
>>   > +             * disabled vCPUs(_STA.ENA=0) so that they cant be used. These
>>   vCPUs
>>   > +             * can be later added to the guest through hotplug exchanges
>>   when
>>   > +             * ARMCPU objects are created back again using 'device_add' QMP
>>   > +             * command.
>>   > +             */
>>   > +            /*
>>   > +             * RFC: Question: Other approach could've been to keep them
>>   forever
>>   > +             * and release it only once when qemu exits as part of finalize or
>>   > +             * when new vCPU is hotplugged. In the later old could be released
>>   > +             * for the newly created object for the same vCPU?
>>   > +             */
>>   > +            if (!qemu_enabled_cpu(cpu)) {
>>   > +                CPUArchId *cpu_slot;
>>   > +                cpu_slot = virt_find_cpu_slot(ms, cpu->cpu_index);
>>   > +                cpu_slot->cpu = NULL;
>>   > +                object_unref(OBJECT(cpu));
>>   > +            }
>>   > +        }
>>   > +    }
>>   >   }
>>   
>>   It's probably hard to keep those ARMCPU objects forever. First of all, one
>>   vCPU can be hot-added first and then hot-removed afterwards. With those
>>   ARMCPU objects kept forever, the syntax of 'device_add' and 'device_del'
>>   become broken at least.
> 
> I had prototyped both approaches 4 years back. Yes, interface problem with
> device_add was solved by a trick of keeping the old vCPU object and on
> device_add instead of creating a new vCPU object we could use the old vCPU
> object and then call qdev_realize() on it.
> 
> But bigger problem with this approach is that of migration. Only realized objects
> have state to be migrated. So it might look cleaner on one aspect but had its
> own issues.
> 
> I think I did share a prototype of this with Igor which he was not in agreement with
> and wanted vCPU objects to be destroyed like in x86. Hence, we stuck with
> the current approach.
> 

Migration needn't to be the blocker necessarily. For those states of vCPUs, which
have been instantiated and not realized yet, those states can be moved around to
be managed under another migratable object (e.g. MachineState). In this way, those
vCPU states can be migrated together with MachineState. However, it sounds strange
to me to have vCPU objects instantiated, but unrealized until hot-add is triggered.
It's not what QOM supposes to support.

Ok, I don't realize x86 also follows this model: instantiate hotpluggable vCPUs
and destroy them on bootup.

> 
>>   The ideal mechanism would be to avoid instanciating those ARMCPU objects
>>   and destroying them soon. I don't know if ms->possible_cpus->cpus[] can
>>   fit and how much efforts needed.
> 
> This is what we are doing now in the current approach. Please read the KVMForum
> slides of 2023 for more details and the cover letter of RFC V3 for more details.
> 

My question has been parsed in a wrong way. My question was if it's doable to avoid
instantiating vCPU#1 on bootup when we have command lines '-cpu host -smp cpus=1,max_cpus=2'
and release it in machvirt_init(). Instead, ms->possible_cpus->cpus[] are reused to create
GICv3 and vCPU file descriptors. It looks like a clean way at least. There may be significant
efforts to reuse mc->possible_cpus->cpus[], but vCPU#1 object has ephemeral life cycle.
It looks unnecessary to create the ephemeral vCPU#1 object.

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 11/29] arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
  2024-08-19 12:10     ` Salil Mehta via
@ 2024-08-20  0:22       ` Gavin Shan
  2024-08-20 17:10         ` Salil Mehta via
  0 siblings, 1 reply; 105+ messages in thread
From: Gavin Shan @ 2024-08-20  0:22 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Salil,

On 8/19/24 10:10 PM, Salil Mehta wrote:
>>   From: Gavin Shan <gshan@redhat.com>
>>   Sent: Tuesday, August 13, 2024 2:05 AM
>>   To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>>   qemu-arm@nongnu.org; mst@redhat.com
>>   
>>   On 6/14/24 9:36 AM, Salil Mehta wrote:
>>   > ACPI CPU hotplug state (is_present=_STA.PRESENT,
>>   > is_enabled=_STA.ENABLED) for all the possible vCPUs MUST be
>>   > initialized during machine init. This is done during the creation of
>>   > the GED device. VMM/Qemu MUST expose/fake the ACPI state of the
>>   > disabled vCPUs to the Guest kernel as 'present' (_STA.PRESENT) always
>>   > i.e. ACPI persistent. if the 'disabled' vCPU objectes are destroyed
>>   > before the GED device has been created then their ACPI hotplug state
>>   > might not get initialized correctly as acpi_persistent flag is part of the
>>   CPUState. This will expose wrong status of the unplugged vCPUs to the
>>   Guest kernel.
>>   >
>>   > Hence, moving the GED device creation before disabled vCPU objects get
>>   > destroyed as part of the post CPU init routine.
>>   >
>>   > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>>   > ---
>>   >   hw/arm/virt.c | 10 +++++++---
>>   >   1 file changed, 7 insertions(+), 3 deletions(-)
>>   >
>>   > diff --git a/hw/arm/virt.c b/hw/arm/virt.c index
>>   > 918bcb9a1b..5f98162587 100644
>>   > --- a/hw/arm/virt.c
>>   > +++ b/hw/arm/virt.c
>>   > @@ -2467,6 +2467,12 @@ static void machvirt_init(MachineState
>>   > *machine)
>>   >
>>   >       create_gic(vms, sysmem);
>>   >
>>   > +    has_ged = has_ged && aarch64 && firmware_loaded &&
>>   > +              virt_is_acpi_enabled(vms);
>>   > +    if (has_ged) {
>>   > +        vms->acpi_dev = create_acpi_ged(vms);
>>   > +    }
>>   > +
>>   >       virt_cpu_post_init(vms, sysmem);
>>   >
>>   >       fdt_add_pmu_nodes(vms);
>>   > @@ -2489,9 +2495,7 @@ static void machvirt_init(MachineState
>>   *machine)
>>   >
>>   >       create_pcie(vms);
>>   >
>>   > -    if (has_ged && aarch64 && firmware_loaded &&
>>   virt_is_acpi_enabled(vms)) {
>>   > -        vms->acpi_dev = create_acpi_ged(vms);
>>   > -    } else {
>>   > +    if (!has_ged) {
>>   >           create_gpio_devices(vms, VIRT_GPIO, sysmem);
>>   >       }
>>   >
>>   
>>   It's likely the GPIO device can be created before those disabled CPU objects
>>   are destroyed. It means the whole chunk of code can be moved together, I
>>   think.
> 
> I was not totally sure of this. Hence, kept the order of the rest like that. I can
> definitely check again if we can do that to reduce the change.
> 

@has_ged is the equivalent to '!vmc->no_ged' initially and then it's overrided by
the following changes in this patch. The syntax of @has_ged has been changed and
it's not the best name to match the changes. There are two solutions: (1) Rename
@has_ged to something meaningful and matching with the changes; (2) Move the whole
chunk of codes, which I preferred. The GPIO device and GED device are supplementing
to each other, meaning GPIO device will be created when GED device has been disallowed.

     has_ged = has_ged && aarch64 && firmware_loaded && virt_is_acpi_enabled(vms);

The code to be moved together in virt.c since they're correlated:

     if (has_ged && aarch64 && firmware_loaded && virt_is_acpi_enabled(vms)) {
         vms->acpi_dev = create_acpi_ged(vms);
     } else {
         create_gpio_devices(vms, VIRT_GPIO, sysmem);
     }
     
     if (vms->secure && !vmc->no_secure_gpio) {
         create_gpio_devices(vms, VIRT_SECURE_GPIO, secure_sysmem);
     }

      /* connect powerdown request */
      vms->powerdown_notifier.notify = virt_powerdown_req;
      qemu_register_powerdown_notifier(&vms->powerdown_notifier);

Thanks,
Gavin






^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
  2024-08-19 13:46         ` Peter Maydell
@ 2024-08-20 15:34           ` Salil Mehta via
  0 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-08-20 15:34 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Alex Bennée, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com, maz@kernel.org, jean-philippe@linaro.org,
	Jonathan Cameron, lpieralisi@kernel.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com,
	rafael@kernel.org, borntraeger@linux.ibm.com, npiggin@gmail.com,
	harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

HI Peter,

>  From: Peter Maydell <peter.maydell@linaro.org>
>  Sent: Monday, August 19, 2024 2:47 PM
>  To: Salil Mehta <salil.mehta@huawei.com>
>  
>  On Mon, 19 Aug 2024 at 13:58, Salil Mehta <salil.mehta@huawei.com>
>  wrote:
>  >
>  > Hi Peter,
>  >
>  > >  From: Peter Maydell <peter.maydell@linaro.org>
>  > >
>  > >  We shouldn't need to explicitly call cpu_address_space_destroy()
>  > > from a  target-specific unrealize anyway: we can do it all from the
>  > > base class (and I  think this would fix some leaks in current code
>  > > for targets that hot-unplug,  though I should check that). Otherwise
>  > > you need to duplicate all the logic for  figuring out which address
>  > > spaces we created in realize, which is fragile and  not necessary
>  > > when all we want to do is "delete every address space the  CPU object
>  has"
>  > >  and we want to do that for every target architecture always.
>  >
>  >
>  > Agreed but I would suggest to make it optional i.e. in case
>  > architecture want to release to from its code. It should be allowed.
>  > This also ensures clarity of the flows,
>  >
>  > https://lore.kernel.org/qemu-
>  devel/a308e1f4f06f4e3ab6ab51f353601f43@hu
>  > awei.com/
>  
>  Do you have any concrete examples where a target arch would want to
>  explicitly release an AS from its own code? 


No, I don’t have but some of the reasons I thought were:

1. Order of destruction of address space. Can it be different than what will; be assumed in the loop?
2. What if something needs to be done or handled before destroying each address space?
3. the flow


Unless there's a real use case for
>  doing that, I think that "common code always does the cleanup of the ASes,
>  nothing else ever does" is a simple design rule that avoids the need for
>  target-specific code and means we don't need complicated handling for
>  "some of the ASes in cpu->cpu_ases are live and some have been
>  released":
>  either the CPU is realized and they're all valid, or else we're in the process of
>  unrealizing the CPU and we get rid of them all at once.

I don’t have hard opinions on this. You can share the patch. I'll test with my branch
of vCPU hotplug

Thanks
Salil.

>  
>  thanks
>  -- PMM

^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 17/29] arm/virt: Release objects for *disabled* possible vCPUs after init
  2024-08-20  0:05       ` Gavin Shan
@ 2024-08-20 16:40         ` Salil Mehta via
  2024-08-21  6:25           ` Gavin Shan
  0 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-08-20 16:40 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

>  From: Gavin Shan <gshan@redhat.com>
>  Sent: Tuesday, August 20, 2024 1:06 AM
>  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  qemu-arm@nongnu.org; mst@redhat.com
>  
>  Hi Salil,
>  
>  On 8/19/24 10:21 PM, Salil Mehta wrote:
>  >>   From: Gavin Shan <gshan@redhat.com>
>  >>   Sent: Tuesday, August 13, 2024 2:17 AM
>  >>   To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  >>   qemu-arm@nongnu.org; mst@redhat.com
>  >>
>  >>   On 6/14/24 9:36 AM, Salil Mehta wrote:
>  >>   > During `machvirt_init()`, QOM ARMCPU objects are pre-created along
>  >>   > with the corresponding KVM vCPUs in the host for all possible vCPUs.
>  >>   > This is necessary due to the architectural constraint that KVM
>  >>   > restricts the deferred creation of KVM vCPUs and VGIC
>  >>   > initialization/sizing after VM initialization. Hence, VGIC is pre-sized
>  with
>  >>   possible vCPUs.
>  >>   >
>  >>   > After the initialization of the machine is complete, the disabled
>  >>   > possible KVM vCPUs are parked in the per-virt-machine list
>  >>   > "kvm_parked_vcpus," and we release the QOM ARMCPU objects for
>  the
>  >>   > disabled vCPUs. These will be re-created when the vCPU is
>  hotplugged
>  >>   > again. The QOM ARMCPU object is then re-attached to the
>  corresponding
>  >>   parked KVM vCPU.
>  >>   >
>  >>   > Alternatively, we could have chosen not to release the QOM CPU
>  objects
>  >>   > and kept reusing them. This approach might require some
>  modifications
>  >>   > to the `qdevice_add()` interface to retrieve the old ARMCPU object
>  >>   > instead of creating a new one for the hotplug request.
>  >>   >
>  >>   > Each of these approaches has its own pros and cons. This prototype
>  >>   > uses the first approach (suggestions are welcome!).
>  >>   >
>  >>   > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
>  >>   > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>  >>   > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  >>   > ---
>  >>   >   hw/arm/virt.c | 32 ++++++++++++++++++++++++++++++++
>  >>   >   1 file changed, 32 insertions(+)
>  >>   >
>  >>   > diff --git a/hw/arm/virt.c b/hw/arm/virt.c index
>  >>   > 9d33f30a6a..a72cd3b20d 100644
>  >>   > --- a/hw/arm/virt.c
>  >>   > +++ b/hw/arm/virt.c
>  >>   > @@ -2050,6 +2050,7 @@ static void
>  virt_cpu_post_init(VirtMachineState
>  >>   *vms, MemoryRegion *sysmem)
>  >>   >   {
>  >>   >       CPUArchIdList *possible_cpus = vms->parent.possible_cpus;
>  >>   >       int max_cpus = MACHINE(vms)->smp.max_cpus;
>  >>   > +    MachineState *ms = MACHINE(vms);
>  >>   >       bool aarch64, steal_time;
>  >>   >       CPUState *cpu;
>  >>   >       int n;
>  >>   > @@ -2111,6 +2112,37 @@ static void
>  virt_cpu_post_init(VirtMachineState
>  >>   *vms, MemoryRegion *sysmem)
>  >>   >               }
>  >>   >           }
>  >>   >       }
>  >>   > +
>  >>   > +    if (kvm_enabled() || tcg_enabled()) {
>  >>   > +        for (n = 0; n < possible_cpus->len; n++) {
>  >>   > +            cpu = qemu_get_possible_cpu(n);
>  >>   > +
>  >>   > +            /*
>  >>   > +             * Now, GIC has been sized with possible CPUs and we dont
>  >>   require
>  >>   > +             * disabled vCPU objects to be represented in the QOM.
>  Release
>  >>   the
>  >>   > +             * disabled ARMCPU objects earlier used during init for pre-
>  sizing.
>  >>   > +             *
>  >>   > +             * We fake to the guest through ACPI about the
>  >>   presence(_STA.PRES=1)
>  >>   > +             * of these non-existent vCPUs at VMM/qemu and present
>  these
>  >>   as
>  >>   > +             * disabled vCPUs(_STA.ENA=0) so that they cant be used.
>  These
>  >>   vCPUs
>  >>   > +             * can be later added to the guest through hotplug exchanges
>  >>   when
>  >>   > +             * ARMCPU objects are created back again using 'device_add'
>  QMP
>  >>   > +             * command.
>  >>   > +             */
>  >>   > +            /*
>  >>   > +             * RFC: Question: Other approach could've been to keep them
>  >>   forever
>  >>   > +             * and release it only once when qemu exits as part of finalize
>  or
>  >>   > +             * when new vCPU is hotplugged. In the later old could be
>  released
>  >>   > +             * for the newly created object for the same vCPU?
>  >>   > +             */
>  >>   > +            if (!qemu_enabled_cpu(cpu)) {
>  >>   > +                CPUArchId *cpu_slot;
>  >>   > +                cpu_slot = virt_find_cpu_slot(ms, cpu->cpu_index);
>  >>   > +                cpu_slot->cpu = NULL;
>  >>   > +                object_unref(OBJECT(cpu));
>  >>   > +            }
>  >>   > +        }
>  >>   > +    }
>  >>   >   }
>  >>
>  >>   It's probably hard to keep those ARMCPU objects forever. First of all,
>  one
>  >>   vCPU can be hot-added first and then hot-removed afterwards. With
>  those
>  >>   ARMCPU objects kept forever, the syntax of 'device_add' and
>  'device_del'
>  >>   become broken at least.
>  >
>  > I had prototyped both approaches 4 years back. Yes, interface problem
>  > with device_add was solved by a trick of keeping the old vCPU object
>  > and on device_add instead of creating a new vCPU object we could use
>  > the old vCPU object and then call qdev_realize() on it.
>  >
>  > But bigger problem with this approach is that of migration. Only
>  > realized objects have state to be migrated. So it might look cleaner
>  > on one aspect but had its own issues.
>  >
>  > I think I did share a prototype of this with Igor which he was not in
>  > agreement with and wanted vCPU objects to be destroyed like in x86.
>  > Hence, we stuck with the current approach.
>  >
>  
>  Migration needn't to be the blocker necessarily. For those states of vCPUs,
>  which have been instantiated and not realized yet, those states can be
>  moved around to be managed under another migratable object (e.g.
>  MachineState). In this way, those vCPU states can be migrated together
>  with MachineState. 


Migration was a blocker., I'll have to muddle my head again to gain back every
context of that which at this juncture seems unnecessary.


However, it sounds strange to me to have vCPU objects
>  instantiated, but unrealized until hot-add is triggered.
>  It's not what QOM supposes to support.


Yes, but for user it does not matter as it is an internal implementation and
we don’t have to expose that to the external user either. Just a representation.


>  
>  Ok, I don't realize x86 also follows this model: instantiate hotpluggable
>  vCPUs and destroy them on bootup.


Correct, so we have now kept it consistent with x86 but of course we have
fake states for ARM arch when vCPUs don’t exist and this state Is not migrated.
ACPI persistent state gets initialized at the start by the architecture specific code.
Hence, what vCPU state we expose to kernel during Qemu init and hot(un)plug
operations varies.


>  
>  >
>  >>   The ideal mechanism would be to avoid instanciating those ARMCPU objects
>  >>   and destroying them soon. I don't know if ms->possible_cpus->cpus[] can
>  >>   fit and how much efforts needed.
>  >
>  > This is what we are doing now in the current approach. Please read the
>  > KVMForum slides of 2023 for more details and the cover letter of RFC V3
>  for more details.
>  >
>  
>  My question has been parsed in a wrong way. 

Oh. 😊

My question was if it's doable
>  to avoid instantiating vCPU#1 on bootup when we have command lines '-
>  cpu host -smp cpus=1,max_cpus=2'
>  and release it in machvirt_init(). Instead, ms->possible_cpus->cpus[] are
>  reused to create
>  GICv3 and vCPU file descriptors.


We create QOM vCPU objects to initialize KVM vCPU objects in host kernel.
and only permanently populate the 'possible_cpus' list for the realized vCPUs.
We release vCPU objects for other vCPUs which will be hot-plugged in future. 
and only make then part of 'possible_cpus' list after they get plugged and
realized.


 It looks like a clean way at least. There may
>  be significant efforts to reuse mc->possible_cpus->cpus[], but vCPU#1
>  object has ephemeral life cycle.
>  It looks unnecessary to create the ephemeral vCPU#1 object.

I don’t understand this clearly. Are  you suggesting to reuse only single
vCPU object to initialize all KVM vCPUs not yet plugged? If yes, then
I'm not sure what do we gain here by adding this complexity? It does
not consume time or resources because we are not realizing any of these
vCPU object in any case. 

Thanks
Salil.


>  
>  Thanks,
>  Gavin
>  


^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 11/29] arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
  2024-08-20  0:22       ` Gavin Shan
@ 2024-08-20 17:10         ` Salil Mehta via
  0 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-08-20 17:10 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Gavin,

>  From: Gavin Shan <gshan@redhat.com>
>  Sent: Tuesday, August 20, 2024 1:22 AM
>  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  qemu-arm@nongnu.org; mst@redhat.com
>  
>  Hi Salil,
>  
>  On 8/19/24 10:10 PM, Salil Mehta wrote:
>  >>   From: Gavin Shan <gshan@redhat.com>
>  >>   Sent: Tuesday, August 13, 2024 2:05 AM
>  >>   To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  >>   qemu-arm@nongnu.org; mst@redhat.com
>  >>
>  >>   On 6/14/24 9:36 AM, Salil Mehta wrote:
>  >>   > ACPI CPU hotplug state (is_present=_STA.PRESENT,
>  >>   > is_enabled=_STA.ENABLED) for all the possible vCPUs MUST be
>  >>   > initialized during machine init. This is done during the creation of
>  >>   > the GED device. VMM/Qemu MUST expose/fake the ACPI state of the
>  >>   > disabled vCPUs to the Guest kernel as 'present' (_STA.PRESENT) always
>  >>   > i.e. ACPI persistent. if the 'disabled' vCPU objectes are destroyed
>  >>   > before the GED device has been created then their ACPI hotplug state
>  >>   > might not get initialized correctly as acpi_persistent flag is part of the
>  >>   CPUState. This will expose wrong status of the unplugged vCPUs to the
>  >>   Guest kernel.
>  >>   >
>  >>   > Hence, moving the GED device creation before disabled vCPU objects get
>  >>   > destroyed as part of the post CPU init routine.
>  >>   >
>  >>   > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  >>   > ---
>  >>   >   hw/arm/virt.c | 10 +++++++---
>  >>   >   1 file changed, 7 insertions(+), 3 deletions(-)
>  >>   >
>  >>   > diff --git a/hw/arm/virt.c b/hw/arm/virt.c index
>  >>   > 918bcb9a1b..5f98162587 100644
>  >>   > --- a/hw/arm/virt.c
>  >>   > +++ b/hw/arm/virt.c
>  >>   > @@ -2467,6 +2467,12 @@ static void machvirt_init(MachineState
>  >>   > *machine)
>  >>   >
>  >>   >       create_gic(vms, sysmem);
>  >>   >
>  >>   > +    has_ged = has_ged && aarch64 && firmware_loaded &&
>  >>   > +              virt_is_acpi_enabled(vms);
>  >>   > +    if (has_ged) {
>  >>   > +        vms->acpi_dev = create_acpi_ged(vms);
>  >>   > +    }
>  >>   > +
>  >>   >       virt_cpu_post_init(vms, sysmem);
>  >>   >
>  >>   >       fdt_add_pmu_nodes(vms);
>  >>   > @@ -2489,9 +2495,7 @@ static void machvirt_init(MachineState
>  >>   *machine)
>  >>   >
>  >>   >       create_pcie(vms);
>  >>   >
>  >>   > -    if (has_ged && aarch64 && firmware_loaded &&
>  >>   virt_is_acpi_enabled(vms)) {
>  >>   > -        vms->acpi_dev = create_acpi_ged(vms);
>  >>   > -    } else {
>  >>   > +    if (!has_ged) {
>  >>   >           create_gpio_devices(vms, VIRT_GPIO, sysmem);
>  >>   >       }
>  >>   >
>  >>
>  >>   It's likely the GPIO device can be created before those disabled CPU objects
>  >>   are destroyed. It means the whole chunk of code can be moved together, I
>  >>   think.
>  >
>  > I was not totally sure of this. Hence, kept the order of the rest like
>  > that. I can definitely check again if we can do that to reduce the change.
>  >
>  
>  @has_ged is the equivalent to '!vmc->no_ged' initially and then it's
>  overrided by the following changes in this patch. The syntax of @has_ged
>  has been changed and it's not the best name to match the changes. There
>  are two solutions: (1) Rename @has_ged to something meaningful and
>  matching with the changes; (2) Move the whole chunk of codes, which I
>  preferred. The GPIO device and GED device are supplementing to each
>  other, meaning GPIO device will be created when GED device has been
>  disallowed.
>  
>       has_ged = has_ged && aarch64 && firmware_loaded &&
>  virt_is_acpi_enabled(vms);
>  
>  The code to be moved together in virt.c since they're correlated:
>  
>       if (has_ged && aarch64 && firmware_loaded &&
>  virt_is_acpi_enabled(vms)) {
>           vms->acpi_dev = create_acpi_ged(vms);
>       } else {
>           create_gpio_devices(vms, VIRT_GPIO, sysmem);
>       }
>  
>       if (vms->secure && !vmc->no_secure_gpio) {
>           create_gpio_devices(vms, VIRT_SECURE_GPIO, secure_sysmem);
>       }
>  
>        /* connect powerdown request */
>        vms->powerdown_notifier.notify = virt_powerdown_req;
>        qemu_register_powerdown_notifier(&vms->powerdown_notifier);


If there is no dependency then we can completely move before   virt_cpu_post_init().
I'll get back to you on this.

Thanks
Salil.

>  
>  Thanks,
>  Gavin
>  
>  
>  
>  


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 17/29] arm/virt: Release objects for *disabled* possible vCPUs after init
  2024-08-20 16:40         ` Salil Mehta via
@ 2024-08-21  6:25           ` Gavin Shan
  2024-08-21 10:23             ` Salil Mehta via
  0 siblings, 1 reply; 105+ messages in thread
From: Gavin Shan @ 2024-08-21  6:25 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Salil,

On 8/21/24 2:40 AM, Salil Mehta wrote:
> 
> I don’t understand this clearly. Are  you suggesting to reuse only single
> vCPU object to initialize all KVM vCPUs not yet plugged? If yes, then
> I'm not sure what do we gain here by adding this complexity? It does
> not consume time or resources because we are not realizing any of these
> vCPU object in any case.
> 

First of all, it seems we have different names and terms for those cold-booted vCPUs
and hotpluggable vCPUs. For example, vCPU-0 and vCPU-1 are cold-booted vCPUs while
vCPU-2 and vCPU-3 are hotpluggable vCPUs when we have '-smp maxcpus=4,cpus=2'. Lets
stick to convention and terms for easier discussion.

The idea is avoid instantiating hotpluggable vCPUs in virtmach_init() and released in the
same function for those hotpluggable vCPUs. As I can understand, those hotpluggable vCPU
instances are serving for two purposes: (1) Relax the constraint that all vCPU's (kvm)
file descriptor have to be created and populated; (2) Help to instantiate and realize
GICv3 object.

For (1), I don't think we have to instantiate those hotpluggable vCPUs at all. In the above
example where we have command line '-smp maxcpus=4,cpus=2', it's unnecessary to instantiate
vCPU-3 and vCPU-4 to create and populate their KVM file descriptors. A vCPU's KVM file
descriptor is create and populated by the following ioctls and function calls. When the first
vCPU (vCPU-0) is realized, the property corresponding to "&init" is fixed for all vCPUs. It
means all vCPUs have same properties except the "vcpu_index".

   ioctl(vm-fd,   KVM_CREATE_VCPU,   vcpu_index);
   ioctl(vcpu-fd, KVM_ARM_VCPU_INIT, &init);
   kvm_park_vcpu(cs);

A vCPU's properties are determined by two sources and both are global. It means all vCPUs
should have same properties: (a) Feature registers returned from the host. The function
kvm_arm_get_host_cpu_features() is called for once, meaning this source is same to all vCPUs;
(b) The parameters provided by user through '-cpu host,sve=off' are translated to global
properties and applied to all vCPUs when they're instantiated.

       (a)                                            (b)

   aarch64_host_initfn                          qemu_init
   kvm_arm_set_cpu_features_from_host           parse_cpu_option
     kvm_arm_get_host_cpu_features              cpu_common_parse_features
                                                qdev_prop_register_global
                                                  :
                                                device_post_init
                                                qdev_prop_set_globals

For (2), I'm still looking into the GICv3 code for better understanding. Until now, I don't
see we need the instantiated hotpluggable vCPUs either. For example, the redistributor regions
can be exposed based on 'maxcpus' instead of 'cpus'. The IRQ connection and teardown can be
dynamically done by connecting the board with GICv3 through callbacks in ARMGICv3CommonClass.
The connection between GICv3CPUState and CPUARMState also can be done dynamically.

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 17/29] arm/virt: Release objects for *disabled* possible vCPUs after init
  2024-08-21  6:25           ` Gavin Shan
@ 2024-08-21 10:23             ` Salil Mehta via
  2024-08-21 13:32               ` Gavin Shan
  0 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-08-21 10:23 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Gavin,

>  From: Gavin Shan <gshan@redhat.com>
>  Sent: Wednesday, August 21, 2024 7:25 AM
>  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  qemu-arm@nongnu.org; mst@redhat.com
>  
>  Hi Salil,
>  
>  On 8/21/24 2:40 AM, Salil Mehta wrote:
>  >
>  > I don’t understand this clearly. Are  you suggesting to reuse only
>  > single vCPU object to initialize all KVM vCPUs not yet plugged? If
>  > yes, then I'm not sure what do we gain here by adding this complexity?
>  > It does not consume time or resources because we are not realizing any
>  > of these vCPU object in any case.
>  >
>  
>  First of all, it seems we have different names and terms for those cold-
>  booted vCPUs and hotpluggable vCPUs. For example, vCPU-0 and vCPU-1
>  are cold-booted vCPUs while
>  vCPU-2 and vCPU-3 are hotpluggable vCPUs when we have '-smp
>  maxcpus=4,cpus=2'. Lets stick to convention and terms for easier discussion.
>  
>  The idea is avoid instantiating hotpluggable vCPUs in virtmach_init() and
>  released in the same function for those hotpluggable vCPUs. As I can
>  understand, those hotpluggable vCPU instances are serving for two
>  purposes: (1) Relax the constraint that all vCPU's (kvm) file descriptor have
>  to be created and populated; 


We are devising *workarounds* in Qemu for the ARM CPU architectural constraints
in KVM and in Guest Kernel,  *not relaxing* them. We are not allowed to meddle with
the constraints. That is the whole point.

Not having to respect those constraints led to rejection of the earlier attempts to
upstream Virtual CPU Hotplug for ARM.


(2) Help to instantiate and realize
>  GICv3 object.
>  
>  For (1), I don't think we have to instantiate those hotpluggable vCPUs at all.
>  In the above example where we have command line '-smp
>  maxcpus=4,cpus=2', it's unnecessary to instantiate
>  vCPU-3 and vCPU-4 to create and populate their KVM file descriptors.


We cannot defer create vCPU in KVM after GIC has been initialized in KVM.
It needs to know every vCPU that will ever exists right at the time it is getting
Initialized. This is an ARM CPU Architectural constraint. 


 A
>  vCPU's KVM file descriptor is create and populated by the following ioctls
>  and function calls. When the first vCPU (vCPU-0) is realized, the property
>  corresponding to "&init" is fixed for all vCPUs. It means all vCPUs have same
>  properties except the "vcpu_index".
>  
>     ioctl(vm-fd,   KVM_CREATE_VCPU,   vcpu_index);
>     ioctl(vcpu-fd, KVM_ARM_VCPU_INIT, &init);
>     kvm_park_vcpu(cs);
>  
>  A vCPU's properties are determined by two sources and both are global. It
>  means all vCPUs should have same properties: (a) Feature registers
>  returned from the host. The function
>  kvm_arm_get_host_cpu_features() is called for once, meaning this source
>  is same to all vCPUs;


Sure, but what are you trying to save here?


>  (b) The parameters provided by user through '-cpu host,sve=off' are
>  translated to global properties and applied to all vCPUs when they're
>  instantiated.


Sure. Same is the case with PMU and other per-vCPU parameters.
We do not support heterogenous computing and therefore we do not
have per-vCPU control of these features as of now.


>  
>         (a)                                            (b)
>  
>     aarch64_host_initfn                          qemu_init
>     kvm_arm_set_cpu_features_from_host           parse_cpu_option
>       kvm_arm_get_host_cpu_features              cpu_common_parse_features
>                                                  qdev_prop_register_global
>                                                    :
>                                                  device_post_init
>                                                  qdev_prop_set_globals


Sure, I understand the code flow but what are you trying to suggest here?


>  For (2), I'm still looking into the GICv3 code for better understanding. 


Oh, I thought you said you've finished your reviews 😊

Please take your time. For your reference, you might want to check:

KVMForum 2023:
https://kvm-forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Virt_CPU_Hotplug_-__ii0iNb3.pdf
https://kvm-forum.qemu.org/2023/KVM-forum-cpu-hotplug_7OJ1YyJ.pdf

KVMForum 2020:
https://kvm-forum.qemu.org/2020/Oct%2029_Challenges%20in%20Supporting%20Virtual%20CPU%20Hotplug%20in%20SoC%20Based%20Systems%20like%20ARM64_Salil%20Mehta.pdf


Until
>  now, I don't see we need the instantiated hotpluggable vCPUs either.


I think, I've already answered this above it is because of ARM Architectural constraint.


 For
>  example, the redistributor regions can be exposed based on 'maxcpus'
>  instead of 'cpus'. 

You mean during the review of the code you found that we are not doing it?


The IRQ connection and teardown can be dynamically
>  done by connecting the board with GICv3 through callbacks in
>  ARMGICv3CommonClass.
>  The connection between GICv3CPUState and CPUARMState also can be
>  done dynamically.

Are you suggesting this after reviewing the code or you have to review it yet? 😉


Thanks
Salil.


>  
>  Thanks,
>  Gavin
>  


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 17/29] arm/virt: Release objects for *disabled* possible vCPUs after init
  2024-08-21 10:23             ` Salil Mehta via
@ 2024-08-21 13:32               ` Gavin Shan
  2024-08-22 10:58                 ` Salil Mehta via
  0 siblings, 1 reply; 105+ messages in thread
From: Gavin Shan @ 2024-08-21 13:32 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Salil,

On 8/21/24 8:23 PM, Salil Mehta wrote:
>>   
>>   On 8/21/24 2:40 AM, Salil Mehta wrote:
>>   >
>>   > I don’t understand this clearly. Are  you suggesting to reuse only
>>   > single vCPU object to initialize all KVM vCPUs not yet plugged? If
>>   > yes, then I'm not sure what do we gain here by adding this complexity?
>>   > It does not consume time or resources because we are not realizing any
>>   > of these vCPU object in any case.
>>   >
>>   
>>   First of all, it seems we have different names and terms for those cold-
>>   booted vCPUs and hotpluggable vCPUs. For example, vCPU-0 and vCPU-1
>>   are cold-booted vCPUs while
>>   vCPU-2 and vCPU-3 are hotpluggable vCPUs when we have '-smp
>>   maxcpus=4,cpus=2'. Lets stick to convention and terms for easier discussion.
>>   
>>   The idea is avoid instantiating hotpluggable vCPUs in virtmach_init() and
>>   released in the same function for those hotpluggable vCPUs. As I can
>>   understand, those hotpluggable vCPU instances are serving for two
>>   purposes: (1) Relax the constraint that all vCPU's (kvm) file descriptor have
>>   to be created and populated;
> 
> 
> We are devising *workarounds* in Qemu for the ARM CPU architectural constraints
> in KVM and in Guest Kernel,  *not relaxing* them. We are not allowed to meddle with
> the constraints. That is the whole point.
> 
> Not having to respect those constraints led to rejection of the earlier attempts to
> upstream Virtual CPU Hotplug for ARM.
> 

I meant to 'overcome' the constraints by 'relax'. My apologies if there're any caused
confusions. Previously, you had attempt to create all vCPU objects and reuse them when
vCPU hot added. In current implementation, the hotpluggable vCPUs are instantiated and
released pretty soon. I was bringing the third possibility, to avoid instantiating those
hotpluggable vCPU objects, for discussion. In this series, the life cycle of those
hotpluggable vCPU objects are really short. Again, I didn't say we must avoid instantiating
those vCPU objects, I brought the topic ONLY for discussion.

> 
> (2) Help to instantiate and realize
>>   GICv3 object.
>>   
>>   For (1), I don't think we have to instantiate those hotpluggable vCPUs at all.
>>   In the above example where we have command line '-smp
>>   maxcpus=4,cpus=2', it's unnecessary to instantiate
>>   vCPU-3 and vCPU-4 to create and populate their KVM file descriptors.
> 
> 
> We cannot defer create vCPU in KVM after GIC has been initialized in KVM.
> It needs to know every vCPU that will ever exists right at the time it is getting
> Initialized. This is an ARM CPU Architectural constraint.
> 

It will be appreciated if more details other than 'an ARM CPU architectural constraint'
can be provided. I don't understand this constrain very well at least.

> 
>   A
>>   vCPU's KVM file descriptor is create and populated by the following ioctls
>>   and function calls. When the first vCPU (vCPU-0) is realized, the property
>>   corresponding to "&init" is fixed for all vCPUs. It means all vCPUs have same
>>   properties except the "vcpu_index".
>>   
>>      ioctl(vm-fd,   KVM_CREATE_VCPU,   vcpu_index);
>>      ioctl(vcpu-fd, KVM_ARM_VCPU_INIT, &init);
>>      kvm_park_vcpu(cs);
>>   
>>   A vCPU's properties are determined by two sources and both are global. It
>>   means all vCPUs should have same properties: (a) Feature registers
>>   returned from the host. The function
>>   kvm_arm_get_host_cpu_features() is called for once, meaning this source
>>   is same to all vCPUs;
> 
> 
> Sure, but what are you trying to save here?
> 

As mentioned above, the life cycle of those hotpluggable vCPU objects are really
short. They still consume time and memory to instantiate them. If I'm correct,
one of the primary goal for vCPU hotplug feature is to save system boot-up time,
correct?

> 
>>   (b) The parameters provided by user through '-cpu host,sve=off' are
>>   translated to global properties and applied to all vCPUs when they're
>>   instantiated.
> 
> 
> Sure. Same is the case with PMU and other per-vCPU parameters.
> We do not support heterogenous computing and therefore we do not
> have per-vCPU control of these features as of now.
> 
> 
>>   
>>          (a)                                            (b)
>>   
>>      aarch64_host_initfn                          qemu_init
>>      kvm_arm_set_cpu_features_from_host           parse_cpu_option
>>        kvm_arm_get_host_cpu_features              cpu_common_parse_features
>>                                                   qdev_prop_register_global
>>                                                     :
>>                                                   device_post_init
>>                                                   qdev_prop_set_globals
> 
> 
> Sure, I understand the code flow but what are you trying to suggest here?
> 

I tried to explain that vCPU object isn't needed to create and populate vCPU's file
descriptors, as highlight in (1). The information used to create the cold-booted
vCPU-0 can be reused because all vCPUs have same properties and feature set.

> 
>>   For (2), I'm still looking into the GICv3 code for better understanding.
> 
> 
> Oh, I thought you said you've finished your reviews 😊
> 
> Please take your time. For your reference, you might want to check:
> 
> KVMForum 2023:
> https://kvm-forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Virt_CPU_Hotplug_-__ii0iNb3.pdf
> https://kvm-forum.qemu.org/2023/KVM-forum-cpu-hotplug_7OJ1YyJ.pdf
> 
> KVMForum 2020:
> https://kvm-forum.qemu.org/2020/Oct%2029_Challenges%20in%20Supporting%20Virtual%20CPU%20Hotplug%20in%20SoC%20Based%20Systems%20like%20ARM64_Salil%20Mehta.pdf
> 

hmm, 'finished my review' has been misread frankly. By that, I meant I finished my tests and
provided all the comments I had. Some of them are questions and discussions, which I still need
to follow up.

> 
> Until
>>   now, I don't see we need the instantiated hotpluggable vCPUs either.
> 
> 
> I think, I've already answered this above it is because of ARM Architectural constraint.
> 
> 
>   For
>>   example, the redistributor regions can be exposed based on 'maxcpus'
>>   instead of 'cpus'.
> 
> You mean during the review of the code you found that we are not doing it?
> 

It's all about the discussion to the possibility to avoid instantiating hotpluggable
vCPU objects.

> 
> The IRQ connection and teardown can be dynamically
>>   done by connecting the board with GICv3 through callbacks in
>>   ARMGICv3CommonClass.
>>   The connection between GICv3CPUState and CPUARMState also can be
>>   done dynamically.
> 
> Are you suggesting this after reviewing the code or you have to review it yet? 😉
> 

I was actually trying to ask for your input and feedback. I was hoping your input
to clear my puzzles: why vCPU objects must be in place to create GICv3 object?
Is it possible to create the GICv3 object without those vCPU objects? What kinds
of efforts we need to avoid instantiating those hotpluggable vCPU objects.
The best way perhaps is to find the answer from the code by myself ;-)

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 17/29] arm/virt: Release objects for *disabled* possible vCPUs after init
  2024-08-21 13:32               ` Gavin Shan
@ 2024-08-22 10:58                 ` Salil Mehta via
  2024-08-23 10:52                   ` Gavin Shan
  0 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-08-22 10:58 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Gavin,

>  From: qemu-arm-bounces+salil.mehta=huawei.com@nongnu.org <qemu-
>  arm-bounces+salil.mehta=huawei.com@nongnu.org> On Behalf Of Gavin Shan
>  Sent: Wednesday, August 21, 2024 2:33 PM
>  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  qemu-arm@nongnu.org; mst@redhat.com
>  
>  Hi Salil,
>  
>  On 8/21/24 8:23 PM, Salil Mehta wrote:
>  >>
>  >>   On 8/21/24 2:40 AM, Salil Mehta wrote:
>  >>   >
>  >>   > I don’t understand this clearly. Are  you suggesting to reuse only
>  >>   > single vCPU object to initialize all KVM vCPUs not yet plugged? If
>  >>   > yes, then I'm not sure what do we gain here by adding this complexity?
>  >>   > It does not consume time or resources because we are not realizing any
>  >>   > of these vCPU object in any case.
>  >>   >
>  >>
>  >>   First of all, it seems we have different names and terms for those cold-
>  >>   booted vCPUs and hotpluggable vCPUs. For example, vCPU-0 and vCPU-1
>  >>   are cold-booted vCPUs while
>  >>   vCPU-2 and vCPU-3 are hotpluggable vCPUs when we have '-smp
>  >>   maxcpus=4,cpus=2'. Lets stick to convention and terms for easier discussion.
>  >>
>  >>   The idea is avoid instantiating hotpluggable vCPUs in virtmach_init() and
>  >>   released in the same function for those hotpluggable vCPUs. As I can
>  >>   understand, those hotpluggable vCPU instances are serving for two
>  >>   purposes: (1) Relax the constraint that all vCPU's (kvm) file descriptor have
>  >>   to be created and populated;
>  >
>  >
>  > We are devising *workarounds* in Qemu for the ARM CPU architectural
>  > constraints in KVM and in Guest Kernel,  *not relaxing* them. We are
>  > not allowed to meddle with the constraints. That is the whole point.
>  >
>  > Not having to respect those constraints led to rejection of the
>  > earlier attempts to upstream Virtual CPU Hotplug for ARM.
>  >
>  
>  I meant to 'overcome' the constraints by 'relax'. My apologies if there're any
>  caused confusions.


Ok. No issues. It was important for me to clarify though.


 Previously, you had attempt to create all vCPU objects
>  and reuse them when vCPU hot added.

Yes, at QOM level. But that approach did not realize the unplugged/yet-to-be-plugged
vCPUs. We were just using QOM vCPU objects as the place holders

 In current implementation, the
>  hotpluggable vCPUs are instantiated and released pretty soon. I was
>  bringing the third possibility, to avoid instantiating those hotpluggable vCPU
>  objects, for discussion. 


Are you suggesting not calling KVM_ARM_VCPU_INIT IOCTL as all for the vCPUs
which are part of possible list but not yet plugged?

If yes, we cannot do that as KVM vCPUs should be fully initialized even before
VGIC is initialized inside the KVM. This is a constraint. I've explained this in
detail in the cover letter of this patch-set and in the slides I have shared
earlier.


In this series, the life cycle of those hotpluggable
>  vCPU objects are really short. Again, I didn't say we must avoid instantiating
>  those vCPU objects, I brought the topic ONLY for discussion.

Sure, I appreciate that. For the details of the reasons please follow below:

1. Cover Letter of this patch-set (Constraints are explained there)
2. KVMForum Slides of 2020 and 2023


>  > (2) Help to instantiate and realize
>  >>   GICv3 object.
>  >>
>  >>   For (1), I don't think we have to instantiate those hotpluggable vCPUs at all.
>  >>   In the above example where we have command line '-smp
>  >>   maxcpus=4,cpus=2', it's unnecessary to instantiate
>  >>   vCPU-3 and vCPU-4 to create and populate their KVM file descriptors.
>  >
>  >
>  > We cannot defer create vCPU in KVM after GIC has been initialized in KVM.
>  > It needs to know every vCPU that will ever exists right at the time it
>  > is getting Initialized. This is an ARM CPU Architectural constraint.
>  >
>  
>  It will be appreciated if more details other than 'an ARM CPU architectural constraint'
>  can be provided. I don't understand this constrain very well at least.


We cannot do that as we MUST present KVM vCPUs to the VGIC fully initialized,
even before it starts its initialization. Initialization of the vCPUs also initializes
the system registers for the corresponding KVM vCPU.  

For example, MPIDR_EL1 must be initialized at VCPU INIT time. We cannot
avoid this. MPIDR value is used by VGIC during its initialization. This MUST be
present for all of the possible KVM vCPUs right from start during vgic_init()

Please check the cover letter of this patch-set, I explained these there and the
KVMForum slides.  Please review and comment there and let me know what is
not clear from the text.


>  >   A
>  >>   vCPU's KVM file descriptor is create and populated by the following ioctls
>  >>   and function calls. When the first vCPU (vCPU-0) is realized, the property
>  >>   corresponding to "&init" is fixed for all vCPUs. It means all vCPUs have same
>  >>   properties except the "vcpu_index".
>  >>
>  >>      ioctl(vm-fd,   KVM_CREATE_VCPU,   vcpu_index);
>  >>      ioctl(vcpu-fd, KVM_ARM_VCPU_INIT, &init);
>  >>      kvm_park_vcpu(cs);
>  >>
>  >>   A vCPU's properties are determined by two sources and both are global. It
>  >>   means all vCPUs should have same properties: (a) Feature registers
>  >>   returned from the host. The function
>  >>   kvm_arm_get_host_cpu_features() is called for once, meaning this source
>  >>   is same to all vCPUs;
>  >
>  >
>  > Sure, but what are you trying to save here?
>  >
>  
>  As mentioned above, the life cycle of those hotpluggable vCPU objects are
>  really short. They still consume time and memory to instantiate them. If I'm
>  correct, one of the primary goal for vCPU hotplug feature is to save system
>  boot-up time, correct?


Correct. We targeted vCPU hotplug for Kata-containers for on-demand resource
allocation and saving the resources. Kata-containers can work with different types
of VMM like Qemu and microVMs like Firecracker. AFAIK, Usecase of microVM is
slightly different than the normal containers. They are short lived (say around
15 min) and require ultrafast boot-up times (say less than 125 ms) - these figures
are from Amazon who invented the concept of microVM in the earlier decade.

With the current patches, we have only partially achieved what we had started
i.e. Kata/Qemu but we also want to target Kata/microVM. In our case, we want
that microVM to be Qemu based fro ARM. I think x86 already has reduced lots
of legacy stuff and created a microVM in Qemu. I'm not sure how it compares
against the true microVM like Firecracker. It will be a good target to reduce
memory foot print of ARM Qemu Virt Machine. or think or creating a new one
just like x86. Using the vCPU Hotplug feature we were drastically able to reduce
the boot up times of Qemu. Please check the calibrated performance figures in
KVmForum  2023 slide 19G (Page 26) [1]

[1]  https://kvm-forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Virt_CPU_Hotplug_-__ii0iNb3.pdf

Last year, I had prototyped a microVM for ARM, Michael Tsirkin suggested that if
the performance number of the ARM Virt machine can match the x86 microVM then
we might not require an explicit microVM code for ARM. Hence, my current efforts
are to reduce the memory foot print of existing Virt machine. But I do have a rough
prototype of microVM as well. We can debate about that later in a different 
discussion.


>  >>   (b) The parameters provided by user through '-cpu host,sve=off' are
>  >>   translated to global properties and applied to all vCPUs when they're
>  >>   instantiated.
>  >
>  >
>  > Sure. Same is the case with PMU and other per-vCPU parameters.
>  > We do not support heterogenous computing and therefore we do not have
>  > per-vCPU control of these features as of now.
>  >
>  >
>  >>
>  >>          (a)                                            (b)
>  >>
>  >>      aarch64_host_initfn                          qemu_init
>  >>      kvm_arm_set_cpu_features_from_host           parse_cpu_option
>  >>        kvm_arm_get_host_cpu_features
>  cpu_common_parse_features
>  >>                                                   qdev_prop_register_global
>  >>                                                     :
>  >>                                                   device_post_init
>  >>
>  >> qdev_prop_set_globals
>  >
>  >
>  > Sure, I understand the code flow but what are you trying to suggest here?
>  >
>  
>  I tried to explain that vCPU object isn't needed to create and populate
>  vCPU's file descriptors, as highlight in (1). The information used to create the
>  cold-booted
>  vCPU-0 can be reused because all vCPUs have same properties and feature
>  set.


It does not matter. We use those QOM vCPU object states to initializes Qemu 
GICv3 model with max possible vCPUs and then release the QOM vCPU objects.
which are yet-to-be-plugged.


>  >>   For (2), I'm still looking into the GICv3 code for better understanding.
>  >
>  >
>  > Oh, I thought you said you've finished your reviews 😊
>  >
>  > Please take your time. For your reference, you might want to check:
>  >
>  > KVMForum 2023:
>  > https://kvm-
>  forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Vir
>  > t_CPU_Hotplug_-__ii0iNb3.pdf
>  > https://kvm-forum.qemu.org/2023/KVM-forum-cpu-
>  hotplug_7OJ1YyJ.pdf
>  >
>  > KVMForum 2020:
>  > https://kvm-
>  forum.qemu.org/2020/Oct%2029_Challenges%20in%20Supporting%
>  >
>  20Virtual%20CPU%20Hotplug%20in%20SoC%20Based%20Systems%20like%2
>  0ARM64_
>  > Salil%20Mehta.pdf
>  >
>  
>  hmm, 'finished my review' has been misread frankly. By that, I meant I
>  finished my tests and provided all the comments I had. Some of them are
>  questions and discussions, which I still need to follow up.


Sure. No worries. Even if you miss, you will have more chance to comment on
upcoming RFC V4 😊


>  > Until
>  >>   now, I don't see we need the instantiated hotpluggable vCPUs either.
>  >
>  >
>  > I think, I've already answered this above it is because of ARM Architectural
>  constraint.
>  >
>  >
>  >   For
>  >>   example, the redistributor regions can be exposed based on 'maxcpus'
>  >>   instead of 'cpus'.
>  >
>  > You mean during the review of the code you found that we are not doing
>  it?
>  >
>  
>  It's all about the discussion to the possibility to avoid instantiating
>  hotpluggable vCPU objects.


As mentioned above, with the current KVM code you cannot. But if we
really want to then perhaps we would need to change KVM.

I might be wrong but AFAICS I don’t see a reason why we cannot have
something like *batch* KVM vCPU create and  initialize instead of current
sequential KVM operations. This will avoid multiple calls to the KVM Host
and can improve Qemu init time further. But this will require a separate
discussion in the LKML including all the KVM folks.

This has potential to delay the vCPU hotplug feature acceptance and I'm
really not in favor of that. We have already stretched it a lot because of
the standards change acceptance earlier.


>  > The IRQ connection and teardown can be dynamically
>  >>   done by connecting the board with GICv3 through callbacks in
>  >>   ARMGICv3CommonClass.
>  >>   The connection between GICv3CPUState and CPUARMState also can be
>  >>   done dynamically.
>  >
>  > Are you suggesting this after reviewing the code or you have to review
>  > it yet? 😉
>  >
>  
>  I was actually trying to ask for your input and feedback. I was hoping your
>  input to clear my puzzles: why vCPU objects must be in place to create
>  GICv3 object?
>  Is it possible to create the GICv3 object without those vCPU objects? 


No. VGIC initializes IRQs to target KVM vCPUs, it would expect same KVM vCPU MPIDR
or MP-AFFINITY configured when KVM vCPUs were initialized at the first place
otherwise the VGIC initialization will not happen correctly. Hence, the sequence.

The sequence of these initialization is generally strictly controlled by specification
which is closely tied up with hardware including powering up initializations. 
You will need to honor the expectations of the KVM VGIC init which in turn are
ARM CPU Architecture specification compliant. It is not just a loosely written code.


What
>  kinds of efforts we need to avoid instantiating those hotpluggable vCPU
>  objects.


I mentioned one of the ways above. Introduce *batch* KVM vCPU create &
initialize. But it will have to undergo greater scrutiny because we are touching
a common part which might affect many stake holders.  But this is a discussion
we can do later as part of microVM for ARM. 


Thanks
Salil.

>  The best way perhaps is to find the answer from the code by myself ;-)
>  
>  Thanks,
>  Gavin
>  


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 17/29] arm/virt: Release objects for *disabled* possible vCPUs after init
  2024-08-22 10:58                 ` Salil Mehta via
@ 2024-08-23 10:52                   ` Gavin Shan
  2024-08-23 13:17                     ` Salil Mehta via
  0 siblings, 1 reply; 105+ messages in thread
From: Gavin Shan @ 2024-08-23 10:52 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Sail,

On 8/22/24 8:58 PM, Salil Mehta wrote:
>>   On 8/21/24 8:23 PM, Salil Mehta wrote:
>>   >>
>>   >>   On 8/21/24 2:40 AM, Salil Mehta wrote:
>>   >>   >
>>   >>   > I don’t understand this clearly. Are  you suggesting to reuse only
>>   >>   > single vCPU object to initialize all KVM vCPUs not yet plugged? If
>>   >>   > yes, then I'm not sure what do we gain here by adding this complexity?
>>   >>   > It does not consume time or resources because we are not realizing any
>>   >>   > of these vCPU object in any case.
>>   >>   >
>>   >>
>>   >>   First of all, it seems we have different names and terms for those cold-
>>   >>   booted vCPUs and hotpluggable vCPUs. For example, vCPU-0 and vCPU-1
>>   >>   are cold-booted vCPUs while
>>   >>   vCPU-2 and vCPU-3 are hotpluggable vCPUs when we have '-smp
>>   >>   maxcpus=4,cpus=2'. Lets stick to convention and terms for easier discussion.
>>   >>
>>   >>   The idea is avoid instantiating hotpluggable vCPUs in virtmach_init() and
>>   >>   released in the same function for those hotpluggable vCPUs. As I can
>>   >>   understand, those hotpluggable vCPU instances are serving for two
>>   >>   purposes: (1) Relax the constraint that all vCPU's (kvm) file descriptor have
>>   >>   to be created and populated;
>>   >
>>   >
>>   > We are devising *workarounds* in Qemu for the ARM CPU architectural
>>   > constraints in KVM and in Guest Kernel,  *not relaxing* them. We are
>>   > not allowed to meddle with the constraints. That is the whole point.
>>   >
>>   > Not having to respect those constraints led to rejection of the
>>   > earlier attempts to upstream Virtual CPU Hotplug for ARM.
>>   >
>>   
>>   I meant to 'overcome' the constraints by 'relax'. My apologies if there're any
>>   caused confusions.
> 
> 
> Ok. No issues. It was important for me to clarify though.
> 
> 
>   Previously, you had attempt to create all vCPU objects
>>   and reuse them when vCPU hot added.
> 
> Yes, at QOM level. But that approach did not realize the unplugged/yet-to-be-plugged
> vCPUs. We were just using QOM vCPU objects as the place holders
> 

Right, my point was actually vCPU objects are too heavy as the place holders. It was
reason why I had the concern: why those hotpluggable vCPU objects can't be avoided
during the bootup time.

>   In current implementation, the
>>   hotpluggable vCPUs are instantiated and released pretty soon. I was
>>   bringing the third possibility, to avoid instantiating those hotpluggable vCPU
>>   objects, for discussion.
> 
> 
> Are you suggesting not calling KVM_ARM_VCPU_INIT IOCTL as all for the vCPUs
> which are part of possible list but not yet plugged?
> 
> If yes, we cannot do that as KVM vCPUs should be fully initialized even before
> VGIC is initialized inside the KVM. This is a constraint. I've explained this in
> detail in the cover letter of this patch-set and in the slides I have shared
> earlier.
> 

No, it's not what I was suggesting. What I suggested is to avoid creating those hotpluggable
vCPU objects (place holders) during the bootup time. However, all vCPU file descriptors (KVM
objects) still need to be created and initialized before GICv3 is initialized. It's one of
the constrains. So we need to create and populate all vCPU file descriptors through
ioctl(vm_fd, CREATE_VCPU) and ioctl(vcpu_fd, INIT_VCPU) before GICv3 object is created and
realized. As I explained in the previous reply, the hotpluggable vCPU objects (place holders)
haven't to be created in order to initialize and populate the vCPU file descriptors for those
hotpluggable vCPUs. I think the parameters used to create and initialize vCPU-0's file descriptor
can be reused by all other vCPUs, because we don't support heterogeneous vCPUs.

What I suggested is something like below: the point is to avoid instantiating those hotpluggable
vCPUs, but their vCPU file descriptor (KVM object) are still created and initialized.

     static void machvirt_init(MachineState *machine)
     {

         /*
          * Instantiate and realize vCPU-0, record the parameter passed to
          * ioctl(vcpu-fd, VCPU_INIT, &init), or a better place to remember the parameter.
          * The point is the parameter can be shared by all vCPUs.
          */

         /*
          * Create vCPU descriptors for all other vCPUs (including hotpluggable vCPUs).
          * The remembered parameter is reused and passed to ioctl(vcpu-fd, VCPU_INIT, &init).
          */

         /* Instanaite and realize other cold-booted vCPUs */

         /* Instantiate and realize GICv3 */

     }

> 
> In this series, the life cycle of those hotpluggable
>>   vCPU objects are really short. Again, I didn't say we must avoid instantiating
>>   those vCPU objects, I brought the topic ONLY for discussion.
> 
> Sure, I appreciate that. For the details of the reasons please follow below:
> 
> 1. Cover Letter of this patch-set (Constraints are explained there)
> 2. KVMForum Slides of 2020 and 2023
> 
> 
>>   > (2) Help to instantiate and realize
>>   >>   GICv3 object.
>>   >>
>>   >>   For (1), I don't think we have to instantiate those hotpluggable vCPUs at all.
>>   >>   In the above example where we have command line '-smp
>>   >>   maxcpus=4,cpus=2', it's unnecessary to instantiate
>>   >>   vCPU-3 and vCPU-4 to create and populate their KVM file descriptors.
>>   >
>>   >
>>   > We cannot defer create vCPU in KVM after GIC has been initialized in KVM.
>>   > It needs to know every vCPU that will ever exists right at the time it
>>   > is getting Initialized. This is an ARM CPU Architectural constraint.
>>   >
>>   
>>   It will be appreciated if more details other than 'an ARM CPU architectural constraint'
>>   can be provided. I don't understand this constrain very well at least.
> 
> 
> We cannot do that as we MUST present KVM vCPUs to the VGIC fully initialized,
> even before it starts its initialization. Initialization of the vCPUs also initializes
> the system registers for the corresponding KVM vCPU.
> 
> For example, MPIDR_EL1 must be initialized at VCPU INIT time. We cannot
> avoid this. MPIDR value is used by VGIC during its initialization. This MUST be
> present for all of the possible KVM vCPUs right from start during vgic_init()
> 
> Please check the cover letter of this patch-set, I explained these there and the
> KVMForum slides.  Please review and comment there and let me know what is
> not clear from the text.
> 

It seems my suggestion wasn't fully understood. I was suggesting to avoid instantiating
those hotpluggable vCPU objects (place holders) during QEMU startup time. All vCPU file
descriptors (the vCPU's corresponding objects) still need to be in place before GICv3
object is initiated and realized.

> 
>>   >   A
>>   >>   vCPU's KVM file descriptor is create and populated by the following ioctls
>>   >>   and function calls. When the first vCPU (vCPU-0) is realized, the property
>>   >>   corresponding to "&init" is fixed for all vCPUs. It means all vCPUs have same
>>   >>   properties except the "vcpu_index".
>>   >>
>>   >>      ioctl(vm-fd,   KVM_CREATE_VCPU,   vcpu_index);
>>   >>      ioctl(vcpu-fd, KVM_ARM_VCPU_INIT, &init);
>>   >>      kvm_park_vcpu(cs);
>>   >>
>>   >>   A vCPU's properties are determined by two sources and both are global. It
>>   >>   means all vCPUs should have same properties: (a) Feature registers
>>   >>   returned from the host. The function
>>   >>   kvm_arm_get_host_cpu_features() is called for once, meaning this source
>>   >>   is same to all vCPUs;
>>   >
>>   >
>>   > Sure, but what are you trying to save here?
>>   >
>>   
>>   As mentioned above, the life cycle of those hotpluggable vCPU objects are
>>   really short. They still consume time and memory to instantiate them. If I'm
>>   correct, one of the primary goal for vCPU hotplug feature is to save system
>>   boot-up time, correct?
> 
> 
> Correct. We targeted vCPU hotplug for Kata-containers for on-demand resource
> allocation and saving the resources. Kata-containers can work with different types
> of VMM like Qemu and microVMs like Firecracker. AFAIK, Usecase of microVM is
> slightly different than the normal containers. They are short lived (say around
> 15 min) and require ultrafast boot-up times (say less than 125 ms) - these figures
> are from Amazon who invented the concept of microVM in the earlier decade.
> 
> With the current patches, we have only partially achieved what we had started
> i.e. Kata/Qemu but we also want to target Kata/microVM. In our case, we want
> that microVM to be Qemu based fro ARM. I think x86 already has reduced lots
> of legacy stuff and created a microVM in Qemu. I'm not sure how it compares
> against the true microVM like Firecracker. It will be a good target to reduce
> memory foot print of ARM Qemu Virt Machine. or think or creating a new one
> just like x86. Using the vCPU Hotplug feature we were drastically able to reduce
> the boot up times of Qemu. Please check the calibrated performance figures in
> KVmForum  2023 slide 19G (Page 26) [1]
> 
> [1]  https://kvm-forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Virt_CPU_Hotplug_-__ii0iNb3.pdf
> 
> Last year, I had prototyped a microVM for ARM, Michael Tsirkin suggested that if
> the performance number of the ARM Virt machine can match the x86 microVM then
> we might not require an explicit microVM code for ARM. Hence, my current efforts
> are to reduce the memory foot print of existing Virt machine. But I do have a rough
> prototype of microVM as well. We can debate about that later in a different
> discussion.
> 

Thanks for the linker to the slides. Yeah, it's nice to reduce the bootup time
and memory footprint if possible. vCPU hotplug feature may help to improve the
performance, but all other paths might also impact the performance. In summary,
it's a comprehensive goal to reduce the memory footprint and bootup time, and
other components (paths) need optimization to achieve this goal.

> 
>>   >>   (b) The parameters provided by user through '-cpu host,sve=off' are
>>   >>   translated to global properties and applied to all vCPUs when they're
>>   >>   instantiated.
>>   >
>>   >
>>   > Sure. Same is the case with PMU and other per-vCPU parameters.
>>   > We do not support heterogenous computing and therefore we do not have
>>   > per-vCPU control of these features as of now.
>>   >
>>   >
>>   >>
>>   >>          (a)                                            (b)
>>   >>
>>   >>      aarch64_host_initfn                          qemu_init
>>   >>      kvm_arm_set_cpu_features_from_host           parse_cpu_option
>>   >>        kvm_arm_get_host_cpu_features
>>   cpu_common_parse_features
>>   >>                                                   qdev_prop_register_global
>>   >>                                                     :
>>   >>                                                   device_post_init
>>   >>
>>   >> qdev_prop_set_globals
>>   >
>>   >
>>   > Sure, I understand the code flow but what are you trying to suggest here?
>>   >
>>   
>>   I tried to explain that vCPU object isn't needed to create and populate
>>   vCPU's file descriptors, as highlight in (1). The information used to create the
>>   cold-booted
>>   vCPU-0 can be reused because all vCPUs have same properties and feature
>>   set.
> 
> 
> It does not matter. We use those QOM vCPU object states to initializes Qemu
> GICv3 model with max possible vCPUs and then release the QOM vCPU objects.
> which are yet-to-be-plugged.
> 

It's what has been implemented in this series. My concern remains: why those
vCPU hotpluggable objects can't be avoided? Again, their corresponding vCPU
file descritpors (KVM vCPU objects) still have to be in place before GICv3
is instantiated and realized.

> 
>>   >>   For (2), I'm still looking into the GICv3 code for better understanding.
>>   >
>>   >
>>   > Oh, I thought you said you've finished your reviews 😊
>>   >
>>   > Please take your time. For your reference, you might want to check:
>>   >
>>   > KVMForum 2023:
>>   > https://kvm-
>>   forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Vir
>>   > t_CPU_Hotplug_-__ii0iNb3.pdf
>>   > https://kvm-forum.qemu.org/2023/KVM-forum-cpu-
>>   hotplug_7OJ1YyJ.pdf
>>   >
>>   > KVMForum 2020:
>>   > https://kvm-
>>   forum.qemu.org/2020/Oct%2029_Challenges%20in%20Supporting%
>>   >
>>   20Virtual%20CPU%20Hotplug%20in%20SoC%20Based%20Systems%20like%2
>>   0ARM64_
>>   > Salil%20Mehta.pdf
>>   >
>>   
>>   hmm, 'finished my review' has been misread frankly. By that, I meant I
>>   finished my tests and provided all the comments I had. Some of them are
>>   questions and discussions, which I still need to follow up.
> 
> 
> Sure. No worries. Even if you miss, you will have more chance to comment on
> upcoming RFC V4 😊
> 

Ok :-)

> 
>>   > Until
>>   >>   now, I don't see we need the instantiated hotpluggable vCPUs either.
>>   >
>>   >
>>   > I think, I've already answered this above it is because of ARM Architectural
>>   constraint.
>>   >
>>   >
>>   >   For
>>   >>   example, the redistributor regions can be exposed based on 'maxcpus'
>>   >>   instead of 'cpus'.
>>   >
>>   > You mean during the review of the code you found that we are not doing
>>   it?
>>   >
>>   
>>   It's all about the discussion to the possibility to avoid instantiating
>>   hotpluggable vCPU objects.
> 
> 
> As mentioned above, with the current KVM code you cannot. But if we
> really want to then perhaps we would need to change KVM.
> 
> I might be wrong but AFAICS I don’t see a reason why we cannot have
> something like *batch* KVM vCPU create and  initialize instead of current
> sequential KVM operations. This will avoid multiple calls to the KVM Host
> and can improve Qemu init time further. But this will require a separate
> discussion in the LKML including all the KVM folks.
> 
> This has potential to delay the vCPU hotplug feature acceptance and I'm
> really not in favor of that. We have already stretched it a lot because of
> the standards change acceptance earlier.
> 

Again, my suggestion wasn't completely understood. I was suggesting to avoid instantiating
those hotpluggable objects, but their vCPU file descriptors still need to be in place
before GICv3's instantiation and realization.

Yes, I was also concerned that too much code changes would be needed if my suggestion is
accepted. It will definitely delay the feature's upstreaming process. It's why I said the
topic (to avoid the hotpluggable objects) are just for discussion now. We can do it (as
separate optimization) after your current implementation is merged.

> 
>>   > The IRQ connection and teardown can be dynamically
>>   >>   done by connecting the board with GICv3 through callbacks in
>>   >>   ARMGICv3CommonClass.
>>   >>   The connection between GICv3CPUState and CPUARMState also can be
>>   >>   done dynamically.
>>   >
>>   > Are you suggesting this after reviewing the code or you have to review
>>   > it yet? 😉
>>   >
>>   
>>   I was actually trying to ask for your input and feedback. I was hoping your
>>   input to clear my puzzles: why vCPU objects must be in place to create
>>   GICv3 object?
>>   Is it possible to create the GICv3 object without those vCPU objects?
> 
> 
> No. VGIC initializes IRQs to target KVM vCPUs, it would expect same KVM vCPU MPIDR
> or MP-AFFINITY configured when KVM vCPUs were initialized at the first place
> otherwise the VGIC initialization will not happen correctly. Hence, the sequence.
> 
> The sequence of these initialization is generally strictly controlled by specification
> which is closely tied up with hardware including powering up initializations.
> You will need to honor the expectations of the KVM VGIC init which in turn are
> ARM CPU Architecture specification compliant. It is not just a loosely written code.
> 

umm, As I explained from the beginning, all KVM vCPU file descriptors are still
in place before GICv3 is instantiated and realized. With those KVM vCPU file
descriptor, we shouldn't have problem except more code changes are needed, or
I miss something? :)

> 
> What
>>   kinds of efforts we need to avoid instantiating those hotpluggable vCPU
>>   objects.
> 
> 
> I mentioned one of the ways above. Introduce *batch* KVM vCPU create &
> initialize. But it will have to undergo greater scrutiny because we are touching
> a common part which might affect many stake holders.  But this is a discussion
> we can do later as part of microVM for ARM.
> 

Remember you just had the architecture agnostic series merged. The vCPU file
descriptor can be parked and picked up at a later time based on the vCPU index.
This KVM vCPU create in batch mode can be well supported. What you need to is
record the parameters passed to ioctl(vm_fd, CREATE_VCPU, index) and ioctl(vcpu_fd,
INIT_CPU) for vCPU-0 in hw/arm/virt.c and create all other vCPU file descriptors
based on the recorded parameters.

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 17/29] arm/virt: Release objects for *disabled* possible vCPUs after init
  2024-08-23 10:52                   ` Gavin Shan
@ 2024-08-23 13:17                     ` Salil Mehta via
  2024-08-24 10:03                       ` Gavin Shan
  0 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-08-23 13:17 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Gavin,

>  From: Gavin Shan <gshan@redhat.com>
>  Sent: Friday, August 23, 2024 11:52 AM
>  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  qemu-arm@nongnu.org; mst@redhat.com
>  
>  Hi Sail,
>  
>  On 8/22/24 8:58 PM, Salil Mehta wrote:
>  >>   On 8/21/24 8:23 PM, Salil Mehta wrote:
>  >>   >>
>  >>   >>   On 8/21/24 2:40 AM, Salil Mehta wrote:
>  >>   >>   >
>  >>   >>   > I don’t understand this clearly. Are  you suggesting to reuse only
>  >>   >>   > single vCPU object to initialize all KVM vCPUs not yet plugged? If
>  >>   >>   > yes, then I'm not sure what do we gain here by adding this complexity?
>  >>   >>   > It does not consume time or resources because we are not realizing any
>  >>   >>   > of these vCPU object in any case.
>  >>   >>   >
>  >>   >>
>  >>   >>   First of all, it seems we have different names and terms for those cold-
>  >>   >>   booted vCPUs and hotpluggable vCPUs. For example, vCPU-0 and vCPU-1
>  >>   >>   are cold-booted vCPUs while
>  >>   >>   vCPU-2 and vCPU-3 are hotpluggable vCPUs when we have '-smp
>  >>   >>   maxcpus=4,cpus=2'. Lets stick to convention and terms for easier discussion.
>  >>   >>
>  >>   >>   The idea is avoid instantiating hotpluggable vCPUs in virtmach_init() and
>  >>   >>   released in the same function for those hotpluggable vCPUs. As I can
>  >>   >>   understand, those hotpluggable vCPU instances are serving for two
>  >>   >>   purposes: (1) Relax the constraint that all vCPU's (kvm) file descriptor have
>  >>   >>   to be created and populated;
>  >>   >
>  >>   >
>  >>   > We are devising *workarounds* in Qemu for the ARM CPU architectural
>  >>   > constraints in KVM and in Guest Kernel,  *not relaxing* them. We are
>  >>   > not allowed to meddle with the constraints. That is the whole point.
>  >>   >
>  >>   > Not having to respect those constraints led to rejection of the
>  >>   > earlier attempts to upstream Virtual CPU Hotplug for ARM.
>  >>   >
>  >>
>  >>   I meant to 'overcome' the constraints by 'relax'. My apologies if there're any
>  >>   caused confusions.
>  >
>  >
>  > Ok. No issues. It was important for me to clarify though.
>  >
>  >
>  >   Previously, you had attempt to create all vCPU objects
>  >>   and reuse them when vCPU hot added.
>  >
>  > Yes, at QOM level. But that approach did not realize the
>  > unplugged/yet-to-be-plugged vCPUs. We were just using QOM vCPU objects
>  > as the place holders
>  >
>  
>  Right, my point was actually vCPU objects are too heavy as the place
>  holders. It was reason why I had the concern: why those hotpluggable vCPU
>  objects can't be avoided during the bootup time.


Sure, to list down again. For the reasons I've already explained :

1. KVM MUST have details about all the vCPUs and its features finalized before the VGIC
    initialization. This is ARM CPU Architecture constraint. This is immutable requirement!
2. QOM vCPUs has got representational changes of the KVM vCPU like vcpu-id, features
    list etc. These MUST be finalized even before QOM begins its own GICv3 initialization
    which will end up in initialization of KVM VGIC. QOM vCPUs MUST be handed over to 
    the GICV3 QOM in fully initialized state. The same reason applies here as well.
    Till this point we are architecture compliant. The only place where we are not ARM
    Architecture compliant is where in QOM, we dissociate the vCPU state with GICV3
    CPU State during unplug action or for the yet-to-be-plugged vCPUs. Later are
    released as part of virt_cpu_post_init() in our current design. 


>  >   In current implementation, the
>  >>   hotpluggable vCPUs are instantiated and released pretty soon. I was
>  >>   bringing the third possibility, to avoid instantiating those hotpluggable vCPU
>  >>   objects, for discussion.
>  >
>  >
>  > Are you suggesting not calling KVM_ARM_VCPU_INIT IOCTL as all for the
>  > vCPUs which are part of possible list but not yet plugged?
>  >
>  > If yes, we cannot do that as KVM vCPUs should be fully initialized
>  > even before VGIC is initialized inside the KVM. This is a constraint.
>  > I've explained this in detail in the cover letter of this patch-set
>  > and in the slides I have shared earlier.
>  >
>  
>  No, it's not what I was suggesting. What I suggested is to avoid creating
>  those hotpluggable vCPU objects (place holders) during the bootup time.
>  However, all vCPU file descriptors (KVM
>  objects) still need to be created and initialized before GICv3 is initialized. It's
>  one of the constrains. So we need to create and populate all vCPU file
>  descriptors through ioctl(vm_fd, CREATE_VCPU) and ioctl(vcpu_fd,
>  INIT_VCPU) before GICv3 object is created and realized. As I explained in
>  the previous reply, the hotpluggable vCPU objects (place holders) haven't
>  to be created in order to initialize and populate the vCPU file descriptors for
>  those hotpluggable vCPUs. I think the parameters used to create and
>  initialize vCPU-0's file descriptor can be reused by all other vCPUs, because
>  we don't support heterogeneous vCPUs.
>  What I suggested is something like below: the point is to avoid instantiating
>  those hotpluggable vCPUs, but their vCPU file descriptor (KVM object) are
>  still created and initialized.
>  
>       static void machvirt_init(MachineState *machine)
>       {
>  
>           /*
>            * Instantiate and realize vCPU-0, record the parameter passed to
>            * ioctl(vcpu-fd, VCPU_INIT, &init), or a better place to remember the
>  parameter.
>            * The point is the parameter can be shared by all vCPUs.
>            */
>  
>           /*
>            * Create vCPU descriptors for all other vCPUs (including hotpluggable
>  vCPUs).
>            * The remembered parameter is reused and passed to ioctl(vcpu-fd,
>  VCPU_INIT, &init).
>            */
>  
>           /* Instanaite and realize other cold-booted vCPUs */
>  
>           /* Instantiate and realize GICv3 */
>  
>       }


No. For the reasons I've mentioned above, we MUST provide fully initialize the QOM
vCPU objects before initialization of QOM GICV3 kicks in. This ensures that nothing
breaks during initialization process of the QOM GICV3.  Therefore, the optimization
steps mentioned above are unnecessary and could cause problems in future.
Additionally, the evolution of the GICV3 QOM can be independent of the ARM Virt
Machine as it can be used with other Machines as well so we MUST treat it as a black
box which needs QOM vCPU objects as inputs during its initialization.


>  > In this series, the life cycle of those hotpluggable
>  >>   vCPU objects are really short. Again, I didn't say we must avoid instantiating
>  >>   those vCPU objects, I brought the topic ONLY for discussion.
>  >
>  > Sure, I appreciate that. For the details of the reasons please follow below:
>  >
>  > 1. Cover Letter of this patch-set (Constraints are explained there) 2.
>  > KVMForum Slides of 2020 and 2023
>  >
>  >
>  >>   > (2) Help to instantiate and realize
>  >>   >>   GICv3 object.
>  >>   >>
>  >>   >>   For (1), I don't think we have to instantiate those hotpluggable vCPUs at all.
>  >>   >>   In the above example where we have command line '-smp
>  >>   >>   maxcpus=4,cpus=2', it's unnecessary to instantiate
>  >>   >>   vCPU-3 and vCPU-4 to create and populate their KVM file descriptors.
>  >>   >
>  >>   >
>  >>   > We cannot defer create vCPU in KVM after GIC has been initialized in KVM.
>  >>   > It needs to know every vCPU that will ever exists right at the time it
>  >>   > is getting Initialized. This is an ARM CPU Architectural constraint.
>  >>   >
>  >>
>  >>   It will be appreciated if more details other than 'an ARM CPU architectural constraint'
>  >>   can be provided. I don't understand this constrain very well at least.
>  >
>  >
>  > We cannot do that as we MUST present KVM vCPUs to the VGIC fully
>  > initialized, even before it starts its initialization. Initialization
>  > of the vCPUs also initializes the system registers for the corresponding KVM vCPU.
>  >
>  > For example, MPIDR_EL1 must be initialized at VCPU INIT time. We
>  > cannot avoid this. MPIDR value is used by VGIC during its
>  > initialization. This MUST be present for all of the possible KVM vCPUs
>  > right from start during vgic_init()
>  >
>  > Please check the cover letter of this patch-set, I explained these
>  > there and the KVMForum slides.  Please review and comment there and
>  > let me know what is not clear from the text.
>  >
>  
>  It seems my suggestion wasn't fully understood. I was suggesting to avoid
>  instantiating those hotpluggable vCPU objects (place holders) during QEMU
>  startup time. All vCPU file descriptors (the vCPU's corresponding objects)
>  still need to be in place before GICv3 object is initiated and realized.


We need both QOM and KVM vCPU objects fully initialized even before QOM GICV3
and hence KVM VGIC starts to initialize itself for the reasons I've explained above.
Yes, there is a duplicity especially when we are not supporting heterogenous 
computing but let us not add weight of that optimization to the current patch-set.


>  >>   >   A
>  >>   >>   vCPU's KVM file descriptor is create and populated by the following ioctls
>  >>   >>   and function calls. When the first vCPU (vCPU-0) is realized, the property
>  >>   >>   corresponding to "&init" is fixed for all vCPUs. It means all vCPUs have same
>  >>   >>   properties except the "vcpu_index".
>  >>   >>
>  >>   >>      ioctl(vm-fd,   KVM_CREATE_VCPU,   vcpu_index);
>  >>   >>      ioctl(vcpu-fd, KVM_ARM_VCPU_INIT, &init);
>  >>   >>      kvm_park_vcpu(cs);
>  >>   >>
>  >>   >>   A vCPU's properties are determined by two sources and both are global. It
>  >>   >>   means all vCPUs should have same properties: (a) Feature registers
>  >>   >>   returned from the host. The function
>  >>   >>   kvm_arm_get_host_cpu_features() is called for once, meaning this source
>  >>   >>   is same to all vCPUs;
>  >>   >
>  >>   >
>  >>   > Sure, but what are you trying to save here?
>  >>   >
>  >>
>  >>   As mentioned above, the life cycle of those hotpluggable vCPU objects are
>  >>   really short. They still consume time and memory to instantiate them. If I'm
>  >>   correct, one of the primary goal for vCPU hotplug feature is to save system
>  >>   boot-up time, correct?
>  >
>  >
>  > Correct. We targeted vCPU hotplug for Kata-containers for on-demand
>  > resource allocation and saving the resources. Kata-containers can work
>  > with different types of VMM like Qemu and microVMs like Firecracker.
>  > AFAIK, Usecase of microVM is slightly different than the normal
>  > containers. They are short lived (say around
>  > 15 min) and require ultrafast boot-up times (say less than 125 ms) -
>  > these figures are from Amazon who invented the concept of microVM in 
>  the earlier decade.
>  >
>  > With the current patches, we have only partially achieved what we had
>  > started i.e. Kata/Qemu but we also want to target Kata/microVM. In our
>  > case, we want that microVM to be Qemu based fro ARM. I think x86
>  > already has reduced lots of legacy stuff and created a microVM in
>  > Qemu. I'm not sure how it compares against the true microVM like
>  > Firecracker. It will be a good target to reduce memory foot print of
>  > ARM Qemu Virt Machine. or think or creating a new one just like x86.
>  > Using the vCPU Hotplug feature we were drastically able to reduce the
>  > boot up times of Qemu. Please check the calibrated performance figures
>  > in KVmForum  2023 slide 19G (Page 26) [1]
>  >
>  > [1]
>  > https://kvm-
>  forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Vir
>  > t_CPU_Hotplug_-__ii0iNb3.pdf
>  >
>  > Last year, I had prototyped a microVM for ARM, Michael Tsirkin
>  > suggested that if the performance number of the ARM Virt machine can
>  > match the x86 microVM then we might not require an explicit microVM
>  > code for ARM. Hence, my current efforts are to reduce the memory foot
>  > print of existing Virt machine. But I do have a rough prototype of
>  > microVM as well. We can debate about that later in a different discussion.
>  >
>  
>  Thanks for the linker to the slides. Yeah, it's nice to reduce the bootup time
>  and memory footprint if possible. vCPU hotplug feature may help to
>  improve the performance, but all other paths might also impact the
>  performance. In summary, it's a comprehensive goal to reduce the memory
>  footprint and bootup time, and other components (paths) need
>  optimization to achieve this goal.


Agreed, there is a lot of scope for that. But let's prioritize and do first things first.

1. We want this patch-set to get accepted after all fixes. This is the main goal.
2. Then think of optimizations like bootup and memory footprint reduction.


>  >>   >>   (b) The parameters provided by user through '-cpu host,sve=off' are
>  >>   >>   translated to global properties and applied to all vCPUs when they're
>  >>   >>   instantiated.
>  >>   >
>  >>   >
>  >>   > Sure. Same is the case with PMU and other per-vCPU parameters.
>  >>   > We do not support heterogenous computing and therefore we do not have
>  >>   > per-vCPU control of these features as of now.
>  >>   >
>  >>   >
>  >>   >>
>  >>   >>          (a)                                            (b)
>  >>   >>
>  >>   >>      aarch64_host_initfn                          qemu_init
>  >>   >>      kvm_arm_set_cpu_features_from_host           parse_cpu_option
>  >>   >>        kvm_arm_get_host_cpu_features
>  >>   cpu_common_parse_features
>  >>   >>                                                   qdev_prop_register_global
>  >>   >>                                                     :
>  >>   >>                                                   device_post_init
>  >>   >>
>  >>   >> qdev_prop_set_globals
>  >>   >
>  >>   >
>  >>   > Sure, I understand the code flow but what are you trying to suggest here?
>  >>   >
>  >>
>  >>   I tried to explain that vCPU object isn't needed to create and populate
>  >>   vCPU's file descriptors, as highlight in (1). The information used to create the
>  >>   cold-booted
>  >>   vCPU-0 can be reused because all vCPUs have same properties and feature
>  >>   set.
>  >
>  >
>  > It does not matter. We use those QOM vCPU object states to initializes
>  > Qemu
>  > GICv3 model with max possible vCPUs and then release the QOM vCPU objects.
>  > which are yet-to-be-plugged.
>  >
>  
>  It's what has been implemented in this series. My concern remains: why
>  those vCPU hotpluggable objects can't be avoided? Again, their
>  corresponding vCPU file descritpors (KVM vCPU objects) still have to be in
>  place before GICv3 is instantiated and realized.


Answered above.


>  >>   >>   For (2), I'm still looking into the GICv3 code for better understanding.
>  >>   >
>  >>   >
>  >>   > Oh, I thought you said you've finished your reviews 😊
>  >>   >
>  >>   > Please take your time. For your reference, you might want to check:
>  >>   >
>  >>   > KVMForum 2023:
>  >>   > https://kvm-
>  >>   forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Vir
>  >>   > t_CPU_Hotplug_-__ii0iNb3.pdf
>  >>   > https://kvm-forum.qemu.org/2023/KVM-forum-cpu-
>  >>   hotplug_7OJ1YyJ.pdf
>  >>   >
>  >>   > KVMForum 2020:
>  >>   > https://kvm-
>  >>   forum.qemu.org/2020/Oct%2029_Challenges%20in%20Supporting%
>  >>   >
>  >>
>  20Virtual%20CPU%20Hotplug%20in%20SoC%20Based%20Systems%20like%2
>  >>   0ARM64_
>  >>   > Salil%20Mehta.pdf
>  >>   >
>  >>
>  >>   hmm, 'finished my review' has been misread frankly. By that, I meant I
>  >>   finished my tests and provided all the comments I had. Some of them
>  are
>  >>   questions and discussions, which I still need to follow up.
>  >
>  >
>  > Sure. No worries. Even if you miss, you will have more chance to
>  > comment on upcoming RFC V4 😊
>  >
>  
>  Ok :-)
>  
>  >
>  >>   > Until
>  >>   >>   now, I don't see we need the instantiated hotpluggable vCPUs either.
>  >>   >
>  >>   >
>  >>   > I think, I've already answered this above it is because of ARM Architectural
>  >>   constraint.
>  >>   >
>  >>   >
>  >>   >   For
>  >>   >>   example, the redistributor regions can be exposed based on 'maxcpus'
>  >>   >>   instead of 'cpus'.
>  >>   >
>  >>   > You mean during the review of the code you found that we are not doing
>  >>   it?
>  >>   >
>  >>
>  >>   It's all about the discussion to the possibility to avoid instantiating
>  >>   hotpluggable vCPU objects.
>  >
>  >
>  > As mentioned above, with the current KVM code you cannot. But if we
>  > really want to then perhaps we would need to change KVM.
>  >
>  > I might be wrong but AFAICS I don’t see a reason why we cannot have
>  > something like *batch* KVM vCPU create and  initialize instead of
>  > current sequential KVM operations. This will avoid multiple calls to
>  > the KVM Host and can improve Qemu init time further. But this will
>  > require a separate discussion in the LKML including all the KVM folks.
>  >
>  > This has potential to delay the vCPU hotplug feature acceptance and
>  > I'm really not in favor of that. We have already stretched it a lot
>  > because of the standards change acceptance earlier.
>  >
>  
>  Again, my suggestion wasn't completely understood. I was suggesting to
>  avoid instantiating those hotpluggable objects, but their vCPU file
>  descriptors still need to be in place before GICv3's instantiation and
>  realization.
>  Yes, I was also concerned that too much code changes would be needed if
>  my suggestion is accepted. It will definitely delay the feature's upstreaming
>  process. It's why I said the topic (to avoid the hotpluggable objects) are just
>  for discussion now. We can do it (as separate optimization) after your
>  current implementation is merged.


Answered above, why we cannot and should not do that.


>  >>   > The IRQ connection and teardown can be dynamically
>  >>   >>   done by connecting the board with GICv3 through callbacks in
>  >>   >>   ARMGICv3CommonClass.
>  >>   >>   The connection between GICv3CPUState and CPUARMState also can be
>  >>   >>   done dynamically.
>  >>   >
>  >>   > Are you suggesting this after reviewing the code or you have to review
>  >>   > it yet? 😉
>  >>   >
>  >>
>  >>   I was actually trying to ask for your input and feedback. I was hoping your
>  >>   input to clear my puzzles: why vCPU objects must be in place to create
>  >>   GICv3 object?
>  >>   Is it possible to create the GICv3 object without those vCPU objects?
>  >
>  >
>  > No. VGIC initializes IRQs to target KVM vCPUs, it would expect same
>  > KVM vCPU MPIDR or MP-AFFINITY configured when KVM vCPUs were
>  > initialized at the first place otherwise the VGIC initialization will not happen
>  correctly. Hence, the sequence.
>  >
>  > The sequence of these initialization is generally strictly controlled
>  > by specification which is closely tied up with hardware including powering
>  up initializations.
>  > You will need to honor the expectations of the KVM VGIC init which in
>  > turn are ARM CPU Architecture specification compliant. It is not just a
>  loosely written code.
>  >
>  
>  umm, As I explained from the beginning, all KVM vCPU file descriptors are
>  still in place before GICv3 is instantiated and realized. With those KVM vCPU
>  file descriptor, we shouldn't have problem except more code changes are
>  needed, or I miss something? :)


Same as explained above.


>  > What
>  >>   kinds of efforts we need to avoid instantiating those hotpluggable vCPU
>  >>   objects.
>  >
>  >
>  > I mentioned one of the ways above. Introduce *batch* KVM vCPU create &
>  > initialize. But it will have to undergo greater scrutiny because we
>  > are touching a common part which might affect many stake holders.  But
>  > this is a discussion we can do later as part of microVM for ARM.
>  >
>  
>  Remember you just had the architecture agnostic series merged. The vCPU
>  file descriptor can be parked and picked up at a later time based on the
>  vCPU index.
>  This KVM vCPU create in batch mode can be well supported. What you need
>  to is record the parameters passed to ioctl(vm_fd, CREATE_VCPU, index)
>  and ioctl(vcpu_fd,
>  INIT_CPU) for vCPU-0 in hw/arm/virt.c and create all other vCPU file
>  descriptors based on the recorded parameters.


Yes, but we would need to consider many other things and we cannot decide that here.
Lets cross the bridge when it comes. I'd suggest to park this part of the discussion
for now.


Thanks
Salil.

>  
>  Thanks,
>  Gavin
>  


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 17/29] arm/virt: Release objects for *disabled* possible vCPUs after init
  2024-08-23 13:17                     ` Salil Mehta via
@ 2024-08-24 10:03                       ` Gavin Shan
  0 siblings, 0 replies; 105+ messages in thread
From: Gavin Shan @ 2024-08-24 10:03 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Salil,

On 8/23/24 11:17 PM, Salil Mehta wrote:
>>   On 8/22/24 8:58 PM, Salil Mehta wrote:
>>   >>   On 8/21/24 8:23 PM, Salil Mehta wrote:
>>   >>   >>
>>   >>   >>   On 8/21/24 2:40 AM, Salil Mehta wrote:
>>   >>   >>   >
>>   >>   >>   > I don’t understand this clearly. Are  you suggesting to reuse only
>>   >>   >>   > single vCPU object to initialize all KVM vCPUs not yet plugged? If
>>   >>   >>   > yes, then I'm not sure what do we gain here by adding this complexity?
>>   >>   >>   > It does not consume time or resources because we are not realizing any
>>   >>   >>   > of these vCPU object in any case.
>>   >>   >>   >
>>   >>   >>
>>   >>   >>   First of all, it seems we have different names and terms for those cold-
>>   >>   >>   booted vCPUs and hotpluggable vCPUs. For example, vCPU-0 and vCPU-1
>>   >>   >>   are cold-booted vCPUs while
>>   >>   >>   vCPU-2 and vCPU-3 are hotpluggable vCPUs when we have '-smp
>>   >>   >>   maxcpus=4,cpus=2'. Lets stick to convention and terms for easier discussion.
>>   >>   >>
>>   >>   >>   The idea is avoid instantiating hotpluggable vCPUs in virtmach_init() and
>>   >>   >>   released in the same function for those hotpluggable vCPUs. As I can
>>   >>   >>   understand, those hotpluggable vCPU instances are serving for two
>>   >>   >>   purposes: (1) Relax the constraint that all vCPU's (kvm) file descriptor have
>>   >>   >>   to be created and populated;
>>   >>   >
>>   >>   >
>>   >>   > We are devising *workarounds* in Qemu for the ARM CPU architectural
>>   >>   > constraints in KVM and in Guest Kernel,  *not relaxing* them. We are
>>   >>   > not allowed to meddle with the constraints. That is the whole point.
>>   >>   >
>>   >>   > Not having to respect those constraints led to rejection of the
>>   >>   > earlier attempts to upstream Virtual CPU Hotplug for ARM.
>>   >>   >
>>   >>
>>   >>   I meant to 'overcome' the constraints by 'relax'. My apologies if there're any
>>   >>   caused confusions.
>>   >
>>   >
>>   > Ok. No issues. It was important for me to clarify though.
>>   >
>>   >
>>   >   Previously, you had attempt to create all vCPU objects
>>   >>   and reuse them when vCPU hot added.
>>   >
>>   > Yes, at QOM level. But that approach did not realize the
>>   > unplugged/yet-to-be-plugged vCPUs. We were just using QOM vCPU objects
>>   > as the place holders
>>   >
>>   
>>   Right, my point was actually vCPU objects are too heavy as the place
>>   holders. It was reason why I had the concern: why those hotpluggable vCPU
>>   objects can't be avoided during the bootup time.
> 
> 
> Sure, to list down again. For the reasons I've already explained :
> 
> 1. KVM MUST have details about all the vCPUs and its features finalized before the VGIC
>      initialization. This is ARM CPU Architecture constraint. This is immutable requirement!
> 2. QOM vCPUs has got representational changes of the KVM vCPU like vcpu-id, features
>      list etc. These MUST be finalized even before QOM begins its own GICv3 initialization
>      which will end up in initialization of KVM VGIC. QOM vCPUs MUST be handed over to
>      the GICV3 QOM in fully initialized state. The same reason applies here as well.
>      Till this point we are architecture compliant. The only place where we are not ARM
>      Architecture compliant is where in QOM, we dissociate the vCPU state with GICV3
>      CPU State during unplug action or for the yet-to-be-plugged vCPUs. Later are
>      released as part of virt_cpu_post_init() in our current design.
> 

Thanks for your time to evaluate and reply. Sorry that I'm not convinced. You're explaining
what we already have in current design and implementation. It doesn't mean the current design
and implementation is 100% perfect and flawless.

In current design and implementation, all (QOM) vCPU objects have to be instantiated, even
though those hotpluggable (QOM) vCPU objects aren't realized at bootup time. After that, those
(QOM) hotpluggable vCPU objects are finalized and destroyed so that they can be hot added
afterwards. This sounds like we create (QOM) vCPU objects, remove those (QOM) vCPU objects
at booting time so that they can be hot-added at running time. The duplicate work is obvious
there. This scheme and design looks just-for-working, not in an optimized way to me.

I'm giving up the efforts to convince you. Lets see if other reviewers will have same concern
or not. Another possibility would be to have current implementation (with all fixes) merged
upstream firstly, and then seek optimizations after that. That time, the suggestion can be
re-evaluated.

> 
>>   >   In current implementation, the
>>   >>   hotpluggable vCPUs are instantiated and released pretty soon. I was
>>   >>   bringing the third possibility, to avoid instantiating those hotpluggable vCPU
>>   >>   objects, for discussion.
>>   >
>>   >
>>   > Are you suggesting not calling KVM_ARM_VCPU_INIT IOCTL as all for the
>>   > vCPUs which are part of possible list but not yet plugged?
>>   >
>>   > If yes, we cannot do that as KVM vCPUs should be fully initialized
>>   > even before VGIC is initialized inside the KVM. This is a constraint.
>>   > I've explained this in detail in the cover letter of this patch-set
>>   > and in the slides I have shared earlier.
>>   >
>>   
>>   No, it's not what I was suggesting. What I suggested is to avoid creating
>>   those hotpluggable vCPU objects (place holders) during the bootup time.
>>   However, all vCPU file descriptors (KVM
>>   objects) still need to be created and initialized before GICv3 is initialized. It's
>>   one of the constrains. So we need to create and populate all vCPU file
>>   descriptors through ioctl(vm_fd, CREATE_VCPU) and ioctl(vcpu_fd,
>>   INIT_VCPU) before GICv3 object is created and realized. As I explained in
>>   the previous reply, the hotpluggable vCPU objects (place holders) haven't
>>   to be created in order to initialize and populate the vCPU file descriptors for
>>   those hotpluggable vCPUs. I think the parameters used to create and
>>   initialize vCPU-0's file descriptor can be reused by all other vCPUs, because
>>   we don't support heterogeneous vCPUs.
>>   What I suggested is something like below: the point is to avoid instantiating
>>   those hotpluggable vCPUs, but their vCPU file descriptor (KVM object) are
>>   still created and initialized.
>>   
>>        static void machvirt_init(MachineState *machine)
>>        {
>>   
>>            /*
>>             * Instantiate and realize vCPU-0, record the parameter passed to
>>             * ioctl(vcpu-fd, VCPU_INIT, &init), or a better place to remember the
>>   parameter.
>>             * The point is the parameter can be shared by all vCPUs.
>>             */
>>   
>>            /*
>>             * Create vCPU descriptors for all other vCPUs (including hotpluggable
>>   vCPUs).
>>             * The remembered parameter is reused and passed to ioctl(vcpu-fd,
>>   VCPU_INIT, &init).
>>             */
>>   
>>            /* Instanaite and realize other cold-booted vCPUs */
>>   
>>            /* Instantiate and realize GICv3 */
>>   
>>        }
> 
> 
> No. For the reasons I've mentioned above, we MUST provide fully initialize the QOM
> vCPU objects before initialization of QOM GICV3 kicks in. This ensures that nothing
> breaks during initialization process of the QOM GICV3.  Therefore, the optimization
> steps mentioned above are unnecessary and could cause problems in future.
> Additionally, the evolution of the GICV3 QOM can be independent of the ARM Virt
> Machine as it can be used with other Machines as well so we MUST treat it as a black
> box which needs QOM vCPU objects as inputs during its initialization.
> 

Your explanation is not completely correct to me. It's what we had in current design. It
doesn't means the design should have to be like this. The (KVM) vCPU file descriptor must
be in place before QOM GICv3 object is instantiated and realized, but the (QOM) vCPU objects
don't have to exist before that. However, we may need a lot of code changes to tame GICv3Common
and GICv3KVM so that they can be vCPU hot-add/remove friendly and then to avoid the pre-created
(QOM) vCPU objects.

It can be something to be re-evaluated in the future, as I said above. Frankly, It's pointless
to take our time on the discussions without reaching to any conclusions. From my side, I've tried
my best to provide my comments and explain my thoughts. Appreciated for your patience, time on
evaluation, and replies :)

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
  2024-08-19 12:35     ` Salil Mehta via
@ 2024-08-28 20:23       ` Gustavo Romero
  2024-09-04 13:53         ` Salil Mehta via
  0 siblings, 1 reply; 105+ messages in thread
From: Gustavo Romero @ 2024-08-28 20:23 UTC (permalink / raw)
  To: Salil Mehta, Alex Bennée
  Cc: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com,
	maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com,
	rafael@kernel.org, borntraeger@linux.ibm.com, npiggin@gmail.com,
	harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Salil,

On 8/19/24 9:35 AM, Salil Mehta via wrote:
> Hi Alex,
> 
>>   From: Alex Bennée <alex.bennee@linaro.org>
>>   Sent: Friday, August 16, 2024 4:37 PM
>>   To: Salil Mehta <salil.mehta@huawei.com>
>>   
>>   Salil Mehta <salil.mehta@huawei.com> writes:
>>   
>>   > vCPU Hot-unplug will result in QOM CPU object unrealization which will
>>   > do away with all the vCPU thread creations, allocations, registrations
>>   > that happened as part of the realization process. This change
>>   > introduces the ARM CPU unrealize function taking care of exactly that.
>>   >
>>   > Note, initialized KVM vCPUs are not destroyed in host KVM but their
>>   > Qemu context is parked at the QEMU KVM layer.
>>   >
>>   > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
>>   > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>>   > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>>   > Reported-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
>>   > [VP: Identified CPU stall issue & suggested probable fix]
>>   > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>>   > ---
>>   >  target/arm/cpu.c       | 101
>>   +++++++++++++++++++++++++++++++++++++++++
>>   >  target/arm/cpu.h       |  14 ++++++
>>   >  target/arm/gdbstub.c   |   6 +++
>>   >  target/arm/helper.c    |  25 ++++++++++
>>   >  target/arm/internals.h |   3 ++
>>   >  target/arm/kvm.c       |   5 ++
>>   >  6 files changed, 154 insertions(+)
>>   >
>>   > diff --git a/target/arm/cpu.c b/target/arm/cpu.c index
>>   > c92162fa97..a3dc669309 100644
>>   > --- a/target/arm/cpu.c
>>   > +++ b/target/arm/cpu.c
>>   > @@ -157,6 +157,16 @@ void
>>   arm_register_pre_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn
>>   *hook,
>>   >      QLIST_INSERT_HEAD(&cpu->pre_el_change_hooks, entry, node);  }
>>   >
>>   > +void arm_unregister_pre_el_change_hooks(ARMCPU *cpu) {
>>   > +    ARMELChangeHook *entry, *next;
>>   > +
>>   > +    QLIST_FOREACH_SAFE(entry, &cpu->pre_el_change_hooks, node,
>>   next) {
>>   > +        QLIST_REMOVE(entry, node);
>>   > +        g_free(entry);
>>   > +    }
>>   > +}
>>   > +
>>   >  void arm_register_el_change_hook(ARMCPU *cpu,
>>   ARMELChangeHookFn *hook,
>>   >                                   void *opaque)  { @@ -168,6 +178,16
>>   > @@ void arm_register_el_change_hook(ARMCPU *cpu,
>>   ARMELChangeHookFn *hook,
>>   >      QLIST_INSERT_HEAD(&cpu->el_change_hooks, entry, node);  }
>>   >
>>   > +void arm_unregister_el_change_hooks(ARMCPU *cpu) {
>>   > +    ARMELChangeHook *entry, *next;
>>   > +
>>   > +    QLIST_FOREACH_SAFE(entry, &cpu->el_change_hooks, node, next) {
>>   > +        QLIST_REMOVE(entry, node);
>>   > +        g_free(entry);
>>   > +    }
>>   > +}
>>   > +
>>   >  static void cp_reg_reset(gpointer key, gpointer value, gpointer
>>   > opaque)  {
>>   >      /* Reset a single ARMCPRegInfo register */ @@ -2552,6 +2572,85 @@
>>   > static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
>>   >      acc->parent_realize(dev, errp);
>>   >  }
>>   >
>>   > +static void arm_cpu_unrealizefn(DeviceState *dev) {
>>   > +    ARMCPUClass *acc = ARM_CPU_GET_CLASS(dev);
>>   > +    ARMCPU *cpu = ARM_CPU(dev);
>>   > +    CPUARMState *env = &cpu->env;
>>   > +    CPUState *cs = CPU(dev);
>>   > +    bool has_secure;
>>   > +
>>   > +    has_secure = cpu->has_el3 || arm_feature(env,
>>   > + ARM_FEATURE_M_SECURITY);
>>   > +
>>   > +    /* rock 'n' un-roll, whatever happened in the arm_cpu_realizefn
>>   cleanly */
>>   > +    cpu_address_space_destroy(cs, ARMASIdx_NS);
>>   
>>   On current master this will fail:
>>   
>>   ../../target/arm/cpu.c: In function ‘arm_cpu_unrealizefn’:
>>   ../../target/arm/cpu.c:2626:5: error: implicit declaration of function
>>   ‘cpu_address_space_destroy’ [-Werror=implicit-function-declaration]
>>    2626 |     cpu_address_space_destroy(cs, ARMASIdx_NS);
>>         |     ^~~~~~~~~~~~~~~~~~~~~~~~~
>>   ../../target/arm/cpu.c:2626:5: error: nested extern declaration of
>>   ‘cpu_address_space_destroy’ [-Werror=nested-externs]
>>   cc1: all warnings being treated as errors
> 
> 
> The current master already has arch-agnostic patch-set. I've applied the
> RFC V3 to the latest and complied. I did not see this issue?
> 
> I've create a new branch for your reference.
> 
> https://github.com/salil-mehta/qemu/tree/virt-cpuhp-armv8/rfc-v4-rc4
> 
> Please let me know if this works for you?

It still happens on the new branch. You need to configure Linux user mode
to reproduce it, e.g.:

$ ../configure --target-list=aarch64-linux-user,aarch64-softmmu [...]

If you just configure the 'aarch64-softmmu' target it doesn't happen.


Cheers,
Gustavo


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (31 preceding siblings ...)
  2024-08-07  9:53 ` Gavin Shan
@ 2024-08-28 20:35 ` Gustavo Romero
  2024-08-29  9:59   ` Alex Bennée
  2024-09-04 14:03   ` Salil Mehta via
  32 siblings, 2 replies; 105+ messages in thread
From: Gustavo Romero @ 2024-08-28 20:35 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm, mst
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, gshan, rafael,
	borntraeger, alex.bennee, npiggin, harshpb, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai,
	shahuang, zhao1.liu, linuxarm

Hi Salil,

On 6/13/24 8:36 PM, Salil Mehta via wrote:
> PROLOGUE
> ========
> 
> To assist in review and set the right expectations from this RFC, please first
> read the sections *APPENDED AT THE END* of this cover letter:
> 
> 1. Important *DISCLAIMER* [Section (X)]
> 2. Work presented at KVMForum Conference (slides available) [Section (V)F]
> 3. Organization of patches [Section (XI)]
> 4. References [Section (XII)]
> 5. Detailed TODO list of leftover work or work-in-progress [Section (IX)]
> 
> There has been interest shown by other organizations in adapting this series
> for their architecture. Hence, RFC V2 [21] has been split into architecture
> *agnostic* [22] and *specific* patch sets.
> 
> This is an ARM architecture-specific patch set carved out of RFC V2. Please
> check section (XI)B for details of architecture agnostic patches.
> 
> SECTIONS [I - XIII] are as follows:
> 
> (I) Key Changes [details in last section (XIV)]
> ==============================================
> 
> RFC V2 -> RFC V3
> 
> 1. Split into Architecture *agnostic* (V13) [22] and *specific* (RFC V3) patch sets.
> 2. Addressed comments by Gavin Shan (RedHat), Shaoqin Huang (RedHat), Philippe Mathieu-Daudé (Linaro),
>     Jonathan Cameron (Huawei), Zhao Liu (Intel).
> 
> RFC V1 -> RFC V2
> 
> RFC V1: https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.mehta@huawei.com/
> 
> 1. ACPI MADT Table GIC CPU Interface can now be presented [6] as ACPI
>     *online-capable* or *enabled* to the Guest OS at boot time. This means
>     associated CPUs can have ACPI _STA as *enabled* or *disabled* even after boot.
>     See UEFI ACPI 6.5 Spec, Section 05, Table 5.37 GICC CPU Interface Flags[20].
> 2. SMCC/HVC Hypercall exit handling in userspace/Qemu for PSCI CPU_{ON,OFF}
>     request. This is required to {dis}allow online'ing a vCPU.
> 3. Always presenting unplugged vCPUs in CPUs ACPI AML code as ACPI _STA.PRESENT
>     to the Guest OS. Toggling ACPI _STA.Enabled to give an effect of the
>     hot{un}plug.
> 4. Live Migration works (some issues are still there).
> 5. TCG/HVF/qtest does not support Hotplug and falls back to default.
> 6. Code for TCG support exists in this release (it is a work-in-progress).
> 7. ACPI _OSC method can now be used by OSPM to negotiate Qemu VM platform
>     hotplug capability (_OSC Query support still pending).
> 8. Misc. Bug fixes.
> 
> (II) Summary
> ============
> 
> This patch set introduces virtual CPU hotplug support for the ARMv8 architecture
> in QEMU. The idea is to be able to hotplug and hot-unplug vCPUs while the guest VM
> is running, without requiring a reboot. This does *not* make any assumptions about
> the physical CPU hotplug availability within the host system but rather tries to
> solve the problem at the virtualizer/QEMU layer. It introduces ACPI CPU hotplug hooks
> and event handling to interface with the guest kernel, and code to initialize, plug,
> and unplug CPUs. No changes are required within the host kernel/KVM except the
> support of hypercall exit handling in the user-space/Qemu, which has recently
> been added to the kernel. Corresponding guest kernel changes have been
> posted on the mailing list [3] [4] by James Morse.
> 
> (III) Motivation
> ================
> 
> This allows scaling the guest VM compute capacity on-demand, which would be
> useful for the following example scenarios:
> 
> 1. Vertical Pod Autoscaling [9][10] in the cloud: Part of the orchestration
>     framework that could adjust resource requests (CPU and Mem requests) for
>     the containers in a pod, based on usage.
> 2. Pay-as-you-grow Business Model: Infrastructure providers could allocate and
>     restrict the total number of compute resources available to the guest VM
>     according to the SLA (Service Level Agreement). VM owners could request more
>     compute to be hot-plugged for some cost.
> 
> For example, Kata Container VM starts with a minimum amount of resources (i.e.,
> hotplug everything approach). Why?
> 
> 1. Allowing faster *boot time* and
> 2. Reduction in *memory footprint*
> 
> Kata Container VM can boot with just 1 vCPU, and then later more vCPUs can be
> hot-plugged as needed.
> 
> (IV) Terminology
> ================
> 
> (*) Possible CPUs: Total vCPUs that could ever exist in the VM. This includes
>                     any cold-booted CPUs plus any CPUs that could be later
>                     hot-plugged.
>                     - Qemu parameter (-smp maxcpus=N)
> (*) Present CPUs:  Possible CPUs that are ACPI 'present'. These might or might
>                     not be ACPI 'enabled'.
>                     - Present vCPUs = Possible vCPUs (Always on ARM Arch)
> (*) Enabled CPUs:  Possible CPUs that are ACPI 'present' and 'enabled' and can
>                     now be ‘onlined’ (PSCI) for use by the Guest Kernel. All cold-
>                     booted vCPUs are ACPI 'enabled' at boot. Later, using
>                     device_add, more vCPUs can be hotplugged and made ACPI
>                     'enabled'.
>                     - Qemu parameter (-smp cpus=N). Can be used to specify some
> 	           cold-booted vCPUs during VM init. Some can be added using the
> 	           '-device' option.
> 
> (V) Constraints Due to ARMv8 CPU Architecture [+] Other Impediments
> ===================================================================
> 
> A. Physical Limitation to Support CPU Hotplug: (Architectural Constraint)
>     1. ARMv8 CPU architecture does not support the concept of the physical CPU
>        hotplug.
>        a. There are many per-CPU components like PMU, SVE, MTE, Arch timers, etc.,
>           whose behavior needs to be clearly defined when the CPU is hot(un)plugged.
>           There is no specification for this.
> 
>     2. Other ARM components like GIC, etc., have not been designed to realize
>        physical CPU hotplug capability as of now. For example,
>        a. Every physical CPU has a unique GICC (GIC CPU Interface) by construct.
>           Architecture does not specify what CPU hot(un)plug would mean in
>           context to any of these.
>        b. CPUs/GICC are physically connected to unique GICR (GIC Redistributor).
>           GIC Redistributors are always part of the always-on power domain. Hence,
>           they cannot be powered off as per specification.
> 
> B. Impediments in Firmware/ACPI (Architectural Constraint)
> 
>     1. Firmware has to expose GICC, GICR, and other per-CPU features like PMU,
>        SVE, MTE, Arch Timers, etc., to the OS. Due to the architectural constraint
>        stated in section A1(a), all interrupt controller structures of
>        MADT describing GIC CPU Interfaces and the GIC Redistributors MUST be
>        presented by firmware to the OSPM during boot time.
>     2. Architectures that support CPU hotplug can evaluate the ACPI _MAT method to
>        get this kind of information from the firmware even after boot, and the
>        OSPM has the capability to process these. ARM kernel uses information in MADT
>        interrupt controller structures to identify the number of present CPUs during
>        boot and hence does not allow to change these after boot. The number of
>        present CPUs cannot be changed. It is an architectural constraint!
> 
> C. Impediments in KVM to Support Virtual CPU Hotplug (Architectural Constraint)
> 
>     1. KVM VGIC:
>        a. Sizing of various VGIC resources like memory regions, etc., related to
>           the redistributor happens only once and is fixed at the VM init time
>           and cannot be changed later after initialization has happened.
>           KVM statically configures these resources based on the number of vCPUs
>           and the number/size of redistributor ranges.
>        b. Association between vCPU and its VGIC redistributor is fixed at the
>           VM init time within the KVM, i.e., when redistributor iodevs gets
>           registered. VGIC does not allow to setup/change this association
>           after VM initialization has happened. Physically, every CPU/GICC is
>           uniquely connected with its redistributor, and there is no
>           architectural way to set this up.
>     2. KVM vCPUs:
>        a. Lack of specification means destruction of KVM vCPUs does not exist as
>           there is no reference to tell what to do with other per-vCPU
>           components like redistributors, arch timer, etc.
>        b. In fact, KVM does not implement the destruction of vCPUs for any
>           architecture. This is independent of whether the architecture
>           actually supports CPU Hotplug feature. For example, even for x86 KVM
>           does not implement the destruction of vCPUs.
> 
> D. Impediments in Qemu to Support Virtual CPU Hotplug (KVM Constraints->Arch)
> 
>     1. Qemu CPU Objects MUST be created to initialize all the Host KVM vCPUs to
>        overcome the KVM constraint. KVM vCPUs are created and initialized when Qemu
>        CPU Objects are realized. But keeping the QOM CPU objects realized for
>        'yet-to-be-plugged' vCPUs can create problems when these new vCPUs shall
>        be plugged using device_add and a new QOM CPU object shall be created.
>     2. GICV3State and GICV3CPUState objects MUST be sized over *possible vCPUs*
>        during VM init time while QOM GICV3 Object is realized. This is because
>        KVM VGIC can only be initialized once during init time. But every
>        GICV3CPUState has an associated QOM CPU Object. Later might correspond to
>        vCPU which are 'yet-to-be-plugged' (unplugged at init).
>     3. How should new QOM CPU objects be connected back to the GICV3CPUState
>        objects and disconnected from it in case the CPU is being hot(un)plugged?
>     4. How should 'unplugged' or 'yet-to-be-plugged' vCPUs be represented in the
>        QOM for which KVM vCPU already exists? For example, whether to keep,
>         a. No QOM CPU objects Or
>         b. Unrealized CPU Objects
>     5. How should vCPU state be exposed via ACPI to the Guest? Especially for
>        the unplugged/yet-to-be-plugged vCPUs whose CPU objects might not exist
>        within the QOM but the Guest always expects all possible vCPUs to be
>        identified as ACPI *present* during boot.
>     6. How should Qemu expose GIC CPU interfaces for the unplugged or
>        yet-to-be-plugged vCPUs using ACPI MADT Table to the Guest?
> 
> E. Summary of Approach ([+] Workarounds to problems in sections A, B, C & D)
> 
>     1. At VM Init, pre-create all the possible vCPUs in the Host KVM i.e., even
>        for the vCPUs which are yet-to-be-plugged in Qemu but keep them in the
>        powered-off state.
>     2. After the KVM vCPUs have been initialized in the Host, the KVM vCPU
>        objects corresponding to the unplugged/yet-to-be-plugged vCPUs are parked
>        at the existing per-VM "kvm_parked_vcpus" list in Qemu. (similar to x86)
>     3. GICV3State and GICV3CPUState objects are sized over possible vCPUs during
>        VM init time i.e., when Qemu GIC is realized. This, in turn, sizes KVM VGIC
>        resources like memory regions, etc., related to the redistributors with the
>        number of possible KVM vCPUs. This never changes after VM has initialized.
>     4. Qemu CPU objects corresponding to unplugged/yet-to-be-plugged vCPUs are
>        released post Host KVM CPU and GIC/VGIC initialization.
>     5. Build ACPI MADT Table with the following updates:
>        a. Number of GIC CPU interface entries (=possible vCPUs)
>        b. Present Boot vCPU as MADT.GICC.Enabled=1 (Not hot[un]pluggable)
>        c. Present hot(un)pluggable vCPUs as MADT.GICC.online-capable=1
>           - MADT.GICC.Enabled=0 (Mutually exclusive) [6][7]
> 	 - vCPU can be ACPI enabled+onlined after Guest boots (Firmware Policy)
> 	 - Some issues with above (details in later sections)
>     6. Expose below ACPI Status to Guest kernel:
>        a. Always _STA.Present=1 (all possible vCPUs)
>        b. _STA.Enabled=1 (plugged vCPUs)
>        c. _STA.Enabled=0 (unplugged vCPUs)
>     7. vCPU hotplug *realizes* new QOM CPU object. The following happens:
>        a. Realizes, initializes QOM CPU Object & spawns Qemu vCPU thread.
>        b. Unparks the existing KVM vCPU ("kvm_parked_vcpus" list).
>           - Attaches to QOM CPU object.
>        c. Reinitializes KVM vCPU in the Host.
>           - Resets the core and sys regs, sets defaults, etc.
>        d. Runs KVM vCPU (created with "start-powered-off").
> 	 - vCPU thread sleeps (waits for vCPU reset via PSCI).
>        e. Updates Qemu GIC.
>           - Wires back IRQs related to this vCPU.
>           - GICV3CPUState association with QOM CPU Object.
>        f. Updates [6] ACPI _STA.Enabled=1.
>        g. Notifies Guest about the new vCPU (via ACPI GED interface).
> 	 - Guest checks _STA.Enabled=1.
> 	 - Guest adds processor (registers CPU with LDM) [3].
>        h. Plugs the QOM CPU object in the slot.
>           - slot-number = cpu-index {socket, cluster, core, thread}.
>        i. Guest online's vCPU (CPU_ON PSCI call over HVC/SMC).
>           - KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
>           - Qemu powers-on KVM vCPU in the Host.
>     8. vCPU hot-unplug *unrealizes* QOM CPU Object. The following happens:
>        a. Notifies Guest (via ACPI GED interface) vCPU hot-unplug event.
>           - Guest offline's vCPU (CPU_OFF PSCI call over HVC/SMC).
>        b. KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
>           - Qemu powers-off the KVM vCPU in the Host.
>        c. Guest signals *Eject* vCPU to Qemu.
>        d. Qemu updates [6] ACPI _STA.Enabled=0.
>        e. Updates GIC.
>           - Un-wires IRQs related to this vCPU.
>           - GICV3CPUState association with new QOM CPU Object is updated.
>        f. Unplugs the vCPU.
> 	 - Removes from slot.
>           - Parks KVM vCPU ("kvm_parked_vcpus" list).
>           - Unrealizes QOM CPU Object & joins back Qemu vCPU thread.
> 	 - Destroys QOM CPU object.
>        g. Guest checks ACPI _STA.Enabled=0.
>           - Removes processor (unregisters CPU with LDM) [3].
> 
> F. Work Presented at KVM Forum Conferences:
> ==========================================
> 
> Details of the above work have been presented at KVMForum2020 and KVMForum2023
> conferences. Slides & video are available at the links below:
> a. KVMForum 2023
>     - Challenges Revisited in Supporting Virt CPU Hotplug on architectures that don't Support CPU Hotplug (like ARM64).
>       https://kvm-forum.qemu.org/2023/KVM-forum-cpu-hotplug_7OJ1YyJ.pdf
>       https://kvm-forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Virt_CPU_Hotplug_-__ii0iNb3.pdf
>       https://www.youtube.com/watch?v=hyrw4j2D6I0&t=23970s
>       https://kvm-forum.qemu.org/2023/talk/9SMPDQ/
> b. KVMForum 2020
>     - Challenges in Supporting Virtual CPU Hotplug on SoC Based Systems (like ARM64) - Salil Mehta, Huawei.
>       https://sched.co/eE4m
> 
> (VI) Commands Used
> ==================
> 
> A. Qemu launch commands to init the machine:
> 
>      $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3 \
>        -cpu host -smp cpus=4,maxcpus=6 \
>        -m 300M \
>        -kernel Image \
>        -initrd rootfs.cpio.gz \
>        -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2 acpi=force" \
>        -nographic \
>        -bios QEMU_EFI.fd \
> 
> B. Hot-(un)plug related commands:
> 
>    # Hotplug a host vCPU (accel=kvm):
>      $ device_add host-arm-cpu,id=core4,core-id=4
> 
>    # Hotplug a vCPU (accel=tcg):
>      $ device_add cortex-a57-arm-cpu,id=core4,core-id=4

Since support for hotplug is disabled on TCG, remove
these two lines in v4 cover letter?


Cheers,
Gustavo

>    # Delete the vCPU:
>      $ device_del core4
> 
> Sample output on guest after boot:
> 
>      $ cat /sys/devices/system/cpu/possible
>      0-5
>      $ cat /sys/devices/system/cpu/present
>      0-5
>      $ cat /sys/devices/system/cpu/enabled
>      0-3
>      $ cat /sys/devices/system/cpu/online
>      0-1
>      $ cat /sys/devices/system/cpu/offline
>      2-5
> 
> Sample output on guest after hotplug of vCPU=4:
> 
>      $ cat /sys/devices/system/cpu/possible
>      0-5
>      $ cat /sys/devices/system/cpu/present
>      0-5
>      $ cat /sys/devices/system/cpu/enabled
>      0-4
>      $ cat /sys/devices/system/cpu/online
>      0-1,4
>      $ cat /sys/devices/system/cpu/offline
>      2-3,5
> 
>      Note: vCPU=4 was explicitly 'onlined' after hot-plug
>      $ echo 1 > /sys/devices/system/cpu/cpu4/online
> 
> (VII) Latest Repository
> =======================
> 
> (*) Latest Qemu RFC V3 (Architecture Specific) patch set:
>      https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v3
> (*) Latest Qemu V13 (Architecture Agnostic) patch set:
>      https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v3.arch.agnostic.v13
> (*) QEMU changes for vCPU hotplug can be cloned from below site:
>      https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2
> (*) Guest Kernel changes (by James Morse, ARM) are available here:
>      https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git virtual_cpu_hotplug/rfc/v2
> (*) Leftover patches of the kernel are available here:
>      https://lore.kernel.org/lkml/20240529133446.28446-1-Jonathan.Cameron@huawei.com/
>      https://github.com/salil-mehta/linux/commits/virtual_cpu_hotplug/rfc/v6.jic/ (not latest)
> 
> (VIII) KNOWN ISSUES
> ===================
> 
> 1. Migration has been lightly tested but has been found working.
> 2. TCG is broken.
> 3. HVF and qtest are not supported yet.
> 4. ACPI MADT Table flags [7] MADT.GICC.Enabled and MADT.GICC.online-capable are
>     mutually exclusive, i.e., as per the change [6], a vCPU cannot be both
>     GICC.Enabled and GICC.online-capable. This means:
>        [ Link: https://bugzilla.tianocore.org/show_bug.cgi?id=3706 ]
>     a. If we have to support hot-unplug of the cold-booted vCPUs, then these MUST
>        be specified as GICC.online-capable in the MADT Table during boot by the
>        firmware/Qemu. But this requirement conflicts with the requirement to
>        support new Qemu changes with legacy OS that don't understand
>        MADT.GICC.online-capable Bit. Legacy OS during boot time will ignore this
>        bit, and hence these vCPUs will not appear on such OS. This is unexpected
>        behavior.
>     b. In case we decide to specify vCPUs as MADT.GICC.Enabled and try to unplug
>        these cold-booted vCPUs from OS (which in actuality should be blocked by
>        returning error at Qemu), then features like 'kexec' will break.
>     c. As I understand, removal of the cold-booted vCPUs is a required feature
>        and x86 world allows it.
>     d. Hence, either we need a specification change to make the MADT.GICC.Enabled
>        and MADT.GICC.online-capable Bits NOT mutually exclusive or NOT support
>        the removal of cold-booted vCPUs. In the latter case, a check can be introduced
>        to bar the users from unplugging vCPUs, which were cold-booted, using QMP
>        commands. (Needs discussion!)
>        Please check the patch part of this patch set:
>        [hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled].
>     
>        NOTE: This is definitely not a blocker!
> 5. Code related to the notification to GICV3 about the hot(un)plug of a vCPU event
>     might need further discussion.
> 
> 
> (IX) THINGS TO DO
> =================
> 
> 1. Fix issues related to TCG/Emulation support. (Not a blocker)
> 2. Comprehensive Testing is in progress. (Positive feedback from Oracle & Ampere)
> 3. Qemu Documentation (.rst) needs to be updated.
> 4. Fix qtest, HVF Support (Future).
> 5. Update the design issue related to ACPI MADT.GICC flags discussed in known
>     issues. This might require UEFI ACPI specification change (Not a blocker).
> 6. Add ACPI _OSC 'Query' support. Only part of _OSC support exists now. (Not a blocker).
> 
> The above is *not* a complete list. Will update later!
> 
> Best regards,
> Salil.
> 
> (X) DISCLAIMER
> ==============
> 
> This work is an attempt to present a proof-of-concept of the ARM64 vCPU hotplug
> implementation to the community. This is *not* production-level code and might
> have bugs. Comprehensive testing is being done on HiSilicon Kunpeng920 SoC,
> Oracle, and Ampere servers. We are nearing stable code and a non-RFC
> version shall be floated soon.
> 
> This work is *mostly* in the lines of the discussions that have happened in the
> previous years [see refs below] across different channels like the mailing list,
> Linaro Open Discussions platform, and various conferences like KVMForum, etc. This
> RFC is being used as a way to verify the idea mentioned in this cover letter and
> to get community views. Once this has been agreed upon, a formal patch shall be
> posted to the mailing list for review.
> 
> [The concept being presented has been found to work!]
> 
> (XI) ORGANIZATION OF PATCHES
> ============================
>   
> A. Architecture *specific* patches:
> 
>     [Patch 1-8, 17, 27, 29] logic required during machine init.
>      (*) Some validation checks.
>      (*) Introduces core-id property and some util functions required later.
>      (*) Logic to pre-create vCPUs.
>      (*) GIC initialization pre-sized with possible vCPUs.
>      (*) Some refactoring to have common hot and cold plug logic together.
>      (*) Release of disabled QOM CPU objects in post_cpu_init().
>      (*) Support of ACPI _OSC method to negotiate platform hotplug capabilities.
>     [Patch 9-16] logic related to ACPI at machine init time.
>      (*) Changes required to Enable ACPI for CPU hotplug.
>      (*) Initialization of ACPI GED framework to cater to CPU Hotplug Events.
>      (*) ACPI MADT/MAT changes.
>     [Patch 18-26] logic required during vCPU hot-(un)plug.
>      (*) Basic framework changes to support vCPU hot-(un)plug.
>      (*) ACPI GED changes for hot-(un)plug hooks.
>      (*) Wire-unwire the IRQs.
>      (*) GIC notification logic.
>      (*) ARMCPU unrealize logic.
>      (*) Handling of SMCC Hypercall Exits by KVM to Qemu.
>     
> B. Architecture *agnostic* patches:
> 
>     [PATCH V13 0/8] Add architecture agnostic code to support vCPU Hotplug.
>     https://lore.kernel.org/qemu-devel/20240607115649.214622-1-salil.mehta@huawei.com/T/#md0887eb07976bc76606a8204614ccc7d9a01c1f7
>      (*) Refactors vCPU create, Parking, unparking logic of vCPUs, and addition of traces.
>      (*) Build ACPI AML related to CPU control dev.
>      (*) Changes related to the destruction of CPU Address Space.
>      (*) Changes related to the uninitialization of GDB Stub.
>      (*) Updating of Docs.
> 
> (XII) REFERENCES
> ================
> 
> [1] https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.mehta@huawei.com/
> [2] https://lore.kernel.org/linux-arm-kernel/20200625133757.22332-1-salil.mehta@huawei.com/
> [3] https://lore.kernel.org/lkml/20230203135043.409192-1-james.morse@arm.com/
> [4] https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/
> [5] https://lore.kernel.org/all/20230404154050.2270077-1-oliver.upton@linux.dev/
> [6] https://bugzilla.tianocore.org/show_bug.cgi?id=3706
> [7] https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic-cpu-interface-gicc-structure
> [8] https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
> [9] https://cloud.google.com/kubernetes-engine/docs/concepts/verticalpodautoscaler
> [10] https://docs.aws.amazon.com/eks/latest/userguide/vertical-pod-autoscaler.html
> [11] https://lkml.org/lkml/2019/7/10/235
> [12] https://lists.cs.columbia.edu/pipermail/kvmarm/2018-July/032316.html
> [13] https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg06517.html
> [14] https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-lists.linaro.org/thread/7CGL6JTACPUZEYQC34CZ2ZBWJGSR74WE/
> [15] http://lists.nongnu.org/archive/html/qemu-devel/2018-07/msg01168.html
> [16] https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00131.html
> [17] https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-lists.linaro.org/message/X74JS6P2N4AUWHHATJJVVFDI2EMDZJ74/
> [18] https://lore.kernel.org/lkml/20210608154805.216869-1-jean-philippe@linaro.org/
> [19] https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/
> [20] https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gicc-cpu-interface-flags
> [21] https://lore.kernel.org/qemu-devel/20230926100436.28284-1-salil.mehta@huawei.com/
> [22] https://lore.kernel.org/qemu-devel/20240607115649.214622-1-salil.mehta@huawei.com/T/#md0887eb07976bc76606a8204614ccc7d9a01c1f7
> 
> (XIII) ACKNOWLEDGEMENTS
> =======================
> 
> I would like to take this opportunity to thank below people for various
> discussions with me over different channels during the development:
> 
> Marc Zyngier (Google)               Catalin Marinas (ARM),
> James Morse(ARM),                   Will Deacon (Google),
> Jean-Phillipe Brucker (Linaro),     Sudeep Holla (ARM),
> Lorenzo Pieralisi (Linaro),         Gavin Shan (Redhat),
> Jonathan Cameron (Huawei),          Darren Hart (Ampere),
> Igor Mamedov (Redhat),              Ilkka Koskinen (Ampere),
> Andrew Jones (Redhat),              Karl Heubaum (Oracle),
> Keqian Zhu (Huawei),                Miguel Luis (Oracle),
> Xiongfeng Wang (Huawei),            Vishnu Pajjuri (Ampere),
> Shameerali Kolothum (Huawei)        Russell King (Oracle)
> Xuwei/Joy (Huawei),                 Peter Maydel (Linaro)
> Zengtao/Prime (Huawei),             And all those whom I have missed!
> 
> Many thanks to the following people for their current or past contributions:
> 
> 1. James Morse (ARM)
>     (Current Kernel part of vCPU Hotplug Support on AARCH64)
> 2. Jean-Philippe Brucker (Linaro)
>     (Prototyped one of the earlier PSCI-based POC [17][18] based on RFC V1)
> 3. Keqian Zhu (Huawei)
>     (Co-developed Qemu prototype)
> 4. Xiongfeng Wang (Huawei)
>     (Co-developed an earlier kernel prototype with me)
> 5. Vishnu Pajjuri (Ampere)
>     (Verification on Ampere ARM64 Platforms + fixes)
> 6. Miguel Luis (Oracle)
>     (Verification on Oracle ARM64 Platforms + fixes)
> 7. Russell King (Oracle) & Jonathan Cameron (Huawei)
>     (Helping in upstreaming James Morse's Kernel patches).
> 
> (XIV) Change Log:
> =================
> 
> RFC V2 -> RFC V3:
> -----------------
> 1. Miscellaneous:
>     - Split the RFC V2 into arch-agnostic and arch-specific patch sets.
> 2. Addressed Gavin Shan's (RedHat) comments:
>     - Made CPU property accessors inline.
>       https://lore.kernel.org/qemu-devel/6cd28639-2cfa-f233-c6d9-d5d2ec5b1c58@redhat.com/
>     - Collected Reviewed-bys [PATCH RFC V2 4/37, 14/37, 22/37].
>     - Dropped the patch as it was not required after init logic was refactored.
>       https://lore.kernel.org/qemu-devel/4fb2eef9-6742-1eeb-721a-b3db04b1be97@redhat.com/
>     - Fixed the range check for the core during vCPU Plug.
>       https://lore.kernel.org/qemu-devel/1c5fa24c-6bf3-750f-4f22-087e4a9311af@redhat.com/
>     - Added has_hotpluggable_vcpus check to make build_cpus_aml() conditional.
>       https://lore.kernel.org/qemu-devel/832342cb-74bc-58dd-c5d7-6f995baeb0f2@redhat.com/
>     - Fixed the states initialization in cpu_hotplug_hw_init() to accommodate previous refactoring.
>       https://lore.kernel.org/qemu-devel/da5e5609-1883-8650-c7d8-6868c7b74f1c@redhat.com/
>     - Fixed typos.
>       https://lore.kernel.org/qemu-devel/eb1ac571-7844-55e6-15e7-3dd7df21366b@redhat.com/
>     - Removed the unnecessary 'goto fail'.
>       https://lore.kernel.org/qemu-devel/4d8980ac-f402-60d4-fe52-787815af8a7d@redhat.com/#t
>     - Added check for hotpluggable vCPUs in the _OSC method.
>       https://lore.kernel.org/qemu-devel/20231017001326.FUBqQ1PTowF2GxQpnL3kIW0AhmSqbspazwixAHVSi6c@z/
> 3. Addressed Shaoqin Huang's (Intel) comments:
>     - Fixed the compilation break due to the absence of a call to virt_cpu_properties() missing
>       along with its definition.
>       https://lore.kernel.org/qemu-devel/3632ee24-47f7-ae68-8790-26eb2cf9950b@redhat.com/
> 4. Addressed Jonathan Cameron's (Huawei) comments:
>     - Gated the 'disabled vcpu message' for GIC version < 3.
>       https://lore.kernel.org/qemu-devel/20240116155911.00004fe1@Huawei.com/
> 
> RFC V1 -> RFC V2:
> -----------------
> 1. Addressed James Morse's (ARM) requirement as per Linaro Open Discussion:
>     - Exposed all possible vCPUs as always ACPI _STA.present and available during boot time.
>     - Added the _OSC handling as required by James's patches.
>     - Introduction of 'online-capable' bit handling in the flag of MADT GICC.
>     - SMCC Hypercall Exit handling in Qemu.
> 2. Addressed Marc Zyngier's comment:
>     - Fixed the note about GIC CPU Interface in the cover letter.
> 3. Addressed issues raised by Vishnu Pajjuru (Ampere) & Miguel Luis (Oracle) during testing:
>     - Live/Pseudo Migration crashes.
> 4. Others:
>     - Introduced the concept of persistent vCPU at QOM.
>     - Introduced wrapper APIs of present, possible, and persistent.
>     - Change at ACPI hotplug H/W init leg accommodating initializing is_present and is_enabled states.
>     - Check to avoid unplugging cold-booted vCPUs.
>     - Disabled hotplugging with TCG/HVF/QTEST.
>     - Introduced CPU Topology, {socket, cluster, core, thread}-id property.
>     - Extract virt CPU properties as a common virt_vcpu_properties() function.
> 
> Author Salil Mehta (1):
>    target/arm/kvm,tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu
> 
> Jean-Philippe Brucker (2):
>    hw/acpi: Make _MAT method optional
>    target/arm/kvm: Write CPU state back to KVM on reset
> 
> Miguel Luis (1):
>    tcg/mttcg: enable threads to unregister in tcg_ctxs[]
> 
> Salil Mehta (25):
>    arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id
>      property
>    cpu-common: Add common CPU utility for possible vCPUs
>    hw/arm/virt: Limit number of possible vCPUs for unsupported Accel or
>      GIC Type
>    hw/arm/virt: Move setting of common CPU properties in a function
>    arm/virt,target/arm: Machine init time change common to vCPU
>      {cold|hot}-plug
>    arm/virt,kvm: Pre-create disabled possible vCPUs @machine init
>    arm/virt,gicv3: Changes to pre-size GIC with possible vcpus @machine
>      init
>    arm/virt: Init PMU at host for all possible vcpus
>    arm/acpi: Enable ACPI support for vcpu hotplug
>    arm/virt: Add cpu hotplug events to GED during creation
>    arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
>    arm/virt/acpi: Build CPUs AML with CPU Hotplug support
>    arm/virt: Make ARM vCPU *present* status ACPI *persistent*
>    hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} Bits
>      to Guest
>    hw/arm: MADT Tbl change to size the guest with possible vCPUs
>    arm/virt: Release objects for *disabled* possible vCPUs after init
>    arm/virt: Add/update basic hot-(un)plug framework
>    arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug
>    hw/arm,gicv3: Changes to update GIC with vCPU hot-plug notification
>    hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register
>      info
>    arm/virt: Update the guest(via GED) about CPU hot-(un)plug events
>    hw/arm: Changes required for reset and to support next boot
>    target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
>    hw/arm: Support hotplug capability check using _OSC method
>    hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled
> 
>   accel/tcg/tcg-accel-ops-mttcg.c    |   1 +
>   cpu-common.c                       |  37 ++
>   hw/acpi/cpu.c                      |  62 +-
>   hw/acpi/generic_event_device.c     |  11 +
>   hw/arm/Kconfig                     |   1 +
>   hw/arm/boot.c                      |   2 +-
>   hw/arm/virt-acpi-build.c           | 113 +++-
>   hw/arm/virt.c                      | 877 +++++++++++++++++++++++------
>   hw/core/gpio.c                     |   2 +-
>   hw/intc/arm_gicv3.c                |   1 +
>   hw/intc/arm_gicv3_common.c         |  66 ++-
>   hw/intc/arm_gicv3_cpuif.c          | 269 +++++----
>   hw/intc/arm_gicv3_cpuif_common.c   |   5 +
>   hw/intc/arm_gicv3_kvm.c            |  39 +-
>   hw/intc/gicv3_internal.h           |   2 +
>   include/hw/acpi/cpu.h              |   2 +
>   include/hw/arm/boot.h              |   2 +
>   include/hw/arm/virt.h              |  38 +-
>   include/hw/core/cpu.h              |  78 +++
>   include/hw/intc/arm_gicv3_common.h |  23 +
>   include/hw/qdev-core.h             |   2 +
>   include/tcg/startup.h              |   7 +
>   target/arm/arm-powerctl.c          |  51 +-
>   target/arm/cpu-qom.h               |  18 +-
>   target/arm/cpu.c                   | 112 ++++
>   target/arm/cpu.h                   |  18 +
>   target/arm/cpu64.c                 |  15 +
>   target/arm/gdbstub.c               |   6 +
>   target/arm/helper.c                |  27 +-
>   target/arm/internals.h             |  14 +-
>   target/arm/kvm.c                   | 146 ++++-
>   target/arm/kvm_arm.h               |  25 +
>   target/arm/meson.build             |   1 +
>   target/arm/{tcg => }/psci.c        |   8 +
>   target/arm/tcg/meson.build         |   4 -
>   tcg/tcg.c                          |  24 +
>   36 files changed, 1749 insertions(+), 360 deletions(-)
>   rename target/arm/{tcg => }/psci.c (97%)
> 


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-08-28 20:35 ` Gustavo Romero
@ 2024-08-29  9:59   ` Alex Bennée
  2024-09-04 14:24     ` Salil Mehta via
  2024-09-04 14:03   ` Salil Mehta via
  1 sibling, 1 reply; 105+ messages in thread
From: Alex Bennée @ 2024-08-29  9:59 UTC (permalink / raw)
  To: Gustavo Romero
  Cc: Salil Mehta, qemu-devel, qemu-arm, mst, maz, jean-philippe,
	jonathan.cameron, lpieralisi, peter.maydell, richard.henderson,
	imammedo, andrew.jones, david, philmd, eric.auger, will, ardb,
	oliver.upton, pbonzini, gshan, rafael, borntraeger, npiggin,
	harshpb, linux, darren, ilkka, vishnu, karl.heubaum, miguel.luis,
	salil.mehta, zhukeqian1, wangxiongfeng2, wangyanan55, jiakernel2,
	maobibo, lixianglai, shahuang, zhao1.liu, linuxarm

Gustavo Romero <gustavo.romero@linaro.org> writes:

> Hi Salil,
>
> On 6/13/24 8:36 PM, Salil Mehta via wrote:
<snip>
>> (VI) Commands Used
>> ==================
>> A. Qemu launch commands to init the machine:
>>      $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3
>> \
>>        -cpu host -smp cpus=4,maxcpus=6 \
>>        -m 300M \
>>        -kernel Image \
>>        -initrd rootfs.cpio.gz \
>>        -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2 acpi=force" \
>>        -nographic \
>>        -bios QEMU_EFI.fd \
>> B. Hot-(un)plug related commands:
>>    # Hotplug a host vCPU (accel=kvm):
>>      $ device_add host-arm-cpu,id=core4,core-id=4
>>    # Hotplug a vCPU (accel=tcg):
>>      $ device_add cortex-a57-arm-cpu,id=core4,core-id=4
>
> Since support for hotplug is disabled on TCG, remove
> these two lines in v4 cover letter?

Why is it disabled for TCG? We should aim for TCG being as close to KVM
as possible for developers even if it is not a production solution.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
  2024-08-28 20:23       ` Gustavo Romero
@ 2024-09-04 13:53         ` Salil Mehta via
  0 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-09-04 13:53 UTC (permalink / raw)
  To: Gustavo Romero, Alex Bennée
  Cc: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com,
	maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com,
	rafael@kernel.org, borntraeger@linux.ibm.com, npiggin@gmail.com,
	harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Gustavo,

Sorry for the delay in reply...got pulled into something else.

>  From: Gustavo Romero <gustavo.romero@linaro.org>
>  Sent: Wednesday, August 28, 2024 9:24 PM
>  To: Salil Mehta <salil.mehta@huawei.com>; Alex Bennée
>  <alex.bennee@linaro.org>
>  
>  Hi Salil,
>  
>  On 8/19/24 9:35 AM, Salil Mehta via wrote:
>  > Hi Alex,
>  >
>  >>   From: Alex Bennée <alex.bennee@linaro.org>
>  >>   Sent: Friday, August 16, 2024 4:37 PM
>  >>   To: Salil Mehta <salil.mehta@huawei.com>
>  >>
>  >>   Salil Mehta <salil.mehta@huawei.com> writes:
>  >>
>  >>   > vCPU Hot-unplug will result in QOM CPU object unrealization which will
>  >>   > do away with all the vCPU thread creations, allocations, registrations
>  >>   > that happened as part of the realization process. This change
>  >>   > introduces the ARM CPU unrealize function taking care of exactly that.
>  >>   >
>  >>   > Note, initialized KVM vCPUs are not destroyed in host KVM but their
>  >>   > Qemu context is parked at the QEMU KVM layer.
>  >>   >
>  >>   > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
>  >>   > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>  >>   > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  >>   > Reported-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
>  >>   > [VP: Identified CPU stall issue & suggested probable fix]
>  >>   > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  >>   > ---
>  >>   >  target/arm/cpu.c       | 101
>  >>   +++++++++++++++++++++++++++++++++++++++++
>  >>   >  target/arm/cpu.h       |  14 ++++++
>  >>   >  target/arm/gdbstub.c   |   6 +++
>  >>   >  target/arm/helper.c    |  25 ++++++++++
>  >>   >  target/arm/internals.h |   3 ++
>  >>   >  target/arm/kvm.c       |   5 ++
>  >>   >  6 files changed, 154 insertions(+)
>  >>   >
>  >>   > diff --git a/target/arm/cpu.c b/target/arm/cpu.c index
>  >>   > c92162fa97..a3dc669309 100644
>  >>   > --- a/target/arm/cpu.c
>  >>   > +++ b/target/arm/cpu.c
>  >>   > @@ -157,6 +157,16 @@ void
>  >>   arm_register_pre_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn
>  >>   *hook,
>  >>   >      QLIST_INSERT_HEAD(&cpu->pre_el_change_hooks, entry, node);  }
>  >>   >
>  >>   > +void arm_unregister_pre_el_change_hooks(ARMCPU *cpu) {
>  >>   > +    ARMELChangeHook *entry, *next;
>  >>   > +
>  >>   > +    QLIST_FOREACH_SAFE(entry, &cpu->pre_el_change_hooks, node,
>  >>   next) {
>  >>   > +        QLIST_REMOVE(entry, node);
>  >>   > +        g_free(entry);
>  >>   > +    }
>  >>   > +}
>  >>   > +
>  >>   >  void arm_register_el_change_hook(ARMCPU *cpu,
>  >>   ARMELChangeHookFn *hook,
>  >>   >                                   void *opaque)  { @@ -168,6 +178,16
>  >>   > @@ void arm_register_el_change_hook(ARMCPU *cpu,
>  >>   ARMELChangeHookFn *hook,
>  >>   >      QLIST_INSERT_HEAD(&cpu->el_change_hooks, entry, node);  }
>  >>   >
>  >>   > +void arm_unregister_el_change_hooks(ARMCPU *cpu) {
>  >>   > +    ARMELChangeHook *entry, *next;
>  >>   > +
>  >>   > +    QLIST_FOREACH_SAFE(entry, &cpu->el_change_hooks, node,
>  next) {
>  >>   > +        QLIST_REMOVE(entry, node);
>  >>   > +        g_free(entry);
>  >>   > +    }
>  >>   > +}
>  >>   > +
>  >>   >  static void cp_reg_reset(gpointer key, gpointer value, gpointer
>  >>   > opaque)  {
>  >>   >      /* Reset a single ARMCPRegInfo register */ @@ -2552,6 +2572,85
>  @@
>  >>   > static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
>  >>   >      acc->parent_realize(dev, errp);
>  >>   >  }
>  >>   >
>  >>   > +static void arm_cpu_unrealizefn(DeviceState *dev) {
>  >>   > +    ARMCPUClass *acc = ARM_CPU_GET_CLASS(dev);
>  >>   > +    ARMCPU *cpu = ARM_CPU(dev);
>  >>   > +    CPUARMState *env = &cpu->env;
>  >>   > +    CPUState *cs = CPU(dev);
>  >>   > +    bool has_secure;
>  >>   > +
>  >>   > +    has_secure = cpu->has_el3 || arm_feature(env,
>  >>   > + ARM_FEATURE_M_SECURITY);
>  >>   > +
>  >>   > +    /* rock 'n' un-roll, whatever happened in the arm_cpu_realizefn
>  >>   cleanly */
>  >>   > +    cpu_address_space_destroy(cs, ARMASIdx_NS);
>  >>
>  >>   On current master this will fail:
>  >>
>  >>   ../../target/arm/cpu.c: In function ‘arm_cpu_unrealizefn’:
>  >>   ../../target/arm/cpu.c:2626:5: error: implicit declaration of function
>  >>   ‘cpu_address_space_destroy’ [-Werror=implicit-function-declaration]
>  >>    2626 |     cpu_address_space_destroy(cs, ARMASIdx_NS);
>  >>         |     ^~~~~~~~~~~~~~~~~~~~~~~~~
>  >>   ../../target/arm/cpu.c:2626:5: error: nested extern declaration of
>  >>   ‘cpu_address_space_destroy’ [-Werror=nested-externs]
>  >>   cc1: all warnings being treated as errors
>  >
>  >
>  > The current master already has arch-agnostic patch-set. I've applied
>  > the RFC V3 to the latest and complied. I did not see this issue?
>  >
>  > I've create a new branch for your reference.
>  >
>  > https://github.com/salil-mehta/qemu/tree/virt-cpuhp-armv8/rfc-v4-rc4
>  >
>  > Please let me know if this works for you?
>  
>  It still happens on the new branch. You need to configure Linux user mode
>  to reproduce it, e.g.:
>  
>  $ ../configure --target-list=aarch64-linux-user,aarch64-softmmu [...]
>  
>  If you just configure the 'aarch64-softmmu' target it doesn't happen.


Aah, I see. I'll check it today. As vCPU Hotplug does not makes sense in
Qemu user-mode emulation. I think we might need to conditionally
compile certain code using !CONFIG_USER_ONLY switch.

Thanks for the clarification.

Cheers

>  
>  
>  Cheers,
>  Gustavo

^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-08-28 20:35 ` Gustavo Romero
  2024-08-29  9:59   ` Alex Bennée
@ 2024-09-04 14:03   ` Salil Mehta via
  1 sibling, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-09-04 14:03 UTC (permalink / raw)
  To: Gustavo Romero, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com
  Cc: maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com,
	rafael@kernel.org, borntraeger@linux.ibm.com,
	alex.bennee@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com,
	linux@armlinux.org.uk, darren@os.amperecomputing.com,
	ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com,
	karl.heubaum@oracle.com, miguel.luis@oracle.com,
	salil.mehta@opnsrc.net, zhukeqian, wangxiongfeng (C),
	wangyanan (Y), jiakernel2@gmail.com, maobibo@loongson.cn,
	lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com,
	Linuxarm

Hi Gustavo,

>  From: Gustavo Romero <gustavo.romero@linaro.org>
>  Sent: Wednesday, August 28, 2024 9:36 PM
>  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  qemu-arm@nongnu.org; mst@redhat.com
>  
>  Hi Salil,
>  
>  On 6/13/24 8:36 PM, Salil Mehta via wrote:
>  > PROLOGUE
>  > ========
>  >
>  > To assist in review and set the right expectations from this RFC,
>  > please first read the sections *APPENDED AT THE END* of this cover
>  letter:
>  >
>  > 1. Important *DISCLAIMER* [Section (X)] 2. Work presented at
>  KVMForum
>  > Conference (slides available) [Section (V)F] 3. Organization of
>  > patches [Section (XI)] 4. References [Section (XII)] 5. Detailed TODO
>  > list of leftover work or work-in-progress [Section (IX)]
>  >
>  > There has been interest shown by other organizations in adapting this
>  > series for their architecture. Hence, RFC V2 [21] has been split into
>  > architecture
>  > *agnostic* [22] and *specific* patch sets.
>  >
>  > This is an ARM architecture-specific patch set carved out of RFC V2.
>  > Please check section (XI)B for details of architecture agnostic patches.
>  >
>  > SECTIONS [I - XIII] are as follows:
>  >
>  > (I) Key Changes [details in last section (XIV)]
>  > ==============================================
>  >
>  > RFC V2 -> RFC V3
>  >

[...]

>  >
>  > (VI) Commands Used
>  > ==================
>  >
>  > A. Qemu launch commands to init the machine:
>  >
>  >      $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3 \
>  >        -cpu host -smp cpus=4,maxcpus=6 \
>  >        -m 300M \
>  >        -kernel Image \
>  >        -initrd rootfs.cpio.gz \
>  >        -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2
>  acpi=force" \
>  >        -nographic \
>  >        -bios QEMU_EFI.fd \
>  >
>  > B. Hot-(un)plug related commands:
>  >
>  >    # Hotplug a host vCPU (accel=kvm):
>  >      $ device_add host-arm-cpu,id=core4,core-id=4
>  >
>  >    # Hotplug a vCPU (accel=tcg):
>  >      $ device_add cortex-a57-arm-cpu,id=core4,core-id=4
>  
>  Since support for hotplug is disabled on TCG, remove these two lines in v4
>  cover letter?


We are fixing that and it should be part of RFC V4.


Thanks
Salil.


>  
>  
>  Cheers,
>  Gustavo
>  
>  >    # Delete the vCPU:
>  >      $ device_del core4
>  >

[...]

^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-08-29  9:59   ` Alex Bennée
@ 2024-09-04 14:24     ` Salil Mehta via
  2024-09-04 15:45       ` Alex Bennée
  0 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-09-04 14:24 UTC (permalink / raw)
  To: Alex Bennée, Gustavo Romero
  Cc: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com,
	maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com,
	rafael@kernel.org, borntraeger@linux.ibm.com, npiggin@gmail.com,
	harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Alex,

>  -----Original Message-----
>  From: Alex Bennée <alex.bennee@linaro.org>
>  Sent: Thursday, August 29, 2024 11:00 AM
>  To: Gustavo Romero <gustavo.romero@linaro.org>
>  
>  Gustavo Romero <gustavo.romero@linaro.org> writes:
>  
>  > Hi Salil,
>  >
>  > On 6/13/24 8:36 PM, Salil Mehta via wrote:
>  <snip>
>  >> (VI) Commands Used
>  >> ==================
>  >> A. Qemu launch commands to init the machine:
>  >>      $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3 \
>  >>        -cpu host -smp cpus=4,maxcpus=6 \
>  >>        -m 300M \
>  >>        -kernel Image \
>  >>        -initrd rootfs.cpio.gz \
>  >>        -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2
>  acpi=force" \
>  >>        -nographic \
>  >>        -bios QEMU_EFI.fd \
>  >> B. Hot-(un)plug related commands:
>  >>    # Hotplug a host vCPU (accel=kvm):
>  >>      $ device_add host-arm-cpu,id=core4,core-id=4
>  >>    # Hotplug a vCPU (accel=tcg):
>  >>      $ device_add cortex-a57-arm-cpu,id=core4,core-id=4
>  >
>  > Since support for hotplug is disabled on TCG, remove these two lines
>  > in v4 cover letter?
>  
>  Why is it disabled for TCG? We should aim for TCG being as close to KVM as
>  possible for developers even if it is not a production solution.

Agreed In principle. Yes, that would be of help.


Context why it was disabled although most code to support TCG exist:

I had reported a crash in the RFC V1 (June 2020) about TCGContext counter
overflow assertion during repeated hot(un)plug operation. Miguel from Oracle
was able to reproduce this problem last year in Feb and also suggested a fix but he
later found out in his testing that there was a problem during migration.

RFC V1 June 2020:
https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.mehta@huawei.com/
Scroll to below:
[...]
THINGS TO DO:
 (*) Migration support 
 (*) TCG/Emulation support is not proper right now. Works to a certain extent
     but is not complete. especially the unrealize part in which there is a
     overflow of tcg contexts. The last is due to the fact tcg maintains a 
     count on number of context(per thread instance) so as we hotplug the vcpus
     this counter keeps on incrementing. But during hot-unplug the counter is
     not decremented.

@ Feb 2023, [Linaro-open-discussions] Re: Qemu TCG support for virtual-cpuhotplug/online-policy 

https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-lists.linaro.org/message/GMDFTEZE6WUUI7LZAYOWLXFHAPXLCND5/

Last status reported by Miguel was that there was problem with the TCG and he intended
to fix this. He was on paternity leave so I will try to gather the exact status of the TCG today.

Thanks
Salil


>  
>  --
>  Alex Bennée
>  Virtualisation Tech Lead @ Linaro

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
  2024-08-19 11:53     ` Salil Mehta via
@ 2024-09-04 14:42       ` zhao1.liu
  2024-09-04 17:37         ` Salil Mehta via
  0 siblings, 1 reply; 105+ messages in thread
From: zhao1.liu @ 2024-09-04 14:42 UTC (permalink / raw)
  To: Salil Mehta
  Cc: Gavin Shan, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com, maz@kernel.org, jean-philippe@linaro.org,
	Jonathan Cameron, lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	Linuxarm

Hi Salil,

On Mon, Aug 19, 2024 at 11:53:52AM +0000, Salil Mehta wrote:
> Date: Mon, 19 Aug 2024 11:53:52 +0000
> From: Salil Mehta <salil.mehta@huawei.com>
> Subject: RE: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU
>  {socket,cluster,core,thread}-id property

[snip]

> >  > NULL); @@ -2708,6 +2716,7 @@ static const CPUArchIdList
> >  *virt_possible_cpu_arch_ids(MachineState *ms)
> >  >   {
> >  >       int n;
> >  >       unsigned int max_cpus = ms->smp.max_cpus;
> >  > +    unsigned int smp_threads = ms->smp.threads;
> >  >       VirtMachineState *vms = VIRT_MACHINE(ms);
> >  >       MachineClass *mc = MACHINE_GET_CLASS(vms);
> >  >
> >  > @@ -2721,6 +2730,7 @@ static const CPUArchIdList
> >  *virt_possible_cpu_arch_ids(MachineState *ms)
> >  >       ms->possible_cpus->len = max_cpus;
> >  >       for (n = 0; n < ms->possible_cpus->len; n++) {
> >  >           ms->possible_cpus->cpus[n].type = ms->cpu_type;
> >  > +        ms->possible_cpus->cpus[n].vcpus_count = smp_threads;
> >  >           ms->possible_cpus->cpus[n].arch_id =
> >  >               virt_cpu_mp_affinity(vms, n);
> >  >
> >  
> >  Why @vcpus_count is initialized to @smp_threads? it needs to be
> >  documented in the commit log.
> 
> 
> Because every thread internally amounts to a vCPU in QOM and which
> is in 1:1 relationship with KVM vCPU. AFAIK, QOM does not strictly follows
> any architecture. Once you start to get into details of threads there
> are many aspects of shared resources one will have to consider and
> these can vary across different implementations of architecture.

For SPAPR CPU, the granularity of >possible_cpus->cpus[] is "core", and
for x86, it's "thread" granularity.

And smp.threads means how many threads in one core, so for x86, the
vcpus_count of a "thread" is 1, and for spapr, the vcpus_count of a
"core" equals to smp.threads.

IIUC, your granularity is still "thread", so that this filed should be 1.

-Zhao

> It is a bigger problem than you think, which I've touched at very nascent
> stages while doing POC of vCPU hotplug but tried to avoid till now. 
> 
> 
> But I would like to hear other community members views on this.
> 
> Hi Igor/Peter,
> 
> What is your take on this?
> 
> Thanks
> Salil.


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-09-04 14:24     ` Salil Mehta via
@ 2024-09-04 15:45       ` Alex Bennée
  2024-09-04 15:59         ` Salil Mehta via
  0 siblings, 1 reply; 105+ messages in thread
From: Alex Bennée @ 2024-09-04 15:45 UTC (permalink / raw)
  To: Salil Mehta
  Cc: Gustavo Romero, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com, maz@kernel.org, jean-philippe@linaro.org,
	Jonathan Cameron, lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com,
	rafael@kernel.org, borntraeger@linux.ibm.com, npiggin@gmail.com,
	harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Salil Mehta <salil.mehta@huawei.com> writes:

> Hi Alex,
>
>>  -----Original Message-----
>>  From: Alex Bennée <alex.bennee@linaro.org>
>>  Sent: Thursday, August 29, 2024 11:00 AM
>>  To: Gustavo Romero <gustavo.romero@linaro.org>
>>  
>>  Gustavo Romero <gustavo.romero@linaro.org> writes:
>>  
>>  > Hi Salil,
>>  >
>>  > On 6/13/24 8:36 PM, Salil Mehta via wrote:
>>  <snip>
>>  >> (VI) Commands Used
>>  >> ==================
>>  >> A. Qemu launch commands to init the machine:
>>  >>      $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3 \
>>  >>        -cpu host -smp cpus=4,maxcpus=6 \
>>  >>        -m 300M \
>>  >>        -kernel Image \
>>  >>        -initrd rootfs.cpio.gz \
>>  >>        -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2
>>  acpi=force" \
>>  >>        -nographic \
>>  >>        -bios QEMU_EFI.fd \
>>  >> B. Hot-(un)plug related commands:
>>  >>    # Hotplug a host vCPU (accel=kvm):
>>  >>      $ device_add host-arm-cpu,id=core4,core-id=4
>>  >>    # Hotplug a vCPU (accel=tcg):
>>  >>      $ device_add cortex-a57-arm-cpu,id=core4,core-id=4
>>  >
>>  > Since support for hotplug is disabled on TCG, remove these two lines
>>  > in v4 cover letter?
>>  
>>  Why is it disabled for TCG? We should aim for TCG being as close to KVM as
>>  possible for developers even if it is not a production solution.
>
> Agreed In principle. Yes, that would be of help.
>
>
> Context why it was disabled although most code to support TCG exist:
>
> I had reported a crash in the RFC V1 (June 2020) about TCGContext counter
> overflow assertion during repeated hot(un)plug operation. Miguel from Oracle
> was able to reproduce this problem last year in Feb and also suggested a fix but he
> later found out in his testing that there was a problem during migration.
>
> RFC V1 June 2020:
> https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.mehta@huawei.com/
> Scroll to below:
> [...]
> THINGS TO DO:
>  (*) Migration support 
>  (*) TCG/Emulation support is not proper right now. Works to a certain extent
>      but is not complete. especially the unrealize part in which there is a
>      overflow of tcg contexts. The last is due to the fact tcg maintains a 
>      count on number of context(per thread instance) so as we hotplug the vcpus
>      this counter keeps on incrementing. But during hot-unplug the counter is
>      not decremented.

Right so the translation cache is segmented by vCPU to support parallel
JIT operations. The easiest solution would be to ensure we dimension for
the maximum number of vCPUs, which it should already, see tcg_init_machine():

  unsigned max_cpus = ms->smp.max_cpus;
  ...
  tcg_init(s->tb_size * MiB, s->splitwx_enabled, max_cpus);

>
> @ Feb 2023, [Linaro-open-discussions] Re: Qemu TCG support for virtual-cpuhotplug/online-policy 
>
> https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-lists.linaro.org/message/GMDFTEZE6WUUI7LZAYOWLXFHAPXLCND5/
>
> Last status reported by Miguel was that there was problem with the TCG and he intended
> to fix this. He was on paternity leave so I will try to gather the exact status of the TCG today.
>
> Thanks
> Salil
>
>
>>  
>>  --
>>  Alex Bennée
>>  Virtualisation Tech Lead @ Linaro

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-09-04 15:45       ` Alex Bennée
@ 2024-09-04 15:59         ` Salil Mehta via
  2024-09-06 15:06           ` Salil Mehta via
  0 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-09-04 15:59 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Gustavo Romero, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com, maz@kernel.org, jean-philippe@linaro.org,
	Jonathan Cameron, lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com,
	rafael@kernel.org, borntraeger@linux.ibm.com, npiggin@gmail.com,
	harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Alex,

>  From: Alex Bennée <alex.bennee@linaro.org>
>  Sent: Wednesday, September 4, 2024 4:46 PM
>  To: Salil Mehta <salil.mehta@huawei.com>
>  
>  Salil Mehta <salil.mehta@huawei.com> writes:
>  
>  > Hi Alex,
>  >
>  >>  -----Original Message-----
>  >>  From: Alex Bennée <alex.bennee@linaro.org>
>  >>  Sent: Thursday, August 29, 2024 11:00 AM
>  >>  To: Gustavo Romero <gustavo.romero@linaro.org>
>  >>
>  >>  Gustavo Romero <gustavo.romero@linaro.org> writes:
>  >>
>  >>  > Hi Salil,
>  >>  >
>  >>  > On 6/13/24 8:36 PM, Salil Mehta via wrote:
>  >>  <snip>
>  >>  >> (VI) Commands Used
>  >>  >> ==================
>  >>  >> A. Qemu launch commands to init the machine:
>  >>  >>      $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3
>  \
>  >>  >>        -cpu host -smp cpus=4,maxcpus=6 \
>  >>  >>        -m 300M \
>  >>  >>        -kernel Image \
>  >>  >>        -initrd rootfs.cpio.gz \
>  >>  >>        -append "console=ttyAMA0 root=/dev/ram rdinit=/init
>  maxcpus=2
>  >>  acpi=force" \
>  >>  >>        -nographic \
>  >>  >>        -bios QEMU_EFI.fd \
>  >>  >> B. Hot-(un)plug related commands:
>  >>  >>    # Hotplug a host vCPU (accel=kvm):
>  >>  >>      $ device_add host-arm-cpu,id=core4,core-id=4
>  >>  >>    # Hotplug a vCPU (accel=tcg):
>  >>  >>      $ device_add cortex-a57-arm-cpu,id=core4,core-id=4
>  >>  >
>  >>  > Since support for hotplug is disabled on TCG, remove these two
>  >> lines  > in v4 cover letter?
>  >>
>  >>  Why is it disabled for TCG? We should aim for TCG being as close to
>  >> KVM as  possible for developers even if it is not a production solution.
>  >
>  > Agreed In principle. Yes, that would be of help.
>  >
>  >
>  > Context why it was disabled although most code to support TCG exist:
>  >
>  > I had reported a crash in the RFC V1 (June 2020) about TCGContext
>  > counter overflow assertion during repeated hot(un)plug operation.
>  > Miguel from Oracle was able to reproduce this problem last year in Feb
>  > and also suggested a fix but he later found out in his testing that there was
>  a problem during migration.
>  >
>  > RFC V1 June 2020:
>  > https://lore.kernel.org/qemu-devel/20200613213629.21984-1-
>  salil.mehta@
>  > huawei.com/
>  > Scroll to below:
>  > [...]
>  > THINGS TO DO:
>  >  (*) Migration support
>  >  (*) TCG/Emulation support is not proper right now. Works to a certain
>  extent
>  >      but is not complete. especially the unrealize part in which there is a
>  >      overflow of tcg contexts. The last is due to the fact tcg maintains a
>  >      count on number of context(per thread instance) so as we hotplug the
>  vcpus
>  >      this counter keeps on incrementing. But during hot-unplug the counter
>  is
>  >      not decremented.
>  
>  Right so the translation cache is segmented by vCPU to support parallel JIT
>  operations. The easiest solution would be to ensure we dimension for the
>  maximum number of vCPUs, which it should already, see
>  tcg_init_machine():
>  
>    unsigned max_cpus = ms->smp.max_cpus;
>    ...
>    tcg_init(s->tb_size * MiB, s->splitwx_enabled, max_cpus);


Agreed. We have done that and have a patch for that as well. But it is still
a work-in-progress and I've lost context a bit.

https://github.com/salil-mehta/qemu/commit/107cf5ca7cf3716bc0f8c68e98e1da3939f449ce

For now, I've very quickly tried to enable and run the TCG to gain back the context.
I've now hit a different problem during TCG vCPU unrealization phase, while
pthread_join() waits on halt condition variable for MTTCG vCPU thread to exit,
there is a crash somewhere. Look like some race condition. Will dig this further.
 

Best regards
Salil.

>  > @ Feb 2023, [Linaro-open-discussions] Re: Qemu TCG support for
>  > virtual-cpuhotplug/online-policy
>  >
>  > https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-l
>  > ists.linaro.org/message/GMDFTEZE6WUUI7LZAYOWLXFHAPXLCND5/
>  >
>  > Last status reported by Miguel was that there was problem with the TCG
>  > and he intended to fix this. He was on paternity leave so I will try to gather
>  the exact status of the TCG today.
>  >
>  > Thanks
>  > Salil
>  >
>  >
>  >>
>  >>  --
>  >>  Alex Bennée
>  >>  Virtualisation Tech Lead @ Linaro
>  
>  --
>  Alex Bennée
>  Virtualisation Tech Lead @ Linaro

^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
  2024-09-04 14:42       ` zhao1.liu
@ 2024-09-04 17:37         ` Salil Mehta via
  2024-09-09 15:28           ` Zhao Liu
  0 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-09-04 17:37 UTC (permalink / raw)
  To: zhao1.liu@intel.com
  Cc: Gavin Shan, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com, maz@kernel.org, jean-philippe@linaro.org,
	Jonathan Cameron, lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	Linuxarm

Hi Zhao,

>  From: zhao1.liu@intel.com <zhao1.liu@intel.com>
>  Sent: Wednesday, September 4, 2024 3:43 PM
>  To: Salil Mehta <salil.mehta@huawei.com>
>  
>  Hi Salil,
>  
>  On Mon, Aug 19, 2024 at 11:53:52AM +0000, Salil Mehta wrote:
>  > Date: Mon, 19 Aug 2024 11:53:52 +0000
>  > From: Salil Mehta <salil.mehta@huawei.com>
>  > Subject: RE: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU
>  > {socket,cluster,core,thread}-id property
>  
>  [snip]
>  
>  > >  > NULL); @@ -2708,6 +2716,7 @@ static const CPUArchIdList
>  > > *virt_possible_cpu_arch_ids(MachineState *ms)
>  > >  >   {
>  > >  >       int n;
>  > >  >       unsigned int max_cpus = ms->smp.max_cpus;
>  > >  > +    unsigned int smp_threads = ms->smp.threads;
>  > >  >       VirtMachineState *vms = VIRT_MACHINE(ms);
>  > >  >       MachineClass *mc = MACHINE_GET_CLASS(vms);
>  > >  >
>  > >  > @@ -2721,6 +2730,7 @@ static const CPUArchIdList
>  > > *virt_possible_cpu_arch_ids(MachineState *ms)
>  > >  >       ms->possible_cpus->len = max_cpus;
>  > >  >       for (n = 0; n < ms->possible_cpus->len; n++) {
>  > >  >           ms->possible_cpus->cpus[n].type = ms->cpu_type;
>  > >  > +        ms->possible_cpus->cpus[n].vcpus_count = smp_threads;
>  > >  >           ms->possible_cpus->cpus[n].arch_id =
>  > >  >               virt_cpu_mp_affinity(vms, n);
>  > >  >
>  > >
>  > >  Why @vcpus_count is initialized to @smp_threads? it needs to be
>  > > documented in the commit log.
>  >
>  >
>  > Because every thread internally amounts to a vCPU in QOM and which is
>  > in 1:1 relationship with KVM vCPU. AFAIK, QOM does not strictly
>  > follows any architecture. Once you start to get into details of
>  > threads there are many aspects of shared resources one will have to
>  > consider and these can vary across different implementations of
>  architecture.
>  
>  For SPAPR CPU, the granularity of >possible_cpus->cpus[] is "core", and for
>  x86, it's "thread" granularity.


We have threads per-core at microarchitecture level in ARM as well. But each
thread appears like a vCPU to OS and AFAICS there are no special attributes
attached to it. SMT can be enabled/disabled at firmware and should get
reflected in the configuration accordingly i.e. value of *threads-per-core* 
changes between 1 and 'N'.  This means 'vcpus_count' has to reflect the
correct configuration. But I think threads lack proper representation
in Qemu QOM.

In Qemu, each vCPU reflects an execution context (which gets uniquely mapped
to KVM vCPU). AFAICS, we only have *CPUState* (Struct ArchCPU) as a placeholder
for this execution context and there is no *ThreadState* (derived out of
Struct CPUState). Hence, we've  to map all the threads as QOM vCPUs. This means
the array of present or possible CPUs represented by 'struct CPUArchIdList' contains
all execution contexts which actually might be vCPU or a thread. Hence, usage of
*vcpus_count* seems quite superficial to me frankly.

Also, AFAICS, KVM does not have the concept of the threads and only has
KVM vCPUs, but you are still allowed to specify the topology with sockets, dies,
clusters, cores, threads in most architectures.  


>  
>  And smp.threads means how many threads in one core, so for x86, the
>  vcpus_count of a "thread" is 1, and for spapr, the vcpus_count of a "core"
>  equals to smp.threads.


Sure, but does the KVM specifies this? and how does these threads map to the QOM
vCPU objects or execution context? AFAICS there is nothing but 'CPUState'
which will be made part of the  possible vCPU list 'struct CPUArchIdList'.



>  
>  IIUC, your granularity is still "thread", so that this filed should be 1.


Well, again we need more discussion on this. I've stated my concerns against
doing this. User should be allowed to create virtual topology which will
include 'threads' as one of the parameter.


>  
>  -Zhao
>  
>  > It is a bigger problem than you think, which I've touched at very
>  > nascent stages while doing POC of vCPU hotplug but tried to avoid till now.
>  >
>  >
>  > But I would like to hear other community members views on this.
>  >
>  > Hi Igor/Peter,
>  >
>  > What is your take on this?
>  >
>  > Thanks
>  > Salil.



^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
  2024-09-04 15:59         ` Salil Mehta via
@ 2024-09-06 15:06           ` Salil Mehta via
  0 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta via @ 2024-09-06 15:06 UTC (permalink / raw)
  To: Salil Mehta, Alex Bennée
  Cc: Gustavo Romero, qemu-devel@nongnu.org, mst@redhat.com,
	maz@kernel.org, jean-philippe@linaro.org, Jonathan Cameron,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com,
	rafael@kernel.org, borntraeger@linux.ibm.com, npiggin@gmail.com,
	harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	zhao1.liu@intel.com, Linuxarm

Hi Alex,

>  From: qemu-arm-bounces+salil.mehta=huawei.com@nongnu.org <qemu-
>  arm-bounces+salil.mehta=huawei.com@nongnu.org> On Behalf Of Salil
>  Mehta via
>  Sent: Wednesday, September 4, 2024 5:00 PM
>  To: Alex Bennée <alex.bennee@linaro.org>
>  
>  Hi Alex,
>  
>  >  From: Alex Bennée <alex.bennee@linaro.org>
>  >  Sent: Wednesday, September 4, 2024 4:46 PM
>  >  To: Salil Mehta <salil.mehta@huawei.com>
>  >
>  >  Salil Mehta <salil.mehta@huawei.com> writes:
>  >
>  >  > Hi Alex,
>  >  >
>  >  >>  -----Original Message-----
>  >  >>  From: Alex Bennée <alex.bennee@linaro.org>  >>  Sent: Thursday,
>  > August 29, 2024 11:00 AM  >>  To: Gustavo Romero
>  > <gustavo.romero@linaro.org>  >>  >>  Gustavo Romero
>  > <gustavo.romero@linaro.org> writes:
>  >  >>
>  >  >>  > Hi Salil,
>  >  >>  >
>  >  >>  > On 6/13/24 8:36 PM, Salil Mehta via wrote:
>  >  >>  <snip>
>  >  >>  >> (VI) Commands Used
>  >  >>  >> ==================
>  >  >>  >> A. Qemu launch commands to init the machine:
>  >  >>  >>      $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3   \
>  >  >>  >>        -cpu host -smp cpus=4,maxcpus=6 \
>  >  >>  >>        -m 300M \
>  >  >>  >>        -kernel Image \
>  >  >>  >>        -initrd rootfs.cpio.gz \
>  >  >>  >>        -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2
>  >  >>  acpi=force" \
>  >  >>  >>        -nographic \
>  >  >>  >>        -bios QEMU_EFI.fd \
>  >  >>  >> B. Hot-(un)plug related commands:
>  >  >>  >>    # Hotplug a host vCPU (accel=kvm):
>  >  >>  >>      $ device_add host-arm-cpu,id=core4,core-id=4
>  >  >>  >>    # Hotplug a vCPU (accel=tcg):
>  >  >>  >>      $ device_add cortex-a57-arm-cpu,id=core4,core-id=4
>  >  >>  >
>  >  >>  > Since support for hotplug is disabled on TCG, remove these two
>  > >> lines  > in v4 cover letter?
>  >  >>
>  >  >>  Why is it disabled for TCG? We should aim for TCG being as close
>  > to  >> KVM as  possible for developers even if it is not a production solution.
>  >  >
>  >  > Agreed In principle. Yes, that would be of help.
>  >  >
>  >  >
>  >  > Context why it was disabled although most code to support TCG exist:
>  >  >
>  >  > I had reported a crash in the RFC V1 (June 2020) about TCGContext
>  > > counter overflow assertion during repeated hot(un)plug operation.
>  >  > Miguel from Oracle was able to reproduce this problem last year in
>  > Feb  > and also suggested a fix but he later found out in his testing
>  > that there was  a problem during migration.
>  >  >
>  >  > RFC V1 June 2020:
>  >  > https://lore.kernel.org/qemu-devel/20200613213629.21984-1-
>  >  salil.mehta@
>  >  > huawei.com/
>  >  > Scroll to below:
>  >  > [...]
>  >  > THINGS TO DO:
>  >  >  (*) Migration support
>  >  >  (*) TCG/Emulation support is not proper right now. Works to a certain  extent
>  >  >      but is not complete. especially the unrealize part in which there is a
>  >  >      overflow of tcg contexts. The last is due to the fact tcg maintains a
>  >  >      count on number of context(per thread instance) so as we hotplug the vcpus
>  >  >      this counter keeps on incrementing. But during hot-unplug the counter is
>  >  >      not decremented.
>  >
>  >  Right so the translation cache is segmented by vCPU to support
>  > parallel JIT  operations. The easiest solution would be to ensure we
>  > dimension for the  maximum number of vCPUs, which it should already, see
>  >  tcg_init_machine():
>  >
>  >    unsigned max_cpus = ms->smp.max_cpus;
>  >    ...
>  >    tcg_init(s->tb_size * MiB, s->splitwx_enabled, max_cpus);
>  
>  
>  Agreed. We have done that and have a patch for that as well. But it is still a
>  work-in-progress and I've lost context a bit.
>  
>  https://github.com/salil-
>  mehta/qemu/commit/107cf5ca7cf3716bc0f8c68e98e1da3939f449ce
>  
>  For now, I've very quickly tried to enable and run the TCG to gain back the
>  context.
>  I've now hit a different problem during TCG vCPU unrealization phase, while
>  pthread_join() waits on halt condition variable for MTTCG vCPU thread to
>  exit, there is a crash somewhere. Look like some race condition. Will dig this
>  further.


It appears that there was a race condition occurring between destruction of the
CPU Address Space and the delayed processing of the tcg_commit_cpu() function.
The latter is primarily responsible for:

1. Updating of memory dispatch pointer 
2. Performing the tlb_flush() operation.

This process involves calling the CPU Address Space Memory listener's
tcg_commit(),  which queues this work item for the CPU to be executed by
the vCPU at the earliest opportunity. During ARM vCPU unrealization, we
were destroying Address Space first, followed by calling cpu_remove_sync().
This resulted vCPU thread being licked out of IO wait state, leading to
processing of the vCPU work queue items. Since the CPU Address Space
had already been destroyed, this caused the Segmentation fault.

I've resolved this issue by delaying the destruction of CPU Address Space
until the cpu_remove_sync() operation has been completed, but before
the  parent is unrealized. This has resolved the crash. The vCPU Hotplug
operation seems to be working on TCG now. I still need to test the migration
process, which I plan to do in the next couple of days. Please have a look
at below patch and the repository.

https://github.com/salil-mehta/qemu/commit/9fbb8ecbc61c6405db342cc243b2be17b1c97e03
https://github.com/salil-mehta/qemu/commit/1900893449c1b6a10e1534635f29bfb545b825d0


Please check the below branch:
https://github.com/salil-mehta/qemu/commits/virt-cpuhp-armv8/rfc-v4-rc5


Best regards
Salil.


>  >  > @ Feb 2023, [Linaro-open-discussions] Re: Qemu TCG support for  >
>  > virtual-cpuhotplug/online-policy  >  >
>  > https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-l
>  >  > ists.linaro.org/message/GMDFTEZE6WUUI7LZAYOWLXFHAPXLCND5/
>  >  >
>  >  > Last status reported by Miguel was that there was problem with the
>  > TCG  > and he intended to fix this. He was on paternity leave so I
>  > will try to gather  the exact status of the TCG today.
>  >  >
>  >  > Thanks
>  >  > Salil
>  >  >
>  >  >
>  >  >>
>  >  >>  --
>  >  >>  Alex Bennée
>  >  >>  Virtualisation Tech Lead @ Linaro
>  >
>  >  --
>  >  Alex Bennée
>  >  Virtualisation Tech Lead @ Linaro

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
  2024-09-04 17:37         ` Salil Mehta via
@ 2024-09-09 15:28           ` Zhao Liu
  2024-09-10 11:01             ` Salil Mehta via
  0 siblings, 1 reply; 105+ messages in thread
From: Zhao Liu @ 2024-09-09 15:28 UTC (permalink / raw)
  To: Salil Mehta
  Cc: Gavin Shan, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com, maz@kernel.org, jean-philippe@linaro.org,
	Jonathan Cameron, lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	Linuxarm, Zhao Liu

On Wed, Sep 04, 2024 at 05:37:21PM +0000, Salil Mehta wrote:
> Date: Wed, 4 Sep 2024 17:37:21 +0000
> From: Salil Mehta <salil.mehta@huawei.com>
> Subject: RE: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU
>  {socket,cluster,core,thread}-id property
> 
> Hi Zhao,
> 
> >  From: zhao1.liu@intel.com <zhao1.liu@intel.com>
> >  Sent: Wednesday, September 4, 2024 3:43 PM
> >  To: Salil Mehta <salil.mehta@huawei.com>
> >  
> >  Hi Salil,
> >  
> >  On Mon, Aug 19, 2024 at 11:53:52AM +0000, Salil Mehta wrote:
> >  > Date: Mon, 19 Aug 2024 11:53:52 +0000
> >  > From: Salil Mehta <salil.mehta@huawei.com>
> >  > Subject: RE: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU
> >  > {socket,cluster,core,thread}-id property
> >  
> >  [snip]
> >  
> >  > >  > NULL); @@ -2708,6 +2716,7 @@ static const CPUArchIdList
> >  > > *virt_possible_cpu_arch_ids(MachineState *ms)
> >  > >  >   {
> >  > >  >       int n;
> >  > >  >       unsigned int max_cpus = ms->smp.max_cpus;
> >  > >  > +    unsigned int smp_threads = ms->smp.threads;
> >  > >  >       VirtMachineState *vms = VIRT_MACHINE(ms);
> >  > >  >       MachineClass *mc = MACHINE_GET_CLASS(vms);
> >  > >  >
> >  > >  > @@ -2721,6 +2730,7 @@ static const CPUArchIdList
> >  > > *virt_possible_cpu_arch_ids(MachineState *ms)
> >  > >  >       ms->possible_cpus->len = max_cpus;
> >  > >  >       for (n = 0; n < ms->possible_cpus->len; n++) {
> >  > >  >           ms->possible_cpus->cpus[n].type = ms->cpu_type;
> >  > >  > +        ms->possible_cpus->cpus[n].vcpus_count = smp_threads;
> >  > >  >           ms->possible_cpus->cpus[n].arch_id =
> >  > >  >               virt_cpu_mp_affinity(vms, n);
> >  > >  >
> >  > >
> >  > >  Why @vcpus_count is initialized to @smp_threads? it needs to be
> >  > > documented in the commit log.
> >  >
> >  >
> >  > Because every thread internally amounts to a vCPU in QOM and which is
> >  > in 1:1 relationship with KVM vCPU. AFAIK, QOM does not strictly
> >  > follows any architecture. Once you start to get into details of
> >  > threads there are many aspects of shared resources one will have to
> >  > consider and these can vary across different implementations of
> >  architecture.
> >  
> >  For SPAPR CPU, the granularity of >possible_cpus->cpus[] is "core", and for
> >  x86, it's "thread" granularity.
> 
> 
> We have threads per-core at microarchitecture level in ARM as well. But each
> thread appears like a vCPU to OS and AFAICS there are no special attributes
> attached to it. SMT can be enabled/disabled at firmware and should get
> reflected in the configuration accordingly i.e. value of *threads-per-core* 
> changes between 1 and 'N'.  This means 'vcpus_count' has to reflect the
> correct configuration. But I think threads lack proper representation
> in Qemu QOM.

In topology related part, SMT (of x86) usually represents the logical
processor level. And thread has the same meaning. To change these
meanings is also possible, but I think it should be based on the actual
use case. we can consider the complexity of the implementation when
there is a need.

> In Qemu, each vCPU reflects an execution context (which gets uniquely mapped
> to KVM vCPU). AFAICS, we only have *CPUState* (Struct ArchCPU) as a placeholder
> for this execution context and there is no *ThreadState* (derived out of
> Struct CPUState). Hence, we've  to map all the threads as QOM vCPUs. This means
> the array of present or possible CPUs represented by 'struct CPUArchIdList' contains
> all execution contexts which actually might be vCPU or a thread. Hence, usage of
> *vcpus_count* seems quite superficial to me frankly.
>
> Also, AFAICS, KVM does not have the concept of the threads and only has
> KVM vCPUs, but you are still allowed to specify the topology with sockets, dies,
> clusters, cores, threads in most architectures.  
 
There are some uses for topology, such as it affects scheduling behavior,
and it affects feature emulation, etc.
  
> >  And smp.threads means how many threads in one core, so for x86, the
> >  vcpus_count of a "thread" is 1, and for spapr, the vcpus_count of a "core"
> >  equals to smp.threads.
> 
> 
> Sure, but does the KVM specifies this? 

At least as you said, KVM (for x86) doesn't consider higher-level topologies
at the moment, but that's not to say that it won't in the future, as certain
registers do have topology dependencies.

> and how does these threads map to the QOM vCPU objects or execution context?

Each CPU object will create a (software) thread, you can refer the
function "kvm_start_vcpu_thread(CPUState *cpu)", which will be called
when CPU object realizes.

> AFAICS there is nothing but 'CPUState'
> which will be made part of the  possible vCPU list 'struct CPUArchIdList'.
 
As I said, an example is spapr ("spapr_possible_cpu_arch_ids()"), which
maps possible_cpu to core object. However, this is a very specific
example, and like Igor's slides said, I understand it's an architectural
requirement.

> >  
> >  IIUC, your granularity is still "thread", so that this filed should be 1.
> 
> 
> Well, again we need more discussion on this. I've stated my concerns against
> doing this. User should be allowed to create virtual topology which will
> include 'threads' as one of the parameter.
> 

I don't seem to understand...There is a “threads” parameter in -smp, does
this not satisfy your use case?

Regards,
Zhao



^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
  2024-09-09 15:28           ` Zhao Liu
@ 2024-09-10 11:01             ` Salil Mehta via
  2024-09-11 11:35               ` Jonathan Cameron via
  0 siblings, 1 reply; 105+ messages in thread
From: Salil Mehta via @ 2024-09-10 11:01 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Gavin Shan, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com, maz@kernel.org, jean-philippe@linaro.org,
	Jonathan Cameron, lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	Linuxarm

HI Zhao,

>  From: Zhao Liu <zhao1.liu@intel.com>
>  Sent: Monday, September 9, 2024 4:28 PM
>  To: Salil Mehta <salil.mehta@huawei.com>
>  
>  On Wed, Sep 04, 2024 at 05:37:21PM +0000, Salil Mehta wrote:
>  > Date: Wed, 4 Sep 2024 17:37:21 +0000
>  > From: Salil Mehta <salil.mehta@huawei.com>
>  > Subject: RE: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU
>  > {socket,cluster,core,thread}-id property
>  >
>  > Hi Zhao,
>  >
>  > >  From: zhao1.liu@intel.com <zhao1.liu@intel.com>
>  > >  Sent: Wednesday, September 4, 2024 3:43 PM
>  > >  To: Salil Mehta <salil.mehta@huawei.com>
>  > >
>  > >  Hi Salil,
>  > >
>  > >  On Mon, Aug 19, 2024 at 11:53:52AM +0000, Salil Mehta wrote:
>  > >  > Date: Mon, 19 Aug 2024 11:53:52 +0000  > From: Salil Mehta
>  > > <salil.mehta@huawei.com>  > Subject: RE: [PATCH RFC V3 01/29]
>  > > arm/virt,target/arm: Add new ARMCPU  >
>  > > {socket,cluster,core,thread}-id property
>  > >
>  > >  [snip]
>  > >
>  > >  > >  > NULL); @@ -2708,6 +2716,7 @@ static const CPUArchIdList  > >
>  > > *virt_possible_cpu_arch_ids(MachineState *ms)
>  > >  > >  >   {
>  > >  > >  >       int n;
>  > >  > >  >       unsigned int max_cpus = ms->smp.max_cpus;
>  > >  > >  > +    unsigned int smp_threads = ms->smp.threads;
>  > >  > >  >       VirtMachineState *vms = VIRT_MACHINE(ms);
>  > >  > >  >       MachineClass *mc = MACHINE_GET_CLASS(vms);
>  > >  > >  >
>  > >  > >  > @@ -2721,6 +2730,7 @@ static const CPUArchIdList  > >
>  > > *virt_possible_cpu_arch_ids(MachineState *ms)
>  > >  > >  >       ms->possible_cpus->len = max_cpus;
>  > >  > >  >       for (n = 0; n < ms->possible_cpus->len; n++) {
>  > >  > >  >           ms->possible_cpus->cpus[n].type = ms->cpu_type;
>  > >  > >  > +        ms->possible_cpus->cpus[n].vcpus_count = smp_threads;
>  > >  > >  >           ms->possible_cpus->cpus[n].arch_id =
>  > >  > >  >               virt_cpu_mp_affinity(vms, n);
>  > >  > >  >
>  > >  > >
>  > >  > >  Why @vcpus_count is initialized to @smp_threads? it needs to
>  > > be  > > documented in the commit log.
>  > >  >
>  > >  >
>  > >  > Because every thread internally amounts to a vCPU in QOM and
>  > > which is  > in 1:1 relationship with KVM vCPU. AFAIK, QOM does not
>  > > strictly  > follows any architecture. Once you start to get into
>  > > details of  > threads there are many aspects of shared resources one
>  > > will have to  > consider and these can vary across different
>  > > implementations of  architecture.
>  > >
>  > >  For SPAPR CPU, the granularity of >possible_cpus->cpus[] is "core",
>  > > and for  x86, it's "thread" granularity.
>  >
>  >
>  > We have threads per-core at microarchitecture level in ARM as well.
>  > But each thread appears like a vCPU to OS and AFAICS there are no
>  > special attributes attached to it. SMT can be enabled/disabled at
>  > firmware and should get reflected in the configuration accordingly
>  > i.e. value of *threads-per-core* changes between 1 and 'N'.  This
>  > means 'vcpus_count' has to reflect the correct configuration. But I
>  > think threads lack proper representation in Qemu QOM.
>  
>  In topology related part, SMT (of x86) usually represents the logical
>  processor level. And thread has the same meaning.


Agreed. It is same in ARM as well. The difference could be in how hardware
threads are implemented at microarchitecture level.  Nevertheless, we do
have such virtual configurations, and the meaning of *threads* as-in QOM
topology (socket,cluster,core,thread) is virtualized similar to the hardware
threads in host. And One should be able to configure threads support in the
virtual environment,  regardless whether or not underlying hardware
supports threads. That's my take.

Other aspect is how we then expose these threads to the guest. The guest
kernel (just like host kernel) should gather topology information using
ACPI PPTT Table (This is ARM specific?). Later is populated by the Qemu
(just like by firmware for the host kernel) by making use of the virtual
topology. ARM guest kernel, in absence of PPTT support can detect
presence of hardware threads by reading MT Bit within the MPIDR_EL1
register.

Every property in 'ms->possible_cpus->cpus[n].props should be exactly
same as finalized and part of the MachineState::CpuTopology.
Hence, number of threads-per-core 'vcpus_count'  should not be treated
differently. 

But there is  a catch! (I explained that earlier)


 To change these
>  meanings is also possible, but I think it should be based on the actual use
>  case. we can consider the complexity of the implementation when there is a
>  need.


Agreed. There is no ambiguity in the meaning of hardware threads or the 
virtualized MachineState::CpuTopology. Properties of all the possible vCPUs
should exactly be same as part of MachineState. This includes the number
of threads-per-core.

You mentioned 'vcpus_count' should be 1 but does that mean user can never
specify threads > 1 in virtual configuration for x86?


>  
>  > In Qemu, each vCPU reflects an execution context (which gets uniquely
>  > mapped to KVM vCPU). AFAICS, we only have *CPUState* (Struct
>  ArchCPU)
>  > as a placeholder for this execution context and there is no
>  > *ThreadState* (derived out of Struct CPUState). Hence, we've  to map
>  > all the threads as QOM vCPUs. This means the array of present or
>  > possible CPUs represented by 'struct CPUArchIdList' contains all
>  > execution contexts which actually might be vCPU or a thread. Hence,
>  > usage of
>  > *vcpus_count* seems quite superficial to me frankly.
>  >
>  > Also, AFAICS, KVM does not have the concept of the threads and only
>  > has KVM vCPUs, but you are still allowed to specify the topology with
>  > sockets, dies, clusters, cores, threads in most architectures.
>  
>  There are some uses for topology, such as it affects scheduling behavior,
>  and it affects feature emulation, etc.


True. And we should be flexible at the VMM level. We should let Admin of
the VMM control how he creates the virtual topology which best fits
on the underlying hardware features of the host. This includes, NUMA,
sub-NUMA, cores, hardware, threads, cache topology etc. 


>  
>  > >  And smp.threads means how many threads in one core, so for x86, the
>  > > vcpus_count of a "thread" is 1, and for spapr, the vcpus_count of a
>  "core" equals to smp.threads.
>  >
>  >
>  > Sure, but does the KVM specifies this?
>  
>  At least as you said, KVM (for x86) doesn't consider higher-level topologies
>  at the moment, but that's not to say that it won't in the future, as certain
>  registers do have topology dependencies.


sure. so you mean for x86 virtual topology, smp.threads = 1 always?


>  
>  > and how does these threads map to the QOM vCPU objects or execution
>  context?
>  
>  Each CPU object will create a (software) thread, you can refer the function
>  "kvm_start_vcpu_thread(CPUState *cpu)", which will be called when CPU
>  object realizes.


Yes, sure, and each such QOM vCPU thread and 'struct CPUState' is mapped to
the lowest granularity of execution specified within the QOM virtual topology.
It could be a 'thread' or a 'core'. And all these will run as a KVM vCPU scheduled
on some hardware core and maybe hardware thread (if enabled).

So there is no difference across architectures regarding this part. I was trying
to point that in QOM, even the threads will have their own 'struct CPUState'
and each one will be part of the "CPUArchIdList *possible_cpus" maintained
at the MachineState. At this level we loose the relationship information of
the cores and their corresponding threads (given by 'vcpus_count').


>  > AFAICS there is nothing but 'CPUState'
>  > which will be made part of the  possible vCPU list 'struct CPUArchIdList'.
>  
>  As I said, an example is spapr ("spapr_possible_cpu_arch_ids()"), which
>  maps possible_cpu to core object. However, this is a very specific example,
>  and like Igor's slides said, I understand it's an architectural requirement.


I'm sure there must have been some. I'm trying to understand it. Can you
share the slides?


>  
>  > >
>  > >  IIUC, your granularity is still "thread", so that this filed should be 1.
>  >
>  >
>  > Well, again we need more discussion on this. I've stated my concerns
>  > against doing this. User should be allowed to create virtual topology
>  > which will include 'threads' as one of the parameter.
>  >
>  
>  I don't seem to understand...There is a "threads" parameter in -smp, does
>  this not satisfy your use case?

It certainly does. But this is what should get reflected in the 'vcpus_count' as well? 


Best regards
Salil

>  
>  Regards,
>  Zhao
>  



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
  2024-09-10 11:01             ` Salil Mehta via
@ 2024-09-11 11:35               ` Jonathan Cameron via
  2024-09-11 12:25                 ` Salil Mehta
  0 siblings, 1 reply; 105+ messages in thread
From: Jonathan Cameron via @ 2024-09-11 11:35 UTC (permalink / raw)
  To: Salil Mehta
  Cc: Zhao Liu, Gavin Shan, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	mst@redhat.com, maz@kernel.org, jean-philippe@linaro.org,
	lpieralisi@kernel.org, peter.maydell@linaro.org,
	richard.henderson@linaro.org, imammedo@redhat.com,
	andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org,
	eric.auger@redhat.com, will@kernel.org, ardb@kernel.org,
	oliver.upton@linux.dev, pbonzini@redhat.com, rafael@kernel.org,
	borntraeger@linux.ibm.com, alex.bennee@linaro.org,
	npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk,
	darren@os.amperecomputing.com, ilkka@os.amperecomputing.com,
	vishnu@os.amperecomputing.com, karl.heubaum@oracle.com,
	miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	Linuxarm

On Tue, 10 Sep 2024 12:01:05 +0100
Salil Mehta <salil.mehta@huawei.com> wrote:

> HI Zhao,

A few trivial comments inline.

> 
> >  From: Zhao Liu <zhao1.liu@intel.com>
> >  Sent: Monday, September 9, 2024 4:28 PM
> >  To: Salil Mehta <salil.mehta@huawei.com>
> >  
> >  On Wed, Sep 04, 2024 at 05:37:21PM +0000, Salil Mehta wrote:  
> >  > Date: Wed, 4 Sep 2024 17:37:21 +0000
> >  > From: Salil Mehta <salil.mehta@huawei.com>
> >  > Subject: RE: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU
> >  > {socket,cluster,core,thread}-id property
> >  >
> >  > Hi Zhao,
> >  >  
> >  > >  From: zhao1.liu@intel.com <zhao1.liu@intel.com>
> >  > >  Sent: Wednesday, September 4, 2024 3:43 PM
> >  > >  To: Salil Mehta <salil.mehta@huawei.com>
> >  > >
> >  > >  Hi Salil,
> >  > >
> >  > >  On Mon, Aug 19, 2024 at 11:53:52AM +0000, Salil Mehta wrote:  
> >  > >  > Date: Mon, 19 Aug 2024 11:53:52 +0000  > From: Salil Mehta  
> >  > > <salil.mehta@huawei.com>  > Subject: RE: [PATCH RFC V3 01/29]
> >  > > arm/virt,target/arm: Add new ARMCPU  >
> >  > > {socket,cluster,core,thread}-id property
> >  > >
> >  > >  [snip]
> >  > >  
> >  > >  > >  > NULL); @@ -2708,6 +2716,7 @@ static const CPUArchIdList  > >  
> >  > > *virt_possible_cpu_arch_ids(MachineState *ms)  
> >  > >  > >  >   {
> >  > >  > >  >       int n;
> >  > >  > >  >       unsigned int max_cpus = ms->smp.max_cpus;
> >  > >  > >  > +    unsigned int smp_threads = ms->smp.threads;
> >  > >  > >  >       VirtMachineState *vms = VIRT_MACHINE(ms);
> >  > >  > >  >       MachineClass *mc = MACHINE_GET_CLASS(vms);
> >  > >  > >  >
> >  > >  > >  > @@ -2721,6 +2730,7 @@ static const CPUArchIdList  > >  
> >  > > *virt_possible_cpu_arch_ids(MachineState *ms)  
> >  > >  > >  >       ms->possible_cpus->len = max_cpus;
> >  > >  > >  >       for (n = 0; n < ms->possible_cpus->len; n++) {
> >  > >  > >  >           ms->possible_cpus->cpus[n].type = ms->cpu_type;
> >  > >  > >  > +        ms->possible_cpus->cpus[n].vcpus_count = smp_threads;
> >  > >  > >  >           ms->possible_cpus->cpus[n].arch_id =
> >  > >  > >  >               virt_cpu_mp_affinity(vms, n);
> >  > >  > >  >  
> >  > >  > >
> >  > >  > >  Why @vcpus_count is initialized to @smp_threads? it needs to  
> >  > > be  > > documented in the commit log.  
> >  > >  >
> >  > >  >
> >  > >  > Because every thread internally amounts to a vCPU in QOM and  
> >  > > which is  > in 1:1 relationship with KVM vCPU. AFAIK, QOM does not
> >  > > strictly  > follows any architecture. Once you start to get into
> >  > > details of  > threads there are many aspects of shared resources one
> >  > > will have to  > consider and these can vary across different
> >  > > implementations of  architecture.
> >  > >
> >  > >  For SPAPR CPU, the granularity of >possible_cpus->cpus[] is "core",
> >  > > and for  x86, it's "thread" granularity.  
> >  >
> >  >
> >  > We have threads per-core at microarchitecture level in ARM as well.
> >  > But each thread appears like a vCPU to OS and AFAICS there are no
> >  > special attributes attached to it. SMT can be enabled/disabled at
> >  > firmware and should get reflected in the configuration accordingly
> >  > i.e. value of *threads-per-core* changes between 1 and 'N'.  This
> >  > means 'vcpus_count' has to reflect the correct configuration. But I
> >  > think threads lack proper representation in Qemu QOM.  
> >  
> >  In topology related part, SMT (of x86) usually represents the logical
> >  processor level. And thread has the same meaning.  
> 
> 
> Agreed. It is same in ARM as well. The difference could be in how hardware
> threads are implemented at microarchitecture level.  Nevertheless, we do
> have such virtual configurations, and the meaning of *threads* as-in QOM
> topology (socket,cluster,core,thread) is virtualized similar to the hardware
> threads in host. And One should be able to configure threads support in the
> virtual environment,  regardless whether or not underlying hardware
> supports threads. That's my take.
> 
> Other aspect is how we then expose these threads to the guest. The guest
> kernel (just like host kernel) should gather topology information using
> ACPI PPTT Table (This is ARM specific?). 

Not ARM specific, but not used by x86 in practice (I believe some risc-v boards
use it).
https://lore.kernel.org/linux-riscv/20240617131425.7526-3-cuiyunhui@bytedance.com/

> Later is populated by the Qemu
> (just like by firmware for the host kernel) by making use of the virtual
> topology. ARM guest kernel, in absence of PPTT support can detect
> presence of hardware threads by reading MT Bit within the MPIDR_EL1
> register.

Sadly no it can't.  Lots of CPUs cores that are single thread set that
bit anyway (so it's garbage and PPTT / DT is the only source of truth)
https://lore.kernel.org/all/CAFfO_h7vUEhqV15epf+_zVrbDhc3JrejkkOVsHzHgCXNk+nDdg@mail.gmail.com/T/

Jonathan



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
  2024-09-11 11:35               ` Jonathan Cameron via
@ 2024-09-11 12:25                 ` Salil Mehta
  0 siblings, 0 replies; 105+ messages in thread
From: Salil Mehta @ 2024-09-11 12:25 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Salil Mehta, Zhao Liu, Gavin Shan, qemu-devel@nongnu.org,
	qemu-arm@nongnu.org, mst@redhat.com, maz@kernel.org,
	jean-philippe@linaro.org, lpieralisi@kernel.org,
	peter.maydell@linaro.org, richard.henderson@linaro.org,
	imammedo@redhat.com, andrew.jones@linux.dev, david@redhat.com,
	philmd@linaro.org, eric.auger@redhat.com, will@kernel.org,
	ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com,
	rafael@kernel.org, borntraeger@linux.ibm.com,
	alex.bennee@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com,
	linux@armlinux.org.uk, darren@os.amperecomputing.com,
	ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com,
	karl.heubaum@oracle.com, miguel.luis@oracle.com, zhukeqian,
	wangxiongfeng (C), wangyanan (Y), jiakernel2@gmail.com,
	maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com,
	Linuxarm

[-- Attachment #1: Type: text/plain, Size: 6173 bytes --]

On Wed, Sep 11, 2024 at 11:35 AM Jonathan Cameron <
Jonathan.Cameron@huawei.com> wrote:

> On Tue, 10 Sep 2024 12:01:05 +0100
> Salil Mehta <salil.mehta@huawei.com> wrote:
>
> > HI Zhao,
>
> A few trivial comments inline.
>
> >
> > >  From: Zhao Liu <zhao1.liu@intel.com>
> > >  Sent: Monday, September 9, 2024 4:28 PM
> > >  To: Salil Mehta <salil.mehta@huawei.com>
> > >
> > >  On Wed, Sep 04, 2024 at 05:37:21PM +0000, Salil Mehta wrote:
> > >  > Date: Wed, 4 Sep 2024 17:37:21 +0000
> > >  > From: Salil Mehta <salil.mehta@huawei.com>
> > >  > Subject: RE: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new
> ARMCPU
> > >  > {socket,cluster,core,thread}-id property
> > >  >
> > >  > Hi Zhao,
> > >  >
> > >  > >  From: zhao1.liu@intel.com <zhao1.liu@intel.com>
> > >  > >  Sent: Wednesday, September 4, 2024 3:43 PM
> > >  > >  To: Salil Mehta <salil.mehta@huawei.com>
> > >  > >
> > >  > >  Hi Salil,
> > >  > >
> > >  > >  On Mon, Aug 19, 2024 at 11:53:52AM +0000, Salil Mehta wrote:
> > >  > >  > Date: Mon, 19 Aug 2024 11:53:52 +0000  > From: Salil Mehta
> > >  > > <salil.mehta@huawei.com>  > Subject: RE: [PATCH RFC V3 01/29]
> > >  > > arm/virt,target/arm: Add new ARMCPU  >
> > >  > > {socket,cluster,core,thread}-id property
> > >  > >
> > >  > >  [snip]
> > >  > >
> > >  > >  > >  > NULL); @@ -2708,6 +2716,7 @@ static const CPUArchIdList
> > >
> > >  > > *virt_possible_cpu_arch_ids(MachineState *ms)
> > >  > >  > >  >   {
> > >  > >  > >  >       int n;
> > >  > >  > >  >       unsigned int max_cpus = ms->smp.max_cpus;
> > >  > >  > >  > +    unsigned int smp_threads = ms->smp.threads;
> > >  > >  > >  >       VirtMachineState *vms = VIRT_MACHINE(ms);
> > >  > >  > >  >       MachineClass *mc = MACHINE_GET_CLASS(vms);
> > >  > >  > >  >
> > >  > >  > >  > @@ -2721,6 +2730,7 @@ static const CPUArchIdList  > >
> > >  > > *virt_possible_cpu_arch_ids(MachineState *ms)
> > >  > >  > >  >       ms->possible_cpus->len = max_cpus;
> > >  > >  > >  >       for (n = 0; n < ms->possible_cpus->len; n++) {
> > >  > >  > >  >           ms->possible_cpus->cpus[n].type = ms->cpu_type;
> > >  > >  > >  > +        ms->possible_cpus->cpus[n].vcpus_count =
> smp_threads;
> > >  > >  > >  >           ms->possible_cpus->cpus[n].arch_id =
> > >  > >  > >  >               virt_cpu_mp_affinity(vms, n);
> > >  > >  > >  >
> > >  > >  > >
> > >  > >  > >  Why @vcpus_count is initialized to @smp_threads? it needs
> to
> > >  > > be  > > documented in the commit log.
> > >  > >  >
> > >  > >  >
> > >  > >  > Because every thread internally amounts to a vCPU in QOM and
> > >  > > which is  > in 1:1 relationship with KVM vCPU. AFAIK, QOM does not
> > >  > > strictly  > follows any architecture. Once you start to get into
> > >  > > details of  > threads there are many aspects of shared resources
> one
> > >  > > will have to  > consider and these can vary across different
> > >  > > implementations of  architecture.
> > >  > >
> > >  > >  For SPAPR CPU, the granularity of >possible_cpus->cpus[] is
> "core",
> > >  > > and for  x86, it's "thread" granularity.
> > >  >
> > >  >
> > >  > We have threads per-core at microarchitecture level in ARM as well.
> > >  > But each thread appears like a vCPU to OS and AFAICS there are no
> > >  > special attributes attached to it. SMT can be enabled/disabled at
> > >  > firmware and should get reflected in the configuration accordingly
> > >  > i.e. value of *threads-per-core* changes between 1 and 'N'.  This
> > >  > means 'vcpus_count' has to reflect the correct configuration. But I
> > >  > think threads lack proper representation in Qemu QOM.
> > >
> > >  In topology related part, SMT (of x86) usually represents the logical
> > >  processor level. And thread has the same meaning.
> >
> >
> > Agreed. It is same in ARM as well. The difference could be in how
> hardware
> > threads are implemented at microarchitecture level.  Nevertheless, we do
> > have such virtual configurations, and the meaning of *threads* as-in QOM
> > topology (socket,cluster,core,thread) is virtualized similar to the
> hardware
> > threads in host. And One should be able to configure threads support in
> the
> > virtual environment,  regardless whether or not underlying hardware
> > supports threads. That's my take.
> >
> > Other aspect is how we then expose these threads to the guest. The guest
> > kernel (just like host kernel) should gather topology information using
> > ACPI PPTT Table (This is ARM specific?).
>
> Not ARM specific, but not used by x86 in practice (I believe some risc-v
> boards
> use it).
>
> https://lore.kernel.org/linux-riscv/20240617131425.7526-3-cuiyunhui@bytedance.com/
>
> > Later is populated by the Qemu
> > (just like by firmware for the host kernel) by making use of the virtual
> > topology. ARM guest kernel, in absence of PPTT support can detect
> > presence of hardware threads by reading MT Bit within the MPIDR_EL1
> > register.
>
> Sadly no it can't.  Lots of CPUs cores that are single thread set that
> bit anyway (so it's garbage and PPTT / DT is the only source of truth)
>
> https://lore.kernel.org/all/CAFfO_h7vUEhqV15epf+_zVrbDhc3JrejkkOVsHzHgCXNk+nDdg@mail.gmail.com/T/



Yes, agreed, this last explanation was not completely correct.
IICRC, Marc did point out in RFC V1 of June 2020 that value  MT=0 is set by
KVM to tell the guest kernel that vCPUs at the same affinity-1 fields are
independent.
Problem was with the interpretation of non-zero MT Bit  and it was not
consistent.
The key thing is we should not rely on the value of the MT Bit in MPIDR to
know
if multithreading exists. So yes, to know the exact status of the
multithreading
on ARM systems parsing PPTT Table is the only way.

I mentioned that because handling still exists in the kernel code but I
think it exists
for handling those other cases. Maybe a comment is required here (?):

https://elixir.bootlin.com/linux/v6.11-rc7/source/arch/arm64/kernel/topology.c#L34

Thanks for pointing this out.


Thanks
Salil.




>
>
> Jonathan
>
>

[-- Attachment #2: Type: text/html, Size: 9423 bytes --]

^ permalink raw reply	[flat|nested] 105+ messages in thread

end of thread, other threads:[~2024-09-11 12:31 UTC | newest]

Thread overview: 105+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-13 23:36 [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
2024-06-13 23:36 ` [PATCH RFC V3 01/29] arm/virt, target/arm: Add new ARMCPU {socket, cluster, core, thread}-id property Salil Mehta via
2024-08-12  4:35   ` [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property Gavin Shan
2024-08-12  8:15     ` Igor Mammedov
2024-08-13  0:31       ` Gavin Shan
2024-08-19 12:07       ` Salil Mehta via
2024-08-19 11:53     ` Salil Mehta via
2024-09-04 14:42       ` zhao1.liu
2024-09-04 17:37         ` Salil Mehta via
2024-09-09 15:28           ` Zhao Liu
2024-09-10 11:01             ` Salil Mehta via
2024-09-11 11:35               ` Jonathan Cameron via
2024-09-11 12:25                 ` Salil Mehta
2024-06-13 23:36 ` [PATCH RFC V3 02/29] cpu-common: Add common CPU utility for possible vCPUs Salil Mehta via
2024-07-04  3:12   ` Nicholas Piggin
2024-08-12  4:59   ` Gavin Shan
2024-08-12  5:41     ` 回复: " liu ping
2024-06-13 23:36 ` [PATCH RFC V3 03/29] hw/arm/virt: Limit number of possible vCPUs for unsupported Accel or GIC Type Salil Mehta via
2024-08-12  5:09   ` Gavin Shan
2024-06-13 23:36 ` [PATCH RFC V3 04/29] hw/arm/virt: Move setting of common CPU properties in a function Salil Mehta via
2024-08-12  5:19   ` Gavin Shan
2024-06-13 23:36 ` [PATCH RFC V3 05/29] arm/virt, target/arm: Machine init time change common to vCPU {cold|hot}-plug Salil Mehta via
2024-06-13 23:36 ` [PATCH RFC V3 06/29] arm/virt, kvm: Pre-create disabled possible vCPUs @machine init Salil Mehta via
2024-08-13  0:58   ` [PATCH RFC V3 06/29] arm/virt,kvm: " Gavin Shan
2024-08-19  5:31   ` Gavin Shan
2024-08-19 13:06     ` Salil Mehta via
2024-06-13 23:36 ` [PATCH RFC V3 07/29] arm/virt, gicv3: Changes to pre-size GIC with possible vcpus " Salil Mehta via
2024-06-13 23:36 ` [PATCH RFC V3 08/29] arm/virt: Init PMU at host for all possible vcpus Salil Mehta via
2024-07-04  3:07   ` Nicholas Piggin
2024-07-04 12:03     ` Salil Mehta via
2024-06-13 23:36 ` [PATCH RFC V3 09/29] arm/acpi: Enable ACPI support for vcpu hotplug Salil Mehta via
2024-06-13 23:36 ` [PATCH RFC V3 10/29] arm/virt: Add cpu hotplug events to GED during creation Salil Mehta via
2024-06-13 23:36 ` [PATCH RFC V3 11/29] arm/virt: Create GED dev before *disabled* CPU Objs are destroyed Salil Mehta via
2024-08-13  1:04   ` Gavin Shan
2024-08-19 12:10     ` Salil Mehta via
2024-08-20  0:22       ` Gavin Shan
2024-08-20 17:10         ` Salil Mehta via
2024-06-13 23:36 ` [PATCH RFC V3 12/29] arm/virt/acpi: Build CPUs AML with CPU Hotplug support Salil Mehta via
2024-06-13 23:36 ` [PATCH RFC V3 13/29] arm/virt: Make ARM vCPU *present* status ACPI *persistent* Salil Mehta via
2024-07-04  2:49   ` Nicholas Piggin
2024-07-04 11:23     ` Salil Mehta via
2024-07-05  0:08       ` Nicholas Piggin
2024-06-13 23:36 ` [PATCH RFC V3 14/29] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES, ENA} Bits to Guest Salil Mehta via
2024-06-13 23:36 ` [PATCH RFC V3 15/29] hw/arm: MADT Tbl change to size the guest with possible vCPUs Salil Mehta via
2024-06-13 23:36 ` [PATCH RFC V3 16/29] hw/acpi: Make _MAT method optional Salil Mehta via
2024-06-13 23:36 ` [PATCH RFC V3 17/29] arm/virt: Release objects for *disabled* possible vCPUs after init Salil Mehta via
2024-08-13  1:17   ` Gavin Shan
2024-08-19 12:21     ` Salil Mehta via
2024-08-20  0:05       ` Gavin Shan
2024-08-20 16:40         ` Salil Mehta via
2024-08-21  6:25           ` Gavin Shan
2024-08-21 10:23             ` Salil Mehta via
2024-08-21 13:32               ` Gavin Shan
2024-08-22 10:58                 ` Salil Mehta via
2024-08-23 10:52                   ` Gavin Shan
2024-08-23 13:17                     ` Salil Mehta via
2024-08-24 10:03                       ` Gavin Shan
2024-06-13 23:36 ` [PATCH RFC V3 18/29] arm/virt: Add/update basic hot-(un)plug framework Salil Mehta via
2024-08-13  1:21   ` Gavin Shan
2024-08-19 12:30     ` Salil Mehta via
2024-06-13 23:36 ` [PATCH RFC V3 19/29] arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug Salil Mehta via
2024-06-13 23:36 ` [PATCH RFC V3 20/29] hw/arm, gicv3: Changes to update GIC with vCPU hot-plug notification Salil Mehta via
2024-06-13 23:36 ` [PATCH RFC V3 21/29] hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register info Salil Mehta via
2024-06-13 23:36 ` [PATCH RFC V3 22/29] arm/virt: Update the guest(via GED) about CPU hot-(un)plug events Salil Mehta via
2024-06-13 23:36 ` [PATCH RFC V3 23/29] hw/arm: Changes required for reset and to support next boot Salil Mehta via
2024-06-13 23:36 ` [PATCH RFC V3 24/29] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug Salil Mehta via
2024-08-16 15:37   ` Alex Bennée
2024-08-16 15:50     ` Peter Maydell
2024-08-16 17:00       ` Peter Maydell
2024-08-19 12:59         ` Salil Mehta via
2024-08-19 13:43           ` Peter Maydell
2024-08-19 12:58       ` Salil Mehta via
2024-08-19 13:46         ` Peter Maydell
2024-08-20 15:34           ` Salil Mehta via
2024-08-19 12:35     ` Salil Mehta via
2024-08-28 20:23       ` Gustavo Romero
2024-09-04 13:53         ` Salil Mehta via
2024-06-13 23:36 ` [PATCH RFC V3 25/29] target/arm/kvm: Write CPU state back to KVM on reset Salil Mehta via
2024-07-04  3:27   ` Nicholas Piggin
2024-07-04 12:27     ` Salil Mehta via
2024-06-14  0:15 ` [PATCH RFC V3 26/29] target/arm/kvm, tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu Salil Mehta via
2024-06-14  0:18 ` [PATCH RFC V3 27/29] hw/arm: Support hotplug capability check using _OSC method Salil Mehta via
2024-06-14  0:19 ` [PATCH RFC V3 28/29] tcg/mttcg: enable threads to unregister in tcg_ctxs[] Salil Mehta via
2024-06-14  0:20 ` [PATCH RFC V3 29/29] hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled Salil Mehta via
2024-06-26  9:53 ` [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch Vishnu Pajjuri
2024-06-26 18:01   ` Salil Mehta via
2024-07-01 11:38 ` Miguel Luis
2024-07-01 16:30   ` Salil Mehta via
2024-08-07  9:53 ` Gavin Shan
2024-08-07 13:27   ` Salil Mehta via
2024-08-07 16:07     ` Salil Mehta via
2024-08-08  5:00       ` Gavin Shan
2024-08-07 23:41     ` Gavin Shan
2024-08-07 23:48       ` Salil Mehta via
2024-08-08  0:29         ` Gavin Shan
2024-08-08  4:15           ` Gavin Shan
2024-08-08  8:39             ` Salil Mehta via
2024-08-08  8:36           ` Salil Mehta via
2024-08-28 20:35 ` Gustavo Romero
2024-08-29  9:59   ` Alex Bennée
2024-09-04 14:24     ` Salil Mehta via
2024-09-04 15:45       ` Alex Bennée
2024-09-04 15:59         ` Salil Mehta via
2024-09-06 15:06           ` Salil Mehta via
2024-09-04 14:03   ` Salil Mehta via

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).