[PATCH v2 0/3] hvf: arm: Support creating VMs with 64+GB of RAM on macOS 15+

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 0/3] hvf: arm: Support creating VMs with 64+GB of RAM on macOS 15+
@ 2024-08-28 11:15 Danny Canter
  2024-08-28 11:15 ` [PATCH v2 1/3] hw/boards: Add hvf_get_physical_address_range to MachineClass Danny Canter
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Danny Canter @ 2024-08-28 11:15 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: dirty, rbolshakov, agraf, peter.maydell, pbonzini,
	richard.henderson, eduardo, mst, marcel.apfelbaum, philmd,
	wangyanan55, zhao1.liu, danny_canter

This patchsets focus is on lighting up the ability to create VMs with 64+GB
of RAM through using some new APIs introduced in macOS 13. Due to the IPA sizes
supported in macOS, the first version we can properly support this requirement
is macOS 15 as (if the hardware supports it also) the kernel adds support for a
40b IPA size, which is the first supported ARM PARange value after 36b, so we
can advertise this to the guest properly as well in id_aa64mmfr0_el1.

Today if you asked for a > 64GB VM you'd be met with a pretty unwieldy
HV_BAD_ARGUMENT. On machines without 40b IPA support this patchset also
improves this, and the message mirrors the kvm_type error you'd get on ARM:

"qemu-system-aarch64: -accel hvf: Addressing limited to 36 bits, but memory
exceeds it by 18253611008 bytes"

Changes from V1 to V2 (Thanks Peter for review!):

- Added a new function pointer to MachineClass to be able to freeze the memory
map and compute the highest guest physical address. We use this to inform VM
creation on what IPA size we should ask the kernel for. This is very similar to
what ARM's kvm_type() does.

- Fixed redundant loop in `round_down_to_parange_bit_size`

- Move the splitting up of hv_vm_create logic per platform to a separate patch.
This is mostly for readability.

Danny Canter (3):
  hw/boards: Add hvf_get_physical_address_range to MachineClass
  hvf: Split up hv_vm_create logic per arch
  hvf: arm: Allow creating VMs with 64+GB of RAM on macOS 15+

 accel/hvf/hvf-accel-ops.c | 16 +++++++---
 hw/arm/virt.c             | 42 ++++++++++++++++++++++++-
 hw/i386/x86.c             |  2 ++
 include/hw/boards.h       |  5 +++
 include/sysemu/hvf_int.h  |  1 +
 target/arm/hvf/hvf.c      | 66 +++++++++++++++++++++++++++++++++++++++
 target/arm/hvf_arm.h      |  3 ++
 target/arm/internals.h    | 19 +++++++++++
 target/arm/ptw.c          | 15 +++++++++
 target/i386/hvf/hvf.c     |  5 +++
 10 files changed, 168 insertions(+), 6 deletions(-)

-- 
2.39.5 (Apple Git-154)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/3] hw/boards: Add hvf_get_physical_address_range to MachineClass
  2024-08-28 11:15 [PATCH v2 0/3] hvf: arm: Support creating VMs with 64+GB of RAM on macOS 15+ Danny Canter
@ 2024-08-28 11:15 ` Danny Canter
  2024-09-06 15:30   ` Peter Maydell
  2024-08-28 11:15 ` [PATCH v2 2/3] hvf: Split up hv_vm_create logic per arch Danny Canter
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Danny Canter @ 2024-08-28 11:15 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: dirty, rbolshakov, agraf, peter.maydell, pbonzini,
	richard.henderson, eduardo, mst, marcel.apfelbaum, philmd,
	wangyanan55, zhao1.liu, danny_canter

This addition will be necessary for some HVF related work to follow.
For HVF on ARM there exists a set of APIs in macOS 13 to be able to
adjust the IPA size for a given VM. This is useful as by default HVF
uses 36 bits as the IPA size, so to support guests with > 64GB of RAM
we'll need to reach for this.

To have all the info necessary to carry this out however, we need some
plumbing to be able to grab the memory map and compute the highest GPA
prior to creating the VM. This is almost exactly like what kvm_type is
used for on ARM today, and is also what this will be used for. We will
compute the highest GPA and find what IPA size we'd need to satisfy this,
and if it's valid (macOS today caps at 40b) we'll set this to be the IPA
size in coming patches. This new method is only needed (today at least)
on ARM, and obviously only for HVF/macOS, so admittedly it is much less
generic than kvm_type today, but it seemed a somewhat sane way to get
the information we need from the memmap at VM creation time.

Signed-off-by: Danny Canter <danny_canter@apple.com>
---
 hw/arm/virt.c       | 9 ++++++++-
 hw/i386/x86.c       | 2 ++
 include/hw/boards.h | 5 +++++
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 687fe0bb8b..62ee5f849b 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2107,7 +2107,8 @@ static void machvirt_init(MachineState *machine)
 
     /*
      * In accelerated mode, the memory map is computed earlier in kvm_type()
-     * to create a VM with the right number of IPA bits.
+     * for Linux, or hvf_get_physical_address_range() for macOS to create a
+     * VM with the right number of IPA bits.
      */
     if (!vms->memmap) {
         Object *cpuobj;
@@ -3027,6 +3028,11 @@ static int virt_kvm_type(MachineState *ms, const char *type_str)
     return fixed_ipa ? 0 : requested_pa_size;
 }
 
+static int virt_hvf_get_physical_address_range(MachineState *ms)
+{
+    return 0;
+}
+
 static void virt_machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
@@ -3086,6 +3092,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
     mc->valid_cpu_types = valid_cpu_types;
     mc->get_default_cpu_node_id = virt_get_default_cpu_node_id;
     mc->kvm_type = virt_kvm_type;
+    mc->hvf_get_physical_address_range = virt_hvf_get_physical_address_range;
     assert(!mc->get_hotplug_handler);
     mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
     hc->pre_plug = virt_machine_device_pre_plug_cb;
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 01fc5e6562..fa7a0f6b98 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -382,6 +382,8 @@ static void x86_machine_class_init(ObjectClass *oc, void *data)
     mc->get_default_cpu_node_id = x86_get_default_cpu_node_id;
     mc->possible_cpu_arch_ids = x86_possible_cpu_arch_ids;
     mc->kvm_type = x86_kvm_type;
+    /* Not needed for x86 */
+    mc->hvf_get_physical_address_range = NULL;
     x86mc->save_tsc_khz = true;
     x86mc->fwcfg_dma_enabled = true;
     nc->nmi_monitor_handler = x86_nmi;
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 48ff6d8b93..bfc7cc7f90 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -215,6 +215,10 @@ typedef struct {
  *    Return the type of KVM corresponding to the kvm-type string option or
  *    computed based on other criteria such as the host kernel capabilities.
  *    kvm-type may be NULL if it is not needed.
+ * @hvf_get_physical_address_range:
+ *    Returns the physical address range in bits to use for the HVF virtual
+ *    machine based on the current boards memory map. This may be NULL if it
+ *    is not needed.
  * @numa_mem_supported:
  *    true if '--numa node.mem' option is supported and false otherwise
  * @hotplug_allowed:
@@ -256,6 +260,7 @@ struct MachineClass {
     void (*reset)(MachineState *state, ShutdownCause reason);
     void (*wakeup)(MachineState *state);
     int (*kvm_type)(MachineState *machine, const char *arg);
+    int (*hvf_get_physical_address_range)(MachineState *machine);
 
     BlockInterfaceType block_default_type;
     int units_per_default_bus;
-- 
2.39.5 (Apple Git-154)



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] hw/boards: Add hvf_get_physical_address_range to MachineClass
  2024-08-28 11:15 ` [PATCH v2 1/3] hw/boards: Add hvf_get_physical_address_range to MachineClass Danny Canter
@ 2024-09-06 15:30   ` Peter Maydell
  0 siblings, 0 replies; 12+ messages in thread
From: Peter Maydell @ 2024-09-06 15:30 UTC (permalink / raw)
  To: Danny Canter
  Cc: qemu-devel, qemu-arm, dirty, rbolshakov, agraf, pbonzini,
	richard.henderson, eduardo, mst, marcel.apfelbaum, philmd,
	wangyanan55, zhao1.liu

On Wed, 28 Aug 2024 at 12:16, Danny Canter <danny_canter@apple.com> wrote:
>
> This addition will be necessary for some HVF related work to follow.
> For HVF on ARM there exists a set of APIs in macOS 13 to be able to
> adjust the IPA size for a given VM. This is useful as by default HVF
> uses 36 bits as the IPA size, so to support guests with > 64GB of RAM
> we'll need to reach for this.
>
> To have all the info necessary to carry this out however, we need some
> plumbing to be able to grab the memory map and compute the highest GPA
> prior to creating the VM. This is almost exactly like what kvm_type is
> used for on ARM today, and is also what this will be used for. We will
> compute the highest GPA and find what IPA size we'd need to satisfy this,
> and if it's valid (macOS today caps at 40b) we'll set this to be the IPA
> size in coming patches. This new method is only needed (today at least)
> on ARM, and obviously only for HVF/macOS, so admittedly it is much less
> generic than kvm_type today, but it seemed a somewhat sane way to get
> the information we need from the memmap at VM creation time.
>
> Signed-off-by: Danny Canter <danny_canter@apple.com>

> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> index 01fc5e6562..fa7a0f6b98 100644
> --- a/hw/i386/x86.c
> +++ b/hw/i386/x86.c
> @@ -382,6 +382,8 @@ static void x86_machine_class_init(ObjectClass *oc, void *data)
>      mc->get_default_cpu_node_id = x86_get_default_cpu_node_id;
>      mc->possible_cpu_arch_ids = x86_possible_cpu_arch_ids;
>      mc->kvm_type = x86_kvm_type;
> +    /* Not needed for x86 */
> +    mc->hvf_get_physical_address_range = NULL;
>      x86mc->save_tsc_khz = true;
>      x86mc->fwcfg_dma_enabled = true;
>      nc->nmi_monitor_handler = x86_nmi;

We guarantee that object and class structs are zero-initialized,
so we don't need to explicitly set this field to NULL.

Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 2/3] hvf: Split up hv_vm_create logic per arch
  2024-08-28 11:15 [PATCH v2 0/3] hvf: arm: Support creating VMs with 64+GB of RAM on macOS 15+ Danny Canter
  2024-08-28 11:15 ` [PATCH v2 1/3] hw/boards: Add hvf_get_physical_address_range to MachineClass Danny Canter
@ 2024-08-28 11:15 ` Danny Canter
  2024-09-06 15:31   ` Peter Maydell
  2024-08-28 11:15 ` [PATCH v2 3/3] hvf: arm: Implement and use hvf_get_physical_address_range Danny Canter
  2024-09-06 15:32 ` [PATCH v2 0/3] hvf: arm: Support creating VMs with 64+GB of RAM on macOS 15+ Peter Maydell
  3 siblings, 1 reply; 12+ messages in thread
From: Danny Canter @ 2024-08-28 11:15 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: dirty, rbolshakov, agraf, peter.maydell, pbonzini,
	richard.henderson, eduardo, mst, marcel.apfelbaum, philmd,
	wangyanan55, zhao1.liu, danny_canter

This is preliminary work to split up hv_vm_create
logic per platform so we can support creating VMs
with > 64GB of RAM on Apple Silicon machines. This
is done via ARM HVF's hv_vm_config_create() (and
other APIs that modify this config that will be
coming in future patches). This should have no
behavioral difference at all as hv_vm_config_create()
just assigns the same default values as if you just
passed NULL to the function.

Signed-off-by: Danny Canter <danny_canter@apple.com>
---
 accel/hvf/hvf-accel-ops.c | 6 +-----
 include/sysemu/hvf_int.h  | 1 +
 target/arm/hvf/hvf.c      | 9 +++++++++
 target/i386/hvf/hvf.c     | 5 +++++
 4 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/accel/hvf/hvf-accel-ops.c b/accel/hvf/hvf-accel-ops.c
index ac08cfb9f3..dbebf209f4 100644
--- a/accel/hvf/hvf-accel-ops.c
+++ b/accel/hvf/hvf-accel-ops.c
@@ -61,10 +61,6 @@
 
 HVFState *hvf_state;
 
-#ifdef __aarch64__
-#define HV_VM_DEFAULT NULL
-#endif
-
 /* Memory slots */
 
 hvf_slot *hvf_find_overlap_slot(uint64_t start, uint64_t size)
@@ -324,7 +320,7 @@ static int hvf_accel_init(MachineState *ms)
     hv_return_t ret;
     HVFState *s;
 
-    ret = hv_vm_create(HV_VM_DEFAULT);
+    ret = hvf_arch_vm_create(ms, 0);
     assert_hvf_ok(ret);
 
     s = g_new0(HVFState, 1);
diff --git a/include/sysemu/hvf_int.h b/include/sysemu/hvf_int.h
index 5b28d17ba1..42ae18433f 100644
--- a/include/sysemu/hvf_int.h
+++ b/include/sysemu/hvf_int.h
@@ -65,6 +65,7 @@ void assert_hvf_ok_impl(hv_return_t ret, const char *file, unsigned int line,
 #define assert_hvf_ok(EX) assert_hvf_ok_impl((EX), __FILE__, __LINE__, #EX)
 const char *hvf_return_string(hv_return_t ret);
 int hvf_arch_init(void);
+hv_return_t hvf_arch_vm_create(MachineState *ms, uint32_t pa_range);
 int hvf_arch_init_vcpu(CPUState *cpu);
 void hvf_arch_vcpu_destroy(CPUState *cpu);
 int hvf_vcpu_exec(CPUState *);
diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
index ace83671b5..19964d241e 100644
--- a/target/arm/hvf/hvf.c
+++ b/target/arm/hvf/hvf.c
@@ -929,6 +929,15 @@ void hvf_arch_vcpu_destroy(CPUState *cpu)
 {
 }
 
+hv_return_t hvf_arch_vm_create(MachineState *ms, uint32_t pa_range)
+{
+    hv_vm_config_t config = hv_vm_config_create();
+    hv_return_t ret = hv_vm_create(config);
+    os_release(config);
+
+    return ret;
+}
+
 int hvf_arch_init_vcpu(CPUState *cpu)
 {
     ARMCPU *arm_cpu = ARM_CPU(cpu);
diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index c9c64e2978..68dc5d9cf7 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -223,6 +223,11 @@ int hvf_arch_init(void)
     return 0;
 }
 
+hv_return_t hvf_arch_vm_create(MachineState *ms, uint32_t pa_range)
+{
+    return hv_vm_create(HV_VM_DEFAULT);
+}
+
 int hvf_arch_init_vcpu(CPUState *cpu)
 {
     X86CPU *x86cpu = X86_CPU(cpu);
-- 
2.39.5 (Apple Git-154)



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] hvf: Split up hv_vm_create logic per arch
  2024-08-28 11:15 ` [PATCH v2 2/3] hvf: Split up hv_vm_create logic per arch Danny Canter
@ 2024-09-06 15:31   ` Peter Maydell
  0 siblings, 0 replies; 12+ messages in thread
From: Peter Maydell @ 2024-09-06 15:31 UTC (permalink / raw)
  To: Danny Canter
  Cc: qemu-devel, qemu-arm, dirty, rbolshakov, agraf, pbonzini,
	richard.henderson, eduardo, mst, marcel.apfelbaum, philmd,
	wangyanan55, zhao1.liu

On Wed, 28 Aug 2024 at 12:16, Danny Canter <danny_canter@apple.com> wrote:
>
> This is preliminary work to split up hv_vm_create
> logic per platform so we can support creating VMs
> with > 64GB of RAM on Apple Silicon machines. This
> is done via ARM HVF's hv_vm_config_create() (and
> other APIs that modify this config that will be
> coming in future patches). This should have no
> behavioral difference at all as hv_vm_config_create()
> just assigns the same default values as if you just
> passed NULL to the function.
>
> Signed-off-by: Danny Canter <danny_canter@apple.com>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 3/3] hvf: arm: Implement and use hvf_get_physical_address_range
  2024-08-28 11:15 [PATCH v2 0/3] hvf: arm: Support creating VMs with 64+GB of RAM on macOS 15+ Danny Canter
  2024-08-28 11:15 ` [PATCH v2 1/3] hw/boards: Add hvf_get_physical_address_range to MachineClass Danny Canter
  2024-08-28 11:15 ` [PATCH v2 2/3] hvf: Split up hv_vm_create logic per arch Danny Canter
@ 2024-08-28 11:15 ` Danny Canter
  2024-09-06 15:31   ` Peter Maydell
  2025-02-10 17:26   ` Philippe Mathieu-Daudé
  2024-09-06 15:32 ` [PATCH v2 0/3] hvf: arm: Support creating VMs with 64+GB of RAM on macOS 15+ Peter Maydell
  3 siblings, 2 replies; 12+ messages in thread
From: Danny Canter @ 2024-08-28 11:15 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: dirty, rbolshakov, agraf, peter.maydell, pbonzini,
	richard.henderson, eduardo, mst, marcel.apfelbaum, philmd,
	wangyanan55, zhao1.liu, danny_canter

This patch's main focus is to use the previously added
hvf_get_physical_address_range to inform VM creation
about the IPA size we need for the VM, so we can extend
the default 36b IPA size and support VMs with 64+GB of
RAM. This is done by freezing the memory map, computing
the highest GPA and then (depending on if the platform
supports an IPA size that large) telling the kernel to
use a size >= for the VM. In pursuit of this a couple of
things related to how we handle the physical address range
we expose to guests were altered, but for an explanation of
what we were doing:

Today, to get the IPA size we were reading id_aa64mmfr0_el1's
PARange field from a newly made vcpu. Unfortunately, HVF just
returns the hosts PARange directly for the initial value and
not the IPA size that will actually back the VM, so we believe
we have much more address space than we actually do today it seems.

Starting in macOS 13.0 some APIs were introduced to be able to
query the maximum IPA size the kernel supports, and to set the IPA
size for a given VM. However, this still has a couple of issues
on < macOS 15. Up until macOS 15 (and if the hardware supported
it) the max IPA size was 39 bits which is not a valid PARange
value, so we can't clamp down what we advertise in the vcpu's
id_aa64mmfr0_el1 to our IPA size. Starting in macOS 15 however,
the maximum IPA size is 40 bits (if it's supported in the hardware
as well) which is also a valid PARange value so we can set our IPA
size to the maximum as well as clamp down the PARange we advertise
to the guest. This allows VMs with 64+ GB of RAM and should fix the
oddness of the PARange situation as well.

Signed-off-by: Danny Canter <danny_canter@apple.com>
---
 accel/hvf/hvf-accel-ops.c | 12 ++++++++-
 hw/arm/virt.c             | 31 +++++++++++++++++++++-
 target/arm/hvf/hvf.c      | 56 ++++++++++++++++++++++++++++++++++++++-
 target/arm/hvf_arm.h      | 19 +++++++++++++
 target/arm/internals.h    | 19 +++++++++++++
 target/arm/ptw.c          | 15 +++++++++++
 6 files changed, 149 insertions(+), 3 deletions(-)

diff --git a/accel/hvf/hvf-accel-ops.c b/accel/hvf/hvf-accel-ops.c
index dbebf209f4..d60874d3e6 100644
--- a/accel/hvf/hvf-accel-ops.c
+++ b/accel/hvf/hvf-accel-ops.c
@@ -53,6 +53,7 @@
 #include "exec/address-spaces.h"
 #include "exec/exec-all.h"
 #include "gdbstub/enums.h"
+#include "hw/boards.h"
 #include "sysemu/cpus.h"
 #include "sysemu/hvf.h"
 #include "sysemu/hvf_int.h"
@@ -319,8 +320,17 @@ static int hvf_accel_init(MachineState *ms)
     int x;
     hv_return_t ret;
     HVFState *s;
+    int pa_range = 36;
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
 
-    ret = hvf_arch_vm_create(ms, 0);
+    if (mc->hvf_get_physical_address_range) {
+        pa_range = mc->hvf_get_physical_address_range(ms);
+        if (pa_range < 0) {
+            return -EINVAL;
+        }
+    }
+
+    ret = hvf_arch_vm_create(ms, (uint32_t)pa_range);
     assert_hvf_ok(ret);
 
     s = g_new0(HVFState, 1);
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 62ee5f849b..b39c7924a0 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -66,6 +66,7 @@
 #include "hw/intc/arm_gicv3_its_common.h"
 #include "hw/irq.h"
 #include "kvm_arm.h"
+#include "hvf_arm.h"
 #include "hw/firmware/smbios.h"
 #include "qapi/visitor.h"
 #include "qapi/qapi-visit-common.h"
@@ -3030,7 +3031,35 @@ static int virt_kvm_type(MachineState *ms, const char *type_str)
 
 static int virt_hvf_get_physical_address_range(MachineState *ms)
 {
-    return 0;
+    VirtMachineState *vms = VIRT_MACHINE(ms);
+
+    int default_ipa_size = hvf_arm_get_default_ipa_bit_size();
+    int max_ipa_size = hvf_arm_get_max_ipa_bit_size();
+
+    /* We freeze the memory map to compute the highest gpa */
+    virt_set_memmap(vms, max_ipa_size);
+
+    int requested_ipa_size = 64 - clz64(vms->highest_gpa);
+
+    /*
+     * If we're <= the default IPA size just use the default.
+     * If we're above the default but below the maximum, round up to
+     * the maximum. hvf_arm_get_max_ipa_bit_size() conveniently only
+     * returns values that are valid ARM PARange values.
+     */
+    if (requested_ipa_size <= default_ipa_size) {
+        requested_ipa_size = default_ipa_size;
+    } else if (requested_ipa_size <= max_ipa_size) {
+        requested_ipa_size = max_ipa_size;
+    } else {
+        error_report("-m and ,maxmem option values "
+                     "require an IPA range (%d bits) larger than "
+                     "the one supported by the host (%d bits)",
+                     requested_ipa_size, max_ipa_size);
+        return -1;
+    }
+
+    return requested_ipa_size;
 }
 
 static void virt_machine_class_init(ObjectClass *oc, void *data)
diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
index 19964d241e..6cea483d42 100644
--- a/target/arm/hvf/hvf.c
+++ b/target/arm/hvf/hvf.c
@@ -22,6 +22,7 @@
 #include <mach/mach_time.h>
 
 #include "exec/address-spaces.h"
+#include "hw/boards.h"
 #include "hw/irq.h"
 #include "qemu/main-loop.h"
 #include "sysemu/cpus.h"
@@ -297,6 +298,8 @@ void hvf_arm_init_debug(void)
 
 static void hvf_wfi(CPUState *cpu);
 
+static uint32_t chosen_ipa_bit_size;
+
 typedef struct HVFVTimer {
     /* Vtimer value during migration and paused state */
     uint64_t vtimer_val;
@@ -839,6 +842,16 @@ static uint64_t hvf_get_reg(CPUState *cpu, int rt)
     return val;
 }
 
+static void clamp_id_aa64mmfr0_parange_to_ipa_size(uint64_t *id_aa64mmfr0)
+{
+    uint32_t ipa_size = chosen_ipa_bit_size ?
+            chosen_ipa_bit_size : hvf_arm_get_max_ipa_bit_size();
+
+    /* Clamp down the PARange to the IPA size the kernel supports. */
+    uint8_t index = round_down_to_parange_index(ipa_size);
+    *id_aa64mmfr0 = (*id_aa64mmfr0 & ~R_ID_AA64MMFR0_PARANGE_MASK) | index;
+}
+
 static bool hvf_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
 {
     ARMISARegisters host_isar = {};
@@ -882,6 +895,8 @@ static bool hvf_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
     r |= hv_vcpu_get_sys_reg(fd, HV_SYS_REG_MIDR_EL1, &ahcf->midr);
     r |= hv_vcpu_destroy(fd);
 
+    clamp_id_aa64mmfr0_parange_to_ipa_size(&host_isar.id_aa64mmfr0);
+
     ahcf->isar = host_isar;
 
     /*
@@ -904,6 +919,30 @@ static bool hvf_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
     return r == HV_SUCCESS;
 }
 
+uint32_t hvf_arm_get_default_ipa_bit_size(void)
+{
+    uint32_t default_ipa_size;
+    hv_return_t ret = hv_vm_config_get_default_ipa_size(&default_ipa_size);
+    assert_hvf_ok(ret);
+
+    return default_ipa_size;
+}
+
+uint32_t hvf_arm_get_max_ipa_bit_size(void)
+{
+    uint32_t max_ipa_size;
+    hv_return_t ret = hv_vm_config_get_max_ipa_size(&max_ipa_size);
+    assert_hvf_ok(ret);
+
+    /*
+     * We clamp any IPA size we want to back the VM with to a valid PARange
+     * value so the guest doesn't try and map memory outside of the valid range.
+     * This logic just clamps the passed in IPA bit size to the first valid
+     * PARange value <= to it.
+     */
+    return round_down_to_parange_bit_size(max_ipa_size);
+}
+
 void hvf_arm_set_cpu_features_from_host(ARMCPU *cpu)
 {
     if (!arm_host_cpu_features.dtb_compatible) {
@@ -931,8 +970,18 @@ void hvf_arch_vcpu_destroy(CPUState *cpu)
 
 hv_return_t hvf_arch_vm_create(MachineState *ms, uint32_t pa_range)
 {
+    hv_return_t ret;
     hv_vm_config_t config = hv_vm_config_create();
-    hv_return_t ret = hv_vm_create(config);
+
+    ret = hv_vm_config_set_ipa_size(config, pa_range);
+    if (ret != HV_SUCCESS) {
+        goto cleanup;
+    }
+    chosen_ipa_bit_size = pa_range;
+
+    ret = hv_vm_create(config);
+
+cleanup:
     os_release(config);
 
     return ret;
@@ -1004,6 +1053,11 @@ int hvf_arch_init_vcpu(CPUState *cpu)
                               &arm_cpu->isar.id_aa64mmfr0);
     assert_hvf_ok(ret);
 
+    clamp_id_aa64mmfr0_parange_to_ipa_size(&arm_cpu->isar.id_aa64mmfr0);
+    ret = hv_vcpu_set_sys_reg(cpu->accel->fd, HV_SYS_REG_ID_AA64MMFR0_EL1,
+                              arm_cpu->isar.id_aa64mmfr0);
+    assert_hvf_ok(ret);
+
     return 0;
 }
 
diff --git a/target/arm/hvf_arm.h b/target/arm/hvf_arm.h
index e848c1d27d..26c717b382 100644
--- a/target/arm/hvf_arm.h
+++ b/target/arm/hvf_arm.h
@@ -22,4 +22,23 @@ void hvf_arm_init_debug(void);
 
 void hvf_arm_set_cpu_features_from_host(ARMCPU *cpu);
 
+#ifdef CONFIG_HVF
+
+uint32_t hvf_arm_get_default_ipa_bit_size(void);
+uint32_t hvf_arm_get_max_ipa_bit_size(void);
+
+#else
+
+static inline uint32_t hvf_arm_get_default_ipa_bit_size(void)
+{
+    return 0;
+}
+
+static inline uint32_t hvf_arm_get_max_ipa_bit_size(void)
+{
+    return 0;
+}
+
+#endif
+
 #endif
diff --git a/target/arm/internals.h b/target/arm/internals.h
index 203a2dae14..c5d7b0b492 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -450,6 +450,25 @@ static inline void update_spsel(CPUARMState *env, uint32_t imm)
  */
 unsigned int arm_pamax(ARMCPU *cpu);
 
+/*
+ * round_down_to_parange_index
+ * @bit_size: uint8_t
+ *
+ * Rounds down the bit_size supplied to the first supported ARM physical
+ * address range and returns the index for this. The index is intended to
+ * be used to set ID_AA64MMFR0_EL1's PARANGE bits.
+ */
+uint8_t round_down_to_parange_index(uint8_t bit_size);
+
+/*
+ * round_down_to_parange_bit_size
+ * @bit_size: uint8_t
+ *
+ * Rounds down the bit_size supplied to the first supported ARM physical
+ * address range bit size and returns this.
+ */
+uint8_t round_down_to_parange_bit_size(uint8_t bit_size);
+
 /* Return true if extended addresses are enabled.
  * This is always the case if our translation regime is 64 bit,
  * but depends on TTBCR.EAE for 32 bit.
diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index 278004661b..defd6b84de 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -96,6 +96,21 @@ static const uint8_t pamax_map[] = {
     [6] = 52,
 };
 
+uint8_t round_down_to_parange_index(uint8_t bit_size)
+{
+    for (int i = ARRAY_SIZE(pamax_map) - 1; i >= 0; i--) {
+        if (pamax_map[i] <= bit_size) {
+            return i;
+        }
+    }
+    g_assert_not_reached();
+}
+
+uint8_t round_down_to_parange_bit_size(uint8_t bit_size)
+{
+    return pamax_map[round_down_to_parange_index(bit_size)];
+}
+
 /*
  * The cpu-specific constant value of PAMax; also used by hw/arm/virt.
  * Note that machvirt_init calls this on a CPU that is inited but not realized!
-- 
2.39.5 (Apple Git-154)



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] hvf: arm: Implement and use hvf_get_physical_address_range
  2024-08-28 11:15 ` [PATCH v2 3/3] hvf: arm: Implement and use hvf_get_physical_address_range Danny Canter
@ 2024-09-06 15:31   ` Peter Maydell
  2025-02-10 17:26   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 12+ messages in thread
From: Peter Maydell @ 2024-09-06 15:31 UTC (permalink / raw)
  To: Danny Canter
  Cc: qemu-devel, qemu-arm, dirty, rbolshakov, agraf, pbonzini,
	richard.henderson, eduardo, mst, marcel.apfelbaum, philmd,
	wangyanan55, zhao1.liu

On Wed, 28 Aug 2024 at 12:16, Danny Canter <danny_canter@apple.com> wrote:
>
> This patch's main focus is to use the previously added
> hvf_get_physical_address_range to inform VM creation
> about the IPA size we need for the VM, so we can extend
> the default 36b IPA size and support VMs with 64+GB of
> RAM. This is done by freezing the memory map, computing
> the highest GPA and then (depending on if the platform
> supports an IPA size that large) telling the kernel to
> use a size >= for the VM. In pursuit of this a couple of
> things related to how we handle the physical address range
> we expose to guests were altered, but for an explanation of
> what we were doing:
>
> Today, to get the IPA size we were reading id_aa64mmfr0_el1's
> PARange field from a newly made vcpu. Unfortunately, HVF just
> returns the hosts PARange directly for the initial value and
> not the IPA size that will actually back the VM, so we believe
> we have much more address space than we actually do today it seems.
>
> Starting in macOS 13.0 some APIs were introduced to be able to
> query the maximum IPA size the kernel supports, and to set the IPA
> size for a given VM. However, this still has a couple of issues
> on < macOS 15. Up until macOS 15 (and if the hardware supported
> it) the max IPA size was 39 bits which is not a valid PARange
> value, so we can't clamp down what we advertise in the vcpu's
> id_aa64mmfr0_el1 to our IPA size. Starting in macOS 15 however,
> the maximum IPA size is 40 bits (if it's supported in the hardware
> as well) which is also a valid PARange value so we can set our IPA
> size to the maximum as well as clamp down the PARange we advertise
> to the guest. This allows VMs with 64+ GB of RAM and should fix the
> oddness of the PARange situation as well.
>
> Signed-off-by: Danny Canter <danny_canter@apple.com>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] hvf: arm: Implement and use hvf_get_physical_address_range
  2024-08-28 11:15 ` [PATCH v2 3/3] hvf: arm: Implement and use hvf_get_physical_address_range Danny Canter
  2024-09-06 15:31   ` Peter Maydell
@ 2025-02-10 17:26   ` Philippe Mathieu-Daudé
  2025-02-10 18:20     ` Danny Canter
  1 sibling, 1 reply; 12+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-02-10 17:26 UTC (permalink / raw)
  To: Danny Canter, qemu-devel, qemu-arm, Itaru Kitayama
  Cc: dirty, rbolshakov, agraf, peter.maydell, pbonzini,
	richard.henderson, eduardo, mst, marcel.apfelbaum, wangyanan55,
	zhao1.liu

Hi Danny,

On 28/8/24 13:15, Danny Canter wrote:
> This patch's main focus is to use the previously added
> hvf_get_physical_address_range to inform VM creation
> about the IPA size we need for the VM, so we can extend
> the default 36b IPA size and support VMs with 64+GB of
> RAM. This is done by freezing the memory map, computing
> the highest GPA and then (depending on if the platform
> supports an IPA size that large) telling the kernel to
> use a size >= for the VM. In pursuit of this a couple of
> things related to how we handle the physical address range
> we expose to guests were altered, but for an explanation of
> what we were doing:
> 
> Today, to get the IPA size we were reading id_aa64mmfr0_el1's
> PARange field from a newly made vcpu. Unfortunately, HVF just
> returns the hosts PARange directly for the initial value and
> not the IPA size that will actually back the VM, so we believe
> we have much more address space than we actually do today it seems.
> 
> Starting in macOS 13.0 some APIs were introduced to be able to
> query the maximum IPA size the kernel supports, and to set the IPA
> size for a given VM. However, this still has a couple of issues
> on < macOS 15. Up until macOS 15 (and if the hardware supported
> it) the max IPA size was 39 bits which is not a valid PARange
> value, so we can't clamp down what we advertise in the vcpu's
> id_aa64mmfr0_el1 to our IPA size. Starting in macOS 15 however,
> the maximum IPA size is 40 bits (if it's supported in the hardware
> as well) which is also a valid PARange value so we can set our IPA
> size to the maximum as well as clamp down the PARange we advertise
> to the guest. This allows VMs with 64+ GB of RAM and should fix the
> oddness of the PARange situation as well.

Could you have a look at the following issue related to your patch?
https://gitlab.com/qemu-project/qemu/-/issues/2800


> 
> Signed-off-by: Danny Canter <danny_canter@apple.com>
> ---
>   accel/hvf/hvf-accel-ops.c | 12 ++++++++-
>   hw/arm/virt.c             | 31 +++++++++++++++++++++-
>   target/arm/hvf/hvf.c      | 56 ++++++++++++++++++++++++++++++++++++++-
>   target/arm/hvf_arm.h      | 19 +++++++++++++
>   target/arm/internals.h    | 19 +++++++++++++
>   target/arm/ptw.c          | 15 +++++++++++
>   6 files changed, 149 insertions(+), 3 deletions(-)
> 
> diff --git a/accel/hvf/hvf-accel-ops.c b/accel/hvf/hvf-accel-ops.c
> index dbebf209f4..d60874d3e6 100644
> --- a/accel/hvf/hvf-accel-ops.c
> +++ b/accel/hvf/hvf-accel-ops.c
> @@ -53,6 +53,7 @@
>   #include "exec/address-spaces.h"
>   #include "exec/exec-all.h"
>   #include "gdbstub/enums.h"
> +#include "hw/boards.h"
>   #include "sysemu/cpus.h"
>   #include "sysemu/hvf.h"
>   #include "sysemu/hvf_int.h"
> @@ -319,8 +320,17 @@ static int hvf_accel_init(MachineState *ms)
>       int x;
>       hv_return_t ret;
>       HVFState *s;
> +    int pa_range = 36;
> +    MachineClass *mc = MACHINE_GET_CLASS(ms);
>   
> -    ret = hvf_arch_vm_create(ms, 0);
> +    if (mc->hvf_get_physical_address_range) {
> +        pa_range = mc->hvf_get_physical_address_range(ms);
> +        if (pa_range < 0) {
> +            return -EINVAL;
> +        }
> +    }
> +
> +    ret = hvf_arch_vm_create(ms, (uint32_t)pa_range);
>       assert_hvf_ok(ret);
>   
>       s = g_new0(HVFState, 1);
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 62ee5f849b..b39c7924a0 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -66,6 +66,7 @@
>   #include "hw/intc/arm_gicv3_its_common.h"
>   #include "hw/irq.h"
>   #include "kvm_arm.h"
> +#include "hvf_arm.h"
>   #include "hw/firmware/smbios.h"
>   #include "qapi/visitor.h"
>   #include "qapi/qapi-visit-common.h"
> @@ -3030,7 +3031,35 @@ static int virt_kvm_type(MachineState *ms, const char *type_str)
>   
>   static int virt_hvf_get_physical_address_range(MachineState *ms)
>   {
> -    return 0;
> +    VirtMachineState *vms = VIRT_MACHINE(ms);
> +
> +    int default_ipa_size = hvf_arm_get_default_ipa_bit_size();
> +    int max_ipa_size = hvf_arm_get_max_ipa_bit_size();
> +
> +    /* We freeze the memory map to compute the highest gpa */
> +    virt_set_memmap(vms, max_ipa_size);
> +
> +    int requested_ipa_size = 64 - clz64(vms->highest_gpa);
> +
> +    /*
> +     * If we're <= the default IPA size just use the default.
> +     * If we're above the default but below the maximum, round up to
> +     * the maximum. hvf_arm_get_max_ipa_bit_size() conveniently only
> +     * returns values that are valid ARM PARange values.
> +     */
> +    if (requested_ipa_size <= default_ipa_size) {
> +        requested_ipa_size = default_ipa_size;
> +    } else if (requested_ipa_size <= max_ipa_size) {
> +        requested_ipa_size = max_ipa_size;
> +    } else {
> +        error_report("-m and ,maxmem option values "
> +                     "require an IPA range (%d bits) larger than "
> +                     "the one supported by the host (%d bits)",
> +                     requested_ipa_size, max_ipa_size);
> +        return -1;
> +    }
> +
> +    return requested_ipa_size;
>   }
>   
>   static void virt_machine_class_init(ObjectClass *oc, void *data)
> diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
> index 19964d241e..6cea483d42 100644
> --- a/target/arm/hvf/hvf.c
> +++ b/target/arm/hvf/hvf.c
> @@ -22,6 +22,7 @@
>   #include <mach/mach_time.h>
>   
>   #include "exec/address-spaces.h"
> +#include "hw/boards.h"
>   #include "hw/irq.h"
>   #include "qemu/main-loop.h"
>   #include "sysemu/cpus.h"
> @@ -297,6 +298,8 @@ void hvf_arm_init_debug(void)
>   
>   static void hvf_wfi(CPUState *cpu);
>   
> +static uint32_t chosen_ipa_bit_size;
> +
>   typedef struct HVFVTimer {
>       /* Vtimer value during migration and paused state */
>       uint64_t vtimer_val;
> @@ -839,6 +842,16 @@ static uint64_t hvf_get_reg(CPUState *cpu, int rt)
>       return val;
>   }
>   
> +static void clamp_id_aa64mmfr0_parange_to_ipa_size(uint64_t *id_aa64mmfr0)
> +{
> +    uint32_t ipa_size = chosen_ipa_bit_size ?
> +            chosen_ipa_bit_size : hvf_arm_get_max_ipa_bit_size();
> +
> +    /* Clamp down the PARange to the IPA size the kernel supports. */
> +    uint8_t index = round_down_to_parange_index(ipa_size);
> +    *id_aa64mmfr0 = (*id_aa64mmfr0 & ~R_ID_AA64MMFR0_PARANGE_MASK) | index;
> +}
> +
>   static bool hvf_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
>   {
>       ARMISARegisters host_isar = {};
> @@ -882,6 +895,8 @@ static bool hvf_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
>       r |= hv_vcpu_get_sys_reg(fd, HV_SYS_REG_MIDR_EL1, &ahcf->midr);
>       r |= hv_vcpu_destroy(fd);
>   
> +    clamp_id_aa64mmfr0_parange_to_ipa_size(&host_isar.id_aa64mmfr0);
> +
>       ahcf->isar = host_isar;
>   
>       /*
> @@ -904,6 +919,30 @@ static bool hvf_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
>       return r == HV_SUCCESS;
>   }
>   
> +uint32_t hvf_arm_get_default_ipa_bit_size(void)
> +{
> +    uint32_t default_ipa_size;
> +    hv_return_t ret = hv_vm_config_get_default_ipa_size(&default_ipa_size);
> +    assert_hvf_ok(ret);
> +
> +    return default_ipa_size;
> +}
> +
> +uint32_t hvf_arm_get_max_ipa_bit_size(void)
> +{
> +    uint32_t max_ipa_size;
> +    hv_return_t ret = hv_vm_config_get_max_ipa_size(&max_ipa_size);
> +    assert_hvf_ok(ret);
> +
> +    /*
> +     * We clamp any IPA size we want to back the VM with to a valid PARange
> +     * value so the guest doesn't try and map memory outside of the valid range.
> +     * This logic just clamps the passed in IPA bit size to the first valid
> +     * PARange value <= to it.
> +     */
> +    return round_down_to_parange_bit_size(max_ipa_size);
> +}
> +
>   void hvf_arm_set_cpu_features_from_host(ARMCPU *cpu)
>   {
>       if (!arm_host_cpu_features.dtb_compatible) {
> @@ -931,8 +970,18 @@ void hvf_arch_vcpu_destroy(CPUState *cpu)
>   
>   hv_return_t hvf_arch_vm_create(MachineState *ms, uint32_t pa_range)
>   {
> +    hv_return_t ret;
>       hv_vm_config_t config = hv_vm_config_create();
> -    hv_return_t ret = hv_vm_create(config);
> +
> +    ret = hv_vm_config_set_ipa_size(config, pa_range);
> +    if (ret != HV_SUCCESS) {
> +        goto cleanup;
> +    }
> +    chosen_ipa_bit_size = pa_range;
> +
> +    ret = hv_vm_create(config);
> +
> +cleanup:
>       os_release(config);
>   
>       return ret;
> @@ -1004,6 +1053,11 @@ int hvf_arch_init_vcpu(CPUState *cpu)
>                                 &arm_cpu->isar.id_aa64mmfr0);
>       assert_hvf_ok(ret);
>   
> +    clamp_id_aa64mmfr0_parange_to_ipa_size(&arm_cpu->isar.id_aa64mmfr0);
> +    ret = hv_vcpu_set_sys_reg(cpu->accel->fd, HV_SYS_REG_ID_AA64MMFR0_EL1,
> +                              arm_cpu->isar.id_aa64mmfr0);
> +    assert_hvf_ok(ret);
> +
>       return 0;
>   }
>   
> diff --git a/target/arm/hvf_arm.h b/target/arm/hvf_arm.h
> index e848c1d27d..26c717b382 100644
> --- a/target/arm/hvf_arm.h
> +++ b/target/arm/hvf_arm.h
> @@ -22,4 +22,23 @@ void hvf_arm_init_debug(void);
>   
>   void hvf_arm_set_cpu_features_from_host(ARMCPU *cpu);
>   
> +#ifdef CONFIG_HVF
> +
> +uint32_t hvf_arm_get_default_ipa_bit_size(void);
> +uint32_t hvf_arm_get_max_ipa_bit_size(void);
> +
> +#else
> +
> +static inline uint32_t hvf_arm_get_default_ipa_bit_size(void)
> +{
> +    return 0;
> +}
> +
> +static inline uint32_t hvf_arm_get_max_ipa_bit_size(void)
> +{
> +    return 0;
> +}
> +
> +#endif
> +
>   #endif
> diff --git a/target/arm/internals.h b/target/arm/internals.h
> index 203a2dae14..c5d7b0b492 100644
> --- a/target/arm/internals.h
> +++ b/target/arm/internals.h
> @@ -450,6 +450,25 @@ static inline void update_spsel(CPUARMState *env, uint32_t imm)
>    */
>   unsigned int arm_pamax(ARMCPU *cpu);
>   
> +/*
> + * round_down_to_parange_index
> + * @bit_size: uint8_t
> + *
> + * Rounds down the bit_size supplied to the first supported ARM physical
> + * address range and returns the index for this. The index is intended to
> + * be used to set ID_AA64MMFR0_EL1's PARANGE bits.
> + */
> +uint8_t round_down_to_parange_index(uint8_t bit_size);
> +
> +/*
> + * round_down_to_parange_bit_size
> + * @bit_size: uint8_t
> + *
> + * Rounds down the bit_size supplied to the first supported ARM physical
> + * address range bit size and returns this.
> + */
> +uint8_t round_down_to_parange_bit_size(uint8_t bit_size);
> +
>   /* Return true if extended addresses are enabled.
>    * This is always the case if our translation regime is 64 bit,
>    * but depends on TTBCR.EAE for 32 bit.
> diff --git a/target/arm/ptw.c b/target/arm/ptw.c
> index 278004661b..defd6b84de 100644
> --- a/target/arm/ptw.c
> +++ b/target/arm/ptw.c
> @@ -96,6 +96,21 @@ static const uint8_t pamax_map[] = {
>       [6] = 52,
>   };
>   
> +uint8_t round_down_to_parange_index(uint8_t bit_size)
> +{
> +    for (int i = ARRAY_SIZE(pamax_map) - 1; i >= 0; i--) {
> +        if (pamax_map[i] <= bit_size) {
> +            return i;
> +        }
> +    }
> +    g_assert_not_reached();
> +}
> +
> +uint8_t round_down_to_parange_bit_size(uint8_t bit_size)
> +{
> +    return pamax_map[round_down_to_parange_index(bit_size)];
> +}
> +
>   /*
>    * The cpu-specific constant value of PAMax; also used by hw/arm/virt.
>    * Note that machvirt_init calls this on a CPU that is inited but not realized!



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] hvf: arm: Implement and use hvf_get_physical_address_range
  2025-02-10 17:26   ` Philippe Mathieu-Daudé
@ 2025-02-10 18:20     ` Danny Canter
  2025-02-10 18:24       ` Peter Maydell
  0 siblings, 1 reply; 12+ messages in thread
From: Danny Canter @ 2025-02-10 18:20 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: qemu-devel, qemu-arm, Itaru Kitayama, dirty, rbolshakov, agraf,
	peter.maydell, pbonzini, richard.henderson, eduardo, mst,
	marcel.apfelbaum, wangyanan55, zhao1.liu

Will do. I’ll reach out if I need extra info. The issue appears to be closed though, was this fixed/no-repro already though?

-Danny 

> On Feb 10, 2025, at 9:26 AM, Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
> 
> Hi Danny,
> 
> On 28/8/24 13:15, Danny Canter wrote:
>> This patch's main focus is to use the previously added
>> hvf_get_physical_address_range to inform VM creation
>> about the IPA size we need for the VM, so we can extend
>> the default 36b IPA size and support VMs with 64+GB of
>> RAM. This is done by freezing the memory map, computing
>> the highest GPA and then (depending on if the platform
>> supports an IPA size that large) telling the kernel to
>> use a size >= for the VM. In pursuit of this a couple of
>> things related to how we handle the physical address range
>> we expose to guests were altered, but for an explanation of
>> what we were doing:
>> Today, to get the IPA size we were reading id_aa64mmfr0_el1's
>> PARange field from a newly made vcpu. Unfortunately, HVF just
>> returns the hosts PARange directly for the initial value and
>> not the IPA size that will actually back the VM, so we believe
>> we have much more address space than we actually do today it seems.
>> Starting in macOS 13.0 some APIs were introduced to be able to
>> query the maximum IPA size the kernel supports, and to set the IPA
>> size for a given VM. However, this still has a couple of issues
>> on < macOS 15. Up until macOS 15 (and if the hardware supported
>> it) the max IPA size was 39 bits which is not a valid PARange
>> value, so we can't clamp down what we advertise in the vcpu's
>> id_aa64mmfr0_el1 to our IPA size. Starting in macOS 15 however,
>> the maximum IPA size is 40 bits (if it's supported in the hardware
>> as well) which is also a valid PARange value so we can set our IPA
>> size to the maximum as well as clamp down the PARange we advertise
>> to the guest. This allows VMs with 64+ GB of RAM and should fix the
>> oddness of the PARange situation as well.
> 
> Could you have a look at the following issue related to your patch?
> https://gitlab.com/qemu-project/qemu/-/issues/2800
> 
> 
>> Signed-off-by: Danny Canter <danny_canter@apple.com>
>> ---
>>  accel/hvf/hvf-accel-ops.c | 12 ++++++++-
>>  hw/arm/virt.c             | 31 +++++++++++++++++++++-
>>  target/arm/hvf/hvf.c      | 56 ++++++++++++++++++++++++++++++++++++++-
>>  target/arm/hvf_arm.h      | 19 +++++++++++++
>>  target/arm/internals.h    | 19 +++++++++++++
>>  target/arm/ptw.c          | 15 +++++++++++
>>  6 files changed, 149 insertions(+), 3 deletions(-)
>> diff --git a/accel/hvf/hvf-accel-ops.c b/accel/hvf/hvf-accel-ops.c
>> index dbebf209f4..d60874d3e6 100644
>> --- a/accel/hvf/hvf-accel-ops.c
>> +++ b/accel/hvf/hvf-accel-ops.c
>> @@ -53,6 +53,7 @@
>>  #include "exec/address-spaces.h"
>>  #include "exec/exec-all.h"
>>  #include "gdbstub/enums.h"
>> +#include "hw/boards.h"
>>  #include "sysemu/cpus.h"
>>  #include "sysemu/hvf.h"
>>  #include "sysemu/hvf_int.h"
>> @@ -319,8 +320,17 @@ static int hvf_accel_init(MachineState *ms)
>>      int x;
>>      hv_return_t ret;
>>      HVFState *s;
>> +    int pa_range = 36;
>> +    MachineClass *mc = MACHINE_GET_CLASS(ms);
>>  -    ret = hvf_arch_vm_create(ms, 0);
>> +    if (mc->hvf_get_physical_address_range) {
>> +        pa_range = mc->hvf_get_physical_address_range(ms);
>> +        if (pa_range < 0) {
>> +            return -EINVAL;
>> +        }
>> +    }
>> +
>> +    ret = hvf_arch_vm_create(ms, (uint32_t)pa_range);
>>      assert_hvf_ok(ret);
>>        s = g_new0(HVFState, 1);
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index 62ee5f849b..b39c7924a0 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -66,6 +66,7 @@
>>  #include "hw/intc/arm_gicv3_its_common.h"
>>  #include "hw/irq.h"
>>  #include "kvm_arm.h"
>> +#include "hvf_arm.h"
>>  #include "hw/firmware/smbios.h"
>>  #include "qapi/visitor.h"
>>  #include "qapi/qapi-visit-common.h"
>> @@ -3030,7 +3031,35 @@ static int virt_kvm_type(MachineState *ms, const char *type_str)
>>    static int virt_hvf_get_physical_address_range(MachineState *ms)
>>  {
>> -    return 0;
>> +    VirtMachineState *vms = VIRT_MACHINE(ms);
>> +
>> +    int default_ipa_size = hvf_arm_get_default_ipa_bit_size();
>> +    int max_ipa_size = hvf_arm_get_max_ipa_bit_size();
>> +
>> +    /* We freeze the memory map to compute the highest gpa */
>> +    virt_set_memmap(vms, max_ipa_size);
>> +
>> +    int requested_ipa_size = 64 - clz64(vms->highest_gpa);
>> +
>> +    /*
>> +     * If we're <= the default IPA size just use the default.
>> +     * If we're above the default but below the maximum, round up to
>> +     * the maximum. hvf_arm_get_max_ipa_bit_size() conveniently only
>> +     * returns values that are valid ARM PARange values.
>> +     */
>> +    if (requested_ipa_size <= default_ipa_size) {
>> +        requested_ipa_size = default_ipa_size;
>> +    } else if (requested_ipa_size <= max_ipa_size) {
>> +        requested_ipa_size = max_ipa_size;
>> +    } else {
>> +        error_report("-m and ,maxmem option values "
>> +                     "require an IPA range (%d bits) larger than "
>> +                     "the one supported by the host (%d bits)",
>> +                     requested_ipa_size, max_ipa_size);
>> +        return -1;
>> +    }
>> +
>> +    return requested_ipa_size;
>>  }
>>    static void virt_machine_class_init(ObjectClass *oc, void *data)
>> diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
>> index 19964d241e..6cea483d42 100644
>> --- a/target/arm/hvf/hvf.c
>> +++ b/target/arm/hvf/hvf.c
>> @@ -22,6 +22,7 @@
>>  #include <mach/mach_time.h>
>>    #include "exec/address-spaces.h"
>> +#include "hw/boards.h"
>>  #include "hw/irq.h"
>>  #include "qemu/main-loop.h"
>>  #include "sysemu/cpus.h"
>> @@ -297,6 +298,8 @@ void hvf_arm_init_debug(void)
>>    static void hvf_wfi(CPUState *cpu);
>>  +static uint32_t chosen_ipa_bit_size;
>> +
>>  typedef struct HVFVTimer {
>>      /* Vtimer value during migration and paused state */
>>      uint64_t vtimer_val;
>> @@ -839,6 +842,16 @@ static uint64_t hvf_get_reg(CPUState *cpu, int rt)
>>      return val;
>>  }
>>  +static void clamp_id_aa64mmfr0_parange_to_ipa_size(uint64_t *id_aa64mmfr0)
>> +{
>> +    uint32_t ipa_size = chosen_ipa_bit_size ?
>> +            chosen_ipa_bit_size : hvf_arm_get_max_ipa_bit_size();
>> +
>> +    /* Clamp down the PARange to the IPA size the kernel supports. */
>> +    uint8_t index = round_down_to_parange_index(ipa_size);
>> +    *id_aa64mmfr0 = (*id_aa64mmfr0 & ~R_ID_AA64MMFR0_PARANGE_MASK) | index;
>> +}
>> +
>>  static bool hvf_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
>>  {
>>      ARMISARegisters host_isar = {};
>> @@ -882,6 +895,8 @@ static bool hvf_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
>>      r |= hv_vcpu_get_sys_reg(fd, HV_SYS_REG_MIDR_EL1, &ahcf->midr);
>>      r |= hv_vcpu_destroy(fd);
>>  +    clamp_id_aa64mmfr0_parange_to_ipa_size(&host_isar.id_aa64mmfr0);
>> +
>>      ahcf->isar = host_isar;
>>        /*
>> @@ -904,6 +919,30 @@ static bool hvf_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
>>      return r == HV_SUCCESS;
>>  }
>>  +uint32_t hvf_arm_get_default_ipa_bit_size(void)
>> +{
>> +    uint32_t default_ipa_size;
>> +    hv_return_t ret = hv_vm_config_get_default_ipa_size(&default_ipa_size);
>> +    assert_hvf_ok(ret);
>> +
>> +    return default_ipa_size;
>> +}
>> +
>> +uint32_t hvf_arm_get_max_ipa_bit_size(void)
>> +{
>> +    uint32_t max_ipa_size;
>> +    hv_return_t ret = hv_vm_config_get_max_ipa_size(&max_ipa_size);
>> +    assert_hvf_ok(ret);
>> +
>> +    /*
>> +     * We clamp any IPA size we want to back the VM with to a valid PARange
>> +     * value so the guest doesn't try and map memory outside of the valid range.
>> +     * This logic just clamps the passed in IPA bit size to the first valid
>> +     * PARange value <= to it.
>> +     */
>> +    return round_down_to_parange_bit_size(max_ipa_size);
>> +}
>> +
>>  void hvf_arm_set_cpu_features_from_host(ARMCPU *cpu)
>>  {
>>      if (!arm_host_cpu_features.dtb_compatible) {
>> @@ -931,8 +970,18 @@ void hvf_arch_vcpu_destroy(CPUState *cpu)
>>    hv_return_t hvf_arch_vm_create(MachineState *ms, uint32_t pa_range)
>>  {
>> +    hv_return_t ret;
>>      hv_vm_config_t config = hv_vm_config_create();
>> -    hv_return_t ret = hv_vm_create(config);
>> +
>> +    ret = hv_vm_config_set_ipa_size(config, pa_range);
>> +    if (ret != HV_SUCCESS) {
>> +        goto cleanup;
>> +    }
>> +    chosen_ipa_bit_size = pa_range;
>> +
>> +    ret = hv_vm_create(config);
>> +
>> +cleanup:
>>      os_release(config);
>>        return ret;
>> @@ -1004,6 +1053,11 @@ int hvf_arch_init_vcpu(CPUState *cpu)
>>                                &arm_cpu->isar.id_aa64mmfr0);
>>      assert_hvf_ok(ret);
>>  +    clamp_id_aa64mmfr0_parange_to_ipa_size(&arm_cpu->isar.id_aa64mmfr0);
>> +    ret = hv_vcpu_set_sys_reg(cpu->accel->fd, HV_SYS_REG_ID_AA64MMFR0_EL1,
>> +                              arm_cpu->isar.id_aa64mmfr0);
>> +    assert_hvf_ok(ret);
>> +
>>      return 0;
>>  }
>>  diff --git a/target/arm/hvf_arm.h b/target/arm/hvf_arm.h
>> index e848c1d27d..26c717b382 100644
>> --- a/target/arm/hvf_arm.h
>> +++ b/target/arm/hvf_arm.h
>> @@ -22,4 +22,23 @@ void hvf_arm_init_debug(void);
>>    void hvf_arm_set_cpu_features_from_host(ARMCPU *cpu);
>>  +#ifdef CONFIG_HVF
>> +
>> +uint32_t hvf_arm_get_default_ipa_bit_size(void);
>> +uint32_t hvf_arm_get_max_ipa_bit_size(void);
>> +
>> +#else
>> +
>> +static inline uint32_t hvf_arm_get_default_ipa_bit_size(void)
>> +{
>> +    return 0;
>> +}
>> +
>> +static inline uint32_t hvf_arm_get_max_ipa_bit_size(void)
>> +{
>> +    return 0;
>> +}
>> +
>> +#endif
>> +
>>  #endif
>> diff --git a/target/arm/internals.h b/target/arm/internals.h
>> index 203a2dae14..c5d7b0b492 100644
>> --- a/target/arm/internals.h
>> +++ b/target/arm/internals.h
>> @@ -450,6 +450,25 @@ static inline void update_spsel(CPUARMState *env, uint32_t imm)
>>   */
>>  unsigned int arm_pamax(ARMCPU *cpu);
>>  +/*
>> + * round_down_to_parange_index
>> + * @bit_size: uint8_t
>> + *
>> + * Rounds down the bit_size supplied to the first supported ARM physical
>> + * address range and returns the index for this. The index is intended to
>> + * be used to set ID_AA64MMFR0_EL1's PARANGE bits.
>> + */
>> +uint8_t round_down_to_parange_index(uint8_t bit_size);
>> +
>> +/*
>> + * round_down_to_parange_bit_size
>> + * @bit_size: uint8_t
>> + *
>> + * Rounds down the bit_size supplied to the first supported ARM physical
>> + * address range bit size and returns this.
>> + */
>> +uint8_t round_down_to_parange_bit_size(uint8_t bit_size);
>> +
>>  /* Return true if extended addresses are enabled.
>>   * This is always the case if our translation regime is 64 bit,
>>   * but depends on TTBCR.EAE for 32 bit.
>> diff --git a/target/arm/ptw.c b/target/arm/ptw.c
>> index 278004661b..defd6b84de 100644
>> --- a/target/arm/ptw.c
>> +++ b/target/arm/ptw.c
>> @@ -96,6 +96,21 @@ static const uint8_t pamax_map[] = {
>>      [6] = 52,
>>  };
>>  +uint8_t round_down_to_parange_index(uint8_t bit_size)
>> +{
>> +    for (int i = ARRAY_SIZE(pamax_map) - 1; i >= 0; i--) {
>> +        if (pamax_map[i] <= bit_size) {
>> +            return i;
>> +        }
>> +    }
>> +    g_assert_not_reached();
>> +}
>> +
>> +uint8_t round_down_to_parange_bit_size(uint8_t bit_size)
>> +{
>> +    return pamax_map[round_down_to_parange_index(bit_size)];
>> +}
>> +
>>  /*
>>   * The cpu-specific constant value of PAMax; also used by hw/arm/virt.
>>   * Note that machvirt_init calls this on a CPU that is inited but not realized!
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] hvf: arm: Implement and use hvf_get_physical_address_range
  2025-02-10 18:20     ` Danny Canter
@ 2025-02-10 18:24       ` Peter Maydell
  2025-02-10 20:39         ` Danny Canter
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Maydell @ 2025-02-10 18:24 UTC (permalink / raw)
  To: Danny Canter
  Cc: Philippe Mathieu-Daudé, qemu-devel, qemu-arm, Itaru Kitayama,
	dirty, rbolshakov, agraf, pbonzini, richard.henderson, eduardo,
	mst, marcel.apfelbaum, wangyanan55, zhao1.liu

On Mon, 10 Feb 2025 at 18:20, Danny Canter <danny_canter@apple.com> wrote:
>
> Will do. I’ll reach out if I need extra info. The issue appears to be closed though, was this fixed/no-repro already though?

Whoops, no, that must have been a mis-click on my part.

While you're looking at address-space related bugs,
https://gitlab.com/qemu-project/qemu/-/issues/2713
is another recent one -- user reports that QEMU says they're
limited to 32 bits even though their mac/macos has a 40-bit
IVA space.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] hvf: arm: Implement and use hvf_get_physical_address_range
  2025-02-10 18:24       ` Peter Maydell
@ 2025-02-10 20:39         ` Danny Canter
  0 siblings, 0 replies; 12+ messages in thread
From: Danny Canter @ 2025-02-10 20:39 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Philippe Mathieu-Daudé, qemu-devel, qemu-arm, Itaru Kitayama,
	dirty, rbolshakov, agraf, pbonzini, richard.henderson, eduardo,
	mst, marcel.apfelbaum, wangyanan55, zhao1.liu

No worries, will get a machine on 15.2 at least as it seems both reports are on that or higher. I’ll likely have time mid to late week to debug this. Thanks!

-Danny  

> On Feb 10, 2025, at 10:24 AM, Peter Maydell <peter.maydell@linaro.org> wrote:
> 
> On Mon, 10 Feb 2025 at 18:20, Danny Canter <danny_canter@apple.com> wrote:
>> 
>> Will do. I’ll reach out if I need extra info. The issue appears to be closed though, was this fixed/no-repro already though?
> 
> Whoops, no, that must have been a mis-click on my part.
> 
> While you're looking at address-space related bugs,
> https://gitlab.com/qemu-project/qemu/-/issues/2713
> is another recent one -- user reports that QEMU says they're
> limited to 32 bits even though their mac/macos has a 40-bit
> IVA space.
> 
> thanks
> -- PMM



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/3] hvf: arm: Support creating VMs with 64+GB of RAM on macOS 15+
  2024-08-28 11:15 [PATCH v2 0/3] hvf: arm: Support creating VMs with 64+GB of RAM on macOS 15+ Danny Canter
                   ` (2 preceding siblings ...)
  2024-08-28 11:15 ` [PATCH v2 3/3] hvf: arm: Implement and use hvf_get_physical_address_range Danny Canter
@ 2024-09-06 15:32 ` Peter Maydell
  3 siblings, 0 replies; 12+ messages in thread
From: Peter Maydell @ 2024-09-06 15:32 UTC (permalink / raw)
  To: Danny Canter
  Cc: qemu-devel, qemu-arm, dirty, rbolshakov, agraf, pbonzini,
	richard.henderson, eduardo, mst, marcel.apfelbaum, philmd,
	wangyanan55, zhao1.liu

On Wed, 28 Aug 2024 at 12:16, Danny Canter <danny_canter@apple.com> wrote:
>
> This patchsets focus is on lighting up the ability to create VMs with 64+GB
> of RAM through using some new APIs introduced in macOS 13. Due to the IPA sizes
> supported in macOS, the first version we can properly support this requirement
> is macOS 15 as (if the hardware supports it also) the kernel adds support for a
> 40b IPA size, which is the first supported ARM PARange value after 36b, so we
> can advertise this to the guest properly as well in id_aa64mmfr0_el1.
>
> Today if you asked for a > 64GB VM you'd be met with a pretty unwieldy
> HV_BAD_ARGUMENT. On machines without 40b IPA support this patchset also
> improves this, and the message mirrors the kvm_type error you'd get on ARM:
>
> "qemu-system-aarch64: -accel hvf: Addressing limited to 36 bits, but memory
> exceeds it by 18253611008 bytes"
>
> Changes from V1 to V2 (Thanks Peter for review!):
>
> - Added a new function pointer to MachineClass to be able to freeze the memory
> map and compute the highest guest physical address. We use this to inform VM
> creation on what IPA size we should ask the kernel for. This is very similar to
> what ARM's kvm_type() does.
>
> - Fixed redundant loop in `round_down_to_parange_bit_size`
>
> - Move the splitting up of hv_vm_create logic per platform to a separate patch.
> This is mostly for readability.

I only had one minor comment on patch 1, so I've applied the
series to target-arm.next and made that tweak there.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-02-10 20:40 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-28 11:15 [PATCH v2 0/3] hvf: arm: Support creating VMs with 64+GB of RAM on macOS 15+ Danny Canter
2024-08-28 11:15 ` [PATCH v2 1/3] hw/boards: Add hvf_get_physical_address_range to MachineClass Danny Canter
2024-09-06 15:30   ` Peter Maydell
2024-08-28 11:15 ` [PATCH v2 2/3] hvf: Split up hv_vm_create logic per arch Danny Canter
2024-09-06 15:31   ` Peter Maydell
2024-08-28 11:15 ` [PATCH v2 3/3] hvf: arm: Implement and use hvf_get_physical_address_range Danny Canter
2024-09-06 15:31   ` Peter Maydell
2025-02-10 17:26   ` Philippe Mathieu-Daudé
2025-02-10 18:20     ` Danny Canter
2025-02-10 18:24       ` Peter Maydell
2025-02-10 20:39         ` Danny Canter
2024-09-06 15:32 ` [PATCH v2 0/3] hvf: arm: Support creating VMs with 64+GB of RAM on macOS 15+ Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).