qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/11] hvf and APIC fixes, improvements, and optimisations
@ 2024-12-09 20:36 phil
  2024-12-09 20:36 ` [PATCH 01/11] hvf: Add facility for initialisation code prior to first vCPU run phil
                   ` (10 more replies)
  0 siblings, 11 replies; 26+ messages in thread
From: phil @ 2024-12-09 20:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm, Phil Dennis-Jordan

From: Phil Dennis-Jordan <phil@philjordan.eu>

These patches are a loosely related series of changes that I've
collected while working on the VMApple/PVG patch set and my
doomed project of integrating Hypervisor.framework's APIC
implementation. (The latter turns out to suffer from a number of
serious bugs which prevent it from being usable with
level-triggered interrupts, at least as of macOS 15.1.)

The motivations behind the patches are:

 * Correctness. A bunch of these fix bugs or improve compliance
   with the appropriate specification.
 * Performance. The biggest performance problem on macOS hosts
   and the hvf accel is BQL contention. Depending on workload
   and configuration, busy vCPU threads seem to spend between
   10% and 30% of their time in bql_lock(). A number of these
   patches either reduce total work done, or move work out of
   the BQL in certain configurations.
 * Code hygiene. I've encountered a few instances of incorrect
   types used, outdated comments, magic numbers, etc. which
   were easy to fix.

There are some dependencies between the patches:

 * Patch 1 is required for both patches 2 and 7. This is why
   I have not split the aarch64 (patch 2) and the x86 (3-10)
   specific changes into separate patch sets.
 * Patch 6 will not apply cleanly without 5 in its current state
   but can be modified to work.
 * Patches 6 and 7 are only useful in combination.

Beyond that, we can mix and match - i.e. pull
reviewed-and-approved patches while further iterating on any
that might need refinement or dropping unwanted ones entirely.

The patches, in brief:

 1. Adds a function for architecture-specific vCPU setup
    before the vCPU's first VM entry, but after QEMU init has
    concluded. This is useful for correctly setting the aarch64
    GIC flag (patch 2) and for telling Hypervisor.framework the
    initial APICBASE address. (patch 7)
 2. Set the GICv3 flag in the PFR0_EL1 system register after
    the GIC has actually been set up. Previously the flag was
    computed before the GIC was ready, and thus always 0.
 3. Don't send signal to thread when kicking. cpus_kick_thread
    calls pthread_kill() to send a SIG_IPI to the target vCPU's
    thread. There does not appear to be any need to do this
    with HVF, so let's not call cpus_kick_thread() when kicking
    a vCPU there.
 4. Prefetch all bytes of an instruction when emulating it in
    software. The x86_decode system in the HVF accel read guest
    memory in 1-, 2-, 4-, or 8-byte parcels when decoding
    instructions. The VM exit tells us how big the instruction
    is, so let's read all of it in one go for efficiency.
 5. x86 instruction decoding is somewhat expensive, and if we
    know that it must definitely be done before we acquire the
    BQL, we can actually move decoding to before the lock. The
    simplest case for this are APIC accesses, so we start with
    those.
 6. APIC access fast paths. EPT faults are somewhat expensive
    to process. This includes xAPIC MMIO. We can skip walking
    the memory graph and fully software-emulating the faulting
    instruction by obtaining the access type and MMIO offset
    provided by VMX/HVF's virtualised APIC access feature, and
    detecting common MOV instructions to call the APIC register
    read/write functions directly.
 7. To enable the APIC_ACCESS VM exit type, both the APIC base
    and the relevant ctl must be set in the VMCS; this change
    implements the former and toggles the latter according to
    APIC state, thus enabling the fast path implemented in patch
    6 and also adding better support for APIC relocation.
 8. This fixes a technically incorrect variable type. No change
    in functionality.
 9. The HVF "failed to decode instruction" error prints
    instruction bytes in variable-length hex; it's easier to
    read when the hex digits are always paired, even if the
    first one is 0.
10. This APIC change replaces a magic number with a symbolic
    constant and removes a comment which is no longer accurate.
11. This change causes the software APIC to raise an exception
    when the guest attempts to write to the reserved bits of the
    APICBASE MSR.


In combination, the patches improve QEMU/hvf's relevant
kvm-unit-tests results in the APIC/xAPIC/x2APIC suites (after
disabling the crashing test_pv_ipi test) from:

FAIL apic-split (56 tests, 9 unexpected failures, 2 skipped)
FAIL x2apic (56 tests, 9 unexpected failures, 2 skipped)
FAIL xapic (45 tests, 7 unexpected failures, 3 skipped)

To:

FAIL apic-split (56 tests, 2 unexpected failures, 2 skipped)
FAIL x2apic (56 tests, 2 unexpected failures, 2 skipped)
FAIL xapic (45 tests, 2 unexpected failures, 3 skipped)

The remaining test failures are, in each case:

FAIL: multiple nmi (received: 1)
FAIL: TMCCT should stay at zero

These are not regressions, they were part of the original
test failures; APIC logic relevant to them was not touched
in this patch set.


Performance improvements vary a lot depending on workload.
MMIO-based APIC accesses are most affected, so APIC-heavy
loads with x2apic disabled benefit most.

macOS guests seem to use the APIC very heavily, so it comes
as no surprise that the improvement is noticeable, with
the guest OS booting in about 33 seconds with xAPIC, down
from about 43 with unmodified QEMU 9.2-rc3. Even with x2APIC
enabled, the changes shaved about 1-2 seconds off boot times.

"Warm" builds of QEMU 9.2-rc3 with
make clean ; time ( ./configure --target-list=x86_64-softmmu ; make -j4)
inside an OpenSUSE 15.6 VM with -smp 4 and -cpu host,-x2apic
on my 8-core MacBook Pro 2019 with macOS 14 repeatably went from
  2m 58s (+/- 1.2s) real, 8m 39s (+/- 2s) user, 1m 14s (+/- <1s) sys
with 9.2-rc3 to
  2m 56s (+/- 1.2s) real, 8m 32s (+/- 2s) user, 1m 13s (+/- <1s) sys
with this patch set applied. It's not a huge improvement
(>1% for real/user times, 2% for sys) but the runs are very
consistently in favour of the modifications.

My wall-clock measurements with boot times for Windows 10,
Linux, and FreeBSD guests were all within margin of error,
though profiling the QEMU process did indicate a reduction of
vCPU thread time spent in bql_lock().

I did not spot any performance regressions, but I can't
entirely rule them out either.

There is scope for similar optimisations that yield more widely
applicable improvements. In particular EPT fault MMIO handling
can be split into non-BQL and BQL portions. I measured around 1-2%
additional improvement with a naive implementation of this, but the
hvf memory slot data structure needs to be made RCU-safe before such
an optimisation can be upstreamed.


I have done my best to do a reasonable amount of testing with
these patches, in particular I have tested Linux, FreeBSD,
macOS, Windows 10, and Windows XP guests on x86-64, each with
x2APIC alternately enabled and disabled. On aarch64, I have
checked Linux and macOS (VMApple) guests. All of this on
macOS hosts.

Phil Dennis-Jordan (11):
  hvf: Add facility for initialisation code prior to first vCPU run
  arm/hvf: Initialise GICv3 state just before first vCPU run
  i386/hvf: Don't send signal to thread when kicking
  i386/hvf: Pre-fetch emulated instructions
  i386/hvf: Decode APIC access x86 instruction outside BQL
  i386/hvf: APIC access exit with fast-path for common mov cases
  i386/hvf: Enables APIC_ACCESS VM exits by setting APICBASE
  i386/hvf: Variable type fixup in decoder
  i386/hvf: Print hex pairs for each opcode byte in decode error
  hw/intc/apic: Fixes magic number use, removes outdated comment
  hw/intc/apic: Raise exception when setting reserved APICBASE bits

 accel/hvf/hvf-accel-ops.c    |   5 ++
 hw/intc/apic.c               |  12 +++--
 include/hw/i386/apic.h       |   2 +
 include/sysemu/hvf_int.h     |   1 +
 meson.build                  |   1 +
 target/arm/hvf/hvf.c         |  21 +++++---
 target/i386/hvf/hvf.c        |  50 +++++++++++++----
 target/i386/hvf/trace-events |   9 ++++
 target/i386/hvf/trace.h      |   1 +
 target/i386/hvf/x86_decode.c |  24 ++++++---
 target/i386/hvf/x86_decode.h |   5 +-
 target/i386/hvf/x86_emu.c    | 102 +++++++++++++++++++++++++++++++++++
 target/i386/hvf/x86_emu.h    |   2 +
 13 files changed, 206 insertions(+), 29 deletions(-)
 create mode 100644 target/i386/hvf/trace-events
 create mode 100644 target/i386/hvf/trace.h

-- 
2.39.3 (Apple Git-146)



^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 01/11] hvf: Add facility for initialisation code prior to first vCPU run
  2024-12-09 20:36 [PATCH 00/11] hvf and APIC fixes, improvements, and optimisations phil
@ 2024-12-09 20:36 ` phil
  2024-12-10 10:05   ` Alexander Graf
  2025-02-05 21:02   ` Philippe Mathieu-Daudé
  2024-12-09 20:36 ` [PATCH 02/11] arm/hvf: Initialise GICv3 state just before " phil
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 26+ messages in thread
From: phil @ 2024-12-09 20:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm, Phil Dennis-Jordan

From: Phil Dennis-Jordan <phil@philjordan.eu>

Some VM state required for fully configuring vCPUs is only available
after all devices have been through their init phase. This extra
function, called just before each vCPU makes its first VM entry,
allows us to perform such architecture-specific initialisation.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
---
 accel/hvf/hvf-accel-ops.c | 5 +++++
 include/sysemu/hvf_int.h  | 1 +
 target/arm/hvf/hvf.c      | 4 ++++
 target/i386/hvf/hvf.c     | 4 ++++
 4 files changed, 14 insertions(+)

diff --git a/accel/hvf/hvf-accel-ops.c b/accel/hvf/hvf-accel-ops.c
index d60874d3e6..c17a9a10de 100644
--- a/accel/hvf/hvf-accel-ops.c
+++ b/accel/hvf/hvf-accel-ops.c
@@ -442,6 +442,11 @@ static void *hvf_cpu_thread_fn(void *arg)
     cpu_thread_signal_created(cpu);
     qemu_guest_random_seed_thread_part2(cpu->random_seed);
 
+    if (!cpu_can_run(cpu)) {
+        qemu_wait_io_event(cpu);
+    }
+    hvf_vcpu_before_first_run(cpu);
+
     do {
         if (cpu_can_run(cpu)) {
             r = hvf_vcpu_exec(cpu);
diff --git a/include/sysemu/hvf_int.h b/include/sysemu/hvf_int.h
index 42ae18433f..2775bd82d7 100644
--- a/include/sysemu/hvf_int.h
+++ b/include/sysemu/hvf_int.h
@@ -67,6 +67,7 @@ const char *hvf_return_string(hv_return_t ret);
 int hvf_arch_init(void);
 hv_return_t hvf_arch_vm_create(MachineState *ms, uint32_t pa_range);
 int hvf_arch_init_vcpu(CPUState *cpu);
+void hvf_vcpu_before_first_run(CPUState *cpu);
 void hvf_arch_vcpu_destroy(CPUState *cpu);
 int hvf_vcpu_exec(CPUState *);
 hvf_slot *hvf_find_overlap_slot(uint64_t, uint64_t);
diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
index ca7ea92774..0b334c268e 100644
--- a/target/arm/hvf/hvf.c
+++ b/target/arm/hvf/hvf.c
@@ -1061,6 +1061,10 @@ int hvf_arch_init_vcpu(CPUState *cpu)
     return 0;
 }
 
+void hvf_vcpu_before_first_run(CPUState *cpu)
+{
+}
+
 void hvf_kick_vcpu_thread(CPUState *cpu)
 {
     cpus_kick_thread(cpu);
diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index c5d025d557..3b6ee79fb2 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -338,6 +338,10 @@ int hvf_arch_init_vcpu(CPUState *cpu)
     return 0;
 }
 
+void hvf_vcpu_before_first_run(CPUState *cpu)
+{
+}
+
 static void hvf_store_events(CPUState *cpu, uint32_t ins_len, uint64_t idtvec_info)
 {
     X86CPU *x86_cpu = X86_CPU(cpu);
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 02/11] arm/hvf: Initialise GICv3 state just before first vCPU run
  2024-12-09 20:36 [PATCH 00/11] hvf and APIC fixes, improvements, and optimisations phil
  2024-12-09 20:36 ` [PATCH 01/11] hvf: Add facility for initialisation code prior to first vCPU run phil
@ 2024-12-09 20:36 ` phil
  2024-12-10 10:05   ` Alexander Graf
  2025-02-05 14:47   ` Zenghui Yu
  2024-12-09 20:36 ` [PATCH 03/11] i386/hvf: Don't send signal to thread when kicking phil
                   ` (8 subsequent siblings)
  10 siblings, 2 replies; 26+ messages in thread
From: phil @ 2024-12-09 20:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm, Phil Dennis-Jordan

From: Phil Dennis-Jordan <phil@philjordan.eu>

Initialising the vCPU PFR0_EL1 system register with the GIC flag in
hvf_arch_init_vcpu() does not actually work because the GIC state is
not yet available at that time.

If we set this flag just before running each vCPU for the first time,
the GIC will definitely be fully initialised at that point.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
---
 target/arm/hvf/hvf.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
index 0b334c268e..bc431f25cc 100644
--- a/target/arm/hvf/hvf.c
+++ b/target/arm/hvf/hvf.c
@@ -993,7 +993,6 @@ int hvf_arch_init_vcpu(CPUState *cpu)
     CPUARMState *env = &arm_cpu->env;
     uint32_t sregs_match_len = ARRAY_SIZE(hvf_sreg_match);
     uint32_t sregs_cnt = 0;
-    uint64_t pfr;
     hv_return_t ret;
     int i;
 
@@ -1042,12 +1041,6 @@ int hvf_arch_init_vcpu(CPUState *cpu)
                               arm_cpu->mp_affinity);
     assert_hvf_ok(ret);
 
-    ret = hv_vcpu_get_sys_reg(cpu->accel->fd, HV_SYS_REG_ID_AA64PFR0_EL1, &pfr);
-    assert_hvf_ok(ret);
-    pfr |= env->gicv3state ? (1 << 24) : 0;
-    ret = hv_vcpu_set_sys_reg(cpu->accel->fd, HV_SYS_REG_ID_AA64PFR0_EL1, pfr);
-    assert_hvf_ok(ret);
-
     /* We're limited to underlying hardware caps, override internal versions */
     ret = hv_vcpu_get_sys_reg(cpu->accel->fd, HV_SYS_REG_ID_AA64MMFR0_EL1,
                               &arm_cpu->isar.id_aa64mmfr0);
@@ -1063,6 +1056,16 @@ int hvf_arch_init_vcpu(CPUState *cpu)
 
 void hvf_vcpu_before_first_run(CPUState *cpu)
 {
+    uint64_t pfr;
+    hv_return_t ret;
+    ARMCPU *arm_cpu = ARM_CPU(cpu);
+    CPUARMState *env = &arm_cpu->env;
+
+    ret = hv_vcpu_get_sys_reg(cpu->accel->fd, HV_SYS_REG_ID_AA64PFR0_EL1, &pfr);
+    assert_hvf_ok(ret);
+    pfr |= env->gicv3state ? (1 << 24) : 0;
+    ret = hv_vcpu_set_sys_reg(cpu->accel->fd, HV_SYS_REG_ID_AA64PFR0_EL1, pfr);
+    assert_hvf_ok(ret);
 }
 
 void hvf_kick_vcpu_thread(CPUState *cpu)
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 03/11] i386/hvf: Don't send signal to thread when kicking
  2024-12-09 20:36 [PATCH 00/11] hvf and APIC fixes, improvements, and optimisations phil
  2024-12-09 20:36 ` [PATCH 01/11] hvf: Add facility for initialisation code prior to first vCPU run phil
  2024-12-09 20:36 ` [PATCH 02/11] arm/hvf: Initialise GICv3 state just before " phil
@ 2024-12-09 20:36 ` phil
  2024-12-09 21:22   ` Philippe Mathieu-Daudé
  2024-12-09 20:36 ` [PATCH 04/11] i386/hvf: Pre-fetch emulated instructions phil
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 26+ messages in thread
From: phil @ 2024-12-09 20:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm, Phil Dennis-Jordan

From: Phil Dennis-Jordan <phil@philjordan.eu>

This seems to be entirely superfluous and is costly enough to show up in
profiling. hv_vcpu_interrupt() has been demonstrated to very reliably
cause VM exits - even if the target vCPU isn't even running, it will
immediately exit on entry.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
---
 target/i386/hvf/hvf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 3b6ee79fb2..936c31dbdd 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -214,7 +214,7 @@ static inline bool apic_bus_freq_is_known(CPUX86State *env)
 
 void hvf_kick_vcpu_thread(CPUState *cpu)
 {
-    cpus_kick_thread(cpu);
+    cpu->thread_kicked = true;
     hv_vcpu_interrupt(&cpu->accel->fd, 1);
 }
 
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 04/11] i386/hvf: Pre-fetch emulated instructions
  2024-12-09 20:36 [PATCH 00/11] hvf and APIC fixes, improvements, and optimisations phil
                   ` (2 preceding siblings ...)
  2024-12-09 20:36 ` [PATCH 03/11] i386/hvf: Don't send signal to thread when kicking phil
@ 2024-12-09 20:36 ` phil
  2024-12-09 20:36 ` [PATCH 05/11] i386/hvf: Decode APIC access x86 instruction outside BQL phil
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 26+ messages in thread
From: phil @ 2024-12-09 20:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm, Phil Dennis-Jordan

From: Phil Dennis-Jordan <phil@philjordan.eu>

The HVF x86 instruction decoder has previously read each instruction
component a few bytes at a time. The HVF vCPU VM exit reports the length
of the faulted instruction, so we can just pre-fetch the memory for the
whole thing in one go, saving extra round-trips for most instructions.

The old code path is retained in case there is a race between VM exit
and another thread overwriting the faulted instruction. In this case,
the instruction length could be wrong, so we allow fetching additional
instruction bytes the traditional way if the prefetched bytes are
overrun.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
---
 target/i386/hvf/hvf.c        |  6 +++---
 target/i386/hvf/x86_decode.c | 18 +++++++++++++++---
 target/i386/hvf/x86_decode.h |  5 ++++-
 3 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 936c31dbdd..095f934923 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -522,7 +522,7 @@ int hvf_vcpu_exec(CPUState *cpu)
                 struct x86_decode decode;
 
                 load_regs(cpu);
-                decode_instruction(env, &decode);
+                decode_instruction(env, &decode, ins_len);
                 exec_instruction(env, &decode);
                 store_regs(cpu);
                 break;
@@ -562,7 +562,7 @@ int hvf_vcpu_exec(CPUState *cpu)
             struct x86_decode decode;
 
             load_regs(cpu);
-            decode_instruction(env, &decode);
+            decode_instruction(env, &decode, ins_len);
             assert(ins_len == decode.len);
             exec_instruction(env, &decode);
             store_regs(cpu);
@@ -667,7 +667,7 @@ int hvf_vcpu_exec(CPUState *cpu)
             struct x86_decode decode;
 
             load_regs(cpu);
-            decode_instruction(env, &decode);
+            decode_instruction(env, &decode, ins_len);
             exec_instruction(env, &decode);
             store_regs(cpu);
             break;
diff --git a/target/i386/hvf/x86_decode.c b/target/i386/hvf/x86_decode.c
index a4a28f113f..79dfc30408 100644
--- a/target/i386/hvf/x86_decode.c
+++ b/target/i386/hvf/x86_decode.c
@@ -73,8 +73,13 @@ static inline uint64_t decode_bytes(CPUX86State *env, struct x86_decode *decode,
         VM_PANIC_EX("%s invalid size %d\n", __func__, size);
         break;
     }
-    target_ulong va  = linear_rip(env_cpu(env), env->eip) + decode->len;
-    vmx_read_mem(env_cpu(env), &val, va, size);
+
+    if (decode->len + size < decode->prefetch_len) {
+        memcpy(&val, decode->prefetch_buf + decode->len, size);
+    } else {
+        target_ulong va  = linear_rip(env_cpu(env), env->eip) + decode->len;
+        vmx_read_mem(env_cpu(env), &val, va, size);
+    }
     decode->len += size;
     
     return val;
@@ -2099,9 +2104,16 @@ static void decode_opcodes(CPUX86State *env, struct x86_decode *decode)
     }
 }
 
-uint32_t decode_instruction(CPUX86State *env, struct x86_decode *decode)
+uint32_t decode_instruction(CPUX86State *env, x86_decode *decode,
+                            uint32_t ins_len)
 {
     memset(decode, 0, sizeof(*decode));
+
+    target_ulong va = linear_rip(env_cpu(env), env->eip);
+    uint32_t prefetch_len = MIN(ins_len, sizeof(sizeof(decode->prefetch_buf)));
+    vmx_read_mem(env_cpu(env), decode->prefetch_buf, va, prefetch_len);
+    decode->prefetch_len = prefetch_len;
+
     decode_prefix(env, decode);
     set_addressing_size(env, decode);
     set_operand_size(env, decode);
diff --git a/target/i386/hvf/x86_decode.h b/target/i386/hvf/x86_decode.h
index a2d7a2a27b..0ff368210b 100644
--- a/target/i386/hvf/x86_decode.h
+++ b/target/i386/hvf/x86_decode.h
@@ -297,11 +297,14 @@ typedef struct x86_decode {
     bool is_fpu;
     uint32_t flags_mask;
 
+    uint8_t prefetch_buf[16];
+    uint16_t prefetch_len;
 } x86_decode;
 
 uint64_t sign(uint64_t val, int size);
 
-uint32_t decode_instruction(CPUX86State *env, struct x86_decode *decode);
+uint32_t decode_instruction(CPUX86State *env, x86_decode *decode,
+                            uint32_t ins_len);
 
 target_ulong get_reg_ref(CPUX86State *env, int reg, int rex_present,
                          int is_extended, int size);
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 05/11] i386/hvf: Decode APIC access x86 instruction outside BQL
  2024-12-09 20:36 [PATCH 00/11] hvf and APIC fixes, improvements, and optimisations phil
                   ` (3 preceding siblings ...)
  2024-12-09 20:36 ` [PATCH 04/11] i386/hvf: Pre-fetch emulated instructions phil
@ 2024-12-09 20:36 ` phil
  2024-12-09 20:36 ` [PATCH 06/11] i386/hvf: APIC access exit with fast-path for common mov cases phil
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 26+ messages in thread
From: phil @ 2024-12-09 20:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm, Phil Dennis-Jordan

From: Phil Dennis-Jordan <phil@philjordan.eu>

The HVF accelerator suffers from severe BQL contention under common
practical workloads. x86 instruction decoding for software-emulating
faulted instructions is a somewhat expensive operation, and there
is no need to hold the BQL while performing it. Except in very
unusual edge cases, only an RCU read lock is acquired during the
instruction fetch from memory.

This change therefore moves instruction decoding for APIC access
VM exits to before the BQL is acquired. This improves performance
on APIC-heavy workloads.

It would be nice to eventually move instruction decoding outside
the BQL for MMIO EPT faults as well, but that case is more
complicated as not every EPT fault exit needs decoding/executing.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
---
 target/i386/hvf/hvf.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 095f934923..3f1ff0f013 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -444,6 +444,7 @@ int hvf_vcpu_exec(CPUState *cpu)
     CPUX86State *env = &x86_cpu->env;
     int ret = 0;
     uint64_t rip = 0;
+    struct x86_decode decode;
 
     if (hvf_process_events(cpu)) {
         return EXCP_HLT;
@@ -481,6 +482,11 @@ int hvf_vcpu_exec(CPUState *cpu)
         rip = rreg(cpu->accel->fd, HV_X86_RIP);
         env->eflags = rreg(cpu->accel->fd, HV_X86_RFLAGS);
 
+        if (exit_reason == EXIT_REASON_APIC_ACCESS) {
+            load_regs(cpu);
+            decode_instruction(env, &decode, ins_len);
+        }
+
         bql_lock();
 
         update_apic_tpr(cpu);
@@ -519,8 +525,6 @@ int hvf_vcpu_exec(CPUState *cpu)
             slot = hvf_find_overlap_slot(gpa, 1);
             /* mmio */
             if (ept_emulation_fault(slot, gpa, exit_qual)) {
-                struct x86_decode decode;
-
                 load_regs(cpu);
                 decode_instruction(env, &decode, ins_len);
                 exec_instruction(env, &decode);
@@ -559,7 +563,6 @@ int hvf_vcpu_exec(CPUState *cpu)
                 macvm_set_rip(cpu, rip + ins_len);
                 break;
             }
-            struct x86_decode decode;
 
             load_regs(cpu);
             decode_instruction(env, &decode, ins_len);
@@ -664,10 +667,6 @@ int hvf_vcpu_exec(CPUState *cpu)
             break;
         }
         case EXIT_REASON_APIC_ACCESS: { /* TODO */
-            struct x86_decode decode;
-
-            load_regs(cpu);
-            decode_instruction(env, &decode, ins_len);
             exec_instruction(env, &decode);
             store_regs(cpu);
             break;
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 06/11] i386/hvf: APIC access exit with fast-path for common mov cases
  2024-12-09 20:36 [PATCH 00/11] hvf and APIC fixes, improvements, and optimisations phil
                   ` (4 preceding siblings ...)
  2024-12-09 20:36 ` [PATCH 05/11] i386/hvf: Decode APIC access x86 instruction outside BQL phil
@ 2024-12-09 20:36 ` phil
  2024-12-09 20:36 ` [PATCH 07/11] i386/hvf: Enables APIC_ACCESS VM exits by setting APICBASE phil
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 26+ messages in thread
From: phil @ 2024-12-09 20:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm, Phil Dennis-Jordan

From: Phil Dennis-Jordan <phil@philjordan.eu>

The implementation of the EXIT_REASON_APIC_ACCESS vm exit handler has so far
been essentially the same as a regular EPT fault handler, performing a full
simulation of the faulted instruction. The code path has also not been used at
all because the APIC base address setter in Hypervisor.framework was never
called. This change improves the former.

In particular, the APIC_ACCESS exit provides us some additional metadata which
in many cases allows us to avoid a full instruction emulation.

There is no need to walk the memory hierarchy, because exit_qual contains the
APIC MMIO offset. It also tells us whether it's an MMIO read or write. So
we can detect common mov instructions and directly call the relevant APIC
accessor functions.

For more complex instructions, we can fall back to the usual instruction
emulation.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
---
 hw/intc/apic.c               |  4 +-
 include/hw/i386/apic.h       |  2 +
 meson.build                  |  1 +
 target/i386/hvf/hvf.c        | 18 +++++++-
 target/i386/hvf/trace-events |  9 ++++
 target/i386/hvf/trace.h      |  1 +
 target/i386/hvf/x86_emu.c    | 84 ++++++++++++++++++++++++++++++++++++
 target/i386/hvf/x86_emu.h    |  2 +
 8 files changed, 117 insertions(+), 4 deletions(-)
 create mode 100644 target/i386/hvf/trace-events
 create mode 100644 target/i386/hvf/trace.h

diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index 4186c57b34..add99f01e5 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -788,7 +788,7 @@ static void apic_timer(void *opaque)
     apic_timer_update(s, s->next_time);
 }
 
-static int apic_register_read(int index, uint64_t *value)
+int apic_register_read(int index, uint64_t *value)
 {
     DeviceState *dev;
     APICCommonState *s;
@@ -936,7 +936,7 @@ static void apic_send_msi(MSIMessage *msi)
     apic_deliver_irq(dest, dest_mode, delivery, vector, trigger_mode);
 }
 
-static int apic_register_write(int index, uint64_t val)
+int apic_register_write(int index, uint64_t val)
 {
     DeviceState *dev;
     APICCommonState *s;
diff --git a/include/hw/i386/apic.h b/include/hw/i386/apic.h
index eb606d6076..47946e5581 100644
--- a/include/hw/i386/apic.h
+++ b/include/hw/i386/apic.h
@@ -20,6 +20,8 @@ void apic_designate_bsp(DeviceState *d, bool bsp);
 int apic_get_highest_priority_irr(DeviceState *dev);
 int apic_msr_read(int index, uint64_t *val);
 int apic_msr_write(int index, uint64_t val);
+int apic_register_read(int index, uint64_t *value);
+int apic_register_write(int index, uint64_t val);
 bool is_x2apic_mode(DeviceState *d);
 
 /* pc.c */
diff --git a/meson.build b/meson.build
index 147097c652..0846c09bdb 100644
--- a/meson.build
+++ b/meson.build
@@ -3606,6 +3606,7 @@ if have_system or have_user
     'target/arm/hvf',
     'target/hppa',
     'target/i386',
+    'target/i386/hvf',
     'target/i386/kvm',
     'target/loongarch',
     'target/mips/tcg',
diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 3f1ff0f013..2a13a9e49b 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -75,6 +75,7 @@
 #include "qemu/main-loop.h"
 #include "qemu/accel.h"
 #include "target/i386/cpu.h"
+#include "trace.h"
 
 static Error *invtsc_mig_blocker;
 
@@ -666,8 +667,21 @@ int hvf_vcpu_exec(CPUState *cpu)
             store_regs(cpu);
             break;
         }
-        case EXIT_REASON_APIC_ACCESS: { /* TODO */
-            exec_instruction(env, &decode);
+        case EXIT_REASON_APIC_ACCESS: {
+            bool is_load = (exit_qual & 0x1000) == 0;
+            uint32_t apic_register_idx = (exit_qual & 0xff0) >> 4;
+
+            if (simulate_fast_path_apic_mmio(is_load, apic_register_idx,
+                                             env, &decode)) {
+                env->eip += ins_len;
+            } else {
+                trace_hvf_x86_vcpu_exec_apic_access_slowpath(
+                    is_load ? "load from" : "store to", apic_register_idx,
+                    ins_len, decode.prefetch_buf[0], decode.prefetch_buf[1],
+                    decode.prefetch_buf[2], decode.prefetch_buf[3],
+                    decode.prefetch_buf[4], decode.prefetch_buf[5]);
+                exec_instruction(env, &decode);
+            }
             store_regs(cpu);
             break;
         }
diff --git a/target/i386/hvf/trace-events b/target/i386/hvf/trace-events
new file mode 100644
index 0000000000..7d0230fb37
--- /dev/null
+++ b/target/i386/hvf/trace-events
@@ -0,0 +1,9 @@
+# See docs/devel/tracing.rst for syntax documentation.
+
+# hvf.c
+hvf_x86_vcpu_exec_apic_access_slowpath(const char *access_type, uint32_t apic_register_idx, uint32_t ins_len, uint8_t ins_byte_0, uint8_t ins_byte_1, uint8_t ins_byte_2, uint8_t ins_byte_3, uint8_t ins_byte_4, uint8_t ins_byte_5) "xAPIC %s register 0x%" PRIx32" taking slow path; instruction length: %" PRIu32 ", bytes: %02x %02x %02x %02x  %02x %02x ..."
+
+# x86_emu.c
+hvf_x86_emu_mmio_load_instruction_fastpath(int cmd, int operand_size, int opcode_len, uint8_t opcode_byte_0, uint8_t opcode_byte_1, uint8_t opcode_byte_2) "slow path apic load: cmd = %d, operand_size = %u, opcode_len = %u, opcode = [ %02x %02x %02x ... ]"
+hvf_x86_emu_mmio_store_instruction_fastpath(int cmd, int operand_size, int opcode_len, uint8_t opcode_byte_0, uint8_t opcode_byte_1, uint8_t opcode_byte_2) "slow path apic store: cmd = %d, operand_size = %u, opcode_len = %u, opcode = [ %02x %02x %02x ... ]"
+hvf_x86_fast_path_apic_mmio_failed(const char *access_type, uint32_t apic_register_idx, uint64_t value, int result) "xAPIC %s register 0x%"PRIx32", value 0x%"PRIx64" returned error %d from APIC"
diff --git a/target/i386/hvf/trace.h b/target/i386/hvf/trace.h
new file mode 100644
index 0000000000..14f15a752a
--- /dev/null
+++ b/target/i386/hvf/trace.h
@@ -0,0 +1 @@
+#include "trace/trace-target_i386_hvf.h"
diff --git a/target/i386/hvf/x86_emu.c b/target/i386/hvf/x86_emu.c
index 015f760acb..197fa155a0 100644
--- a/target/i386/hvf/x86_emu.c
+++ b/target/i386/hvf/x86_emu.c
@@ -44,6 +44,7 @@
 #include "x86_flags.h"
 #include "vmcs.h"
 #include "vmx.h"
+#include "trace.h"
 
 void hvf_handle_io(CPUState *cs, uint16_t port, void *data,
                    int direction, int size, uint32_t count);
@@ -897,6 +898,89 @@ static void exec_wrmsr(CPUX86State *env, struct x86_decode *decode)
     env->eip += decode->len;
 }
 
+static bool mmio_load_instruction_fastpath(x86_decode *decode, CPUX86State *env,
+                                           int *load_dest_reg)
+{
+    if (decode->cmd == X86_DECODE_CMD_MOV && decode->operand_size == 4
+        && decode->opcode_len == 1) {
+        if (decode->opcode[0] == 0x8b) {
+            g_assert(decode->op[0].type == X86_VAR_REG);
+            g_assert(decode->op[1].type == X86_VAR_RM);
+
+            *load_dest_reg = decode->op[0].reg | (decode->rex.r ? R_R8 : 0);
+            return true;
+        } else if (decode->opcode[0] == 0xa1) {
+            *load_dest_reg = R_EAX;
+            return true;
+        }
+    }
+
+    trace_hvf_x86_emu_mmio_load_instruction_fastpath(
+        decode->cmd, decode->operand_size, decode->opcode_len,
+        decode->opcode[0], decode->opcode[1], decode->opcode[2]);
+
+    return false;
+}
+
+static bool mmio_store_instruction_fastpath(x86_decode *decode, CPUX86State *env,
+                                            uint64_t *store_val)
+{
+    if (decode->cmd == X86_DECODE_CMD_MOV && decode->operand_size == 4 &&
+        decode->opcode_len == 1) {
+        if (decode->opcode[0] == 0x89) { /* mov DWORD PTR [reg0+off],reg1 */
+            g_assert(decode->op[1].type == X86_VAR_REG);
+            g_assert(decode->op[0].type == X86_VAR_RM);
+
+            *store_val = RRX(env, decode->op[1].reg | (decode->rex.r ? R_R8 : 0));
+            return true;
+        } else if (decode->opcode[0] == 0xc7) { /* mov DWORD PTR [reg0+off],imm*/
+            g_assert(decode->op[0].type == X86_VAR_RM);
+            g_assert(decode->op[1].type == X86_VAR_IMMEDIATE);
+            *store_val = decode->op[1].val;
+            return true;
+        } else if (decode->opcode[0] == 0xa3) { /* movabs ds:immaddr,eax */
+            *store_val = RRX(env, R_EAX);
+            return true;
+        }
+    }
+
+    trace_hvf_x86_emu_mmio_store_instruction_fastpath(
+        decode->cmd, decode->operand_size, decode->opcode_len,
+        decode->opcode[0], decode->opcode[1], decode->opcode[2]);
+    return false;
+}
+
+
+bool simulate_fast_path_apic_mmio(bool is_load, uint32_t apic_register_idx,
+                                  CPUX86State *env, x86_decode* decode)
+{
+    uint64_t value;
+    int load_dest_reg;
+    int res;
+
+    if (is_load) {
+        if (!mmio_load_instruction_fastpath(decode, env, &load_dest_reg)) {
+            return false;
+        }
+        res = apic_register_read(apic_register_idx, &value);
+        if (res == 0) {
+            RRX(env, load_dest_reg) = value;
+        }
+    } else {
+        if (!mmio_store_instruction_fastpath(decode, env, &value)) {
+            return false;
+        }
+        res = apic_register_write(apic_register_idx, value);
+    }
+
+    if (res != 0) {
+        trace_hvf_x86_fast_path_apic_mmio_failed(
+            is_load ? "load from" : "store to", apic_register_idx, value, res);
+        raise_exception(env, EXCP0D_GPF, 0);
+    }
+    return true;
+}
+
 /*
  * flag:
  * 0 - bt, 1 - btc, 2 - bts, 3 - btr
diff --git a/target/i386/hvf/x86_emu.h b/target/i386/hvf/x86_emu.h
index 8bd97608c4..6726ca2240 100644
--- a/target/i386/hvf/x86_emu.h
+++ b/target/i386/hvf/x86_emu.h
@@ -31,6 +31,8 @@ void store_regs(CPUState *cpu);
 
 void simulate_rdmsr(CPUX86State *env);
 void simulate_wrmsr(CPUX86State *env);
+bool simulate_fast_path_apic_mmio(bool is_load, uint32_t apic_register_idx,
+                                  CPUX86State *env, x86_decode* decode);
 
 target_ulong read_reg(CPUX86State *env, int reg, int size);
 void write_reg(CPUX86State *env, int reg, target_ulong val, int size);
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 07/11] i386/hvf: Enables APIC_ACCESS VM exits by setting APICBASE
  2024-12-09 20:36 [PATCH 00/11] hvf and APIC fixes, improvements, and optimisations phil
                   ` (5 preceding siblings ...)
  2024-12-09 20:36 ` [PATCH 06/11] i386/hvf: APIC access exit with fast-path for common mov cases phil
@ 2024-12-09 20:36 ` phil
  2024-12-09 20:36 ` [PATCH 08/11] i386/hvf: Variable type fixup in decoder phil
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 26+ messages in thread
From: phil @ 2024-12-09 20:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm, Phil Dennis-Jordan

From: Phil Dennis-Jordan <phil@philjordan.eu>

This change activates virtualised APIC access VM exits so the new
fast-pathed implementation will be taken.

Two parts are required for enabling APIC_ACCESS exits rather than
falling back to "regular" MMIO EPT faults: Hypervisor.framework
needs to know the current APIC base address, and the APIC access
virtualisation ctl, VMCS_PRI_PROC_BASED2_CTLS_APIC_ACCESSES,
must be set in the VMCS. The latter has already been set in QEMU's
HVF accel, but setting the APIC base address has been missing.

This change calls hv_vmx_vcpu_set_apic_address() before a vCPU
runs for the first time, and whenever the APICBASE MSR is modified
and the xAPIC is enabled. Additionally, the APIC access ctl is
toggled when the APIC is enabled or disabled, or changes mode.

In addition to making APIC access VM exits occur at all, it also
makes APIC relocation work, at least on the fast path. (QEMU does
not currently support different address spaces per vCPU, which
is why the purely EPT fault based software APIC - and thus the slow
path - does not properly support relocation.)

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
---
 target/i386/hvf/hvf.c     | 11 +++++++++++
 target/i386/hvf/x86_emu.c | 18 ++++++++++++++++++
 2 files changed, 29 insertions(+)

diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 2a13a9e49b..a7b8d124bb 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -341,6 +341,17 @@ int hvf_arch_init_vcpu(CPUState *cpu)
 
 void hvf_vcpu_before_first_run(CPUState *cpu)
 {
+    X86CPU *x86_cpu = X86_CPU(cpu);
+    hv_vcpuid_t vcpu = cpu->accel->fd;
+    uint64_t apic_base;
+    hv_return_t apicbase_result;
+
+    if (cpu_is_apic_enabled(x86_cpu->apic_state)
+        && !is_x2apic_mode(x86_cpu->apic_state)) {
+        apic_base = MSR_IA32_APICBASE_BASE & cpu_get_apic_base(x86_cpu->apic_state);
+        apicbase_result = hv_vmx_vcpu_set_apic_address(vcpu, apic_base);
+        assert_hvf_ok(apicbase_result);
+    }
 }
 
 static void hvf_store_events(CPUState *cpu, uint32_t ins_len, uint64_t idtvec_info)
diff --git a/target/i386/hvf/x86_emu.c b/target/i386/hvf/x86_emu.c
index 197fa155a0..88a946cb0f 100644
--- a/target/i386/hvf/x86_emu.c
+++ b/target/i386/hvf/x86_emu.c
@@ -797,10 +797,28 @@ void simulate_wrmsr(CPUX86State *env)
         break;
     case MSR_IA32_APICBASE: {
         int r;
+        hv_return_t res;
 
         r = cpu_set_apic_base(cpu->apic_state, data);
         if (r < 0) {
             raise_exception(env, EXCP0D_GPF, 0);
+        } else {
+            uint64_t pbc = rvmcs(cs->accel->fd, VMCS_SEC_PROC_BASED_CTLS);
+            uint64_t new_pbc;
+            if (cpu_is_apic_enabled(cpu->apic_state)
+                && !is_x2apic_mode(cpu->apic_state)) {
+                res = hv_vmx_vcpu_set_apic_address(cs->accel->fd,
+                                                   data & MSR_IA32_APICBASE_BASE);
+                assert_hvf_ok(res);
+
+                new_pbc = pbc | VMCS_PRI_PROC_BASED2_CTLS_APIC_ACCESSES;
+            } else {
+                new_pbc = pbc & ~VMCS_PRI_PROC_BASED2_CTLS_APIC_ACCESSES;
+            }
+            if (new_pbc != pbc) {
+                wvmcs(cs->accel->fd, VMCS_SEC_PROC_BASED_CTLS,
+                    cap2ctrl(hvf_state->hvf_caps->vmx_cap_procbased2, new_pbc));
+            }
         }
 
         break;
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 08/11] i386/hvf: Variable type fixup in decoder
  2024-12-09 20:36 [PATCH 00/11] hvf and APIC fixes, improvements, and optimisations phil
                   ` (6 preceding siblings ...)
  2024-12-09 20:36 ` [PATCH 07/11] i386/hvf: Enables APIC_ACCESS VM exits by setting APICBASE phil
@ 2024-12-09 20:36 ` phil
  2024-12-09 20:49   ` Philippe Mathieu-Daudé
  2024-12-09 20:36 ` [PATCH 09/11] i386/hvf: Print hex pairs for each opcode byte in decode error phil
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 26+ messages in thread
From: phil @ 2024-12-09 20:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm, Phil Dennis-Jordan

From: Phil Dennis-Jordan <phil@philjordan.eu>

decode_bytes reads 1, 2, 4, or 8 bytes at a time. The destination
variable should therefore be a uint64_t, not a target_ulong.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
---
 target/i386/hvf/x86_decode.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/i386/hvf/x86_decode.c b/target/i386/hvf/x86_decode.c
index 79dfc30408..6c7cfc820f 100644
--- a/target/i386/hvf/x86_decode.c
+++ b/target/i386/hvf/x86_decode.c
@@ -61,8 +61,8 @@ uint64_t sign(uint64_t val, int size)
 static inline uint64_t decode_bytes(CPUX86State *env, struct x86_decode *decode,
                                     int size)
 {
-    target_ulong val = 0;
-    
+    uint64_t val = 0;
+
     switch (size) {
     case 1:
     case 2:
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 09/11] i386/hvf: Print hex pairs for each opcode byte in decode error
  2024-12-09 20:36 [PATCH 00/11] hvf and APIC fixes, improvements, and optimisations phil
                   ` (7 preceding siblings ...)
  2024-12-09 20:36 ` [PATCH 08/11] i386/hvf: Variable type fixup in decoder phil
@ 2024-12-09 20:36 ` phil
  2024-12-09 20:54   ` Philippe Mathieu-Daudé
  2024-12-09 20:36 ` [PATCH 10/11] hw/intc/apic: Fixes magic number use, removes outdated comment phil
  2024-12-09 20:36 ` [PATCH 11/11] hw/intc/apic: Raise exception when setting reserved APICBASE bits phil
  10 siblings, 1 reply; 26+ messages in thread
From: phil @ 2024-12-09 20:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm, Phil Dennis-Jordan

From: Phil Dennis-Jordan <phil@philjordan.eu>

Printing a sequence of bytes as hex with leading zeroes omitted just looks odd.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
---
 target/i386/hvf/x86_decode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/hvf/x86_decode.c b/target/i386/hvf/x86_decode.c
index 6c7cfc820f..f8d37f2d53 100644
--- a/target/i386/hvf/x86_decode.c
+++ b/target/i386/hvf/x86_decode.c
@@ -30,7 +30,7 @@ static void decode_invalid(CPUX86State *env, struct x86_decode *decode)
 {
     printf("%llx: failed to decode instruction ", env->eip);
     for (int i = 0; i < decode->opcode_len; i++) {
-        printf("%x ", decode->opcode[i]);
+        printf("%02x ", decode->opcode[i]);
     }
     printf("\n");
     VM_PANIC("decoder failed\n");
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 10/11] hw/intc/apic: Fixes magic number use, removes outdated comment
  2024-12-09 20:36 [PATCH 00/11] hvf and APIC fixes, improvements, and optimisations phil
                   ` (8 preceding siblings ...)
  2024-12-09 20:36 ` [PATCH 09/11] i386/hvf: Print hex pairs for each opcode byte in decode error phil
@ 2024-12-09 20:36 ` phil
  2024-12-09 20:55   ` Philippe Mathieu-Daudé
  2024-12-09 20:36 ` [PATCH 11/11] hw/intc/apic: Raise exception when setting reserved APICBASE bits phil
  10 siblings, 1 reply; 26+ messages in thread
From: phil @ 2024-12-09 20:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm, Phil Dennis-Jordan

From: Phil Dennis-Jordan <phil@philjordan.eu>

This changes replaces the use of an explicit literal constant for
the APIC base address mask with the existing symbolic constant
intended for this purpose.

Additionally, we remove the comment about not being able to
re-enable the APIC after disabling it. This is no longer
the case after the APIC implementation's state machine was
modified in 9.0.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
---
 hw/intc/apic.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index add99f01e5..d72cbb2a8f 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -350,9 +350,8 @@ static int apic_set_base(APICCommonState *s, uint64_t val)
         return -1;
     }
 
-    s->apicbase = (val & 0xfffff000) |
+    s->apicbase = (val & MSR_IA32_APICBASE_BASE) |
         (s->apicbase & (MSR_IA32_APICBASE_BSP | MSR_IA32_APICBASE_ENABLE));
-    /* if disabled, cannot be enabled again */
     if (!(val & MSR_IA32_APICBASE_ENABLE)) {
         s->apicbase &= ~MSR_IA32_APICBASE_ENABLE;
         cpu_clear_apic_feature(&s->cpu->env);
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 11/11] hw/intc/apic: Raise exception when setting reserved APICBASE bits
  2024-12-09 20:36 [PATCH 00/11] hvf and APIC fixes, improvements, and optimisations phil
                   ` (9 preceding siblings ...)
  2024-12-09 20:36 ` [PATCH 10/11] hw/intc/apic: Fixes magic number use, removes outdated comment phil
@ 2024-12-09 20:36 ` phil
  10 siblings, 0 replies; 26+ messages in thread
From: phil @ 2024-12-09 20:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm, Phil Dennis-Jordan

From: Phil Dennis-Jordan <phil@philjordan.eu>

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
---
 hw/intc/apic.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index d72cbb2a8f..83e626a45e 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -309,6 +309,11 @@ bool is_x2apic_mode(DeviceState *dev)
 
 static int apic_set_base_check(APICCommonState *s, uint64_t val)
 {
+    /* Refuse to set reserved bits */
+    if (val & MSR_IA32_APICBASE_RESERVED) {
+        return -1;
+    }
+
     /* Enable x2apic when x2apic is not supported by CPU */
     if (!cpu_has_x2apic_feature(&s->cpu->env) &&
         val & MSR_IA32_APICBASE_EXTD) {
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 08/11] i386/hvf: Variable type fixup in decoder
  2024-12-09 20:36 ` [PATCH 08/11] i386/hvf: Variable type fixup in decoder phil
@ 2024-12-09 20:49   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 26+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-12-09 20:49 UTC (permalink / raw)
  To: phil, qemu-devel
  Cc: Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm

On 9/12/24 21:36, phil@philjordan.eu wrote:
> From: Phil Dennis-Jordan <phil@philjordan.eu>
> 
> decode_bytes reads 1, 2, 4, or 8 bytes at a time. The destination
> variable should therefore be a uint64_t, not a target_ulong.
> 

Fixes: ff2de1668c9 ("i386: hvf: remove addr_t")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

> Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
> ---
>   target/i386/hvf/x86_decode.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/target/i386/hvf/x86_decode.c b/target/i386/hvf/x86_decode.c
> index 79dfc30408..6c7cfc820f 100644
> --- a/target/i386/hvf/x86_decode.c
> +++ b/target/i386/hvf/x86_decode.c
> @@ -61,8 +61,8 @@ uint64_t sign(uint64_t val, int size)
>   static inline uint64_t decode_bytes(CPUX86State *env, struct x86_decode *decode,
>                                       int size)
>   {
> -    target_ulong val = 0;
> -
> +    uint64_t val = 0;
> +
>       switch (size) {
>       case 1:
>       case 2:



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 09/11] i386/hvf: Print hex pairs for each opcode byte in decode error
  2024-12-09 20:36 ` [PATCH 09/11] i386/hvf: Print hex pairs for each opcode byte in decode error phil
@ 2024-12-09 20:54   ` Philippe Mathieu-Daudé
  2024-12-10 11:28     ` Phil Dennis-Jordan
  0 siblings, 1 reply; 26+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-12-09 20:54 UTC (permalink / raw)
  To: phil, qemu-devel
  Cc: Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm

On 9/12/24 21:36, phil@philjordan.eu wrote:
> From: Phil Dennis-Jordan <phil@philjordan.eu>
> 
> Printing a sequence of bytes as hex with leading zeroes omitted just looks odd.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

> 
> Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
> ---
>   target/i386/hvf/x86_decode.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/i386/hvf/x86_decode.c b/target/i386/hvf/x86_decode.c
> index 6c7cfc820f..f8d37f2d53 100644
> --- a/target/i386/hvf/x86_decode.c
> +++ b/target/i386/hvf/x86_decode.c
> @@ -30,7 +30,7 @@ static void decode_invalid(CPUX86State *env, struct x86_decode *decode)
>   {
>       printf("%llx: failed to decode instruction ", env->eip);
>       for (int i = 0; i < decode->opcode_len; i++) {
> -        printf("%x ", decode->opcode[i]);
> +        printf("%02x ", decode->opcode[i]);
>       }
>       printf("\n");

Maybe we should use monitor_printf() here?

>       VM_PANIC("decoder failed\n");



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 10/11] hw/intc/apic: Fixes magic number use, removes outdated comment
  2024-12-09 20:36 ` [PATCH 10/11] hw/intc/apic: Fixes magic number use, removes outdated comment phil
@ 2024-12-09 20:55   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 26+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-12-09 20:55 UTC (permalink / raw)
  To: phil, qemu-devel
  Cc: Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm

On 9/12/24 21:36, phil@philjordan.eu wrote:
> From: Phil Dennis-Jordan <phil@philjordan.eu>
> 
> This changes replaces the use of an explicit literal constant for
> the APIC base address mask with the existing symbolic constant
> intended for this purpose.
> 
> Additionally, we remove the comment about not being able to
> re-enable the APIC after disabling it. This is no longer
> the case after the APIC implementation's state machine was
> modified in 9.0.
> 
> Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
> ---
>   hw/intc/apic.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 03/11] i386/hvf: Don't send signal to thread when kicking
  2024-12-09 20:36 ` [PATCH 03/11] i386/hvf: Don't send signal to thread when kicking phil
@ 2024-12-09 21:22   ` Philippe Mathieu-Daudé
  2024-12-10  8:17     ` Phil Dennis-Jordan
  2024-12-10  9:21     ` Roman Bolshakov
  0 siblings, 2 replies; 26+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-12-09 21:22 UTC (permalink / raw)
  To: phil, qemu-devel
  Cc: Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm, Richard Henderson

On 9/12/24 21:36, phil@philjordan.eu wrote:
> From: Phil Dennis-Jordan <phil@philjordan.eu>
> 
> This seems to be entirely superfluous and is costly enough to show up in

So the pthread_kill(cpu->thread, SIG_IPI) is entirely superfluous?

> profiling. hv_vcpu_interrupt() has been demonstrated to very reliably
> cause VM exits - even if the target vCPU isn't even running, it will
> immediately exit on entry.
> 
> Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
> ---
>   target/i386/hvf/hvf.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
> index 3b6ee79fb2..936c31dbdd 100644
> --- a/target/i386/hvf/hvf.c
> +++ b/target/i386/hvf/hvf.c
> @@ -214,7 +214,7 @@ static inline bool apic_bus_freq_is_known(CPUX86State *env)
>   
>   void hvf_kick_vcpu_thread(CPUState *cpu)
>   {
> -    cpus_kick_thread(cpu);
> +    cpu->thread_kicked = true;
>       hv_vcpu_interrupt(&cpu->accel->fd, 1);
>   }
>   



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 03/11] i386/hvf: Don't send signal to thread when kicking
  2024-12-09 21:22   ` Philippe Mathieu-Daudé
@ 2024-12-10  8:17     ` Phil Dennis-Jordan
  2024-12-10  9:21     ` Roman Bolshakov
  1 sibling, 0 replies; 26+ messages in thread
From: Phil Dennis-Jordan @ 2024-12-10  8:17 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: qemu-devel, Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm, Richard Henderson

[-- Attachment #1: Type: text/plain, Size: 2095 bytes --]

On Mon 9. Dec 2024 at 22:22, Philippe Mathieu-Daudé <philmd@linaro.org>
wrote:

> On 9/12/24 21:36, phil@philjordan.eu wrote:
> > From: Phil Dennis-Jordan <phil@philjordan.eu>
> >
> > This seems to be entirely superfluous and is costly enough to show up in
>
> So the pthread_kill(cpu->thread, SIG_IPI) is entirely superfluous?
>

Yes, that is my conclusion after looking at the code and testing without
the signal sending. (I am only talking about HVF/x86 here!)

By my understanding, the HVF vCPU thread is always in one of 3 states:
- running hv_vcpu_run_until (hv_vcpu_interrupt() forces the transition out
of this, even in edge cases)
- waiting in qemu_wait_io_event In the main  hvf_cpu_thread_fn loop
(signalling the condition variable will bring it out of this)
- actively processing a VM exit, I/O request, etc (whatever it’s doing
can’t be interrupted, but it will make progress and check its job queue on
completion)

So I can’t see any purpose the signal is supposed to be serving. I mean,
maybe I’ve missed something important - always happy to be corrected and
learn something new. :-)

I’ll still need to investigate if the arm64 version is also doing this
unnecessarily and whether we can remove the signal handler entirely.


> > profiling. hv_vcpu_interrupt() has been demonstrated to very reliably
> > cause VM exits - even if the target vCPU isn't even running, it will
> > immediately exit on entry.
> >
> > Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
> > ---
> >   target/i386/hvf/hvf.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
> > index 3b6ee79fb2..936c31dbdd 100644
> > --- a/target/i386/hvf/hvf.c
> > +++ b/target/i386/hvf/hvf.c
> > @@ -214,7 +214,7 @@ static inline bool
> apic_bus_freq_is_known(CPUX86State *env)
> >
> >   void hvf_kick_vcpu_thread(CPUState *cpu)
> >   {
> > -    cpus_kick_thread(cpu);
> > +    cpu->thread_kicked = true;
> >       hv_vcpu_interrupt(&cpu->accel->fd, 1);
> >   }
> >
>
>

[-- Attachment #2: Type: text/html, Size: 4002 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 03/11] i386/hvf: Don't send signal to thread when kicking
  2024-12-09 21:22   ` Philippe Mathieu-Daudé
  2024-12-10  8:17     ` Phil Dennis-Jordan
@ 2024-12-10  9:21     ` Roman Bolshakov
  2024-12-10  9:52       ` Phil Dennis-Jordan
  1 sibling, 1 reply; 26+ messages in thread
From: Roman Bolshakov @ 2024-12-10  9:21 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, phil@philjordan.eu,
	qemu-devel@nongnu.org
  Cc: Cameron Esfahani, Michael S. Tsirkin, Paolo Bonzini,
	Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm@nongnu.org, Richard Henderson

On 10.12.2024 04:22, Philippe Mathieu-Daudé wrote:
> On 9/12/24 21:36, phil@philjordan.eu wrote:
>> From: Phil Dennis-Jordan <phil@philjordan.eu>
>>
>> This seems to be entirely superfluous and is costly enough to show up in
>
> So the pthread_kill(cpu->thread, SIG_IPI) is entirely superfluous?
>
>> profiling. hv_vcpu_interrupt() has been demonstrated to very reliably
>> cause VM exits - even if the target vCPU isn't even running, it will
>> immediately exit on entry.
>>
>> Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
>> ---
>>   target/i386/hvf/hvf.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
>> index 3b6ee79fb2..936c31dbdd 100644
>> --- a/target/i386/hvf/hvf.c
>> +++ b/target/i386/hvf/hvf.c
>> @@ -214,7 +214,7 @@ static inline bool 
>> apic_bus_freq_is_known(CPUX86State *env)
>>     void hvf_kick_vcpu_thread(CPUState *cpu)
>>   {
>> -    cpus_kick_thread(cpu);
>> +    cpu->thread_kicked = true;
>>       hv_vcpu_interrupt(&cpu->accel->fd, 1);
>>   }
>
SIG_IPI is macOS crutch handled in XNU kernel that was essential until 
Phil submitted proper kick support with hv_vcpu_interrupt().


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 03/11] i386/hvf: Don't send signal to thread when kicking
  2024-12-10  9:21     ` Roman Bolshakov
@ 2024-12-10  9:52       ` Phil Dennis-Jordan
  2025-02-05 20:58         ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 26+ messages in thread
From: Phil Dennis-Jordan @ 2024-12-10  9:52 UTC (permalink / raw)
  To: Roman Bolshakov
  Cc: Philippe Mathieu-Daudé, qemu-devel@nongnu.org,
	Cameron Esfahani, Michael S. Tsirkin, Paolo Bonzini,
	Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm@nongnu.org, Richard Henderson

[-- Attachment #1: Type: text/plain, Size: 1503 bytes --]

On Tue 10. Dec 2024 at 10:21, Roman Bolshakov <rbolshakov@ddn.com> wrote:

> On 10.12.2024 04:22, Philippe Mathieu-Daudé wrote:
> > On 9/12/24 21:36, phil@philjordan.eu wrote:
> >> From: Phil Dennis-Jordan <phil@philjordan.eu>
> >>
> >> This seems to be entirely superfluous and is costly enough to show up in
> >
> > So the pthread_kill(cpu->thread, SIG_IPI) is entirely superfluous?
> >
> >> profiling. hv_vcpu_interrupt() has been demonstrated to very reliably
> >> cause VM exits - even if the target vCPU isn't even running, it will
> >> immediately exit on entry.
> >>
> >> Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
> >> ---
> >>   target/i386/hvf/hvf.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
> >> index 3b6ee79fb2..936c31dbdd 100644
> >> --- a/target/i386/hvf/hvf.c
> >> +++ b/target/i386/hvf/hvf.c
> >> @@ -214,7 +214,7 @@ static inline bool
> >> apic_bus_freq_is_known(CPUX86State *env)
> >>     void hvf_kick_vcpu_thread(CPUState *cpu)
> >>   {
> >> -    cpus_kick_thread(cpu);
> >> +    cpu->thread_kicked = true;
> >>       hv_vcpu_interrupt(&cpu->accel->fd, 1);
> >>   }
> >
> SIG_IPI is macOS crutch handled in XNU kernel that was essential until
> Phil submitted proper kick support with hv_vcpu_interrupt().
>
> Ah yes, perhaps it allowed exit from hv_vcpu_run(). hv_vcpu_run_until()
definitely does not exit early upon receiving SIG_IPI (USR1).

[-- Attachment #2: Type: text/html, Size: 2910 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 01/11] hvf: Add facility for initialisation code prior to first vCPU run
  2024-12-09 20:36 ` [PATCH 01/11] hvf: Add facility for initialisation code prior to first vCPU run phil
@ 2024-12-10 10:05   ` Alexander Graf
  2025-02-05 21:02   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 26+ messages in thread
From: Alexander Graf @ 2024-12-10 10:05 UTC (permalink / raw)
  To: phil, qemu-devel
  Cc: Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Peter Maydell, qemu-arm


On 09.12.24 21:36, phil@philjordan.eu wrote:
> From: Phil Dennis-Jordan <phil@philjordan.eu>
>
> Some VM state required for fully configuring vCPUs is only available
> after all devices have been through their init phase. This extra
> function, called just before each vCPU makes its first VM entry,
> allows us to perform such architecture-specific initialisation.
>
> Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>


Reviewed-by: Alexander Graf <agraf@csgraf.de>

Alex




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 02/11] arm/hvf: Initialise GICv3 state just before first vCPU run
  2024-12-09 20:36 ` [PATCH 02/11] arm/hvf: Initialise GICv3 state just before " phil
@ 2024-12-10 10:05   ` Alexander Graf
  2025-02-05 14:47   ` Zenghui Yu
  1 sibling, 0 replies; 26+ messages in thread
From: Alexander Graf @ 2024-12-10 10:05 UTC (permalink / raw)
  To: phil, qemu-devel
  Cc: Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Peter Maydell, qemu-arm


On 09.12.24 21:36, phil@philjordan.eu wrote:
> From: Phil Dennis-Jordan <phil@philjordan.eu>
>
> Initialising the vCPU PFR0_EL1 system register with the GIC flag in
> hvf_arch_init_vcpu() does not actually work because the GIC state is
> not yet available at that time.
>
> If we set this flag just before running each vCPU for the first time,
> the GIC will definitely be fully initialised at that point.
>
> Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>


Reviewed-by: Alexander Graf <agraf@csgraf.de>

Alex




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 09/11] i386/hvf: Print hex pairs for each opcode byte in decode error
  2024-12-09 20:54   ` Philippe Mathieu-Daudé
@ 2024-12-10 11:28     ` Phil Dennis-Jordan
  0 siblings, 0 replies; 26+ messages in thread
From: Phil Dennis-Jordan @ 2024-12-10 11:28 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: qemu-devel, Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm

[-- Attachment #1: Type: text/plain, Size: 1312 bytes --]

On Mon, 9 Dec 2024 at 21:54, Philippe Mathieu-Daudé <philmd@linaro.org>
wrote:

> On 9/12/24 21:36, phil@philjordan.eu wrote:
> > From: Phil Dennis-Jordan <phil@philjordan.eu>
> >
> > Printing a sequence of bytes as hex with leading zeroes omitted just
> looks odd.
>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
>
> >
> > Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
> > ---
> >   target/i386/hvf/x86_decode.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/target/i386/hvf/x86_decode.c b/target/i386/hvf/x86_decode.c
> > index 6c7cfc820f..f8d37f2d53 100644
> > --- a/target/i386/hvf/x86_decode.c
> > +++ b/target/i386/hvf/x86_decode.c
> > @@ -30,7 +30,7 @@ static void decode_invalid(CPUX86State *env, struct
> x86_decode *decode)
> >   {
> >       printf("%llx: failed to decode instruction ", env->eip);
> >       for (int i = 0; i < decode->opcode_len; i++) {
> > -        printf("%x ", decode->opcode[i]);
> > +        printf("%02x ", decode->opcode[i]);
> >       }
> >       printf("\n");
>
> Maybe we should use monitor_printf() here?
>

Or perhaps snprintf it into a buffer, then change this…


> >       VM_PANIC("decoder failed\n");
>
>
… to a VM_PANIC_EX() that also writes out the opcode buffer?

[-- Attachment #2: Type: text/html, Size: 2288 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 02/11] arm/hvf: Initialise GICv3 state just before first vCPU run
  2024-12-09 20:36 ` [PATCH 02/11] arm/hvf: Initialise GICv3 state just before " phil
  2024-12-10 10:05   ` Alexander Graf
@ 2025-02-05 14:47   ` Zenghui Yu
  1 sibling, 0 replies; 26+ messages in thread
From: Zenghui Yu @ 2025-02-05 14:47 UTC (permalink / raw)
  To: phil
  Cc: qemu-devel, Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm

On 2024/12/10 04:36, phil@philjordan.eu wrote:
> From: Phil Dennis-Jordan <phil@philjordan.eu>
> 
> Initialising the vCPU PFR0_EL1 system register with the GIC flag in
> hvf_arch_init_vcpu() does not actually work because the GIC state is
> not yet available at that time.
> 
> If we set this flag just before running each vCPU for the first time,
> the GIC will definitely be fully initialised at that point.
> 
> Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
> ---
>  target/arm/hvf/hvf.c | 17 ++++++++++-------
>  1 file changed, 10 insertions(+), 7 deletions(-)

Tested-by: Zenghui Yu <zenghui.yu@linux.dev>

Thanks!


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 03/11] i386/hvf: Don't send signal to thread when kicking
  2024-12-10  9:52       ` Phil Dennis-Jordan
@ 2025-02-05 20:58         ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 26+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-02-05 20:58 UTC (permalink / raw)
  To: Phil Dennis-Jordan, Roman Bolshakov
  Cc: qemu-devel@nongnu.org, Cameron Esfahani, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm@nongnu.org, Richard Henderson

On 10/12/24 10:52, Phil Dennis-Jordan wrote:
> 
> 
> On Tue 10. Dec 2024 at 10:21, Roman Bolshakov <rbolshakov@ddn.com 
> <mailto:rbolshakov@ddn.com>> wrote:
> 
>     On 10.12.2024 04:22, Philippe Mathieu-Daudé wrote:
>      > On 9/12/24 21:36, phil@philjordan.eu <mailto:phil@philjordan.eu>
>     wrote:
>      >> From: Phil Dennis-Jordan <phil@philjordan.eu
>     <mailto:phil@philjordan.eu>>
>      >>
>      >> This seems to be entirely superfluous and is costly enough to
>     show up in
>      >
>      > So the pthread_kill(cpu->thread, SIG_IPI) is entirely superfluous?
>      >
>      >> profiling. hv_vcpu_interrupt() has been demonstrated to very
>     reliably
>      >> cause VM exits - even if the target vCPU isn't even running, it will
>      >> immediately exit on entry.
>      >>
>      >> Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu
>     <mailto:phil@philjordan.eu>>
>      >> ---
>      >>   target/i386/hvf/hvf.c | 2 +-
>      >>   1 file changed, 1 insertion(+), 1 deletion(-)
>      >>
>      >> diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
>      >> index 3b6ee79fb2..936c31dbdd 100644
>      >> --- a/target/i386/hvf/hvf.c
>      >> +++ b/target/i386/hvf/hvf.c
>      >> @@ -214,7 +214,7 @@ static inline bool
>      >> apic_bus_freq_is_known(CPUX86State *env)
>      >>     void hvf_kick_vcpu_thread(CPUState *cpu)
>      >>   {
>      >> -    cpus_kick_thread(cpu);
>      >> +    cpu->thread_kicked = true;
>      >>       hv_vcpu_interrupt(&cpu->accel->fd, 1);
>      >>   }
>      >
>     SIG_IPI is macOS crutch handled in XNU kernel that was essential until
>     Phil submitted proper kick support with hv_vcpu_interrupt().
> 
> Ah yes, perhaps it allowed exit from hv_vcpu_run(). hv_vcpu_run_until() 
> definitely does not exit early upon receiving SIG_IPI (USR1).

Maybe worth explaining and mentioning commit bf9bf2306cc ("i386/hvf:
In kick_vcpu use hv_vcpu_interrupt to force exit") in this patch
description?


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 01/11] hvf: Add facility for initialisation code prior to first vCPU run
  2024-12-09 20:36 ` [PATCH 01/11] hvf: Add facility for initialisation code prior to first vCPU run phil
  2024-12-10 10:05   ` Alexander Graf
@ 2025-02-05 21:02   ` Philippe Mathieu-Daudé
  2025-02-09 19:45     ` Phil Dennis-Jordan
  1 sibling, 1 reply; 26+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-02-05 21:02 UTC (permalink / raw)
  To: phil, qemu-devel, Igor Mammedov, Richard Henderson
  Cc: Cameron Esfahani, Roman Bolshakov, Michael S. Tsirkin,
	Paolo Bonzini, Marcel Apfelbaum, Alexander Graf, Peter Maydell,
	qemu-arm

+Igor

On 9/12/24 21:36, phil@philjordan.eu wrote:
> From: Phil Dennis-Jordan <phil@philjordan.eu>
> 
> Some VM state required for fully configuring vCPUs is only available
> after all devices have been through their init phase. This extra
> function, called just before each vCPU makes its first VM entry,
> allows us to perform such architecture-specific initialisation.
> 
> Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
> ---
>   accel/hvf/hvf-accel-ops.c | 5 +++++
>   include/sysemu/hvf_int.h  | 1 +
>   target/arm/hvf/hvf.c      | 4 ++++
>   target/i386/hvf/hvf.c     | 4 ++++
>   4 files changed, 14 insertions(+)
> 
> diff --git a/accel/hvf/hvf-accel-ops.c b/accel/hvf/hvf-accel-ops.c
> index d60874d3e6..c17a9a10de 100644
> --- a/accel/hvf/hvf-accel-ops.c
> +++ b/accel/hvf/hvf-accel-ops.c
> @@ -442,6 +442,11 @@ static void *hvf_cpu_thread_fn(void *arg)
>       cpu_thread_signal_created(cpu);
>       qemu_guest_random_seed_thread_part2(cpu->random_seed);
>   
> +    if (!cpu_can_run(cpu)) {
> +        qemu_wait_io_event(cpu);
> +    }
> +    hvf_vcpu_before_first_run(cpu);

Could this be fixed by the cpu_list_add() split?
https://lore.kernel.org/qemu-devel/20250128142152.9889-1-philmd@linaro.org/

>       do {
>           if (cpu_can_run(cpu)) {
>               r = hvf_vcpu_exec(cpu);


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 01/11] hvf: Add facility for initialisation code prior to first vCPU run
  2025-02-05 21:02   ` Philippe Mathieu-Daudé
@ 2025-02-09 19:45     ` Phil Dennis-Jordan
  0 siblings, 0 replies; 26+ messages in thread
From: Phil Dennis-Jordan @ 2025-02-09 19:45 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: qemu-devel, Igor Mammedov, Richard Henderson, Cameron Esfahani,
	Roman Bolshakov, Michael S. Tsirkin, Paolo Bonzini,
	Marcel Apfelbaum, Alexander Graf, Peter Maydell, qemu-arm

[-- Attachment #1: Type: text/plain, Size: 2380 bytes --]

On Wed, 5 Feb 2025 at 22:02, Philippe Mathieu-Daudé <philmd@linaro.org>
wrote:

> +Igor
>
> On 9/12/24 21:36, phil@philjordan.eu wrote:
> > From: Phil Dennis-Jordan <phil@philjordan.eu>
> >
> > Some VM state required for fully configuring vCPUs is only available
> > after all devices have been through their init phase. This extra
> > function, called just before each vCPU makes its first VM entry,
> > allows us to perform such architecture-specific initialisation.
> >
> > Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
> > ---
> >   accel/hvf/hvf-accel-ops.c | 5 +++++
> >   include/sysemu/hvf_int.h  | 1 +
> >   target/arm/hvf/hvf.c      | 4 ++++
> >   target/i386/hvf/hvf.c     | 4 ++++
> >   4 files changed, 14 insertions(+)
> >
> > diff --git a/accel/hvf/hvf-accel-ops.c b/accel/hvf/hvf-accel-ops.c
> > index d60874d3e6..c17a9a10de 100644
> > --- a/accel/hvf/hvf-accel-ops.c
> > +++ b/accel/hvf/hvf-accel-ops.c
> > @@ -442,6 +442,11 @@ static void *hvf_cpu_thread_fn(void *arg)
> >       cpu_thread_signal_created(cpu);
> >       qemu_guest_random_seed_thread_part2(cpu->random_seed);
> >
> > +    if (!cpu_can_run(cpu)) {
> > +        qemu_wait_io_event(cpu);
> > +    }
> > +    hvf_vcpu_before_first_run(cpu);
>
> Could this be fixed by the cpu_list_add() split?
> https://lore.kernel.org/qemu-devel/20250128142152.9889-1-philmd@linaro.org/
>
>
You mean by implementing a wire() handler for HVF CPU classes? Possibly -
I'll need to apply those patches locally and trace in what context those
wire methods would run. HVF wants most vCPU-specific functions to be run on
the thread owning the vCPU, so if wire() runs on the main QEMU event
handling thread (or anything other than the vCPU's own thread), it won't
work for patches 2 & 7 from this series which actually do stuff in these
before_first_run() handlers.

I notice that Igor's v2 of the cpu_list_add patch set no longer includes
the wire()/unwire() handlers…
https://patchew.org/QEMU/20250207162048.1890669-1-imammedo@redhat.com/

Another option might be to use async_run_on_cpu() for such early
on-vCPU-thread initialisation, but I figured that option would perhaps be a
little to indirect to readers of the code and difficult to reason about.

>       do {
> >           if (cpu_can_run(cpu)) {
> >               r = hvf_vcpu_exec(cpu);
>

[-- Attachment #2: Type: text/html, Size: 3571 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2025-02-09 19:46 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-09 20:36 [PATCH 00/11] hvf and APIC fixes, improvements, and optimisations phil
2024-12-09 20:36 ` [PATCH 01/11] hvf: Add facility for initialisation code prior to first vCPU run phil
2024-12-10 10:05   ` Alexander Graf
2025-02-05 21:02   ` Philippe Mathieu-Daudé
2025-02-09 19:45     ` Phil Dennis-Jordan
2024-12-09 20:36 ` [PATCH 02/11] arm/hvf: Initialise GICv3 state just before " phil
2024-12-10 10:05   ` Alexander Graf
2025-02-05 14:47   ` Zenghui Yu
2024-12-09 20:36 ` [PATCH 03/11] i386/hvf: Don't send signal to thread when kicking phil
2024-12-09 21:22   ` Philippe Mathieu-Daudé
2024-12-10  8:17     ` Phil Dennis-Jordan
2024-12-10  9:21     ` Roman Bolshakov
2024-12-10  9:52       ` Phil Dennis-Jordan
2025-02-05 20:58         ` Philippe Mathieu-Daudé
2024-12-09 20:36 ` [PATCH 04/11] i386/hvf: Pre-fetch emulated instructions phil
2024-12-09 20:36 ` [PATCH 05/11] i386/hvf: Decode APIC access x86 instruction outside BQL phil
2024-12-09 20:36 ` [PATCH 06/11] i386/hvf: APIC access exit with fast-path for common mov cases phil
2024-12-09 20:36 ` [PATCH 07/11] i386/hvf: Enables APIC_ACCESS VM exits by setting APICBASE phil
2024-12-09 20:36 ` [PATCH 08/11] i386/hvf: Variable type fixup in decoder phil
2024-12-09 20:49   ` Philippe Mathieu-Daudé
2024-12-09 20:36 ` [PATCH 09/11] i386/hvf: Print hex pairs for each opcode byte in decode error phil
2024-12-09 20:54   ` Philippe Mathieu-Daudé
2024-12-10 11:28     ` Phil Dennis-Jordan
2024-12-09 20:36 ` [PATCH 10/11] hw/intc/apic: Fixes magic number use, removes outdated comment phil
2024-12-09 20:55   ` Philippe Mathieu-Daudé
2024-12-09 20:36 ` [PATCH 11/11] hw/intc/apic: Raise exception when setting reserved APICBASE bits phil

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).