qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: qemu-devel@nongnu.org
Cc: Igor Mammedov <imammedo@redhat.com>, Peter Xu <peterx@redhat.com>
Subject: [PULL 27/28] kvm: i386: irqchip: take BQL only if there is an interrupt
Date: Fri, 29 Aug 2025 14:59:34 +0200	[thread overview]
Message-ID: <20250829125935.1526984-28-pbonzini@redhat.com> (raw)
In-Reply-To: <20250829125935.1526984-1-pbonzini@redhat.com>

From: Igor Mammedov <imammedo@redhat.com>

when kernel-irqchip=split is used, QEMU still hits BQL
contention issue when reading ACPI PM/HPET timers
(despite of timer[s] access being lock-less).

So Windows with more than 255 cpus is still not able to
boot (since it requires iommu -> split irqchip).

Problematic path is in kvm_arch_pre_run() where BQL is taken
unconditionally when split irqchip is in use.

There are a few parts that BQL protects there:
  1. interrupt check and injecting

    however we do not take BQL when checking for pending
    interrupt (even within the same function), so the patch
    takes the same approach for cpu->interrupt_request checks
    and takes BQL only if there is a job to do.

  2. request_interrupt_window access
      CPUState::kvm_run::request_interrupt_window doesn't need BQL
      as it's accessed by its own vCPU thread.

  3. cr8/cpu_get_apic_tpr access
      the same (as #2) applies to CPUState::kvm_run::cr8,
      and APIC registers are also cached/synced (get/put) within
      the vCPU thread it belongs to.

Taking BQL only when is necessary, eleminates BQL bottleneck on
IO/MMIO only exit path, improoving latency by 80% on HPET micro
benchmark.

This lets Windows to boot succesfully (in case hv-time isn't used)
when more than 255 vCPUs are in use.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20250814160600.2327672-8-imammedo@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/kvm/kvm.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index a7b5c8f81bc..306430a0521 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -5478,9 +5478,6 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
         }
     }
 
-    if (!kvm_pic_in_kernel()) {
-        bql_lock();
-    }
 
     /* Force the VCPU out of its inner loop to process any INIT requests
      * or (for userspace APIC, but it is cheap to combine the checks here)
@@ -5489,10 +5486,10 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
     if (cpu_test_interrupt(cpu, CPU_INTERRUPT_INIT | CPU_INTERRUPT_TPR)) {
         if (cpu_test_interrupt(cpu, CPU_INTERRUPT_INIT) &&
             !(env->hflags & HF_SMM_MASK)) {
-            cpu->exit_request = 1;
+            qatomic_set(&cpu->exit_request, 1);
         }
         if (cpu_test_interrupt(cpu, CPU_INTERRUPT_TPR)) {
-            cpu->exit_request = 1;
+            qatomic_set(&cpu->exit_request, 1);
         }
     }
 
@@ -5503,6 +5500,8 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
             (env->eflags & IF_MASK)) {
             int irq;
 
+            bql_lock();
+
             cpu->interrupt_request &= ~CPU_INTERRUPT_HARD;
             irq = cpu_get_pic_interrupt(env);
             if (irq >= 0) {
@@ -5517,6 +5516,7 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
                             strerror(-ret));
                 }
             }
+            bql_unlock();
         }
 
         /* If we have an interrupt but the guest is not ready to receive an
@@ -5531,8 +5531,6 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
 
         DPRINTF("setting tpr\n");
         run->cr8 = cpu_get_apic_tpr(x86_cpu->apic_state);
-
-        bql_unlock();
     }
 }
 
-- 
2.51.0



  parent reply	other threads:[~2025-08-30 16:01 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-29 12:59 [PULL 00/28] i386, accel, memory patches for 2025-08-29 Paolo Bonzini
2025-08-29 12:59 ` [PULL 01/28] hw/i386/pc_piix.c: restrict isapc machine to 32-bit CPUs Paolo Bonzini
2025-08-29 12:59 ` [PULL 02/28] hw/i386/pc_piix.c: restrict isapc machine to 3.5G memory Paolo Bonzini
2025-08-29 12:59 ` [PULL 03/28] hw/i386/pc_piix.c: remove include for loader.h Paolo Bonzini
2025-08-29 12:59 ` [PULL 04/28] hw/i386/pc_piix.c: inline pc_xen_hvm_init_pci() into pc_xen_hvm_init() Paolo Bonzini
2025-08-29 12:59 ` [PULL 05/28] hw/i386/pc_piix.c: duplicate pc_init1() into pc_isa_init() Paolo Bonzini
2025-08-29 12:59 ` [PULL 06/28] hw/i386/pc_piix.c: remove pcmc->pci_enabled dependent initialisation from pc_init_isa() Paolo Bonzini
2025-08-29 12:59 ` [PULL 07/28] hw/i386/pc_piix.c: remove igvm " Paolo Bonzini
2025-08-29 12:59 ` [PULL 08/28] hw/i386/pc_piix.c: remove SMI and piix4_pm " Paolo Bonzini
2025-08-29 12:59 ` [PULL 09/28] hw/i386/pc_piix.c: remove SGX " Paolo Bonzini
2025-08-29 12:59 ` [PULL 10/28] hw/i386/pc_piix.c: remove nvdimm " Paolo Bonzini
2025-08-29 12:59 ` [PULL 11/28] hw/i386/pc_piix.c: simplify RAM size logic in pc_init_isa() Paolo Bonzini
2025-08-29 12:59 ` [PULL 12/28] hw/i386/pc_piix.c: hardcode hole64_size to 0 " Paolo Bonzini
2025-08-29 12:59 ` [PULL 13/28] hw/i386/pc_piix.c: remove pc_system_flash_cleanup_unused() from pc_init_isa() Paolo Bonzini
2025-08-29 12:59 ` [PULL 14/28] hw/i386/pc_piix.c: always initialise ISA IDE drives in pc_init_isa() Paolo Bonzini
2025-08-29 12:59 ` [PULL 15/28] hw/i386/pc_piix.c: assume pcmc->pci_enabled is always true in pc_init1() Paolo Bonzini
2025-09-01 10:43   ` Peter Maydell
2025-09-01 13:27     ` Mark Cave-Ayland
2025-08-29 12:59 ` [PULL 16/28] hw/i386: move isapc machine to separate isapc.c file Paolo Bonzini
2025-08-29 12:59 ` [PULL 17/28] hw/i386/pc_piix.c: remove unused headers after isapc machine split Paolo Bonzini
2025-08-29 12:59 ` [PULL 18/28] hw/i386/pc_piix.c: replace rom_memory with pci_memory Paolo Bonzini
2025-08-29 12:59 ` [PULL 19/28] hw/i386/isapc.c: replace rom_memory with system_memory Paolo Bonzini
2025-08-29 12:59 ` [PULL 20/28] user-exec: ensure interrupt_request is not used Paolo Bonzini
2025-08-29 12:59 ` [PULL 21/28] add cpu_test_interrupt()/cpu_set_interrupt() helpers and use them tree wide Paolo Bonzini
2025-08-29 12:59 ` [PULL 22/28] memory: reintroduce BQL-free fine-grained PIO/MMIO Paolo Bonzini
2025-08-29 12:59 ` [PULL 23/28] acpi: mark PMTIMER as unlocked Paolo Bonzini
2025-08-29 12:59 ` [PULL 24/28] hpet: switch to fine-grained device locking Paolo Bonzini
2025-09-08 14:30   ` Daniel P. Berrangé
2025-09-10 11:16     ` Igor Mammedov
2025-09-10 11:23       ` Paolo Bonzini
2025-09-10 12:56         ` Igor Mammedov
2025-09-15 13:26           ` Peter Maydell
2025-09-10 14:25     ` [PATCH] hpet: guard IRQ handling with BQL Igor Mammedov
2025-09-11 13:40       ` Paolo Bonzini
2025-08-29 12:59 ` [PULL 25/28] hpet: move out main counter read into a separate block Paolo Bonzini
2025-08-29 12:59 ` [PULL 26/28] hpet: make main counter read lock-less Paolo Bonzini
2025-08-29 12:59 ` Paolo Bonzini [this message]
2025-08-29 12:59 ` [PULL 28/28] tcg: move interrupt caching and single step masking closer to user Paolo Bonzini
2025-08-31  7:28 ` [PULL 00/28] i386, accel, memory patches for 2025-08-29 Richard Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250829125935.1526984-28-pbonzini@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).