All of lore.kernel.org
 help / color / mirror / Atom feed
From: Igor Mammedov <imammedo@redhat.com>
To: "Philippe Mathieu-Daudé" <philmd@linaro.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	qemu-ppc <qemu-ppc@nongnu.org>, Peter Xu <peterx@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	mtosatti@redhat.com, qemu-devel@nongnu.org,
	Peter Collingbourne <pcc@google.com>,
	Alexander Graf <agraf@csgraf.de>,
	Roman Bolshakov <r.bolshakov@yadro.com>,
	Sergio Lopez <slp@redhat.com>, qemu-arm <qemu-arm@nongnu.org>,
	Mohamed Mediouni <mohamed@unpredictable.fr>
Subject: Re: [PATCH v4 7/8] kvm: i386: irqchip: take BQL only if there is an interrupt
Date: Wed, 27 Aug 2025 10:40:52 +0200	[thread overview]
Message-ID: <20250827104052.655134e9@fedora> (raw)
In-Reply-To: <c16db8b9-9b73-49c4-8233-370c1b46a382@linaro.org>

On Mon, 25 Aug 2025 12:46:07 +0200
Philippe Mathieu-Daudé <philmd@linaro.org> wrote:

> On 14/8/25 18:05, Igor Mammedov wrote:
> > when kernel-irqchip=split is used, QEMU still hits BQL
> > contention issue when reading ACPI PM/HPET timers
> > (despite of timer[s] access being lock-less).
> > 
> > So Windows with more than 255 cpus is still not able to
> > boot (since it requires iommu -> split irqchip).
> > 
> > Problematic path is in kvm_arch_pre_run() where BQL is taken
> > unconditionally when split irqchip is in use.
> > 
> > There are a few parts that BQL protects there:
> >    1. interrupt check and injecting
> > 
> >      however we do not take BQL when checking for pending
> >      interrupt (even within the same function), so the patch
> >      takes the same approach for cpu->interrupt_request checks
> >      and takes BQL only if there is a job to do.
> > 
> >    2. request_interrupt_window access
> >        CPUState::kvm_run::request_interrupt_window doesn't need BQL
> >        as it's accessed by its own vCPU thread.
> > 
> >    3. cr8/cpu_get_apic_tpr access
> >        the same (as #2) applies to CPUState::kvm_run::cr8,
> >        and APIC registers are also cached/synced (get/put) within
> >        the vCPU thread it belongs to.
> > 
> > Taking BQL only when is necessary, eleminates BQL bottleneck on
> > IO/MMIO only exit path, improoving latency by 80% on HPET micro
> > benchmark.
> > 
> > This lets Windows to boot succesfully (in case hv-time isn't used)
> > when more than 255 vCPUs are in use.
> > 
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > Reviewed-by: Peter Xu <peterx@redhat.com>
> > ---
> > v3:
> >    * drop net needed pair of () in cpu->interrupt_request & CPU_INTERRUPT_HARD
> >      check
> >    * Paolo Bonzini <pbonzini@redhat.com>
> >       * don't take BQL when setting exit_request, use qatomic_set() instead
> >       * after above simplification take/release BQL unconditionally
> >       * drop smp_mb() after run->cr8/run->request_interrupt_window update
> > ---
> >   target/i386/kvm/kvm.c | 12 +++++-------
> >   1 file changed, 5 insertions(+), 7 deletions(-)  
> 
> 
> >       /* Force the VCPU out of its inner loop to process any INIT requests
> >        * or (for userspace APIC, but it is cheap to combine the checks here)
> > @@ -5489,10 +5486,10 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
> >       if (cpu_test_interrupt(cpu, CPU_INTERRUPT_INIT | CPU_INTERRUPT_TPR)) {
> >           if (cpu_test_interrupt(cpu, CPU_INTERRUPT_INIT) &&
> >               !(env->hflags & HF_SMM_MASK)) {
> > -            cpu->exit_request = 1;
> > +            qatomic_set(&cpu->exit_request, 1);
> >           }
> >           if (cpu_test_interrupt(cpu, CPU_INTERRUPT_TPR)) {
> > -            cpu->exit_request = 1;
> > +            qatomic_set(&cpu->exit_request, 1);
> >           }
> >       }  
> 
> Interesting. IMHO to avoid future similar problems, we should declare
> CPUState::exit_request a "private" field and access it via a helper,
> where atomicity is enforced.

I did only bare minimum here, while
Paolo took over sanitizing/fixing exit_request part
see
https://patchew.org/QEMU/20250808185905.62776-1-pbonzini@redhat.com/
so this suggestion better fits there.
 
> The only non-accelerator use is in PPC sPAPR:
> 
> hw/ppc/spapr_hcall.c:512:        cs->exit_request = 1;
> hw/ppc/spapr_hcall.c:534:    cs->exit_request = 1;
> hw/ppc/spapr_hcall.c:627:    cs->exit_request = 1;
> 
> FYI last week we noticed a similar problem with CPUState::thread_kicked
> when using HVF and I'll post in a few days a series containing this
> change:
> 
> -- >8 --  
> diff --git a/system/cpus.c b/system/cpus.c
> index 26fb3bd69c3..39daf85bae7 100644
> --- a/system/cpus.c
> +++ b/system/cpus.c
> @@ -464,10 +464,10 @@ void qemu_wait_io_event(CPUState *cpu)
> 
>   void cpus_kick_thread(CPUState *cpu)
>   {
> -    if (cpu->thread_kicked) {
> +    if (qatomic_read(&cpu->thread_kicked)) {
>           return;
>       }
> -    cpu->thread_kicked = true;
> +    qatomic_set(&cpu->thread_kicked, true);
> 
> ---
> 
> It only affects HVF because this is the single access to thread_kicked
> out of accelerator code:
> 
> $ git grep -w thread_kicked
> include/hw/core/cpu.h:484:    bool thread_kicked;
> system/cpus.c:449:    qatomic_set_mb(&cpu->thread_kicked, false);
> system/cpus.c:476:    if (cpu->thread_kicked) {
> system/cpus.c:479:    cpu->thread_kicked = true;
> target/arm/hvf/hvf.c:1825:    qatomic_set_mb(&cpu->thread_kicked, false);
> 
> (Call introduced in commit 219c101fa7f "arm/hvf: Add a WFI handler").
> 



  reply	other threads:[~2025-08-27  8:41 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-14 16:05 [PATCH v4 0/8] Reinvent BQL-free PIO/MMIO Igor Mammedov
2025-08-14 16:05 ` [PATCH v4 1/8] memory: reintroduce BQL-free fine-grained PIO/MMIO Igor Mammedov
2025-08-25 10:55   ` Philippe Mathieu-Daudé
2025-08-14 16:05 ` [PATCH v4 2/8] acpi: mark PMTIMER as unlocked Igor Mammedov
2025-08-14 16:05 ` [PATCH v4 3/8] hpet: switch to fain-grained device locking Igor Mammedov
2025-08-25 14:43   ` Zhao Liu
2025-08-14 16:05 ` [PATCH v4 4/8] hpet: move out main counter read into a separate block Igor Mammedov
2025-08-25 14:44   ` Zhao Liu
2025-08-14 16:05 ` [PATCH v4 5/8] hpet: make main counter read lock-less Igor Mammedov
2025-08-25 14:55   ` Zhao Liu
2025-08-25 15:10     ` Igor Mammedov
2025-08-14 16:05 ` [PATCH v4 6/8] add cpu_test_interrupt()/cpu_set_interrupt() helpers and use them tree wide Igor Mammedov
2025-08-14 19:05   ` Peter Xu
2025-08-20 15:01   ` Jason J. Herne
2025-08-21 15:57     ` Igor Mammedov
2025-08-21 15:56   ` [PATCH v5 " Igor Mammedov
2025-08-25  8:16     ` Harsh Prateek Bora
2025-08-25 15:06       ` Igor Mammedov
2025-08-25 10:35     ` Philippe Mathieu-Daudé
2025-08-25 15:02       ` Igor Mammedov
2025-08-25 15:28     ` Zhao Liu
2025-08-25 15:19       ` Igor Mammedov
2025-08-26  7:45         ` Zhao Liu
2025-08-26  8:47           ` Igor Mammedov
2025-08-26  9:27             ` Zhao Liu
2025-08-29  8:18             ` Paolo Bonzini
2025-08-29 12:33               ` Paolo Bonzini
2025-09-01 12:05                 ` Igor Mammedov
2025-09-01 12:06                   ` Paolo Bonzini
2025-08-14 16:05 ` [PATCH v4 7/8] kvm: i386: irqchip: take BQL only if there is an interrupt Igor Mammedov
2025-08-25 10:46   ` Philippe Mathieu-Daudé
2025-08-27  8:40     ` Igor Mammedov [this message]
2025-08-14 16:06 ` [PATCH v4 8/8] tcg: move interrupt caching and single step masking closer to user Igor Mammedov
2025-08-29  8:19 ` [PATCH v4 0/8] Reinvent BQL-free PIO/MMIO Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250827104052.655134e9@fedora \
    --to=imammedo@redhat.com \
    --cc=agraf@csgraf.de \
    --cc=mohamed@unpredictable.fr \
    --cc=mst@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=pcc@google.com \
    --cc=peterx@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=r.bolshakov@yadro.com \
    --cc=slp@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.