From: "Michael S. Tsirkin" <mst@redhat.com>
To: Igor Mammedov <imammedo@redhat.com>
Cc: qemu-devel@nongnu.org, pbonzini@redhat.com, peterx@redhat.com,
david@redhat.com, philmd@linaro.org, mtosatti@redhat.com
Subject: Re: [PATCH v3 00/10] Reinvent BQL-free PIO/MMIO
Date: Mon, 11 Aug 2025 01:36:13 -0400 [thread overview]
Message-ID: <20250811013556-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20250808120137.2208800-1-imammedo@redhat.com>
On Fri, Aug 08, 2025 at 02:01:27PM +0200, Igor Mammedov wrote:
> v3:
> * hpet: replace explicit atomics with use seqlock API (PeterX)
> * introduce cpu_test_interrupt() (Paolo)
> and use it tree wide for checking interrupts
> * don't take BQL for setting exit_request, use qatomic_set() instead. (Paolo)
> * after above change, relace conditional BQL with unconditional
> to simlify things a bit (Paolo)
> * drop not needed barriers (Paolo)
> * minor tcg:cpu_handle_interrupt() cleanup
>
> v2:
> * Make both read and write pathes BQL-less (Gerd)
> * Refactor HPET to handle lock-less access correctly
> when stopping/starting counter in parallel. (Peter Maydell)
> * Publish kvm-unit-tests HPET bench/torture test [1] to verify
> HPET lock-less handling
nice
acpi things:
Acked-by: Michael S. Tsirkin <mst@redhat.com>
> When booting WS2025 with following CLI
> 1) -M q35,hpet=off -cpu host -enable-kvm -smp 240,sockets=4
> the guest boots very slow and is sluggish after boot
> or it's stuck on boot at spinning circle (most of the time).
>
> pref shows that VM is experiencing heavy BQL contention on IO path
> which happens to be ACPI PM timer read access. A variation with
> HPET enabled moves contention to HPET timer read access.
> And it only gets worse with increasing number of VCPUs.
>
> Series prevents large VM vCPUs contending on BQL due to PM|HPET timer
> access and lets Windows to move on with boot process.
>
> Testing lock-less IO with HPET micro benchmark [2] shows approx 80%
> better performance than the current BLQ locked path.
> [chart https://ibb.co/MJY9999 shows much better scaling of lock-less
> IO compared to BQL one.]
>
> In my tests, with CLI WS2025 guest wasn't able to boot within 30min
> on both hosts
> * 32 core 2NUMA nodes
> * 448 cores 8NUMA nodes
> With ACPI PM timer in BQL-free read mode, guest boots within approx:
> * 2min
> * 1min
> respectively.
>
> With HPET enabled boot time shrinks ~2x
> * 4m13 -> 2m21
> * 2m19 -> 1m15
> respectively.
>
> 2) "[kvm-unit-tests PATCH v4 0/5] x86: add HPET counter tests"
> https://lore.kernel.org/kvm/20250725095429.1691734-1-imammedo@redhat.com/T/#t
> PS:
> Using hv-time=on cpu option helps a lot (when it works) and
> lets [1] guest boot fine in ~1-2min. Series doesn't make
> a significant impact in this case.
>
> PS2:
> Tested series with a bunch of different guests:
> RHEL-[6..10]x64, WS2012R2, WS2016, WS2022, WS2025
>
> PS3:
> dropped mention of https://bugzilla.redhat.com/show_bug.cgi?id=1322713
> as it's not reproducible with current software stack or even with
> the same qemu/seabios as reported (kernel versions mentioned in
> the report were interim ones and no longer available,
> so I've used nearest released at the time for testing)
>
> Igor Mammedov (10):
> memory: reintroduce BQL-free fine-grained PIO/MMIO
> acpi: mark PMTIMER as unlocked
> hpet: switch to fain-grained device locking
> hpet: move out main counter read into a separate block
> hpet: make main counter read lock-less
> introduce cpu_test_interrupt() that will replace open coded checks
> x86: kvm: use cpu_test_interrupt() instead of oppen coding checks
> kvm: i386: irqchip: take BQL only if there is an interrupt
> use cpu_test_interrupt() instead of oppen coding checks tree wide
> tcg: move interrupt caching and single step masking closer to user
>
> include/hw/core/cpu.h | 12 ++++++++
> include/system/memory.h | 10 +++++++
> accel/tcg/cpu-exec.c | 25 +++++++---------
> accel/tcg/tcg-accel-ops.c | 3 +-
> hw/acpi/core.c | 1 +
> hw/timer/hpet.c | 38 +++++++++++++++++++-----
> system/cpus.c | 3 +-
> system/memory.c | 15 ++++++++++
> system/physmem.c | 2 +-
> target/alpha/cpu.c | 8 ++---
> target/arm/cpu.c | 20 ++++++-------
> target/arm/helper.c | 16 +++++-----
> target/arm/hvf/hvf.c | 6 ++--
> target/avr/cpu.c | 2 +-
> target/hppa/cpu.c | 2 +-
> target/i386/hvf/hvf.c | 4 +--
> target/i386/hvf/x86hvf.c | 21 +++++++------
> target/i386/kvm/kvm.c | 46 ++++++++++++++---------------
> target/i386/nvmm/nvmm-all.c | 24 +++++++--------
> target/i386/tcg/system/seg_helper.c | 2 +-
> target/i386/whpx/whpx-all.c | 34 ++++++++++-----------
> target/loongarch/cpu.c | 2 +-
> target/m68k/cpu.c | 2 +-
> target/microblaze/cpu.c | 2 +-
> target/mips/cpu.c | 6 ++--
> target/mips/kvm.c | 2 +-
> target/openrisc/cpu.c | 3 +-
> target/ppc/cpu_init.c | 2 +-
> target/ppc/kvm.c | 2 +-
> target/rx/cpu.c | 3 +-
> target/rx/helper.c | 2 +-
> target/s390x/cpu-system.c | 2 +-
> target/sh4/cpu.c | 2 +-
> target/sh4/helper.c | 2 +-
> target/sparc/cpu.c | 2 +-
> target/sparc/int64_helper.c | 4 +--
> 36 files changed, 193 insertions(+), 139 deletions(-)
>
> --
> 2.47.1
prev parent reply other threads:[~2025-08-11 5:37 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-08 12:01 [PATCH v3 00/10] Reinvent BQL-free PIO/MMIO Igor Mammedov
2025-08-08 12:01 ` [PATCH v3 01/10] memory: reintroduce BQL-free fine-grained PIO/MMIO Igor Mammedov
2025-08-08 12:12 ` David Hildenbrand
2025-08-08 14:36 ` Igor Mammedov
2025-08-08 15:24 ` David Hildenbrand
2025-08-11 12:08 ` Igor Mammedov
2025-08-11 15:54 ` Peter Xu
2025-08-08 12:01 ` [PATCH v3 02/10] acpi: mark PMTIMER as unlocked Igor Mammedov
2025-08-11 15:55 ` Peter Xu
2025-08-08 12:01 ` [PATCH v3 03/10] hpet: switch to fain-grained device locking Igor Mammedov
2025-08-11 15:56 ` Peter Xu
2025-08-08 12:01 ` [PATCH v3 04/10] hpet: move out main counter read into a separate block Igor Mammedov
2025-08-11 15:56 ` Peter Xu
2025-08-08 12:01 ` [PATCH v3 05/10] hpet: make main counter read lock-less Igor Mammedov
2025-08-11 15:58 ` Peter Xu
2025-08-08 12:01 ` [PATCH v3 06/10] introduce cpu_test_interrupt() that will replace open coded checks Igor Mammedov
2025-08-11 16:31 ` Peter Xu
2025-08-12 15:00 ` Igor Mammedov
2025-08-12 16:10 ` Peter Xu
2025-08-08 12:01 ` [PATCH v3 07/10] x86: kvm: use cpu_test_interrupt() instead of oppen coding checks Igor Mammedov
2025-08-08 12:01 ` [PATCH v3 08/10] kvm: i386: irqchip: take BQL only if there is an interrupt Igor Mammedov
2025-08-11 16:22 ` Peter Xu
2025-08-08 12:01 ` [PATCH v3 09/10] use cpu_test_interrupt() instead of oppen coding checks tree wide Igor Mammedov
2025-08-08 12:01 ` [PATCH v3 10/10] tcg: move interrupt caching and single step masking closer to user Igor Mammedov
2025-08-11 5:36 ` Michael S. Tsirkin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250811013556-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=david@redhat.com \
--cc=imammedo@redhat.com \
--cc=mtosatti@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.