qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Salil Mehta via <qemu-devel@nongnu.org>
To: Richard Henderson <richard.henderson@linaro.org>,
	"salil.mehta@opnsrc.net" <salil.mehta@opnsrc.net>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	 "qemu-arm@nongnu.org" <qemu-arm@nongnu.org>,
	"mst@redhat.com" <mst@redhat.com>
Subject: RE: [PATCH RFC V6 24/24] tcg: Defer TB flush for 'lazy realized' vCPUs on first region alloc
Date: Tue, 7 Oct 2025 10:14:00 +0000	[thread overview]
Message-ID: <524d3e6e5f44438285b1a74d4bbb933e@huawei.com> (raw)
In-Reply-To: <7605b216-8aa1-4897-a96e-6ed9953f4e91@linaro.org>

Hi Richard,

Sorry for the delay in reply. 

> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Thursday, October 2, 2025 4:41 PM
> 
> On 10/2/25 05:27, Salil Mehta wrote:
> > Hi Richard,
> >
> > Thanks for the reply. Please find my response inline.
> >
> > Cheers.
> >
> >> From: qemu-devel-bounces+salil.mehta=huawei.com@nongnu.org
> <qemu-
> >> devel-bounces+salil.mehta=huawei.com@nongnu.org> On Behalf Of
> Richard
> >> Henderson
> >> Sent: Wednesday, October 1, 2025 10:34 PM
> >> To: salil.mehta@opnsrc.net; qemu-devel@nongnu.org; qemu-
> >> arm@nongnu.org; mst@redhat.com
> >> Subject: Re: [PATCH RFC V6 24/24] tcg: Defer TB flush for 'lazy
> >> realized' vCPUs on first region alloc
> >>
> >> On 9/30/25 18:01, salil.mehta@opnsrc.net wrote:
> >>> From: Salil Mehta <salil.mehta@huawei.com>
> >>>
> >>> The TCG code cache is split into regions shared by vCPUs under MTTCG.
> >>> For cold-boot (early realized) vCPUs, regions are sized/allocated
> >>> during
> >> bring-up.
> >>> However, when a vCPU is *lazy_realized* (administratively "disabled"
> >>> at boot and realized later on demand), its TCGContext may fail the
> >>> very first code region allocation if the shared TB cache is
> >>> saturated by already-running vCPUs.
> >>>
> >>> Flushing the TB cache is the right remediation, but `tb_flush()`
> >>> must be performed from the safe execution context
> >> (cpu_exec_loop()/tb_gen_code()).
> >>> This patch wires a deferred flush:
> >>>
> >>>     * In `tcg_region_initial_alloc__locked()`, treat an initial allocation
> >>>       failure for a lazily realized vCPU as non-fatal: set `s->tbflush_pend`
> >>>       and return.
> >>>
> >>>     * In `tcg_tb_alloc()`, if `s->tbflush_pend` is observed, clear it and
> >>>       return NULL so the caller performs a synchronous `tb_flush()` and
> then
> >>>       retries allocation.
> >>>
> >>> This avoids hangs observed when a newly realized vCPU cannot obtain
> >>> its first region under TB-cache pressure, while keeping the flush at
> >>> a safe
> >> point.
> >>>
> >>> No change for cold-boot vCPUs and when accel ops is KVM.
> >>>
> >>> In earlier series, this patch was with below named,
> >>> 'tcg: Update tcg_register_thread() leg to handle region alloc for
> >>> hotplugged
> >> vCPU'
> >>
> >>
> >> I don't see why you need two different booleans for this.
> >
> >
> > I can see your point. Maybe I can move `s->tbflush_pend`  to 'CPUState'
> instead?
> >
> >
> >> It seems to me that you could create the cpu in a state for which the
> >> first call to
> >> tcg_tb_alloc() sees highwater state, and everything after that
> >> happens per usual allocating a new region, and possibly flushing the full
> buffer.
> >
> >
> > Correct. but with a distinction that highwater state is relevant to a
> > TCGContext and the regions are allocated from a common pool 'Code
> Generation Buffer'.
> > 'code_gen_highwater' is use to detect whether current context needs
> > more region allocation for the dynamic translation to continue. This
> > is a different condition than what we are encountering; which is the
> > worst case condition that the entire code generation buffer is
> > saturated and cannot even allocate a single free TCG region successfully.
> 
> I think you misunderstand "and everything after that happens per usual".
> 
> When allocating a tb, if a cpu finds that it's current region is full, then it tries
> to allocate another region.  If that is not successful, then we flush the entire
> code_gen_buffer and try again.
> 
> Thus tbflush_pend is exactly equivalent to setting
> 
>      s->code_gen_ptr > s->code_gen_highwater.
> 
> As far as lazy_realized...  The utility of the assert under these conditions may
> be called into question; we could just remove it.


I understand your point. I'll remove the 'tbflush_pend' flag  and directly use
'code_gen_highwater = NULL' so that we hit the highwater condition early
when the TCG threads gets lazily realized. And yes, we might have to either
remove or conditionally bypass the assert(). Will dig further and validate. 

Many thanks for this optimization!

Best regards
Salil.


> 
> 
> r~

  reply	other threads:[~2025-10-07 10:14 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-01  1:01 [PATCH RFC V6 00/24] Support of Virtual CPU Hotplug-like Feature for ARMv8+ Arch salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 01/24] hw/core: Introduce administrative power-state property and its accessors salil.mehta
2025-10-09 10:48   ` Miguel Luis
2025-10-01  1:01 ` [PATCH RFC V6 02/24] hw/core, qemu-options.hx: Introduce 'disabledcpus' SMP parameter salil.mehta
2025-10-09 11:28   ` Miguel Luis
2025-10-09 13:17     ` Igor Mammedov
2025-10-09 11:51   ` Markus Armbruster
2025-10-01  1:01 ` [PATCH RFC V6 03/24] hw/arm/virt: Clamp 'maxcpus' as-per machine's vCPU deferred online-capability salil.mehta
2025-10-09 12:32   ` Miguel Luis
2025-10-09 13:11     ` Igor Mammedov
2025-10-01  1:01 ` [PATCH RFC V6 04/24] arm/virt, target/arm: Add new ARMCPU {socket, cluster, core, thread}-id property salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 05/24] arm/virt, kvm: Pre-create KVM vCPUs for 'disabled' QOM vCPUs at machine init salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 06/24] arm/virt, gicv3: Pre-size GIC with possible " salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 07/24] arm/gicv3: Refactor CPU interface init for shared TCG/KVM use salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 08/24] arm/virt, gicv3: Guard CPU interface access for admin disabled vCPUs salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 09/24] hw/intc/arm_gicv3_common: Migrate & check 'GICv3CPUState' accessibility mismatch salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 10/24] arm/virt: Init PMU at host for all present vCPUs salil.mehta
2025-10-03 15:02   ` Igor Mammedov
2025-10-01  1:01 ` [PATCH RFC V6 11/24] hw/arm/acpi: MADT change to size the guest with possible vCPUs salil.mehta
2025-10-03 15:09   ` Igor Mammedov
     [not found]     ` <0175e40f70424dd9a29389b8a4f16c42@huawei.com>
2025-10-07 12:20       ` Igor Mammedov
2025-10-10  3:15         ` Salil Mehta
2025-10-01  1:01 ` [PATCH RFC V6 12/24] hw/core: Introduce generic device power-state handler interface salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 13/24] qdev: make admin power state changes trigger platform transitions via ACPI salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 14/24] arm/acpi: Introduce dedicated CPU OSPM interface for ARM-like platforms salil.mehta
2025-10-03 14:58   ` Igor Mammedov
     [not found]     ` <7da6a9c470684754810414f0abd23a62@huawei.com>
2025-10-07 12:06       ` Igor Mammedov
2025-10-10  3:00         ` Salil Mehta
2025-10-01  1:01 ` [PATCH RFC V6 15/24] acpi/ged: Notify OSPM of CPU administrative state changes via GED salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 16/24] arm/virt/acpi: Update ACPI DSDT Tbl to include 'Online-Capable' CPUs AML salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 17/24] hw/arm/virt, acpi/ged: Add PowerStateHandler hooks for runtime CPU state changes salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 18/24] target/arm/kvm, tcg: Handle SMCCC hypercall exits in VMM during PSCI_CPU_{ON, OFF} salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 19/24] target/arm/cpu: Add the Accessor hook to fetch ARM CPU arch-id salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 20/24] target/arm/kvm: Write vCPU's state back to KVM on cold-reset salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 21/24] hw/intc/arm-gicv3-kvm: Pause all vCPUs & cache ICC_CTLR_EL1 for userspace PSCI CPU_ON salil.mehta
2025-10-01  1:01 ` [PATCH RFC V6 22/24] monitor, qdev: Introduce 'device_set' to change admin state of existing devices salil.mehta
2025-10-09  8:55   ` [PATCH RFC V6 22/24] monitor,qdev: " Markus Armbruster
2025-10-09 12:51     ` Igor Mammedov
2025-10-09 14:03       ` Daniel P. Berrangé
2025-10-09 14:55       ` Markus Armbruster
2025-10-09 15:19         ` Peter Maydell
2025-10-10  4:59           ` Markus Armbruster
2025-10-01  1:01 ` [PATCH RFC V6 23/24] monitor, qapi: add 'info cpus-powerstate' and QMP query (Admin + Oper states) salil.mehta
2025-10-09 11:53   ` [PATCH RFC V6 23/24] monitor,qapi: " Markus Armbruster
2025-10-01  1:01 ` [PATCH RFC V6 24/24] tcg: Defer TB flush for 'lazy realized' vCPUs on first region alloc salil.mehta
2025-10-01 21:34   ` Richard Henderson
2025-10-02 12:27     ` Salil Mehta via
2025-10-02 15:41       ` Richard Henderson
2025-10-07 10:14         ` Salil Mehta via [this message]
2025-10-06 14:00 ` [PATCH RFC V6 00/24] Support of Virtual CPU Hotplug-like Feature for ARMv8+ Arch Igor Mammedov
2025-10-13  0:34 ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=524d3e6e5f44438285b1a74d4bbb933e@huawei.com \
    --to=qemu-devel@nongnu.org \
    --cc=mst@redhat.com \
    --cc=qemu-arm@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=salil.mehta@huawei.com \
    --cc=salil.mehta@opnsrc.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).