From: Richard Henderson <richard.henderson@linaro.org>
To: Salil Mehta <salil.mehta@huawei.com>,
"salil.mehta@opnsrc.net" <salil.mehta@opnsrc.net>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"qemu-arm@nongnu.org" <qemu-arm@nongnu.org>,
"mst@redhat.com" <mst@redhat.com>
Subject: Re: [PATCH RFC V6 24/24] tcg: Defer TB flush for 'lazy realized' vCPUs on first region alloc
Date: Thu, 2 Oct 2025 08:41:12 -0700 [thread overview]
Message-ID: <7605b216-8aa1-4897-a96e-6ed9953f4e91@linaro.org> (raw)
In-Reply-To: <bc780e0c68fa44da975d8f6fcdb38cd7@huawei.com>
On 10/2/25 05:27, Salil Mehta wrote:
> Hi Richard,
>
> Thanks for the reply. Please find my response inline.
>
> Cheers.
>
>> From: qemu-devel-bounces+salil.mehta=huawei.com@nongnu.org <qemu-
>> devel-bounces+salil.mehta=huawei.com@nongnu.org> On Behalf Of Richard
>> Henderson
>> Sent: Wednesday, October 1, 2025 10:34 PM
>> To: salil.mehta@opnsrc.net; qemu-devel@nongnu.org; qemu-
>> arm@nongnu.org; mst@redhat.com
>> Subject: Re: [PATCH RFC V6 24/24] tcg: Defer TB flush for 'lazy realized' vCPUs
>> on first region alloc
>>
>> On 9/30/25 18:01, salil.mehta@opnsrc.net wrote:
>>> From: Salil Mehta <salil.mehta@huawei.com>
>>>
>>> The TCG code cache is split into regions shared by vCPUs under MTTCG.
>>> For cold-boot (early realized) vCPUs, regions are sized/allocated during
>> bring-up.
>>> However, when a vCPU is *lazy_realized* (administratively "disabled"
>>> at boot and realized later on demand), its TCGContext may fail the
>>> very first code region allocation if the shared TB cache is saturated
>>> by already-running vCPUs.
>>>
>>> Flushing the TB cache is the right remediation, but `tb_flush()` must
>>> be performed from the safe execution context
>> (cpu_exec_loop()/tb_gen_code()).
>>> This patch wires a deferred flush:
>>>
>>> * In `tcg_region_initial_alloc__locked()`, treat an initial allocation
>>> failure for a lazily realized vCPU as non-fatal: set `s->tbflush_pend`
>>> and return.
>>>
>>> * In `tcg_tb_alloc()`, if `s->tbflush_pend` is observed, clear it and
>>> return NULL so the caller performs a synchronous `tb_flush()` and then
>>> retries allocation.
>>>
>>> This avoids hangs observed when a newly realized vCPU cannot obtain
>>> its first region under TB-cache pressure, while keeping the flush at a safe
>> point.
>>>
>>> No change for cold-boot vCPUs and when accel ops is KVM.
>>>
>>> In earlier series, this patch was with below named,
>>> 'tcg: Update tcg_register_thread() leg to handle region alloc for hotplugged
>> vCPU'
>>
>>
>> I don't see why you need two different booleans for this.
>
>
> I can see your point. Maybe I can move `s->tbflush_pend` to 'CPUState' instead?
>
>
>> It seems to me that you could create the cpu in a state for which the first call
>> to
>> tcg_tb_alloc() sees highwater state, and everything after that happens per
>> usual allocating a new region, and possibly flushing the full buffer.
>
>
> Correct. but with a distinction that highwater state is relevant to a TCGContext
> and the regions are allocated from a common pool 'Code Generation Buffer'.
> 'code_gen_highwater' is use to detect whether current context needs more
> region allocation for the dynamic translation to continue. This is a different
> condition than what we are encountering; which is the worst case condition
> that the entire code generation buffer is saturated and cannot even allocate
> a single free TCG region successfully.
I think you misunderstand "and everything after that happens per usual".
When allocating a tb, if a cpu finds that it's current region is full, then it tries to
allocate another region. If that is not successful, then we flush the entire
code_gen_buffer and try again.
Thus tbflush_pend is exactly equivalent to setting
s->code_gen_ptr > s->code_gen_highwater.
As far as lazy_realized... The utility of the assert under these conditions may be called
into question; we could just remove it.
r~
next prev parent reply other threads:[~2025-10-02 15:43 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-01 1:01 [PATCH RFC V6 00/24] Support of Virtual CPU Hotplug-like Feature for ARMv8+ Arch salil.mehta
2025-10-01 1:01 ` [PATCH RFC V6 01/24] hw/core: Introduce administrative power-state property and its accessors salil.mehta
2025-10-09 10:48 ` Miguel Luis
2025-10-01 1:01 ` [PATCH RFC V6 02/24] hw/core, qemu-options.hx: Introduce 'disabledcpus' SMP parameter salil.mehta
2025-10-09 11:28 ` Miguel Luis
2025-10-09 13:17 ` Igor Mammedov
2025-10-09 11:51 ` Markus Armbruster
2025-10-01 1:01 ` [PATCH RFC V6 03/24] hw/arm/virt: Clamp 'maxcpus' as-per machine's vCPU deferred online-capability salil.mehta
2025-10-09 12:32 ` Miguel Luis
2025-10-09 13:11 ` Igor Mammedov
2025-10-01 1:01 ` [PATCH RFC V6 04/24] arm/virt, target/arm: Add new ARMCPU {socket, cluster, core, thread}-id property salil.mehta
2025-10-01 1:01 ` [PATCH RFC V6 05/24] arm/virt, kvm: Pre-create KVM vCPUs for 'disabled' QOM vCPUs at machine init salil.mehta
2025-10-01 1:01 ` [PATCH RFC V6 06/24] arm/virt, gicv3: Pre-size GIC with possible " salil.mehta
2025-10-01 1:01 ` [PATCH RFC V6 07/24] arm/gicv3: Refactor CPU interface init for shared TCG/KVM use salil.mehta
2025-10-01 1:01 ` [PATCH RFC V6 08/24] arm/virt, gicv3: Guard CPU interface access for admin disabled vCPUs salil.mehta
2025-10-01 1:01 ` [PATCH RFC V6 09/24] hw/intc/arm_gicv3_common: Migrate & check 'GICv3CPUState' accessibility mismatch salil.mehta
2025-10-01 1:01 ` [PATCH RFC V6 10/24] arm/virt: Init PMU at host for all present vCPUs salil.mehta
2025-10-03 15:02 ` Igor Mammedov
2025-10-01 1:01 ` [PATCH RFC V6 11/24] hw/arm/acpi: MADT change to size the guest with possible vCPUs salil.mehta
2025-10-03 15:09 ` Igor Mammedov
[not found] ` <0175e40f70424dd9a29389b8a4f16c42@huawei.com>
2025-10-07 12:20 ` Igor Mammedov
2025-10-10 3:15 ` Salil Mehta
2025-10-01 1:01 ` [PATCH RFC V6 12/24] hw/core: Introduce generic device power-state handler interface salil.mehta
2025-10-01 1:01 ` [PATCH RFC V6 13/24] qdev: make admin power state changes trigger platform transitions via ACPI salil.mehta
2025-10-01 1:01 ` [PATCH RFC V6 14/24] arm/acpi: Introduce dedicated CPU OSPM interface for ARM-like platforms salil.mehta
2025-10-03 14:58 ` Igor Mammedov
[not found] ` <7da6a9c470684754810414f0abd23a62@huawei.com>
2025-10-07 12:06 ` Igor Mammedov
2025-10-10 3:00 ` Salil Mehta
2025-10-01 1:01 ` [PATCH RFC V6 15/24] acpi/ged: Notify OSPM of CPU administrative state changes via GED salil.mehta
2025-10-01 1:01 ` [PATCH RFC V6 16/24] arm/virt/acpi: Update ACPI DSDT Tbl to include 'Online-Capable' CPUs AML salil.mehta
2025-10-01 1:01 ` [PATCH RFC V6 17/24] hw/arm/virt, acpi/ged: Add PowerStateHandler hooks for runtime CPU state changes salil.mehta
2025-10-01 1:01 ` [PATCH RFC V6 18/24] target/arm/kvm, tcg: Handle SMCCC hypercall exits in VMM during PSCI_CPU_{ON, OFF} salil.mehta
2025-10-01 1:01 ` [PATCH RFC V6 19/24] target/arm/cpu: Add the Accessor hook to fetch ARM CPU arch-id salil.mehta
2025-10-01 1:01 ` [PATCH RFC V6 20/24] target/arm/kvm: Write vCPU's state back to KVM on cold-reset salil.mehta
2025-10-01 1:01 ` [PATCH RFC V6 21/24] hw/intc/arm-gicv3-kvm: Pause all vCPUs & cache ICC_CTLR_EL1 for userspace PSCI CPU_ON salil.mehta
2025-10-01 1:01 ` [PATCH RFC V6 22/24] monitor, qdev: Introduce 'device_set' to change admin state of existing devices salil.mehta
2025-10-09 8:55 ` [PATCH RFC V6 22/24] monitor,qdev: " Markus Armbruster
2025-10-09 12:51 ` Igor Mammedov
2025-10-09 14:03 ` Daniel P. Berrangé
2025-10-09 14:55 ` Markus Armbruster
2025-10-09 15:19 ` Peter Maydell
2025-10-10 4:59 ` Markus Armbruster
2025-10-01 1:01 ` [PATCH RFC V6 23/24] monitor, qapi: add 'info cpus-powerstate' and QMP query (Admin + Oper states) salil.mehta
2025-10-09 11:53 ` [PATCH RFC V6 23/24] monitor,qapi: " Markus Armbruster
2025-10-01 1:01 ` [PATCH RFC V6 24/24] tcg: Defer TB flush for 'lazy realized' vCPUs on first region alloc salil.mehta
2025-10-01 21:34 ` Richard Henderson
2025-10-02 12:27 ` Salil Mehta via
2025-10-02 15:41 ` Richard Henderson [this message]
2025-10-07 10:14 ` Salil Mehta via
2025-10-06 14:00 ` [PATCH RFC V6 00/24] Support of Virtual CPU Hotplug-like Feature for ARMv8+ Arch Igor Mammedov
2025-10-13 0:34 ` Gavin Shan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7605b216-8aa1-4897-a96e-6ed9953f4e91@linaro.org \
--to=richard.henderson@linaro.org \
--cc=mst@redhat.com \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=salil.mehta@huawei.com \
--cc=salil.mehta@opnsrc.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).