From: "Alex Bennée" <alex.bennee@linaro.org>
To: Igor Mammedov <imammedo@redhat.com>
Cc: qemu-devel@nongnu.org, richard.henderson@linaro.org,
pbonzini@redhat.com, npiggin@gmail.com
Subject: Re: [PATCH] tcg: drop qemu_cpu_is_self() in tlb_flush_by_mmuidx[_async_work]
Date: Fri, 24 Jan 2025 15:15:01 +0000 [thread overview]
Message-ID: <87wmeki6q2.fsf@draig.linaro.org> (raw)
In-Reply-To: <20250124152148.706e88c2@imammedo.users.ipa.redhat.com> (Igor Mammedov's message of "Fri, 24 Jan 2025 15:21:48 +0100")
Igor Mammedov <imammedo@redhat.com> writes:
> On Thu, 23 Jan 2025 12:23:43 +0100
> Igor Mammedov <imammedo@redhat.com> wrote:
>
>> On Thu, 23 Jan 2025 10:52:15 +0000
>> Alex Bennée <alex.bennee@linaro.org> wrote:
>>
>> > Igor Mammedov <imammedo@redhat.com> writes:
>> >
>> > > QEMU will crash with following debug enabled
>> > > # define DEBUG_TLB_GATE 1
>> > > # define DEBUG_TLB_LOG_GATE 1
>> > > due to [1] introduced assert and as it happenstlb_flush_by_mmuidx[_async_work]
>> > > functions are called not only from vcpu thread but also from reset handler
>> > > that is called from main thread at cpu realize time when vcpu is already
>> > > created
>> > > x86_cpu_new -> ... ->
>> > > x86_cpu_realizefn -> cpu_reset -> ... ->
>> > > tcg_cpu_reset_hold
>> > >
>> > > drop assert to fix crash.
>> >
>> > Hmm the assert is there for a good reason because we do not want to be
>> > flushing another CPUs state. However the assert itself:
>> >
>> > g_assert(!(cpu)->created || qemu_cpu_is_self(cpu));
>> >
>> > was trying to account for pre-initialised vCPUs. What has changed?
>> >
>> > cpu_thread_signal_created(cpu) is called just before we start running
>> > the main loop in mttcg_cpu_thread_fn. So any other thread messing with
>> > the CPUs TLB can potentially mess things up.
>>
>> it reproduces on current master, so yes it likely has changed over time.
>> I've just stumbled on it when attempting to get rid of cpu->created usage.
>>
>>
>> > > 1)
>> > > Fixes: f0aff0f124028 ("cputlb: add assert_cpu_is_self checks")
>
> bisection points to (so above fixes was a wrong one):
> 30933c4fb4f3d tcg/cputlb: remove other-cpu capability from TLB flushing
> which has replaced a check with assert:
>
> - if (cpu->created && !qemu_cpu_is_self(cpu)) {
> - async_run_on_cpu(cpu, tlb_flush_by_mmuidx_async_work,
> - RUN_ON_CPU_HOST_INT(idxmap));
> - } else {
> - tlb_flush_by_mmuidx_async_work(cpu, RUN_ON_CPU_HOST_INT(idxmap));
> - }
> + assert_cpu_is_self(cpu);
>
> should we revert that instead?
>
> perhaps also drop 'cpu->created' check in assert_cpu_is_self as it
> obviously doesn't work.
That's because the asserts only check when built with debug due to
concerns about the hot path. I think in the case of cputlb we should
probably use tcg_debug_assert() which at least gets enabled for
--enable-debug builds without needing to manually patch the define for
DEBUG_TLB.
Converting the custom TLB log implementation to using tracepoints would
also be nice.
>
>
>> > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
>> > > ---
>> > > accel/tcg/cputlb.c | 4 ----
>> > > 1 file changed, 4 deletions(-)
>> > >
>> > > diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
>> > > index b26c0e088f..2da803103c 100644
>> > > --- a/accel/tcg/cputlb.c
>> > > +++ b/accel/tcg/cputlb.c
>> > > @@ -381,8 +381,6 @@ static void tlb_flush_by_mmuidx_async_work(CPUState *cpu, run_on_cpu_data data)
>> > > uint16_t all_dirty, work, to_clean;
>> > > int64_t now = get_clock_realtime();
>> > >
>> > > - assert_cpu_is_self(cpu);
>> > > -
>> > > tlb_debug("mmu_idx:0x%04" PRIx16 "\n", asked);
>> > >
>> > > qemu_spin_lock(&cpu->neg.tlb.c.lock);
>> > > @@ -419,8 +417,6 @@ void tlb_flush_by_mmuidx(CPUState *cpu, uint16_t idxmap)
>> > > {
>> > > tlb_debug("mmu_idx: 0x%" PRIx16 "\n", idxmap);
>> > >
>> > > - assert_cpu_is_self(cpu);
>> > > -
>> > > tlb_flush_by_mmuidx_async_work(cpu, RUN_ON_CPU_HOST_INT(idxmap));
>> > > }
>> >
>>
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
prev parent reply other threads:[~2025-01-24 15:15 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-23 9:45 [PATCH] tcg: drop qemu_cpu_is_self() in tlb_flush_by_mmuidx[_async_work] Igor Mammedov
2025-01-23 10:52 ` Alex Bennée
2025-01-23 11:23 ` Igor Mammedov
2025-01-23 12:09 ` Alex Bennée
2025-01-23 13:38 ` Igor Mammedov
2025-01-27 13:24 ` Philippe Mathieu-Daudé
2025-01-27 13:33 ` Igor Mammedov
2025-01-27 13:36 ` Philippe Mathieu-Daudé
2025-01-24 14:21 ` Igor Mammedov
2025-01-24 15:15 ` Alex Bennée [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87wmeki6q2.fsf@draig.linaro.org \
--to=alex.bennee@linaro.org \
--cc=imammedo@redhat.com \
--cc=npiggin@gmail.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.