From: "Alex Bennée" <alex.bennee@linaro.org>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: QEMU Developers <qemu-devel@nongnu.org>,
Paolo Bonzini <pbonzini@redhat.com>,
Richard Henderson <rth@twiddle.net>,
"Emilio G. Cota" <cota@braap.org>
Subject: Re: [Qemu-devel] racing between pause_all_vcpus() and qemu_cpu_stop()
Date: Mon, 01 Oct 2018 19:12:38 +0100 [thread overview]
Message-ID: <87lg7hlend.fsf@linaro.org> (raw)
In-Reply-To: <CAFEAcA-T7-PcKLwPr0VO0wrwW3x+w2WbdEjTjN2YfWOmZXyfUg@mail.gmail.com>
Peter Maydell <peter.maydell@linaro.org> writes:
> I've been investigating a race condition where sometimes when my
> guest writes to a device register which triggers a
> qemu_system_reset_request(), it doesn't actually cause a clean reset,
> but instead the guest CPU continues to execute instructions.
> I managed to repro it under 'rr', which let me walk through enough
> of what was going on to determine the following:
>
> When a guest CPU thread calls qemu_system_reset_request(), this
> results in a call to qemu_cpu_stop(current_cpu, true), to
> make the CPU come back out to the main loop. We also set the
> reset_requested flag, to get the IO thread to actually do the
> reset.
>
> The main loop thread runs main_loop_should_exit(). If there is a
> pending reset, it calls pause_all_vcpus(), with the intention
> that this quiesces all the guest CPUs before it starts messing
> with reset actions.
>
> pause_all_vcpus() just waits for every cpu to have cpu->stopped set.
> However, if the running cpu has just called qemu_cpu_stop() on
> itself then it will have set cpu->stopped true but not actually
> made it out to the main loop yet. (In the case I'm looking at,
> what happens is that as soon as the CPU thread unlocks the
> iothread mutex in io_writex() after the device write, the
> main thread runs and does all the reset operations.)
>
> The reset code in the iothread then proceeds to start calling
> various reset functions while the CPU thread is still inside
> the exec loop, running generated code and so on. This doesn't
> seem like what ought to happen. In particular it includes
> calling cpu_common_reset(), which clears all kinds of flags
> relevant to the still-executing CPU...
I would have thought the reset code should be scheduled via safe async
work to run in the vCPU context. Why should the main loop get involved
at all here?
>
> Any suggestions for how we should fix this?
>
> thanks
> -- PMM
--
Alex Bennée
next prev parent reply other threads:[~2018-10-01 18:12 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-01 17:03 [Qemu-devel] racing between pause_all_vcpus() and qemu_cpu_stop() Peter Maydell
2018-10-01 18:12 ` Alex Bennée [this message]
2018-10-02 8:01 ` Peter Maydell
2018-10-02 8:58 ` Paolo Bonzini
2018-10-02 9:04 ` Peter Maydell
2018-10-02 9:59 ` Paolo Bonzini
2018-10-02 10:34 ` Peter Maydell
2018-10-02 16:46 ` Paolo Bonzini
2018-10-02 16:57 ` Peter Maydell
2018-10-02 10:00 ` Alex Bennée
2018-10-02 10:31 ` Peter Maydell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87lg7hlend.fsf@linaro.org \
--to=alex.bennee@linaro.org \
--cc=cota@braap.org \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=rth@twiddle.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.