From: "Alex Bennée" <alex.bennee@linaro.org>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: "Paolo Bonzini" <pbonzini@redhat.com>,
"Richard Henderson" <rth@twiddle.net>,
"Philippe Mathieu-Daudé" <f4bug@amsat.org>,
qemu-devel <qemu-devel@nongnu.org>,
maciej.borzecki@rndity.com
Subject: Re: [Qemu-devel] qemu_system_reset_request() broken w.r.t BQL locking regime
Date: Wed, 05 Jul 2017 22:46:52 +0100 [thread overview]
Message-ID: <87r2xua75f.fsf@linaro.org> (raw)
In-Reply-To: <CAFEAcA_DcOMzo=X1Y9O+5VrfeA32i012vpQmqB++aEXH6J1qeg@mail.gmail.com>
Peter Maydell <peter.maydell@linaro.org> writes:
> On 5 July 2017 at 20:30, Alex Bennée <alex.bennee@linaro.org> wrote:
>>
>> Peter Maydell <peter.maydell@linaro.org> writes:
>>
>>> On 5 July 2017 at 17:01, Alex Bennée <alex.bennee@linaro.org> wrote:
>>>> An interesting bug was reported on #qemu today. It was bisected to
>>>> 8d04fb55 (drop global lock for TCG) and only occurred when QEMU was run
>>>> with taskset -c 0. Originally the fingers where pointed at mttcg but it
>>>> occurs in both single and multi-threaded modes.
>>>>
>>>> I think the problem is qemu_system_reset_request() is certainly racy
>>>> when resetting a running CPU. AFAICT:
>>>>
>>>> - Guest resets board, writing to some hw address (e.g.
>>>> arm_sysctl_write)
>>>> - This triggers qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET)
>>>> - We exit iowrite and drop the BQL
>>>> - vl.c schedules qemu_system_reset->qemu_devices_reset...arm_cpu_reset
>>>> - we start writing new values to CPU env while still in TCG code
>>>> - CHAOS!
>>>>
>>>> The general solution for this is to ensure these sort of tasks are done
>>>> with safe work in the CPUs context when we know nothing else is running.
>>>> It seems this is probably best done by modifying
>>>> qemu_system_reset_request to queue work up on current_cpu and execute it
>>>> as safe work - I don't think the vl.c thread should ever be messing
>>>> about with calling cpu_reset directly.
>>>
>>> My first thought is that qemu_system_reset() should absolutely
>>> stop every CPU (or other runnable thing like a DMA agent) in the
>>> system.
>>
>> Are all these reset calls system wide though?
>
> It's called 'system_reset' because it resets the entire system...
>
>> After all with PCSI you
>> can bring individual cores up and down. I appreciate the vexpress stuff
>> pre-dates those well defined semantics though.
>
> It's individual core reset that's a more ad-hoc afterthought,
> really.
>
>> vm_stop certainly tries to deal with things gracefully as well as send
>> qapi events, drain IO queues and the rest of it. My only concern is it
>> handles two cases - external vm_stops and those from the current CPU.
>>
>> I think it may be cleaner for CPU originated halts to use the
>> async_safe_run_on_cpu() mechanism.
>
> System reset already has an async component to it -- you call
> qemu_system_reset_request(), which just says "schedule a system
> reset as soon as convenient". qemu_system_reset() is the thing
> that runs later and actually does the job (from the io thread,
> not the CPU thread).
>
> Looking more closely at the vl.c code, it looks like it
> calls pause_all_vcpus() before calling qemu_system_reset():
> shouldn't that be pausing all the TCG CPUs?
Looking deeper it seems cpu_stop_current() is doing the wrong thing.
Because it sets cpu->stopped the pause_all_vcpus() in the vl.c thread
doesn't wait.
I suspect it should really be doing a cpu_loop_exit. I'll see if I can
work up a patch.
>
> thanks
> -- PMM
--
Alex Bennée
next prev parent reply other threads:[~2017-07-05 21:46 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-05 16:01 [Qemu-devel] qemu_system_reset_request() broken w.r.t BQL locking regime Alex Bennée
2017-07-05 16:14 ` Peter Maydell
2017-07-05 16:21 ` Paolo Bonzini
2017-07-05 19:31 ` Alex Bennée
2017-07-06 8:37 ` Alex Bennée
2017-07-05 19:30 ` Alex Bennée
2017-07-05 19:42 ` Peter Maydell
2017-07-05 20:10 ` Alex Bennée
2017-07-05 21:46 ` Alex Bennée [this message]
[not found] <mailman.82700.1499272965.22738.qemu-devel@nongnu.org>
2017-07-05 16:54 ` G 3
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87r2xua75f.fsf@linaro.org \
--to=alex.bennee@linaro.org \
--cc=f4bug@amsat.org \
--cc=maciej.borzecki@rndity.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=rth@twiddle.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.