From: Paolo Bonzini <pbonzini@redhat.com>
To: "Alex Bennée" <alex.bennee@linaro.org>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 5/5] icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread
Date: Mon, 13 Mar 2017 18:16:43 +0100 [thread overview]
Message-ID: <834de521-e5cc-8857-afeb-8734037f95f3@redhat.com> (raw)
In-Reply-To: <871su1qfsu.fsf@linaro.org>
On 13/03/2017 17:53, Alex Bennée wrote:
>
> Paolo Bonzini <pbonzini@redhat.com> writes:
>
>> icount has become much slower after tcg_cpu_exec has stopped
>> using the BQL. There is also a latent bug that is masked by
>> the slowness.
>>
>> The slowness happens because every occurrence of a QEMU_CLOCK_VIRTUAL
>> timer now has to wake up the I/O thread and wait for it. The rendez-vous
>> is mediated by the BQL QemuMutex:
>>
>> - handle_icount_deadline wakes up the I/O thread with BQL taken
>> - the I/O thread wakes up and waits on the BQL
>> - the VCPU thread releases the BQL a little later
>> - the I/O thread raises an interrupt, which calls qemu_cpu_kick
>> - the VCPU thread notices the interrupt, takes the BQL to
>> process it and waits on it
>>
>> All this back and forth is extremely expensive, causing a 6 to 8-fold
>> slowdown when icount is turned on.
>>
>> One may think that the issue is that the VCPU thread is too dependent
>> on the BQL, but then the latent bug comes in. I first tried removing
>> the BQL completely from the x86 cpu_exec, only to see everything break.
>> The only way to fix it (and make everything slow again) was to add a dummy
>> BQL lock/unlock pair.
>>
>> This is because in -icount mode you really have to process the events
>> before the CPU restarts executing the next instruction. Therefore, this
>> series moves the processing of QEMU_CLOCK_VIRTUAL timers straight in
>> the vCPU thread when running in icount mode.
>>
>> The required changes include:
>>
>> - make the timer notification callback wake up TCG's single vCPU thread
>> when run from another thread. By using async_run_on_cpu, the callback
>> can override all_cpu_threads_idle() when the CPU is halted.
>>
>> - move handle_icount_deadline after qemu_tcg_wait_io_event, so that
>> the timer notification callback is invoked after the dummy work item
>> wakes up the vCPU thread
>>
>> - make handle_icount_deadline run the timers instead of just waking the
>> I/O thread.
>>
>> - stop processing the timers in the main loop
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>> cpus.c | 26 +++++++++++++++++++++++---
>> include/qemu/timer.h | 24 ++++++++++++++++++++++++
>> util/qemu-timer.c | 4 +++-
>> 3 files changed, 50 insertions(+), 4 deletions(-)
>>
>> diff --git a/cpus.c b/cpus.c
>> index 747addd..209c196 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -804,9 +804,23 @@ static void qemu_cpu_kick_rr_cpu(void)
>> } while (cpu != atomic_mb_read(&tcg_current_rr_cpu));
>> }
>>
>> +static void do_nothing(CPUState *cpu, run_on_cpu_data unused)
>> +{
>> +}
>> +
>> void qemu_timer_notify_cb(void *opaque, QEMUClockType type)
>> {
>> - qemu_notify_event();
>> + if (!use_icount || type != QEMU_CLOCK_VIRTUAL) {
>> + qemu_notify_event();
>> + return;
>> + }
>> +
>> + if (!qemu_in_vcpu_thread() && first_cpu) {
>> + /* run_on_cpu will also kick the CPU out of halt so that
>> + * handle_icount_deadline runs
>> + */
>> + async_run_on_cpu(first_cpu, do_nothing, RUN_ON_CPU_NULL);
>> + }
>
> This doesn't read quite right - otherwise you get the same effect by
> calling qemu_cpu_kick(), or even modify and call qemu_cpu_kick_rr_cpu()?
>
> The only real effect of having something in the work queue is you run
> the outside of the loop before returning into the tcg_cpu_exec.
Yes, and "the outside of the loop" here means handle_icount_deadline.
But qemu_cpu_kick_rr_cpu won't signal condition variables, and
qemu_cpu_kick does but it won't make cpu_thread_is_idle return false.
Therefore, qemu_tcg_wait_io_event keeps looping.
>> }
>>
>> static void kick_tcg_thread(void *opaque)
>> @@ -1220,12 +1234,15 @@ static int64_t tcg_get_icount_limit(void)
>>
>> static void handle_icount_deadline(void)
>> {
>> + assert(qemu_in_vcpu_thread());
>> if (use_icount) {
>> int64_t deadline =
>> qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
>>
>> if (deadline == 0) {
>> + /* Wake up other AioContexts. */
>> qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
>> + qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL);
>> }
>> }
>> }
>> @@ -1338,6 +1355,11 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
>> /* Account partial waits to QEMU_CLOCK_VIRTUAL. */
>> qemu_account_warp_timer();
>>
>> + /* Run the timers here. This is much more efficient than
>> + * waking up the I/O thread and waiting for completion.
>> + */
>> + handle_icount_deadline();
>> +
>> if (!cpu) {
>> cpu = first_cpu;
>> }
>> @@ -1379,8 +1401,6 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
>> atomic_mb_set(&cpu->exit_request, 0);
>> }
>>
>> - handle_icount_deadline();
>> -
>
> I guess we could just as easily move the handling to after
> qemu_tcg_wait_io_event(cpu ? cpu : QTAILQ_FIRST(&cpus))?
>
> I noticed we still call handle_icount_deadline in the multi-thread case
> which is probably superfluous unless/until we figure out a way for it to
> work with MTTCG.
Should I remove the call? Add an assert(!use_icount)?
>> qemu_tcg_wait_io_event(cpu ? cpu : QTAILQ_FIRST(&cpus));
>> deal_with_unplugged_cpus();
>> }
>> diff --git a/include/qemu/timer.h b/include/qemu/timer.h
>> index 1441b42..e1742f2 100644
>> --- a/include/qemu/timer.h
>> +++ b/include/qemu/timer.h
>> @@ -533,6 +533,12 @@ static inline QEMUTimer *timer_new_tl(QEMUTimerList *timer_list,
>> * Create a new timer and associate it with the default
>> * timer list for the clock type @type.
>> *
>> + * The default timer list has one special feature: in icount mode,
>> + * %QEMU_CLOCK_VIRTUAL timers are run in the vCPU thread. This is
>> + * not true of other timer lists, which are typically associated
>> + * with an AioContext---each of them runs its timer callbacks in its own
>> + * AioContext thread.
>> + *
>> * Returns: a pointer to the timer
>> */
>> static inline QEMUTimer *timer_new(QEMUClockType type, int scale,
>> @@ -550,6 +556,12 @@ static inline QEMUTimer *timer_new(QEMUClockType type, int scale,
>> * Create a new timer with nanosecond scale on the default timer list
>> * associated with the clock.
>> *
>> + * The default timer list has one special feature: in icount mode,
>> + * %QEMU_CLOCK_VIRTUAL timers are run in the vCPU thread. This is
>> + * not true of other timer lists, which are typically associated
>> + * with an AioContext---each of them runs its timer callbacks in its own
>> + * AioContext thread.
>> + *
>> * Returns: a pointer to the newly created timer
>> */
>> static inline QEMUTimer *timer_new_ns(QEMUClockType type, QEMUTimerCB *cb,
>> @@ -564,6 +576,12 @@ static inline QEMUTimer *timer_new_ns(QEMUClockType type, QEMUTimerCB *cb,
>> * @cb: the callback to call when the timer expires
>> * @opaque: the opaque pointer to pass to the callback
>> *
>> + * The default timer list has one special feature: in icount mode,
>> + * %QEMU_CLOCK_VIRTUAL timers are run in the vCPU thread. This is
>> + * not true of other timer lists, which are typically associated
>> + * with an AioContext---each of them runs its timer callbacks in its own
>> + * AioContext thread.
>> + *
>> * Create a new timer with microsecond scale on the default timer list
>> * associated with the clock.
>> *
>> @@ -581,6 +599,12 @@ static inline QEMUTimer *timer_new_us(QEMUClockType type, QEMUTimerCB *cb,
>> * @cb: the callback to call when the timer expires
>> * @opaque: the opaque pointer to pass to the callback
>> *
>> + * The default timer list has one special feature: in icount mode,
>> + * %QEMU_CLOCK_VIRTUAL timers are run in the vCPU thread. This is
>> + * not true of other timer lists, which are typically associated
>> + * with an AioContext---each of them runs its timer callbacks in its own
>> + * AioContext thread.
>> + *
>> * Create a new timer with millisecond scale on the default timer list
>> * associated with the clock.
>> *
>> diff --git a/util/qemu-timer.c b/util/qemu-timer.c
>> index dc3181e..82d5650 100644
>> --- a/util/qemu-timer.c
>> +++ b/util/qemu-timer.c
>> @@ -658,7 +658,9 @@ bool qemu_clock_run_all_timers(void)
>> QEMUClockType type;
>>
>> for (type = 0; type < QEMU_CLOCK_MAX; type++) {
>> - progress |= qemu_clock_run_timers(type);
>> + if (qemu_clock_use_for_deadline(type)) {
>> + progress |= qemu_clock_run_timers(type);
>> + }
>
> minor nit: its not really qemu_clock_run_all_timers() now is it. We run
> all but the virtual timers (if icount is enabled).
Well yeah, it's all those that pass qemu_clock_use_for_deadline.
Paolo
>> }
>>
>> return progress;
>
>
> --
> Alex Bennée
>
next prev parent reply other threads:[~2017-03-13 17:16 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-03 13:11 [Qemu-devel] [PATCH 0/6] tcg: fix icount super slowdown Paolo Bonzini
2017-03-03 13:11 ` [Qemu-devel] [PATCH 1/5] qemu-timer: fix off-by-one Paolo Bonzini
2017-03-03 13:48 ` Edgar E. Iglesias
2017-03-10 9:46 ` Alex Bennée
2017-03-03 13:11 ` [Qemu-devel] [PATCH 2/5] qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h Paolo Bonzini
2017-03-03 13:48 ` Edgar E. Iglesias
2017-03-03 14:50 ` Alex Bennée
2017-03-03 14:55 ` Paolo Bonzini
2017-03-10 7:42 ` [Qemu-devel] [PATCH] fixup! " Alex Bennée
2017-03-10 8:27 ` Peter Maydell
2017-03-10 9:47 ` [Qemu-devel] [PATCH 2/5] " Alex Bennée
2017-03-03 13:11 ` [Qemu-devel] [PATCH 3/5] cpus: define QEMUTimerListNotifyCB for QEMU system emulation Paolo Bonzini
2017-03-03 13:53 ` Edgar E. Iglesias
2017-03-03 13:11 ` [Qemu-devel] [PATCH 4/5] main-loop: remove now unnecessary optimization Paolo Bonzini
2017-03-03 13:53 ` Edgar E. Iglesias
2017-03-13 16:23 ` Alex Bennée
2017-03-03 13:11 ` [Qemu-devel] [PATCH 5/5] icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread Paolo Bonzini
2017-03-13 16:53 ` Alex Bennée
2017-03-13 17:16 ` Paolo Bonzini [this message]
2017-03-13 18:15 ` Alex Bennée
2017-03-14 10:05 ` Paolo Bonzini
2017-03-14 12:57 ` Paolo Bonzini
2017-03-14 15:43 ` Alex Bennée
2017-03-14 16:23 ` Paolo Bonzini
2017-03-09 17:19 ` [Qemu-devel] [PATCH 0/6] tcg: fix icount super slowdown Alex Bennée
2017-03-09 17:22 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=834de521-e5cc-8857-afeb-8734037f95f3@redhat.com \
--to=pbonzini@redhat.com \
--cc=alex.bennee@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).