From: Paolo Bonzini <pbonzini@redhat.com>
To: Sebastian Tanase <sebastian.tanase@openwide.fr>, qemu-devel@nongnu.org
Cc: kwolf@redhat.com, peter.maydell@linaro.org,
jeremy.rosen@openwide.fr, aliguori@amazon.com,
wenchaoqemu@gmail.com, quintela@redhat.com, mst@redhat.com,
stefanha@redhat.com, armbru@redhat.com, lcapitulino@redhat.com,
michael@walle.cc, camille.begue@openwide.fr, alex@alex.org.uk,
crobinso@redhat.com, pierre.lemagourou@openwide.fr,
afaerber@suse.de, rth@twiddle.net
Subject: Re: [Qemu-devel] [PATCH V5 4/6] cpu_exec: Add sleeping algorithm
Date: Fri, 25 Jul 2014 12:13:44 +0200 [thread overview]
Message-ID: <53D22DD8.10601@redhat.com> (raw)
In-Reply-To: <1406282193-9664-5-git-send-email-sebastian.tanase@openwide.fr>
Il 25/07/2014 11:56, Sebastian Tanase ha scritto:
> The goal is to sleep qemu whenever the guest clock
> is in advance compared to the host clock (we use
> the monotonic clocks). The amount of time to sleep
> is calculated in the execution loop in cpu_exec.
>
> At first, we tried to approximate at each for loop the real time elapsed
> while searching for a TB (generating or retrieving from cache) and
> executing it. We would then approximate the virtual time corresponding
> to the number of virtual instructions executed. The difference between
> these 2 values would allow us to know if the guest is in advance or delayed.
> However, the function used for measuring the real time
> (qemu_clock_get_ns(QEMU_CLOCK_REALTIME)) proved to be very expensive.
> We had an added overhead of 13% of the total run time.
>
> Therefore, we modified the algorithm and only take into account the
> difference between the 2 clocks at the begining of the cpu_exec function.
> During the for loop we try to reduce the advance of the guest only by
> computing the virtual time elapsed and sleeping if necessary. The overhead
> is thus reduced to 3%. Even though this method still has a noticeable
> overhead, it no longer is a bottleneck in trying to achieve a better
> guest frequency for which the guest clock is faster than the host one.
>
> As for the the alignement of the 2 clocks, with the first algorithm
> the guest clock was oscillating between -1 and 1ms compared to the host clock.
> Using the second algorithm we notice that the guest is 5ms behind the host, which
> is still acceptable for our use case.
>
> The tests where conducted using fio and stress. The host machine in an i5 CPU at
> 3.10GHz running Debian Jessie (kernel 3.12). The guest machine is an arm versatile-pb
> built with buildroot.
>
> Currently, on our test machine, the lowest icount we can achieve that is suitable for
> aligning the 2 clocks is 6. However, we observe that the IO tests (using fio) are
> slower than the cpu tests (using stress).
>
> Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
> Tested-by: Camille Bégué <camille.begue@openwide.fr>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> cpu-exec.c | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> cpus.c | 17 ++++++++++
> include/qemu/timer.h | 1 +
> 3 files changed, 109 insertions(+)
>
> diff --git a/cpu-exec.c b/cpu-exec.c
> index 38e5f02..1a725b6 100644
> --- a/cpu-exec.c
> +++ b/cpu-exec.c
> @@ -22,6 +22,84 @@
> #include "tcg.h"
> #include "qemu/atomic.h"
> #include "sysemu/qtest.h"
> +#include "qemu/timer.h"
> +
> +/* -icount align implementation. */
> +
> +typedef struct SyncClocks {
> + int64_t diff_clk;
> + int64_t original_instr_counter;
> +} SyncClocks;
> +
> +#if !defined(CONFIG_USER_ONLY)
> +/* Allow the guest to have a max 3ms advance.
> + * The difference between the 2 clocks could therefore
> + * oscillate around 0.
> + */
> +#define VM_CLOCK_ADVANCE 3000000
> +
> +static int64_t delay_host(int64_t diff_clk)
> +{
> + if (diff_clk > VM_CLOCK_ADVANCE) {
> +#ifndef _WIN32
> + struct timespec sleep_delay, rem_delay;
> + sleep_delay.tv_sec = diff_clk / 1000000000LL;
> + sleep_delay.tv_nsec = diff_clk % 1000000000LL;
> + if (nanosleep(&sleep_delay, &rem_delay) < 0) {
> + diff_clk -= (sleep_delay.tv_sec - rem_delay.tv_sec) * 1000000000LL;
> + diff_clk -= sleep_delay.tv_nsec - rem_delay.tv_nsec;
> + } else {
> + diff_clk = 0;
> + }
> +#else
> + Sleep(diff_clk / SCALE_MS);
> + diff_clk = 0;
> +#endif
> + }
> + return diff_clk;
> +}
> +
> +static int64_t instr_to_vtime(int64_t instr_counter, const CPUState *cpu)
> +{
> + int64_t instr_exec_time;
> + instr_exec_time = instr_counter -
> + (cpu->icount_extra +
> + cpu->icount_decr.u16.low);
> + instr_exec_time = instr_exec_time << icount_time_shift;
> +
> + return instr_exec_time;
> +}
> +
> +static void align_clocks(SyncClocks *sc, const CPUState *cpu)
> +{
> + if (!icount_align_option) {
> + return;
> + }
> + sc->diff_clk += instr_to_vtime(sc->original_instr_counter, cpu);
> + sc->original_instr_counter = cpu->icount_extra + cpu->icount_decr.u16.low;
> + sc->diff_clk = delay_host(sc->diff_clk);
> +}
Just two comments:
1) perhaps s/original/last/ in original_instr_counter?
2) I think I prefer this to be written like:
instr_counter = cpu->icount_extra + cpu->icount_decr.u16.low;
instr_exec_time = sc->original_instr_counter - instr_counter;
sc->original_instr_counter = instr_counter
sc->diff_clk += instr_exec_time << icount_time_shift;
sc->diff_clk = delay_host(sc->diff_clk);
If you agree, I can do it when applying the patches.
Thanks for your persistence, I'm very happy with this version!
As a follow up, do you think it's possible to modify the places where
you run align_clocks, so that you sleep with the iothread mutex *not* taken?
Paolo
next prev parent reply other threads:[~2014-07-25 10:14 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-25 9:56 [Qemu-devel] [PATCH V5 0/6] icount: Implement delay algorithm between guest and host clocks Sebastian Tanase
2014-07-25 9:56 ` [Qemu-devel] [PATCH V5 1/6] icount: Add QemuOpts for icount Sebastian Tanase
2014-08-08 6:51 ` Markus Armbruster
2014-09-15 7:04 ` TeLeMan
2014-09-15 11:41 ` Paolo Bonzini
2014-07-25 9:56 ` [Qemu-devel] [PATCH V5 2/6] icount: Add align option to icount Sebastian Tanase
2014-07-25 9:56 ` [Qemu-devel] [PATCH V5 3/6] icount: Make icount_time_shift available everywhere Sebastian Tanase
2014-07-25 9:56 ` [Qemu-devel] [PATCH V5 4/6] cpu_exec: Add sleeping algorithm Sebastian Tanase
2014-07-25 10:13 ` Paolo Bonzini [this message]
2014-07-25 14:28 ` Sebastian Tanase
2014-07-25 9:56 ` [Qemu-devel] [PATCH V5 5/6] cpu_exec: Print to console if the guest is late Sebastian Tanase
2014-07-25 9:56 ` [Qemu-devel] [PATCH V5 6/6] monitor: Add drift info to 'info jit' Sebastian Tanase
2014-07-25 10:00 ` [Qemu-devel] [PATCH V5 0/6] icount: Implement delay algorithm between guest and host clocks Andreas Färber
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53D22DD8.10601@redhat.com \
--to=pbonzini@redhat.com \
--cc=afaerber@suse.de \
--cc=alex@alex.org.uk \
--cc=aliguori@amazon.com \
--cc=armbru@redhat.com \
--cc=camille.begue@openwide.fr \
--cc=crobinso@redhat.com \
--cc=jeremy.rosen@openwide.fr \
--cc=kwolf@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=michael@walle.cc \
--cc=mst@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=pierre.lemagourou@openwide.fr \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=rth@twiddle.net \
--cc=sebastian.tanase@openwide.fr \
--cc=stefanha@redhat.com \
--cc=wenchaoqemu@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.