qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH V5 0/6] icount: Implement delay algorithm between guest and host clocks
@ 2014-07-25  9:56 Sebastian Tanase
  2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 1/6] icount: Add QemuOpts for icount Sebastian Tanase
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Sebastian Tanase @ 2014-07-25  9:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, peter.maydell, Sebastian Tanase, jeremy.rosen, alex,
	wenchaoqemu, quintela, mst, stefanha, armbru, lcapitulino,
	michael, camille.begue, aliguori, crobinso, pbonzini,
	pierre.lemagourou, afaerber, rth

The icount option already implemented in QEMU allows the guest to run at a theoretical
frequency of 1/(2^N) GHz (N is the icount parameter). The goal of this patch is to have a
real guest frequency close to the one imposed by using the icount option.

The main idea behind the algorithm is that we compare the virtual monotonic clock and the
host monotonic clock. For big icounts (on our test machine, an i5 CPU @ 3.10GHz, icounts
starting at 6) the guest clock will be ahead of the host clock. In this case, we try to
sleep QEMU for the difference between the 2 clocks. Therefore, the guest would have
executed for a period almost equally to the one imposed by icount. We should point out
that the algorithm works only for those icounts that allow the guest clock to be in front
of the host clock.

The first patch adds support for QemuOpts for the 'icount' parameter. It also adds a
suboption called 'shift' that will hold the value for 'icount'. Therefore we now have
-icount shift=N|auto or -icount N|auto.

The second patch adds the 'align' suboption for icount.

The third patch exports 'icount_time_shift' so that it can be used in places other than
cpus.c; we need it in cpu-exec.c for calculating for how long we want QEMU to sleep.

The forth patch implements the algorithm used for calculating the delay we want to sleep.
It uses the number of instructions executed by the virtual cpu and also the icount_time_shift.

The fifth patch prints to the console whenever the guest clock runs behind the host
clock. The fastest printing speed is every 2 seconds, and we only print if the align option
is enabled. We also have a limit to 100 printed messages.

The sixth patch adds information about the difference between the host and guest clocks
(taking into account the offset) in the 'info jit' command. We also print the maximum
delay and advance of the guest clock compared to the host clock.

v4 -> v5

* Use existing offset value (timers_state.cpu_clock_offset) for computation -> patch 4
* Use cpu_get_clock and cpu_get_icount for showing drift information -> patch 6

Important note: This patch series relies on the bug fix discussed here
http://lists.gnu.org/archive/html/qemu-devel/2014-07/msg03208.html

Sebastian Tanase (6):
  icount: Add QemuOpts for icount
  icount: Add align option to icount
  icount: Make icount_time_shift available everywhere
  cpu_exec: Add sleeping algorithm
  cpu_exec: Print to console if the guest is late
  monitor: Add drift info to 'info jit'

 cpu-exec.c            | 127 ++++++++++++++++++++++++++++++++++++++++++++++++++
 cpus.c                |  59 +++++++++++++++++++++--
 include/qemu-common.h |   9 +++-
 include/qemu/timer.h  |   1 +
 monitor.c             |   1 +
 qemu-options.hx       |  17 +++++--
 qtest.c               |  13 +++++-
 vl.c                  |  39 +++++++++++++---
 8 files changed, 248 insertions(+), 18 deletions(-)

-- 
2.0.0.rc2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH V5 1/6] icount: Add QemuOpts for icount
  2014-07-25  9:56 [Qemu-devel] [PATCH V5 0/6] icount: Implement delay algorithm between guest and host clocks Sebastian Tanase
@ 2014-07-25  9:56 ` Sebastian Tanase
  2014-08-08  6:51   ` Markus Armbruster
  2014-09-15  7:04   ` TeLeMan
  2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 2/6] icount: Add align option to icount Sebastian Tanase
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 13+ messages in thread
From: Sebastian Tanase @ 2014-07-25  9:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, peter.maydell, Sebastian Tanase, jeremy.rosen, alex,
	wenchaoqemu, quintela, mst, stefanha, armbru, lcapitulino,
	michael, camille.begue, aliguori, crobinso, pbonzini,
	pierre.lemagourou, afaerber, rth

Make icount parameter use QemuOpts style options in order
to easily add other suboptions.

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Tested-by: Camille Bégué <camille.begue@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpus.c                | 10 +++++++++-
 include/qemu-common.h |  3 ++-
 qemu-options.hx       |  4 ++--
 qtest.c               | 13 +++++++++++--
 vl.c                  | 35 ++++++++++++++++++++++++++++-------
 5 files changed, 52 insertions(+), 13 deletions(-)

diff --git a/cpus.c b/cpus.c
index 5e7f2cf..dcca96a 100644
--- a/cpus.c
+++ b/cpus.c
@@ -440,13 +440,21 @@ static const VMStateDescription vmstate_timers = {
     }
 };
 
-void configure_icount(const char *option)
+void configure_icount(QemuOpts *opts, Error **errp)
 {
+    const char *option;
+
     seqlock_init(&timers_state.vm_clock_seqlock, NULL);
     vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
+    option = qemu_opt_get(opts, "shift");
     if (!option) {
         return;
     }
+    /* When using -icount shift, the shift option will be
+       misinterpreted as a boolean */
+    if (strcmp(option, "on") == 0 || strcmp(option, "off") == 0) {
+        error_setg(errp, "The shift option must be a number or auto");
+    }
 
     icount_warp_timer = timer_new_ns(QEMU_CLOCK_REALTIME,
                                           icount_warp_rt, NULL);
diff --git a/include/qemu-common.h b/include/qemu-common.h
index ae76197..cc346ec 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -41,6 +41,7 @@
 #include <assert.h>
 #include <signal.h>
 #include "glib-compat.h"
+#include "qemu/option.h"
 
 #ifdef _WIN32
 #include "sysemu/os-win32.h"
@@ -105,7 +106,7 @@ static inline char *realpath(const char *path, char *resolved_path)
 #endif
 
 /* icount */
-void configure_icount(const char *option);
+void configure_icount(QemuOpts *opts, Error **errp);
 extern int use_icount;
 
 #include "qemu/osdep.h"
diff --git a/qemu-options.hx b/qemu-options.hx
index 9e54686..143def4 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3011,11 +3011,11 @@ re-inject them.
 ETEXI
 
 DEF("icount", HAS_ARG, QEMU_OPTION_icount, \
-    "-icount [N|auto]\n" \
+    "-icount [shift=N|auto]\n" \
     "                enable virtual instruction counter with 2^N clock ticks per\n" \
     "                instruction\n", QEMU_ARCH_ALL)
 STEXI
-@item -icount [@var{N}|auto]
+@item -icount [shift=@var{N}|auto]
 @findex -icount
 Enable virtual instruction counter.  The virtual cpu will execute one
 instruction every 2^@var{N} ns of virtual time.  If @code{auto} is specified
diff --git a/qtest.c b/qtest.c
index 04a6dc1..ef0d991 100644
--- a/qtest.c
+++ b/qtest.c
@@ -19,6 +19,9 @@
 #include "hw/irq.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/cpus.h"
+#include "qemu/config-file.h"
+#include "qemu/option.h"
+#include "qemu/error-report.h"
 
 #define MAX_IRQ 256
 
@@ -509,10 +512,16 @@ static void qtest_event(void *opaque, int event)
     }
 }
 
-int qtest_init_accel(MachineClass *mc)
+static void configure_qtest_icount(const char *options)
 {
-    configure_icount("0");
+    QemuOpts *opts  = qemu_opts_parse(qemu_find_opts("icount"), options, 1);
+    configure_icount(opts, &error_abort);
+    qemu_opts_del(opts);
+}
 
+int qtest_init_accel(MachineClass *mc)
+{
+    configure_qtest_icount("0");
     return 0;
 }
 
diff --git a/vl.c b/vl.c
index 41ddcd2..103027f 100644
--- a/vl.c
+++ b/vl.c
@@ -537,6 +537,20 @@ static QemuOptsList qemu_mem_opts = {
     },
 };
 
+static QemuOptsList qemu_icount_opts = {
+    .name = "icount",
+    .implied_opt_name = "shift",
+    .merge_lists = true,
+    .head = QTAILQ_HEAD_INITIALIZER(qemu_icount_opts.head),
+    .desc = {
+        {
+            .name = "shift",
+            .type = QEMU_OPT_STRING,
+        },
+        { /* end of list */ }
+    },
+};
+
 /**
  * Get machine options
  *
@@ -2896,13 +2910,12 @@ int main(int argc, char **argv, char **envp)
 {
     int i;
     int snapshot, linux_boot;
-    const char *icount_option = NULL;
     const char *initrd_filename;
     const char *kernel_filename, *kernel_cmdline;
     const char *boot_order;
     DisplayState *ds;
     int cyls, heads, secs, translation;
-    QemuOpts *hda_opts = NULL, *opts, *machine_opts;
+    QemuOpts *hda_opts = NULL, *opts, *machine_opts, *icount_opts = NULL;
     QemuOptsList *olist;
     int optind;
     const char *optarg;
@@ -2967,6 +2980,7 @@ int main(int argc, char **argv, char **envp)
     qemu_add_opts(&qemu_msg_opts);
     qemu_add_opts(&qemu_name_opts);
     qemu_add_opts(&qemu_numa_opts);
+    qemu_add_opts(&qemu_icount_opts);
 
     runstate_init();
 
@@ -3817,7 +3831,11 @@ int main(int argc, char **argv, char **envp)
                 }
                 break;
             case QEMU_OPTION_icount:
-                icount_option = optarg;
+                icount_opts = qemu_opts_parse(qemu_find_opts("icount"),
+                                              optarg, 1);
+                if (!icount_opts) {
+                    exit(1);
+                }
                 break;
             case QEMU_OPTION_incoming:
                 incoming = optarg;
@@ -4294,11 +4312,14 @@ int main(int argc, char **argv, char **envp)
     qemu_spice_init();
 #endif
 
-    if (icount_option && (kvm_enabled() || xen_enabled())) {
-        fprintf(stderr, "-icount is not allowed with kvm or xen\n");
-        exit(1);
+    if (icount_opts) {
+        if (kvm_enabled() || xen_enabled()) {
+            fprintf(stderr, "-icount is not allowed with kvm or xen\n");
+            exit(1);
+        }
+        configure_icount(icount_opts, &error_abort);
+        qemu_opts_del(icount_opts);
     }
-    configure_icount(icount_option);
 
     /* clean up network at qemu process termination */
     atexit(&net_cleanup);
-- 
2.0.0.rc2

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH V5 2/6] icount: Add align option to icount
  2014-07-25  9:56 [Qemu-devel] [PATCH V5 0/6] icount: Implement delay algorithm between guest and host clocks Sebastian Tanase
  2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 1/6] icount: Add QemuOpts for icount Sebastian Tanase
@ 2014-07-25  9:56 ` Sebastian Tanase
  2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 3/6] icount: Make icount_time_shift available everywhere Sebastian Tanase
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Sebastian Tanase @ 2014-07-25  9:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, peter.maydell, Sebastian Tanase, jeremy.rosen, alex,
	wenchaoqemu, quintela, mst, stefanha, armbru, lcapitulino,
	michael, camille.begue, aliguori, crobinso, pbonzini,
	pierre.lemagourou, afaerber, rth

The align option is used for activating the align algorithm
in order to synchronise the host clock and the guest clock.

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Tested-by: Camille Bégué <camille.begue@openwide.fr>
---
 cpus.c                | 19 ++++++++++++-------
 include/qemu-common.h |  1 +
 qemu-options.hx       | 15 +++++++++++++--
 vl.c                  |  4 ++++
 4 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/cpus.c b/cpus.c
index dcca96a..34cc4c8 100644
--- a/cpus.c
+++ b/cpus.c
@@ -443,25 +443,30 @@ static const VMStateDescription vmstate_timers = {
 void configure_icount(QemuOpts *opts, Error **errp)
 {
     const char *option;
+    char *rem_str = NULL;
 
     seqlock_init(&timers_state.vm_clock_seqlock, NULL);
     vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
     option = qemu_opt_get(opts, "shift");
     if (!option) {
+        if (qemu_opt_get(opts, "align") != NULL) {
+            error_setg(errp, "Please specify shift option when using align");
+        }
         return;
     }
-    /* When using -icount shift, the shift option will be
-       misinterpreted as a boolean */
-    if (strcmp(option, "on") == 0 || strcmp(option, "off") == 0) {
-        error_setg(errp, "The shift option must be a number or auto");
-    }
-
+    icount_align_option = qemu_opt_get_bool(opts, "align", false);
     icount_warp_timer = timer_new_ns(QEMU_CLOCK_REALTIME,
                                           icount_warp_rt, NULL);
     if (strcmp(option, "auto") != 0) {
-        icount_time_shift = strtol(option, NULL, 0);
+        errno = 0;
+        icount_time_shift = strtol(option, &rem_str, 0);
+        if (errno != 0 || *rem_str != '\0' || !strlen(option)) {
+            error_setg(errp, "icount: Invalid shift value");
+        }
         use_icount = 1;
         return;
+    } else if (icount_align_option) {
+        error_setg(errp, "shift=auto and align=on are incompatible");
     }
 
     use_icount = 2;
diff --git a/include/qemu-common.h b/include/qemu-common.h
index cc346ec..860bb15 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -108,6 +108,7 @@ static inline char *realpath(const char *path, char *resolved_path)
 /* icount */
 void configure_icount(QemuOpts *opts, Error **errp);
 extern int use_icount;
+extern int icount_align_option;
 
 #include "qemu/osdep.h"
 #include "qemu/bswap.h"
diff --git a/qemu-options.hx b/qemu-options.hx
index 143def4..379d932 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3011,9 +3011,9 @@ re-inject them.
 ETEXI
 
 DEF("icount", HAS_ARG, QEMU_OPTION_icount, \
-    "-icount [shift=N|auto]\n" \
+    "-icount [shift=N|auto][,align=on|off]\n" \
     "                enable virtual instruction counter with 2^N clock ticks per\n" \
-    "                instruction\n", QEMU_ARCH_ALL)
+    "                instruction and enable aligning the host and virtual clocks\n", QEMU_ARCH_ALL)
 STEXI
 @item -icount [shift=@var{N}|auto]
 @findex -icount
@@ -3026,6 +3026,17 @@ Note that while this option can give deterministic behavior, it does not
 provide cycle accurate emulation.  Modern CPUs contain superscalar out of
 order cores with complex cache hierarchies.  The number of instructions
 executed often has little or no correlation with actual performance.
+
+@option{align=on} will activate the delay algorithm which will try to
+to synchronise the host clock and the virtual clock. The goal is to
+have a guest running at the real frequency imposed by the shift option.
+Whenever the guest clock is behind the host clock and if
+@option{align=on} is specified then we print a messsage to the user
+to inform about the delay.
+Currently this option does not work when @option{shift} is @code{auto}.
+Note: The sync algorithm will work for those shift values for which
+the guest clock runs ahead of the host clock. Typically this happens
+when the shift value is high (how high depends on the host machine).
 ETEXI
 
 DEF("watchdog", HAS_ARG, QEMU_OPTION_watchdog, \
diff --git a/vl.c b/vl.c
index 103027f..d270070 100644
--- a/vl.c
+++ b/vl.c
@@ -183,6 +183,7 @@ uint8_t *boot_splash_filedata;
 size_t boot_splash_filedata_size;
 uint8_t qemu_extra_params_fw[2];
 
+int icount_align_option;
 typedef struct FWBootEntry FWBootEntry;
 
 struct FWBootEntry {
@@ -546,6 +547,9 @@ static QemuOptsList qemu_icount_opts = {
         {
             .name = "shift",
             .type = QEMU_OPT_STRING,
+        }, {
+            .name = "align",
+            .type = QEMU_OPT_BOOL,
         },
         { /* end of list */ }
     },
-- 
2.0.0.rc2

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH V5 3/6] icount: Make icount_time_shift available everywhere
  2014-07-25  9:56 [Qemu-devel] [PATCH V5 0/6] icount: Implement delay algorithm between guest and host clocks Sebastian Tanase
  2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 1/6] icount: Add QemuOpts for icount Sebastian Tanase
  2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 2/6] icount: Add align option to icount Sebastian Tanase
@ 2014-07-25  9:56 ` Sebastian Tanase
  2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 4/6] cpu_exec: Add sleeping algorithm Sebastian Tanase
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Sebastian Tanase @ 2014-07-25  9:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, peter.maydell, Sebastian Tanase, jeremy.rosen, alex,
	wenchaoqemu, quintela, mst, stefanha, armbru, lcapitulino,
	michael, camille.begue, aliguori, crobinso, pbonzini,
	pierre.lemagourou, afaerber, rth

icount_time_shift is used for calculting the delay
qemu has to sleep in order to synchronise the
host and guest clocks. Therefore, we need it in
cpu-exec.c.

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Tested-by: Camille Bégué <camille.begue@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpus.c                | 8 ++++++--
 include/qemu-common.h | 1 +
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/cpus.c b/cpus.c
index 34cc4c8..de0a8b2 100644
--- a/cpus.c
+++ b/cpus.c
@@ -105,8 +105,12 @@ static bool all_cpu_threads_idle(void)
 /* Compensate for varying guest execution speed.  */
 static int64_t qemu_icount_bias;
 static int64_t vm_clock_warp_start;
-/* Conversion factor from emulated instructions to virtual clock ticks.  */
-static int icount_time_shift;
+/* Conversion factor from emulated instructions to virtual clock ticks.
+ * icount_time_shift is defined as extern in include/qemu-common.h because
+ * it is used (in cpu-exec.c) for calculating the delay for sleeping
+ * qemu in order to align the host and virtual clocks.
+ */
+int icount_time_shift;
 /* Arbitrarily pick 1MIPS as the minimum allowable speed.  */
 #define MAX_ICOUNT_SHIFT 10
 
diff --git a/include/qemu-common.h b/include/qemu-common.h
index 860bb15..d47aa02 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -109,6 +109,7 @@ static inline char *realpath(const char *path, char *resolved_path)
 void configure_icount(QemuOpts *opts, Error **errp);
 extern int use_icount;
 extern int icount_align_option;
+extern int icount_time_shift;
 
 #include "qemu/osdep.h"
 #include "qemu/bswap.h"
-- 
2.0.0.rc2

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH V5 4/6] cpu_exec: Add sleeping algorithm
  2014-07-25  9:56 [Qemu-devel] [PATCH V5 0/6] icount: Implement delay algorithm between guest and host clocks Sebastian Tanase
                   ` (2 preceding siblings ...)
  2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 3/6] icount: Make icount_time_shift available everywhere Sebastian Tanase
@ 2014-07-25  9:56 ` Sebastian Tanase
  2014-07-25 10:13   ` Paolo Bonzini
  2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 5/6] cpu_exec: Print to console if the guest is late Sebastian Tanase
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Sebastian Tanase @ 2014-07-25  9:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, peter.maydell, Sebastian Tanase, jeremy.rosen, alex,
	wenchaoqemu, quintela, mst, stefanha, armbru, lcapitulino,
	michael, camille.begue, aliguori, crobinso, pbonzini,
	pierre.lemagourou, afaerber, rth

The goal is to sleep qemu whenever the guest clock
is in advance compared to the host clock (we use
the monotonic clocks). The amount of time to sleep
is calculated in the execution loop in cpu_exec.

At first, we tried to approximate at each for loop the real time elapsed
while searching for a TB (generating or retrieving from cache) and
executing it. We would then approximate the virtual time corresponding
to the number of virtual instructions executed. The difference between
these 2 values would allow us to know if the guest is in advance or delayed.
However, the function used for measuring the real time
(qemu_clock_get_ns(QEMU_CLOCK_REALTIME)) proved to be very expensive.
We had an added overhead of 13% of the total run time.

Therefore, we modified the algorithm and only take into account the
difference between the 2 clocks at the begining of the cpu_exec function.
During the for loop we try to reduce the advance of the guest only by
computing the virtual time elapsed and sleeping if necessary. The overhead
is thus reduced to 3%. Even though this method still has a noticeable
overhead, it no longer is a bottleneck in trying to achieve a better
guest frequency for which the guest clock is faster than the host one.

As for the the alignement of the 2 clocks, with the first algorithm
the guest clock was oscillating between -1 and 1ms compared to the host clock.
Using the second algorithm we notice that the guest is 5ms behind the host, which
is still acceptable for our use case.

The tests where conducted using fio and stress. The host machine in an i5 CPU at
3.10GHz running Debian Jessie (kernel 3.12). The guest machine is an arm versatile-pb
built with buildroot.

Currently, on our test machine, the lowest icount we can achieve that is suitable for
aligning the 2 clocks is 6. However, we observe that the IO tests (using fio) are
slower than the cpu tests (using stress).

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Tested-by: Camille Bégué <camille.begue@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpu-exec.c           | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 cpus.c               | 17 ++++++++++
 include/qemu/timer.h |  1 +
 3 files changed, 109 insertions(+)

diff --git a/cpu-exec.c b/cpu-exec.c
index 38e5f02..1a725b6 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -22,6 +22,84 @@
 #include "tcg.h"
 #include "qemu/atomic.h"
 #include "sysemu/qtest.h"
+#include "qemu/timer.h"
+
+/* -icount align implementation. */
+
+typedef struct SyncClocks {
+    int64_t diff_clk;
+    int64_t original_instr_counter;
+} SyncClocks;
+
+#if !defined(CONFIG_USER_ONLY)
+/* Allow the guest to have a max 3ms advance.
+ * The difference between the 2 clocks could therefore
+ * oscillate around 0.
+ */
+#define VM_CLOCK_ADVANCE 3000000
+
+static int64_t delay_host(int64_t diff_clk)
+{
+    if (diff_clk > VM_CLOCK_ADVANCE) {
+#ifndef _WIN32
+        struct timespec sleep_delay, rem_delay;
+        sleep_delay.tv_sec = diff_clk / 1000000000LL;
+        sleep_delay.tv_nsec = diff_clk % 1000000000LL;
+        if (nanosleep(&sleep_delay, &rem_delay) < 0) {
+            diff_clk -= (sleep_delay.tv_sec - rem_delay.tv_sec) * 1000000000LL;
+            diff_clk -= sleep_delay.tv_nsec - rem_delay.tv_nsec;
+        } else {
+            diff_clk = 0;
+        }
+#else
+        Sleep(diff_clk / SCALE_MS);
+        diff_clk = 0;
+#endif
+    }
+    return diff_clk;
+}
+
+static int64_t instr_to_vtime(int64_t instr_counter, const CPUState *cpu)
+{
+    int64_t instr_exec_time;
+    instr_exec_time = instr_counter -
+                      (cpu->icount_extra +
+                       cpu->icount_decr.u16.low);
+    instr_exec_time = instr_exec_time << icount_time_shift;
+
+    return instr_exec_time;
+}
+
+static void align_clocks(SyncClocks *sc, const CPUState *cpu)
+{
+    if (!icount_align_option) {
+        return;
+    }
+    sc->diff_clk += instr_to_vtime(sc->original_instr_counter, cpu);
+    sc->original_instr_counter = cpu->icount_extra + cpu->icount_decr.u16.low;
+    sc->diff_clk = delay_host(sc->diff_clk);
+}
+
+static void init_delay_params(SyncClocks *sc,
+                              const CPUState *cpu)
+{
+    if (!icount_align_option) {
+        return;
+    }
+    sc->diff_clk = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) -
+                   qemu_clock_get_ns(QEMU_CLOCK_REALTIME) +
+                   cpu_get_clock_offset();
+    sc->original_instr_counter = cpu->icount_extra + cpu->icount_decr.u16.low;
+}
+#else
+static void align_clocks(SyncClocks *sc, const CPUState *cpu)
+{
+}
+
+static void init_delay_params(SyncClocks *sc, const CPUState *cpu)
+{
+}
+#endif /* CONFIG USER ONLY */
 
 void cpu_loop_exit(CPUState *cpu)
 {
@@ -227,6 +305,8 @@ int cpu_exec(CPUArchState *env)
     TranslationBlock *tb;
     uint8_t *tc_ptr;
     uintptr_t next_tb;
+    SyncClocks sc;
+
     /* This must be volatile so it is not trashed by longjmp() */
     volatile bool have_tb_lock = false;
 
@@ -283,6 +363,13 @@ int cpu_exec(CPUArchState *env)
 #endif
     cpu->exception_index = -1;
 
+    /* Calculate difference between guest clock and host clock.
+     * This delay includes the delay of the last cycle, so
+     * what we have to do is sleep until it is 0. As for the
+     * advance/delay we gain here, we try to fix it next time.
+     */
+    init_delay_params(&sc, cpu);
+
     /* prepare setjmp context for exception handling */
     for(;;) {
         if (sigsetjmp(cpu->jmp_env, 0) == 0) {
@@ -672,6 +759,7 @@ int cpu_exec(CPUArchState *env)
                             if (insns_left > 0) {
                                 /* Execute remaining instructions.  */
                                 cpu_exec_nocache(env, insns_left, tb);
+                                align_clocks(&sc, cpu);
                             }
                             cpu->exception_index = EXCP_INTERRUPT;
                             next_tb = 0;
@@ -684,6 +772,9 @@ int cpu_exec(CPUArchState *env)
                     }
                 }
                 cpu->current_tb = NULL;
+                /* Try to align the host and virtual clocks
+                   if the guest is in advance */
+                align_clocks(&sc, cpu);
                 /* reset soft MMU for next block (it can currently
                    only be set by a memory fault) */
             } /* for(;;) */
diff --git a/cpus.c b/cpus.c
index de0a8b2..fcc5308 100644
--- a/cpus.c
+++ b/cpus.c
@@ -218,6 +218,23 @@ int64_t cpu_get_clock(void)
     return ti;
 }
 
+/* return the offset between the host clock and virtual CPU clock */
+int64_t cpu_get_clock_offset(void)
+{
+    int64_t ti;
+    unsigned start;
+
+    do {
+        start = seqlock_read_begin(&timers_state.vm_clock_seqlock);
+        ti = timers_state.cpu_clock_offset;
+        if (!timers_state.cpu_ticks_enabled) {
+            ti -= get_clock();
+        }
+    } while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start));
+
+    return -ti;
+}
+
 /* enable cpu_get_ticks()
  * Caller must hold BQL which server as mutex for vm_clock_seqlock.
  */
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 7f9a074..102f442 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -745,6 +745,7 @@ static inline int64_t get_clock(void)
 /* icount */
 int64_t cpu_get_icount(void);
 int64_t cpu_get_clock(void);
+int64_t cpu_get_clock_offset(void);
 
 /*******************************************/
 /* host CPU ticks (if available) */
-- 
2.0.0.rc2

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH V5 5/6] cpu_exec: Print to console if the guest is late
  2014-07-25  9:56 [Qemu-devel] [PATCH V5 0/6] icount: Implement delay algorithm between guest and host clocks Sebastian Tanase
                   ` (3 preceding siblings ...)
  2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 4/6] cpu_exec: Add sleeping algorithm Sebastian Tanase
@ 2014-07-25  9:56 ` Sebastian Tanase
  2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 6/6] monitor: Add drift info to 'info jit' Sebastian Tanase
  2014-07-25 10:00 ` [Qemu-devel] [PATCH V5 0/6] icount: Implement delay algorithm between guest and host clocks Andreas Färber
  6 siblings, 0 replies; 13+ messages in thread
From: Sebastian Tanase @ 2014-07-25  9:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, peter.maydell, Sebastian Tanase, jeremy.rosen, alex,
	wenchaoqemu, quintela, mst, stefanha, armbru, lcapitulino,
	michael, camille.begue, aliguori, crobinso, pbonzini,
	pierre.lemagourou, afaerber, rth

If the align option is enabled, we print to the user whenever
the guest clock is behind the host clock in order for he/she
to have a hint about the actual performance. The maximum
print interval is 2s and we limit the number of messages to 100.
If desired, this can be changed in cpu-exec.c

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Tested-by: Camille Bégué <camille.begue@openwide.fr>

Conflicts:
	cpu-exec.c
---
 cpu-exec.c | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 1a725b6..7259c94 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -29,6 +29,7 @@
 typedef struct SyncClocks {
     int64_t diff_clk;
     int64_t original_instr_counter;
+    int64_t realtime_clock;
 } SyncClocks;
 
 #if !defined(CONFIG_USER_ONLY)
@@ -37,6 +38,9 @@ typedef struct SyncClocks {
  * oscillate around 0.
  */
 #define VM_CLOCK_ADVANCE 3000000
+#define THRESHOLD_REDUCE 1.5
+#define MAX_DELAY_PRINT_RATE 2
+#define MAX_NB_PRINTS 100
 
 static int64_t delay_host(int64_t diff_clk)
 {
@@ -80,16 +84,42 @@ static void align_clocks(SyncClocks *sc, const CPUState *cpu)
     sc->diff_clk = delay_host(sc->diff_clk);
 }
 
+static void print_delay(const SyncClocks *sc)
+{
+    static float threshold_delay;
+    static int64_t last_realtime_clock;
+    static int nb_prints;
+
+    if (icount_align_option &&
+        (sc->realtime_clock - last_realtime_clock) / 1000000000LL
+        >= MAX_DELAY_PRINT_RATE && nb_prints < MAX_NB_PRINTS) {
+        if ((-sc->diff_clk / (float)1000000000LL > threshold_delay) ||
+            (-sc->diff_clk / (float)1000000000LL <
+             (threshold_delay - THRESHOLD_REDUCE))) {
+            threshold_delay = (-sc->diff_clk / 1000000000LL) + 1;
+            printf("Warning: The guest is now late by %.1f to %.1f seconds\n",
+                   threshold_delay - 1,
+                   threshold_delay);
+            nb_prints++;
+            last_realtime_clock = sc->realtime_clock;
+        }
+    }
+}
+
 static void init_delay_params(SyncClocks *sc,
                               const CPUState *cpu)
 {
     if (!icount_align_option) {
         return;
     }
+    sc->realtime_clock = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
     sc->diff_clk = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) -
-                   qemu_clock_get_ns(QEMU_CLOCK_REALTIME) +
+                   sc->realtime_clock +
                    cpu_get_clock_offset();
     sc->original_instr_counter = cpu->icount_extra + cpu->icount_decr.u16.low;
+    /* Print every 2s max if the guest is late. We limit the number
+       of printed messages to NB_PRINT_MAX(currently 100) */
+    print_delay(sc);
 }
 #else
 static void align_clocks(SyncClocks *sc, const CPUState *cpu)
-- 
2.0.0.rc2

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH V5 6/6] monitor: Add drift info to 'info jit'
  2014-07-25  9:56 [Qemu-devel] [PATCH V5 0/6] icount: Implement delay algorithm between guest and host clocks Sebastian Tanase
                   ` (4 preceding siblings ...)
  2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 5/6] cpu_exec: Print to console if the guest is late Sebastian Tanase
@ 2014-07-25  9:56 ` Sebastian Tanase
  2014-07-25 10:00 ` [Qemu-devel] [PATCH V5 0/6] icount: Implement delay algorithm between guest and host clocks Andreas Färber
  6 siblings, 0 replies; 13+ messages in thread
From: Sebastian Tanase @ 2014-07-25  9:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, peter.maydell, Sebastian Tanase, jeremy.rosen, alex,
	wenchaoqemu, quintela, mst, stefanha, armbru, lcapitulino,
	michael, camille.begue, aliguori, crobinso, pbonzini,
	pierre.lemagourou, afaerber, rth

Show in 'info jit' the current delay between the host clock
and the guest clock. In addition, print the maximum advance
and delay of the guest compared to the host.

Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Tested-by: Camille Bégué <camille.begue@openwide.fr>
---
 cpu-exec.c            |  6 ++++++
 cpus.c                | 15 +++++++++++++++
 include/qemu-common.h |  4 ++++
 monitor.c             |  1 +
 4 files changed, 26 insertions(+)

diff --git a/cpu-exec.c b/cpu-exec.c
index 7259c94..533e0bf 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -117,6 +117,12 @@ static void init_delay_params(SyncClocks *sc,
                    sc->realtime_clock +
                    cpu_get_clock_offset();
     sc->original_instr_counter = cpu->icount_extra + cpu->icount_decr.u16.low;
+    if (sc->diff_clk < max_delay) {
+        max_delay = sc->diff_clk;
+    }
+    if (sc->diff_clk > max_advance) {
+        max_advance = sc->diff_clk;
+    }
     /* Print every 2s max if the guest is late. We limit the number
        of printed messages to NB_PRINT_MAX(currently 100) */
     print_delay(sc);
diff --git a/cpus.c b/cpus.c
index fcc5308..c98036e 100644
--- a/cpus.c
+++ b/cpus.c
@@ -64,6 +64,8 @@
 #endif /* CONFIG_LINUX */
 
 static CPUState *next_cpu;
+int64_t max_delay;
+int64_t max_advance;
 
 bool cpu_is_stopped(CPUState *cpu)
 {
@@ -1521,3 +1523,16 @@ void qmp_inject_nmi(Error **errp)
     error_set(errp, QERR_UNSUPPORTED);
 #endif
 }
+
+void dump_drift_info(FILE *f, fprintf_function cpu_fprintf)
+{
+    cpu_fprintf(f, "Host - Guest clock  %ld(ms)\n",
+                (cpu_get_clock() - cpu_get_icount())/SCALE_MS);
+    if (icount_align_option) {
+        cpu_fprintf(f, "Max guest delay     %ld(ms)\n", -max_delay/SCALE_MS);
+        cpu_fprintf(f, "Max guest advance   %ld(ms)\n", max_advance/SCALE_MS);
+    } else {
+        cpu_fprintf(f, "Max guest delay     NA\n");
+        cpu_fprintf(f, "Max guest advance   NA\n");
+    }
+}
diff --git a/include/qemu-common.h b/include/qemu-common.h
index d47aa02..55971df 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -110,6 +110,10 @@ void configure_icount(QemuOpts *opts, Error **errp);
 extern int use_icount;
 extern int icount_align_option;
 extern int icount_time_shift;
+/* drift information for info jit command */
+extern int64_t max_delay;
+extern int64_t max_advance;
+void dump_drift_info(FILE *f, fprintf_function cpu_fprintf);
 
 #include "qemu/osdep.h"
 #include "qemu/bswap.h"
diff --git a/monitor.c b/monitor.c
index 5bc70a6..cdbaa60 100644
--- a/monitor.c
+++ b/monitor.c
@@ -1047,6 +1047,7 @@ static void do_info_registers(Monitor *mon, const QDict *qdict)
 static void do_info_jit(Monitor *mon, const QDict *qdict)
 {
     dump_exec_info((FILE *)mon, monitor_fprintf);
+    dump_drift_info((FILE *)mon, monitor_fprintf);
 }
 
 static void do_info_history(Monitor *mon, const QDict *qdict)
-- 
2.0.0.rc2

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH V5 0/6] icount: Implement delay algorithm between guest and host clocks
  2014-07-25  9:56 [Qemu-devel] [PATCH V5 0/6] icount: Implement delay algorithm between guest and host clocks Sebastian Tanase
                   ` (5 preceding siblings ...)
  2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 6/6] monitor: Add drift info to 'info jit' Sebastian Tanase
@ 2014-07-25 10:00 ` Andreas Färber
  6 siblings, 0 replies; 13+ messages in thread
From: Andreas Färber @ 2014-07-25 10:00 UTC (permalink / raw)
  To: Sebastian Tanase, qemu-devel
  Cc: kwolf, peter.maydell, jeremy.rosen, aliguori, wenchaoqemu,
	quintela, mst, stefanha, armbru, lcapitulino, michael,
	camille.begue, alex, crobinso, pbonzini, pierre.lemagourou, rth

Am 25.07.2014 11:56, schrieb Sebastian Tanase:
> Sebastian Tanase (6):
>   icount: Add QemuOpts for icount
>   icount: Add align option to icount
>   icount: Make icount_time_shift available everywhere
>   cpu_exec: Add sleeping algorithm
>   cpu_exec: Print to console if the guest is late

FWIW the file name is cpu-exec, please fix if you need to resend.

Thanks,
Andreas

>   monitor: Add drift info to 'info jit'
> 
>  cpu-exec.c            | 127 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  cpus.c                |  59 +++++++++++++++++++++--
>  include/qemu-common.h |   9 +++-
>  include/qemu/timer.h  |   1 +
>  monitor.c             |   1 +
>  qemu-options.hx       |  17 +++++--
>  qtest.c               |  13 +++++-
>  vl.c                  |  39 +++++++++++++---
>  8 files changed, 248 insertions(+), 18 deletions(-)

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH V5 4/6] cpu_exec: Add sleeping algorithm
  2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 4/6] cpu_exec: Add sleeping algorithm Sebastian Tanase
@ 2014-07-25 10:13   ` Paolo Bonzini
  2014-07-25 14:28     ` Sebastian Tanase
  0 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2014-07-25 10:13 UTC (permalink / raw)
  To: Sebastian Tanase, qemu-devel
  Cc: kwolf, peter.maydell, jeremy.rosen, aliguori, wenchaoqemu,
	quintela, mst, stefanha, armbru, lcapitulino, michael,
	camille.begue, alex, crobinso, pierre.lemagourou, afaerber, rth

Il 25/07/2014 11:56, Sebastian Tanase ha scritto:
> The goal is to sleep qemu whenever the guest clock
> is in advance compared to the host clock (we use
> the monotonic clocks). The amount of time to sleep
> is calculated in the execution loop in cpu_exec.
> 
> At first, we tried to approximate at each for loop the real time elapsed
> while searching for a TB (generating or retrieving from cache) and
> executing it. We would then approximate the virtual time corresponding
> to the number of virtual instructions executed. The difference between
> these 2 values would allow us to know if the guest is in advance or delayed.
> However, the function used for measuring the real time
> (qemu_clock_get_ns(QEMU_CLOCK_REALTIME)) proved to be very expensive.
> We had an added overhead of 13% of the total run time.
> 
> Therefore, we modified the algorithm and only take into account the
> difference between the 2 clocks at the begining of the cpu_exec function.
> During the for loop we try to reduce the advance of the guest only by
> computing the virtual time elapsed and sleeping if necessary. The overhead
> is thus reduced to 3%. Even though this method still has a noticeable
> overhead, it no longer is a bottleneck in trying to achieve a better
> guest frequency for which the guest clock is faster than the host one.
> 
> As for the the alignement of the 2 clocks, with the first algorithm
> the guest clock was oscillating between -1 and 1ms compared to the host clock.
> Using the second algorithm we notice that the guest is 5ms behind the host, which
> is still acceptable for our use case.
> 
> The tests where conducted using fio and stress. The host machine in an i5 CPU at
> 3.10GHz running Debian Jessie (kernel 3.12). The guest machine is an arm versatile-pb
> built with buildroot.
> 
> Currently, on our test machine, the lowest icount we can achieve that is suitable for
> aligning the 2 clocks is 6. However, we observe that the IO tests (using fio) are
> slower than the cpu tests (using stress).
> 
> Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
> Tested-by: Camille Bégué <camille.begue@openwide.fr>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  cpu-exec.c           | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  cpus.c               | 17 ++++++++++
>  include/qemu/timer.h |  1 +
>  3 files changed, 109 insertions(+)
> 
> diff --git a/cpu-exec.c b/cpu-exec.c
> index 38e5f02..1a725b6 100644
> --- a/cpu-exec.c
> +++ b/cpu-exec.c
> @@ -22,6 +22,84 @@
>  #include "tcg.h"
>  #include "qemu/atomic.h"
>  #include "sysemu/qtest.h"
> +#include "qemu/timer.h"
> +
> +/* -icount align implementation. */
> +
> +typedef struct SyncClocks {
> +    int64_t diff_clk;
> +    int64_t original_instr_counter;
> +} SyncClocks;
> +
> +#if !defined(CONFIG_USER_ONLY)
> +/* Allow the guest to have a max 3ms advance.
> + * The difference between the 2 clocks could therefore
> + * oscillate around 0.
> + */
> +#define VM_CLOCK_ADVANCE 3000000
> +
> +static int64_t delay_host(int64_t diff_clk)
> +{
> +    if (diff_clk > VM_CLOCK_ADVANCE) {
> +#ifndef _WIN32
> +        struct timespec sleep_delay, rem_delay;
> +        sleep_delay.tv_sec = diff_clk / 1000000000LL;
> +        sleep_delay.tv_nsec = diff_clk % 1000000000LL;
> +        if (nanosleep(&sleep_delay, &rem_delay) < 0) {
> +            diff_clk -= (sleep_delay.tv_sec - rem_delay.tv_sec) * 1000000000LL;
> +            diff_clk -= sleep_delay.tv_nsec - rem_delay.tv_nsec;
> +        } else {
> +            diff_clk = 0;
> +        }
> +#else
> +        Sleep(diff_clk / SCALE_MS);
> +        diff_clk = 0;
> +#endif
> +    }
> +    return diff_clk;
> +}
> +
> +static int64_t instr_to_vtime(int64_t instr_counter, const CPUState *cpu)
> +{
> +    int64_t instr_exec_time;
> +    instr_exec_time = instr_counter -
> +                      (cpu->icount_extra +
> +                       cpu->icount_decr.u16.low);
> +    instr_exec_time = instr_exec_time << icount_time_shift;
> +
> +    return instr_exec_time;
> +}
> +
> +static void align_clocks(SyncClocks *sc, const CPUState *cpu)
> +{
> +    if (!icount_align_option) {
> +        return;
> +    }
> +    sc->diff_clk += instr_to_vtime(sc->original_instr_counter, cpu);
> +    sc->original_instr_counter = cpu->icount_extra + cpu->icount_decr.u16.low;
> +    sc->diff_clk = delay_host(sc->diff_clk);
> +}

Just two comments:

1) perhaps s/original/last/ in original_instr_counter?

2) I think I prefer this to be written like:

    instr_counter = cpu->icount_extra + cpu->icount_decr.u16.low;
    instr_exec_time = sc->original_instr_counter - instr_counter;
    sc->original_instr_counter = instr_counter
    sc->diff_clk += instr_exec_time << icount_time_shift;
    sc->diff_clk = delay_host(sc->diff_clk);

If you agree, I can do it when applying the patches.

Thanks for your persistence, I'm very happy with this version!

As a follow up, do you think it's possible to modify the places where
you run align_clocks, so that you sleep with the iothread mutex *not* taken?

Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH V5 4/6] cpu_exec: Add sleeping algorithm
  2014-07-25 10:13   ` Paolo Bonzini
@ 2014-07-25 14:28     ` Sebastian Tanase
  0 siblings, 0 replies; 13+ messages in thread
From: Sebastian Tanase @ 2014-07-25 14:28 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kwolf, peter maydell, jeremy rosen, aliguori, wenchaoqemu,
	quintela, qemu-devel, mst, stefanha, armbru, lcapitulino, michael,
	camille begue, alex, crobinso, pierre lemagourou, afaerber, rth



----- Mail original -----
> De: "Paolo Bonzini" <pbonzini@redhat.com>
> À: "Sebastian Tanase" <sebastian.tanase@openwide.fr>, qemu-devel@nongnu.org
> Cc: aliguori@amazon.com, afaerber@suse.de, rth@twiddle.net, "peter maydell" <peter.maydell@linaro.org>,
> michael@walle.cc, alex@alex.org.uk, stefanha@redhat.com, lcapitulino@redhat.com, crobinso@redhat.com,
> armbru@redhat.com, wenchaoqemu@gmail.com, quintela@redhat.com, kwolf@redhat.com, mst@redhat.com, "pierre lemagourou"
> <pierre.lemagourou@openwide.fr>, "jeremy rosen" <jeremy.rosen@openwide.fr>, "camille begue"
> <camille.begue@openwide.fr>
> Envoyé: Vendredi 25 Juillet 2014 12:13:44
> Objet: Re: [PATCH V5 4/6] cpu_exec: Add sleeping algorithm
> 
> Il 25/07/2014 11:56, Sebastian Tanase ha scritto:
> > The goal is to sleep qemu whenever the guest clock
> > is in advance compared to the host clock (we use
> > the monotonic clocks). The amount of time to sleep
> > is calculated in the execution loop in cpu_exec.
> > 
> > At first, we tried to approximate at each for loop the real time
> > elapsed
> > while searching for a TB (generating or retrieving from cache) and
> > executing it. We would then approximate the virtual time
> > corresponding
> > to the number of virtual instructions executed. The difference
> > between
> > these 2 values would allow us to know if the guest is in advance or
> > delayed.
> > However, the function used for measuring the real time
> > (qemu_clock_get_ns(QEMU_CLOCK_REALTIME)) proved to be very
> > expensive.
> > We had an added overhead of 13% of the total run time.
> > 
> > Therefore, we modified the algorithm and only take into account the
> > difference between the 2 clocks at the begining of the cpu_exec
> > function.
> > During the for loop we try to reduce the advance of the guest only
> > by
> > computing the virtual time elapsed and sleeping if necessary. The
> > overhead
> > is thus reduced to 3%. Even though this method still has a
> > noticeable
> > overhead, it no longer is a bottleneck in trying to achieve a
> > better
> > guest frequency for which the guest clock is faster than the host
> > one.
> > 
> > As for the the alignement of the 2 clocks, with the first algorithm
> > the guest clock was oscillating between -1 and 1ms compared to the
> > host clock.
> > Using the second algorithm we notice that the guest is 5ms behind
> > the host, which
> > is still acceptable for our use case.
> > 
> > The tests where conducted using fio and stress. The host machine in
> > an i5 CPU at
> > 3.10GHz running Debian Jessie (kernel 3.12). The guest machine is
> > an arm versatile-pb
> > built with buildroot.
> > 
> > Currently, on our test machine, the lowest icount we can achieve
> > that is suitable for
> > aligning the 2 clocks is 6. However, we observe that the IO tests
> > (using fio) are
> > slower than the cpu tests (using stress).
> > 
> > Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
> > Tested-by: Camille Bégué <camille.begue@openwide.fr>
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> >  cpu-exec.c           | 91
> >  ++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  cpus.c               | 17 ++++++++++
> >  include/qemu/timer.h |  1 +
> >  3 files changed, 109 insertions(+)
> > 
> > diff --git a/cpu-exec.c b/cpu-exec.c
> > index 38e5f02..1a725b6 100644
> > --- a/cpu-exec.c
> > +++ b/cpu-exec.c
> > @@ -22,6 +22,84 @@
> >  #include "tcg.h"
> >  #include "qemu/atomic.h"
> >  #include "sysemu/qtest.h"
> > +#include "qemu/timer.h"
> > +
> > +/* -icount align implementation. */
> > +
> > +typedef struct SyncClocks {
> > +    int64_t diff_clk;
> > +    int64_t original_instr_counter;
> > +} SyncClocks;
> > +
> > +#if !defined(CONFIG_USER_ONLY)
> > +/* Allow the guest to have a max 3ms advance.
> > + * The difference between the 2 clocks could therefore
> > + * oscillate around 0.
> > + */
> > +#define VM_CLOCK_ADVANCE 3000000
> > +
> > +static int64_t delay_host(int64_t diff_clk)
> > +{
> > +    if (diff_clk > VM_CLOCK_ADVANCE) {
> > +#ifndef _WIN32
> > +        struct timespec sleep_delay, rem_delay;
> > +        sleep_delay.tv_sec = diff_clk / 1000000000LL;
> > +        sleep_delay.tv_nsec = diff_clk % 1000000000LL;
> > +        if (nanosleep(&sleep_delay, &rem_delay) < 0) {
> > +            diff_clk -= (sleep_delay.tv_sec - rem_delay.tv_sec) *
> > 1000000000LL;
> > +            diff_clk -= sleep_delay.tv_nsec - rem_delay.tv_nsec;
> > +        } else {
> > +            diff_clk = 0;
> > +        }
> > +#else
> > +        Sleep(diff_clk / SCALE_MS);
> > +        diff_clk = 0;
> > +#endif
> > +    }
> > +    return diff_clk;
> > +}
> > +
> > +static int64_t instr_to_vtime(int64_t instr_counter, const
> > CPUState *cpu)
> > +{
> > +    int64_t instr_exec_time;
> > +    instr_exec_time = instr_counter -
> > +                      (cpu->icount_extra +
> > +                       cpu->icount_decr.u16.low);
> > +    instr_exec_time = instr_exec_time << icount_time_shift;
> > +
> > +    return instr_exec_time;
> > +}
> > +
> > +static void align_clocks(SyncClocks *sc, const CPUState *cpu)
> > +{
> > +    if (!icount_align_option) {
> > +        return;
> > +    }
> > +    sc->diff_clk += instr_to_vtime(sc->original_instr_counter,
> > cpu);
> > +    sc->original_instr_counter = cpu->icount_extra +
> > cpu->icount_decr.u16.low;
> > +    sc->diff_clk = delay_host(sc->diff_clk);
> > +}
> 
> Just two comments:
> 
> 1) perhaps s/original/last/ in original_instr_counter?
> 
> 2) I think I prefer this to be written like:
> 
>     instr_counter = cpu->icount_extra + cpu->icount_decr.u16.low;
>     instr_exec_time = sc->original_instr_counter - instr_counter;
>     sc->original_instr_counter = instr_counter
>     sc->diff_clk += instr_exec_time << icount_time_shift;
>     sc->diff_clk = delay_host(sc->diff_clk);
> 
> If you agree, I can do it when applying the patches.
> 

Sure, no problem.

> Thanks for your persistence, I'm very happy with this version!
> 
> As a follow up, do you think it's possible to modify the places where
> you run align_clocks, so that you sleep with the iothread mutex *not*
> taken?
> 
> Paolo

I'll consider that and run some tests.
> 

Thank you very much for your help and guidance.


Sebastian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH V5 1/6] icount: Add QemuOpts for icount
  2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 1/6] icount: Add QemuOpts for icount Sebastian Tanase
@ 2014-08-08  6:51   ` Markus Armbruster
  2014-09-15  7:04   ` TeLeMan
  1 sibling, 0 replies; 13+ messages in thread
From: Markus Armbruster @ 2014-08-08  6:51 UTC (permalink / raw)
  To: Sebastian Tanase
  Cc: kwolf, peter.maydell, jeremy.rosen, wenchaoqemu, quintela, rth,
	mst, stefanha, qemu-devel, lcapitulino, michael, camille.begue,
	alex, crobinso, pbonzini, pierre.lemagourou, afaerber, aliguori

Sebastian Tanase <sebastian.tanase@openwide.fr> writes:

> Make icount parameter use QemuOpts style options in order
> to easily add other suboptions.
>
> Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
> Tested-by: Camille Bégué <camille.begue@openwide.fr>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[...]
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 9e54686..143def4 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -3011,11 +3011,11 @@ re-inject them.
>  ETEXI
>  
>  DEF("icount", HAS_ARG, QEMU_OPTION_icount, \
> -    "-icount [N|auto]\n" \
> +    "-icount [shift=N|auto]\n" \
>      "                enable virtual instruction counter with 2^N clock ticks per\n" \
>      "                instruction\n", QEMU_ARCH_ALL)
>  STEXI
> -@item -icount [@var{N}|auto]
> +@item -icount [shift=@var{N}|auto]
>  @findex -icount
>  Enable virtual instruction counter.  The virtual cpu will execute one
>  instruction every 2^@var{N} ns of virtual time.  If @code{auto} is specified


"shift=" is documented to be required, but...

[...]
> diff --git a/vl.c b/vl.c
> index 41ddcd2..103027f 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -537,6 +537,20 @@ static QemuOptsList qemu_mem_opts = {
>      },
>  };
>  
> +static QemuOptsList qemu_icount_opts = {
> +    .name = "icount",
> +    .implied_opt_name = "shift",

... it's actually optional.  Please fix the documentation.
I guess [[shift=][N|auto] would be correct.

> +    .merge_lists = true,
> +    .head = QTAILQ_HEAD_INITIALIZER(qemu_icount_opts.head),
> +    .desc = {
> +        {
> +            .name = "shift",
> +            .type = QEMU_OPT_STRING,
> +        },
> +        { /* end of list */ }
> +    },
> +};
> +
>  /**
>   * Get machine options
>   *
[...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH V5 1/6] icount: Add QemuOpts for icount
  2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 1/6] icount: Add QemuOpts for icount Sebastian Tanase
  2014-08-08  6:51   ` Markus Armbruster
@ 2014-09-15  7:04   ` TeLeMan
  2014-09-15 11:41     ` Paolo Bonzini
  1 sibling, 1 reply; 13+ messages in thread
From: TeLeMan @ 2014-09-15  7:04 UTC (permalink / raw)
  To: Sebastian Tanase
  Cc: Kevin Wolf, Peter Maydell, pierre.lemagourou, jeremy.rosen,
	wenchaoqemu, Juan Quintela, rth, Michael S. Tsirkin,
	Stefan Hajnoczi, qemu-devel, armbru, michael, camille.begue, alex,
	crobinso, Paolo Bonzini, Luiz Capitulino, Andreas Färber,
	aliguori

On Fri, Jul 25, 2014 at 5:56 PM, Sebastian Tanase
<sebastian.tanase@openwide.fr> wrote:
> Make icount parameter use QemuOpts style options in order
> to easily add other suboptions.
>
> Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
> Tested-by: Camille Bégué <camille.begue@openwide.fr>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  cpus.c                | 10 +++++++++-
>  include/qemu-common.h |  3 ++-
>  qemu-options.hx       |  4 ++--
>  qtest.c               | 13 +++++++++++--
>  vl.c                  | 35 ++++++++++++++++++++++++++++-------
>  5 files changed, 52 insertions(+), 13 deletions(-)
>
> diff --git a/cpus.c b/cpus.c
> index 5e7f2cf..dcca96a 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -440,13 +440,21 @@ static const VMStateDescription vmstate_timers = {
>      }
>  };
>
> -void configure_icount(const char *option)
> +void configure_icount(QemuOpts *opts, Error **errp)
>  {
> +    const char *option;
> +
>      seqlock_init(&timers_state.vm_clock_seqlock, NULL);
>      vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
> +    option = qemu_opt_get(opts, "shift");
>      if (!option) {
>          return;
>      }
> +    /* When using -icount shift, the shift option will be
> +       misinterpreted as a boolean */
> +    if (strcmp(option, "on") == 0 || strcmp(option, "off") == 0) {
> +        error_setg(errp, "The shift option must be a number or auto");
> +    }
>
>      icount_warp_timer = timer_new_ns(QEMU_CLOCK_REALTIME,
>                                            icount_warp_rt, NULL);
> diff --git a/vl.c b/vl.c
> index 41ddcd2..103027f 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -537,6 +537,20 @@ static QemuOptsList qemu_mem_opts = {
>      },
>  };
>
> +static QemuOptsList qemu_icount_opts = {
> +    .name = "icount",
> +    .implied_opt_name = "shift",
> +    .merge_lists = true,
> +    .head = QTAILQ_HEAD_INITIALIZER(qemu_icount_opts.head),
> +    .desc = {
> +        {
> +            .name = "shift",
> +            .type = QEMU_OPT_STRING,
> +        },
> +        { /* end of list */ }
> +    },
> +};
> +
>  /**
>   * Get machine options
>   *
> @@ -2896,13 +2910,12 @@ int main(int argc, char **argv, char **envp)
>  {
>      int i;
>      int snapshot, linux_boot;
> -    const char *icount_option = NULL;
>      const char *initrd_filename;
>      const char *kernel_filename, *kernel_cmdline;
>      const char *boot_order;
>      DisplayState *ds;
>      int cyls, heads, secs, translation;
> -    QemuOpts *hda_opts = NULL, *opts, *machine_opts;
> +    QemuOpts *hda_opts = NULL, *opts, *machine_opts, *icount_opts = NULL;
>      QemuOptsList *olist;
>      int optind;
>      const char *optarg;
> @@ -2967,6 +2980,7 @@ int main(int argc, char **argv, char **envp)
>      qemu_add_opts(&qemu_msg_opts);
>      qemu_add_opts(&qemu_name_opts);
>      qemu_add_opts(&qemu_numa_opts);
> +    qemu_add_opts(&qemu_icount_opts);
>
>      runstate_init();
>
> @@ -3817,7 +3831,11 @@ int main(int argc, char **argv, char **envp)
>                  }
>                  break;
>              case QEMU_OPTION_icount:
> -                icount_option = optarg;
> +                icount_opts = qemu_opts_parse(qemu_find_opts("icount"),
> +                                              optarg, 1);
> +                if (!icount_opts) {
> +                    exit(1);
> +                }
>                  break;
>              case QEMU_OPTION_incoming:
>                  incoming = optarg;
> @@ -4294,11 +4312,14 @@ int main(int argc, char **argv, char **envp)
>      qemu_spice_init();
>  #endif
>
> -    if (icount_option && (kvm_enabled() || xen_enabled())) {
> -        fprintf(stderr, "-icount is not allowed with kvm or xen\n");
> -        exit(1);
> +    if (icount_opts) {
> +        if (kvm_enabled() || xen_enabled()) {
> +            fprintf(stderr, "-icount is not allowed with kvm or xen\n");
> +            exit(1);
> +        }
> +        configure_icount(icount_opts, &error_abort);
> +        qemu_opts_del(icount_opts);
>      }
> -    configure_icount(icount_option);

When icount_opts is null, seqlock_init() & vmstate_register() in
configure_icount() cannot be executed.

>      /* clean up network at qemu process termination */
>      atexit(&net_cleanup);
> --
> 2.0.0.rc2
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH V5 1/6] icount: Add QemuOpts for icount
  2014-09-15  7:04   ` TeLeMan
@ 2014-09-15 11:41     ` Paolo Bonzini
  0 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2014-09-15 11:41 UTC (permalink / raw)
  To: TeLeMan, Sebastian Tanase
  Cc: Kevin Wolf, Peter Maydell, jeremy.rosen, wenchaoqemu,
	Juan Quintela, alex, Michael S. Tsirkin, aliguori, qemu-devel,
	armbru, michael, camille.begue, Stefan Hajnoczi, crobinso,
	Luiz Capitulino, pierre.lemagourou, Andreas Färber, rth

Il 15/09/2014 09:04, TeLeMan ha scritto:
>> > -    configure_icount(icount_option);
> When icount_opts is null, seqlock_init() & vmstate_register() in
> configure_icount() cannot be executed.
> 

There's already a pull request sent with a fix.

Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-09-15 11:41 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-25  9:56 [Qemu-devel] [PATCH V5 0/6] icount: Implement delay algorithm between guest and host clocks Sebastian Tanase
2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 1/6] icount: Add QemuOpts for icount Sebastian Tanase
2014-08-08  6:51   ` Markus Armbruster
2014-09-15  7:04   ` TeLeMan
2014-09-15 11:41     ` Paolo Bonzini
2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 2/6] icount: Add align option to icount Sebastian Tanase
2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 3/6] icount: Make icount_time_shift available everywhere Sebastian Tanase
2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 4/6] cpu_exec: Add sleeping algorithm Sebastian Tanase
2014-07-25 10:13   ` Paolo Bonzini
2014-07-25 14:28     ` Sebastian Tanase
2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 5/6] cpu_exec: Print to console if the guest is late Sebastian Tanase
2014-07-25  9:56 ` [Qemu-devel] [PATCH V5 6/6] monitor: Add drift info to 'info jit' Sebastian Tanase
2014-07-25 10:00 ` [Qemu-devel] [PATCH V5 0/6] icount: Implement delay algorithm between guest and host clocks Andreas Färber

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).