qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
@ 2013-07-18 15:02 fred.konrad
  2013-07-18 15:02 ` [Qemu-devel] [RFC 1/3] icount: base rt_clock on icount fred.konrad
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: fred.konrad @ 2013-07-18 15:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, mark.burton, fred.konrad

From: KONRAD Frederic <fred.konrad@greensocs.com>

Hi everybody,

As I said in the last email, we have issues with determinism with icount.
We are wondering if determinism is really ensured with icount?

We saw that the rt_clock is used at multiple place which is a pain for
replaying the simulation the same way. So we simply base rt_clock on icount.

When the cpu is idle vm_clock is synchronized with the host clock which is bad
for determinism, and this happen regularly in time even if there are no I/O.
We choose to synchronize vm_clock on the next event to make sure the vm will
restart, and to avoid non determinism.

Is this the right think to do?

An other issue:

A timer callback is triggered to make the snapshot regularly. Unfortunately this
was not at all regular. For example: if we use step function in GDB instead of
cont, the callback will be triggered more often.

So we :
    * created a new clock (ic_clock) based only on icount without the bias.
    * and we repaired icount_extra mechanism by computing it from ic_clock.

Then the snapshot are taken regularly with or without stepping in gdb, the timer
became accurate and call the callback at the exact ic_clock time.

Both icount and reverse execution need an instruction counter. icount use a
count-down mechanism but reverse execution need a continuous counter. For now
we have build a separate counter and we think that these two counters can be
merged. However we would like feedback about this before modifying this.

Adding these features has moved us forward, but - QEMU is STILL not
deterministic - we believe this is likely due to I/O which we will start
investigating next.

Fred

KONRAD Frederic (3):
  icount: base rt_clock on icount.
  icount: sync vm_clock on the next event.
  icount: create a new icount based timer.

 cpus.c               | 21 +++++++++++++++++----
 include/qemu/timer.h |  4 ++++
 main-loop.c          |  5 +++++
 qemu-timer.c         | 12 +++++++++++-
 4 files changed, 37 insertions(+), 5 deletions(-)

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] [RFC 1/3] icount: base rt_clock on icount.
  2013-07-18 15:02 [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount fred.konrad
@ 2013-07-18 15:02 ` fred.konrad
  2013-07-18 15:36   ` Paolo Bonzini
  2013-07-18 15:02 ` [Qemu-devel] [RFC 2/3] icount: sync vm_clock on the next event fred.konrad
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: fred.konrad @ 2013-07-18 15:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, mark.burton, fred.konrad

From: KONRAD Frederic <fred.konrad@greensocs.com>

This bases rt_clock on icount, as vm_clock.
So vm_clock = rt_clock.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
---
 qemu-timer.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index b2d95e2..6c607e5 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -401,7 +401,11 @@ int64_t qemu_get_clock_ns(QEMUClock *clock)
 
     switch(clock->type) {
     case QEMU_CLOCK_REALTIME:
-        return get_clock();
+        if (use_icount) {
+            return cpu_get_icount();
+        } else {
+            return get_clock();
+        }
     default:
     case QEMU_CLOCK_VIRTUAL:
         if (use_icount) {
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [RFC 2/3] icount: sync vm_clock on the next event.
  2013-07-18 15:02 [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount fred.konrad
  2013-07-18 15:02 ` [Qemu-devel] [RFC 1/3] icount: base rt_clock on icount fred.konrad
@ 2013-07-18 15:02 ` fred.konrad
  2013-07-18 15:02 ` [Qemu-devel] [RFC 3/3] icount: create a new icount based timer fred.konrad
  2013-07-18 15:06 ` [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount Peter Maydell
  3 siblings, 0 replies; 18+ messages in thread
From: fred.konrad @ 2013-07-18 15:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, mark.burton, fred.konrad

From: KONRAD Frederic <fred.konrad@greensocs.com>

We don't want vm_clock to be synchronized with rt_clock as it is
not deterministic for replay.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
---
 cpus.c      | 11 +++++++++--
 main-loop.c |  5 +++++
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/cpus.c b/cpus.c
index 46504d0..fb83153 100644
--- a/cpus.c
+++ b/cpus.c
@@ -64,6 +64,8 @@
 
 static CPUArchState *next_cpu;
 
+void icount_warp_rt(void *opaque);
+
 static bool cpu_thread_is_idle(CPUState *cpu)
 {
     if (cpu->stop || cpu->queued_work_first) {
@@ -277,7 +279,7 @@ static int64_t qemu_icount_round(int64_t count)
     return (count + (1 << icount_time_shift) - 1) >> icount_time_shift;
 }
 
-static void icount_warp_rt(void *opaque)
+void icount_warp_rt(void *opaque)
 {
     if (vm_clock_warp_start == -1) {
         return;
@@ -286,8 +288,13 @@ static void icount_warp_rt(void *opaque)
     if (runstate_is_running()) {
         int64_t clock = qemu_get_clock_ns(rt_clock);
         int64_t warp_delta = clock - vm_clock_warp_start;
+        int64_t next_vm_deadline = qemu_clock_deadline(vm_clock);
         if (use_icount == 1) {
-            qemu_icount_bias += warp_delta;
+            if (next_vm_deadline > 0) {
+                qemu_icount_bias += next_vm_deadline;
+            } else {
+                qemu_notify_event();
+            }
         } else {
             /*
              * In adaptive mode, do not let the vm_clock run too
diff --git a/main-loop.c b/main-loop.c
index a44fff6..16c9b85 100644
--- a/main-loop.c
+++ b/main-loop.c
@@ -33,6 +33,8 @@
 
 #include "qemu/compatfd.h"
 
+void icount_warp_rt(void *opaque);
+
 /* If we have signalfd, we mask out the signals we want to handle and then
  * use signalfd to listen for them.  We rely on whatever the current signal
  * handler is to dispatch the signals when we receive them.
@@ -470,6 +472,9 @@ int main_loop_wait(int nonblocking)
 
     qemu_run_all_timers();
 
+    if (use_icount == 1) {
+        icount_warp_rt(NULL);
+    }
     return ret;
 }
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [RFC 3/3] icount: create a new icount based timer.
  2013-07-18 15:02 [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount fred.konrad
  2013-07-18 15:02 ` [Qemu-devel] [RFC 1/3] icount: base rt_clock on icount fred.konrad
  2013-07-18 15:02 ` [Qemu-devel] [RFC 2/3] icount: sync vm_clock on the next event fred.konrad
@ 2013-07-18 15:02 ` fred.konrad
  2013-07-18 15:08   ` Peter Maydell
  2013-07-18 15:06 ` [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount Peter Maydell
  3 siblings, 1 reply; 18+ messages in thread
From: fred.konrad @ 2013-07-18 15:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, mark.burton, fred.konrad

From: KONRAD Frederic <fred.konrad@greensocs.com>

This creates a new icount based timer, with no bias.

It moves only with the instruction counter.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
---
 cpus.c               | 10 ++++++++--
 include/qemu/timer.h |  4 ++++
 qemu-timer.c         |  6 ++++++
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/cpus.c b/cpus.c
index fb83153..86fe82b 100644
--- a/cpus.c
+++ b/cpus.c
@@ -156,6 +156,12 @@ void restore_icount(CPUArchState *env, int save)
 /* Return the virtual CPU time, based on the instruction counter.  */
 int64_t cpu_get_icount(void)
 {
+    return qemu_icount_bias + cpu_get_icount_wo_bias();
+}
+
+/* Return the virtual CPU time, really based on the instruction counter.  */
+int64_t cpu_get_icount_wo_bias(void)
+{
     int64_t icount;
     CPUArchState *env = cpu_single_env;
 
@@ -166,7 +172,7 @@ int64_t cpu_get_icount(void)
         }
         icount -= (env->icount_decr.u16.low + env->icount_extra);
     }
-    return qemu_icount_bias + (icount << icount_time_shift);
+    return icount << icount_time_shift;
 }
 
 /* return the host CPU cycle counter and handle stop/restart */
@@ -1165,7 +1171,7 @@ static int tcg_cpu_exec(CPUArchState *env)
         qemu_icount -= (env->icount_decr.u16.low + env->icount_extra);
         env->icount_decr.u16.low = 0;
         env->icount_extra = 0;
-        count = qemu_icount_round(qemu_clock_deadline(vm_clock));
+        count = qemu_icount_round(qemu_clock_deadline(ic_clock));
         qemu_icount += count;
         decr = (count > 0xffff) ? 0xffff : count;
         count -= decr;
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index b4d8229..6e53f22 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -32,6 +32,9 @@ extern QEMUClock *vm_clock;
    the virtual clock. */
 extern QEMUClock *host_clock;
 
+/* A new clock based on icount only. */
+extern QEMUClock *ic_clock;
+
 int64_t qemu_get_clock_ns(QEMUClock *clock);
 int64_t qemu_clock_has_timers(QEMUClock *clock);
 int64_t qemu_clock_expired(QEMUClock *clock);
@@ -136,6 +139,7 @@ void qemu_put_timer(QEMUFile *f, QEMUTimer *ts);
 
 /* icount */
 int64_t cpu_get_icount(void);
+int64_t cpu_get_icount_wo_bias(void);
 int64_t cpu_get_clock(void);
 
 /*******************************************/
diff --git a/qemu-timer.c b/qemu-timer.c
index 6c607e5..79d5dcb 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -43,6 +43,7 @@
 #define QEMU_CLOCK_REALTIME 0
 #define QEMU_CLOCK_VIRTUAL  1
 #define QEMU_CLOCK_HOST     2
+#define QEMU_CLOCK_ICOUNT   3
 
 struct QEMUClock {
     QEMUTimer *active_timers;
@@ -230,6 +231,7 @@ next:
 QEMUClock *rt_clock;
 QEMUClock *vm_clock;
 QEMUClock *host_clock;
+QEMUClock *ic_clock;
 
 static QEMUClock *qemu_new_clock(int type)
 {
@@ -413,6 +415,8 @@ int64_t qemu_get_clock_ns(QEMUClock *clock)
         } else {
             return cpu_get_clock();
         }
+    case QEMU_CLOCK_ICOUNT:
+        return cpu_get_icount_wo_bias();
     case QEMU_CLOCK_HOST:
         now = get_clock_realtime();
         last = clock->last;
@@ -440,6 +444,7 @@ void init_clocks(void)
         rt_clock = qemu_new_clock(QEMU_CLOCK_REALTIME);
         vm_clock = qemu_new_clock(QEMU_CLOCK_VIRTUAL);
         host_clock = qemu_new_clock(QEMU_CLOCK_HOST);
+        ic_clock = qemu_new_clock(QEMU_CLOCK_ICOUNT);
     }
 }
 
@@ -456,6 +461,7 @@ void qemu_run_all_timers(void)
     qemu_run_timers(vm_clock);
     qemu_run_timers(rt_clock);
     qemu_run_timers(host_clock);
+    qemu_run_timers(ic_clock);
 
     /* rearm timer, if not periodic */
     if (alarm_timer->expired) {
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
  2013-07-18 15:02 [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount fred.konrad
                   ` (2 preceding siblings ...)
  2013-07-18 15:02 ` [Qemu-devel] [RFC 3/3] icount: create a new icount based timer fred.konrad
@ 2013-07-18 15:06 ` Peter Maydell
  2013-07-18 15:09   ` Frederic Konrad
  2013-07-18 15:35   ` Paolo Bonzini
  3 siblings, 2 replies; 18+ messages in thread
From: Peter Maydell @ 2013-07-18 15:06 UTC (permalink / raw)
  To: fred.konrad; +Cc: pbonzini, mark.burton, qemu-devel

On 18 July 2013 16:02,  <fred.konrad@greensocs.com> wrote:
> As I said in the last email, we have issues with determinism with icount.
> We are wondering if determinism is really ensured with icount?

My opinion is that it *should* be deterministic but it would
be unsurprising if the determinism had got broken along the way.

> Both icount and reverse execution need an instruction counter. icount use a
> count-down mechanism but reverse execution need a continuous counter. For now
> we have build a separate counter and we think that these two counters can be
> merged. However we would like feedback about this before modifying this.

I definitely think that there should only be one counter, not two.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 3/3] icount: create a new icount based timer.
  2013-07-18 15:02 ` [Qemu-devel] [RFC 3/3] icount: create a new icount based timer fred.konrad
@ 2013-07-18 15:08   ` Peter Maydell
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Maydell @ 2013-07-18 15:08 UTC (permalink / raw)
  To: fred.konrad; +Cc: pbonzini, mark.burton, qemu-devel

On 18 July 2013 16:02,  <fred.konrad@greensocs.com> wrote:
> @@ -156,6 +156,12 @@ void restore_icount(CPUArchState *env, int save)
>  /* Return the virtual CPU time, based on the instruction counter.  */
>  int64_t cpu_get_icount(void)
>  {
> +    return qemu_icount_bias + cpu_get_icount_wo_bias();
> +}
> +
> +/* Return the virtual CPU time, really based on the instruction counter.  */
> +int64_t cpu_get_icount_wo_bias(void)
> +{

The comments for these two functions don't make any sense.
You need to explain what they're actually doing (and
when you'd want one and when the other).

-- PMM

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
  2013-07-18 15:06 ` [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount Peter Maydell
@ 2013-07-18 15:09   ` Frederic Konrad
  2013-07-18 15:12     ` Peter Maydell
  2013-07-18 15:35   ` Paolo Bonzini
  1 sibling, 1 reply; 18+ messages in thread
From: Frederic Konrad @ 2013-07-18 15:09 UTC (permalink / raw)
  To: Peter Maydell; +Cc: pbonzini, mark.burton, qemu-devel

On 18/07/2013 17:06, Peter Maydell wrote:
> On 18 July 2013 16:02,  <fred.konrad@greensocs.com> wrote:
>> As I said in the last email, we have issues with determinism with icount.
>> We are wondering if determinism is really ensured with icount?
> My opinion is that it *should* be deterministic but it would
> be unsurprising if the determinism had got broken along the way.

Yes, the documentation say that this command can give deterministic results
that's why we asked.
>
>> Both icount and reverse execution need an instruction counter. icount use a
>> count-down mechanism but reverse execution need a continuous counter. For now
>> we have build a separate counter and we think that these two counters can be
>> merged. However we would like feedback about this before modifying this.
> I definitely think that there should only be one counter, not two.
>
> thanks
> -- PMM

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
  2013-07-18 15:09   ` Frederic Konrad
@ 2013-07-18 15:12     ` Peter Maydell
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Maydell @ 2013-07-18 15:12 UTC (permalink / raw)
  To: Frederic Konrad; +Cc: pbonzini, mark.burton, qemu-devel

On 18 July 2013 16:09, Frederic Konrad <fred.konrad@greensocs.com> wrote:
> On 18/07/2013 17:06, Peter Maydell wrote:
>>
>> On 18 July 2013 16:02,  <fred.konrad@greensocs.com> wrote:
>>>
>>> As I said in the last email, we have issues with determinism with icount.
>>> We are wondering if determinism is really ensured with icount?
>>
>> My opinion is that it *should* be deterministic but it would
>> be unsurprising if the determinism had got broken along the way.
>
>
> Yes, the documentation say that this command can give deterministic results
> that's why we asked.

As part of working through this it would be great if you could
write some developer documentation (in docs/ or possibly as a
comment somewhere sensible) that summarises how icount works,
what you need to do in a target-* to support it, etc.
[I suspect that by the time you're done you're going to be the
expert on icount...]

thanks
-- PMM

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
  2013-07-18 15:06 ` [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount Peter Maydell
  2013-07-18 15:09   ` Frederic Konrad
@ 2013-07-18 15:35   ` Paolo Bonzini
  2013-07-18 16:31     ` Frederic Konrad
  2013-07-29 15:27     ` Frederic Konrad
  1 sibling, 2 replies; 18+ messages in thread
From: Paolo Bonzini @ 2013-07-18 15:35 UTC (permalink / raw)
  To: Peter Maydell; +Cc: mark.burton, qemu-devel, fred.konrad

Il 18/07/2013 17:06, Peter Maydell ha scritto:
> On 18 July 2013 16:02,  <fred.konrad@greensocs.com> wrote:
>> As I said in the last email, we have issues with determinism with icount.
>> We are wondering if determinism is really ensured with icount?
> 
> My opinion is that it *should* be deterministic but it would
> be unsurprising if the determinism had got broken along the way.

First of all, it can only be deterministic if the guest satisfies (at
least) all the following condition:

1) only uses timer that QEMU bases on vm_clock (which means that you
should use "-rtc clock=vm"---sorry Fred, didn't think about this in the
previous answer);

2) never does any network operation nor any asynchronous disk I/O operation

3) never halts the VCPU waiting for an interrupt


Point 1 is obvious.


To explain points 2, let's consider what happens if a block device uses
synchronous vs. asynchronous I/O.

With synchronous I/O, each block device operation will complete
immediately.  All clocks are stalled during the operation.

With asynchronous I/O, each block device operation will be done while
the CPU is running.  If the CPU is polling a completion flag, the number
of instructions executed (thus icount) depends on how long it takes to
do I/O.


To explain point 3 (which is the only one that _might_ be fixable),
let's see what happens if the VCPU halts waiting for an interrupt.  If
that is the case, and you haven't done any asynchronous I/O, there
should be active vm_clock timers, and you have another possible source
of non-deterministic behavior.

The current QEMU behavior is (and has always been) to start tracking
rt_clock.  This is obviously not deterministic.  Note that with the
switch to separate threads for iothread/VCPU, the algorithm to do this
has become much better.  Let's look at a couple possibilities:

2) jump to the next vm_clock deadline.  This sounds appealing, but it is
still nondeterministic in the general case when the guest *is* doing
asynchronous I/O too.  How many vm_clock timers do you run before I/O
finishes?  Furthermore, the vm_clock might move too fast.  Think of an
RTC clock whose alarm registers are 0/0/0 so it fires at midnight; if it
is the only active vm_clock timer, you end up in 2107 even before the
kernel boots!

3) do not process vm_clock timers at all unless there is no pending I/O
(block/network); if there is none, track rt_clock as in current
behavior.  I just made it up, but it sounds promising and similar to
synchronous I/O.  It should not be extremely hard to implement, and it
can remove this kind of nondeterminism.  But it won't fix the case when
the CPU is polling.

Paolo

ps: I'm not an expert on icount at all, I'm only reasoning of the
possible interactions with the main loop.

>> Both icount and reverse execution need an instruction counter. icount use a
>> count-down mechanism but reverse execution need a continuous counter. For now
>> we have build a separate counter and we think that these two counters can be
>> merged. However we would like feedback about this before modifying this.
> 
> I definitely think that there should only be one counter, not two.
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 1/3] icount: base rt_clock on icount.
  2013-07-18 15:02 ` [Qemu-devel] [RFC 1/3] icount: base rt_clock on icount fred.konrad
@ 2013-07-18 15:36   ` Paolo Bonzini
  2013-07-18 16:23     ` Frederic Konrad
  0 siblings, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2013-07-18 15:36 UTC (permalink / raw)
  To: fred.konrad; +Cc: mark.burton, qemu-devel

Il 18/07/2013 17:02, fred.konrad@greensocs.com ha scritto:
> From: KONRAD Frederic <fred.konrad@greensocs.com>
> 
> This bases rt_clock on icount, as vm_clock.
> So vm_clock = rt_clock.
> 
> Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
> ---
>  qemu-timer.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/qemu-timer.c b/qemu-timer.c
> index b2d95e2..6c607e5 100644
> --- a/qemu-timer.c
> +++ b/qemu-timer.c
> @@ -401,7 +401,11 @@ int64_t qemu_get_clock_ns(QEMUClock *clock)
>  
>      switch(clock->type) {
>      case QEMU_CLOCK_REALTIME:
> -        return get_clock();
> +        if (use_icount) {
> +            return cpu_get_icount();
> +        } else {
> +            return get_clock();
> +        }
>      default:
>      case QEMU_CLOCK_VIRTUAL:
>          if (use_icount) {
> 

rt_clock is very little used in general.  You should use "-rtc clock=vm"
if you want to base the RTC on vm_clock.

Paolo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 1/3] icount: base rt_clock on icount.
  2013-07-18 15:36   ` Paolo Bonzini
@ 2013-07-18 16:23     ` Frederic Konrad
  2013-07-18 16:26       ` Paolo Bonzini
  0 siblings, 1 reply; 18+ messages in thread
From: Frederic Konrad @ 2013-07-18 16:23 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: mark.burton, qemu-devel

On 18/07/2013 17:36, Paolo Bonzini wrote:
> Il 18/07/2013 17:02, fred.konrad@greensocs.com ha scritto:
>> From: KONRAD Frederic <fred.konrad@greensocs.com>
>>
>> This bases rt_clock on icount, as vm_clock.
>> So vm_clock = rt_clock.
>>
>> Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
>> ---
>>   qemu-timer.c | 6 +++++-
>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/qemu-timer.c b/qemu-timer.c
>> index b2d95e2..6c607e5 100644
>> --- a/qemu-timer.c
>> +++ b/qemu-timer.c
>> @@ -401,7 +401,11 @@ int64_t qemu_get_clock_ns(QEMUClock *clock)
>>   
>>       switch(clock->type) {
>>       case QEMU_CLOCK_REALTIME:
>> -        return get_clock();
>> +        if (use_icount) {
>> +            return cpu_get_icount();
>> +        } else {
>> +            return get_clock();
>> +        }
>>       default:
>>       case QEMU_CLOCK_VIRTUAL:
>>           if (use_icount) {
>>
> rt_clock is very little used in general.  You should use "-rtc clock=vm"
> if you want to base the RTC on vm_clock.
>
> Paolo

True but it seems used in some place:

For example: ui/console.c:
ds->gui_timer = qemu_new_timer_ms(rt_clock, gui_update, ds);

Maybe it can cause trouble no?

Fred

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 1/3] icount: base rt_clock on icount.
  2013-07-18 16:23     ` Frederic Konrad
@ 2013-07-18 16:26       ` Paolo Bonzini
  0 siblings, 0 replies; 18+ messages in thread
From: Paolo Bonzini @ 2013-07-18 16:26 UTC (permalink / raw)
  To: Frederic Konrad; +Cc: mark.burton, qemu-devel

Il 18/07/2013 18:23, Frederic Konrad ha scritto:
> On 18/07/2013 17:36, Paolo Bonzini wrote:
>> Il 18/07/2013 17:02, fred.konrad@greensocs.com ha scritto:
>>> From: KONRAD Frederic <fred.konrad@greensocs.com>
>>>
>>> This bases rt_clock on icount, as vm_clock.
>>> So vm_clock = rt_clock.
>>>
>>> Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
>>> ---
>>>   qemu-timer.c | 6 +++++-
>>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/qemu-timer.c b/qemu-timer.c
>>> index b2d95e2..6c607e5 100644
>>> --- a/qemu-timer.c
>>> +++ b/qemu-timer.c
>>> @@ -401,7 +401,11 @@ int64_t qemu_get_clock_ns(QEMUClock *clock)
>>>         switch(clock->type) {
>>>       case QEMU_CLOCK_REALTIME:
>>> -        return get_clock();
>>> +        if (use_icount) {
>>> +            return cpu_get_icount();
>>> +        } else {
>>> +            return get_clock();
>>> +        }
>>>       default:
>>>       case QEMU_CLOCK_VIRTUAL:
>>>           if (use_icount) {
>>>
>> rt_clock is very little used in general.  You should use "-rtc clock=vm"
>> if you want to base the RTC on vm_clock.
>>
>> Paolo
> 
> True but it seems used in some place:
> 
> For example: ui/console.c:
> ds->gui_timer = qemu_new_timer_ms(rt_clock, gui_update, ds);
> 
> Maybe it can cause trouble no?

In theory it is only used in places where it shouldn't cause trouble
(those that do should use the similarly named rtc_clock variable).

Paolo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
  2013-07-18 15:35   ` Paolo Bonzini
@ 2013-07-18 16:31     ` Frederic Konrad
  2013-07-18 16:35       ` Paolo Bonzini
  2013-07-29 15:27     ` Frederic Konrad
  1 sibling, 1 reply; 18+ messages in thread
From: Frederic Konrad @ 2013-07-18 16:31 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Peter Maydell, mark.burton, qemu-devel

On 18/07/2013 17:35, Paolo Bonzini wrote:
> Il 18/07/2013 17:06, Peter Maydell ha scritto:
>> On 18 July 2013 16:02,  <fred.konrad@greensocs.com> wrote:
>>> As I said in the last email, we have issues with determinism with icount.
>>> We are wondering if determinism is really ensured with icount?
>> My opinion is that it *should* be deterministic but it would
>> be unsurprising if the determinism had got broken along the way.
> First of all, it can only be deterministic if the guest satisfies (at
> least) all the following condition:
>
> 1) only uses timer that QEMU bases on vm_clock (which means that you
> should use "-rtc clock=vm"---sorry Fred, didn't think about this in the
> previous answer);

Oops sorry, I didn't mentioned that, but we used rtc clock=vm for our tests.
> 2) never does any network operation nor any asynchronous disk I/O operation
>
> 3) never halts the VCPU waiting for an interrupt
>
>
> Point 1 is obvious.
>
>
> To explain points 2, let's consider what happens if a block device uses
> synchronous vs. asynchronous I/O.
>
> With synchronous I/O, each block device operation will complete
> immediately.  All clocks are stalled during the operation.
>
> With asynchronous I/O, each block device operation will be done while
> the CPU is running.  If the CPU is polling a completion flag, the number
> of instructions executed (thus icount) depends on how long it takes to
> do I/O.

So I suppose this can happen even if there are any network card or block 
device.

We probably need to disable it until we finally save and replay IO, to 
get this thing
working.

>
>
> To explain point 3 (which is the only one that _might_ be fixable),
> let's see what happens if the VCPU halts waiting for an interrupt.  If
> that is the case, and you haven't done any asynchronous I/O, there
> should be active vm_clock timers, and you have another possible source
> of non-deterministic behavior.
>
> The current QEMU behavior is (and has always been) to start tracking
> rt_clock.  This is obviously not deterministic.  Note that with the
> switch to separate threads for iothread/VCPU, the algorithm to do this
> has become much better.  Let's look at a couple possibilities:
>
> 2) jump to the next vm_clock deadline.  This sounds appealing, but it is
> still nondeterministic in the general case when the guest *is* doing
> asynchronous I/O too.  How many vm_clock timers do you run before I/O
> finishes?  Furthermore, the vm_clock might move too fast.  Think of an
> RTC clock whose alarm registers are 0/0/0 so it fires at midnight; if it
> is the only active vm_clock timer, you end up in 2107 even before the
> kernel boots!

Yes I didn't think about that :).
>
> 3) do not process vm_clock timers at all unless there is no pending I/O
> (block/network); if there is none, track rt_clock as in current
> behavior.  I just made it up, but it sounds promising and similar to
> synchronous I/O.  It should not be extremely hard to implement, and it
> can remove this kind of nondeterminism.  But it won't fix the case when
> the CPU is polling.

Thanks, I need to take a look at all this.
>
> Paolo
>
> ps: I'm not an expert on icount at all, I'm only reasoning of the
> possible interactions with the main loop.
>
>>> Both icount and reverse execution need an instruction counter. icount use a
>>> count-down mechanism but reverse execution need a continuous counter. For now
>>> we have build a separate counter and we think that these two counters can be
>>> merged. However we would like feedback about this before modifying this.
>> I definitely think that there should only be one counter, not two.
>>
>> thanks
>> -- PMM
>>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
  2013-07-18 16:31     ` Frederic Konrad
@ 2013-07-18 16:35       ` Paolo Bonzini
  2013-07-19 15:26         ` Frederic Konrad
  0 siblings, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2013-07-18 16:35 UTC (permalink / raw)
  To: Frederic Konrad; +Cc: Peter Maydell, mark.burton, qemu-devel, Orit Wasserman

Il 18/07/2013 18:31, Frederic Konrad ha scritto:
> On 18/07/2013 17:35, Paolo Bonzini wrote:
>> Il 18/07/2013 17:06, Peter Maydell ha scritto:
>>> On 18 July 2013 16:02,  <fred.konrad@greensocs.com> wrote:
>>>> As I said in the last email, we have issues with determinism with
>>>> icount.
>>>> We are wondering if determinism is really ensured with icount?
>>> My opinion is that it *should* be deterministic but it would
>>> be unsurprising if the determinism had got broken along the way.
>> First of all, it can only be deterministic if the guest satisfies (at
>> least) all the following condition:
>>
>> 1) only uses timer that QEMU bases on vm_clock (which means that you
>> should use "-rtc clock=vm"---sorry Fred, didn't think about this in the
>> previous answer);
> 
> Oops sorry, I didn't mentioned that, but we used rtc clock=vm for our
> tests.
>> 2) never does any network operation nor any asynchronous disk I/O
>> operation
>>
>> 3) never halts the VCPU waiting for an interrupt
>>
>>
>> Point 1 is obvious.
>>
>>
>> To explain points 2, let's consider what happens if a block device uses
>> synchronous vs. asynchronous I/O.
>>
>> With synchronous I/O, each block device operation will complete
>> immediately.  All clocks are stalled during the operation.
>>
>> With asynchronous I/O, each block device operation will be done while
>> the CPU is running.  If the CPU is polling a completion flag, the number
>> of instructions executed (thus icount) depends on how long it takes to
>> do I/O.
> 
> So I suppose this can happen even if there are any network card or block
> device.
> 
> We probably need to disable it until we finally save and replay IO, to
> get this thing working.

Are you aware of the work that was done on fault tolerance (Kemari)?
Orit is working on resurrecting it.

Paolo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
  2013-07-18 16:35       ` Paolo Bonzini
@ 2013-07-19 15:26         ` Frederic Konrad
  0 siblings, 0 replies; 18+ messages in thread
From: Frederic Konrad @ 2013-07-19 15:26 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Peter Maydell, mark.burton, qemu-devel, Orit Wasserman

On 18/07/2013 18:35, Paolo Bonzini wrote:
> Il 18/07/2013 18:31, Frederic Konrad ha scritto:
>> On 18/07/2013 17:35, Paolo Bonzini wrote:
>>> Il 18/07/2013 17:06, Peter Maydell ha scritto:
>>>> On 18 July 2013 16:02,  <fred.konrad@greensocs.com> wrote:
>>>>> As I said in the last email, we have issues with determinism with
>>>>> icount.
>>>>> We are wondering if determinism is really ensured with icount?
>>>> My opinion is that it *should* be deterministic but it would
>>>> be unsurprising if the determinism had got broken along the way.
>>> First of all, it can only be deterministic if the guest satisfies (at
>>> least) all the following condition:
>>>
>>> 1) only uses timer that QEMU bases on vm_clock (which means that you
>>> should use "-rtc clock=vm"---sorry Fred, didn't think about this in the
>>> previous answer);
>> Oops sorry, I didn't mentioned that, but we used rtc clock=vm for our
>> tests.
>>> 2) never does any network operation nor any asynchronous disk I/O
>>> operation
>>>
>>> 3) never halts the VCPU waiting for an interrupt
>>>
>>>
>>> Point 1 is obvious.
>>>
>>>
>>> To explain points 2, let's consider what happens if a block device uses
>>> synchronous vs. asynchronous I/O.
>>>
>>> With synchronous I/O, each block device operation will complete
>>> immediately.  All clocks are stalled during the operation.
>>>
>>> With asynchronous I/O, each block device operation will be done while
>>> the CPU is running.  If the CPU is polling a completion flag, the number
>>> of instructions executed (thus icount) depends on how long it takes to
>>> do I/O.
>> So I suppose this can happen even if there are any network card or block
>> device.
>>
>> We probably need to disable it until we finally save and replay IO, to
>> get this thing working.
> Are you aware of the work that was done on fault tolerance (Kemari)?
> Orit is working on resurrecting it.
>
> Paolo

No, but I will take a look that can be really usefull for IO.

Thanks,
Fred

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
  2013-07-18 15:35   ` Paolo Bonzini
  2013-07-18 16:31     ` Frederic Konrad
@ 2013-07-29 15:27     ` Frederic Konrad
  2013-07-29 16:42       ` Paolo Bonzini
  1 sibling, 1 reply; 18+ messages in thread
From: Frederic Konrad @ 2013-07-29 15:27 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Peter Maydell, mark.burton, qemu-devel

On 18/07/2013 17:35, Paolo Bonzini wrote:
> Il 18/07/2013 17:06, Peter Maydell ha scritto:
>> On 18 July 2013 16:02,<fred.konrad@greensocs.com>  wrote:
>>> As I said in the last email, we have issues with determinism with icount.
>>> We are wondering if determinism is really ensured with icount?
>> My opinion is that it *should* be deterministic but it would
>> be unsurprising if the determinism had got broken along the way.
> First of all, it can only be deterministic if the guest satisfies (at
> least) all the following condition:
>
> 1) only uses timer that QEMU bases on vm_clock (which means that you
> should use "-rtc clock=vm"---sorry Fred, didn't think about this in the
> previous answer);
>
> 2) never does any network operation nor any asynchronous disk I/O operation
>
> 3) never halts the VCPU waiting for an interrupt

Hi,

qemu_alarm is making the replay not deterministic too.

We tried to remove those alarms and it seems to replay well (at least 
far better).

So the question is: how we can solve that?

We thought at two possibilities :
   * record/replay them, like IO.
   * base them on our new ic_clock.

Both have drawbacks:
   * record/replay won't make icount more deterministic (run to run).
   * ic_clock speed time is apparently not constant.

Thanks,
Fred

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
  2013-07-29 15:27     ` Frederic Konrad
@ 2013-07-29 16:42       ` Paolo Bonzini
  2013-07-30  7:06         ` Frederic Konrad
  0 siblings, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2013-07-29 16:42 UTC (permalink / raw)
  To: Frederic Konrad; +Cc: Peter Maydell, mark.burton, qemu-devel

Il 29/07/2013 17:27, Frederic Konrad ha scritto:
> On 18/07/2013 17:35, Paolo Bonzini wrote:
>> Il 18/07/2013 17:06, Peter Maydell ha scritto:
>>> On 18 July 2013 16:02,<fred.konrad@greensocs.com>  wrote:
>>>> As I said in the last email, we have issues with determinism with
>>>> icount.
>>>> We are wondering if determinism is really ensured with icount?
>>> My opinion is that it *should* be deterministic but it would
>>> be unsurprising if the determinism had got broken along the way.
>> First of all, it can only be deterministic if the guest satisfies (at
>> least) all the following condition:
>>
>> 1) only uses timer that QEMU bases on vm_clock (which means that you
>> should use "-rtc clock=vm"---sorry Fred, didn't think about this in the
>> previous answer);
>>
>> 2) never does any network operation nor any asynchronous disk I/O
>> operation
>>
>> 3) never halts the VCPU waiting for an interrupt
> 
> Hi,
> 
> qemu_alarm is making the replay not deterministic too.

What is qemu_alarm?  If you mean qemu_alarm_timer, then that means
rt_clock and host_clock (item 1 above)?

If so, yes, I believe you need to record/replay them.  When doing replay
for reverse execution, you certainly want to execute at full speed
without waiting for real time to pass again.

Paolo

> We tried to remove those alarms and it seems to replay well (at least
> far better).
> 
> So the question is: how we can solve that?
> 
> We thought at two possibilities :
>   * record/replay them, like IO.
>   * base them on our new ic_clock.
> 
> Both have drawbacks:
>   * record/replay won't make icount more deterministic (run to run).
>   * ic_clock speed time is apparently not constant.
> 
> Thanks,
> Fred
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
  2013-07-29 16:42       ` Paolo Bonzini
@ 2013-07-30  7:06         ` Frederic Konrad
  0 siblings, 0 replies; 18+ messages in thread
From: Frederic Konrad @ 2013-07-30  7:06 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Peter Maydell, mark.burton, qemu-devel

On 29/07/2013 18:42, Paolo Bonzini wrote:
> Il 29/07/2013 17:27, Frederic Konrad ha scritto:
>> On 18/07/2013 17:35, Paolo Bonzini wrote:
>>> Il 18/07/2013 17:06, Peter Maydell ha scritto:
>>>> On 18 July 2013 16:02,<fred.konrad@greensocs.com>  wrote:
>>>>> As I said in the last email, we have issues with determinism with
>>>>> icount.
>>>>> We are wondering if determinism is really ensured with icount?
>>>> My opinion is that it *should* be deterministic but it would
>>>> be unsurprising if the determinism had got broken along the way.
>>> First of all, it can only be deterministic if the guest satisfies (at
>>> least) all the following condition:
>>>
>>> 1) only uses timer that QEMU bases on vm_clock (which means that you
>>> should use "-rtc clock=vm"---sorry Fred, didn't think about this in the
>>> previous answer);
>>>
>>> 2) never does any network operation nor any asynchronous disk I/O
>>> operation
>>>
>>> 3) never halts the VCPU waiting for an interrupt
>> Hi,
>>
>> qemu_alarm is making the replay not deterministic too.
> What is qemu_alarm?  If you mean qemu_alarm_timer, then that means
> rt_clock and host_clock (item 1 above)?
>
> If so, yes, I believe you need to record/replay them.  When doing replay
> for reverse execution, you certainly want to execute at full speed
> without waiting for real time to pass again.
>
> Paolo

Yes, it was what we believed too. :)

Thanks,
Fred

>> We tried to remove those alarms and it seems to replay well (at least
>> far better).
>>
>> So the question is: how we can solve that?
>>
>> We thought at two possibilities :
>>    * record/replay them, like IO.
>>    * base them on our new ic_clock.
>>
>> Both have drawbacks:
>>    * record/replay won't make icount more deterministic (run to run).
>>    * ic_clock speed time is apparently not constant.
>>
>> Thanks,
>> Fred
>>

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2013-07-30  7:06 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-07-18 15:02 [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount fred.konrad
2013-07-18 15:02 ` [Qemu-devel] [RFC 1/3] icount: base rt_clock on icount fred.konrad
2013-07-18 15:36   ` Paolo Bonzini
2013-07-18 16:23     ` Frederic Konrad
2013-07-18 16:26       ` Paolo Bonzini
2013-07-18 15:02 ` [Qemu-devel] [RFC 2/3] icount: sync vm_clock on the next event fred.konrad
2013-07-18 15:02 ` [Qemu-devel] [RFC 3/3] icount: create a new icount based timer fred.konrad
2013-07-18 15:08   ` Peter Maydell
2013-07-18 15:06 ` [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount Peter Maydell
2013-07-18 15:09   ` Frederic Konrad
2013-07-18 15:12     ` Peter Maydell
2013-07-18 15:35   ` Paolo Bonzini
2013-07-18 16:31     ` Frederic Konrad
2013-07-18 16:35       ` Paolo Bonzini
2013-07-19 15:26         ` Frederic Konrad
2013-07-29 15:27     ` Frederic Konrad
2013-07-29 16:42       ` Paolo Bonzini
2013-07-30  7:06         ` Frederic Konrad

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).