* [Qemu-devel] [RFC 1/3] icount: base rt_clock on icount.
2013-07-18 15:02 [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount fred.konrad
@ 2013-07-18 15:02 ` fred.konrad
2013-07-18 15:36 ` Paolo Bonzini
2013-07-18 15:02 ` [Qemu-devel] [RFC 2/3] icount: sync vm_clock on the next event fred.konrad
` (2 subsequent siblings)
3 siblings, 1 reply; 18+ messages in thread
From: fred.konrad @ 2013-07-18 15:02 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini, mark.burton, fred.konrad
From: KONRAD Frederic <fred.konrad@greensocs.com>
This bases rt_clock on icount, as vm_clock.
So vm_clock = rt_clock.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
---
qemu-timer.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/qemu-timer.c b/qemu-timer.c
index b2d95e2..6c607e5 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -401,7 +401,11 @@ int64_t qemu_get_clock_ns(QEMUClock *clock)
switch(clock->type) {
case QEMU_CLOCK_REALTIME:
- return get_clock();
+ if (use_icount) {
+ return cpu_get_icount();
+ } else {
+ return get_clock();
+ }
default:
case QEMU_CLOCK_VIRTUAL:
if (use_icount) {
--
1.8.1.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [RFC 1/3] icount: base rt_clock on icount.
2013-07-18 15:02 ` [Qemu-devel] [RFC 1/3] icount: base rt_clock on icount fred.konrad
@ 2013-07-18 15:36 ` Paolo Bonzini
2013-07-18 16:23 ` Frederic Konrad
0 siblings, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2013-07-18 15:36 UTC (permalink / raw)
To: fred.konrad; +Cc: mark.burton, qemu-devel
Il 18/07/2013 17:02, fred.konrad@greensocs.com ha scritto:
> From: KONRAD Frederic <fred.konrad@greensocs.com>
>
> This bases rt_clock on icount, as vm_clock.
> So vm_clock = rt_clock.
>
> Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
> ---
> qemu-timer.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/qemu-timer.c b/qemu-timer.c
> index b2d95e2..6c607e5 100644
> --- a/qemu-timer.c
> +++ b/qemu-timer.c
> @@ -401,7 +401,11 @@ int64_t qemu_get_clock_ns(QEMUClock *clock)
>
> switch(clock->type) {
> case QEMU_CLOCK_REALTIME:
> - return get_clock();
> + if (use_icount) {
> + return cpu_get_icount();
> + } else {
> + return get_clock();
> + }
> default:
> case QEMU_CLOCK_VIRTUAL:
> if (use_icount) {
>
rt_clock is very little used in general. You should use "-rtc clock=vm"
if you want to base the RTC on vm_clock.
Paolo
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [RFC 1/3] icount: base rt_clock on icount.
2013-07-18 15:36 ` Paolo Bonzini
@ 2013-07-18 16:23 ` Frederic Konrad
2013-07-18 16:26 ` Paolo Bonzini
0 siblings, 1 reply; 18+ messages in thread
From: Frederic Konrad @ 2013-07-18 16:23 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: mark.burton, qemu-devel
On 18/07/2013 17:36, Paolo Bonzini wrote:
> Il 18/07/2013 17:02, fred.konrad@greensocs.com ha scritto:
>> From: KONRAD Frederic <fred.konrad@greensocs.com>
>>
>> This bases rt_clock on icount, as vm_clock.
>> So vm_clock = rt_clock.
>>
>> Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
>> ---
>> qemu-timer.c | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/qemu-timer.c b/qemu-timer.c
>> index b2d95e2..6c607e5 100644
>> --- a/qemu-timer.c
>> +++ b/qemu-timer.c
>> @@ -401,7 +401,11 @@ int64_t qemu_get_clock_ns(QEMUClock *clock)
>>
>> switch(clock->type) {
>> case QEMU_CLOCK_REALTIME:
>> - return get_clock();
>> + if (use_icount) {
>> + return cpu_get_icount();
>> + } else {
>> + return get_clock();
>> + }
>> default:
>> case QEMU_CLOCK_VIRTUAL:
>> if (use_icount) {
>>
> rt_clock is very little used in general. You should use "-rtc clock=vm"
> if you want to base the RTC on vm_clock.
>
> Paolo
True but it seems used in some place:
For example: ui/console.c:
ds->gui_timer = qemu_new_timer_ms(rt_clock, gui_update, ds);
Maybe it can cause trouble no?
Fred
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [RFC 1/3] icount: base rt_clock on icount.
2013-07-18 16:23 ` Frederic Konrad
@ 2013-07-18 16:26 ` Paolo Bonzini
0 siblings, 0 replies; 18+ messages in thread
From: Paolo Bonzini @ 2013-07-18 16:26 UTC (permalink / raw)
To: Frederic Konrad; +Cc: mark.burton, qemu-devel
Il 18/07/2013 18:23, Frederic Konrad ha scritto:
> On 18/07/2013 17:36, Paolo Bonzini wrote:
>> Il 18/07/2013 17:02, fred.konrad@greensocs.com ha scritto:
>>> From: KONRAD Frederic <fred.konrad@greensocs.com>
>>>
>>> This bases rt_clock on icount, as vm_clock.
>>> So vm_clock = rt_clock.
>>>
>>> Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
>>> ---
>>> qemu-timer.c | 6 +++++-
>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/qemu-timer.c b/qemu-timer.c
>>> index b2d95e2..6c607e5 100644
>>> --- a/qemu-timer.c
>>> +++ b/qemu-timer.c
>>> @@ -401,7 +401,11 @@ int64_t qemu_get_clock_ns(QEMUClock *clock)
>>> switch(clock->type) {
>>> case QEMU_CLOCK_REALTIME:
>>> - return get_clock();
>>> + if (use_icount) {
>>> + return cpu_get_icount();
>>> + } else {
>>> + return get_clock();
>>> + }
>>> default:
>>> case QEMU_CLOCK_VIRTUAL:
>>> if (use_icount) {
>>>
>> rt_clock is very little used in general. You should use "-rtc clock=vm"
>> if you want to base the RTC on vm_clock.
>>
>> Paolo
>
> True but it seems used in some place:
>
> For example: ui/console.c:
> ds->gui_timer = qemu_new_timer_ms(rt_clock, gui_update, ds);
>
> Maybe it can cause trouble no?
In theory it is only used in places where it shouldn't cause trouble
(those that do should use the similarly named rtc_clock variable).
Paolo
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [RFC 2/3] icount: sync vm_clock on the next event.
2013-07-18 15:02 [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount fred.konrad
2013-07-18 15:02 ` [Qemu-devel] [RFC 1/3] icount: base rt_clock on icount fred.konrad
@ 2013-07-18 15:02 ` fred.konrad
2013-07-18 15:02 ` [Qemu-devel] [RFC 3/3] icount: create a new icount based timer fred.konrad
2013-07-18 15:06 ` [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount Peter Maydell
3 siblings, 0 replies; 18+ messages in thread
From: fred.konrad @ 2013-07-18 15:02 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini, mark.burton, fred.konrad
From: KONRAD Frederic <fred.konrad@greensocs.com>
We don't want vm_clock to be synchronized with rt_clock as it is
not deterministic for replay.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
---
cpus.c | 11 +++++++++--
main-loop.c | 5 +++++
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/cpus.c b/cpus.c
index 46504d0..fb83153 100644
--- a/cpus.c
+++ b/cpus.c
@@ -64,6 +64,8 @@
static CPUArchState *next_cpu;
+void icount_warp_rt(void *opaque);
+
static bool cpu_thread_is_idle(CPUState *cpu)
{
if (cpu->stop || cpu->queued_work_first) {
@@ -277,7 +279,7 @@ static int64_t qemu_icount_round(int64_t count)
return (count + (1 << icount_time_shift) - 1) >> icount_time_shift;
}
-static void icount_warp_rt(void *opaque)
+void icount_warp_rt(void *opaque)
{
if (vm_clock_warp_start == -1) {
return;
@@ -286,8 +288,13 @@ static void icount_warp_rt(void *opaque)
if (runstate_is_running()) {
int64_t clock = qemu_get_clock_ns(rt_clock);
int64_t warp_delta = clock - vm_clock_warp_start;
+ int64_t next_vm_deadline = qemu_clock_deadline(vm_clock);
if (use_icount == 1) {
- qemu_icount_bias += warp_delta;
+ if (next_vm_deadline > 0) {
+ qemu_icount_bias += next_vm_deadline;
+ } else {
+ qemu_notify_event();
+ }
} else {
/*
* In adaptive mode, do not let the vm_clock run too
diff --git a/main-loop.c b/main-loop.c
index a44fff6..16c9b85 100644
--- a/main-loop.c
+++ b/main-loop.c
@@ -33,6 +33,8 @@
#include "qemu/compatfd.h"
+void icount_warp_rt(void *opaque);
+
/* If we have signalfd, we mask out the signals we want to handle and then
* use signalfd to listen for them. We rely on whatever the current signal
* handler is to dispatch the signals when we receive them.
@@ -470,6 +472,9 @@ int main_loop_wait(int nonblocking)
qemu_run_all_timers();
+ if (use_icount == 1) {
+ icount_warp_rt(NULL);
+ }
return ret;
}
--
1.8.1.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [RFC 3/3] icount: create a new icount based timer.
2013-07-18 15:02 [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount fred.konrad
2013-07-18 15:02 ` [Qemu-devel] [RFC 1/3] icount: base rt_clock on icount fred.konrad
2013-07-18 15:02 ` [Qemu-devel] [RFC 2/3] icount: sync vm_clock on the next event fred.konrad
@ 2013-07-18 15:02 ` fred.konrad
2013-07-18 15:08 ` Peter Maydell
2013-07-18 15:06 ` [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount Peter Maydell
3 siblings, 1 reply; 18+ messages in thread
From: fred.konrad @ 2013-07-18 15:02 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini, mark.burton, fred.konrad
From: KONRAD Frederic <fred.konrad@greensocs.com>
This creates a new icount based timer, with no bias.
It moves only with the instruction counter.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
---
cpus.c | 10 ++++++++--
include/qemu/timer.h | 4 ++++
qemu-timer.c | 6 ++++++
3 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/cpus.c b/cpus.c
index fb83153..86fe82b 100644
--- a/cpus.c
+++ b/cpus.c
@@ -156,6 +156,12 @@ void restore_icount(CPUArchState *env, int save)
/* Return the virtual CPU time, based on the instruction counter. */
int64_t cpu_get_icount(void)
{
+ return qemu_icount_bias + cpu_get_icount_wo_bias();
+}
+
+/* Return the virtual CPU time, really based on the instruction counter. */
+int64_t cpu_get_icount_wo_bias(void)
+{
int64_t icount;
CPUArchState *env = cpu_single_env;
@@ -166,7 +172,7 @@ int64_t cpu_get_icount(void)
}
icount -= (env->icount_decr.u16.low + env->icount_extra);
}
- return qemu_icount_bias + (icount << icount_time_shift);
+ return icount << icount_time_shift;
}
/* return the host CPU cycle counter and handle stop/restart */
@@ -1165,7 +1171,7 @@ static int tcg_cpu_exec(CPUArchState *env)
qemu_icount -= (env->icount_decr.u16.low + env->icount_extra);
env->icount_decr.u16.low = 0;
env->icount_extra = 0;
- count = qemu_icount_round(qemu_clock_deadline(vm_clock));
+ count = qemu_icount_round(qemu_clock_deadline(ic_clock));
qemu_icount += count;
decr = (count > 0xffff) ? 0xffff : count;
count -= decr;
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index b4d8229..6e53f22 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -32,6 +32,9 @@ extern QEMUClock *vm_clock;
the virtual clock. */
extern QEMUClock *host_clock;
+/* A new clock based on icount only. */
+extern QEMUClock *ic_clock;
+
int64_t qemu_get_clock_ns(QEMUClock *clock);
int64_t qemu_clock_has_timers(QEMUClock *clock);
int64_t qemu_clock_expired(QEMUClock *clock);
@@ -136,6 +139,7 @@ void qemu_put_timer(QEMUFile *f, QEMUTimer *ts);
/* icount */
int64_t cpu_get_icount(void);
+int64_t cpu_get_icount_wo_bias(void);
int64_t cpu_get_clock(void);
/*******************************************/
diff --git a/qemu-timer.c b/qemu-timer.c
index 6c607e5..79d5dcb 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -43,6 +43,7 @@
#define QEMU_CLOCK_REALTIME 0
#define QEMU_CLOCK_VIRTUAL 1
#define QEMU_CLOCK_HOST 2
+#define QEMU_CLOCK_ICOUNT 3
struct QEMUClock {
QEMUTimer *active_timers;
@@ -230,6 +231,7 @@ next:
QEMUClock *rt_clock;
QEMUClock *vm_clock;
QEMUClock *host_clock;
+QEMUClock *ic_clock;
static QEMUClock *qemu_new_clock(int type)
{
@@ -413,6 +415,8 @@ int64_t qemu_get_clock_ns(QEMUClock *clock)
} else {
return cpu_get_clock();
}
+ case QEMU_CLOCK_ICOUNT:
+ return cpu_get_icount_wo_bias();
case QEMU_CLOCK_HOST:
now = get_clock_realtime();
last = clock->last;
@@ -440,6 +444,7 @@ void init_clocks(void)
rt_clock = qemu_new_clock(QEMU_CLOCK_REALTIME);
vm_clock = qemu_new_clock(QEMU_CLOCK_VIRTUAL);
host_clock = qemu_new_clock(QEMU_CLOCK_HOST);
+ ic_clock = qemu_new_clock(QEMU_CLOCK_ICOUNT);
}
}
@@ -456,6 +461,7 @@ void qemu_run_all_timers(void)
qemu_run_timers(vm_clock);
qemu_run_timers(rt_clock);
qemu_run_timers(host_clock);
+ qemu_run_timers(ic_clock);
/* rearm timer, if not periodic */
if (alarm_timer->expired) {
--
1.8.1.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [RFC 3/3] icount: create a new icount based timer.
2013-07-18 15:02 ` [Qemu-devel] [RFC 3/3] icount: create a new icount based timer fred.konrad
@ 2013-07-18 15:08 ` Peter Maydell
0 siblings, 0 replies; 18+ messages in thread
From: Peter Maydell @ 2013-07-18 15:08 UTC (permalink / raw)
To: fred.konrad; +Cc: pbonzini, mark.burton, qemu-devel
On 18 July 2013 16:02, <fred.konrad@greensocs.com> wrote:
> @@ -156,6 +156,12 @@ void restore_icount(CPUArchState *env, int save)
> /* Return the virtual CPU time, based on the instruction counter. */
> int64_t cpu_get_icount(void)
> {
> + return qemu_icount_bias + cpu_get_icount_wo_bias();
> +}
> +
> +/* Return the virtual CPU time, really based on the instruction counter. */
> +int64_t cpu_get_icount_wo_bias(void)
> +{
The comments for these two functions don't make any sense.
You need to explain what they're actually doing (and
when you'd want one and when the other).
-- PMM
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
2013-07-18 15:02 [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount fred.konrad
` (2 preceding siblings ...)
2013-07-18 15:02 ` [Qemu-devel] [RFC 3/3] icount: create a new icount based timer fred.konrad
@ 2013-07-18 15:06 ` Peter Maydell
2013-07-18 15:09 ` Frederic Konrad
2013-07-18 15:35 ` Paolo Bonzini
3 siblings, 2 replies; 18+ messages in thread
From: Peter Maydell @ 2013-07-18 15:06 UTC (permalink / raw)
To: fred.konrad; +Cc: pbonzini, mark.burton, qemu-devel
On 18 July 2013 16:02, <fred.konrad@greensocs.com> wrote:
> As I said in the last email, we have issues with determinism with icount.
> We are wondering if determinism is really ensured with icount?
My opinion is that it *should* be deterministic but it would
be unsurprising if the determinism had got broken along the way.
> Both icount and reverse execution need an instruction counter. icount use a
> count-down mechanism but reverse execution need a continuous counter. For now
> we have build a separate counter and we think that these two counters can be
> merged. However we would like feedback about this before modifying this.
I definitely think that there should only be one counter, not two.
thanks
-- PMM
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
2013-07-18 15:06 ` [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount Peter Maydell
@ 2013-07-18 15:09 ` Frederic Konrad
2013-07-18 15:12 ` Peter Maydell
2013-07-18 15:35 ` Paolo Bonzini
1 sibling, 1 reply; 18+ messages in thread
From: Frederic Konrad @ 2013-07-18 15:09 UTC (permalink / raw)
To: Peter Maydell; +Cc: pbonzini, mark.burton, qemu-devel
On 18/07/2013 17:06, Peter Maydell wrote:
> On 18 July 2013 16:02, <fred.konrad@greensocs.com> wrote:
>> As I said in the last email, we have issues with determinism with icount.
>> We are wondering if determinism is really ensured with icount?
> My opinion is that it *should* be deterministic but it would
> be unsurprising if the determinism had got broken along the way.
Yes, the documentation say that this command can give deterministic results
that's why we asked.
>
>> Both icount and reverse execution need an instruction counter. icount use a
>> count-down mechanism but reverse execution need a continuous counter. For now
>> we have build a separate counter and we think that these two counters can be
>> merged. However we would like feedback about this before modifying this.
> I definitely think that there should only be one counter, not two.
>
> thanks
> -- PMM
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
2013-07-18 15:09 ` Frederic Konrad
@ 2013-07-18 15:12 ` Peter Maydell
0 siblings, 0 replies; 18+ messages in thread
From: Peter Maydell @ 2013-07-18 15:12 UTC (permalink / raw)
To: Frederic Konrad; +Cc: pbonzini, mark.burton, qemu-devel
On 18 July 2013 16:09, Frederic Konrad <fred.konrad@greensocs.com> wrote:
> On 18/07/2013 17:06, Peter Maydell wrote:
>>
>> On 18 July 2013 16:02, <fred.konrad@greensocs.com> wrote:
>>>
>>> As I said in the last email, we have issues with determinism with icount.
>>> We are wondering if determinism is really ensured with icount?
>>
>> My opinion is that it *should* be deterministic but it would
>> be unsurprising if the determinism had got broken along the way.
>
>
> Yes, the documentation say that this command can give deterministic results
> that's why we asked.
As part of working through this it would be great if you could
write some developer documentation (in docs/ or possibly as a
comment somewhere sensible) that summarises how icount works,
what you need to do in a target-* to support it, etc.
[I suspect that by the time you're done you're going to be the
expert on icount...]
thanks
-- PMM
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
2013-07-18 15:06 ` [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount Peter Maydell
2013-07-18 15:09 ` Frederic Konrad
@ 2013-07-18 15:35 ` Paolo Bonzini
2013-07-18 16:31 ` Frederic Konrad
2013-07-29 15:27 ` Frederic Konrad
1 sibling, 2 replies; 18+ messages in thread
From: Paolo Bonzini @ 2013-07-18 15:35 UTC (permalink / raw)
To: Peter Maydell; +Cc: mark.burton, qemu-devel, fred.konrad
Il 18/07/2013 17:06, Peter Maydell ha scritto:
> On 18 July 2013 16:02, <fred.konrad@greensocs.com> wrote:
>> As I said in the last email, we have issues with determinism with icount.
>> We are wondering if determinism is really ensured with icount?
>
> My opinion is that it *should* be deterministic but it would
> be unsurprising if the determinism had got broken along the way.
First of all, it can only be deterministic if the guest satisfies (at
least) all the following condition:
1) only uses timer that QEMU bases on vm_clock (which means that you
should use "-rtc clock=vm"---sorry Fred, didn't think about this in the
previous answer);
2) never does any network operation nor any asynchronous disk I/O operation
3) never halts the VCPU waiting for an interrupt
Point 1 is obvious.
To explain points 2, let's consider what happens if a block device uses
synchronous vs. asynchronous I/O.
With synchronous I/O, each block device operation will complete
immediately. All clocks are stalled during the operation.
With asynchronous I/O, each block device operation will be done while
the CPU is running. If the CPU is polling a completion flag, the number
of instructions executed (thus icount) depends on how long it takes to
do I/O.
To explain point 3 (which is the only one that _might_ be fixable),
let's see what happens if the VCPU halts waiting for an interrupt. If
that is the case, and you haven't done any asynchronous I/O, there
should be active vm_clock timers, and you have another possible source
of non-deterministic behavior.
The current QEMU behavior is (and has always been) to start tracking
rt_clock. This is obviously not deterministic. Note that with the
switch to separate threads for iothread/VCPU, the algorithm to do this
has become much better. Let's look at a couple possibilities:
2) jump to the next vm_clock deadline. This sounds appealing, but it is
still nondeterministic in the general case when the guest *is* doing
asynchronous I/O too. How many vm_clock timers do you run before I/O
finishes? Furthermore, the vm_clock might move too fast. Think of an
RTC clock whose alarm registers are 0/0/0 so it fires at midnight; if it
is the only active vm_clock timer, you end up in 2107 even before the
kernel boots!
3) do not process vm_clock timers at all unless there is no pending I/O
(block/network); if there is none, track rt_clock as in current
behavior. I just made it up, but it sounds promising and similar to
synchronous I/O. It should not be extremely hard to implement, and it
can remove this kind of nondeterminism. But it won't fix the case when
the CPU is polling.
Paolo
ps: I'm not an expert on icount at all, I'm only reasoning of the
possible interactions with the main loop.
>> Both icount and reverse execution need an instruction counter. icount use a
>> count-down mechanism but reverse execution need a continuous counter. For now
>> we have build a separate counter and we think that these two counters can be
>> merged. However we would like feedback about this before modifying this.
>
> I definitely think that there should only be one counter, not two.
>
> thanks
> -- PMM
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
2013-07-18 15:35 ` Paolo Bonzini
@ 2013-07-18 16:31 ` Frederic Konrad
2013-07-18 16:35 ` Paolo Bonzini
2013-07-29 15:27 ` Frederic Konrad
1 sibling, 1 reply; 18+ messages in thread
From: Frederic Konrad @ 2013-07-18 16:31 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Peter Maydell, mark.burton, qemu-devel
On 18/07/2013 17:35, Paolo Bonzini wrote:
> Il 18/07/2013 17:06, Peter Maydell ha scritto:
>> On 18 July 2013 16:02, <fred.konrad@greensocs.com> wrote:
>>> As I said in the last email, we have issues with determinism with icount.
>>> We are wondering if determinism is really ensured with icount?
>> My opinion is that it *should* be deterministic but it would
>> be unsurprising if the determinism had got broken along the way.
> First of all, it can only be deterministic if the guest satisfies (at
> least) all the following condition:
>
> 1) only uses timer that QEMU bases on vm_clock (which means that you
> should use "-rtc clock=vm"---sorry Fred, didn't think about this in the
> previous answer);
Oops sorry, I didn't mentioned that, but we used rtc clock=vm for our tests.
> 2) never does any network operation nor any asynchronous disk I/O operation
>
> 3) never halts the VCPU waiting for an interrupt
>
>
> Point 1 is obvious.
>
>
> To explain points 2, let's consider what happens if a block device uses
> synchronous vs. asynchronous I/O.
>
> With synchronous I/O, each block device operation will complete
> immediately. All clocks are stalled during the operation.
>
> With asynchronous I/O, each block device operation will be done while
> the CPU is running. If the CPU is polling a completion flag, the number
> of instructions executed (thus icount) depends on how long it takes to
> do I/O.
So I suppose this can happen even if there are any network card or block
device.
We probably need to disable it until we finally save and replay IO, to
get this thing
working.
>
>
> To explain point 3 (which is the only one that _might_ be fixable),
> let's see what happens if the VCPU halts waiting for an interrupt. If
> that is the case, and you haven't done any asynchronous I/O, there
> should be active vm_clock timers, and you have another possible source
> of non-deterministic behavior.
>
> The current QEMU behavior is (and has always been) to start tracking
> rt_clock. This is obviously not deterministic. Note that with the
> switch to separate threads for iothread/VCPU, the algorithm to do this
> has become much better. Let's look at a couple possibilities:
>
> 2) jump to the next vm_clock deadline. This sounds appealing, but it is
> still nondeterministic in the general case when the guest *is* doing
> asynchronous I/O too. How many vm_clock timers do you run before I/O
> finishes? Furthermore, the vm_clock might move too fast. Think of an
> RTC clock whose alarm registers are 0/0/0 so it fires at midnight; if it
> is the only active vm_clock timer, you end up in 2107 even before the
> kernel boots!
Yes I didn't think about that :).
>
> 3) do not process vm_clock timers at all unless there is no pending I/O
> (block/network); if there is none, track rt_clock as in current
> behavior. I just made it up, but it sounds promising and similar to
> synchronous I/O. It should not be extremely hard to implement, and it
> can remove this kind of nondeterminism. But it won't fix the case when
> the CPU is polling.
Thanks, I need to take a look at all this.
>
> Paolo
>
> ps: I'm not an expert on icount at all, I'm only reasoning of the
> possible interactions with the main loop.
>
>>> Both icount and reverse execution need an instruction counter. icount use a
>>> count-down mechanism but reverse execution need a continuous counter. For now
>>> we have build a separate counter and we think that these two counters can be
>>> merged. However we would like feedback about this before modifying this.
>> I definitely think that there should only be one counter, not two.
>>
>> thanks
>> -- PMM
>>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
2013-07-18 16:31 ` Frederic Konrad
@ 2013-07-18 16:35 ` Paolo Bonzini
2013-07-19 15:26 ` Frederic Konrad
0 siblings, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2013-07-18 16:35 UTC (permalink / raw)
To: Frederic Konrad; +Cc: Peter Maydell, mark.burton, qemu-devel, Orit Wasserman
Il 18/07/2013 18:31, Frederic Konrad ha scritto:
> On 18/07/2013 17:35, Paolo Bonzini wrote:
>> Il 18/07/2013 17:06, Peter Maydell ha scritto:
>>> On 18 July 2013 16:02, <fred.konrad@greensocs.com> wrote:
>>>> As I said in the last email, we have issues with determinism with
>>>> icount.
>>>> We are wondering if determinism is really ensured with icount?
>>> My opinion is that it *should* be deterministic but it would
>>> be unsurprising if the determinism had got broken along the way.
>> First of all, it can only be deterministic if the guest satisfies (at
>> least) all the following condition:
>>
>> 1) only uses timer that QEMU bases on vm_clock (which means that you
>> should use "-rtc clock=vm"---sorry Fred, didn't think about this in the
>> previous answer);
>
> Oops sorry, I didn't mentioned that, but we used rtc clock=vm for our
> tests.
>> 2) never does any network operation nor any asynchronous disk I/O
>> operation
>>
>> 3) never halts the VCPU waiting for an interrupt
>>
>>
>> Point 1 is obvious.
>>
>>
>> To explain points 2, let's consider what happens if a block device uses
>> synchronous vs. asynchronous I/O.
>>
>> With synchronous I/O, each block device operation will complete
>> immediately. All clocks are stalled during the operation.
>>
>> With asynchronous I/O, each block device operation will be done while
>> the CPU is running. If the CPU is polling a completion flag, the number
>> of instructions executed (thus icount) depends on how long it takes to
>> do I/O.
>
> So I suppose this can happen even if there are any network card or block
> device.
>
> We probably need to disable it until we finally save and replay IO, to
> get this thing working.
Are you aware of the work that was done on fault tolerance (Kemari)?
Orit is working on resurrecting it.
Paolo
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
2013-07-18 16:35 ` Paolo Bonzini
@ 2013-07-19 15:26 ` Frederic Konrad
0 siblings, 0 replies; 18+ messages in thread
From: Frederic Konrad @ 2013-07-19 15:26 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Peter Maydell, mark.burton, qemu-devel, Orit Wasserman
On 18/07/2013 18:35, Paolo Bonzini wrote:
> Il 18/07/2013 18:31, Frederic Konrad ha scritto:
>> On 18/07/2013 17:35, Paolo Bonzini wrote:
>>> Il 18/07/2013 17:06, Peter Maydell ha scritto:
>>>> On 18 July 2013 16:02, <fred.konrad@greensocs.com> wrote:
>>>>> As I said in the last email, we have issues with determinism with
>>>>> icount.
>>>>> We are wondering if determinism is really ensured with icount?
>>>> My opinion is that it *should* be deterministic but it would
>>>> be unsurprising if the determinism had got broken along the way.
>>> First of all, it can only be deterministic if the guest satisfies (at
>>> least) all the following condition:
>>>
>>> 1) only uses timer that QEMU bases on vm_clock (which means that you
>>> should use "-rtc clock=vm"---sorry Fred, didn't think about this in the
>>> previous answer);
>> Oops sorry, I didn't mentioned that, but we used rtc clock=vm for our
>> tests.
>>> 2) never does any network operation nor any asynchronous disk I/O
>>> operation
>>>
>>> 3) never halts the VCPU waiting for an interrupt
>>>
>>>
>>> Point 1 is obvious.
>>>
>>>
>>> To explain points 2, let's consider what happens if a block device uses
>>> synchronous vs. asynchronous I/O.
>>>
>>> With synchronous I/O, each block device operation will complete
>>> immediately. All clocks are stalled during the operation.
>>>
>>> With asynchronous I/O, each block device operation will be done while
>>> the CPU is running. If the CPU is polling a completion flag, the number
>>> of instructions executed (thus icount) depends on how long it takes to
>>> do I/O.
>> So I suppose this can happen even if there are any network card or block
>> device.
>>
>> We probably need to disable it until we finally save and replay IO, to
>> get this thing working.
> Are you aware of the work that was done on fault tolerance (Kemari)?
> Orit is working on resurrecting it.
>
> Paolo
No, but I will take a look that can be really usefull for IO.
Thanks,
Fred
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
2013-07-18 15:35 ` Paolo Bonzini
2013-07-18 16:31 ` Frederic Konrad
@ 2013-07-29 15:27 ` Frederic Konrad
2013-07-29 16:42 ` Paolo Bonzini
1 sibling, 1 reply; 18+ messages in thread
From: Frederic Konrad @ 2013-07-29 15:27 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Peter Maydell, mark.burton, qemu-devel
On 18/07/2013 17:35, Paolo Bonzini wrote:
> Il 18/07/2013 17:06, Peter Maydell ha scritto:
>> On 18 July 2013 16:02,<fred.konrad@greensocs.com> wrote:
>>> As I said in the last email, we have issues with determinism with icount.
>>> We are wondering if determinism is really ensured with icount?
>> My opinion is that it *should* be deterministic but it would
>> be unsurprising if the determinism had got broken along the way.
> First of all, it can only be deterministic if the guest satisfies (at
> least) all the following condition:
>
> 1) only uses timer that QEMU bases on vm_clock (which means that you
> should use "-rtc clock=vm"---sorry Fred, didn't think about this in the
> previous answer);
>
> 2) never does any network operation nor any asynchronous disk I/O operation
>
> 3) never halts the VCPU waiting for an interrupt
Hi,
qemu_alarm is making the replay not deterministic too.
We tried to remove those alarms and it seems to replay well (at least
far better).
So the question is: how we can solve that?
We thought at two possibilities :
* record/replay them, like IO.
* base them on our new ic_clock.
Both have drawbacks:
* record/replay won't make icount more deterministic (run to run).
* ic_clock speed time is apparently not constant.
Thanks,
Fred
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
2013-07-29 15:27 ` Frederic Konrad
@ 2013-07-29 16:42 ` Paolo Bonzini
2013-07-30 7:06 ` Frederic Konrad
0 siblings, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2013-07-29 16:42 UTC (permalink / raw)
To: Frederic Konrad; +Cc: Peter Maydell, mark.burton, qemu-devel
Il 29/07/2013 17:27, Frederic Konrad ha scritto:
> On 18/07/2013 17:35, Paolo Bonzini wrote:
>> Il 18/07/2013 17:06, Peter Maydell ha scritto:
>>> On 18 July 2013 16:02,<fred.konrad@greensocs.com> wrote:
>>>> As I said in the last email, we have issues with determinism with
>>>> icount.
>>>> We are wondering if determinism is really ensured with icount?
>>> My opinion is that it *should* be deterministic but it would
>>> be unsurprising if the determinism had got broken along the way.
>> First of all, it can only be deterministic if the guest satisfies (at
>> least) all the following condition:
>>
>> 1) only uses timer that QEMU bases on vm_clock (which means that you
>> should use "-rtc clock=vm"---sorry Fred, didn't think about this in the
>> previous answer);
>>
>> 2) never does any network operation nor any asynchronous disk I/O
>> operation
>>
>> 3) never halts the VCPU waiting for an interrupt
>
> Hi,
>
> qemu_alarm is making the replay not deterministic too.
What is qemu_alarm? If you mean qemu_alarm_timer, then that means
rt_clock and host_clock (item 1 above)?
If so, yes, I believe you need to record/replay them. When doing replay
for reverse execution, you certainly want to execute at full speed
without waiting for real time to pass again.
Paolo
> We tried to remove those alarms and it seems to replay well (at least
> far better).
>
> So the question is: how we can solve that?
>
> We thought at two possibilities :
> * record/replay them, like IO.
> * base them on our new ic_clock.
>
> Both have drawbacks:
> * record/replay won't make icount more deterministic (run to run).
> * ic_clock speed time is apparently not constant.
>
> Thanks,
> Fred
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
2013-07-29 16:42 ` Paolo Bonzini
@ 2013-07-30 7:06 ` Frederic Konrad
0 siblings, 0 replies; 18+ messages in thread
From: Frederic Konrad @ 2013-07-30 7:06 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Peter Maydell, mark.burton, qemu-devel
On 29/07/2013 18:42, Paolo Bonzini wrote:
> Il 29/07/2013 17:27, Frederic Konrad ha scritto:
>> On 18/07/2013 17:35, Paolo Bonzini wrote:
>>> Il 18/07/2013 17:06, Peter Maydell ha scritto:
>>>> On 18 July 2013 16:02,<fred.konrad@greensocs.com> wrote:
>>>>> As I said in the last email, we have issues with determinism with
>>>>> icount.
>>>>> We are wondering if determinism is really ensured with icount?
>>>> My opinion is that it *should* be deterministic but it would
>>>> be unsurprising if the determinism had got broken along the way.
>>> First of all, it can only be deterministic if the guest satisfies (at
>>> least) all the following condition:
>>>
>>> 1) only uses timer that QEMU bases on vm_clock (which means that you
>>> should use "-rtc clock=vm"---sorry Fred, didn't think about this in the
>>> previous answer);
>>>
>>> 2) never does any network operation nor any asynchronous disk I/O
>>> operation
>>>
>>> 3) never halts the VCPU waiting for an interrupt
>> Hi,
>>
>> qemu_alarm is making the replay not deterministic too.
> What is qemu_alarm? If you mean qemu_alarm_timer, then that means
> rt_clock and host_clock (item 1 above)?
>
> If so, yes, I believe you need to record/replay them. When doing replay
> for reverse execution, you certainly want to execute at full speed
> without waiting for real time to pass again.
>
> Paolo
Yes, it was what we believed too. :)
Thanks,
Fred
>> We tried to remove those alarms and it seems to replay well (at least
>> far better).
>>
>> So the question is: how we can solve that?
>>
>> We thought at two possibilities :
>> * record/replay them, like IO.
>> * base them on our new ic_clock.
>>
>> Both have drawbacks:
>> * record/replay won't make icount more deterministic (run to run).
>> * ic_clock speed time is apparently not constant.
>>
>> Thanks,
>> Fred
>>
^ permalink raw reply [flat|nested] 18+ messages in thread