From: Marcelo Tosatti <mtosatti@redhat.com>
To: Glauber Costa <glommer@redhat.com>
Cc: aliguori@us.ibm.com, kvm@vger.kernel.org, qemu-devel@nongnu.org,
jan.kiszka@siemens.com, Ulrich Obergfell <uobergfe@redhat.com>,
gcosta@redhat.com, avi@redhat.com
Subject: Re: [PATCH v3 5/5] hpet 'driftfix': add code in hpet_timer() to compensate delayed callbacks and coalesced interrupts
Date: Tue, 3 May 2011 19:55:35 -0300 [thread overview]
Message-ID: <20110503225535.GA16599@amt.cnet> (raw)
In-Reply-To: <1304460505.4737.4.camel@mothafucka.localdomain>
On Tue, May 03, 2011 at 07:08:25PM -0300, Glauber Costa wrote:
> On Tue, 2011-05-03 at 16:03 -0300, Marcelo Tosatti wrote:
> > On Thu, Apr 28, 2011 at 04:25:00PM +0200, Ulrich Obergfell wrote:
> > > Loss of periodic timer interrupts caused by delayed callbacks and by
> > > interrupt coalescing is compensated by gradually injecting additional
> > > interrupts during subsequent timer intervals, starting at a rate of
> > > one additional interrupt per interval. The injection of additional
> > > interrupts is based on a backlog of unaccounted HPET clock periods
> > > (new HPETTimer field 'ticks_not_accounted'). The backlog increases
> > > due to delayed callbacks and coalesced interrupts, and it decreases
> > > if an interrupt was injected successfully. If the backlog increases
> > > while compensation is still in progress, the rate at which additional
> > > interrupts are injected is increased too. A limit is imposed on the
> > > backlog and on the rate.
> > >
> > > Injecting additional timer interrupts to compensate lost interrupts
> > > can alleviate long term time drift. However, on a short time scale,
> > > this method can have the side effect of making virtual machine time
> > > intermittently pass slower and faster than real time (depending on
> > > the guest's time keeping algorithm). Compensation is disabled by
> > > default and can be enabled for guests where this behaviour may be
> > > acceptable.
> > >
> > > Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
> > > ---
> > > hw/hpet.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> > > 1 files changed, 61 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/hw/hpet.c b/hw/hpet.c
> > > index 35466ae..92d5f58 100644
> > > --- a/hw/hpet.c
> > > +++ b/hw/hpet.c
> > > @@ -32,6 +32,7 @@
> > > #include "sysbus.h"
> > > #include "mc146818rtc.h"
> > > #include "sysemu.h"
> > > +#include <assert.h>
> > >
> > > //#define HPET_DEBUG
> > > #ifdef HPET_DEBUG
> > > @@ -42,6 +43,9 @@
> > >
> > > #define HPET_MSI_SUPPORT 0
> > >
> > > +#define MAX_TICKS_NOT_ACCOUNTED (uint64_t)500000000 /* 5 sec */
> > > +#define MAX_IRQ_RATE (uint32_t)10
> > > +
> > > struct HPETState;
> > > typedef struct HPETTimer { /* timers */
> > > uint8_t tn; /*timer number*/
> > > @@ -326,28 +330,63 @@ static const VMStateDescription vmstate_hpet = {
> > > }
> > > };
> > >
> > > +static bool hpet_timer_has_tick_backlog(HPETTimer *t)
> > > +{
> > > + uint64_t backlog = t->ticks_not_accounted - (t->period + t->prev_period);
> > > + return (backlog >= t->period);
> > > +}
> > > +
> > > /*
> > > * timer expiration callback
> > > */
> > > static void hpet_timer(void *opaque)
> > > {
> > > HPETTimer *t = opaque;
> > > + HPETState *s = t->state;
> > > uint64_t diff;
> > > -
> > > + int irq_delivered = 0;
> > > + uint32_t irq_count = 0;
> > > uint64_t period = t->period;
> > > uint64_t cur_tick = hpet_get_ticks(t->state);
> > >
> > > + if (s->driftfix && !t->ticks_not_accounted) {
> > > + t->ticks_not_accounted = t->prev_period = t->period;
> > > + }
> > > if (timer_is_periodic(t) && period != 0) {
> > > if (t->config & HPET_TN_32BIT) {
> > > while (hpet_time_after(cur_tick, t->cmp)) {
> > > t->cmp = (uint32_t)(t->cmp + t->period);
> > > + t->ticks_not_accounted += t->period;
> > > + irq_count++;
> > > }
> > > } else {
> > > while (hpet_time_after64(cur_tick, t->cmp)) {
> > > t->cmp += period;
> > > + t->ticks_not_accounted += period;
> > > + irq_count++;
> > > }
> > > }
> > > diff = hpet_calculate_diff(t, cur_tick);
> > > + if (s->driftfix) {
> > > + if (t->ticks_not_accounted > MAX_TICKS_NOT_ACCOUNTED) {
> > > + t->ticks_not_accounted = t->period + t->prev_period;
> > > + }
> > > + if (hpet_timer_has_tick_backlog(t)) {
> > > + if (t->irq_rate == 1 || irq_count > 1) {
> > > + t->irq_rate++;
> > > + t->irq_rate = MIN(t->irq_rate, MAX_IRQ_RATE);
> > > + }
> > > + if (t->divisor == 0) {
> > > + assert(irq_count);
> > > + }
> > > + if (irq_count) {
> > > + t->divisor = t->irq_rate;
> > > + }
> > > + diff /= t->divisor--;
> > > + } else {
> > > + t->irq_rate = 1;
> > > + }
> > > + }
> > > qemu_mod_timer(t->qemu_timer,
> > > qemu_get_clock_ns(vm_clock) + (int64_t)ticks_to_ns(diff));
> > > } else if (t->config & HPET_TN_32BIT && !timer_is_periodic(t)) {
> > > @@ -358,7 +397,22 @@ static void hpet_timer(void *opaque)
> > > t->wrap_flag = 0;
> > > }
> > > }
> > > - update_irq(t, 1);
> > > + if (s->driftfix && timer_is_periodic(t) && period != 0) {
> > > + if (t->ticks_not_accounted >= t->period + t->prev_period) {
^^^^^^^^^^
> > > + irq_delivered = update_irq(t, 1);
> > > + if (irq_delivered) {
> > > + t->ticks_not_accounted -= t->prev_period;
> > > + t->prev_period = t->period;
> > > + } else {
> > > + if (irq_count) {
> > > + t->irq_rate++;
> > > + t->irq_rate = MIN(t->irq_rate, MAX_IRQ_RATE);
> > > + }
> > > + }
> > > + }
> > > + } else {
> > > + update_irq(t, 1);
> > > + }
> > > }
> >
> > Hi Ulrich,
> >
> > Whats prev_period for, since in practice the period will not change
> > between interrupts (OS programs comparator once, or perhaps twice during
> > bootup) ?
>
> Actually, some systems, like Windows 2000-and-something change this
> period quite heavily under multimedia workloads.
I see. Still, using the period at the previous timer handler instance in
the decision whether to inject an interrupt to the guest makes no sense
to me.
> > Other than that, shouldnt reset accounting variables to init state on
> > write to GLOBAL_ENABLE_CFG / writes to main counter?
What i meant to ask here is whether the reinjection state (including
ticks_not_accounted) should be reset whenever a different period is
programmed.
WARNING: multiple messages have this Message-ID (diff)
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Glauber Costa <glommer@redhat.com>
Cc: aliguori@us.ibm.com, kvm@vger.kernel.org, qemu-devel@nongnu.org,
jan.kiszka@siemens.com, Ulrich Obergfell <uobergfe@redhat.com>,
gcosta@redhat.com, avi@redhat.com
Subject: Re: [Qemu-devel] [PATCH v3 5/5] hpet 'driftfix': add code in hpet_timer() to compensate delayed callbacks and coalesced interrupts
Date: Tue, 3 May 2011 19:55:35 -0300 [thread overview]
Message-ID: <20110503225535.GA16599@amt.cnet> (raw)
In-Reply-To: <1304460505.4737.4.camel@mothafucka.localdomain>
On Tue, May 03, 2011 at 07:08:25PM -0300, Glauber Costa wrote:
> On Tue, 2011-05-03 at 16:03 -0300, Marcelo Tosatti wrote:
> > On Thu, Apr 28, 2011 at 04:25:00PM +0200, Ulrich Obergfell wrote:
> > > Loss of periodic timer interrupts caused by delayed callbacks and by
> > > interrupt coalescing is compensated by gradually injecting additional
> > > interrupts during subsequent timer intervals, starting at a rate of
> > > one additional interrupt per interval. The injection of additional
> > > interrupts is based on a backlog of unaccounted HPET clock periods
> > > (new HPETTimer field 'ticks_not_accounted'). The backlog increases
> > > due to delayed callbacks and coalesced interrupts, and it decreases
> > > if an interrupt was injected successfully. If the backlog increases
> > > while compensation is still in progress, the rate at which additional
> > > interrupts are injected is increased too. A limit is imposed on the
> > > backlog and on the rate.
> > >
> > > Injecting additional timer interrupts to compensate lost interrupts
> > > can alleviate long term time drift. However, on a short time scale,
> > > this method can have the side effect of making virtual machine time
> > > intermittently pass slower and faster than real time (depending on
> > > the guest's time keeping algorithm). Compensation is disabled by
> > > default and can be enabled for guests where this behaviour may be
> > > acceptable.
> > >
> > > Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
> > > ---
> > > hw/hpet.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> > > 1 files changed, 61 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/hw/hpet.c b/hw/hpet.c
> > > index 35466ae..92d5f58 100644
> > > --- a/hw/hpet.c
> > > +++ b/hw/hpet.c
> > > @@ -32,6 +32,7 @@
> > > #include "sysbus.h"
> > > #include "mc146818rtc.h"
> > > #include "sysemu.h"
> > > +#include <assert.h>
> > >
> > > //#define HPET_DEBUG
> > > #ifdef HPET_DEBUG
> > > @@ -42,6 +43,9 @@
> > >
> > > #define HPET_MSI_SUPPORT 0
> > >
> > > +#define MAX_TICKS_NOT_ACCOUNTED (uint64_t)500000000 /* 5 sec */
> > > +#define MAX_IRQ_RATE (uint32_t)10
> > > +
> > > struct HPETState;
> > > typedef struct HPETTimer { /* timers */
> > > uint8_t tn; /*timer number*/
> > > @@ -326,28 +330,63 @@ static const VMStateDescription vmstate_hpet = {
> > > }
> > > };
> > >
> > > +static bool hpet_timer_has_tick_backlog(HPETTimer *t)
> > > +{
> > > + uint64_t backlog = t->ticks_not_accounted - (t->period + t->prev_period);
> > > + return (backlog >= t->period);
> > > +}
> > > +
> > > /*
> > > * timer expiration callback
> > > */
> > > static void hpet_timer(void *opaque)
> > > {
> > > HPETTimer *t = opaque;
> > > + HPETState *s = t->state;
> > > uint64_t diff;
> > > -
> > > + int irq_delivered = 0;
> > > + uint32_t irq_count = 0;
> > > uint64_t period = t->period;
> > > uint64_t cur_tick = hpet_get_ticks(t->state);
> > >
> > > + if (s->driftfix && !t->ticks_not_accounted) {
> > > + t->ticks_not_accounted = t->prev_period = t->period;
> > > + }
> > > if (timer_is_periodic(t) && period != 0) {
> > > if (t->config & HPET_TN_32BIT) {
> > > while (hpet_time_after(cur_tick, t->cmp)) {
> > > t->cmp = (uint32_t)(t->cmp + t->period);
> > > + t->ticks_not_accounted += t->period;
> > > + irq_count++;
> > > }
> > > } else {
> > > while (hpet_time_after64(cur_tick, t->cmp)) {
> > > t->cmp += period;
> > > + t->ticks_not_accounted += period;
> > > + irq_count++;
> > > }
> > > }
> > > diff = hpet_calculate_diff(t, cur_tick);
> > > + if (s->driftfix) {
> > > + if (t->ticks_not_accounted > MAX_TICKS_NOT_ACCOUNTED) {
> > > + t->ticks_not_accounted = t->period + t->prev_period;
> > > + }
> > > + if (hpet_timer_has_tick_backlog(t)) {
> > > + if (t->irq_rate == 1 || irq_count > 1) {
> > > + t->irq_rate++;
> > > + t->irq_rate = MIN(t->irq_rate, MAX_IRQ_RATE);
> > > + }
> > > + if (t->divisor == 0) {
> > > + assert(irq_count);
> > > + }
> > > + if (irq_count) {
> > > + t->divisor = t->irq_rate;
> > > + }
> > > + diff /= t->divisor--;
> > > + } else {
> > > + t->irq_rate = 1;
> > > + }
> > > + }
> > > qemu_mod_timer(t->qemu_timer,
> > > qemu_get_clock_ns(vm_clock) + (int64_t)ticks_to_ns(diff));
> > > } else if (t->config & HPET_TN_32BIT && !timer_is_periodic(t)) {
> > > @@ -358,7 +397,22 @@ static void hpet_timer(void *opaque)
> > > t->wrap_flag = 0;
> > > }
> > > }
> > > - update_irq(t, 1);
> > > + if (s->driftfix && timer_is_periodic(t) && period != 0) {
> > > + if (t->ticks_not_accounted >= t->period + t->prev_period) {
^^^^^^^^^^
> > > + irq_delivered = update_irq(t, 1);
> > > + if (irq_delivered) {
> > > + t->ticks_not_accounted -= t->prev_period;
> > > + t->prev_period = t->period;
> > > + } else {
> > > + if (irq_count) {
> > > + t->irq_rate++;
> > > + t->irq_rate = MIN(t->irq_rate, MAX_IRQ_RATE);
> > > + }
> > > + }
> > > + }
> > > + } else {
> > > + update_irq(t, 1);
> > > + }
> > > }
> >
> > Hi Ulrich,
> >
> > Whats prev_period for, since in practice the period will not change
> > between interrupts (OS programs comparator once, or perhaps twice during
> > bootup) ?
>
> Actually, some systems, like Windows 2000-and-something change this
> period quite heavily under multimedia workloads.
I see. Still, using the period at the previous timer handler instance in
the decision whether to inject an interrupt to the guest makes no sense
to me.
> > Other than that, shouldnt reset accounting variables to init state on
> > write to GLOBAL_ENABLE_CFG / writes to main counter?
What i meant to ask here is whether the reinjection state (including
ticks_not_accounted) should be reset whenever a different period is
programmed.
next prev parent reply other threads:[~2011-05-03 22:55 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-28 14:24 [PATCH v3 0/5] hpet 'driftfix': alleviate time drift with HPET periodic timers Ulrich Obergfell
2011-04-28 14:24 ` [Qemu-devel] " Ulrich Obergfell
2011-04-28 14:24 ` [PATCH v3 1/5] hpet 'driftfix': add hooks required to detect coalesced interrupts (x86 apic only) Ulrich Obergfell
2011-04-28 14:24 ` [Qemu-devel] " Ulrich Obergfell
2011-04-28 18:51 ` Blue Swirl
2011-04-28 18:51 ` Blue Swirl
2011-04-28 22:50 ` Jan Kiszka
2011-04-28 22:50 ` Jan Kiszka
2011-04-29 9:45 ` Ulrich Obergfell
2011-04-29 9:45 ` Ulrich Obergfell
2011-04-29 10:15 ` Jan Kiszka
2011-04-29 10:15 ` [Qemu-devel] " Jan Kiszka
2011-04-29 21:44 ` Blue Swirl
2011-04-29 21:44 ` Blue Swirl
2011-04-28 14:24 ` [PATCH v3 2/5] hpet 'driftfix': add driftfix property to HPETState and DeviceInfo Ulrich Obergfell
2011-04-28 14:24 ` [Qemu-devel] " Ulrich Obergfell
2011-04-28 14:24 ` [PATCH v3 3/5] hpet 'driftfix': add fields to HPETTimer and VMStateDescription Ulrich Obergfell
2011-04-28 14:24 ` [Qemu-devel] " Ulrich Obergfell
2011-04-28 14:24 ` [PATCH v3 4/5] hpet 'driftfix': add code in update_irq() to detect coalesced interrupts (x86 apic only) Ulrich Obergfell
2011-04-28 14:24 ` [Qemu-devel] " Ulrich Obergfell
2011-04-28 14:25 ` [PATCH v3 5/5] hpet 'driftfix': add code in hpet_timer() to compensate delayed callbacks and coalesced interrupts Ulrich Obergfell
2011-04-28 14:25 ` [Qemu-devel] " Ulrich Obergfell
2011-05-03 19:03 ` Marcelo Tosatti
2011-05-03 19:03 ` [Qemu-devel] " Marcelo Tosatti
2011-05-03 22:08 ` Glauber Costa
2011-05-03 22:08 ` [Qemu-devel] " Glauber Costa
2011-05-03 22:55 ` Marcelo Tosatti [this message]
2011-05-03 22:55 ` Marcelo Tosatti
2011-05-04 8:06 ` Ulrich Obergfell
2011-05-04 8:06 ` [Qemu-devel] " Ulrich Obergfell
2011-05-04 9:09 ` Marcelo Tosatti
2011-05-04 9:09 ` [Qemu-devel] " Marcelo Tosatti
2011-05-04 13:36 ` Glauber Costa
2011-05-04 13:36 ` [Qemu-devel] " Glauber Costa
2011-05-04 13:46 ` Gleb Natapov
2011-05-04 13:46 ` [Qemu-devel] " Gleb Natapov
2011-05-04 13:47 ` Glauber Costa
2011-05-04 13:47 ` [Qemu-devel] " Glauber Costa
2011-05-04 13:55 ` Gleb Natapov
2011-05-04 13:55 ` [Qemu-devel] " Gleb Natapov
2011-05-05 8:07 ` Ulrich Obergfell
2011-05-05 8:07 ` [Qemu-devel] " Ulrich Obergfell
2011-05-06 14:39 ` Marcelo Tosatti
2011-05-06 14:39 ` [Qemu-devel] " Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110503225535.GA16599@amt.cnet \
--to=mtosatti@redhat.com \
--cc=aliguori@us.ibm.com \
--cc=avi@redhat.com \
--cc=gcosta@redhat.com \
--cc=glommer@redhat.com \
--cc=jan.kiszka@siemens.com \
--cc=kvm@vger.kernel.org \
--cc=qemu-devel@nongnu.org \
--cc=uobergfe@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.