From: Marcelo Tosatti <mtosatti@redhat.com>
To: Sheng Yang <sheng@linux.intel.com>
Cc: Alexander Graf <agraf@suse.de>,
kvm@vger.kernel.org, avi@redhat.com, Kevin Wolf <kwolf@suse.de>
Subject: Re: [PATCH] Fix almost infinite loop in APIC
Date: Tue, 13 Jan 2009 20:01:17 -0200 [thread overview]
Message-ID: <20090113220117.GB8945@amt.cnet> (raw)
In-Reply-To: <200901131547.50626.sheng@linux.intel.com>
On Tue, Jan 13, 2009 at 03:47:49PM +0800, Sheng Yang wrote:
> On Sunday 11 January 2009 12:55:58 Marcelo Tosatti wrote:
> The original apic_timer_intr_post() got:
>
> > if (apic && apic_lvt_vector(apic, APIC_LVTT) == vec)
> > apic->timer.last_update = ktime_add_ns(
> > apic->timer.last_update,
> > apic->timer.period);
>
> I think the purpose is let guest see a reasonable TMCCT. Which means:
>
> 1. Timer start to fire at start_apic_timer(). last_update=now().
>
> 2. Every one LAPIC timer interrupt was injected into the guest, so the time at
> least elapsed timer.period(the first alarm set at "period" later), then
> last_update is increased by period. Notice last_update's base is set before
> timer fire, so it's not indicated the next one, but *the time this interrupt
> should be injected*(time can be delayed)... So last_update = n*period + base.
Right.
> 3. If there is pending interrupt, last_update won't be updated, so we have
> this:
>
> > while (counter_passed > tmcct)
> > counter_passed -= tmcct;
> > tmcct -= counter_passed;
> (notice tmcct is TMICT here.)
>
> And the returned tmcct value indicated (counter_passed % tmict), a offset
> regardless of pending interrupt numbers.
Right. The problem is that for accumulated interrupts, the guest will
receive the interrupt as fast as the qemu process can be scheduled (as
long as its not masked, of course). There could be higher priority
vectors in the mix, but thats not the common case.
So calculating the offset using last interrupt injection is not very
reasonable in this case.
> I think now the overflow seems OK, but I am not sure why last_update can be
> bigger than ktime_get()? Maybe due to vcpu migration? Suspect some racy or
> boundary condition existed...
Yep, it seems to be some small inaccuracy in the accounting that
causes it.
> And base on this, I don't think my quick fix is correct...
>
> >
> > 3) And then there's interrupt reinjection. Once interrupts accumulate,
> > and we reinject, current count reads become completly bogus. This is
> > the reason for time drift on RHEL4-style kernels with "clock=tsc". The
> > algorithm calculates the interrupt handling delay by using the PIT
> > count (equivalent to APIC_TMCCT). But PIT count emulation uses the
> > hrtimer expiration time, which has nothing to do with the accumulated
> > interrupts.
> >
> > So what I propose is to first switch lapic current count emulation to a
> > straightforward scaled hrtimer_get_remaining equivalent.
> >
> > For the reinjection case, maintain an average of the delay between host
> > interrupt and interrupt injection (this can be generic, so both PIT
> > and LAPIC timer can use it). Return that scaled average on APIC_TMCCT
> > emulation whenever pending > 2.
> >
> > What you think?
>
> 1. If we simply use ktime_get() to update last_update, the interval can't be
> same between different last_update. I think we may got some problem here, but
> not for sure. For example, three delayed interrupt was injected one after one,
> last_update would show very little interval, and APIC_TMCCT may got trouble.
I mean get rid of last_update and use hrtimer_get_remaining on
apic_get_tmcct for no pending interrupts. And estimated delay between
timer fire and injection if there are pending interrupts.
> Maintain an average of the delay seems a little tricky here, and I am not sure
> if it would help the problem... Average is average at most, not the real
> affection at the time...
Yes. Probably using the last delay between timer fire and injection is
more accurate than an average.
>
> 2. For the current mechanism, the interval is the same, so last_update always
> equal to n*period + base. If there are more than one pending interrupts, TMCCT
> should also can return the relative correct value.
>
> So I am leaning toward to fix current problem, though I haven't find the root
> cause yet...
I think it can be simpler, without the need to deal with overflow at
all.
Understanding the present root cause is important though.
next prev parent reply other threads:[~2009-01-13 22:01 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-08 16:36 [PATCH] Fix almost infinite loop in APIC Alexander Graf
2009-01-09 6:34 ` Sheng Yang
2009-01-09 10:49 ` Alexander Graf
2009-01-09 12:57 ` Alexander Graf
2009-01-10 11:21 ` Sheng Yang
2009-01-11 4:55 ` Marcelo Tosatti
2009-01-13 7:47 ` Sheng Yang
2009-01-13 22:01 ` Marcelo Tosatti [this message]
2009-01-14 9:17 ` Sheng Yang
2009-01-14 17:03 ` Marcelo Tosatti
2009-01-15 7:20 ` Sheng Yang
2009-01-16 5:01 ` Marcelo Tosatti
2009-01-20 10:41 ` Alexander Graf
2009-01-20 11:20 ` Sheng Yang
2009-01-20 12:09 ` Alexander Graf
2009-01-20 12:30 ` Sheng Yang
2009-01-20 13:43 ` Sheng Yang
2009-01-20 18:51 ` Marcelo Tosatti
2009-01-21 2:40 ` Sheng Yang
2009-01-21 4:23 ` Marcelo Tosatti
2009-01-21 5:11 ` Sheng Yang
2009-01-21 15:07 ` Marcelo Tosatti
2009-01-21 16:01 ` Alexander Graf
2009-01-21 16:03 ` Alexander Graf
2009-01-21 16:18 ` Alexander Graf
2009-01-21 16:55 ` Marcelo Tosatti
2009-01-22 13:08 ` Avi Kivity
2009-01-23 17:58 ` Alex Williamson
2009-01-10 11:25 ` Sheng Yang
2009-01-10 11:28 ` Sheng Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090113220117.GB8945@amt.cnet \
--to=mtosatti@redhat.com \
--cc=agraf@suse.de \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=kwolf@suse.de \
--cc=sheng@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox