From: Marcelo Tosatti <mtosatti@redhat.com>
To: Sheng Yang <sheng@linux.intel.com>
Cc: Alexander Graf <agraf@suse.de>,
kvm@vger.kernel.org, avi@redhat.com, Kevin Wolf <kwolf@suse.de>
Subject: Re: [PATCH] Fix almost infinite loop in APIC
Date: Fri, 16 Jan 2009 03:01:43 -0200 [thread overview]
Message-ID: <20090116050143.GA13032@amt.cnet> (raw)
In-Reply-To: <200901151520.07458.sheng@linux.intel.com>
[-- Attachment #1: Type: text/plain, Size: 2169 bytes --]
On Thu, Jan 15, 2009 at 03:20:06PM +0800, Sheng Yang wrote:
> > + * Since reinjection is not rate-limited, use the delay
> > + * to inject the last interrupt as an estimate.
> > + */
> > + if (unlikely(atomic_read(&apic->timer.pending) > 0)) {
> > + remaining = apic->timer.injection_delay;
> > + if (ktime_to_ns(remaining) > apic->timer.period)
> > + remaining = ns_to_ktime(apic->timer.period);
> > + } else
> > + remaining = hrtimer_expires_remaining(&apic->timer.dev);
>
> A little doubt...
>
> A: time_fire
> B: intr_post
> C: read TMCCT
>
> The sequence can be ABC or ACB.
>
> injection_delay = time(B) - time(A)
>
> So it didn't count time from read TMCCT... And guest get interrupt at time(B),
> not quite understand why time(B) - time(A) matters here...
Its an estimate of the delay it takes to inject an interrupt to the
guest once fired. Its only used if there have been accumulated ones, so
ACB sequence with pending=0 will use hrtimer_expires_remaining().
> I think the reasonable here means, this interval is usable later after the
> accumulated interrupts are injected. From this point of view, I think current
> solution is reasonable. It just assume the delayed interrupts have been
> injected.
>
> However, seriously, any value here is wrong, no elegant one.
I agree.
> But I still prefer to the current solution...
Why? The proposed version is smaller and simpler than the current
one IMO, and also not vulnerable to whatever bug is causing now <
last_update.
And more precise, since the current scheme uses interrupt injection time
as if it was "count-down restart" time, which is not true.
> And here is not the really problem for now I think. The current mechanism is
> mostly OK, but where is the bug... We have either have a simple fix(e.g. if
> now < last_update, then return 0) or dig into it. Did it worth a try? Anyway,
> it would return a buggy result if we have pending interrupts...
The overflow calculation is not necessary as discussed. Alexander, can
you please try the following? Sheng, do you have any other suggestion to
test?
/proc/timer_list output of the host when ESX is running too.
[-- Attachment #2: kvm-lapic-debug-loop.patch --]
[-- Type: text/plain, Size: 3426 bytes --]
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index afac68c..e9f13cc 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -509,6 +509,37 @@ static void apic_send_ipi(struct kvm_lapic *apic)
}
}
+
+static void record_last_update(struct kvm_lapic *apic)
+{
+ int rec_idx = apic->timer.rec_idx;
+
+ apic->timer.rec[rec_idx].when = ktime_get();
+ apic->timer.rec[rec_idx].last_update = apic->timer.last_update;
+ apic->timer.rec[rec_idx].pending = atomic_read(&apic->timer.pending);
+ apic->timer.rec_idx++;
+
+ if (apic->timer.rec_idx >= KVM_LAPIC_REC_NR)
+ apic->timer.rec_idx = 0;
+}
+
+static void print_last_update_record(struct kvm_lapic *apic)
+{
+ int iter = 0;
+ int i = apic->timer.rec_idx;
+
+ while (iter < KVM_LAPIC_REC_NR) {
+ printk("rec[%d] when=%lld last_update=%lld pend=%d\n",
+ i, ktime_to_ns(apic->timer.rec[i].when),
+ ktime_to_ns(apic->timer.rec[i].last_update),
+ apic->timer.rec[i].pending);
+ iter++;
+ i--;
+ if (i < 0)
+ i = KVM_LAPIC_REC_NR-1;
+ }
+}
+
static u32 apic_get_tmcct(struct kvm_lapic *apic)
{
u64 counter_passed;
@@ -532,7 +563,10 @@ static u32 apic_get_tmcct(struct kvm_lapic *apic)
.tv64 = KTIME_MAX -
(apic->timer.last_update).tv64}; }
), now);
- apic_debug("time elapsed\n");
+ printk("last_update = %lld now = %lld\n",
+ ktime_to_ns(apic->timer.last_update),
+ ktime_to_ns(now));
+ print_last_update_record(apic);
} else
passed = ktime_sub(now, apic->timer.last_update);
@@ -544,14 +578,7 @@ static u32 apic_get_tmcct(struct kvm_lapic *apic)
/* one-shot timers stick at 0 until reset */
tmcct = 0;
} else {
- /*
- * periodic timers reset to APIC_TMICT when they
- * hit 0. The while loop simulates this happening N
- * times. (counter_passed %= tmcct) would also work,
- * but might be slower or not work on 32-bit??
- */
- while (counter_passed > tmcct)
- counter_passed -= tmcct;
+ counter_passed %= (u64)tmcct;
tmcct -= counter_passed;
}
} else {
@@ -666,7 +693,7 @@ static void start_apic_timer(struct kvm_lapic *apic)
ktime_add_ns(now, apic->timer.period),
HRTIMER_MODE_ABS);
- apic_debug("%s: bus cycle is %" PRId64 "ns, now 0x%016"
+ printk("%s: bus cycle is %" PRId64 "ns, now 0x%016"
PRIx64 ", "
"timer initial count 0x%x, period %lldns, "
"expire @ 0x%016" PRIx64 ".\n", __func__,
@@ -1114,10 +1141,12 @@ void kvm_apic_timer_intr_post(struct kvm_vcpu *vcpu, int vec)
{
struct kvm_lapic *apic = vcpu->arch.apic;
- if (apic && apic_lvt_vector(apic, APIC_LVTT) == vec)
+ if (apic && apic_lvt_vector(apic, APIC_LVTT) == vec) {
apic->timer.last_update = ktime_add_ns(
apic->timer.last_update,
apic->timer.period);
+ record_last_update(apic);
+ }
}
int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 8185888..27baadb 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -5,6 +5,8 @@
#include <linux/kvm_host.h>
+#define KVM_LAPIC_REC_NR 50
+
struct kvm_lapic {
unsigned long base_address;
struct kvm_io_device dev;
@@ -13,6 +15,13 @@ struct kvm_lapic {
s64 period; /* unit: ns */
u32 divide_count;
ktime_t last_update;
+ struct {
+ ktime_t when;
+ ktime_t last_update;
+ int pending;
+ } rec[KVM_LAPIC_REC_NR];
+ int rec_idx;
+
struct hrtimer dev;
} timer;
struct kvm_vcpu *vcpu;
next prev parent reply other threads:[~2009-01-16 5:02 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-08 16:36 [PATCH] Fix almost infinite loop in APIC Alexander Graf
2009-01-09 6:34 ` Sheng Yang
2009-01-09 10:49 ` Alexander Graf
2009-01-09 12:57 ` Alexander Graf
2009-01-10 11:21 ` Sheng Yang
2009-01-11 4:55 ` Marcelo Tosatti
2009-01-13 7:47 ` Sheng Yang
2009-01-13 22:01 ` Marcelo Tosatti
2009-01-14 9:17 ` Sheng Yang
2009-01-14 17:03 ` Marcelo Tosatti
2009-01-15 7:20 ` Sheng Yang
2009-01-16 5:01 ` Marcelo Tosatti [this message]
2009-01-20 10:41 ` Alexander Graf
2009-01-20 11:20 ` Sheng Yang
2009-01-20 12:09 ` Alexander Graf
2009-01-20 12:30 ` Sheng Yang
2009-01-20 13:43 ` Sheng Yang
2009-01-20 18:51 ` Marcelo Tosatti
2009-01-21 2:40 ` Sheng Yang
2009-01-21 4:23 ` Marcelo Tosatti
2009-01-21 5:11 ` Sheng Yang
2009-01-21 15:07 ` Marcelo Tosatti
2009-01-21 16:01 ` Alexander Graf
2009-01-21 16:03 ` Alexander Graf
2009-01-21 16:18 ` Alexander Graf
2009-01-21 16:55 ` Marcelo Tosatti
2009-01-22 13:08 ` Avi Kivity
2009-01-23 17:58 ` Alex Williamson
2009-01-10 11:25 ` Sheng Yang
2009-01-10 11:28 ` Sheng Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090116050143.GA13032@amt.cnet \
--to=mtosatti@redhat.com \
--cc=agraf@suse.de \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=kwolf@suse.de \
--cc=sheng@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox