From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sheng Yang Subject: Re: [PATCH] Fix almost infinite loop in APIC Date: Thu, 15 Jan 2009 15:20:06 +0800 Message-ID: <200901151520.07458.sheng@linux.intel.com> References: <1231432566-9864-1-git-send-email-agraf@suse.de> <200901141717.23587.sheng@linux.intel.com> <20090114170359.GA5865@amt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Cc: Alexander Graf , kvm@vger.kernel.org, avi@redhat.com, Kevin Wolf To: Marcelo Tosatti Return-path: Received: from mga06.intel.com ([134.134.136.21]:3480 "EHLO orsmga101.jf.intel.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1755733AbZAOHUM (ORCPT ); Thu, 15 Jan 2009 02:20:12 -0500 In-Reply-To: <20090114170359.GA5865@amt.cnet> Content-Disposition: inline Sender: kvm-owner@vger.kernel.org List-ID: On Thursday 15 January 2009 01:03:59 Marcelo Tosatti wrote: > On Wed, Jan 14, 2009 at 05:17:22PM +0800, Sheng Yang wrote: > > > So calculating the offset using last interrupt injection is not very > > > reasonable in this case. > > > > Um... I think it's not easy to find a complete reasonable result here, I > > haven't look into how OS us TMCCT, maybe it can help us to find a more > > reasonable result here. > > Linux, FreeBSD use it for calibration only. And Windows too, apparently > (at least a couple of versions tested). > > What about this? > > Index: kvm/arch/x86/kvm/lapic.c > =================================================================== > --- kvm.orig/arch/x86/kvm/lapic.c > +++ kvm/arch/x86/kvm/lapic.c > @@ -511,52 +511,31 @@ static void apic_send_ipi(struct kvm_lap > > static u32 apic_get_tmcct(struct kvm_lapic *apic) > { > - u64 counter_passed; > - ktime_t passed, now; > - u32 tmcct; > + ktime_t remaining; > + u32 tmcct = 0; > > ASSERT(apic != NULL); > > - now = apic->timer.dev.base->get_time(); > - tmcct = apic_get_reg(apic, APIC_TMICT); > - > /* if initial count is 0, current count should also be 0 */ > - if (tmcct == 0) > + if (apic_get_reg(apic, APIC_TMICT) == 0) > return 0; > > - if (unlikely(ktime_to_ns(now) <= > - ktime_to_ns(apic->timer.last_update))) { > - /* Wrap around */ > - passed = ktime_add(( { > - (ktime_t) { > - .tv64 = KTIME_MAX - > - (apic->timer.last_update).tv64}; } > - ), now); > - apic_debug("time elapsed\n"); > - } else > - passed = ktime_sub(now, apic->timer.last_update); > - > - counter_passed = div64_u64(ktime_to_ns(passed), > - (APIC_BUS_CYCLE_NS * apic->timer.divide_count)); > - > - if (counter_passed > tmcct) { > - if (unlikely(!apic_lvtt_period(apic))) { > - /* one-shot timers stick at 0 until reset */ > - tmcct = 0; > - } else { > - /* > - * periodic timers reset to APIC_TMICT when they > - * hit 0. The while loop simulates this happening N > - * times. (counter_passed %= tmcct) would also work, > - * but might be slower or not work on 32-bit?? > - */ > - while (counter_passed > tmcct) > - counter_passed -= tmcct; > - tmcct -= counter_passed; > - } > - } else { > - tmcct -= counter_passed; > - } > + /* > + * Since reinjection is not rate-limited, use the delay > + * to inject the last interrupt as an estimate. > + */ > + if (unlikely(atomic_read(&apic->timer.pending) > 0)) { > + remaining = apic->timer.injection_delay; > + if (ktime_to_ns(remaining) > apic->timer.period) > + remaining = ns_to_ktime(apic->timer.period); > + } else > + remaining = hrtimer_expires_remaining(&apic->timer.dev); A little doubt... A: time_fire B: intr_post C: read TMCCT The sequence can be ABC or ACB. injection_delay = time(B) - time(A) So it didn't count time from read TMCCT... And guest get interrupt at time(B), not quite understand why time(B) - time(A) matters here... I think the reasonable here means, this interval is usable later after the accumulated interrupts are injected. From this point of view, I think current solution is reasonable. It just assume the delayed interrupts have been injected. However, seriously, any value here is wrong, no elegant one. But I still prefer to the current solution... And here is not the really problem for now I think. The current mechanism is mostly OK, but where is the bug... We have either have a simple fix(e.g. if now < last_update, then return 0) or dig into it. Did it worth a try? Anyway, it would return a buggy result if we have pending interrupts... -- regards Yang, Sheng > + > + if (remaining.tv64 > 0) > + tmcct = div64_u64(ktime_to_ns(remaining), > + (APIC_BUS_CYCLE_NS * apic->timer.divide_count)); > + > + WARN_ON(tmcct > apic_get_reg(apic, APIC_TMICT)); > > return tmcct; > } > @@ -653,8 +632,6 @@ static void start_apic_timer(struct kvm_ > { > ktime_t now = apic->timer.dev.base->get_time(); > > - apic->timer.last_update = now; > - > apic->timer.period = apic_get_reg(apic, APIC_TMICT) * > APIC_BUS_CYCLE_NS * apic->timer.divide_count; > atomic_set(&apic->timer.pending, 0); > @@ -972,6 +949,7 @@ static int __apic_timer_fn(struct kvm_la > set_bit(KVM_REQ_PENDING_TIMER, &apic->vcpu->requests); > if (waitqueue_active(q)) > wake_up_interruptible(q); > + apic->timer.int_trigger = ktime_get(); > > if (apic_lvtt_period(apic)) { > result = 1; > @@ -1112,12 +1090,14 @@ void kvm_inject_apic_timer_irqs(struct k > > void kvm_apic_timer_intr_post(struct kvm_vcpu *vcpu, int vec) > { > + ktime_t delta; > struct kvm_lapic *apic = vcpu->arch.apic; > > - if (apic && apic_lvt_vector(apic, APIC_LVTT) == vec) > - apic->timer.last_update = ktime_add_ns( > - apic->timer.last_update, > - apic->timer.period); > + if (apic && apic_lvt_vector(apic, APIC_LVTT) == vec && > + atomic_read(&apic->timer.pending) == 0) { > + delta = ktime_sub(ktime_get(), apic->timer.int_trigger); > + apic->timer.injection_delay = delta; > + } > } > > int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu) > Index: kvm/arch/x86/kvm/lapic.h > =================================================================== > --- kvm.orig/arch/x86/kvm/lapic.h > +++ kvm/arch/x86/kvm/lapic.h > @@ -12,7 +12,8 @@ struct kvm_lapic { > atomic_t pending; > s64 period; /* unit: ns */ > u32 divide_count; > - ktime_t last_update; > + ktime_t injection_delay; > + ktime_t int_trigger; > struct hrtimer dev; > } timer; > struct kvm_vcpu *vcpu;