From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Graf Subject: Re: [PATCH] Fix almost infinite loop in APIC Date: Tue, 20 Jan 2009 11:41:21 +0100 Message-ID: <4975AA51.2060705@suse.de> References: <1231432566-9864-1-git-send-email-agraf@suse.de> <200901141717.23587.sheng@linux.intel.com> <20090114170359.GA5865@amt.cnet> <200901151520.07458.sheng@linux.intel.com> <20090116050143.GA13032@amt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Sheng Yang , kvm@vger.kernel.org, avi@redhat.com, Kevin Wolf To: Marcelo Tosatti Return-path: Received: from cantor2.suse.de ([195.135.220.15]:46979 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1760605AbZATKlZ (ORCPT ); Tue, 20 Jan 2009 05:41:25 -0500 In-Reply-To: <20090116050143.GA13032@amt.cnet> Sender: kvm-owner@vger.kernel.org List-ID: Marcelo Tosatti wrote: > On Thu, Jan 15, 2009 at 03:20:06PM +0800, Sheng Yang wrote: > >>> + * Since reinjection is not rate-limited, use the delay >>> + * to inject the last interrupt as an estimate. >>> + */ >>> + if (unlikely(atomic_read(&apic->timer.pending) > 0)) { >>> + remaining = apic->timer.injection_delay; >>> + if (ktime_to_ns(remaining) > apic->timer.period) >>> + remaining = ns_to_ktime(apic->timer.period); >>> + } else >>> + remaining = hrtimer_expires_remaining(&apic->timer.dev); >>> >> A little doubt... >> >> A: time_fire >> B: intr_post >> C: read TMCCT >> >> The sequence can be ABC or ACB. >> >> injection_delay = time(B) - time(A) >> >> So it didn't count time from read TMCCT... And guest get interrupt at time(B), >> not quite understand why time(B) - time(A) matters here... >> > > Its an estimate of the delay it takes to inject an interrupt to the > guest once fired. Its only used if there have been accumulated ones, so > ACB sequence with pending=0 will use hrtimer_expires_remaining(). > > >> I think the reasonable here means, this interval is usable later after the >> accumulated interrupts are injected. From this point of view, I think current >> solution is reasonable. It just assume the delayed interrupts have been >> injected. >> >> However, seriously, any value here is wrong, no elegant one. >> > > I agree. > > >> But I still prefer to the current solution... >> > > Why? The proposed version is smaller and simpler than the current > one IMO, and also not vulnerable to whatever bug is causing now < > last_update. > > And more precise, since the current scheme uses interrupt injection time > as if it was "count-down restart" time, which is not true. > > >> And here is not the really problem for now I think. The current mechanism is >> mostly OK, but where is the bug... We have either have a simple fix(e.g. if >> now < last_update, then return 0) or dig into it. Did it worth a try? Anyway, >> it would return a buggy result if we have pending interrupts... >> > > The overflow calculation is not necessary as discussed. Alexander, can > you please try the following? Sheng, do you have any other suggestion to > test? > > /proc/timer_list output of the host when ESX is running too. > Sorry for the late reply. Here's the dmesg output and /proc/timer_list after the issue occured. I left the warning message in that I put there in case the value is too high: counter_passed = div64_u64(ktime_to_ns(passed), (APIC_BUS_CYCLE_NS * apic->timer.divide_count)); + if (counter_passed > 0x7f00000000000000) { + /* If we're in here we probably encountered a bug! */ + printk(KERN_INFO "LAPIC: Too high counter_passed value: 0x%lx | 0x%lx (0x%lx) | 0x%lx (0x%lx)\n", counter_passed, ktime_to_ns(passed), passed, now.tv64, apic ->timer.last_update.tv64); + } + if (counter_passed > tmcct) { if (unlikely(!apic_lvtt_period(apic))) { /* one-shot timers stick at 0 until reset */ start_apic_timer: bus cycle is 1ns, now 0x00039a6f377cfbc4, timer initial count 0x186a0, period 100000ns, expire @ 0x00039a6f377e8264. start_apic_timer: bus cycle is 1ns, now 0x00039a6f3a7b5aef, timer initial count 0x203a0, period 132000ns, expire @ 0x00039a6f3a7d5e8f. last_update = 1014860818546063 now = 1014860818426085 rec[47] when=1014860811948512 last_update=1014860812078063 pend=0 rec[46] when=1014860818417773 last_update=1014860818546063 pend=0 rec[45] when=1014860818285454 last_update=1014860818414063 pend=0 rec[44] when=1014860818153119 last_update=1014860818282063 pend=0 rec[43] when=1014860818021746 last_update=1014860818150063 pend=0 rec[42] when=1014860817889326 last_update=1014860818018063 pend=0 rec[41] when=1014860817757297 last_update=1014860817886063 pend=0 rec[40] when=1014860817625682 last_update=1014860817754063 pend=0 rec[39] when=1014860817492484 last_update=1014860817622063 pend=0 rec[38] when=1014860817360531 last_update=1014860817490063 pend=0 rec[37] when=1014860817228489 last_update=1014860817358063 pend=0 rec[36] when=1014860817101250 last_update=1014860817226063 pend=0 rec[35] when=1014860816964378 last_update=1014860817094063 pend=0 rec[34] when=1014860816832459 last_update=1014860816962063 pend=0 rec[33] when=1014860816700514 last_update=1014860816830063 pend=0 rec[32] when=1014860816568558 last_update=1014860816698063 pend=0 rec[31] when=1014860816436510 last_update=1014860816566063 pend=0 rec[30] when=1014860816305079 last_update=1014860816434063 pend=0 rec[29] when=1014860816172560 last_update=1014860816302063 pend=0 rec[28] when=1014860816040449 last_update=1014860816170063 pend=0 rec[27] when=1014860815908500 last_update=1014860816038063 pend=0 rec[26] when=1014860815776451 last_update=1014860815906063 pend=0 rec[25] when=1014860815644499 last_update=1014860815774063 pend=0 rec[24] when=1014860815513100 last_update=1014860815642063 pend=0 rec[23] when=1014860815380476 last_update=1014860815510063 pend=0 rec[22] when=1014860815248609 last_update=1014860815378063 pend=0 rec[21] when=1014860815116494 last_update=1014860815246063 pend=0 rec[20] when=1014860814984441 last_update=1014860815114063 pend=0 rec[19] when=1014860814852498 last_update=1014860814982063 pend=0 rec[18] when=1014860814721082 last_update=1014860814850063 pend=0 rec[17] when=1014860814588467 last_update=1014860814718063 pend=0 rec[16] when=1014860814456530 last_update=1014860814586063 pend=0 rec[15] when=1014860814324452 last_update=1014860814454063 pend=0 rec[14] when=1014860814192601 last_update=1014860814322063 pend=0 rec[13] when=1014860814060489 last_update=1014860814190063 pend=0 rec[12] when=1014860813928546 last_update=1014860814058063 pend=0 rec[11] when=1014860813797571 last_update=1014860813926063 pend=0 rec[10] when=1014860813664475 last_update=1014860813794063 pend=0 rec[9] when=1014860813532520 last_update=1014860813662063 pend=0 rec[8] when=1014860813400463 last_update=1014860813530063 pend=0 rec[7] when=1014860813268519 last_update=1014860813398063 pend=0 rec[6] when=1014860813136452 last_update=1014860813266063 pend=0 rec[5] when=1014860813007312 last_update=1014860813134063 pend=0 rec[4] when=1014860812872314 last_update=1014860813002063 pend=0 rec[3] when=1014860812740791 last_update=1014860812870063 pend=0 rec[2] when=1014860812652998 last_update=1014860812738063 pend=0 rec[1] when=1014860812476526 last_update=1014860812606063 pend=0 rec[0] when=1014860812344635 last_update=1014860812474063 pend=0 rec[49] when=1014860812213098 last_update=1014860812342063 pend=0 rec[48] when=1014860812080466 last_update=1014860812210063 pend=0 LAPIC: Too high counter_passed value: 0x7ffffffffffe2b55 | 0x7ffffffffffe2b55 (0x7ffffffffffe2b55) | 0x39b02b29428e5 (0x39b02b295fd8f) start_apic_timer: bus cycle is 1ns, now 0x00039b02b2998ada, timer initial count 0x8400, period 33792ns, expire @ 0x00039b02b29a0eda. agraf@busu:~/git/kvm-userspace/patch-virtio> cat /proc/timer_list Timer List Version: v0.3 HRTIMER_MAX_CLOCK_BASES: 2 now at 1015003705479836 nsecs cpu: 0 clock 0: .index: 0 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 1231432907192509789 nsecs active timers: clock 1: .index: 1 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: , tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0 # expires at 1015003708000000 nsecs [in 2520164 nsecs] #1: , posix_timer_fn, S:01, common_timer_set, qemu-system-x86/12319 # expires at 1015003730134958 nsecs [in 24655122 nsecs] #2: , hrtimer_wakeup, S:01, futex_wait, thunderbird-bin/16571 # expires at 1015004228913679 nsecs [in 523433843 nsecs] #3: , hrtimer_wakeup, S:01, futex_wait, thunderbird-bin/5575 # expires at 1015004235164609 nsecs [in 529684773 nsecs] #4: , it_real_fn, S:01, do_setitimer, ntpd/3897 # expires at 1015004515252084 nsecs [in 809772248 nsecs] #5: , hrtimer_wakeup, S:01, do_nanosleep, mono/4506 # expires at 1015007076193303 nsecs [in 3370713467 nsecs] #6: , hrtimer_wakeup, S:01, futex_wait, thunderbird-bin/5568 # expires at 1015011817382709 nsecs [in 8111902873 nsecs] #7: , hrtimer_wakeup, S:01, do_nanosleep, ypbind/3441 # expires at 1015018457521563 nsecs [in 14752041727 nsecs] #8: , hrtimer_wakeup, S:01, do_nanosleep, cron/4093 # expires at 1015033940939337 nsecs [in 30235459501 nsecs] #9: , it_real_fn, S:01, do_setitimer, qmgr/4085 # expires at 1015280112163402 nsecs [in 276406683566 nsecs] #10: , it_real_fn, S:01, do_setitimer, pickup/10231 # expires at 1015285116223756 nsecs [in 281410743920 nsecs] #11: , it_real_fn, S:01, do_setitimer, master/4058 # expires at 1015285116254144 nsecs [in 281410774308 nsecs] .expires_next : 1015003708000000 nsecs .hres_active : 1 .nr_events : 580379958 .nohz_mode : 2 .idle_tick : 1015003700000000 nsecs .tick_stopped : 0 .idle_jiffies : 4548643220 .idle_calls : 593550517 .idle_sleeps : 362170032 .idle_entrytime : 1015003701647385 nsecs .idle_waketime : 1015003701630992 nsecs .idle_exittime : 1015003702532497 nsecs .idle_sleeptime : 571458026968578 nsecs .last_jiffies : 4548643221 .next_jiffies : 4548643224 .idle_expires : 1015003712000000 nsecs jiffies: 4548643222 cpu: 1 clock 0: .index: 0 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 1231432907192509789 nsecs active timers: clock 1: .index: 1 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: , hrtimer_wakeup, S:01, futex_wait, beagled/4716 # expires at 1015003708490644 nsecs [in 3010808 nsecs] #1: , tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0 # expires at 1015003708500000 nsecs [in 3020164 nsecs] #2: , hrtimer_wakeup, S:01, do_nanosleep, xosview.bin/9582 # expires at 1015003726118330 nsecs [in 20638494 nsecs] #3: , hrtimer_wakeup, S:01, futex_wait, beagled/4718 # expires at 1015003794490837 nsecs [in 89011001 nsecs] #4: , hrtimer_wakeup, S:01, futex_wait, automount/3918 # expires at 1015003984221642 nsecs [in 278741806 nsecs] #5: , hrtimer_wakeup, S:01, futex_wait, thunderbird-bin/9982 # expires at 1015004091979648 nsecs [in 386499812 nsecs] #6: , hrtimer_wakeup, S:01, do_nanosleep, beagled-helper/10041 # expires at 1015004496780704 nsecs [in 791300868 nsecs] #7: , hrtimer_wakeup, S:01, do_nanosleep, mono/10031 # expires at 1015004573838387 nsecs [in 868358551 nsecs] #8: , hrtimer_wakeup, S:01, do_nanosleep, mono/4568 # expires at 1015007037900538 nsecs [in 3332420702 nsecs] #9: , hrtimer_wakeup, S:01, do_nanosleep, irqbalance/3631 # expires at 1015010807493897 nsecs [in 7102014061 nsecs] #10: , hrtimer_wakeup, S:01, futex_wait, thunderbird-bin/12567 # expires at 1015052672245807 nsecs [in 48966765971 nsecs] #11: , hrtimer_wakeup, S:01, do_nanosleep, smartd/4119 # expires at 1015236365925605 nsecs [in 232660445769 nsecs] .expires_next : 1015003708490644 nsecs .hres_active : 1 .nr_events : 1177733430 .nohz_mode : 2 .idle_tick : 1015003704500000 nsecs .tick_stopped : 0 .idle_jiffies : 4548643221 .idle_calls : 1078408244 .idle_sleeps : 846829238 .idle_entrytime : 1015003704530991 nsecs .idle_waketime : 1015003698212870 nsecs .idle_exittime : 1015003701647606 nsecs .idle_sleeptime : 775344327548400 nsecs .last_jiffies : 4548643221 .next_jiffies : 4548643247 .idle_expires : 1015003804000000 nsecs jiffies: 4548643222 cpu: 2 clock 0: .index: 0 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 1231432907192509789 nsecs active timers: clock 1: .index: 1 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: , apic_timer_fn, S:01, start_apic_timer, qemu-system-x86/12327 # expires at 1015003705716442 nsecs [in 236606 nsecs] #1: , tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0 # expires at 1015003709000000 nsecs [in 3520164 nsecs] #2: , pit_timer_fn, S:01, pit_ioport_write, qemu-system-x86/12327 # expires at 1015003712306931 nsecs [in 6827095 nsecs] #3: , hrtimer_wakeup, S:01, futex_wait, firefox/19140 # expires at 1015003723685421 nsecs [in 18205585 nsecs] #4: , hrtimer_wakeup, S:01, futex_wait, automount/3919 # expires at 1015003987442435 nsecs [in 281962599 nsecs] #5: , it_real_fn, S:01, do_setitimer, screen/14384 # expires at 1015005458349005 nsecs [in 1752869169 nsecs] .expires_next : 1015003705817818 nsecs .hres_active : 1 .nr_events : 523400970 .nohz_mode : 2 .idle_tick : 1014859089000000 nsecs .tick_stopped : 0 .idle_jiffies : 4548607067 .idle_calls : 336510407 .idle_sleeps : 258867154 .idle_entrytime : 1014859085481448 nsecs .idle_waketime : 1014859053575960 nsecs .idle_exittime : 1014859085485692 nsecs .idle_sleeptime : 395655240495416 nsecs .last_jiffies : 4548607067 .next_jiffies : 4548607171 .idle_expires : 1014859500000000 nsecs jiffies: 4548643222 cpu: 3 clock 0: .index: 0 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 1231432907192509789 nsecs active timers: clock 1: .index: 1 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: , tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0 # expires at 1015003709500000 nsecs [in 4020164 nsecs] #1: , hrtimer_wakeup, S:01, do_nanosleep, xosview.bin/1922 # expires at 1015003770033068 nsecs [in 64553232 nsecs] #2: , hrtimer_wakeup, S:01, futex_wait, thunderbird-bin/7131 # expires at 1015003800127636 nsecs [in 94647800 nsecs] #3: , hrtimer_wakeup, S:01, futex_wait, thunderbird-bin/10001 # expires at 1015003962537676 nsecs [in 257057840 nsecs] #4: , sched_rt_period_timer, S:01, __enqueue_rt_entity, swapper/1 # expires at 1015004000000000 nsecs [in 294520164 nsecs] #5: , hrtimer_wakeup, S:01, futex_wait, thunderbird-bin/5577 # expires at 1015004228745798 nsecs [in 523265962 nsecs] .expires_next : 1015003709500000 nsecs .hres_active : 1 .nr_events : 1260974322 .nohz_mode : 2 .idle_tick : 1015003381500000 nsecs .tick_stopped : 0 .idle_jiffies : 4548643141 .idle_calls : 1146068541 .idle_sleeps : 910709854 .idle_entrytime : 1015003705511942 nsecs .idle_waketime : 1015003380002663 nsecs .idle_exittime : 1015003408508278 nsecs .idle_sleeptime : 782489673980131 nsecs .last_jiffies : 4548643222 .next_jiffies : 4548643241 .idle_expires : 1015003660000000 nsecs jiffies: 4548643222 Tick Device: mode: 1 Clock Event Device: hpet max_delta_ns: 85899346200 min_delta_ns: 5000 mult: 107374182 shift: 32 mode: 3 next_event: 9223372036854775807 nsecs set_next_event: hpet_legacy_next_event set_mode: hpet_legacy_set_mode event_handler: tick_handle_oneshot_broadcast tick_broadcast_mask: 00000000 tick_broadcast_oneshot_mask: 00000000 Tick Device: mode: 1 Clock Event Device: lapic max_delta_ns: 669608197 min_delta_ns: 1197 mult: 53805782 shift: 32 mode: 3 next_event: 1015003708000000 nsecs set_next_event: lapic_next_event set_mode: lapic_timer_setup event_handler: hrtimer_interrupt Tick Device: mode: 1 Clock Event Device: lapic max_delta_ns: 669608197 min_delta_ns: 1197 mult: 53805782 shift: 32 mode: 3 next_event: 1015003708490644 nsecs set_next_event: lapic_next_event set_mode: lapic_timer_setup event_handler: hrtimer_interrupt Tick Device: mode: 1 Clock Event Device: lapic max_delta_ns: 669608197 min_delta_ns: 1197 mult: 53805782 shift: 32 mode: 3 next_event: 1015003705919194 nsecs set_next_event: lapic_next_event set_mode: lapic_timer_setup event_handler: hrtimer_interrupt Tick Device: mode: 1 Clock Event Device: lapic max_delta_ns: 669608197 min_delta_ns: 1197 mult: 53805782 shift: 32 mode: 3 next_event: 1015003709500000 nsecs set_next_event: lapic_next_event set_mode: lapic_timer_setup event_handler: hrtimer_interrupt