From: Matt Lupfer <mlupfer@ddn.com>
To: QEMU Developers <qemu-devel@nongnu.org>
Cc: alex@alex.org.uk
Subject: [Qemu-devel] CentOS 5.x intermittently fails to boot on QEMU 1.7.0
Date: Thu, 20 Feb 2014 21:34:19 -0700 [thread overview]
Message-ID: <5306D74B.7070104@ddn.com> (raw)
Hello,
After upgrading to QEMU 1.7.0, CentOS 5.x guests often fail to boot
with the following kernel apic=debug output:
> ACPI: Core revision 20060707
> enabled ExtINT on CPU#0
> ENABLING IO-APIC IRQs
> ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
> ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> ...trying to set up timer (IRQ0) through the 8259A ... failed.
> ...trying to set up timer as Virtual Wire IRQ... failed.
> ...trying to set up timer as ExtINT IRQ... failed .
> Kernel panic - not syncing: IO-APIC + timer doesn't work! Try using the 'noapic' kernel parameter
This happens greater than 50% of the time in my configuration.
Adding the noapic or no_timer_check parameter causes the guest to boot
properly. I'd like to find a way to restore the previous behavior, which
didn't require these guest kernel parameters.
Host is a fully updated Fedora 20, kernel 3.12.10-300.fc20.x86_64 with an Intel
Core i5-2500 CPU. Guest is a fully updated base install of CentOS 5.10, kernel
2.6.18-371.4.1.el5.x86_64 (installed with "noapic", but booted with default
parameters).
QEMU invocation:
./x86_64-softmmu/qemu-system-x86_64 -m 4096 -cpu host -enable-kvm -drive file=~/ddn-001.img,cache=off -serial telnet:0.0.0.0:4444,server,nowait
A git bisect points to this commit as the culprit:
b1bbfe7 aio / timers: On timer modification, qemu_notify or aio_notify
which was part of the Aug 2013 timer rewrite. Reverting this hunk in
particular makes the issue go away:
> @@ -522,9 +531,7 @@ void qemu_mod_timer_ns(QEMUTimer *ts, int64_t expire_time)
> }
> /* Interrupt execution to force deadline recalculation. */
> qemu_clock_warp(ts->timer_list->clock);
> - if (use_icount) {
> - timerlist_notify(ts->timer_list);
> - }
> + timerlist_notify(ts->timer_list);
> }
> }
(Note this was later refactored into timerlist_rearm() in 1.7.0, so I
mean that I modified timerlist_rearm() in 1.7.0 to read as that hunk
did before the b1bbfe7 commit.)
This doesn't appear to be a solution, because with the timer rewrite, QEMU
moves its periodic (1 ms) qemu_notify_event() call to break out of
the main event loop from a SIGALRM handler to the rearm of a QEMU timer.
Presumably QEMU is counting on these generic callbacks.
It appears that in QEMU 1.7.0, QEMU/KVM doesn't inject timer interrupts, or
alternatively the guest doesn't handle them, quickly enough to pass
the timer check in the guest kernel reliably.
I've found that if I suppress the first 20ms of calls to timerlist_notify()
in timerlist_rearm() by timers on the QEMU_CLOCK_VIRTUAL, the system is
able to boot successfully and remains stable. Not calling
qemu_notify_event() on the first 20 ticks of QEMU_CLOCK_VIRTUAL seems to
alter the timings enough to produce a reliable result. I tried this after
realizing that the guest kernel enables the HPET, which enables the QEMU
virtual clock, immediately before the guest timer check occurs. I also
observed that the kernel boots fine with the "nohpet" parameter, and
I suspected that this could be a source of resource contention.
Finally, the QEMU options to disable KVM PIT IRQ reinjection and to disable
the kvm kernel irqchip altogether result in less frequent panics, but the
guest still panics within 100 boots.
Thanks for any assistance you can provide.
Matt
next reply other threads:[~2014-02-21 4:55 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-21 4:34 Matt Lupfer [this message]
2014-02-21 6:30 ` [Qemu-devel] CentOS 5.x intermittently fails to boot on QEMU 1.7.0 Paolo Bonzini
2014-02-21 13:27 ` Alex Bligh
2014-02-22 0:57 ` Matt Lupfer
2014-02-22 8:55 ` Alex Bligh
2014-02-22 8:59 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5306D74B.7070104@ddn.com \
--to=mlupfer@ddn.com \
--cc=alex@alex.org.uk \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).