From: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
To: akataria@vmware.com
Cc: Ingo Molnar <mingo@elte.hu>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Robert Hancock <hancockr@shaw.ca>,
Arjan van de Ven <arjan@infradead.org>,
Pavel Machek <pavel@suse.cz>
Subject: Re: upstream regression (IO-APIC?)
Date: Mon, 3 Nov 2008 20:01:54 +0100 [thread overview]
Message-ID: <200811032001.54806.bzolnier@gmail.com> (raw)
In-Reply-To: <1225735444.8168.24.camel@alok-dev1>
On Monday 03 November 2008, Alok Kataria wrote:
> On Mon, 2008-11-03 at 09:28 -0800, Bartlomiej Zolnierkiewicz wrote:
> > On Sunday 02 November 2008, Bartlomiej Zolnierkiewicz wrote:
> > > On Sunday 02 November 2008, Bartlomiej Zolnierkiewicz wrote:
> > > > On Thursday 30 October 2008, Robert Hancock wrote:
> > > > > Bartlomiej Zolnierkiewicz wrote:
> > > > > > The current Linus tree as of commit e946217e4fdaa67681bbabfa8e6b18641921f750
> > > > > > is broken for me. I get either the following panic (see log from qemu below)
> > > > > > or lost IRQs on ATA init... Is this a known issue?
> > > > > >
> > > > > > PS The tree that I used before and was supposedly good (sorry, I'm too tired
> > > > > > to verify it now) had commit 57f8f7b60db6f1ed2c6918ab9230c4623a9dbe37 at head.
> > > >
> > > > Unfortunately 57f8f7b60db6f1ed2c6918ab9230c4623a9dbe37 (v2.6.28-rc1)
> > > > is also bad. Bisecting it further was a real pain (i.e. I hit broken
> > > > build with x86 irqbalance changes, broken build with netfilter nat
> > > > changes and jbd journal problem). In the end it turned out that 2.6.27
> > > > is bad too! However with 2.6.27 the panic occurs only once per several
> > > > attempts and if there is no panic kernel boots normally (no lost IRQs).
> > > >
> > > > [...]
> > > >
> > > > I finally managed to narrow it down to change making x86 use tsc_khz
> > > > for loops_per_jiffy -- commit 3da757daf86e498872855f0b5e101f763ba79499
> > > > ("x86: use cpu_khz for loops_per_jiffy calculation"). This approach
> > > > seems too simplistic (as I see now Arjan & Pavel expressed concerns
> > > > about it back when the patch was posted initially [1][2]). Also it
> > > > would probably be preferred to re-use existing preset_lpj variable
> > > > (just like KVM does it for similar purpose [3]) instead of adding a
> > > > lpj_tsc one and increasing complexity.
> > >
> > > It turned out that I can boot a kernel with different config with
> > > HZ == 250 just fine and switching to HZ == 1000 makes it fail.
> > >
> > >
> > > Looking into it some more:
> > >
> > > HZ == 250 kernel (good):
> > >
> > > Calibrating delay loop (skipped), value calculated using timer frequency.. 2986.79 BogoMIPS (lpj=5973580)
> > >
> > > HZ == 1000 kernel (bad):
> > >
> > > Calibrating delay loop (skipped), using tsc calculated value.. 2990.35 BogoMIPS (lpj=1495176)
> > >
> > > HZ == 1000 kernel with hackyfix (good):
> > >
> > > Calibrating delay using timer specific routine.. 3016.68 BogoMIPS (lpj=6033376)
> > >
> > >
> > > Argggh... lpj is used for udelay() & friends so this bug is quite
> > > dangerous (since udelay() & friends are used for hardware delays)...
> >
> > It may be not as severe as I initially thought,
> > (obviously) the real hardware works fine:
> >
> > calibrate_delay_direct(): lpj=1495884
> > Calibrating delay loop (skipped), value calculated using timer frequency.. 2990.36 BogoMIPS (lpj=1495183)
> >
> > So the issue only affects qemu ATM.
> Oh so its on a emulator, something wrong in the timer emulation logic in
> qemu ?
Probably. I now noticed that the problem happens only with HZ == 250 host
and HZ == 1000 guest. When host and guest are using the same HZ setting
everything works fine.
Thanks,
Bart
next prev parent reply other threads:[~2008-11-03 19:04 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <fa.mewpISr5S4wcdUNw1DVfOnQfl8s@ifi.uio.no>
2008-10-30 0:34 ` upstream regression (IO-APIC?) Robert Hancock
2008-11-02 14:37 ` Bartlomiej Zolnierkiewicz
2008-11-02 20:24 ` Bartlomiej Zolnierkiewicz
2008-11-02 20:35 ` Arjan van de Ven
2008-11-03 17:28 ` Bartlomiej Zolnierkiewicz
2008-11-03 18:04 ` Alok Kataria
2008-11-03 19:01 ` Bartlomiej Zolnierkiewicz [this message]
2008-11-03 17:49 ` Alok Kataria
2008-11-03 18:57 ` Bartlomiej Zolnierkiewicz
2008-10-30 0:00 Bartlomiej Zolnierkiewicz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200811032001.54806.bzolnier@gmail.com \
--to=bzolnier@gmail.com \
--cc=akataria@vmware.com \
--cc=arjan@infradead.org \
--cc=hancockr@shaw.ca \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=pavel@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.