From: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
To: akataria@vmware.com
Cc: Ingo Molnar <mingo@elte.hu>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Robert Hancock <hancockr@shaw.ca>,
Arjan van de Ven <arjan@infradead.org>,
Pavel Machek <pavel@suse.cz>
Subject: Re: upstream regression (IO-APIC?)
Date: Mon, 3 Nov 2008 20:01:54 +0100 [thread overview]
Message-ID: <200811032001.54806.bzolnier@gmail.com> (raw)
In-Reply-To: <1225735444.8168.24.camel@alok-dev1>
On Monday 03 November 2008, Alok Kataria wrote:
> On Mon, 2008-11-03 at 09:28 -0800, Bartlomiej Zolnierkiewicz wrote:
> > On Sunday 02 November 2008, Bartlomiej Zolnierkiewicz wrote:
> > > On Sunday 02 November 2008, Bartlomiej Zolnierkiewicz wrote:
> > > > On Thursday 30 October 2008, Robert Hancock wrote:
> > > > > Bartlomiej Zolnierkiewicz wrote:
> > > > > > The current Linus tree as of commit e946217e4fdaa67681bbabfa8e6b18641921f750
> > > > > > is broken for me. I get either the following panic (see log from qemu below)
> > > > > > or lost IRQs on ATA init... Is this a known issue?
> > > > > >
> > > > > > PS The tree that I used before and was supposedly good (sorry, I'm too tired
> > > > > > to verify it now) had commit 57f8f7b60db6f1ed2c6918ab9230c4623a9dbe37 at head.
> > > >
> > > > Unfortunately 57f8f7b60db6f1ed2c6918ab9230c4623a9dbe37 (v2.6.28-rc1)
> > > > is also bad. Bisecting it further was a real pain (i.e. I hit broken
> > > > build with x86 irqbalance changes, broken build with netfilter nat
> > > > changes and jbd journal problem). In the end it turned out that 2.6.27
> > > > is bad too! However with 2.6.27 the panic occurs only once per several
> > > > attempts and if there is no panic kernel boots normally (no lost IRQs).
> > > >
> > > > [...]
> > > >
> > > > I finally managed to narrow it down to change making x86 use tsc_khz
> > > > for loops_per_jiffy -- commit 3da757daf86e498872855f0b5e101f763ba79499
> > > > ("x86: use cpu_khz for loops_per_jiffy calculation"). This approach
> > > > seems too simplistic (as I see now Arjan & Pavel expressed concerns
> > > > about it back when the patch was posted initially [1][2]). Also it
> > > > would probably be preferred to re-use existing preset_lpj variable
> > > > (just like KVM does it for similar purpose [3]) instead of adding a
> > > > lpj_tsc one and increasing complexity.
> > >
> > > It turned out that I can boot a kernel with different config with
> > > HZ == 250 just fine and switching to HZ == 1000 makes it fail.
> > >
> > >
> > > Looking into it some more:
> > >
> > > HZ == 250 kernel (good):
> > >
> > > Calibrating delay loop (skipped), value calculated using timer frequency.. 2986.79 BogoMIPS (lpj=5973580)
> > >
> > > HZ == 1000 kernel (bad):
> > >
> > > Calibrating delay loop (skipped), using tsc calculated value.. 2990.35 BogoMIPS (lpj=1495176)
> > >
> > > HZ == 1000 kernel with hackyfix (good):
> > >
> > > Calibrating delay using timer specific routine.. 3016.68 BogoMIPS (lpj=6033376)
> > >
> > >
> > > Argggh... lpj is used for udelay() & friends so this bug is quite
> > > dangerous (since udelay() & friends are used for hardware delays)...
> >
> > It may be not as severe as I initially thought,
> > (obviously) the real hardware works fine:
> >
> > calibrate_delay_direct(): lpj=1495884
> > Calibrating delay loop (skipped), value calculated using timer frequency.. 2990.36 BogoMIPS (lpj=1495183)
> >
> > So the issue only affects qemu ATM.
> Oh so its on a emulator, something wrong in the timer emulation logic in
> qemu ?
Probably. I now noticed that the problem happens only with HZ == 250 host
and HZ == 1000 guest. When host and guest are using the same HZ setting
everything works fine.
Thanks,
Bart
next prev parent reply other threads:[~2008-11-03 19:04 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <fa.mewpISr5S4wcdUNw1DVfOnQfl8s@ifi.uio.no>
2008-10-30 0:34 ` upstream regression (IO-APIC?) Robert Hancock
2008-11-02 14:37 ` Bartlomiej Zolnierkiewicz
2008-11-02 20:24 ` Bartlomiej Zolnierkiewicz
2008-11-02 20:35 ` Arjan van de Ven
2008-11-03 17:28 ` Bartlomiej Zolnierkiewicz
2008-11-03 18:04 ` Alok Kataria
2008-11-03 19:01 ` Bartlomiej Zolnierkiewicz [this message]
2008-11-03 17:49 ` Alok Kataria
2008-11-03 18:57 ` Bartlomiej Zolnierkiewicz
2008-10-30 0:00 Bartlomiej Zolnierkiewicz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200811032001.54806.bzolnier@gmail.com \
--to=bzolnier@gmail.com \
--cc=akataria@vmware.com \
--cc=arjan@infradead.org \
--cc=hancockr@shaw.ca \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=pavel@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox