From: Thomas Gleixner <tglx@linutronix.de>
To: Len Brown <lenb@kernel.org>
Cc: Maxim Levitsky <maximlevitsky@gmail.com>,
linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
Andrew Morton <akpm@linux-foundation.org>,
Adrian Bunk <bunk@stusta.de>,
Arjan van de Ven <arjan@infradead.org>
Subject: Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
Date: Sat, 17 Mar 2007 10:56:44 +0100 [thread overview]
Message-ID: <1174125404.13341.396.camel@localhost.localdomain> (raw)
In-Reply-To: <200703162132.53906.lenb@kernel.org>
Len,
On Fri, 2007-03-16 at 21:32 -0400, Len Brown wrote:
> > > [ 36.433917] APIC timer disabled due to verification failure.
> > >
> > > And NO_HZ is disabled due to that (I get 1000/s timer's interrupts)
> > > I haven't investigated that yet.
> > > It looks like another new test that my hardware fails to perform...
> >
> > Yes, this is probably caused by SMM code trying to emulate a PS/2
> > keyboard from a (maybe connected or not) USB keyboard. Unfortunately we
> > have no way to disable this BIOS misfeature in the early boot process.
> > Arjan, Len ?????
>
> Nope. By definition, SMM is invisible to the OS -- we don't even
> get a bit that said it occurred (though we'd like one -- it would
> be really helpful to diagnose issues like this one)
I know that it is invisible. Nevertheless I know that the BIOSes emulate
PS/2 keyboards from USB via SMM during the boot process until we call
the usb_handoff function.
> So go into BIOS SETUP and see if there is a USB Legacy Emulation
> feature that you can disable. Sometimes there is not, but disabling
> onboard USB altogether may help at least prove the issue is in that area.
I have more than one box (even original Intel mainboards), where either
plugging a PS/2 keyboard or switching off USB makes this problem go
away.
> > I built in this test to rule out bogus LAPIC timer calibration values
> > which are sometimes off by factor 2-10.
> >
> > But I also built in a calibration against the PM-Timer, which turned out
> > to be quite reliable and I think the additional verification step is
> > only necessary for sytems without PM-Timer.
> >
> > That was a bit over cautious from my side. I send a patch to avoid this
> > when PM-Timer is available in a separate mail.
>
> PM-Timer was invented to work-around the issue that the TSC became unreliable
> in the face of power management on laptops. In particular, to be able
> to time duration of OS idle where TSC stopped.
>
> While it is not fine grain, and it is not low-latency, is should
> be very reliable. My understanding is that it is implemented as
> a simple divider right off the system 14MHz clock -- the signal
> which most motherboard clocks are PLL multiplied up from --
> including the 100MHz front-side bus which drives the LAPIC timer.
>
> But that said, I don't understand why calibrating the LAPIC timer
> using the PM-timer is going to be more reliable -- exactly how
> and why did the previous calibration scheme fail?
> Maybe I could follow the new logic in apic.c if I saw the "apic=debug"
> output for this box.
calibrating APIC timer ...
... lapic delta = 2426884
... PM timer delta = 833908
APIC calibration PIT not consistent with PM Timer: 232ms instead of 100ms
APIC delta adjusted to PM-Timer: 1041737 (2426884)
..... delta 1041737
..... mult: 44749065
..... calibration result: 166677
..... CPU clock speed is 4659.0624 MHz.
..... host bus clock speed is 166.0677 MHz.
This box is off by factor 2.3 and using the PM-Timer instead of the
PIT/jiffies values gives me a correct result.
Another one:
APIC calibration not consistent with PM Timer: 2020ms instead of 100ms
APIC delta adjusted to PM-Timer: 1254436 (25341111)
Off by factor 20 !!
The original APIC timer calibration did:
local_irq_disable();
wait_until_pit_underflows();
t1 = read_apic_counter();
for (i = 0; i < HZ/10; i++)
wait_until_pit_underflows();
t2 = read_apic_counter();
and calculated the APIC timer frequency from the delta of t1 and t2 vs.
the 100ms time.
This had 2 problems:
1. It gave results, which are off by factor 2-10 on a couple of boxen.
2. Some systems stop there dead as the PIT readout is broken.
I changed it to do:
local_irq_disable();
original_pit_handler = pit->handler;
pit->handler = lapic_calibration_handler;
loops = 0;
local_irq_enable();
wait_until_handler_has_done_HZ/10_loops();
The handler does:
if (!loops++) {
t1_apic = read_apic_counter();
t1_jiffies = jiffies;
t1_pm = read_pm_timer();
}
if (loops == HZ/10) {
t2_apic = read_apic_counter();
t2_jiffies = jiffies;
t2_pm = read_pm_timer();
done = 1;
}
If the pmtimer is available, then calculate the APIC timer frequency
from the t1_pm/t2_pm delta, otherwise use jiffies.
When pm_timer is there, we can trust the calculated value, if not we do
a verify run of the periodic apic timer and the pit timer. If this fails
- and it fails often due to the SMM crap - then I use the PIT and IPIs.
In the first version I did a verification run even when pm_timer was
there, but this produced false positives as well, because the lapic
timer interrupt is in the same way delayed as the PIT interrupt. I
removed this to avoid unnecessary switching to IPIs after I verified,
that it always produced false positives when the calibration was done
against the PM-Timer.
tglx
next prev parent reply other threads:[~2007-03-17 9:49 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-03-16 10:30 [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far Maxim Levitsky
2007-03-16 23:19 ` Len Brown
2007-03-17 23:00 ` Maxim
2007-03-17 23:32 ` Thomas Gleixner
2007-12-30 7:50 ` Mike Galbraith
2007-12-30 14:53 ` Ingo Molnar
2007-03-20 11:54 ` sysfs ugly timer interface (was Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far) Pavel Machek
2007-03-22 15:28 ` Greg KH
2007-03-22 15:41 ` Thomas Gleixner
2007-03-23 1:24 ` sysfs q [was: sysfs ugly timer interface] Jan Engelhardt
2007-03-23 4:48 ` Greg KH
2007-03-23 6:05 ` Jan Engelhardt
2007-03-16 23:39 ` [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far Thomas Gleixner
2007-03-17 23:01 ` Maxim
2007-03-16 23:44 ` Thomas Gleixner
2007-03-17 0:04 ` [PATCH] i386: trust the PM-Timer calibration of the local APIC timer Thomas Gleixner
2007-03-17 7:22 ` Ingo Molnar
2007-03-17 13:24 ` Andi Kleen
2007-03-18 8:12 ` Andrew Morton
2007-03-18 8:45 ` Thomas Gleixner
2007-03-17 1:32 ` [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far Len Brown
2007-03-17 9:56 ` Thomas Gleixner [this message]
2007-03-17 11:05 ` Thomas Gleixner
2007-03-17 16:52 ` Thomas Gleixner
2007-03-17 10:32 ` Arjan van de Ven
2007-03-17 13:26 ` Andi Kleen
2007-03-20 4:27 ` Greg KH
2007-03-20 6:30 ` Thomas Gleixner
2007-03-20 9:14 ` Arjan van de Ven
2007-03-20 11:36 ` Andi Kleen
2007-03-20 11:41 ` Oliver Neukum
2007-03-17 22:45 ` Maxim
2007-03-20 5:04 ` Lee Revell
2007-03-20 5:36 ` Eric St-Laurent
2007-03-20 9:15 ` Arjan van de Ven
2007-03-20 18:04 ` Andy Lutomirski
2007-03-20 22:58 ` Eric St-Laurent
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1174125404.13341.396.camel@localhost.localdomain \
--to=tglx@linutronix.de \
--cc=akpm@linux-foundation.org \
--cc=arjan@infradead.org \
--cc=bunk@stusta.de \
--cc=lenb@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maximlevitsky@gmail.com \
--cc=mingo@elte.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox