* [PATCH 1/5] Skip timer works.patch
@ 2006-10-20 0:09 Zachary Amsden
2006-10-27 14:56 ` Andi Kleen
2006-11-16 22:53 ` Andrew Morton
0 siblings, 2 replies; 15+ messages in thread
From: Zachary Amsden @ 2006-10-20 0:09 UTC (permalink / raw)
To: Andrew Morton, Rusty Russell, Andi Kleen, Jeremy Fitzhardinge,
Chris Wright, Virtualization Mailing List,
Linux Kernel Mailing List, Zachary Amsden
Add a way to disable the timer IRQ routing check via a boot option. The
VMI timer code uses this to avoid triggering the pester Mingo code, which
probes for some very unusual and broken motherboard routings. It fires
100% of the time when using a paravirtual delay mechanism instead of
using a realtime delay, since there is no elapsed real time, and the 4 timer
IRQs have not yet been delivered.
In addition, it is entirely possible, though improbable, that this bug
could surface on real hardware which picks a particularly bad time to enter
SMM mode, causing a long latency during one of the timer IRQs.
While here, make check_timer be __init.
Signed-off-by: Zachary Amsden <zach@vmware.com>
===================================================================
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -603,8 +603,6 @@ and is between 256 and 4096 characters.
hugepages= [HW,IA-32,IA-64] Maximal number of HugeTLB pages.
- noirqbalance [IA-32,SMP,KNL] Disable kernel irq balancing
-
i8042.direct [HW] Put keyboard port into non-translated mode
i8042.dumbkbd [HW] Pretend that controller can only read data from
keyboard and cannot control its state
@@ -1060,8 +1058,13 @@ and is between 256 and 4096 characters.
in certain environments such as networked servers or
real-time systems.
+ noirqbalance [IA-32,SMP,KNL] Disable kernel irq balancing
+
noirqdebug [IA-32] Disables the code which attempts to detect and
disable unhandled interrupt sources.
+
+ noirqtest [IA-32,APIC] Disables the code which tests for broken
+ timer IRQ sources.
noisapnp [ISAPNP] Disables ISA PnP code.
===================================================================
--- a/arch/i386/kernel/io_apic.c
+++ b/arch/i386/kernel/io_apic.c
@@ -1864,6 +1864,15 @@ static void __init setup_ioapic_ids_from
static void __init setup_ioapic_ids_from_mpc(void) { }
#endif
+int timer_irq_really_works __initdata;
+int __init irqtest_disable(char *str)
+{
+ timer_irq_really_works = 1;
+ return 1;
+}
+
+__setup("noirqtest", irqtest_disable);
+
/*
* There is a nasty bug in some older SMP boards, their mptable lies
* about the timer IRQ. We do the following to work around the situation:
@@ -1872,9 +1881,12 @@ static void __init setup_ioapic_ids_from
* - if this function detects that timer IRQs are defunct, then we fall
* back to ISA timer IRQs
*/
-static int __init timer_irq_works(void)
+int __init timer_irq_works(void)
{
unsigned long t1 = jiffies;
+
+ if (timer_irq_really_works)
+ return 1;
local_irq_enable();
/* Let ten ticks pass... */
@@ -2146,7 +2158,7 @@ int timer_uses_ioapic_pin_0;
* is so screwy. Thanks to Brian Perkins for testing/hacking this beast
* fanatically on his truly buggy board.
*/
-static inline void check_timer(void)
+static inline void __init check_timer(void)
{
int apic1, pin1, apic2, pin2;
int vector;
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH 1/5] Skip timer works.patch 2006-10-20 0:09 [PATCH 1/5] Skip timer works.patch Zachary Amsden @ 2006-10-27 14:56 ` Andi Kleen 2006-10-27 19:09 ` Zachary Amsden 2006-11-16 22:53 ` Andrew Morton 1 sibling, 1 reply; 15+ messages in thread From: Andi Kleen @ 2006-10-27 14:56 UTC (permalink / raw) To: Zachary Amsden Cc: Andrew Morton, Rusty Russell, Jeremy Fitzhardinge, Chris Wright, Virtualization Mailing List, Linux Kernel Mailing List On Thu, Oct 19, 2006 at 05:09:22PM -0700, Zachary Amsden wrote: > Add a way to disable the timer IRQ routing check via a boot option. The > VMI timer code uses this to avoid triggering the pester Mingo code, which > probes for some very unusual and broken motherboard routings. It fires > 100% of the time when using a paravirtual delay mechanism instead of > using a realtime delay, since there is no elapsed real time, and the 4 timer > IRQs have not yet been delivered. You mean paravirtualized udelay will not actually wait? This implies that you can't ever use any real timer in that kind of guest, right? > > In addition, it is entirely possible, though improbable, that this bug > could surface on real hardware which picks a particularly bad time to enter > SMM mode, causing a long latency during one of the timer IRQs. We already have a no timer check option. But: > > While here, make check_timer be __init. So how is this supposed to work? The hypervisor would always pass that option? If yes that would seem rather hackish to me. We should probably instead probe in some way if we have the required timer hardware. The paravirt kernel should know anyways it is paravirt and that it doesn't need to probe for flakey hardware. -Andi ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/5] Skip timer works.patch 2006-10-27 14:56 ` Andi Kleen @ 2006-10-27 19:09 ` Zachary Amsden 2006-10-27 21:16 ` Andi Kleen 0 siblings, 1 reply; 15+ messages in thread From: Zachary Amsden @ 2006-10-27 19:09 UTC (permalink / raw) To: Andi Kleen Cc: Andrew Morton, Rusty Russell, Jeremy Fitzhardinge, Chris Wright, Virtualization Mailing List, Linux Kernel Mailing List Andi Kleen wrote: > On Thu, Oct 19, 2006 at 05:09:22PM -0700, Zachary Amsden wrote: > >> Add a way to disable the timer IRQ routing check via a boot option. The >> VMI timer code uses this to avoid triggering the pester Mingo code, which >> probes for some very unusual and broken motherboard routings. It fires >> 100% of the time when using a paravirtual delay mechanism instead of >> using a realtime delay, since there is no elapsed real time, and the 4 timer >> IRQs have not yet been delivered. >> > > You mean paravirtualized udelay will not actually wait? > Yes, but even putting that problem aside, the timing element here is tricky to get right in a VM. > This implies that you can't ever use any real timer in that kind of guest, > right? > No. You can use a real timer just fine. But there is no reason ever to use udelay to busy wait for "hardware" in a virtual machine. Drivers which are used for real hardware may turn udelay back on selectively; but this is another patch. >> In addition, it is entirely possible, though improbable, that this bug >> could surface on real hardware which picks a particularly bad time to enter >> SMM mode, causing a long latency during one of the timer IRQs. >> > > We already have a no timer check option. But: > Really? I didn't see one that disabled the broken motherboard detection / workaround code, which is what we are trying to avoid here. >> While here, make check_timer be __init. >> > > So how is this supposed to work? The hypervisor would always pass that > option? If yes that would seem rather hackish to me. We should probably > instead probe in some way if we have the required timer hardware. > The paravirt kernel should know anyways it is paravirt and that it doesn't > need to probe for flakey hardware. > That is what this patch is building towards, but the boot option is "free", so why not? In the meantime, it helps non-paravirt kernels booted in a VM. Zach ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/5] Skip timer works.patch 2006-10-27 19:09 ` Zachary Amsden @ 2006-10-27 21:16 ` Andi Kleen 2006-10-30 20:54 ` Zachary Amsden 0 siblings, 1 reply; 15+ messages in thread From: Andi Kleen @ 2006-10-27 21:16 UTC (permalink / raw) To: virtualization Cc: Zachary Amsden, Andi Kleen, Andrew Morton, Chris Wright, Linux Kernel Mailing List On Friday 27 October 2006 12:09, Zachary Amsden wrote: > Andi Kleen wrote: > > On Thu, Oct 19, 2006 at 05:09:22PM -0700, Zachary Amsden wrote: > >> Add a way to disable the timer IRQ routing check via a boot option. The > >> VMI timer code uses this to avoid triggering the pester Mingo code, > >> which probes for some very unusual and broken motherboard routings. It > >> fires 100% of the time when using a paravirtual delay mechanism instead > >> of using a realtime delay, since there is no elapsed real time, and the > >> 4 timer IRQs have not yet been delivered. > > > > You mean paravirtualized udelay will not actually wait? > > Yes, but even putting that problem aside, the timing element here is > tricky to get right in a VM. > > > This implies that you can't ever use any real timer in that kind of > > guest, right? > > No. You can use a real timer just fine. But there is no reason ever to > use udelay to busy wait for "hardware" in a virtual machine. Drivers > which are used for real hardware may turn udelay back on selectively; > but this is another patch. > > >> In addition, it is entirely possible, though improbable, that this bug > >> could surface on real hardware which picks a particularly bad time to > >> enter SMM mode, causing a long latency during one of the timer IRQs. > > > > We already have a no timer check option. But: > > Really? I didn't see one that disabled the broken motherboard detection > / workaround code, which is what we are trying to avoid here. no_timer_check. But it's only there on x86-64 in mainline - although there were some patches to add it to i386 too. > That is what this patch is building towards, but the boot option is > "free", so why not? In the meantime, it helps non-paravirt kernels > booted in a VM. Hmm, you meant they paniced before? If they just fail a few tests that is not particularly worrying (real hardware does that often too) -Andi ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/5] Skip timer works.patch 2006-10-27 21:16 ` Andi Kleen @ 2006-10-30 20:54 ` Zachary Amsden 2006-10-30 22:50 ` Andi Kleen 0 siblings, 1 reply; 15+ messages in thread From: Zachary Amsden @ 2006-10-30 20:54 UTC (permalink / raw) To: Andi Kleen Cc: virtualization, Andi Kleen, Andrew Morton, Chris Wright, Linux Kernel Mailing List Andi Kleen wrote: > no_timer_check. But it's only there on x86-64 in mainline - although there > were some patches to add it to i386 too. > I can rename to match the x86-64 name. >> That is what this patch is building towards, but the boot option is >> "free", so why not? In the meantime, it helps non-paravirt kernels >> booted in a VM. >> > > Hmm, you meant they paniced before? If they just fail a few tests > that is not particularly worrying (real hardware does that often too) > Yes, they sometimes fail to boot, and the failure message used to ask us to pester mingo. Zach ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/5] Skip timer works.patch 2006-10-30 20:54 ` Zachary Amsden @ 2006-10-30 22:50 ` Andi Kleen 2006-10-30 23:09 ` Zachary Amsden 2006-11-15 8:03 ` Chris Wright 0 siblings, 2 replies; 15+ messages in thread From: Andi Kleen @ 2006-10-30 22:50 UTC (permalink / raw) To: Zachary Amsden Cc: Andi Kleen, virtualization, Andrew Morton, Chris Wright, Linux Kernel Mailing List On Mon, Oct 30, 2006 at 12:54:55PM -0800, Zachary Amsden wrote: > Andi Kleen wrote: > >no_timer_check. But it's only there on x86-64 in mainline - although there > >were some patches to add it to i386 too. > > > > I can rename to match the x86-64 name. I will do that in my tree. > >>That is what this patch is building towards, but the boot option is > >>"free", so why not? In the meantime, it helps non-paravirt kernels > >>booted in a VM. > >> > > > >Hmm, you meant they paniced before? If they just fail a few tests > >that is not particularly worrying (real hardware does that often too) > > > > Yes, they sometimes fail to boot, and the failure message used to ask us > to pester mingo. I still think we should figure that out automatically. Letting the Hypervisor pass magic boot options seems somehow unclean. But i suppose it will only work for the paravirtualized case, not for the case of kernel running "native" under a hypervisor I suppose? Or does that one not panic? -Andi ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/5] Skip timer works.patch 2006-10-30 22:50 ` Andi Kleen @ 2006-10-30 23:09 ` Zachary Amsden 2006-10-30 23:12 ` Andi Kleen 2006-11-15 8:03 ` Chris Wright 1 sibling, 1 reply; 15+ messages in thread From: Zachary Amsden @ 2006-10-30 23:09 UTC (permalink / raw) To: Andi Kleen Cc: Andi Kleen, virtualization, Andrew Morton, Chris Wright, Linux Kernel Mailing List Andi Kleen wrote: > On Mon, Oct 30, 2006 at 12:54:55PM -0800, Zachary Amsden wrote: > >> Andi Kleen wrote: >> >>> no_timer_check. But it's only there on x86-64 in mainline - although there >>> were some patches to add it to i386 too. >>> >>> >> I can rename to match the x86-64 name. >> > > I will do that in my tree. > > >>>> That is what this patch is building towards, but the boot option is >>>> "free", so why not? In the meantime, it helps non-paravirt kernels >>>> booted in a VM. >>>> >>>> >>> Hmm, you meant they paniced before? If they just fail a few tests >>> that is not particularly worrying (real hardware does that often too) >>> >>> >> Yes, they sometimes fail to boot, and the failure message used to ask us >> to pester mingo. >> > > I still think we should figure that out automatically. Letting > the Hypervisor pass magic boot options seems somehow unclean. > > But i suppose it will only work for the paravirtualized case, > not for the case of kernel running "native" under a hypervisor > I suppose? Or does that one not panic? > That is the one that can panic, for now. Fixing the paravirtualized case is easy, but we can't assume paravirtualization just yet. Zach ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/5] Skip timer works.patch 2006-10-30 23:09 ` Zachary Amsden @ 2006-10-30 23:12 ` Andi Kleen 2006-10-30 23:24 ` Zachary Amsden 0 siblings, 1 reply; 15+ messages in thread From: Andi Kleen @ 2006-10-30 23:12 UTC (permalink / raw) To: Zachary Amsden Cc: Andi Kleen, virtualization, Andrew Morton, Chris Wright, Linux Kernel Mailing List > That is the one that can panic, for now. Fixing the paravirtualized > case is easy, but we can't assume paravirtualization just yet. Hmm, this means standard vmware boot is not reliable unless that magic option is set? That doesn't sound good. -Andi ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/5] Skip timer works.patch 2006-10-30 23:12 ` Andi Kleen @ 2006-10-30 23:24 ` Zachary Amsden 2006-10-30 23:50 ` Andi Kleen 0 siblings, 1 reply; 15+ messages in thread From: Zachary Amsden @ 2006-10-30 23:24 UTC (permalink / raw) To: Andi Kleen Cc: Andi Kleen, virtualization, Andrew Morton, Chris Wright, Linux Kernel Mailing List Andi Kleen wrote: >> That is the one that can panic, for now. Fixing the paravirtualized >> case is easy, but we can't assume paravirtualization just yet. >> > > Hmm, this means standard vmware boot is not reliable unless that magic option > is set? That doesn't sound good. > It doesn't happen often, but it is a possibility that the kernel calibrates the delay wrong because of timing glitches caused by CPU migration, paging, or other phenomena which are supposed to be transparent to the kernel (but cause temporal lapse). In that case, the kernel may not make enough progress in a spin delay loop to properly reach the number of microseconds required for N number of timer ticks to occur. In theory this can happen on a real machine, as SMM mode could be active, doing USB device emulation or something that takes a while during the lpj calibration and throwing the computation off. By changing the parameters (N ticks at K Hz in T seconds), it is easy to create an unstable measurement that can achieve high failure rates, although in practice the Linux parameters appear to be reasonable enough that it is not a major problem. Zach ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/5] Skip timer works.patch 2006-10-30 23:24 ` Zachary Amsden @ 2006-10-30 23:50 ` Andi Kleen 0 siblings, 0 replies; 15+ messages in thread From: Andi Kleen @ 2006-10-30 23:50 UTC (permalink / raw) To: Zachary Amsden Cc: virtualization, Andrew Morton, Chris Wright, Linux Kernel Mailing List > It doesn't happen often, but it is a possibility that the kernel > calibrates the delay wrong because of timing glitches caused by CPU > migration, paging, or other phenomena which are supposed to be > transparent to the kernel (but cause temporal lapse). We're supposed to handle those because they happen on real hardware too with long running SMM handlers. Or at least there was a effort some time ago to do this. If it wasn't enough we'll likely need to fix the code. > In that case, the > kernel may not make enough progress in a spin delay loop to properly > reach the number of microseconds required for N number of timer ticks to > occur. Hmm, mdelay is polling RDTSC and assumes it makes forward progress and waits until the time that was estimated at the original TSC<->PIT calibration passed. While there is a spin loop it is definitely polling a timer that is supposed to tick properly even in virtualization. You're saying that doesn't work on vmware? Does it have trouble with RDTSC? Anyways if polling against TSC doesn't work I suppose we could change it to poll against some other timer. > In theory this can happen on a real machine, as SMM mode could > be active, doing USB device emulation or something that takes a while > during the lpj calibration and throwing the computation off. Yep. > By changing the parameters (N ticks at K Hz in T seconds), it is easy to > create an unstable measurement that can achieve high failure rates, > although in practice the Linux parameters appear to be reasonable enough > that it is not a major problem. Hmm, why exactly? -Andi ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/5] Skip timer works.patch 2006-10-30 22:50 ` Andi Kleen 2006-10-30 23:09 ` Zachary Amsden @ 2006-11-15 8:03 ` Chris Wright 1 sibling, 0 replies; 15+ messages in thread From: Chris Wright @ 2006-11-15 8:03 UTC (permalink / raw) To: Andi Kleen Cc: Zachary Amsden, Andi Kleen, virtualization, Andrew Morton, Chris Wright, Linux Kernel Mailing List * Andi Kleen (ak@muc.de) wrote: > On Mon, Oct 30, 2006 at 12:54:55PM -0800, Zachary Amsden wrote: > > Andi Kleen wrote: > > >no_timer_check. But it's only there on x86-64 in mainline - although there > > >were some patches to add it to i386 too. > > > > > > > I can rename to match the x86-64 name. > > I will do that in my tree. Looks like this one might have been lost. Here's the renamed version (it's against 2.6.19-rc5-mm2, which should have your tree in it). thanks, -chris -- From: Zachary Amsden <zach@vmware.com> Add a way to disable the timer IRQ routing check via a boot option. The VMI timer code uses this to avoid triggering the pester Mingo code, which probes for some very unusual and broken motherboard routings. It fires 100% of the time when using a paravirtual delay mechanism instead of using a realtime delay, since there is no elapsed real time, and the 4 timer IRQs have not yet been delivered. In addition, it is entirely possible, though improbable, that this bug could surface on real hardware which picks a particularly bad time to enter SMM mode, causing a long latency during one of the timer IRQs. While here, make check_timer be __init. Signed-off-by: Zachary Amsden <zach@vmware.com> [chrisw: use no_timer_check to bring inline with x86_64 as per Andi's request] Signed-off-by: Chris Wright <chrisw@sous-sol.org> =================================================================== --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -610,8 +610,6 @@ and is between 256 and 4096 characters. hugepages= [HW,IA-32,IA-64] Maximal number of HugeTLB pages. - noirqbalance [IA-32,SMP,KNL] Disable kernel irq balancing - i8042.direct [HW] Put keyboard port into non-translated mode i8042.dumbkbd [HW] Pretend that controller can only read data from keyboard and cannot control its state @@ -1081,8 +1079,13 @@ and is between 256 and 4096 characters. in certain environments such as networked servers or real-time systems. + noirqbalance [IA-32,SMP,KNL] Disable kernel irq balancing + noirqdebug [IA-32] Disables the code which attempts to detect and disable unhandled interrupt sources. + + no_timer_check [IA-32,X86_64,APIC] Disables the code which tests for + broken timer IRQ sources. noisapnp [ISAPNP] Disables ISA PnP code. =================================================================== --- a/arch/i386/kernel/io_apic.c +++ b/arch/i386/kernel/io_apic.c @@ -1931,6 +1931,15 @@ static void __init setup_ioapic_ids_from static void __init setup_ioapic_ids_from_mpc(void) { } #endif +static int no_timer_check __initdata; + +static int __init notimercheck(char *s) +{ + no_timer_check = 1; + return 1; +} +__setup("no_timer_check", notimercheck); + /* * There is a nasty bug in some older SMP boards, their mptable lies * about the timer IRQ. We do the following to work around the situation: @@ -1939,9 +1948,12 @@ static void __init setup_ioapic_ids_from * - if this function detects that timer IRQs are defunct, then we fall * back to ISA timer IRQs */ -static int __init timer_irq_works(void) +int __init timer_irq_works(void) { unsigned long t1 = jiffies; + + if (no_timer_check) + return 1; local_irq_enable(); /* Let ten ticks pass... */ @@ -2219,7 +2231,7 @@ int timer_uses_ioapic_pin_0; * is so screwy. Thanks to Brian Perkins for testing/hacking this beast * fanatically on his truly buggy board. */ -static inline void check_timer(void) +static inline void __init check_timer(void) { int apic1, pin1, apic2, pin2; int vector; ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/5] Skip timer works.patch 2006-10-20 0:09 [PATCH 1/5] Skip timer works.patch Zachary Amsden 2006-10-27 14:56 ` Andi Kleen @ 2006-11-16 22:53 ` Andrew Morton 2006-11-16 23:08 ` Zachary Amsden ` (2 more replies) 1 sibling, 3 replies; 15+ messages in thread From: Andrew Morton @ 2006-11-16 22:53 UTC (permalink / raw) To: Zachary Amsden Cc: Rusty Russell, Andi Kleen, Jeremy Fitzhardinge, Chris Wright, Virtualization Mailing List, Linux Kernel Mailing List On Thu, 19 Oct 2006 17:09:22 -0700 Zachary Amsden <zach@vmware.com> wrote: > Add a way to disable the timer IRQ routing check via a boot option. The > VMI timer code uses this to avoid triggering the pester Mingo code, which > probes for some very unusual and broken motherboard routings. It fires > 100% of the time when using a paravirtual delay mechanism instead of > using a realtime delay, since there is no elapsed real time, and the 4 timer > IRQs have not yet been delivered. > > In addition, it is entirely possible, though improbable, that this bug > could surface on real hardware which picks a particularly bad time to enter > SMM mode, causing a long latency during one of the timer IRQs. > > While here, make check_timer be __init. > Andi seems to have merged this patch but from somewhere I picked up a different version, below. I think the version I have is better. Because the patch Andi has merged is cast in terms of "irq testing", which is broad. But that's not what the patch does - the patch handles only timers. IOW, this: > + > + noirqtest [IA-32,APIC] Disables the code which tests for broken > + timer IRQ sources. is misleadingly named. This: + no_timer_check [IA-32,X86_64,APIC] Disables the code which tests for + broken timer IRQ sources. + is better, no? But right now, I'll settle for anything which usually compiles. From: Zachary Amsden <zach@vmware.com> Add a way to disable the timer IRQ routing check via a boot option. The VMI timer code uses this to avoid triggering the pester Mingo code, which probes for some very unusual and broken motherboard routings. It fires 100% of the time when using a paravirtual delay mechanism instead of using a realtime delay, since there is no elapsed real time, and the 4 timer IRQs have not yet been delivered. In addition, it is entirely possible, though improbable, that this bug could surface on real hardware which picks a particularly bad time to enter SMM mode, causing a long latency during one of the timer IRQs. While here, make check_timer be __init. Signed-off-by: Zachary Amsden <zach@vmware.com> [chrisw: use no_timer_check to bring inline with x86_64 as per Andi's request] Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> --- Documentation/kernel-parameters.txt | 7 +++++-- arch/i386/kernel/io_apic.c | 16 ++++++++++++++-- 2 files changed, 19 insertions(+), 4 deletions(-) diff -puN arch/i386/kernel/io_apic.c~i386-add-a-way-to-disable-the-timer-irq-routing-check-via-a-boot-option arch/i386/kernel/io_apic.c --- a/arch/i386/kernel/io_apic.c~i386-add-a-way-to-disable-the-timer-irq-routing-check-via-a-boot-option +++ a/arch/i386/kernel/io_apic.c @@ -1930,6 +1930,15 @@ static void __init setup_ioapic_ids_from static void __init setup_ioapic_ids_from_mpc(void) { } #endif +static int no_timer_check __initdata; + +static int __init notimercheck(char *s) +{ + no_timer_check = 1; + return 1; +} +__setup("no_timer_check", notimercheck); + /* * There is a nasty bug in some older SMP boards, their mptable lies * about the timer IRQ. We do the following to work around the situation: @@ -1938,10 +1947,13 @@ static void __init setup_ioapic_ids_from * - if this function detects that timer IRQs are defunct, then we fall * back to ISA timer IRQs */ -static int __init timer_irq_works(void) +int __init timer_irq_works(void) { unsigned long t1 = jiffies; + if (no_timer_check) + return 1; + local_irq_enable(); /* Let ten ticks pass... */ mdelay((10 * 1000) / HZ); @@ -2212,7 +2224,7 @@ int timer_uses_ioapic_pin_0; * is so screwy. Thanks to Brian Perkins for testing/hacking this beast * fanatically on his truly buggy board. */ -static inline void check_timer(void) +static inline void __init check_timer(void) { int apic1, pin1, apic2, pin2; int vector; diff -puN Documentation/kernel-parameters.txt~i386-add-a-way-to-disable-the-timer-irq-routing-check-via-a-boot-option Documentation/kernel-parameters.txt --- a/Documentation/kernel-parameters.txt~i386-add-a-way-to-disable-the-timer-irq-routing-check-via-a-boot-option +++ a/Documentation/kernel-parameters.txt @@ -599,8 +599,6 @@ and is between 256 and 4096 characters. hugepages= [HW,IA-32,IA-64] Maximal number of HugeTLB pages. - noirqbalance [IA-32,SMP,KNL] Disable kernel irq balancing - i8042.direct [HW] Put keyboard port into non-translated mode i8042.dumbkbd [HW] Pretend that controller can only read data from keyboard and cannot control its state @@ -1056,9 +1054,14 @@ and is between 256 and 4096 characters. in certain environments such as networked servers or real-time systems. + noirqbalance [IA-32,SMP,KNL] Disable kernel irq balancing + noirqdebug [IA-32] Disables the code which attempts to detect and disable unhandled interrupt sources. + no_timer_check [IA-32,X86_64,APIC] Disables the code which tests for + broken timer IRQ sources. + noisapnp [ISAPNP] Disables ISA PnP code. noinitrd [RAM] Tells the kernel not to load any configured _ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/5] Skip timer works.patch 2006-11-16 22:53 ` Andrew Morton @ 2006-11-16 23:08 ` Zachary Amsden 2006-11-16 23:10 ` Chris Wright 2006-11-17 5:05 ` Andi Kleen 2 siblings, 0 replies; 15+ messages in thread From: Zachary Amsden @ 2006-11-16 23:08 UTC (permalink / raw) To: Andrew Morton Cc: Rusty Russell, Andi Kleen, Jeremy Fitzhardinge, Chris Wright, Virtualization Mailing List, Linux Kernel Mailing List Andrew Morton wrote: > On Thu, 19 Oct 2006 17:09:22 -0700 > Zachary Amsden <zach@vmware.com> wrote: > > >> Add a way to disable the timer IRQ routing check via a boot option. The >> VMI timer code uses this to avoid triggering the pester Mingo code, which >> probes for some very unusual and broken motherboard routings. It fires >> 100% of the time when using a paravirtual delay mechanism instead of >> using a realtime delay, since there is no elapsed real time, and the 4 timer >> IRQs have not yet been delivered. >> >> In addition, it is entirely possible, though improbable, that this bug >> could surface on real hardware which picks a particularly bad time to enter >> SMM mode, causing a long latency during one of the timer IRQs. >> >> While here, make check_timer be __init. >> >> > > Andi seems to have merged this patch but from somewhere I picked up a > different version, below. > > I think the version I have is better. Because the patch Andi has merged is > cast in terms of "irq testing", which is broad. But that's not what the > patch does - the patch handles only timers. > > IOW, this: > > >> + >> + noirqtest [IA-32,APIC] Disables the code which tests for broken >> + timer IRQ sources. >> > > is misleadingly named. This: > > + no_timer_check [IA-32,X86_64,APIC] Disables the code which tests for > + broken timer IRQ sources. > + > > is better, no? > > But right now, I'll settle for anything which usually compiles. > > Yes, the name sucks. There is no real reason to actually have a boot parameter at all once the paravirt / VMI patches are in, but I wanted something to be able to set timer_irq_really_works until then to avoid someone accidentally removing it. Zach ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/5] Skip timer works.patch 2006-11-16 22:53 ` Andrew Morton 2006-11-16 23:08 ` Zachary Amsden @ 2006-11-16 23:10 ` Chris Wright 2006-11-17 5:05 ` Andi Kleen 2 siblings, 0 replies; 15+ messages in thread From: Chris Wright @ 2006-11-16 23:10 UTC (permalink / raw) To: Andrew Morton Cc: Zachary Amsden, Rusty Russell, Andi Kleen, Jeremy Fitzhardinge, Chris Wright, Virtualization Mailing List, Linux Kernel Mailing List * Andrew Morton (akpm@osdl.org) wrote: > Andi seems to have merged this patch but from somewhere I picked up a > different version, below. That was from me. The earlier one had fallen through the cracks, so I picked it up and reworked it (so that it uses no_timer_check) as per Andi's suggestion. > I think the version I have is better. Agreed ;-) > Because the patch Andi has merged is > cast in terms of "irq testing", which is broad. But that's not what the > patch does - the patch handles only timers. > > IOW, this: > > > + > > + noirqtest [IA-32,APIC] Disables the code which tests for broken > > + timer IRQ sources. > > is misleadingly named. This: > > + no_timer_check [IA-32,X86_64,APIC] Disables the code which tests for > + broken timer IRQ sources. > + > > is better, no? Clearer, and also same as x86_64. > But right now, I'll settle for anything which usually compiles. I believe the current round has fixed all the know compilation issues. There's one outstanding runtime issue from Andi, which I believe is due to an old glibc which requires COMPAT_VDSO (and which CONFIG_PARAVIRT must disable). So current set is clean from the reports I have (and assuming above vdso analysis is correct). thanks, -chris -- > From: Zachary Amsden <zach@vmware.com> > > Add a way to disable the timer IRQ routing check via a boot option. The > VMI timer code uses this to avoid triggering the pester Mingo code, which > probes for some very unusual and broken motherboard routings. It fires > 100% of the time when using a paravirtual delay mechanism instead of using > a realtime delay, since there is no elapsed real time, and the 4 timer IRQs > have not yet been delivered. > > In addition, it is entirely possible, though improbable, that this bug > could surface on real hardware which picks a particularly bad time to enter > SMM mode, causing a long latency during one of the timer IRQs. > > While here, make check_timer be __init. > > Signed-off-by: Zachary Amsden <zach@vmware.com> > [chrisw: use no_timer_check to bring inline with x86_64 as per Andi's request] > Signed-off-by: Chris Wright <chrisw@sous-sol.org> > Cc: Andi Kleen <ak@suse.de> > Signed-off-by: Andrew Morton <akpm@osdl.org> > --- > > Documentation/kernel-parameters.txt | 7 +++++-- > arch/i386/kernel/io_apic.c | 16 ++++++++++++++-- > 2 files changed, 19 insertions(+), 4 deletions(-) > > diff -puN arch/i386/kernel/io_apic.c~i386-add-a-way-to-disable-the-timer-irq-routing-check-via-a-boot-option arch/i386/kernel/io_apic.c > --- a/arch/i386/kernel/io_apic.c~i386-add-a-way-to-disable-the-timer-irq-routing-check-via-a-boot-option > +++ a/arch/i386/kernel/io_apic.c > @@ -1930,6 +1930,15 @@ static void __init setup_ioapic_ids_from > static void __init setup_ioapic_ids_from_mpc(void) { } > #endif > > +static int no_timer_check __initdata; > + > +static int __init notimercheck(char *s) > +{ > + no_timer_check = 1; > + return 1; > +} > +__setup("no_timer_check", notimercheck); > + > /* > * There is a nasty bug in some older SMP boards, their mptable lies > * about the timer IRQ. We do the following to work around the situation: > @@ -1938,10 +1947,13 @@ static void __init setup_ioapic_ids_from > * - if this function detects that timer IRQs are defunct, then we fall > * back to ISA timer IRQs > */ > -static int __init timer_irq_works(void) > +int __init timer_irq_works(void) > { > unsigned long t1 = jiffies; > > + if (no_timer_check) > + return 1; > + > local_irq_enable(); > /* Let ten ticks pass... */ > mdelay((10 * 1000) / HZ); > @@ -2212,7 +2224,7 @@ int timer_uses_ioapic_pin_0; > * is so screwy. Thanks to Brian Perkins for testing/hacking this beast > * fanatically on his truly buggy board. > */ > -static inline void check_timer(void) > +static inline void __init check_timer(void) > { > int apic1, pin1, apic2, pin2; > int vector; > diff -puN Documentation/kernel-parameters.txt~i386-add-a-way-to-disable-the-timer-irq-routing-check-via-a-boot-option Documentation/kernel-parameters.txt > --- a/Documentation/kernel-parameters.txt~i386-add-a-way-to-disable-the-timer-irq-routing-check-via-a-boot-option > +++ a/Documentation/kernel-parameters.txt > @@ -599,8 +599,6 @@ and is between 256 and 4096 characters. > > hugepages= [HW,IA-32,IA-64] Maximal number of HugeTLB pages. > > - noirqbalance [IA-32,SMP,KNL] Disable kernel irq balancing > - > i8042.direct [HW] Put keyboard port into non-translated mode > i8042.dumbkbd [HW] Pretend that controller can only read data from > keyboard and cannot control its state > @@ -1056,9 +1054,14 @@ and is between 256 and 4096 characters. > in certain environments such as networked servers or > real-time systems. > > + noirqbalance [IA-32,SMP,KNL] Disable kernel irq balancing > + > noirqdebug [IA-32] Disables the code which attempts to detect and > disable unhandled interrupt sources. > > + no_timer_check [IA-32,X86_64,APIC] Disables the code which tests for > + broken timer IRQ sources. > + > noisapnp [ISAPNP] Disables ISA PnP code. > > noinitrd [RAM] Tells the kernel not to load any configured > _ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/5] Skip timer works.patch 2006-11-16 22:53 ` Andrew Morton 2006-11-16 23:08 ` Zachary Amsden 2006-11-16 23:10 ` Chris Wright @ 2006-11-17 5:05 ` Andi Kleen 2 siblings, 0 replies; 15+ messages in thread From: Andi Kleen @ 2006-11-17 5:05 UTC (permalink / raw) To: Andrew Morton Cc: Zachary Amsden, Rusty Russell, Jeremy Fitzhardinge, Chris Wright, Virtualization Mailing List, Linux Kernel Mailing List > Andi seems to have merged this patch but from somewhere I picked up a > different version, below. > > I think the version I have is better. Because the patch Andi has merged is > cast in terms of "irq testing", which is broad. But that's not what the > patch does - the patch handles only timers. Agreed. I updated to your version now. -Andi ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2006-11-17 5:05 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-10-20 0:09 [PATCH 1/5] Skip timer works.patch Zachary Amsden 2006-10-27 14:56 ` Andi Kleen 2006-10-27 19:09 ` Zachary Amsden 2006-10-27 21:16 ` Andi Kleen 2006-10-30 20:54 ` Zachary Amsden 2006-10-30 22:50 ` Andi Kleen 2006-10-30 23:09 ` Zachary Amsden 2006-10-30 23:12 ` Andi Kleen 2006-10-30 23:24 ` Zachary Amsden 2006-10-30 23:50 ` Andi Kleen 2006-11-15 8:03 ` Chris Wright 2006-11-16 22:53 ` Andrew Morton 2006-11-16 23:08 ` Zachary Amsden 2006-11-16 23:10 ` Chris Wright 2006-11-17 5:05 ` Andi Kleen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).