* [PATCH] I/O APIC: Timer through 8259A revamp
@ 2008-05-18 3:35 Maciej W. Rozycki
2008-05-19 12:30 ` Ingo Molnar
2008-05-19 12:52 ` Andi Kleen
0 siblings, 2 replies; 8+ messages in thread
From: Maciej W. Rozycki @ 2008-05-18 3:35 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner; +Cc: linux-kernel
There is some seemingly unnecessary cruft accumulated in the 8254 timer
setup path for the I/O APIC. Some attempts have been made to cater for
broken BIOSes reporting the ExtINTA I/O APIC pin cascaded from the master
8259A as the native timer interrupt input, which interfere with the setup
sequence in a non-obvious way and require a pair of additional kernel
command-line parameters.
I think this is unnecessary. All the bits are already in place. With
the removal of AEOI bits for non-i82489DX systems the whole setup can be
simplified. Here is a set of changes that reuse the existing code for
potentially broken systems. The observation is broken systems do not have
a notion of ExtINTA pins, so only one pin/apic pair will be determined as
useful for the timer interrupt. Therefore it should be safe to select
this single pin/apic to test both directly and through the 8259A.
This change implements this approach by copying timer pin information
found to the other pin if one has been found only. This way both tests
are always performed. The timer input of the 8259A is carefully unmasked
if needed only and masked again if found unneeded. This is because some
systems seem sensitive for the master 8259A interrupt being active even if
all the APIC inputs this line is wired to are masked.
The "disable_8254_timer" and "enable_8254_timer" kernel parameters are
removed as no longer needed.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
---
It works for my system, but it definitely requires much, much more
testing. The patch depends on patch-2.6.26-rc1-20080505-timer_ack-1 sent
earlier.
Of course these "if (pin? != -1)" statements are useless (though
harmless) now, but I think that change deserves a separate patch not to
obfuscate changes to code which is obscure enough already.
Maciej
patch-2.6.26-rc1-20080505-timer-8259-0
diff -up --recursive --new-file linux-2.6.26-rc1-20080505.macro/arch/x86/kernel/early-quirks.c linux-2.6.26-rc1-20080505/arch/x86/kernel/early-quirks.c
--- linux-2.6.26-rc1-20080505.macro/arch/x86/kernel/early-quirks.c 2008-05-05 02:55:24.000000000 +0000
+++ linux-2.6.26-rc1-20080505/arch/x86/kernel/early-quirks.c 2008-05-17 23:33:42.000000000 +0000
@@ -98,17 +98,6 @@ static void __init nvidia_bugs(int num,
}
-static void __init ati_bugs(int num, int slot, int func)
-{
-#ifdef CONFIG_X86_IO_APIC
- if (timer_over_8254 == 1) {
- timer_over_8254 = 0;
- printk(KERN_INFO
- "ATI board detected. Disabling timer routing over 8254.\n");
- }
-#endif
-}
-
#define QFLAG_APPLY_ONCE 0x1
#define QFLAG_APPLIED 0x2
#define QFLAG_DONE (QFLAG_APPLY_ONCE|QFLAG_APPLIED)
@@ -126,8 +115,6 @@ static struct chipset early_qrk[] __init
PCI_CLASS_BRIDGE_PCI, PCI_ANY_ID, QFLAG_APPLY_ONCE, nvidia_bugs },
{ PCI_VENDOR_ID_VIA, PCI_ANY_ID,
PCI_CLASS_BRIDGE_PCI, PCI_ANY_ID, QFLAG_APPLY_ONCE, via_bugs },
- { PCI_VENDOR_ID_ATI, PCI_ANY_ID,
- PCI_CLASS_BRIDGE_PCI, PCI_ANY_ID, QFLAG_APPLY_ONCE, ati_bugs },
{ PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_K8_NB,
PCI_CLASS_BRIDGE_HOST, PCI_ANY_ID, 0, fix_hypertransport_config },
{}
diff -up --recursive --new-file linux-2.6.26-rc1-20080505.macro/arch/x86/kernel/io_apic_32.c linux-2.6.26-rc1-20080505/arch/x86/kernel/io_apic_32.c
--- linux-2.6.26-rc1-20080505.macro/arch/x86/kernel/io_apic_32.c 2008-05-16 20:08:32.000000000 +0000
+++ linux-2.6.26-rc1-20080505/arch/x86/kernel/io_apic_32.c 2008-05-18 00:24:24.000000000 +0000
@@ -58,7 +58,7 @@ static struct { int pin, apic; } ioapic_
static DEFINE_SPINLOCK(ioapic_lock);
static DEFINE_SPINLOCK(vector_lock);
-int timer_over_8254 __initdata = 1;
+int timer_through_8259 __initdata;
/*
* Is the SiS APIC rmw bug present ?
@@ -2129,6 +2129,7 @@ static inline void __init unlock_ExtINT_
static inline void __init check_timer(void)
{
int apic1, pin1, apic2, pin2;
+ int pin2_fixup = 0;
int vector;
unsigned int ver;
unsigned long flags;
@@ -2157,8 +2158,6 @@ static inline void __init check_timer(vo
apic_write_around(APIC_LVT0, APIC_LVT_MASKED | APIC_DM_EXTINT);
init_8259A(1);
timer_ack = (nmi_watchdog == NMI_IO_APIC && !APIC_INTEGRATED(ver));
- if (timer_over_8254 > 0)
- enable_8259A_irq(0);
pin1 = find_isa_irq_pin(0, mp_INT);
apic1 = find_isa_irq_apic(0, mp_INT);
@@ -2168,6 +2167,22 @@ static inline void __init check_timer(vo
printk(KERN_INFO "..TIMER: vector=0x%02X apic1=%d pin1=%d apic2=%d pin2=%d\n",
vector, apic1, pin1, apic2, pin2);
+ /*
+ * Some BIOS writers are clueless and report the ExtINTA
+ * I/O APIC input from the cascaded 8259A as the timer
+ * interrupt input. So just in case, if only one pin
+ * was found above, try it both directly and through the
+ * 8259A. --macro
+ */
+ if (pin1 == -1) {
+ pin1 = pin2;
+ apic1 = apic2;
+ pin2_fixup = 1;
+ } else if (pin2 == -1) {
+ pin2 = pin1;
+ apic2 = apic1;
+ }
+
if (pin1 != -1) {
/*
* Ok, does IRQ0 through the IOAPIC work?
@@ -2175,7 +2190,6 @@ static inline void __init check_timer(vo
unmask_IO_APIC_irq(0);
if (timer_irq_works()) {
if (nmi_watchdog == NMI_IO_APIC) {
- disable_8259A_irq(0);
setup_nmi();
enable_8259A_irq(0);
}
@@ -2195,20 +2209,25 @@ static inline void __init check_timer(vo
* legacy devices should be connected to IO APIC #0
*/
setup_ExtINT_IRQ0_pin(apic2, pin2, vector);
+ enable_8259A_irq(0);
if (timer_irq_works()) {
printk("works.\n");
- if (pin1 != -1)
+ timer_through_8259 = 1;
+ if (pin2 != pin1 || apic2 != apic1)
replace_pin_at_irq(0, apic1, pin1, apic2, pin2);
- else
+ else if (pin2_fixup)
add_pin_to_irq(0, apic2, pin2);
if (nmi_watchdog == NMI_IO_APIC) {
+ disable_8259A_irq(0);
setup_nmi();
+ enable_8259A_irq(0);
}
goto out;
}
/*
* Cleanup, just in case ...
*/
+ disable_8259A_irq(0);
clear_IO_APIC_pin(apic2, pin2);
}
printk(" failed.\n");
@@ -2220,7 +2239,6 @@ static inline void __init check_timer(vo
printk(KERN_INFO "...trying to set up timer as Virtual Wire IRQ...");
- disable_8259A_irq(0);
set_irq_chip_and_handler_name(0, &lapic_chip, handle_fasteoi_irq,
"fasteoi");
apic_write_around(APIC_LVT0, APIC_DM_FIXED | vector); /* Fixed mode */
@@ -2230,6 +2248,7 @@ static inline void __init check_timer(vo
printk(" works.\n");
goto out;
}
+ disable_8259A_irq(0);
apic_write_around(APIC_LVT0, APIC_LVT_MASKED | APIC_DM_FIXED | vector);
printk(" failed.\n");
@@ -2292,20 +2311,6 @@ void __init setup_IO_APIC(void)
print_IO_APIC();
}
-static int __init setup_disable_8254_timer(char *s)
-{
- timer_over_8254 = -1;
- return 1;
-}
-static int __init setup_enable_8254_timer(char *s)
-{
- timer_over_8254 = 2;
- return 1;
-}
-
-__setup("disable_8254_timer", setup_disable_8254_timer);
-__setup("enable_8254_timer", setup_enable_8254_timer);
-
/*
* Called after all the initialization is done. If we didnt find any
* APIC bugs then we can allow the modify fast path
diff -up --recursive --new-file linux-2.6.26-rc1-20080505.macro/arch/x86/kernel/io_apic_64.c linux-2.6.26-rc1-20080505/arch/x86/kernel/io_apic_64.c
--- linux-2.6.26-rc1-20080505.macro/arch/x86/kernel/io_apic_64.c 2008-05-17 03:43:40.000000000 +0000
+++ linux-2.6.26-rc1-20080505/arch/x86/kernel/io_apic_64.c 2008-05-18 00:24:43.000000000 +0000
@@ -90,7 +90,7 @@ static int no_timer_check;
static int disable_timer_pin_1 __initdata;
-int timer_over_8254 __initdata = 1;
+int timer_through_8259 __initdata;
/* Where if anywhere is the i8259 connect in external int mode */
static struct { int pin, apic; } ioapic_i8259 = { -1, -1 };
@@ -430,20 +430,6 @@ static int __init disable_timer_pin_setu
}
__setup("disable_timer_pin_1", disable_timer_pin_setup);
-static int __init setup_disable_8254_timer(char *s)
-{
- timer_over_8254 = -1;
- return 1;
-}
-static int __init setup_enable_8254_timer(char *s)
-{
- timer_over_8254 = 2;
- return 1;
-}
-
-__setup("disable_8254_timer", setup_disable_8254_timer);
-__setup("enable_8254_timer", setup_enable_8254_timer);
-
/*
* Find the IRQ entry number of a certain pin.
@@ -1674,8 +1660,6 @@ static inline void __init check_timer(vo
*/
apic_write(APIC_LVT0, APIC_LVT_MASKED | APIC_DM_EXTINT);
init_8259A(1);
- if (timer_over_8254 > 0)
- enable_8259A_irq(0);
pin1 = find_isa_irq_pin(0, mp_INT);
apic1 = find_isa_irq_apic(0, mp_INT);
@@ -1685,6 +1669,21 @@ static inline void __init check_timer(vo
apic_printk(APIC_VERBOSE,KERN_INFO "..TIMER: vector=0x%02X apic1=%d pin1=%d apic2=%d pin2=%d\n",
cfg->vector, apic1, pin1, apic2, pin2);
+ /*
+ * Some BIOS writers are clueless and report the ExtINTA
+ * I/O APIC input from the cascaded 8259A as the timer
+ * interrupt input. So just in case, if only one pin
+ * was found above, try it both directly and through the
+ * 8259A. --macro
+ */
+ if (pin1 == -1) {
+ pin1 = pin2;
+ apic1 = apic2;
+ } else if (pin2 == -1) {
+ pin2 = pin1;
+ apic2 = apic1;
+ }
+
if (pin1 != -1) {
/*
* Ok, does IRQ0 through the IOAPIC work?
@@ -1693,7 +1692,6 @@ static inline void __init check_timer(vo
if (!no_timer_check && timer_irq_works()) {
nmi_watchdog_default();
if (nmi_watchdog == NMI_IO_APIC) {
- disable_8259A_irq(0);
setup_nmi();
enable_8259A_irq(0);
}
@@ -1715,17 +1713,22 @@ static inline void __init check_timer(vo
* legacy devices should be connected to IO APIC #0
*/
setup_ExtINT_IRQ0_pin(apic2, pin2, cfg->vector);
+ enable_8259A_irq(0);
if (timer_irq_works()) {
apic_printk(APIC_VERBOSE," works.\n");
+ timer_through_8259 = 1;
nmi_watchdog_default();
if (nmi_watchdog == NMI_IO_APIC) {
+ disable_8259A_irq(0);
setup_nmi();
+ enable_8259A_irq(0);
}
goto out;
}
/*
* Cleanup, just in case ...
*/
+ disable_8259A_irq(0);
clear_IO_APIC_pin(apic2, pin2);
}
apic_printk(APIC_VERBOSE," failed.\n");
@@ -1737,7 +1740,6 @@ static inline void __init check_timer(vo
apic_printk(APIC_VERBOSE, KERN_INFO "...trying to set up timer as Virtual Wire IRQ...");
- disable_8259A_irq(0);
irq_desc[0].chip = &lapic_irq_type;
apic_write(APIC_LVT0, APIC_DM_FIXED | cfg->vector); /* Fixed mode */
enable_8259A_irq(0);
@@ -1746,6 +1748,7 @@ static inline void __init check_timer(vo
apic_printk(APIC_VERBOSE," works.\n");
goto out;
}
+ disable_8259A_irq(0);
apic_write(APIC_LVT0, APIC_LVT_MASKED | APIC_DM_FIXED | cfg->vector);
apic_printk(APIC_VERBOSE," failed.\n");
diff -up --recursive --new-file linux-2.6.26-rc1-20080505.macro/arch/x86/kernel/nmi_32.c linux-2.6.26-rc1-20080505/arch/x86/kernel/nmi_32.c
--- linux-2.6.26-rc1-20080505.macro/arch/x86/kernel/nmi_32.c 2008-05-16 20:11:41.000000000 +0000
+++ linux-2.6.26-rc1-20080505/arch/x86/kernel/nmi_32.c 2008-05-18 02:06:46.000000000 +0000
@@ -24,6 +24,8 @@
#include <linux/kdebug.h>
#include <linux/slab.h>
+#include <asm/apic.h>
+#include <asm/i8259.h>
#include <asm/smp.h>
#include <asm/nmi.h>
#include <asm/timer.h>
@@ -131,6 +133,8 @@ int __init check_nmi_watchdog(void)
kfree(prev_nmi_count);
return 0;
error:
+ if (nmi_watchdog == NMI_IO_APIC && !timer_through_8259)
+ disable_8259A_irq(0);
timer_ack = 0;
return -1;
diff -up --recursive --new-file linux-2.6.26-rc1-20080505.macro/arch/x86/kernel/nmi_64.c linux-2.6.26-rc1-20080505/arch/x86/kernel/nmi_64.c
--- linux-2.6.26-rc1-20080505.macro/arch/x86/kernel/nmi_64.c 2008-05-05 02:55:24.000000000 +0000
+++ linux-2.6.26-rc1-20080505/arch/x86/kernel/nmi_64.c 2008-05-18 00:15:51.000000000 +0000
@@ -21,6 +21,8 @@
#include <linux/cpumask.h>
#include <linux/kdebug.h>
+#include <asm/apic.h>
+#include <asm/i8259.h>
#include <asm/smp.h>
#include <asm/nmi.h>
#include <asm/proto.h>
@@ -90,7 +92,7 @@ int __init check_nmi_watchdog(void)
prev_nmi_count = kmalloc(NR_CPUS * sizeof(int), GFP_KERNEL);
if (!prev_nmi_count)
- return -1;
+ goto error;
printk(KERN_INFO "Testing NMI watchdog ... ");
@@ -121,7 +123,7 @@ int __init check_nmi_watchdog(void)
if (!atomic_read(&nmi_active)) {
kfree(prev_nmi_count);
atomic_set(&nmi_active, -1);
- return -1;
+ goto error;
}
printk("OK.\n");
@@ -132,6 +134,11 @@ int __init check_nmi_watchdog(void)
kfree(prev_nmi_count);
return 0;
+error:
+ if (nmi_watchdog == NMI_IO_APIC && !timer_through_8259)
+ disable_8259A_irq(0);
+
+ return -1;
}
static int __init setup_nmi_watchdog(char *str)
diff -up --recursive --new-file linux-2.6.26-rc1-20080505.macro/include/asm-x86/apic.h linux-2.6.26-rc1-20080505/include/asm-x86/apic.h
--- linux-2.6.26-rc1-20080505.macro/include/asm-x86/apic.h 2008-05-05 02:55:57.000000000 +0000
+++ linux-2.6.26-rc1-20080505/include/asm-x86/apic.h 2008-05-18 00:16:42.000000000 +0000
@@ -36,7 +36,7 @@ extern void generic_apic_probe(void);
#ifdef CONFIG_X86_LOCAL_APIC
extern int apic_verbosity;
-extern int timer_over_8254;
+extern int timer_through_8259;
extern int local_apic_timer_c2_ok;
extern int local_apic_timer_disabled;
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] I/O APIC: Timer through 8259A revamp
2008-05-18 3:35 [PATCH] I/O APIC: Timer through 8259A revamp Maciej W. Rozycki
@ 2008-05-19 12:30 ` Ingo Molnar
2008-05-19 13:57 ` Maciej W. Rozycki
2008-05-19 12:52 ` Andi Kleen
1 sibling, 1 reply; 8+ messages in thread
From: Ingo Molnar @ 2008-05-19 12:30 UTC (permalink / raw)
To: Maciej W. Rozycki; +Cc: Thomas Gleixner, linux-kernel, H. Peter Anvin
* Maciej W. Rozycki <macro@linux-mips.org> wrote:
> There is some seemingly unnecessary cruft accumulated in the 8254
> timer setup path for the I/O APIC. Some attempts have been made to
> cater for broken BIOSes reporting the ExtINTA I/O APIC pin cascaded
> from the master 8259A as the native timer interrupt input, which
> interfere with the setup sequence in a non-obvious way and require a
> pair of additional kernel command-line parameters.
>
> I think this is unnecessary. All the bits are already in place. With
> the removal of AEOI bits for non-i82489DX systems the whole setup can
> be simplified. Here is a set of changes that reuse the existing code
> for potentially broken systems. The observation is broken systems do
> not have a notion of ExtINTA pins, so only one pin/apic pair will be
> determined as useful for the timer interrupt. Therefore it should be
> safe to select this single pin/apic to test both directly and through
> the 8259A.
>
> This change implements this approach by copying timer pin information
> found to the other pin if one has been found only. This way both
> tests are always performed. The timer input of the 8259A is carefully
> unmasked if needed only and masked again if found unneeded. This is
> because some systems seem sensitive for the master 8259A interrupt
> being active even if all the APIC inputs this line is wired to are
> masked.
>
> The "disable_8254_timer" and "enable_8254_timer" kernel parameters are
> removed as no longer needed.
applied to latest -tip, thanks.
> Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
> ---
> It works for my system, but it definitely requires much, much more
> testing. The patch depends on patch-2.6.26-rc1-20080505-timer_ack-1
> sent earlier.
agreed, scary patch - but nice cleanups!
> Of course these "if (pin? != -1)" statements are useless (though
> harmless) now, but I think that change deserves a separate patch not
> to obfuscate changes to code which is obscure enough already.
please send another patch for that. If you can think of a good way of
splitting up this patch into smaller units feel free to do that too ...
Dangerous changes are much better if they happen in small incremental
steps. (even if the sum of the changes is not any less dangerous - it
just makes any trouble easier to bisect and fix)
Ingo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] I/O APIC: Timer through 8259A revamp
2008-05-18 3:35 [PATCH] I/O APIC: Timer through 8259A revamp Maciej W. Rozycki
2008-05-19 12:30 ` Ingo Molnar
@ 2008-05-19 12:52 ` Andi Kleen
2008-05-19 14:25 ` Maciej W. Rozycki
1 sibling, 1 reply; 8+ messages in thread
From: Andi Kleen @ 2008-05-19 12:52 UTC (permalink / raw)
To: Maciej W. Rozycki; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel
"Maciej W. Rozycki" <macro@linux-mips.org> writes:
> ---
> It works for my system, but it definitely requires much, much more
> testing. The patch depends on patch-2.6.26-rc1-20080505-timer_ack-1 sent
> earlier.
I tried to clean up this code in the past too, but one experience I got
was that even a lot of relatively modern systems (<5 years old) fall into the fallback
paths unfortunately and it's quite difficult to find out why.
Also Windows uses a different timer set up so this configuration is often
not well tested.
-Andi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] I/O APIC: Timer through 8259A revamp
2008-05-19 12:30 ` Ingo Molnar
@ 2008-05-19 13:57 ` Maciej W. Rozycki
0 siblings, 0 replies; 8+ messages in thread
From: Maciej W. Rozycki @ 2008-05-19 13:57 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Thomas Gleixner, linux-kernel, H. Peter Anvin
On Mon, 19 May 2008, Ingo Molnar wrote:
> > It works for my system, but it definitely requires much, much more
> > testing. The patch depends on patch-2.6.26-rc1-20080505-timer_ack-1
> > sent earlier.
>
> agreed, scary patch - but nice cleanups!
Well, I think somebody has to understand these pieces of code after all,
right? ;-)
> > Of course these "if (pin? != -1)" statements are useless (though
> > harmless) now, but I think that change deserves a separate patch not
> > to obfuscate changes to code which is obscure enough already.
>
> please send another patch for that. If you can think of a good way of
That's syntactic sugar -- I am not sure whether it's worth doing it yet.
I think let's test semantic changes first -- there may be more needed --
and once things have settled, fix up the syntax. That's boring enough it
can be done in 30 seconds :-) and can certainly happen before you push
changes to Linus.
> splitting up this patch into smaller units feel free to do that too ...
Well, actually the NMI failure clean-up and the associated
"timer_through_8259" variable can certainly be separate. They are a new
feature that builds on top of the clean-up and without them the fix is
consistent as well. I'll do that and see whether there is anything else
that could be split off.
Additionally I think the following enhancements should be done as the
next step:
1. Also mask LVT0/LVTPC entries as necessary when the NMI watchdog fails.
2. I think the 64-bit version wants the I/O APIC rerouting bits as well.
I recall they were needed for the affinity setting to work for the
timer interrupt -- has anybody tested it with the 64-bit architecture?
3. Nobody will probably care, but since we still claim support for the
i82489DX -- for this APIC the NMI return path for the watchdog should
reassert the motherboard NMI source through the mask register at the
I/O port 0x80 like it is done for the other reasons.
In case somebody reading this is about to head for the skip now -- I am
still interested in test equipment for the case #2.
> Dangerous changes are much better if they happen in small incremental
> steps. (even if the sum of the changes is not any less dangerous - it
> just makes any trouble easier to bisect and fix)
Yes, of course. But sometimes they are not written in the order that
would make the split obvious from the beginning. ;-)
Maciej
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] I/O APIC: Timer through 8259A revamp
2008-05-19 12:52 ` Andi Kleen
@ 2008-05-19 14:25 ` Maciej W. Rozycki
2008-05-19 14:50 ` Andi Kleen
0 siblings, 1 reply; 8+ messages in thread
From: Maciej W. Rozycki @ 2008-05-19 14:25 UTC (permalink / raw)
To: Andi Kleen; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel
On Mon, 19 May 2008, Andi Kleen wrote:
> I tried to clean up this code in the past too, but one experience I got
> was that even a lot of relatively modern systems (<5 years old) fall into the fallback
> paths unfortunately and it's quite difficult to find out why.
The new burst of breakage came with the invention of ACPI and its tables
for interrupt routing for the APIC.
Old MP tables coming from the MP spec used to get the timer interrupt
setup right almost always if not in all the cases. The code we had got up
to that point was meant to handle hardware variations the best we could.
The most common problem was some integrated chipset components predated
the existence of the APIC and had the output of the timer #0 of their
embedded 8254 core routed directly to the IRQ0 input of their embedded
8259A component only. The I/O APIC was external and there was no way to
route IRQ0 there directly.
With the ACPI tables, hardware is modern enough the timer interrupt
should be directly available, but is usually wired in the way recommended
by the MP spec. That is the output of the master 8259A goes to INT0 of
the I/O APIC (or is only connected to local APICs) and the timer is routed
to INT2. However the default for ACPI tables is to map 8259A one-to-one
to the I/O APIC. Perhaps the intention of the authors of the spec was
that hardware will gradually get designed this way or perhaps there was no
clear reason at all. Anyway the reality is most systems out there need
an override for the timer interrupt and practice has shown it is sometimes
missing. As a result the pin that our code assumes is for the timer
interrupt in fact is the 8259A ExtINTA interrupt, which as it happens, can
be used for the timer, but we have to be careful
The end result is we are trying to reuse the same logic in the code for
a different purpose and while it is a reasonable approach, care has to be
taken to take the slightly different circumstances into account.
Of course if there is no INT2 override in the ACPI table, we might
consider blindly checking whether it actually is a timer interrupt,
because for systems which have the legacy 8259A chips IRQ2 makes no sense.
> Also Windows uses a different timer set up so this configuration is often
> not well tested.
Well, they must have figured out the setup of the 8254 timer is
unreliable as well. ;-) Actually the original i82489DX documents
explicitly discouraged SMP operating systems from using the 8254 because
of the missing wiring to the I/O APIC in some, especially early, systems
mentioned above. The use of the mixed interrupt mode was implied if the
use of timer was found inevitable, which was recognised as not fully in
the spirit of an SMP system. Later on the mixed mode proved problematic
too, because of various APIC errata. Still we arrange for such a set-up
as the last resort in check_timer().
The use of the local APIC timer as a replacement was recommended back
then, but I think IRQ8 from the RTC was also a good solution, as it was
still before the time the functionality started getting integrated into
chipsets and the line was always available to the I/O APIC.
Maciej
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] I/O APIC: Timer through 8259A revamp
2008-05-19 14:25 ` Maciej W. Rozycki
@ 2008-05-19 14:50 ` Andi Kleen
2008-05-19 22:08 ` Maciej W. Rozycki
0 siblings, 1 reply; 8+ messages in thread
From: Andi Kleen @ 2008-05-19 14:50 UTC (permalink / raw)
To: Maciej W. Rozycki; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel
> With the ACPI tables, hardware is modern enough the timer interrupt
> should be directly available, but is usually wired in the way recommended
> by the MP spec.
I originally had that assumption too, but in practice that doesn't seem
to be always the case unfortunately. Or sometimes the tests just fail
for whatever reason and they drop into the fallback path.
> That is the output of the master 8259A goes to INT0 of
> the I/O APIC (or is only connected to local APICs) and the timer is routed
> to INT2.
There are already a couple of wiring differences in common chipsets.
Mostly hurts on the fallback path especially when multiple sources are
enabled (then you might end up with duplicated interrupts, happened often
on ATI systems)
> The use of the local APIC timer as a replacement was recommended back
> then,
That was before CPU power saving was invented I guess @)
> but I think IRQ8 from the RTC was also a good solution,
Trouble on Linux is that RTC doesn't support any of the traditional HZ
values. I considered writing a RTC driver back then, but didn't do
it because of this.
On the other hand with wider use of tickless kernels and hrtimers it might
be tolerable to use non standard HZ,
-Andi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] I/O APIC: Timer through 8259A revamp
2008-05-19 14:50 ` Andi Kleen
@ 2008-05-19 22:08 ` Maciej W. Rozycki
2008-05-19 22:34 ` Andi Kleen
0 siblings, 1 reply; 8+ messages in thread
From: Maciej W. Rozycki @ 2008-05-19 22:08 UTC (permalink / raw)
To: Andi Kleen; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel
On Mon, 19 May 2008, Andi Kleen wrote:
> > With the ACPI tables, hardware is modern enough the timer interrupt
> > should be directly available, but is usually wired in the way recommended
> > by the MP spec.
>
> I originally had that assumption too, but in practice that doesn't seem
> to be always the case unfortunately. Or sometimes the tests just fail
> for whatever reason and they drop into the fallback path.
Well, the fallback path was meant to be used with hardware suffering from
limitations and not from broken configuration tables. If the
configuration recorded in the tables is broken, then who knows what the
hardware looks like? Anything may happen. This should really be the case
for: "Fix your *** firmware!" With Flash memories in wide use today,
there is really no excuse and the technology behind these tables does not
require a rocket scientist.
> > That is the output of the master 8259A goes to INT0 of
> > the I/O APIC (or is only connected to local APICs) and the timer is routed
> > to INT2.
>
> There are already a couple of wiring differences in common chipsets.
> Mostly hurts on the fallback path especially when multiple sources are
> enabled (then you might end up with duplicated interrupts, happened often
> on ATI systems)
As soon as you hit the through-8259A mode chances are you will start
suffering from things like assumptions about the state of the hardware
made by whoever has written SMM code for this system, which may result in
reassertion of the interrupt that has already been handled (the mode of
operation of the 8254 does not help here). Otherwise if the timer
interrupt genuinely arrives through multiple I/O APIC inputs, then we have
a piece of sloppy code somewhere -- we should never have more than one
input at a time with the same interrupt vector enabled; this is also a
grey area of operation of the APIC itself, at least with some
implementations.
> > The use of the local APIC timer as a replacement was recommended back
> > then,
>
> That was before CPU power saving was invented I guess @)
Well, the i386SL was *the* power-saving Intel CPU when the i82489DX was
being worked on and certainly back then nobody sane would go through the
pain of putting additional CPUs into a system only to make them go to
sleep. ;-) The SMM had already been committed in that CPU in case you
were asking...
> > but I think IRQ8 from the RTC was also a good solution,
>
> Trouble on Linux is that RTC doesn't support any of the traditional HZ
> values. I considered writing a RTC driver back then, but didn't do
> it because of this.
>
> On the other hand with wider use of tickless kernels and hrtimers it might
> be tolerable to use non standard HZ,
I don't think the value of HZ should be a problem anymore now that we
have a separate user-visible HZ -- and it's been a while like that. We
have a choice of three values of HZ already and for example the MIPS port
uses the closest powers of two to the usual values of 100, 250 and 1000
where necessary because of the very same reason and I haven't heard about
any problems resulting from that.
Note some systems have the polarity reported incorrectly for the RTC
interrupt -- that can be easily solved, but care has to be taken.
Maciej
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] I/O APIC: Timer through 8259A revamp
2008-05-19 22:08 ` Maciej W. Rozycki
@ 2008-05-19 22:34 ` Andi Kleen
0 siblings, 0 replies; 8+ messages in thread
From: Andi Kleen @ 2008-05-19 22:34 UTC (permalink / raw)
To: Maciej W. Rozycki; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel
Maciej W. Rozycki wrote:
> On Mon, 19 May 2008, Andi Kleen wrote:
>
>>> With the ACPI tables, hardware is modern enough the timer interrupt
>>> should be directly available, but is usually wired in the way recommended
>>> by the MP spec.
>> I originally had that assumption too, but in practice that doesn't seem
>> to be always the case unfortunately. Or sometimes the tests just fail
>> for whatever reason and they drop into the fallback path.
>
> Well, the fallback path was meant to be used with hardware suffering from
> limitations and not from broken configuration tables.
I don't think it was wrong tables generally, but more screwy hardware.
The main table problem were with nvidia with the wrong timer overrides,
but that was handled elsewhere. And a long time ago ACPI couldn't handle
the VIA interrupt configuration properly, but that was also fixed later.
>> On the other hand with wider use of tickless kernels and hrtimers it might
>> be tolerable to use non standard HZ,
>
> I don't think the value of HZ should be a problem anymore now that we
> have a separate user-visible HZ
I wasn't worried about the units in sysctl, but about the minimum delay
time in select() and other timeout syscalls. And yes changing that
changes user behaviour significantly. And it's still HZ granuality.
I don't think anybody dared to change it to hrtimers yet because it has
potentially large impact on applications.
Of course you could probably define a "true user hz" that is emulated
with hrtimers even when the system tick is different. Problem is also
that a lot of systems don't support hrtimers yet because they have both
broken APIC timer and missing HPET.
-Andi
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-05-19 22:34 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-18 3:35 [PATCH] I/O APIC: Timer through 8259A revamp Maciej W. Rozycki
2008-05-19 12:30 ` Ingo Molnar
2008-05-19 13:57 ` Maciej W. Rozycki
2008-05-19 12:52 ` Andi Kleen
2008-05-19 14:25 ` Maciej W. Rozycki
2008-05-19 14:50 ` Andi Kleen
2008-05-19 22:08 ` Maciej W. Rozycki
2008-05-19 22:34 ` Andi Kleen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox