* 2.6.29-rt1+ irqbalance = OOPS
@ 2009-03-27 16:16 Robin Gareus
2009-03-27 16:49 ` Gregory Haskins
0 siblings, 1 reply; 8+ messages in thread
From: Robin Gareus @ 2009-03-27 16:16 UTC (permalink / raw)
To: linux-rt-users
Hello,
Starting the irqbalance daemon ( http://www.irqbalance.org/ ) on
2.6.29-rt1 causes the system to OOPS and freeze. irqbalance works fine
with previous realtime kernels (2.6.24.7-rt27).
The OOPS seems to be caused by the PID of the [timer] IRQ-1. However the
OOPS message is too long and I can't read the beginning lines on the
terminal. The system hangs so I can't scroll back and it's not written
to any log. SysRq does also not work after the OOPS.
A bit off topic, but how can I capture those OOPSes?
I could get a larger screen and a digital camera ;) The lkcd.sf.net
patch does not apply to 2.6.29. Running a realtime kernel in qemu does
not make much sense, although it may be sufficient to reproduce this OOPS.
Is there a way to use USB as serial console using a hub and two
computers? I guess I'll need a dongle or sth. Besides would a printk()
to a serial port work even if some IRQ handler hangs?
Any ideas? links?
robin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.29-rt1+ irqbalance = OOPS
2009-03-27 16:16 2.6.29-rt1+ irqbalance = OOPS Robin Gareus
@ 2009-03-27 16:49 ` Gregory Haskins
2009-03-27 17:24 ` Robin Gareus
0 siblings, 1 reply; 8+ messages in thread
From: Gregory Haskins @ 2009-03-27 16:49 UTC (permalink / raw)
To: Robin Gareus; +Cc: linux-rt-users, Thomas Gleixner
[-- Attachment #1: Type: text/plain, Size: 1405 bytes --]
Robin Gareus wrote:
> Hello,
>
> Starting the irqbalance daemon ( http://www.irqbalance.org/ ) on
> 2.6.29-rt1 causes the system to OOPS and freeze. irqbalance works fine
> with previous realtime kernels (2.6.24.7-rt27).
>
> The OOPS seems to be caused by the PID of the [timer] IRQ-1
This in of itself might be part of the problem. I didn't think the
timer was supposed to be threaded. Thomas?
> . However the
> OOPS message is too long and I can't read the beginning lines on the
> terminal. The system hangs so I can't scroll back and it's not written
> to any log. SysRq does also not work after the OOPS.
>
> A bit off topic, but how can I capture those OOPSes?
>
A serial-console or netconsole is generally better than a VT for
catching the whole output.
> I could get a larger screen and a digital camera ;) The lkcd.sf.net
> patch does not apply to 2.6.29. Running a realtime kernel in qemu does
> not make much sense, although it may be sufficient to reproduce this OOPS.
> Is there a way to use USB as serial console using a hub and two
> computers?
I think so, though I have never tried.
> I guess I'll need a dongle or sth. Besides would a printk()
> to a serial port work even if some IRQ handler hangs?
>
Usually yes. Of course, it depends on what is actually broken, but
*most* issues still allow the printks to work.
Good luck!
-Greg
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.29-rt1+ irqbalance = OOPS
2009-03-27 16:49 ` Gregory Haskins
@ 2009-03-27 17:24 ` Robin Gareus
2009-03-28 14:37 ` Thomas Gleixner
0 siblings, 1 reply; 8+ messages in thread
From: Robin Gareus @ 2009-03-27 17:24 UTC (permalink / raw)
To: Gregory Haskins; +Cc: linux-rt-users, Thomas Gleixner
Gregory Haskins wrote:
> Robin Gareus wrote:
>> Hello,
>>
>> Starting the irqbalance daemon ( http://www.irqbalance.org/ ) on
>> 2.6.29-rt1 causes the system to OOPS and freeze. irqbalance works fine
>> with previous realtime kernels (2.6.24.7-rt27).
>>
>> The OOPS seems to be caused by the PID of the [timer] IRQ-1
> This in of itself might be part of the problem. I didn't think the
> timer was supposed to be threaded. Thomas?
OOPS ;) my bad. IRQ-0 is the timer and it's indeed not threaded.
IRQ-1 is i8042
The OOPS can be triggered by launching irqbalance. pressing any key.
For whatever reason the OOPS also takes place after a random amount of
time without pressing any key..
robin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.29-rt1+ irqbalance = OOPS
2009-03-27 17:24 ` Robin Gareus
@ 2009-03-28 14:37 ` Thomas Gleixner
2009-03-28 15:57 ` Robin Gareus
0 siblings, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2009-03-28 14:37 UTC (permalink / raw)
To: Robin Gareus; +Cc: Gregory Haskins, linux-rt-users
On Fri, 27 Mar 2009, Robin Gareus wrote:
> Gregory Haskins wrote:
> > Robin Gareus wrote:
> >> Hello,
> >>
> >> Starting the irqbalance daemon ( http://www.irqbalance.org/ ) on
> >> 2.6.29-rt1 causes the system to OOPS and freeze. irqbalance works fine
> >> with previous realtime kernels (2.6.24.7-rt27).
> >>
> >> The OOPS seems to be caused by the PID of the [timer] IRQ-1
> > This in of itself might be part of the problem. I didn't think the
> > timer was supposed to be threaded. Thomas?
>
> OOPS ;) my bad. IRQ-0 is the timer and it's indeed not threaded.
>
> IRQ-1 is i8042
>
> The OOPS can be triggered by launching irqbalance. pressing any key.
> For whatever reason the OOPS also takes place after a random amount of
> time without pressing any key..
Hmm. Works fine here. Can you please provide the output of
# cat /proc/interrupts
Thanks,
tglx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.29-rt1+ irqbalance = OOPS
2009-03-28 14:37 ` Thomas Gleixner
@ 2009-03-28 15:57 ` Robin Gareus
2009-03-30 13:43 ` Robin Gareus
0 siblings, 1 reply; 8+ messages in thread
From: Robin Gareus @ 2009-03-28 15:57 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Gregory Haskins, linux-rt-users
Thomas Gleixner wrote:
> On Fri, 27 Mar 2009, Robin Gareus wrote:
>> Gregory Haskins wrote:
>>> Robin Gareus wrote:
>>>> Hello,
>>>>
>>>> Starting the irqbalance daemon ( http://www.irqbalance.org/ ) on
>>>> 2.6.29-rt1 causes the system to OOPS and freeze. irqbalance works fine
>>>> with previous realtime kernels (2.6.24.7-rt27).
>>>>
>>>> The OOPS seems to be caused by the PID of the [timer] IRQ-1
>>> This in of itself might be part of the problem. I didn't think the
>>> timer was supposed to be threaded. Thomas?
>> OOPS ;) my bad. IRQ-0 is the timer and it's indeed not threaded.
>>
>> IRQ-1 is i8042
>>
>> The OOPS can be triggered by launching irqbalance. then pressing any key.
>> For whatever reason the OOPS also takes place after a random amount of
>> time without pressing any key..
>
> Hmm. Works fine here. Can you please provide the output of
>
> # cat /proc/interrupts
CPU0 CPU1
0: 2592396 0 IO-APIC-edge timer
1: 31901 0 IO-APIC-edge i8042
8: 79 0 IO-APIC-edge rtc0
9: 294004 0 IO-APIC-fasteoi acpi
12: 55544 0 IO-APIC-edge i8042
14: 28478 0 IO-APIC-edge ata_piix
15: 0 0 IO-APIC-edge ata_piix
16: 0 0 IO-APIC-fasteoi uhci_hcd:usb1, yenta,
i915@pci:0000:00:02.0
17: 166 0 IO-APIC-fasteoi uhci_hcd:usb2, HDA Intel,
ohci1394
18: 0 0 IO-APIC-fasteoi uhci_hcd:usb4, mmc0
19: 66 0 IO-APIC-fasteoi ehci_hcd:usb3, uhci_hcd:usb5
28: 1504 0 PCI-MSI-edge iwl3945
NMI: 0 0 Non-maskable interrupts
LOC: 398393 1538764 Local timer interrupts
SPU: 0 0 Spurious interrupts
CNT: 0 0 Performance counter interrupts
RES: 120526 181396 Rescheduling interrupts
CAL: 42459 30754 Function call interrupts
TLB: 839 1056 TLB shootdowns
TRM: 0 0 Thermal event interrupts
ERR: 0
MIS: 0
The corresponding .config is available at
http://rg42.org/_media/wiki/kernel/config-2.6.29-rt1.txt
I just tested Fernando's CCRMA 2.6.29-rt1 kernel which is x686 plus
generic x86-compat (mine is MCORE2) and irqbalance does not produce an
OOPS there.
I'll see if can get a dump of the OOPS on Monday using netconsole.
HTH,
robin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.29-rt1+ irqbalance = OOPS
2009-03-28 15:57 ` Robin Gareus
@ 2009-03-30 13:43 ` Robin Gareus
2009-04-01 6:43 ` Thomas Gleixner
0 siblings, 1 reply; 8+ messages in thread
From: Robin Gareus @ 2009-03-30 13:43 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Gregory Haskins, linux-rt-users
Robin Gareus wrote:
> The corresponding .config is available at
> http://rg42.org/_media/wiki/kernel/config-2.6.29-rt1.txt
BTW. This is a vanilla 2.6.29 with only the 2.6.29-rt1 patch applied.
> I just tested Fernando's CCRMA 2.6.29-rt1 kernel which is x686 plus
> generic x86-compat (mine is MCORE2) and irqbalance does not produce an
> OOPS there.
>
> I'll see if can get a dump of the OOPS on Monday using netconsole.
If the netconsole module is loaded, the OOPS is produced without the
Call and Stack Trace (both on local terminal and remote) and the cause
is no longer the keyboard-interrupt but ethernet (IRQ 28)
Starting `irqbalance` still reliably hangs this system (Thinkpad X60s).
------------[ cut here ]------------
kernel BUG at kernel/sched_rt.c:1602!
invalid opcode: 0000 [#1] PREEMPT SMP kernel BUG at kernel/sched_rt.c:1602!
invalid opcode: 0000 [#1] PREEMPT SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/irq
Modules linked in: netconsole ipv6 nfsd exportfs nfs lockd nfs_acl
auth_rpcgss sunrpc deflate zlib_deflate twofish twofish_common camellia
serpent blowfish des_generic cbc aes_i586 aes_generic xcbc rmd160
sha256_generic sha1_generic
last sysfs file: /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/irq
Modules linked in: netconsole ipv6 nfsd exportfs nfs lockd nfs_acl
auth_rpcgss sunrpc deflate zlib_deflate twofish twofish_common camellia
serpent blowfish des_generic cbc aes_i586 aes_generic xcbc rmd160
sha256_generic sha1_generic
-=-=-=-=-
# lspci ..
00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express
Port 1 (rev 02) (prog-if 00 [Normal decode])
# cat /proc/interrupts
CPU0 CPU1
0: 416869 0 IO-APIC-edge timer
1: 2921 0 IO-APIC-edge i8042
8: 48 0 IO-APIC-edge rtc0
9: 36105 0 IO-APIC-fasteoi acpi
12: 23294 0 IO-APIC-edge i8042
14: 19922 0 IO-APIC-edge ata_piix
15: 0 0 IO-APIC-edge ata_piix
16: 0 0 IO-APIC-fasteoi uhci_hcd:usb2, yenta,
i915@pci:0000:00:02.0
17: 51665 0 IO-APIC-fasteoi uhci_hcd:usb3, ohci1394,
HDA Intel
18: 0 0 IO-APIC-fasteoi uhci_hcd:usb4, mmc0
19: 28 0 IO-APIC-fasteoi ehci_hcd:usb1, uhci_hcd:usb5
28: 2669 0 PCI-MSI-edge eth1
29: 449 0 PCI-MSI-edge iwl3945
NMI: 0 0 Non-maskable interrupts
LOC: 79098 252004 Local timer interrupts
SPU: 0 0 Spurious interrupts
CNT: 0 0 Performance counter interrupts
RES: 86816 141218 Rescheduling interrupts
CAL: 6852 5954 Function call interrupts
TLB: 268 266 TLB shootdowns
TRM: 0 0 Thermal event interrupts
ERR: 0
MIS: 0
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.29-rt1+ irqbalance = OOPS
2009-03-30 13:43 ` Robin Gareus
@ 2009-04-01 6:43 ` Thomas Gleixner
2009-04-04 11:20 ` Robin Gareus
0 siblings, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2009-04-01 6:43 UTC (permalink / raw)
To: Robin Gareus; +Cc: Gregory Haskins, linux-rt-users
On Mon, 30 Mar 2009, Robin Gareus wrote:
> Robin Gareus wrote:
>
> > The corresponding .config is available at
> > http://rg42.org/_media/wiki/kernel/config-2.6.29-rt1.txt
>
> BTW. This is a vanilla 2.6.29 with only the 2.6.29-rt1 patch applied.
>
> > I just tested Fernando's CCRMA 2.6.29-rt1 kernel which is x686 plus
> > generic x86-compat (mine is MCORE2) and irqbalance does not produce an
> > OOPS there.
Ah, so there is some delta in the .configs which exposes this
problem. I got your .config and try to reproduce.
> > I'll see if can get a dump of the OOPS on Monday using netconsole.
>
> If the netconsole module is loaded, the OOPS is produced without the
> Call and Stack Trace (both on local terminal and remote) and the cause
> is no longer the keyboard-interrupt but ethernet (IRQ 28)
>
> Starting `irqbalance` still reliably hangs this system (Thinkpad X60s).
>
>
> ------------[ cut here ]------------
> kernel BUG at kernel/sched_rt.c:1602!
> invalid opcode: 0000 [#1] PREEMPT SMP kernel BUG at kernel/sched_rt.c:1602!
Unfortunately this does not tell us much. It tells us that something
went wrong, but not where and when :)
Thanks,
tglx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.29-rt1+ irqbalance = OOPS
2009-04-01 6:43 ` Thomas Gleixner
@ 2009-04-04 11:20 ` Robin Gareus
0 siblings, 0 replies; 8+ messages in thread
From: Robin Gareus @ 2009-04-04 11:20 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Gregory Haskins, linux-rt-users
I'm just testing 2.6.29-rt3 and irqbalance works fine again. Many thanks
Peter Zijlstra.
cheers,
robin
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2009-04-04 11:20 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-27 16:16 2.6.29-rt1+ irqbalance = OOPS Robin Gareus
2009-03-27 16:49 ` Gregory Haskins
2009-03-27 17:24 ` Robin Gareus
2009-03-28 14:37 ` Thomas Gleixner
2009-03-28 15:57 ` Robin Gareus
2009-03-30 13:43 ` Robin Gareus
2009-04-01 6:43 ` Thomas Gleixner
2009-04-04 11:20 ` Robin Gareus
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).