linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.6.29-rt1+ irqbalance = OOPS
@ 2009-03-27 16:16 Robin Gareus
  2009-03-27 16:49 ` Gregory Haskins
  0 siblings, 1 reply; 8+ messages in thread
From: Robin Gareus @ 2009-03-27 16:16 UTC (permalink / raw)
  To: linux-rt-users

Hello,

Starting the irqbalance daemon ( http://www.irqbalance.org/ ) on
2.6.29-rt1 causes the system to OOPS and freeze. irqbalance works fine
with previous realtime kernels (2.6.24.7-rt27).

The OOPS seems to be caused by the PID of the [timer] IRQ-1. However the
OOPS message is too long and I can't read the beginning lines on the
terminal. The system hangs so I can't scroll back and it's not written
to any log. SysRq does also not work after the OOPS.

A bit off topic, but how can I capture those OOPSes?

I could get a larger screen and a digital camera ;) The lkcd.sf.net
patch does not apply to 2.6.29. Running a realtime kernel in qemu does
not make much sense, although it may be sufficient to reproduce this OOPS.
Is there a way to use USB as serial console using a hub and two
computers? I guess I'll need a dongle or sth. Besides would a printk()
to a serial port work even if some IRQ handler hangs?

Any ideas? links?
robin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.29-rt1+ irqbalance = OOPS
  2009-03-27 16:16 2.6.29-rt1+ irqbalance = OOPS Robin Gareus
@ 2009-03-27 16:49 ` Gregory Haskins
  2009-03-27 17:24   ` Robin Gareus
  0 siblings, 1 reply; 8+ messages in thread
From: Gregory Haskins @ 2009-03-27 16:49 UTC (permalink / raw)
  To: Robin Gareus; +Cc: linux-rt-users, Thomas Gleixner

[-- Attachment #1: Type: text/plain, Size: 1405 bytes --]

Robin Gareus wrote:
> Hello,
>
> Starting the irqbalance daemon ( http://www.irqbalance.org/ ) on
> 2.6.29-rt1 causes the system to OOPS and freeze. irqbalance works fine
> with previous realtime kernels (2.6.24.7-rt27).
>
> The OOPS seems to be caused by the PID of the [timer] IRQ-1
This in of itself might be part of the problem.  I didn't think the
timer was supposed to be threaded.  Thomas?

> . However the
> OOPS message is too long and I can't read the beginning lines on the
> terminal. The system hangs so I can't scroll back and it's not written
> to any log. SysRq does also not work after the OOPS.
>
> A bit off topic, but how can I capture those OOPSes?
>   
A serial-console or netconsole is generally better than a VT for
catching the whole output.

> I could get a larger screen and a digital camera ;) The lkcd.sf.net
> patch does not apply to 2.6.29. Running a realtime kernel in qemu does
> not make much sense, although it may be sufficient to reproduce this OOPS.
> Is there a way to use USB as serial console using a hub and two
> computers?
I think so, though I have never tried.

> I guess I'll need a dongle or sth. Besides would a printk()
> to a serial port work even if some IRQ handler hangs?
>   
Usually yes.  Of course, it depends on what is actually broken, but
*most* issues still allow the printks to work.

Good luck!
-Greg




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.29-rt1+ irqbalance = OOPS
  2009-03-27 16:49 ` Gregory Haskins
@ 2009-03-27 17:24   ` Robin Gareus
  2009-03-28 14:37     ` Thomas Gleixner
  0 siblings, 1 reply; 8+ messages in thread
From: Robin Gareus @ 2009-03-27 17:24 UTC (permalink / raw)
  To: Gregory Haskins; +Cc: linux-rt-users, Thomas Gleixner

Gregory Haskins wrote:
> Robin Gareus wrote:
>> Hello,
>>
>> Starting the irqbalance daemon ( http://www.irqbalance.org/ ) on
>> 2.6.29-rt1 causes the system to OOPS and freeze. irqbalance works fine
>> with previous realtime kernels (2.6.24.7-rt27).
>>
>> The OOPS seems to be caused by the PID of the [timer] IRQ-1
> This in of itself might be part of the problem.  I didn't think the
> timer was supposed to be threaded.  Thomas?

OOPS ;) my bad. IRQ-0 is the timer and it's indeed not threaded.

IRQ-1 is i8042

The OOPS can be triggered by launching irqbalance. pressing any key.
For whatever reason the OOPS also takes place after a random amount of
time without pressing any key..

robin



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.29-rt1+ irqbalance = OOPS
  2009-03-27 17:24   ` Robin Gareus
@ 2009-03-28 14:37     ` Thomas Gleixner
  2009-03-28 15:57       ` Robin Gareus
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2009-03-28 14:37 UTC (permalink / raw)
  To: Robin Gareus; +Cc: Gregory Haskins, linux-rt-users

On Fri, 27 Mar 2009, Robin Gareus wrote:
> Gregory Haskins wrote:
> > Robin Gareus wrote:
> >> Hello,
> >>
> >> Starting the irqbalance daemon ( http://www.irqbalance.org/ ) on
> >> 2.6.29-rt1 causes the system to OOPS and freeze. irqbalance works fine
> >> with previous realtime kernels (2.6.24.7-rt27).
> >>
> >> The OOPS seems to be caused by the PID of the [timer] IRQ-1
> > This in of itself might be part of the problem.  I didn't think the
> > timer was supposed to be threaded.  Thomas?
> 
> OOPS ;) my bad. IRQ-0 is the timer and it's indeed not threaded.
> 
> IRQ-1 is i8042
> 
> The OOPS can be triggered by launching irqbalance. pressing any key.
> For whatever reason the OOPS also takes place after a random amount of
> time without pressing any key..

Hmm. Works fine here. Can you please provide the output of

# cat /proc/interrupts

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.29-rt1+ irqbalance = OOPS
  2009-03-28 14:37     ` Thomas Gleixner
@ 2009-03-28 15:57       ` Robin Gareus
  2009-03-30 13:43         ` Robin Gareus
  0 siblings, 1 reply; 8+ messages in thread
From: Robin Gareus @ 2009-03-28 15:57 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Gregory Haskins, linux-rt-users

Thomas Gleixner wrote:
> On Fri, 27 Mar 2009, Robin Gareus wrote:
>> Gregory Haskins wrote:
>>> Robin Gareus wrote:
>>>> Hello,
>>>>
>>>> Starting the irqbalance daemon ( http://www.irqbalance.org/ ) on
>>>> 2.6.29-rt1 causes the system to OOPS and freeze. irqbalance works fine
>>>> with previous realtime kernels (2.6.24.7-rt27).
>>>>
>>>> The OOPS seems to be caused by the PID of the [timer] IRQ-1
>>> This in of itself might be part of the problem.  I didn't think the
>>> timer was supposed to be threaded.  Thomas?
>> OOPS ;) my bad. IRQ-0 is the timer and it's indeed not threaded.
>>
>> IRQ-1 is i8042
>>
>> The OOPS can be triggered by launching irqbalance. then pressing any key.
>> For whatever reason the OOPS also takes place after a random amount of
>> time without pressing any key..
> 
> Hmm. Works fine here. Can you please provide the output of
> 
> # cat /proc/interrupts
           CPU0       CPU1
  0:    2592396          0   IO-APIC-edge      timer
  1:      31901          0   IO-APIC-edge      i8042
  8:         79          0   IO-APIC-edge      rtc0
  9:     294004          0   IO-APIC-fasteoi   acpi
 12:      55544          0   IO-APIC-edge      i8042
 14:      28478          0   IO-APIC-edge      ata_piix
 15:          0          0   IO-APIC-edge      ata_piix
 16:          0          0   IO-APIC-fasteoi   uhci_hcd:usb1, yenta,
i915@pci:0000:00:02.0
 17:        166          0   IO-APIC-fasteoi   uhci_hcd:usb2, HDA Intel,
ohci1394
 18:          0          0   IO-APIC-fasteoi   uhci_hcd:usb4, mmc0
 19:         66          0   IO-APIC-fasteoi   ehci_hcd:usb3, uhci_hcd:usb5
 28:       1504          0   PCI-MSI-edge      iwl3945
NMI:          0          0   Non-maskable interrupts
LOC:     398393    1538764   Local timer interrupts
SPU:          0          0   Spurious interrupts
CNT:          0          0   Performance counter interrupts
RES:     120526     181396   Rescheduling interrupts
CAL:      42459      30754   Function call interrupts
TLB:        839       1056   TLB shootdowns
TRM:          0          0   Thermal event interrupts
ERR:          0
MIS:          0

The corresponding .config is available at
http://rg42.org/_media/wiki/kernel/config-2.6.29-rt1.txt

I just tested Fernando's CCRMA 2.6.29-rt1 kernel which is x686 plus
generic x86-compat (mine is MCORE2) and irqbalance does not produce an
OOPS there.

I'll see if can get a dump of the OOPS on Monday using netconsole.

HTH,
robin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.29-rt1+ irqbalance = OOPS
  2009-03-28 15:57       ` Robin Gareus
@ 2009-03-30 13:43         ` Robin Gareus
  2009-04-01  6:43           ` Thomas Gleixner
  0 siblings, 1 reply; 8+ messages in thread
From: Robin Gareus @ 2009-03-30 13:43 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Gregory Haskins, linux-rt-users

Robin Gareus wrote:

> The corresponding .config is available at
> http://rg42.org/_media/wiki/kernel/config-2.6.29-rt1.txt

BTW. This is a vanilla 2.6.29 with only the 2.6.29-rt1 patch applied.

> I just tested Fernando's CCRMA 2.6.29-rt1 kernel which is x686 plus
> generic x86-compat (mine is MCORE2) and irqbalance does not produce an
> OOPS there.
> 
> I'll see if can get a dump of the OOPS on Monday using netconsole.

If the netconsole module is loaded, the OOPS is produced without the
Call and Stack Trace (both on local terminal and remote) and the cause
is no longer the keyboard-interrupt but ethernet (IRQ 28)

Starting `irqbalance` still reliably hangs this system (Thinkpad X60s).

------------[ cut here ]------------
kernel BUG at kernel/sched_rt.c:1602!
invalid opcode: 0000 [#1] PREEMPT SMP kernel BUG at kernel/sched_rt.c:1602!
invalid opcode: 0000 [#1] PREEMPT SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/irq
Modules linked in: netconsole ipv6 nfsd exportfs nfs lockd nfs_acl
auth_rpcgss sunrpc deflate zlib_deflate twofish twofish_common camellia
serpent blowfish des_generic cbc aes_i586 aes_generic xcbc rmd160
sha256_generic sha1_generic
last sysfs file: /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/irq
Modules linked in: netconsole ipv6 nfsd exportfs nfs lockd nfs_acl
auth_rpcgss sunrpc deflate zlib_deflate twofish twofish_common camellia
serpent blowfish des_generic cbc aes_i586 aes_generic xcbc rmd160
sha256_generic sha1_generic


-=-=-=-=-

# lspci ..

00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express
Port 1 (rev 02) (prog-if 00 [Normal decode])

# cat /proc/interrupts

          CPU0       CPU1
  0:     416869          0   IO-APIC-edge      timer
  1:       2921          0   IO-APIC-edge      i8042
  8:         48          0   IO-APIC-edge      rtc0
  9:      36105          0   IO-APIC-fasteoi   acpi
 12:      23294          0   IO-APIC-edge      i8042
 14:      19922          0   IO-APIC-edge      ata_piix
 15:          0          0   IO-APIC-edge      ata_piix
 16:          0          0   IO-APIC-fasteoi   uhci_hcd:usb2, yenta,
i915@pci:0000:00:02.0
 17:      51665          0   IO-APIC-fasteoi   uhci_hcd:usb3, ohci1394,
HDA Intel
 18:          0          0   IO-APIC-fasteoi   uhci_hcd:usb4, mmc0
 19:         28          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb5
 28:       2669          0   PCI-MSI-edge      eth1
 29:        449          0   PCI-MSI-edge      iwl3945
NMI:          0          0   Non-maskable interrupts
LOC:      79098     252004   Local timer interrupts
SPU:          0          0   Spurious interrupts
CNT:          0          0   Performance counter interrupts
RES:      86816     141218   Rescheduling interrupts
CAL:       6852       5954   Function call interrupts
TLB:        268        266   TLB shootdowns
TRM:          0          0   Thermal event interrupts
ERR:          0
MIS:          0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.29-rt1+ irqbalance = OOPS
  2009-03-30 13:43         ` Robin Gareus
@ 2009-04-01  6:43           ` Thomas Gleixner
  2009-04-04 11:20             ` Robin Gareus
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2009-04-01  6:43 UTC (permalink / raw)
  To: Robin Gareus; +Cc: Gregory Haskins, linux-rt-users


On Mon, 30 Mar 2009, Robin Gareus wrote:

> Robin Gareus wrote:
> 
> > The corresponding .config is available at
> > http://rg42.org/_media/wiki/kernel/config-2.6.29-rt1.txt
> 
> BTW. This is a vanilla 2.6.29 with only the 2.6.29-rt1 patch applied.
> 
> > I just tested Fernando's CCRMA 2.6.29-rt1 kernel which is x686 plus
> > generic x86-compat (mine is MCORE2) and irqbalance does not produce an
> > OOPS there.

Ah, so there is some delta in the .configs which exposes this
problem. I got your .config and try to reproduce.

> > I'll see if can get a dump of the OOPS on Monday using netconsole.
> 
> If the netconsole module is loaded, the OOPS is produced without the
> Call and Stack Trace (both on local terminal and remote) and the cause
> is no longer the keyboard-interrupt but ethernet (IRQ 28)
> 
> Starting `irqbalance` still reliably hangs this system (Thinkpad X60s).
>
>
> ------------[ cut here ]------------
> kernel BUG at kernel/sched_rt.c:1602!
> invalid opcode: 0000 [#1] PREEMPT SMP kernel BUG at kernel/sched_rt.c:1602!

Unfortunately this does not tell us much. It tells us that something
went wrong, but not where and when :)

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.29-rt1+ irqbalance = OOPS
  2009-04-01  6:43           ` Thomas Gleixner
@ 2009-04-04 11:20             ` Robin Gareus
  0 siblings, 0 replies; 8+ messages in thread
From: Robin Gareus @ 2009-04-04 11:20 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Gregory Haskins, linux-rt-users

I'm just testing 2.6.29-rt3 and irqbalance works fine again. Many thanks
Peter Zijlstra.

cheers,
robin

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-04-04 11:20 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-27 16:16 2.6.29-rt1+ irqbalance = OOPS Robin Gareus
2009-03-27 16:49 ` Gregory Haskins
2009-03-27 17:24   ` Robin Gareus
2009-03-28 14:37     ` Thomas Gleixner
2009-03-28 15:57       ` Robin Gareus
2009-03-30 13:43         ` Robin Gareus
2009-04-01  6:43           ` Thomas Gleixner
2009-04-04 11:20             ` Robin Gareus

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).