linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Hard lockup with 2.6.24.7-rt26 on x86
@ 2009-02-02 20:55 m.luescher
  2009-02-03 15:00 ` Clark Williams
  2009-02-03 19:20 ` Remy Bohmer
  0 siblings, 2 replies; 7+ messages in thread
From: m.luescher @ 2009-02-02 20:55 UTC (permalink / raw)
  To: linux-rt-users

Dear Linux rt users

On two different desktop PC (both Core 2 Duo) and on a notebook (Core Duo) the kernel (2.6.24.7-rt26, CONFIG_X86_32) locks up hard under a certain rt load. The failure can be reproduced reliably with the following command:

sudo ./cyclictest -p99 -t10 -n -i250

During the test I made some additional observations:
- The lockup is really hard: there is no output on a netconsole and even the magic sysrequest keys to reboot the system (Alt+SysRQ+b) do not work anymore.
- with a lower realtime priority the system seems to run at least more stable (e.g. sudo ./cyclictest -p95 -t10 -n -i250)
- with the vanilla 2.6.26.6-rt11 kernel I observed probably the same crash
- the problem also appears on Ubuntu Hardy Heron with the latest official kernel (2.6.24.23.25 with rt21 patch)
- the problem did not appear on Ubuntu Hardy Heron with older rt-kernel versions (also 2.6.24 but based on rt3 patch)

So at the moment I am really clueless and I would be happy for any hint on how I can further debug this crash.

Best regards
Matthias

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Hard lockup with 2.6.24.7-rt26 on x86
  2009-02-02 20:55 Hard lockup with 2.6.24.7-rt26 on x86 m.luescher
@ 2009-02-03 15:00 ` Clark Williams
  2009-02-03 15:44   ` Carsten Emde
  2009-02-03 19:20 ` Remy Bohmer
  1 sibling, 1 reply; 7+ messages in thread
From: Clark Williams @ 2009-02-03 15:00 UTC (permalink / raw)
  To: m.luescher; +Cc: linux-rt-users

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, 2 Feb 2009 21:55:36 +0100
m.luescher@vtxmail.ch wrote:

> Dear Linux rt users
> 
> On two different desktop PC (both Core 2 Duo) and on a notebook (Core Duo) the kernel (2.6.24.7-rt26, CONFIG_X86_32) locks up hard under a certain rt load. The failure can be reproduced reliably with the following command:
> 
> sudo ./cyclictest -p99 -t10 -n -i250
> 
> During the test I made some additional observations:
> - The lockup is really hard: there is no output on a netconsole and even the magic sysrequest keys to reboot the system (Alt+SysRQ+b) do not work anymore.
> - with a lower realtime priority the system seems to run at least more stable (e.g. sudo ./cyclictest -p95 -t10 -n -i250)
> - with the vanilla 2.6.26.6-rt11 kernel I observed probably the same crash
> - the problem also appears on Ubuntu Hardy Heron with the latest official kernel (2.6.24.23.25 with rt21 patch)
> - the problem did not appear on Ubuntu Hardy Heron with older rt-kernel versions (also 2.6.24 but based on rt3 patch)
> 
> So at the moment I am really clueless and I would be happy for any hint on how I can further debug this crash.
> 
> Best regards
> Matthias

My guess would be that since you're running 10 threads on two
processors at the highest available priority, you're starving all the
hard IRQ threads (as well as soft irq and other kernel threads).

This is one of those power-tool moments where you can lock up the
system with the wrong workload/priority combination. 

Can you lock up the system running at -p49 (I  suspect all your [IRQ-X]
threads are running at priority 50)?

Clark
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.10 (GNU/Linux)

iEYEARECAAYFAkmIXCcACgkQHyuj/+TTEp0nqgCfdoddZvpxUwcpN5EXRAE90kmf
hegAn18D1Ji5cGqeJwgHNk3Xb8fxMrCe
=F04E
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Hard lockup with 2.6.24.7-rt26 on x86
  2009-02-03 15:00 ` Clark Williams
@ 2009-02-03 15:44   ` Carsten Emde
  2009-02-03 16:26     ` Clark Williams
  0 siblings, 1 reply; 7+ messages in thread
From: Carsten Emde @ 2009-02-03 15:44 UTC (permalink / raw)
  To: Clark Williams
  Cc: m.luescher, linux-rt-users, Thomas Gleixner, Steven Rostedt

Clark,

>> On two different desktop PC (both Core 2 Duo) and on a notebook
>> (Core Duo) the kernel (2.6.24.7-rt26, CONFIG_X86_32) locks up hard
>> under a certain rt load. The failure can be reproduced reliably
>> with the following command:
>> sudo ./cyclictest -p99 -t10 -n -i250
> My guess would be that since you're running 10 threads on two 
> processors at the highest available priority, you're starving all the
> hard IRQ threads (as well as soft irq and other kernel threads).
> This is one of those power-tool moments where you can lock up the 
> system with the wrong workload/priority combination.
Hmm, I would agree immediately, if the tasks were using all of the CPU
power. But this is not the case. In the present case, running

# cyclictest -p99 -t10 -n -i250

results in about 10% CPU load. So there is plenty of time in between to
let the system respond. I can reproduce the problem here (2.6.24-rt,
2.6.26-rt, various configurations). The system does not crash every time
when cyclictest is started, only once in about 5 trials or so. If it
crashes, then it does so immediately after being started. Whenever
cyclictest survives for several seconds, then it never crashes at a
later time. I tend to believe that we have a bug here, not a regular
behavior.

	Carsten.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Hard lockup with 2.6.24.7-rt26 on x86
  2009-02-03 15:44   ` Carsten Emde
@ 2009-02-03 16:26     ` Clark Williams
  0 siblings, 0 replies; 7+ messages in thread
From: Clark Williams @ 2009-02-03 16:26 UTC (permalink / raw)
  To: Carsten Emde; +Cc: m.luescher, linux-rt-users, Thomas Gleixner, Steven Rostedt

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, 03 Feb 2009 16:44:10 +0100
Carsten Emde <Carsten.Emde@osadl.org> wrote:

> Clark,
> 
> >> On two different desktop PC (both Core 2 Duo) and on a notebook
> >> (Core Duo) the kernel (2.6.24.7-rt26, CONFIG_X86_32) locks up hard
> >> under a certain rt load. The failure can be reproduced reliably
> >> with the following command:
> >> sudo ./cyclictest -p99 -t10 -n -i250
> > My guess would be that since you're running 10 threads on two 
> > processors at the highest available priority, you're starving all the
> > hard IRQ threads (as well as soft irq and other kernel threads).
> > This is one of those power-tool moments where you can lock up the 
> > system with the wrong workload/priority combination.
> Hmm, I would agree immediately, if the tasks were using all of the CPU
> power. But this is not the case. In the present case, running
> 
> # cyclictest -p99 -t10 -n -i250
> 
> results in about 10% CPU load. So there is plenty of time in between to
> let the system respond. I can reproduce the problem here (2.6.24-rt,
> 2.6.26-rt, various configurations). The system does not crash every time
> when cyclictest is started, only once in about 5 trials or so. If it
> crashes, then it does so immediately after being started. Whenever
> cyclictest survives for several seconds, then it never crashes at a
> later time. I tend to believe that we have a bug here, not a regular
> behavior.
> 

That's what I get for firing off a response with out running it first :)

I don't have a 2.6.26 test box handy, but I'll set that up this
afternoon. I agree with your results that this is a bug, since I
haven't been able to kill my 2.6.24.7-based RT test system and if it
was just starvation I should be able to do that easily.

Clark
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.10 (GNU/Linux)

iEYEARECAAYFAkmIcDMACgkQHyuj/+TTEp21DgCfUihH/6BUUS5o3SwlrnK57dJC
/HQAoMEZp8bYuOEBiS6wVtbGgA0vi0MS
=X1jB
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Hard lockup with 2.6.24.7-rt26 on x86
  2009-02-02 20:55 Hard lockup with 2.6.24.7-rt26 on x86 m.luescher
  2009-02-03 15:00 ` Clark Williams
@ 2009-02-03 19:20 ` Remy Bohmer
       [not found]   ` <902d99b50902040046u3257591fxe29cef4e62315b1a@mail.gmail.com>
  1 sibling, 1 reply; 7+ messages in thread
From: Remy Bohmer @ 2009-02-03 19:20 UTC (permalink / raw)
  To: m.luescher; +Cc: linux-rt-users, Steven Rostedt, Thomas Gleixner

Hello Matthias,

2009/2/2  <m.luescher@vtxmail.ch>:
> Dear Linux rt users
>
> On two different desktop PC (both Core 2 Duo) and on a notebook (Core Duo) the kernel (2.6.24.7-rt26, CONFIG_X86_32)
> locks up hard under a certain rt load. The failure can be reproduced reliably with the following command:

In noticed some hangups too while using 2.6.24.7-rt26, while
2.6.24.7-rt20 was running fine.
I can reproduce it here on an ARM11 core, when
CONFIG_DEBUG_LOCKING_API_SELFTESTS is set the system hangs during boot
on rt26, while it boots properly on rt20. So, there is a regression
here.
I did not find the time yet to bisect it to the offending RT-patch.

Kind regards,

Remy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Hard lockup with 2.6.24.7-rt26 on x86
       [not found]   ` <902d99b50902040046u3257591fxe29cef4e62315b1a@mail.gmail.com>
@ 2009-02-04 11:56     ` Remy Bohmer
  2009-02-04 12:31       ` Claus Gindhart
  0 siblings, 1 reply; 7+ messages in thread
From: Remy Bohmer @ 2009-02-04 11:56 UTC (permalink / raw)
  To: Denis Borisevich; +Cc: linux-rt-users

Hello Denis,

Please keep the mailinglist on the CC.

Kind Regards,

Remy

2009/2/4 Denis Borisevich <dennisfen@gmail.com>:
> 2009/2/3 Remy Bohmer <linux@bohmer.net>:
>> In noticed some hangups too while using 2.6.24.7-rt26, while
>> 2.6.24.7-rt20 was running fine.
>> I can reproduce it here on an ARM11 core, when
>> CONFIG_DEBUG_LOCKING_API_SELFTESTS is set the system hangs during boot
>> on rt26, while it boots properly on rt20. So, there is a regression
>> here.
>
> I second that. With CONFIG_DEBUG_LOCKING_API_SELFTEST system hangs on
> boot with -rt26.
>
> --
> Denis
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Hard lockup with 2.6.24.7-rt26 on x86
  2009-02-04 11:56     ` Remy Bohmer
@ 2009-02-04 12:31       ` Claus Gindhart
  0 siblings, 0 replies; 7+ messages in thread
From: Claus Gindhart @ 2009-02-04 12:31 UTC (permalink / raw)
  To: Remy Bohmer; +Cc: Denis Borisevich, linux-rt-users

Hi all,

On Wednesday 04 February 2009 12:56:26 pm Remy Bohmer wrote:
> Hello Denis,
>
> Please keep the mailinglist on the CC.
>
> Kind Regards,
>
> Remy
>
> 2009/2/4 Denis Borisevich <dennisfen@gmail.com>:
> > 2009/2/3 Remy Bohmer <linux@bohmer.net>:
> >> In noticed some hangups too while using 2.6.24.7-rt26, while
> >> 2.6.24.7-rt20 was running fine.
> >> I can reproduce it here on an ARM11 core, when
> >> CONFIG_DEBUG_LOCKING_API_SELFTESTS is set the system hangs during boot
> >> on rt26, while it boots properly on rt20. So, there is a regression
> >> here.
> >
> > I second that. With CONFIG_DEBUG_LOCKING_API_SELFTEST system hangs on
> > boot with -rt26.
> >
> > --
> > Denis
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users"
> in the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

For your info:
i tried to reproduce this with 2.6.24.7-rt21, and i cant reproduce the problem 
here.



-- 
Mit freundlichen Gruessen / Best regards

Claus Gindhart
SW R&D
Kontron Modular Computers
phone :++49 (0)8341-803-374
mailto:claus.gindhart@kontron-modular.com
http://www.kontron.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-02-04 12:31 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-02 20:55 Hard lockup with 2.6.24.7-rt26 on x86 m.luescher
2009-02-03 15:00 ` Clark Williams
2009-02-03 15:44   ` Carsten Emde
2009-02-03 16:26     ` Clark Williams
2009-02-03 19:20 ` Remy Bohmer
     [not found]   ` <902d99b50902040046u3257591fxe29cef4e62315b1a@mail.gmail.com>
2009-02-04 11:56     ` Remy Bohmer
2009-02-04 12:31       ` Claus Gindhart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).