From: Darren Hart <dvhart@linux.intel.com>
To: Nathan Grennan <linux-rt-users@cygnusx-1.org>
Cc: linux-rt-users@vger.kernel.org
Subject: Re: Soft lock issue with 2.6.33.7-rt29
Date: Fri, 19 Nov 2010 10:46:51 -0800 [thread overview]
Message-ID: <4CE6C61B.6090608@linux.intel.com> (raw)
In-Reply-To: <4CE5A4A3.4040202@cygnusx-1.org>
On 11/18/2010 02:11 PM, Nathan Grennan wrote:
> On 11/17/2010 05:26 PM, Darren Hart wrote:
>> On 11/17/2010 11:11 AM, Nathan Grennan wrote:
>>> I have been working for weeks to get a stable rt kernel. I had been
>>> focusing on 2.6.31.6-rt19. It is stable for about four days under stress
>>> testing before it soft locks. I am using rt19 instead of rt21, because
>>> rt19 seems to be more stable. The rtmutex issue that seems to still be
>>> in rt29 is in rt21. I also had to backport the iptables fix to rt19.
>>>
>>> I just started looking at 2.6.33.7-rt29 again, since I can reproduce a
>>> soft lock with it in 10-15 minutes. I have yet to get sysrq output for
>>> rt19, since it takes four days. The soft lock with rt29 as far as I can
>>> tell seems to relate to disk i/o.
>>>
>>> There are links to two logs of rt29 from a serial console below. They
>>> include sysrq output like "Show Blocked State" and "Show State". The
>>> level7 file is with nfsd enable, and level9 is with it disable. So nfsd
>>> doesn't seem to be the issue.
>>>
>>> If any other debugging information is useful or needed, just say the
>>> word.
>>
>> A reproducible test-case is always the first thing we ask for :-) What
>> is your stress test?
>
> I have been able to boil it down the script below. If I just run yes it
> is fine, if I just run dd, it is fine. If you just run octave, it is
> fine. Run yes+dd, gets it most of the way there, but will wake up
> sometimes, off and on. Do all three together and it soft locks. It takes
> 5-15 minutes. I did it on our main example hardware, which is a server.
> I have also reproduced it on a desktop. Sometimes sysrq-n, to renice
> realtime processes, brings it out of it enough you can kill processes off.
Interesting, so you're locking up a preempt-rt kernel with SCHED_OTHER
tasks running at the least favorable priority.
Note: nice -n 19 is actually the valid nice value (20 and higher seem to
be accepted, but have the same effect as 19). NICE(1)
How many CPUs on your test machine?
--
Darren Hart
Yocto Linux Kernel
next prev parent reply other threads:[~2010-11-19 18:46 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-17 19:11 Soft lock issue with 2.6.33.7-rt29 Nathan Grennan
2010-11-18 1:26 ` Darren Hart
2010-11-18 11:35 ` Luis Claudio R. Goncalves
2010-11-18 17:48 ` Nathan Grennan
2010-11-18 22:11 ` Nathan Grennan
2010-11-19 12:05 ` Locating processes impacting my rt application Leggo, Adam (UK)
2010-11-19 14:53 ` Uwe Kleine-König
2010-11-19 15:11 ` Husak, Jan
2010-11-19 18:46 ` Darren Hart [this message]
2010-11-19 19:21 ` Soft lock issue with 2.6.33.7-rt29 Nathan Grennan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CE6C61B.6090608@linux.intel.com \
--to=dvhart@linux.intel.com \
--cc=linux-rt-users@cygnusx-1.org \
--cc=linux-rt-users@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.