From: Fernando Lopez-Lezcano <nando@ccrma.Stanford.EDU>
To: john stultz <johnstul@us.ibm.com>
Cc: nando@ccrma.Stanford.EDU, Thomas Gleixner <tglx@linutronix.de>,
LKML <linux-kernel@vger.kernel.org>,
rt-users <linux-rt-users@vger.kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
Nick Piggin <npiggin@suse.de>
Subject: Re: 2.6.33.5 rt23: machine lockup (nfs/autofs related?)
Date: Fri, 09 Jul 2010 12:13:02 -0700 [thread overview]
Message-ID: <1278702782.7122.1.camel@localhost.localdomain> (raw)
In-Reply-To: <1278702134.5102.9.camel@localhost.localdomain>
On Fri, 2010-07-09 at 12:02 -0700, Fernando Lopez-Lezcano wrote:
> On Thu, 2010-07-08 at 16:00 -0700, john stultz wrote:
> > On Thu, 2010-07-08 at 15:44 -0700, Fernando Lopez-Lezcano wrote:
> > > On Thu, 2010-07-08 at 15:33 -0700, john stultz wrote:
> > > > On Thu, 2010-07-08 at 10:19 -0700, Fernando Lopez-Lezcano wrote:
> > > > > We are having problems with 2.6.33.5+rt23, at least in our configuration
> > > > > while accessing an nfs automounted directory. This causes a complete
> > > > > machine lockup (press reset to exit as the only option).
> > > > >
> > > > > I simply use the Nautilus file manager (in Fedora 12) to navigate to an
> > > > > autofs mounted directory and the process monitor goes to 100% on one
> > > > > core (or maybe two), the mouse jerks a bit and the whole thing goes
> > > > > catatonic almost immediately.
> > > > >
> > > > > I get this in any open terminal at the time of the crash:
> > > > >
> > > > > --------
> > > > > Message from syslogd@localhost at Jul 8 10:13:54 ...
> > > > > kernel:------------[ cut here ]------------
> > > > >
> > > > > Message from syslogd@localhost at Jul 8 10:13:54 ...
> > > > > kernel:invalid opcode: 0000 [#1] PREEMPT SMP
> > > > >
> > > > > Message from syslogd@localhost at Jul 8 10:13:54 ...
> > > > > kernel:last sysfs
> > > > > file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
> > > > >
> > > > > Message from syslogd@localhost at Jul 8 10:13:54 ...
> > > > > kernel:Process nautilus (pid: 2874, ti=f0204000 task=f17dd1f0
> > > > > task.ti=f0204000)
> > > > >
> > > > > Message from syslogd@localhost at Jul 8 10:13:54 ...
> > > > > kernel:Stack:
> > > > >
> > > > > Message from syslogd@localhost at Jul 8 10:13:54 ...
> > > > > kernel:Call Trace:
> > > > >
> > > > > Message from syslogd@localhost at Jul 8 10:13:54 ...
> > > > > kernel:Code: 7b 08 00 89 45 b8 75 12 8d 43 04 89 43 04 89 43 08 8d 43
> > > > > 0c 89 43 0c 89 43 10 8b 43 14 64 8b 15 2c d1 a5 c0 83 e0 fc 39 c2 75 04
> > > > > <0f> 0b eb fe 8b 3a 81 ff 08 01 00 00 74 0a 83 ff 02 b8 04 00 00
> > > > >
> > > > > Message from syslogd@localhost at Jul 8 10:13:54 ...
> > > > > kernel:EIP: [<c0792c0f>] rt_spin_lock_slowlock+0x43/0x1bb SS:ESP
> > > > > 0068:f0205cbc
> > > > > --------
> > > > >
> > > > > And that's it... nothing else in the logs.
> > > >
> > > > Hrm. Not too much to go on there, but thanks for the report.
> > > >
> > > >
> > > > > For now we are booting into the normal Fedora kernel (this is on Fedora
> > > > > 12) as this makes the rt kernel not usable in our setup.
> > > > >
> > > > > Let me know if there is anything else I can do to help debug this...
> > > >
> > > > Had you done any testing with earlier 2.6.33-rt kernels where this
> > > > didn't occur? If so what version?
> > >
> > > I have been working with the whole series but my main usage case does
> > > not use nfs/autofs (see next paragraphs).
> > >
> > > I have noticed that the problem does not appear to happen when I cd into
> > > an nfs automounted directory directly. It appears to happen only when
> > > listing the contents of a mount point (ie: when "/whatever/" is an
> > > autofs mount point where several directories are mounted, not
> > > necessarily from the same server).
> > >
> > > Before switching to Fedora 12 users were normally running 2.6.29 rt and
> > > I had been running 2.6.31.x and 2.6.33.x rt, but I don't think it ever
> > > happened to me personally (I'm always using the command line - this is
> > > completely reproducible with nautilus). After the switch it started
> > > happening almost immediately to regular users (using nautilus mostly).
> > >
> > > How could I try to get more debugging information?
> >
> > Any chance you have a serial port on the machine in question? If so its
> > likely any oops messages could be collected over that.
>
> No response from the network or the keyboard or
> mouse at this point, reset is the only way out.
Not quite true, it does respond to the sysrq key (a sync command got an
immediate dump in the terminal). But the boot command does not reboot
the machine.
-- Fernando
next prev parent reply other threads:[~2010-07-09 19:13 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-08 17:19 2.6.33.5 rt23: machine lockup (nfs/autofs related?) Fernando Lopez-Lezcano
2010-07-08 22:33 ` john stultz
2010-07-08 22:44 ` Fernando Lopez-Lezcano
2010-07-08 23:00 ` john stultz
2010-07-09 19:02 ` Fernando Lopez-Lezcano
2010-07-09 19:13 ` Fernando Lopez-Lezcano [this message]
2010-07-09 19:54 ` john stultz
2010-07-09 22:13 ` Fernando Lopez-Lezcano
2010-07-09 22:31 ` john stultz
2010-07-09 23:07 ` Fernando Lopez-Lezcano
2010-07-09 23:24 ` Fernando Lopez-Lezcano
2010-07-09 22:57 ` john stultz
2010-07-09 23:13 ` Fernando Lopez-Lezcano
2010-07-12 23:37 ` Fernando Lopez-Lezcano
2010-07-12 23:53 ` john stultz
2010-07-13 1:10 ` Fernando Lopez-Lezcano
2010-07-13 1:40 ` john stultz
2010-07-13 3:06 ` Fernando Lopez-Lezcano
2010-07-14 21:32 ` Fernando Lopez-Lezcano
2010-07-14 21:36 ` john stultz
2010-07-14 22:02 ` Fernando Lopez-Lezcano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1278702782.7122.1.camel@localhost.localdomain \
--to=nando@ccrma.stanford.edu \
--cc=johnstul@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=npiggin@suse.de \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).