From: Brice Figureau <brice+lklm@daysofwonder.com>
To: linux-kernel@vger.kernel.org
Subject: Strange freeze on 2.6.22 (deadlock?)
Date: Mon, 07 Jan 2008 16:48:54 +0100 [thread overview]
Message-ID: <1199720934.11173.49.camel@localhost.localdomain> (raw)
Hi,
I'm seeing a strange complete server freeze/lock-up on an bi-Xeon HT
amd64 server running standard debian 2.6.22 (and before that vanilla
2.6.19.x and 2.6.20.x which exhibited the same issue).
I'm only reporting it now, since I could get a full sysrq-t only this
morning.
The symptoms are that every 5 to 7 days, the server (which acts as a MX
along with a few low traffic websites) locks-up. The ipmi watchdog is
unable to reboot the server (and doesn't even trigger, since there is no
evidence in the esmlog), the machine is still pingable. I can't ssh to
it, but I can enter my login & password on a serial console, but no
shell is started.
Pressing sysrq-t produced the trace hosted here:
http://www.daysofwonder.com/bug/crash-server1.txt.gz
It happened one time when I was connected to the server through ssh and
I could see that the load started to increase well above 100. It was
then impossible to launch new process from the command-line (and I had
to reboot manually).
It happened also last week, and the server was stuck for about 6 hours.
When I started investigating what was wrong, it slowly came back to life
(with an avg 1-min load of more than 1500, and tons of cron processes
running in parallel).
I'm not really familiar with kernel development so I can't really find
the issue in the aforementioned trace output.
What I think is that for some reason there is a race/deadlock that
finally prevents new processes to really start (which in turns produces
the high load).
What seems suspect in the aforementioned trace is:
*) lot of processes stacktrace ends in __mod_timer+0xc3/0xd3
which seems to be this line from kernel/timer.c
415 timer->expires = expires;
416 internal_add_timer(base, timer);
--> spin_unlock_irqrestore(&base->lock, flags);
419 return ret;
420 }
*) lot of processes stacktrace ends in __mutex_lock_slowpath and/or zone_statistics
Anyway, I will soon reboot to a 2.6.23.x to see if that symptom
persists.
More information (config, server specs) are available on request.
I'm not subscribed to the list, so please CC: me for any anwser.
Many thanks,
--
Brice Figureau <brice+lklm@daysofwonder.com>
next reply other threads:[~2008-01-07 16:18 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-07 15:48 Brice Figureau [this message]
2008-01-07 17:20 ` Strange freeze on 2.6.22 (deadlock?) Randy Dunlap
-- strict thread matches above, loose matches on Subject: below --
2008-01-07 19:06 Brice Figureau
2008-01-08 23:16 ` Chuck Ebbert
2008-01-09 16:02 ` Brice Figureau
2008-01-07 21:14 Brice Figureau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1199720934.11173.49.camel@localhost.localdomain \
--to=brice+lklm@daysofwonder.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.