* Question about system lock-ups
@ 2008-01-05 3:36 Eddie Atherton
2008-01-06 23:23 ` Edward Shishkin
0 siblings, 1 reply; 3+ messages in thread
From: Eddie Atherton @ 2008-01-05 3:36 UTC (permalink / raw)
To: reiserfs-devel
Hi,
I don't think this issue is reiser related, but I thought I'd bounce it
here, in case anyone might be able to shed light on it, based on the
symptoms.
Yesterday, for the 3nd time, my system "kind of" locked up. Each time
this has happened, it's been after around a month of up-time, and all
times were at odd hours, so the system wasn't being used much, so I
don't think it's temperature related, otherwise I'd get them during the
day, when I am using the system heavily.
OK, the symptoms are this. On the monitor I can see my screensaver,
<http://www.linuxquestions.org/questions/#>but it's not "in motion". The
keyboard and mouse don't work. I can't connect by ssh or VNC,
<http://www.linuxquestions.org/questions/#> but neither actually time
out, they just hang. Requests to DHCP hang. BUT, the system is still
running, because it acts as a server/gateway for the rest of the
machines in the house, and they can connect to the internet
<http://www.linuxquestions.org/questions/#>without problems. I can surf
the web, I can send and receive e-mails, anything but connect directly
to the server. <http://www.linuxquestions.org/questions/#>
The only way to get the system to respond, is the "big red button". On
restart, and here's why I'm trying this list for ideas, is that I
noticed, again each time, is hundreds and hundreds, if not thousands of
messages showing reiser replaying transactions. It takes, sometimes,
nearly 5 minutes for this to complete. What kind of issue could cause
all of these transactions to be replayed.
Examining the logs doesn't show anything, except that all logging
appears to stop at the same point in time, which I think is when the
"issue", whatever it is, occurs, because the time was around 4 hours
before I noticed the lock-up.
thought about a drive issue, but smartd isn't reporting any problems,
normally, or via short/long self-tests. Plus, I'd expect some "failure"
messages at some point, not just a lock-up.
Lastly, here's the only relevant message I could find in dmesg, from the
resultant boot:
ReiserFS: hda2: checking transaction log (hda2)
ReiserFS: hda2: replayed 1125 transactions in 67 seconds
Now, here's the really weird part. hda2 is an UNUSED partition. It
isn't mounted on the system:
Filesystem Size Used Avail Use% Mounted on
/dev/hda1 9.3G 3.5G 5.9G 37% /
/dev/hdb2 187G 164G 24G 88% /usr/local/NotYet
/dev/hdb3 47G 18G 30G 37% /usr/local/Apps
/dev/sda5 233G 140G 93G 61% /usr/local/Music
/dev/sda6 233G 173G 60G 75% /usr/local/Backup
tmpfs 256M 68K 256M 1% /tmp
Any thoughts on what might be causing this, and more importantly, how to
track down the culprit.
Cheers,
Eddie
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Question about system lock-ups
2008-01-05 3:36 Question about system lock-ups Eddie Atherton
@ 2008-01-06 23:23 ` Edward Shishkin
2008-01-07 0:48 ` Eddie Atherton
0 siblings, 1 reply; 3+ messages in thread
From: Edward Shishkin @ 2008-01-06 23:23 UTC (permalink / raw)
To: stunnel; +Cc: reiserfs-devel
Eddie Atherton wrote:
> Hi,
Hello.
>
> I don't think this issue is reiser related, but I thought I'd bounce
> it here, in case anyone might be able to shed light on it, based on
> the symptoms.
>
> Yesterday, for the 3nd time, my system "kind of" locked up. Each time
> this has happened, it's been after around a month of up-time, and all
> times were at odd hours, so the system wasn't being used much, so I
> don't think it's temperature related, otherwise I'd get them during
> the day, when I am using the system heavily.
>
I had something similar on my dual athlon 1700+ box.
Getting rid of case has fixed the problem, so I didn't
investigate this further ;)
> OK, the symptoms are this. On the monitor I can see my screensaver,
> <http://www.linuxquestions.org/questions/#>but it's not "in motion".
> The keyboard and mouse don't work. I can't connect by ssh or VNC,
> <http://www.linuxquestions.org/questions/#> but neither actually time
> out, they just hang. Requests to DHCP hang. BUT, the system is still
> running, because it acts as a server/gateway for the rest of the
> machines in the house, and they can connect to the internet
> <http://www.linuxquestions.org/questions/#>without problems. I can
> surf the web, I can send and receive e-mails, anything but connect
> directly to the server. <http://www.linuxquestions.org/questions/#>
>
> The only way to get the system to respond, is the "big red button". On
> restart, and here's why I'm trying this list for ideas, is that I
> noticed, again each time, is hundreds and hundreds, if not thousands
> of messages showing reiser replaying transactions. It takes,
> sometimes, nearly 5 minutes for this to complete. What kind of issue
> could cause all of these transactions to be replayed.
For example, pushing the "big red button" when write is in progress..
>
> Examining the logs doesn't show anything, except that all logging
> appears to stop at the same point in time, which I think is when the
> "issue", whatever it is, occurs, because the time was around 4 hours
> before I noticed the lock-up.
>
> thought about a drive issue, but smartd isn't reporting any problems,
> normally, or via short/long self-tests. Plus, I'd expect some
> "failure" messages at some point, not just a lock-up.
>
> Lastly, here's the only relevant message I could find in dmesg, from
> the resultant boot:
>
> ReiserFS: hda2: checking transaction log (hda2)
> ReiserFS: hda2: replayed 1125 transactions in 67 seconds
>
> Now, here's the really weird part. hda2 is an UNUSED partition. It
> isn't mounted on the system:
>
> Filesystem Size Used Avail Use% Mounted on
> /dev/hda1 9.3G 3.5G 5.9G 37% /
> /dev/hdb2 187G 164G 24G 88% /usr/local/NotYet
> /dev/hdb3 47G 18G 30G 37% /usr/local/Apps
> /dev/sda5 233G 140G 93G 61% /usr/local/Music
> /dev/sda6 233G 173G 60G 75% /usr/local/Backup
> tmpfs 256M 68K 256M 1% /tmp
>
> Any thoughts on what might be causing this, and more importantly, how
> to track down the culprit.
>
> Cheers,
> Eddie
>
> -
> To unsubscribe from this list: send the line "unsubscribe
> reiserfs-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Question about system lock-ups
2008-01-06 23:23 ` Edward Shishkin
@ 2008-01-07 0:48 ` Eddie Atherton
0 siblings, 0 replies; 3+ messages in thread
From: Eddie Atherton @ 2008-01-07 0:48 UTC (permalink / raw)
To: reiserfs-devel
> I had something similar on my dual athlon 1700+ box.
> Getting rid of case has fixed the problem, so I didn't
> investigate this further ;)
>
Hmmmm. This is a dual Opteron setup
> For example, pushing the "big red button" when write is in progress..
>
Except the last writes to the disk were hours before the button was
pushed. Plus, I doubt there would be that many writes active at that
moment.
>> Now, here's the really weird part. hda2 is an UNUSED partition. It
>> isn't mounted on the system:
>>
>> Filesystem Size Used Avail Use% Mounted on
>> /dev/hda1 9.3G 3.5G 5.9G 37% /
>> /dev/hdb2 187G 164G 24G 88% /usr/local/NotYet
>> /dev/hdb3 47G 18G 30G 37% /usr/local/Apps
>> /dev/sda5 233G 140G 93G 61% /usr/local/Music
>> /dev/sda6 233G 173G 60G 75% /usr/local/Backup
>> tmpfs 256M 68K 256M 1% /tmp
Any thoughts on this part.
Cheers,
Eddie
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2008-01-07 0:48 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-05 3:36 Question about system lock-ups Eddie Atherton
2008-01-06 23:23 ` Edward Shishkin
2008-01-07 0:48 ` Eddie Atherton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).