reiserfs-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Question about system lock-ups
@ 2008-01-05  3:36 Eddie Atherton
  2008-01-06 23:23 ` Edward Shishkin
  0 siblings, 1 reply; 3+ messages in thread
From: Eddie Atherton @ 2008-01-05  3:36 UTC (permalink / raw)
  To: reiserfs-devel

Hi,

I don't think this issue is reiser related, but I thought I'd bounce it 
here, in case anyone might be able to shed light on it, based on the 
symptoms.

Yesterday, for the 3nd time, my system "kind of" locked up. Each time 
this has happened, it's been after around a month of up-time, and all 
times were at odd hours, so the system wasn't being used much, so I 
don't think it's temperature related, otherwise I'd get them during the 
day, when I am using the system heavily.

OK, the symptoms are this. On the monitor I can see my screensaver, 
<http://www.linuxquestions.org/questions/#>but it's not "in motion". The 
keyboard and mouse don't work. I can't connect by ssh or VNC, 
<http://www.linuxquestions.org/questions/#> but neither actually time 
out, they just hang. Requests to DHCP hang. BUT, the system is still 
running, because it acts as a server/gateway for the rest of the 
machines in the house, and they can connect to the internet 
<http://www.linuxquestions.org/questions/#>without problems. I can surf 
the web, I can send and receive e-mails, anything but connect directly 
to the server. <http://www.linuxquestions.org/questions/#>

The only way to get the system to respond, is the "big red button". On 
restart, and here's why I'm trying this list for ideas, is that I 
noticed, again each time, is hundreds and hundreds, if not thousands of 
messages showing reiser replaying transactions. It takes, sometimes, 
nearly 5 minutes for this to complete.  What kind of issue could cause 
all of these transactions to be replayed.

Examining the logs doesn't show anything, except that all logging 
appears to stop at the same point in time, which I think is when the 
"issue", whatever it is, occurs, because the time was around 4 hours 
before I noticed the lock-up.

thought about a drive issue, but smartd isn't reporting any problems, 
normally, or via short/long self-tests. Plus, I'd expect some "failure" 
messages at some point, not just a lock-up.

Lastly, here's the only relevant message I could find in dmesg, from the 
resultant boot:

ReiserFS: hda2: checking transaction log (hda2)
ReiserFS: hda2: replayed 1125 transactions in 67 seconds

Now, here's the really weird part.  hda2 is an UNUSED partition.  It 
isn't mounted on the system:

Filesystem            Size  Used Avail Use% Mounted on
/dev/hda1             9.3G  3.5G  5.9G  37% /
/dev/hdb2             187G  164G   24G  88% /usr/local/NotYet
/dev/hdb3              47G   18G   30G  37% /usr/local/Apps
/dev/sda5             233G  140G   93G  61% /usr/local/Music
/dev/sda6             233G  173G   60G  75% /usr/local/Backup
tmpfs                 256M   68K  256M   1% /tmp

Any thoughts on what might be causing this, and more importantly, how to 
track down the culprit.

Cheers,
Eddie


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Question about system lock-ups
  2008-01-05  3:36 Question about system lock-ups Eddie Atherton
@ 2008-01-06 23:23 ` Edward Shishkin
  2008-01-07  0:48   ` Eddie Atherton
  0 siblings, 1 reply; 3+ messages in thread
From: Edward Shishkin @ 2008-01-06 23:23 UTC (permalink / raw)
  To: stunnel; +Cc: reiserfs-devel

Eddie Atherton wrote:

> Hi,


Hello.

>
> I don't think this issue is reiser related, but I thought I'd bounce 
> it here, in case anyone might be able to shed light on it, based on 
> the symptoms.
>
> Yesterday, for the 3nd time, my system "kind of" locked up. Each time 
> this has happened, it's been after around a month of up-time, and all 
> times were at odd hours, so the system wasn't being used much, so I 
> don't think it's temperature related, otherwise I'd get them during 
> the day, when I am using the system heavily.
>

I had something similar on my dual athlon 1700+ box.
Getting rid of case has fixed the problem, so I didn't
investigate this further ;)

> OK, the symptoms are this. On the monitor I can see my screensaver, 
> <http://www.linuxquestions.org/questions/#>but it's not "in motion". 
> The keyboard and mouse don't work. I can't connect by ssh or VNC, 
> <http://www.linuxquestions.org/questions/#> but neither actually time 
> out, they just hang. Requests to DHCP hang. BUT, the system is still 
> running, because it acts as a server/gateway for the rest of the 
> machines in the house, and they can connect to the internet 
> <http://www.linuxquestions.org/questions/#>without problems. I can 
> surf the web, I can send and receive e-mails, anything but connect 
> directly to the server. <http://www.linuxquestions.org/questions/#>
>
> The only way to get the system to respond, is the "big red button". On 
> restart, and here's why I'm trying this list for ideas, is that I 
> noticed, again each time, is hundreds and hundreds, if not thousands 
> of messages showing reiser replaying transactions. It takes, 
> sometimes, nearly 5 minutes for this to complete.  What kind of issue 
> could cause all of these transactions to be replayed.


For example, pushing the "big red button" when write is in progress..

>
> Examining the logs doesn't show anything, except that all logging 
> appears to stop at the same point in time, which I think is when the 
> "issue", whatever it is, occurs, because the time was around 4 hours 
> before I noticed the lock-up.
>
> thought about a drive issue, but smartd isn't reporting any problems, 
> normally, or via short/long self-tests. Plus, I'd expect some 
> "failure" messages at some point, not just a lock-up.
>
> Lastly, here's the only relevant message I could find in dmesg, from 
> the resultant boot:
>
> ReiserFS: hda2: checking transaction log (hda2)
> ReiserFS: hda2: replayed 1125 transactions in 67 seconds
>
> Now, here's the really weird part.  hda2 is an UNUSED partition.  It 
> isn't mounted on the system:
>
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/hda1             9.3G  3.5G  5.9G  37% /
> /dev/hdb2             187G  164G   24G  88% /usr/local/NotYet
> /dev/hdb3              47G   18G   30G  37% /usr/local/Apps
> /dev/sda5             233G  140G   93G  61% /usr/local/Music
> /dev/sda6             233G  173G   60G  75% /usr/local/Backup
> tmpfs                 256M   68K  256M   1% /tmp
>
> Any thoughts on what might be causing this, and more importantly, how 
> to track down the culprit.
>
> Cheers,
> Eddie
>
> -
> To unsubscribe from this list: send the line "unsubscribe 
> reiserfs-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Question about system lock-ups
  2008-01-06 23:23 ` Edward Shishkin
@ 2008-01-07  0:48   ` Eddie Atherton
  0 siblings, 0 replies; 3+ messages in thread
From: Eddie Atherton @ 2008-01-07  0:48 UTC (permalink / raw)
  To: reiserfs-devel


> I had something similar on my dual athlon 1700+ box.
> Getting rid of case has fixed the problem, so I didn't
> investigate this further ;)
>
Hmmmm.  This is a dual Opteron setup

> For example, pushing the "big red button" when write is in progress..
>
Except the last writes to the disk were hours before the button was 
pushed.  Plus, I doubt there would be that many writes active at that 
moment.
>> Now, here's the really weird part.  hda2 is an UNUSED partition.  It 
>> isn't mounted on the system:
>>
>> Filesystem            Size  Used Avail Use% Mounted on
>> /dev/hda1             9.3G  3.5G  5.9G  37% /
>> /dev/hdb2             187G  164G   24G  88% /usr/local/NotYet
>> /dev/hdb3              47G   18G   30G  37% /usr/local/Apps
>> /dev/sda5             233G  140G   93G  61% /usr/local/Music
>> /dev/sda6             233G  173G   60G  75% /usr/local/Backup
>> tmpfs                 256M   68K  256M   1% /tmp
Any thoughts on this part.

Cheers,
Eddie

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2008-01-07  0:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-05  3:36 Question about system lock-ups Eddie Atherton
2008-01-06 23:23 ` Edward Shishkin
2008-01-07  0:48   ` Eddie Atherton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).