All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kalin KOZHUHAROV <kalin@thinrope.net>
To: linux-kernel@vger.kernel.org
Subject: Re: ata errors -> read-only root partition. Hardware issue?
Date: Sat, 14 Jan 2006 02:04:31 +0900	[thread overview]
Message-ID: <dq8mj0$leb$1@sea.gmane.org> (raw)
In-Reply-To: <1137001442.27255.53.camel@mindpipe>

Lee Revell wrote:
> On Wed, 2006-01-11 at 16:26 +0100, jerome lacoste wrote:
> 
>>Could something else (bad cable or disk controller ) trigger these
>>issues?
>>
>>It would be great if we users had a quick way to decipher these
>>messages.
>>
>>E.g.
>>
>>"Buffer I/O error on device xxxx, logical block yyyyyyy"
>>
>>Usualy a disk failure, may also be caused by.... 
> 
> 
> This is not a bad idea, "status=0x51 { DriveReady SeekComplete Error }"
> in my experience always indicates a failing hard drive.  Maybe a
> "Possible drive or media failure" could be added?

I posted this in another thread, but reposting here. This is an Asus P5GDC-V
MB with WD740GD harddisk.

The machine was locking hard (no KBD, video, network) after a few hours of
uptime with kenrels 2.6.12 ... 2.6.14.4, now running 2.6.15 with patched
sk98lin. After some random time (up to 2d), the dmesg output is full of these:

[snip, see below as it is identical]

the fs is mounted ro, and most I/O is dead (like trying to use
/sbin/shutdown resulting in I/O error). I checked the disk with WD Data
LifeGuardTools and no errors were reported. smartctl says this:

ooops, the machine is again borked and not here, will post smartctl
tomorrow, but basicaly no errors are reported after extended tests.

Now dmesg says:

[17225533.452000] ata1: port reset, p_is 40000001 is 1 pis 0 cmd c017 tf 471
ss 113 se 0
[17225533.452000] ata1: translated ATA stat/err 0x71/04 to SCSI SK/ASC/ASCQ
0xb/00/00
[17225533.452000] ata1: status=0x71 { DriveReady DeviceFault SeekComplete
Error }
[17225533.452000] ata1: error=0x04 { DriveStatusError }
[17225533.452000] ata1: port reset, p_is 40000001 is 1 pis 0 cmd c017 tf 471
ss 113 se 0
[17225533.452000] ata1: translated ATA stat/err 0x71/04 to SCSI SK/ASC/ASCQ
0xb/00/00
[17225533.452000] ata1: status=0x71 { DriveReady DeviceFault SeekComplete
Error }
[17225533.452000] ata1: error=0x04 { DriveStatusError }
[17225533.452000] ata1: port reset, p_is 40000001 is 1 pis 0 cmd c017 tf 471
ss 113 se 0
[17225533.452000] ata1: translated ATA stat/err 0x71/04 to SCSI SK/ASC/ASCQ
0xb/00/00
[17225533.452000] ata1: status=0x71 { DriveReady DeviceFault SeekComplete
Error }
[17225533.452000] ata1: error=0x04 { DriveStatusError }
[17225533.452000] ata1: port reset, p_is 40000001 is 1 pis 0 cmd c017 tf 471
ss 113 se 0
[17225533.452000] ata1: translated ATA stat/err 0x71/04 to SCSI SK/ASC/ASCQ
0xb/00/00
[17225533.452000] ata1: status=0x71 { DriveReady DeviceFault SeekComplete
Error }
[17225533.452000] ata1: error=0x04 { DriveStatusError }
[17225533.452000] ata1: port reset, p_is 40000001 is 1 pis 0 cmd c017 tf 471
ss 113 se 0
[17225533.452000] ata1: translated ATA stat/err 0x71/04 to SCSI SK/ASC/ASCQ
0xb/00/00
[17225533.452000] ata1: status=0x71 { DriveReady DeviceFault SeekComplete
Error }
[17225533.452000] ata1: error=0x04 { DriveStatusError }
[17225533.452000] sd 0:0:0:0: SCSI error: return code = 0x8000002
[17225533.452000] sda: Current: sense key=0xb
[17225533.452000]     ASC=0x0 ASCQ=0x0
[17225533.452000] end_request: I/O error, dev sda, sector 17632540
[17225533.452000] Buffer I/O error on device sda3, logical block 1216070
[17225533.452000] lost page write due to I/O error on sda3
[17225677.824000] ReiserFS: sda3: warning: clm-6006: writing inode 10055 on
readonly FS
[17225677.824000] ReiserFS: sda3: warning: clm-6006: writing inode 10055 on
readonly FS


At least the good thing is that I can ssh now.

After soft reboot (for i in s u s b; do echo $i >/proc/sysrq-trigger; sleep
1;done ) form a borked state (like now) the bios fails to detect the
harddisk and hangs indefinately...

Kalin.

P.S. Will try sky2 tomorrow instead of sk98lin.
P.P.S. Also askid in "2.6.15 and CONFIG_PRINTK_TIME" thread, but any idea
why is this strange time printed since boot?

-- 
|[ ~~~~~~~~~~~~~~~~~~~~~~ ]|
+-> http://ThinRope.net/ <-+
|[ ______________________ ]|


  reply	other threads:[~2006-01-13 17:05 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <5ttip-Xh-13@gated-at.bofh.it>
2006-01-10 23:54 ` ata errors -> read-only root partition. Hardware issue? Robert Hancock
2006-01-11  8:30   ` jerome lacoste
2006-01-11 13:38     ` Alan Cox
2006-01-11 13:52       ` jerome lacoste
2006-01-11 15:26         ` jerome lacoste
2006-01-11 15:55           ` Roger Heflin
2006-01-11 17:44           ` Lee Revell
2006-01-13 17:04             ` Kalin KOZHUHAROV [this message]
2006-01-10 15:06 jerome lacoste

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='dq8mj0$leb$1@sea.gmane.org' \
    --to=kalin@thinrope.net \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.