Re: Ninth(?) Velociraptor replacement or md(RAID)/smartmontools(?) bug?

The Linux Kernel Mailing List
 help / color / mirror / Atom feed

From: Peter Rabbitson <rabbit+list@rabbit.us>
To: Justin Piszcz <jpiszcz@lucidpixels.com>
Cc: linux-raid <linux-raid@vger.kernel.org>,
	linux-kernel@vger.kernel.org, alan@lxorguk.ukuu.org.uk,
	martmontools-support@lists.sourceforge.net,
	Bruce Allen <ballen@gravity.phys.uwm.edu>
Subject: Re: Ninth(?) Velociraptor replacement or md(RAID)/smartmontools(?) bug?
Date: Fri, 21 Nov 2008 12:30:23 +0100	[thread overview]
Message-ID: <49269BCF.8060300@rabbit.us> (raw)
In-Reply-To: <alpine.DEB.1.10.0811210600310.5577@p34.internal.lan>

Justin Piszcz wrote:
> Comment 1: From Alan Cox:
> 
> ================================================================================
> 
> Alan Cox <alan@lxorguk.ukuu.org.uk>
> 
>> Error 1 occurred at disk power-on lifetime: 818 hours (34 days + 2 hours)
>>    When the command that caused the error occurred, the device was
>> doing SMART
> Offline or Self-test.
>>
>>    After command completion occurred, registers were:
>>    ER ST SC SN CL CH DH
>>    -- -- -- -- -- -- --
>>    04 51 00 34 cf f3 a3
> 
> So Error 0x04 (ABRT)
> Status 0x51 (DRDY N/A ERR)      Error occurred, and at the point data
> transfer was expected
> 
> Which the spec says means the device errored the command because it does
> not support it.
> 
> Seems odd that this then tripped a raid failover
> ================================================================================
> 
> 
> Comment 1 Response: Should this have tripped a raid fail-over?  I have
> been having raid failures like this ever since I replaced all my
> raptor150s with velociraptor300 disks, what can be done so this does not
> occur?  Is this a WD/firmware bug or a bug in the md/raid code?
> 
> ================================================================================
> 

It might very well be a WD bug. I had three (3) identical WDC
WD2500AAJS-08B4A0 drives fail on me with the same _identical_ error
(same sector number to the last digit):

Oct 27 11:33:41 Arzamas kernel: ata6.00: exception Emask 0x10 SAct 0x0
SErr 0x80000 action 0xe frozen
Oct 27 11:33:41 Arzamas kernel: ata6.00: irq_stat 0x01100010, PHY RDY
changed
Oct 27 11:33:41 Arzamas kernel: ata6: SError: { 10B8B }
Oct 27 11:33:41 Arzamas kernel: ata6.00: cmd
ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
Oct 27 11:33:41 Arzamas kernel: res 06/37:00:00:00:00/00:00:00:00:06/00
Emask 0x12 (ATA bus error)
Oct 27 11:33:41 Arzamas kernel: ata6.00: error: { IDNF ABRT }
Oct 27 11:33:41 Arzamas kernel: ata6: hard resetting link
Oct 27 11:33:46 Arzamas kernel: ata6: SATA link up 3.0 Gbps (SStatus 123
SControl 0)
Oct 27 11:33:46 Arzamas kernel: ata6.00: configured for UDMA/100
Oct 27 11:33:46 Arzamas kernel: ata6: EH complete
Oct 27 11:33:46 Arzamas kernel: sd 6:0:0:0: [sde] 488397168 512-byte
hardware sectors (250059 MB)
Oct 27 11:33:46 Arzamas kernel: sd 6:0:0:0: [sde] Write Protect is off
Oct 27 11:33:46 Arzamas kernel: sd 6:0:0:0: [sde] Mode Sense: 00 3a 00 00
Oct 27 11:33:46 Arzamas kernel: sd 6:0:0:0: [sde] Write cache: enabled,
read cache: enabled, doesn't support DPO or FUA
Oct 27 11:33:46 Arzamas kernel: end_request: I/O error, dev sde, sector
488166955
Oct 27 11:33:46 Arzamas kernel: md: super_written gets error=-5, uptodate=0


All 3 drives endured the same multiple rewriting of the sector in
question, as they did multiple smart self-tests. I am currently in the
process of replacing these two drives with Seagates, (the other 2 in the
4 member array are Maxtors). Will see what happens.

Peter

P.S. See threads http://marc.info/?l=linux-raid&m=122523835815697 and
http://marc.info/?l=linux-raid&m=122669103213041 for more info on my
setup and hardware.

next prev parent reply	other threads:[~2008-11-21 11:39 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-21 11:12 Ninth(?) Velociraptor replacement or md(RAID)/smartmontools(?) bug? Justin Piszcz
2008-11-21 11:28 ` Justin Piszcz
2008-11-21 11:30 ` Peter Rabbitson [this message]
2008-11-21 11:33   ` Justin Piszcz
2008-11-21 11:37     ` Justin Piszcz
2008-11-21 11:37     ` Peter Rabbitson
2008-11-21 11:54   ` Alan Cox
2008-11-21 12:08     ` Justin Piszcz
2008-11-21 17:55       ` Bill Davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49269BCF.8060300@rabbit.us \
    --to=rabbit+list@rabbit.us \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=ballen@gravity.phys.uwm.edu \
    --cc=jpiszcz@lucidpixels.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=martmontools-support@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox