Re: Ninth(?) Velociraptor replacement or md(RAID)/smartmontools(?) bug?

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Peter Rabbitson <rabbit+list@rabbit.us>
To: Justin Piszcz <jpiszcz@lucidpixels.com>
Cc: linux-raid <linux-raid@vger.kernel.org>,
	linux-kernel@vger.kernel.org, alan@lxorguk.ukuu.org.uk,
	martmontools-support@lists.sourceforge.net,
	Bruce Allen <ballen@gravity.phys.uwm.edu>
Subject: Re: Ninth(?) Velociraptor replacement or md(RAID)/smartmontools(?) bug?
Date: Fri, 21 Nov 2008 12:30:23 +0100	[thread overview]
Message-ID: <49269BCF.8060300@rabbit.us> (raw)
In-Reply-To: <alpine.DEB.1.10.0811210600310.5577@p34.internal.lan>

Justin Piszcz wrote:
> Comment 1: From Alan Cox:
> 
> ================================================================================
> 
> Alan Cox <alan@lxorguk.ukuu.org.uk>
> 
>> Error 1 occurred at disk power-on lifetime: 818 hours (34 days + 2 hours)
>>    When the command that caused the error occurred, the device was
>> doing SMART
> Offline or Self-test.
>>
>>    After command completion occurred, registers were:
>>    ER ST SC SN CL CH DH
>>    -- -- -- -- -- -- --
>>    04 51 00 34 cf f3 a3
> 
> So Error 0x04 (ABRT)
> Status 0x51 (DRDY N/A ERR)      Error occurred, and at the point data
> transfer was expected
> 
> Which the spec says means the device errored the command because it does
> not support it.
> 
> Seems odd that this then tripped a raid failover
> ================================================================================
> 
> 
> Comment 1 Response: Should this have tripped a raid fail-over?  I have
> been having raid failures like this ever since I replaced all my
> raptor150s with velociraptor300 disks, what can be done so this does not
> occur?  Is this a WD/firmware bug or a bug in the md/raid code?
> 
> ================================================================================
> 

It might very well be a WD bug. I had three (3) identical WDC
WD2500AAJS-08B4A0 drives fail on me with the same _identical_ error
(same sector number to the last digit):

Oct 27 11:33:41 Arzamas kernel: ata6.00: exception Emask 0x10 SAct 0x0
SErr 0x80000 action 0xe frozen
Oct 27 11:33:41 Arzamas kernel: ata6.00: irq_stat 0x01100010, PHY RDY
changed
Oct 27 11:33:41 Arzamas kernel: ata6: SError: { 10B8B }
Oct 27 11:33:41 Arzamas kernel: ata6.00: cmd
ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
Oct 27 11:33:41 Arzamas kernel: res 06/37:00:00:00:00/00:00:00:00:06/00
Emask 0x12 (ATA bus error)
Oct 27 11:33:41 Arzamas kernel: ata6.00: error: { IDNF ABRT }
Oct 27 11:33:41 Arzamas kernel: ata6: hard resetting link
Oct 27 11:33:46 Arzamas kernel: ata6: SATA link up 3.0 Gbps (SStatus 123
SControl 0)
Oct 27 11:33:46 Arzamas kernel: ata6.00: configured for UDMA/100
Oct 27 11:33:46 Arzamas kernel: ata6: EH complete
Oct 27 11:33:46 Arzamas kernel: sd 6:0:0:0: [sde] 488397168 512-byte
hardware sectors (250059 MB)
Oct 27 11:33:46 Arzamas kernel: sd 6:0:0:0: [sde] Write Protect is off
Oct 27 11:33:46 Arzamas kernel: sd 6:0:0:0: [sde] Mode Sense: 00 3a 00 00
Oct 27 11:33:46 Arzamas kernel: sd 6:0:0:0: [sde] Write cache: enabled,
read cache: enabled, doesn't support DPO or FUA
Oct 27 11:33:46 Arzamas kernel: end_request: I/O error, dev sde, sector
488166955
Oct 27 11:33:46 Arzamas kernel: md: super_written gets error=-5, uptodate=0


All 3 drives endured the same multiple rewriting of the sector in
question, as they did multiple smart self-tests. I am currently in the
process of replacing these two drives with Seagates, (the other 2 in the
4 member array are Maxtors). Will see what happens.

Peter

P.S. See threads http://marc.info/?l=linux-raid&m=122523835815697 and
http://marc.info/?l=linux-raid&m=122669103213041 for more info on my
setup and hardware.

next prev parent reply	other threads:[~2008-11-21 11:30 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-21 11:12 Ninth(?) Velociraptor replacement or md(RAID)/smartmontools(?) bug? Justin Piszcz
2008-11-21 11:28 ` Justin Piszcz
2008-11-21 11:30 ` Peter Rabbitson [this message]
2008-11-21 11:33   ` Justin Piszcz
2008-11-21 11:37     ` Justin Piszcz
2008-11-21 11:37     ` Peter Rabbitson
2008-11-21 11:54   ` Alan Cox
2008-11-21 12:08     ` Justin Piszcz
2008-11-21 17:55       ` Bill Davidsen
2008-11-21 18:18         ` Richard Scobie
2008-11-21 21:30           ` Justin Piszcz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49269BCF.8060300@rabbit.us \
    --to=rabbit+list@rabbit.us \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=ballen@gravity.phys.uwm.edu \
    --cc=jpiszcz@lucidpixels.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=martmontools-support@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.