raid5:bad sectors after lost power

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* raid5:bad sectors after lost power
@ 2012-09-13  8:43 vincent
  2012-09-13  8:57 ` Mathias Buren
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: vincent @ 2012-09-13  8:43 UTC (permalink / raw)
  To: linux-raid

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=gb2312, Size: 1280 bytes --]

Hi everyone:
    I have a problem about raid5.
    I created a raid5 with the command ¡°mdadm ¨CCv --chuck=128 /dev/md0 ¨C
l5 ¨Cn4 sd[abcd]¡± for write performance test, the capacity of each disk is
2TB. After the array was created, I modified the strip_catch_size of it to
2048. 
    Then I used a program to parallel write 150 files to the array, the
speed of each is 1MB/s. Unfortunately, the electricity went off suddenly at
the time. when I turn on the device again, I found the raid5 is in recovery.
When     the progress of the recovery went up to 98%, there was a write
error occurred. I used ¡°smartctl ¨CA /dev/sd*¡± to check the heath status
of these disks. I found the ¡°RAW_VALUE¡± of the attribute whose name was
¡°Current_Pending_Sector¡± of sda and sdb were 1.
¡¡¡¡Then I used ¡°HDD_Regenerator¡± to check if there were bad blocks in the
disks. The result of the output indicated that sda and sdb did have a bad
sector. 
    These disks were used for the first time after purchased. Is it normal
to have bad sectors? Could you please help me? 


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: raid5:bad sectors after lost power
  2012-09-13  8:43 raid5:bad sectors after lost power vincent
@ 2012-09-13  8:57 ` Mathias Buren
  2012-09-13 10:48 ` Mikael Abrahamsson
  2012-09-13 20:00 ` Peter Grandi
  2 siblings, 0 replies; 4+ messages in thread
From: Mathias Buren @ 2012-09-13  8:57 UTC (permalink / raw)
  To: vincent; +Cc: linux-raid

On 13/09/12 16:43, vincent wrote:
> Hi everyone:
>      I have a problem about raid5.
>      I created a raid5 with the command “mdadm –Cv --chuck=128 /dev/md0 –
> l5 –n4 sd[abcd]” for write performance test, the capacity of each disk is
> 2TB. After the array was created, I modified the strip_catch_size of it to
> 2048.
>      Then I used a program to parallel write 150 files to the array, the
> speed of each is 1MB/s. Unfortunately, the electricity went off suddenly at
> the time. when I turn on the device again, I found the raid5 is in recovery.
> When     the progress of the recovery went up to 98%, there was a write
> error occurred. I used “smartctl –A /dev/sd*” to check the heath status
> of these disks. I found the “RAW_VALUE” of the attribute whose name was
> “Current_Pending_Sector” of sda and sdb were 1.
> 　　Then I used “HDD_Regenerator” to check if there were bad blocks in the
> disks. The result of the output indicated that sda and sdb did have a bad
> sector.
>      These disks were used for the first time after purchased. Is it normal
> to have bad sectors? Could you please help me?
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


RMA the HDDs (or return to vendor), they are not healthy.


// Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: raid5:bad sectors after lost power
  2012-09-13  8:43 raid5:bad sectors after lost power vincent
  2012-09-13  8:57 ` Mathias Buren
@ 2012-09-13 10:48 ` Mikael Abrahamsson
  2012-09-13 20:00 ` Peter Grandi
  2 siblings, 0 replies; 4+ messages in thread
From: Mikael Abrahamsson @ 2012-09-13 10:48 UTC (permalink / raw)
  To: vincent; +Cc: linux-raid

On Thu, 13 Sep 2012, vincent wrote:

>    These disks were used for the first time after purchased. Is it 
> normal to have bad sectors? Could you please help me?

If a drive gives *write* errors, then it's bad and should be returned.

Read errors might be less weird if you turned off power in the middle of a 
write.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: raid5:bad sectors after lost power
  2012-09-13  8:43 raid5:bad sectors after lost power vincent
  2012-09-13  8:57 ` Mathias Buren
  2012-09-13 10:48 ` Mikael Abrahamsson
@ 2012-09-13 20:00 ` Peter Grandi
  2 siblings, 0 replies; 4+ messages in thread
From: Peter Grandi @ 2012-09-13 20:00 UTC (permalink / raw)
  To: Linux RAID

> I have a problem about raid5.  I created a raid5 [ 3+1 2TB
> with 128KiB chunks ... ] parallel write 150 files to the
> array, the speed of each is 1MB/s.

Your problem with RAID5 is that even writing at just 1MB/s per
file, writing 150 streams in parallel can mean a lot of arm
movement, and a RAID5 of just 4 consumer-class drives probably
can't deliver that many IOPS, unless your writes are pretty
large (e.g. every stream writes 1MB in one go every second).

It probably will mostly work, but with pretty tight margins.
Both because 4 disks are not that many for 150 even relatively
slow streams, and RAID5 writes are correlated because of having
to do full stripe writes (in your case 384KiB at least) to avoid
RMW.

> Unfortunately, the electricity went off suddenly at the
> time. when I turn on the device again, I found the raid5 is in
> recovery.  When the progress of the recovery went up to 98%,
> there was a write error occurred. [ ... ]

That's a bad situation for those disks, because a *write* error
means that there are no more spare sectors available in the
whole disk, because on a write the disk firmware on finding a
bad sector can always transparently substitute it, if there are
spare sectors available. That there are no spare sectors
available means that the firmware previously found a lot of bad
sectors.

From this point onwards you no longer have a RAID issue, the MD
RAID has attempted its rebuild after finding the drives out of
sync, and it is now purely a hardware issue. It is bit offtopic,
but let's go over it without too many details, making obligatory
references to MD RAID aspects where appropriate.

The first one is a vastly misunderstood point about base RAID
systems like MD: they are not supposed to detect errors, they
are deliberately designed under the assumption that any and
every storage issue is discovered and reported to MD by the
block device layer and beneath.

So for example the purpose of parity is *reconstruction* of
data, once the block device layer has reported a device issue,
not the *detection* of corrupted data. It can also be used as an
aside for that, but a numver of optimizations in parity RAID
depend on not using parity to detect issues.

> used “HDD_Regenerator” to check if there were bad blocks in
> the disks. The result of the output indicated that sda and sdb
> did have a bad sector.

They have many, but many/most are remapped to spares. The number
will be in the SMART attribute 'Reallocated_Sector_Ct'. The
output is that they have at least one *unspared* bad sector.

> These disks were used for the first time after purchased. Is
> it normal to have bad sectors?

It is quite normal to have bad sectors: a 2TB drive has 4
billion 512B sectors, or 0.5 billion 4KiB sectors, and *some*
percentage of that very large number must be defective.

MD note: since a small number (hundreds) usually flips to bad
over time, it is convenient to use MD sync-checking to *detect*
issues. Since this is an aside convenience, it must be used
explicitly (and if used it is usually VERY IMPORTANT to ensure
that SMART ERC is set for a short timeout).

Also it may be useful to run periodic SMART selftests. But both
MD sync-checking and SMART selftests consume IOPS and bandwidth.

But in your case probably the sudden power loss, perhaps
accompanied by a power surge, may have damaged in some way or
another, depending on the disk mechanics, electronics and
firmware, some significant chunk of the recording surface.

> Could you please help me?

If you want to use 'sda' and 'sdb' for production systems with
any degree of criticality I would say don't do it. If you are
them purely for testing I would suggest some steps that *might*
make them more useful again:

  * If available run SECURITY ERASE on the drives using recent
    versions of 'hdparm'. Many drive firmwares seem to combine
    SECURITY ERASE with refreshing and rebuilding the spared and
    spare sector lists.

  * Map the areas where there are unspared sectors using
    'badblocks' or 'dd_rescue', and then partition the disks
    (using GPT labelling) and create not-to-use partitions on
    those areas. You may have even 10% or more of the disk in
    bad sectors, but as long as the partition(s) you actually
    use don't cross a bad area, it is relatively safe to use
    them.  Some older filesystems can be given bad-sector lists
    and will not use them, but with RAID5 that becomes a bit
    complicated.

Note that often drives with many *spared* sectors can perform
badly because the spare sectors that subtitute for bad one can
be rather far away from them, causing sudden long seeks.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-09-13 20:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-13  8:43 raid5:bad sectors after lost power vincent
2012-09-13  8:57 ` Mathias Buren
2012-09-13 10:48 ` Mikael Abrahamsson
2012-09-13 20:00 ` Peter Grandi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).