Linux RAID subsystem development

Linux RAID subsystem development
 help / color / mirror / Atom feed

* Re: HBA Adaptor advice
From: John Robinson @ 2011-05-23 10:44 UTC (permalink / raw)
  To: Ed W; +Cc: linux-raid
In-Reply-To: <4DDA2D77.1050604@wildgooses.com>

On 23/05/2011 10:48, Ed W wrote:
[...]
> Pardon what is probably a very ignorant question, but someone earlier in
> this thread claimed that some adaptors report the size of the disk
> slightly differently?  Wouldn't this potentially cause problems if you
> needed to move the disks to a different controller?

Yup. RAID cards will use some of the disc for their own metadata. The 
amount used, and the location of it, is probably different for different 
controllers. This would be one reason why using a RAID controller with 
BBWC and exporting the drives as single-drive RAID0 volumes is a bit 
icky, and liable to tie you to one manufacturer.

There is a possibility (handwaving here) that using a RAID controller in 
JBOD mode would be similar. You may need to flash your controller to 
non-RAID firmware to avoid it, at which point you probably ought to have 
bought an HBA in the first place.

There is a similar problem on some OEMs' BIOSes that will set a 
"host-protected area" that will reduce the visible size of drives.

> Additionally if you needed to replace the disk then some new batch might
> be some few sectors smaller?  This seems to be the biggest reason for
> wanting to add a partition table and then deliberately partition some
> 10s MB smaller? (Think I saw this exact problem come up several times in
> the last few weeks alone?)

For spinning rust discs this hasn't been the case for several years 
since we passed about 160GB; all the manufacturers signed up to an 
industry standard[1] making all their discs a consistent number of 
sectors for any given marketing size.

It's probably a problem again now with SSDs, though.

Cheers,

John.

[1] I can't remember what the standard or standards group is, and I 
can't be bothered looking it up. But of course it's a standard. We love 
standards, that's why we have so many of them![2]

[2] Sorry if I'm a bit grumpy this morning. Too many standards and not 
enough coffee make John a grumpy boy.

^ permalink raw reply

* Re: HBA Adaptor advice
From: Stan Hoeppner @ 2011-05-23 10:42 UTC (permalink / raw)
  To: Brad Campbell; +Cc: Roman Mamedov, linux-raid
In-Reply-To: <4DD9F6B2.4070708@fnarfbargle.com>

On 5/23/2011 12:54 AM, Brad Campbell wrote:

> Most sane operating systems use cluster sizes of 4k or larger and have
> done for years, so I really don't see what all the fuss is about.
> 
> Peoples inability to properly align the data on their disks can be read
> either as a failing in the technology (the partitioning applications
> have not caught up yet) or simply a lack of understanding on how to
> apply the technology.
> 
> Don't blame the drive manufacturers, this should have happened _years_ ago.

I don't think anyone has an issue w/native 4KB sectors and operating
system support for it.  That would have been the big win.  What folks
have issue with is the hybrid 512/4096 drives which has created the
alignment offset problems.

The industry (BIOS/firmware), commercial and FOSS OSes, should have
worked together to migrate directly to 4KB native sectors.  I don't know
why this didn't happen, usual suspects I guess.  It seems, from my
limited POV, that the Linux partition tool people and kernel folks
simply don't care at this point.

I've not paid recent attention.  Have fdisk, cfdisk, parted, etc, all
come up to speed now, and automatically handle offsets correctly for
hybrid sector size disks?

-- 
Stan

^ permalink raw reply

* Re: HBA Adaptor advice
From: Ed W @ 2011-05-23 10:42 UTC (permalink / raw)
  To: Tobias McNulty; +Cc: Jim Schatzman, linux-raid
In-Reply-To: <BANLkTindNRCxRUpfpONnKPwoE8fg7SbgZw@mail.gmail.com>

On 23/05/2011 04:39, Tobias McNulty wrote:
> One odd statistical fluke regarding quality control on the large Green
> drives:  I ordered 3 of the drives on Amazon and 3 on Newegg, and all
> 3 of the Newegg drives failed very quickly (within a couple weeks),
> while all 3 of the Amazon drives are still going strong (5 months old
> and on 24/7).  I'm not sure if it was a packaging issue or an issue
> with that particular set of drives that Newegg had in stock, but it
> left me wondering.

I had some similar experience with some Samsung F3 drives recently.  Not
as clear cut as your example, but I have found other examples in the
archives here which suggest that drive failure might correlate with
batch number?  I'm sure I have seen others suggest trying to build
arrays out of mixed batch (or perhaps brand?) drives

Going back some 10-15 years when I used to put paired raid1 drives into
our office servers, every failure I ever had appeared to affect both
drives within some few hours of each other... (less than 48 hours say).
 There are plenty of external reasons to explain that (besides drives
reaching end of life), such as power fluctuations, temperature, etc, but
punchline remains that mirroring didn't buy me much protection...

Tricky to make this stuff reliable.. Small probabilities and
catastrophic scenarios are hard to value...

Ed W

^ permalink raw reply

* Re: disable raid autodetect at boot
From: Michael Tokarev @ 2011-05-23 10:35 UTC (permalink / raw)
  To: Alexander Lyakas; +Cc: linux-raid
In-Reply-To: <BANLkTim=K8b_3jD5RAdMT6+0TYQXNAHBtg@mail.gmail.com>

23.05.2011 13:36, Alexander Lyakas wrote:
> Hello,
> 
> I have a simple raid1 created on top of /dev/sda and /dev/sdb. After I
> reboot, I would like to always manually assemble the raid. However,
> every time the machine reboots, it looks like md tries to
> automatically reassemble the raid, but usually binds only one of the
> source devices:
> 
> 16936	May 23 11:47:36 vc kernel: [    3.472926] md: linear personality
> registered for level -1
..
> 16971	May 23 11:47:36 vc kernel: [    6.059194] md: bind<sdb>

This is not kernel autodetection, this is your initramfs/initrd
and mdadm.  Or maybe mdadm in the regular root filesystem.

/mjt

^ permalink raw reply

* Re: HBA Adaptor advice
From: Ed W @ 2011-05-23 10:33 UTC (permalink / raw)
  To: Stan Hoeppner; +Cc: Brad Campbell, linux-raid
In-Reply-To: <4DDA2FD9.3020303@hardwarefreak.com>

On 23/05/2011 10:58, Stan Hoeppner wrote:
> Multiple drives definitely went offline, but I doubt
> it's due to a real RAID ASIC with custom firmware and a TLER issue.
> More likely, given the price, it was a backplane signal quality problem,
> for which cheap backplanes are notorious.

I think we shouldn't bang this one to death any further, but your
statement above could be interpreted that whilst the drives may not be
ideal, likely they weren't the issue in this case?

If the cheap WD drives weren't the main issue then perhaps at least this
example shouldn't be used as an example of why NOT to use those drives?

> Either way, cheap
> not-fit-for-RAID drives were stuffed into a cheap RAID box and disaster
> was the result.

But likely due to what boils down to "cables falling out" is what you
seem to be guessing?

> WDC
> itself says not to use the Green drives in RAID arrays. 

The problem with taking the manufacturers word on this is that they
provide two products and claim one is "good enough" and that the other
"lasts way longer", and then price them quite significantly differently

Now, without even looking inside the two identical metal chassis, you
have to admit: a) there is incentive for them to tell fibs here in order
to gain a price premium and b) given the "reliable" drives are roughly
twice the cost then there should be sufficient extra engineering in
there that we can look for third party documentation, patents and other
supplemental information to learn more about what that engineering is
and gain confidence that the money is well spent?

I guess you have two near equal priced options:

a) 12 disk RAID6 using "enterprise" drives
b) 12x 2 disk RAID 1, plus 12x RAID6 on the top of that (some variant of
RAID61 basically)

Does having twice the number of "cheap" drives make the thing more or
less reliable?  (More drives = higher probability of individual drive
failing, but additional redundancy decreases chance of total loss).  I
need to crank some numbers in excel to try and get my head around which
is better for a given failure probability

Cheers

Ed W

^ permalink raw reply

* Re: HBA Adaptor advice
From: Ed W @ 2011-05-23 10:18 UTC (permalink / raw)
  To: stefan.huebner; +Cc: linux-raid
In-Reply-To: <4DD9F0EB.5040801@stud.tu-ilmenau.de>

On 23/05/2011 06:30, Stefan /*St0fF*/ Hübner wrote:

> We sell 200+ drives a week from our "at that time preferred"
> Manufacturer.  That was WD from 2008 till the beginning of 2010.  

Just to clarify - are all those "failures" basically attributable to
drives with unreadable sectors which then drop out of arrays due to lack
of TLER?

ie if the drive DID have TLER then it would likely not have been
reported as a failed drive? (But presumably smart might report a
re-allocated sector and you might get a sectors dataloss?)

> But
> since the climb of wd failure rates we're at "Hitachi" and have
> astounding failure rates of less than one percent.  I hope this will
> stay the case even after WD bought Hitachi GST...

Likewise is this because the Hitachi drives appear more reliable or
because they incorporate some kind of TLER which keeps them running in
the face of reallocated sectors?

Can you draw your conclusion to the desktop Hitachi drives also? Do
these also suffer lower failure rates? Can they be made "TLER" compatible?

> Conclusion about this university-storage-failure: wrong drives for this
> scenario.  It would've been OK to use the cheap WDs for backup (if the
> backup was at least RAID6 and sends error-mails to the admin).  But the
> primary storage was a big fail.  You do not use this kind of storage for
> data which is worth much time (and by that much money).

How do folks here react to Googles paper stating that largely they find
little difference in reliability between "raid drives" and consumer drives?

Granted it's a problem if a drive pops out of an array because it has a
reallocated sector, but a) do folks with TLER drives immediately replace
the drive when they see a reallocated sector? b) those without TLER run
badblocks and put the drive back into the array c) can MD raid work
around the limitations of lacking TLER and consumer drives?

Thanks

Ed W

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: HBA Adaptor advice
From: Stan Hoeppner @ 2011-05-23  9:58 UTC (permalink / raw)
  To: Brad Campbell; +Cc: linux-raid
In-Reply-To: <4DD99FF2.2030609@fnarfbargle.com>

On 5/22/2011 6:44 PM, Brad Campbell wrote:

> He used WD commodity drives on a "hardware" RAID enclosure that needed
> TLER. The RAID-5 kicked out 4 drives in a short period of time, so he
> power cycled it and re-initialised the array and it came up fine, but
> blank (as it would as he re-initialised it).

It was/is an Excel Meridian Data SecurStor Astra ES with 4 expansion
chassis.  Low end, many would call it junk (myself included).  Excel
Meridian Data, formerly known as Excel CD-ROM, has been re-badging cheap
Taiwanese, and now Chinese, junk since their inception in 1992.

Eli stated this gear was sold to UCSC as a packaged, warrantied, storage
system, by an unnamed local SoCal vendor.  I took the man at his word.
We can speculate that he lied about it and tossed it together himself
form a catalog, which seems plausible given EMD's business model, but
that makes no material difference in this discussion, which is that WD
Green drives are not appropriate for RAID arrays, regardless of who
selected the components and whose hands assembled the hardware.

> Sorry Stan, that's not a failure of the drives. He lost the data due to
> limitations in his RAID configuration and bad management.

On the contrary.  It *is* a failure of the drives.  They failed to
perform properly in the chosen application environment, because the
vendor/end user put them in an unsupported environment.  That's the
whole theme of this thread, and precisely why I encouraged people to
read about this fiasco and the potential costs of using these drives in
RAIDs.

Whether the spindles motors quit, the PCBs failed, or they were merely
kicked offline due to any of a half dozen reasons, these are all drive
failures.  When a drive goes offline doe you call it "success"?  No.
What's the opposite of success?  Failure.

The Astra ES is almost certainly running embedded Linux + md RAID due to
its price point.  I can't locate the EMD website nor the PDF for this
Astra unit because every Excel Meridian domain Google'ing returned is
currently squatted.  They may have gone belly up.

If indeed that box uses embedded Linux + md RAID, TLER wouldn't have
been the problem.  Multiple drives definitely went offline, but I doubt
it's due to a real RAID ASIC with custom firmware and a TLER issue.
More likely, given the price, it was a backplane signal quality problem,
for which cheap backplanes are notorious.  Either way, cheap
not-fit-for-RAID drives were stuffed into a cheap RAID box and disaster
was the result.  People buying these drives for array use aren't
dropping them into quality backplanes, but cheap ones.  The entire
ecosystem of components used to build a WD Green drive array are
typically of much lower quality than the drives themselves.  Cheap
backplane + cheap drives + cheap HBAs = high probability of disaster.

In summary, very few people are going to successfully build reliable
arrays from these drives.  I've seen too many horror stories, the UCSC
fiasco being the most severe.  I'm simply trying want to prevent others
from suffering similar disasters.  I think that's a worthy cause.  WDC
itself says not to use the Green drives in RAID arrays.  I'm supplying
examples of real world disasters to support WDC's disclaimer, and
prevent some heartache.

-- 
Stan

^ permalink raw reply

* Re: HBA Adaptor advice
From: Ed W @ 2011-05-23  9:48 UTC (permalink / raw)
  To: Johannes Truschnigg; +Cc: Tobias McNulty, linux-raid
In-Reply-To: <4DD97C83.4050907@truschnigg.info>

On 22/05/2011 22:13, Johannes Truschnigg wrote:
> On 05/22/2011 10:57 PM, Tobias McNulty wrote:
>> Case in point: I have 4 of these 2TB Green drives in a RAID5 array.  I
>> assembled them from the raw devices (no partition table) without any
>> special precautions.  Am I in trouble?  The array seems to be working
>> fine...
> 
> No, you aren't. If you don't create a partition table in the first
> place, there's no possibility for partition boundaries to be mis-aligned
> in regard to the physical sector or erase block size of the underlying
> blockdevice. You could probably still get it wrong if you chose (if

Pardon what is probably a very ignorant question, but someone earlier in
this thread claimed that some adaptors report the size of the disk
slightly differently?  Wouldn't this potentially cause problems if you
needed to move the disks to a different controller?

Additionally if you needed to replace the disk then some new batch might
be some few sectors smaller?  This seems to be the biggest reason for
wanting to add a partition table and then deliberately partition some
10s MB smaller? (Think I saw this exact problem come up several times in
the last few weeks alone?)

Cheers

Ed W

^ permalink raw reply

* disable raid autodetect at boot
From: Alexander Lyakas @ 2011-05-23  9:36 UTC (permalink / raw)
  To: linux-raid

Hello,

I have a simple raid1 created on top of /dev/sda and /dev/sdb. After I
reboot, I would like to always manually assemble the raid. However,
every time the machine reboots, it looks like md tries to
automatically reassemble the raid, but usually binds only one of the
source devices:

16936	May 23 11:47:36 vc kernel: [    3.472926] md: linear personality
registered for level -1
16937	May 23 11:47:36 vc kernel: [    3.509352] md: multipath
personality registered for level -4
16938	May 23 11:47:36 vc kernel: [    3.551948] md: raid0 personality
registered for level 0
16939	May 23 11:47:36 vc kernel: [    3.683874] md: raid1 personality
registered for level 1
16953	May 23 11:47:36 vc kernel: [    4.944362] md: raid6 personality
registered for level 6
16954	May 23 11:47:36 vc kernel: [    4.944364] md: raid5 personality
registered for level 5
16955	May 23 11:47:36 vc kernel: [    4.944365] md: raid4 personality
registered for level 4
16956	May 23 11:47:36 vc kernel: [    4.949193] md: raid10 personality
registered for level 10
16971	May 23 11:47:36 vc kernel: [    6.059194] md: bind<sdb>

And /proc/mdstat has:
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
[raid4] [raid10]
md_d127 : inactive sdd[1](S)
      2097140 blocks super 1.2

According to md.txt, "only drives with a type 0 superblock can be
autodetected and run at boot time". During raid creation, I have
specified metadata version to be 1.2; this is confirmed by --examine.
Also, I have tried to use raid=noautodetect kernel parameter, but the
auto-detection still keeps happening.

The only way up to now I have found to prevent this, is by specifying
something like
DEVICE /dev/null
in /etc/mdadm/mdadm.conf

What am I missing?

Thanks,
  Alex.

^ permalink raw reply

* Re: HBA Adaptor advice
From: Ed W @ 2011-05-23  9:32 UTC (permalink / raw)
  To: Rudy Zijlstra; +Cc: linux-raid
In-Reply-To: <4DD8DF8A.30304@grumpydevil.homelinux.org>

On 22/05/2011 11:03, Rudy Zijlstra wrote:
> The amount of money that his time has cost discussing this & thinking
> about it, is most likely already noticeably more then the cost of a
> mid-range RAID card.

But hopefully that cost is shared across plenty of people who are now
more educated about the state of linux raid?

Please don't bogged down with this - what is obvious to someone who
hangs around here plenty, is not necessarily obvious to others of us
with less experience

Kind regards

Ed W

^ permalink raw reply

* Re: HBA Adaptor advice
From: Brad Campbell @ 2011-05-23  7:23 UTC (permalink / raw)
  To: Stan Hoeppner; +Cc: linux-raid
In-Reply-To: <4DDA04BE.9000300@hardwarefreak.com>

On 23/05/11 14:54, Stan Hoeppner wrote:
> On 5/22/2011 6:19 PM, Brad Campbell wrote:
>
>> They're not variable. Or to put it another way, if they _can_ vary the
>> spindle speed none of mine ever do.
>
> There's too little official info from WD on what exactly the variable
> speed IntelliPower actually means.
>

I can actually answer that one with certainty.

Intellipower is a combination of a fixed (but mostly non-standard) spin 
speed, clever cache usage and a variable speed seek.

It's the variable speed seek that trips people up. The drive knows where 
it is in its rotational cycle, and it knows how far it has to move to 
read the next block. If its rotational latency (and remember it's 
_slow_) is going to be greater than its seek time, it slows the seek 
down to save power. No point snapping the head to the next block if it 
is just going to have to wait for the data to arrive.

Unfortunately early on a couple of "benchmark" sites got it horribly 
wrong, it then got picked up by sites that are normally pretty reliable 
(Anandtech for example) and it just propagated from there.

If anyone would like concrete proof, I'm willing to sacrifice a 1TB 
drive I have here by popping the top and putting an optical tacho on it. 
Just give me a couple of weeks to get it out of service and get an 
optical tacho approved by the war office ;)

> And btw, HDDs pull the bulk of their power from the 5 volt rail, not the
> 12 volt rail.  This is the main differentiating factor between a server
> PSU and a PC PSU--much more current available on the 5v rail.  Hop on
> NewEgg and compare the 12v and 5v rail current of an PC SLI PSU and a
> server PSU.

Have another look at the drive data sheets. The bulk of the load on the 
+12V rail is the drive motor while spinning up. Even my Cheetah's detail 
this accurately in the data sheet. The logic is +5V but the magnetics 
tend to lean on the +12v rail pretty heavily.

> I think a lot of people fell into this 'trap' due the super low price/GB
> of the Green drives, and simply not realizing we now have boutique hard
> drives and a variety of application tailored drives, just as we have 31
> flavors of ice cream.

I kinda knew what I was getting myself into. Here's my reasoning.

The bulk of my storage is what you would call nearline. Write once, read 
lots but not very often. Media, movies, source trees, backups.

My storage used to be spread across 2 servers (30 drives in total) and 
one of those was comprised of 15 250G Maxtors. I did the power 
calculations and figured I could justify replacing 10 1TB drives in 
Server A with 10 WD GP 2T drives and decommission server B.

This will pay for itself in 14 Months.

I did a lot of research on the GP drives and figured for my use pattern 
they'd be ok. Remember, I keep the server up 24/7. It's on an APC Smart 
UPS (boost/buck + UPS) and it gets rebooted for kernel updates every 200 
days or so, but never powered down.

My experience has been that even with the cheapest consumer drives, if 
you keep them spinning and keep them warm they'll go the distance after 
weeding out the early life failures. Now, I might have a time bomb 
sitting there and suffer massive drive failure next week totalling my 
array, but then I knew what I was getting into before I started.

Realizing that for an extra $150 overall I could have had Hitachi 7200 
RPM drives was a bit of a DOH! moment, but then the power savings did 
not stack up as well.

You pays your money, you takes your chances.

^ permalink raw reply

* Re: HBA Adaptor advice
From: Stan Hoeppner @ 2011-05-23  6:54 UTC (permalink / raw)
  To: Brad Campbell; +Cc: linux-raid
In-Reply-To: <4DD99A16.5010108@fnarfbargle.com>

On 5/22/2011 6:19 PM, Brad Campbell wrote:

> They're not variable. Or to put it another way, if they _can_ vary the
> spindle speed none of mine ever do.

There's too little official info from WD on what exactly the variable
speed IntelliPower actually means.

> Can you imagine the potential vibration nightmare as 10 drives vary
> their spindle speed up and down? Not to mention the extra load on the
> +12V rail and the delays while waiting for the platters to reach servo
> lock?

WD sells this drive for *consumer* use, meaning 1 to a few drives, where
multi-drive oscillation isn't going to be an issue.  Given that fact,
it's not hard for a WD or Seagate et al in 2011 to build such a drive
with a variable spindle speed.  Apparently WD has done just that.

And btw, HDDs pull the bulk of their power from the 5 volt rail, not the
12 volt rail.  This is the main differentiating factor between a server
PSU and a PC PSU--much more current available on the 5v rail.  Hop on
NewEgg and compare the 12v and 5v rail current of an PC SLI PSU and a
server PSU.

>> IIRC from discussions here, mdadm has alignment issues with hybrid
>> sector size drives when assembling raw disks.  Not everyone assembles
>> their md devices from partitions.  Many assemble raw devices.
> 
> Which means the data starts at sector 0. That's an even multiple of 8.
> Job done. (Mine are all assembled raw also).

I had it slightly backwards.  Thanks for the correction.  The problem
case is building arrays from partitions created using defaults of
many/most partitioning tools.

>> You must boot your server with MD-DOS or FreeDOS and run wdidle3 once
>> for each Green drive in the system.  But, IIRC, if the drives are
>> connected via SAS expander or SATA PMP, this will not work.  A direct
>> connection to the HBA is required.
> 
> Indeed. In my workshop I have an old machine with 3 SATA hotswap bays
> that allowed me to do 3 at once, booting off a USB key into DOS.
> 
>> Once one accounts for all the necessary labor and configuration
>> contortions one must put himself through to make a Green drive into a
>> 'regular' drive, it is often far more cost effective to buy 'regular'
>> drives to begin with.  This saves on labor $$ which is usually greater,
>> from a total life cycle perspective, than the drive acquisition savings.
>>  The drives you end up with are already designed and tuned for the
>> application.  Reiterating Rudy's earlier point, using the Green drives
>> in arrays is "penny wise, pound foolish".
> 
> I agree with you. If I were doing it again I'd spend some extra $$$ on
> better drives, but I've already outlaid the cash and have a working array.

I think a lot of people fell into this 'trap' due the super low price/GB
of the Green drives, and simply not realizing we now have boutique hard
drives and a variety of application tailored drives, just as we have 31
flavors of ice cream.

>> Google WD20EARS and you'll find a 100:1 or more post ratio of problems
>> vs praise for this drive.  This is the original 2TB model which has
>> shipped in much greater numbers into the marketplace than all other
>> Green drives.  Heck, simply search the archives of this list.
> 
> Indeed, but the same follows for almost any drive. People are quick to
> voice their discontent but not so quick to praise something that does
> what it says on the tin.

The WD20EARS was far worse than the typical scenario you describe.
Interestingly, though, the drive itself is not at fault.  The two
problems associated with the drive are:

1.  Deploying it in the wrong application--primary RAID arrays
2.  The Linux partitioning tools lack(ed) support for 512B/4KB hybrids

Desktop MS Windows users seem to love these drives.  They're using them
as intended, go figure...

>> And that backup array may fail you when you need it most:  during a
>> restore.  Search the XFS archives for the horrific tale at University of
>> California Santa Cruz.  The SA lost ~7TB of doctoral student research
>> data due to multiple WD20EARS drives in his primary storage arrays *and*
>> his D2D backup array dying in quick succession.  IIRC multiple grad
>> students were forced to attend another semester to redo their
>> experiments and field work to recreate the lost data, so they could then
>> submit their theses.
> 
> Perhaps. Mine get a SMART short test every morning, a LONG every Sunday
> and a complete array scrub every other Sunday. My critical backups are
> also replicated to a WD World Edition Mybook that lives in another
> building.

I don't like disparaging other SAs, so I didn't go into that aspect of
the tale.  In summary, the SA tasked with managing that system had zero
monitoring in place, no proactive testing, nothing.  He was flying
blind.  When XFS "dropped" 12TB of the 60TB filesystem it took this SA
over a day to realize an entire RAID chassis had gone offline due to
multiple drives failures.  It took him almost a week, with lots of XFS
mailing list expertise, to save the intact 4/5ths of filesystem.  If
he'd have used LVM or md striping instead of concatenation he'd have
lost the entire 60TB filesystem.  He had a backup on a D2D server which
was also built of the 2TB green drives.  Turns out that system already
had 2 of its RAID6 drives down, and a 3 failed while he was
troubleshooting the file server problem.  He discovered this fact when I
decided to attempt a restore of the lost 12TB.

> I've had quite a few large arrays over the years, all comprised of the
> cheapest available storage at the time. I've had drives fail, but aside
> from a Sil3124 controller induced array failure I've never lost data
> because of a cheap hard disk and I've saved many, many, many $$$ on drives.

The problem I see most often, and have experienced first hand, isn't
losing data due to drive failure once in production.  The problem is
usually getting arrays stable when 'pounding them on the test bench'.
I've used hardware RAID far more often than md RAID over the years, and
some/many hardware RAID cards are just really damn picky about which
drives they'll work reliably with.  md RAID is more forgiving in this
regard, ones of its many benefits.

> I'm not arguing the penny wise, pound foolish sentiment. I'm just
> stating my personal experience has been otherwise with drives.

One shoe won't fit every foot.  Going the cheap route is typically more
labor intensive.  If proper procedures are used to monitor and replace
before the sky falls, this solution can work in many environments.  In
other environments, drive failure notification must be automatic,
management software and light path diagnostics must clearly show which
drive has failed, all so a $15/hour low skilled datacenter technician
can walk down the rack isle, find the dead drive, pull and replace it,
without system administrator intervention.  The SA will simply launch
his management console and make sure the array is auto-rebuilding.

-- 
Stan

^ permalink raw reply

* Re: HBA Adaptor advice
From: Roman Mamedov @ 2011-05-23  6:08 UTC (permalink / raw)
  To: Brad Campbell; +Cc: Stan Hoeppner, linux-raid
In-Reply-To: <4DD9F6B2.4070708@fnarfbargle.com>

[-- Attachment #1: Type: text/plain, Size: 794 bytes --]

On Mon, 23 May 2011 13:54:58 +0800
Brad Campbell <lists2009@fnarfbargle.com> wrote:

> I think the term "Advanced Format" crap is a bit harsh.
> The reality is that for drives > 2TB it is simply inevitable that bigger 
> sectors will be required.

For >2TB maybe, but not for 2TB.

> Most sane operating systems use cluster sizes of 4k or larger and have 
> done for years, so I really don't see what all the fuss is about.

4K drives shouldn't lie that they have 512 byte sectors, pretending all is
fine but doing that horrendous r-m-w translation under the hood; at least they
should be switchable (perhaps by a jumper?) into the 4K-native mode. No way to
improperly align anything if a drive honestly tells it has 4K/4K
logical/physical sector.

-- 
With respect,
Roman

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply

* Re: HBA Adaptor advice
From: Brad Campbell @ 2011-05-23  5:54 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: Stan Hoeppner, linux-raid
In-Reply-To: <20110523100939.0c019ea2@natsu>

On 23/05/11 12:09, Roman Mamedov wrote:

> Personally so far I have been sucessful in avoiding the "Advanced Format" crap
> that WD and others are pushing down customers' throats; I have none of such
> drives. It *is* possible to make a non-AF 2TB drive, even a 3-platter one. And
> this one in my opinion the ideal Green drive to buy today, which has the
> advantages of being non-AF and at the same time still in production (maybe
> not for too long, with WD buying Hitachi :-/ ):

I think the term "Advanced Format" crap is a bit harsh.
The reality is that for drives > 2TB it is simply inevitable that bigger 
sectors will be required.

Most sane operating systems use cluster sizes of 4k or larger and have 
done for years, so I really don't see what all the fuss is about.

Peoples inability to properly align the data on their disks can be read 
either as a failing in the technology (the partitioning applications 
have not caught up yet) or simply a lack of understanding on how to 
apply the technology.

Don't blame the drive manufacturers, this should have happened _years_ ago.

^ permalink raw reply

* Re: HBA Adaptor advice
From: Stefan /*St0fF*/ Hübner @ 2011-05-23  5:30 UTC (permalink / raw)
  To: Brad Campbell; +Cc: Stan Hoeppner, linux-raid
In-Reply-To: <4DD9A558.7000802@fnarfbargle.com>

Am 23.05.2011 02:07, schrieb Brad Campbell:
> On 23/05/11 07:44, Brad Campbell wrote:
>> He used WD commodity drives on a "hardware" RAID enclosure that needed
>> TLER. The RAID-5 kicked out 4
>> drives in a short period of time, so he power cycled it and
>> re-initialised the array and it came up
>> fine, but blank (as it would as he re-initialised it).
>>
> 
> Just to clarify that as it was somewhat muddled. The initial failure was
> on an unspecified array with unspecified drives and resulted in a blank
> array. The backup failure was TLER related using WD GP drives on a
> hardware array and was left unresolved.
> 
> That's still not concrete evidence of those drives failing, it's just
> using the wrong tool for the wrong job.

Just to clarify a bit more: the elder WD20EADS (notice the 'D') worked
very well and until Nov 2009 they were TLER capable (which means: ERC
timeouts could be set to non-zero, the setting was preserved over
power-cycles).  Short after the Firmware-Patch that removed the
TLER-ability those WD20EARS (notice 'R') appeared.  From that moment on
our WD-failure-rates started to climb noticeably and has not fallen
again, since.

We also noticed that WD started to do the same mistake (customer-wise)
as Seagate.  More than 50% of their "certified repaired" disks (those
which you get back after sending in defective drives for RMA) died soon
after putting them back into work.  I will not comment further on this
statement.

We sell 200+ drives a week from our "at that time preferred"
Manufacturer.  That was WD from 2008 till the beginning of 2010.  But
since the climb of wd failure rates we're at "Hitachi" and have
astounding failure rates of less than one percent.  I hope this will
stay the case even after WD bought Hitachi GST...

Conclusion about this university-storage-failure: wrong drives for this
scenario.  It would've been OK to use the cheap WDs for backup (if the
backup was at least RAID6 and sends error-mails to the admin).  But the
primary storage was a big fail.  You do not use this kind of storage for
data which is worth much time (and by that much money).

Just a few pro-cents,
Stefan

^ permalink raw reply

* Re: HBA Adaptor advice
From: Roman Mamedov @ 2011-05-23  4:09 UTC (permalink / raw)
  To: Brad Campbell; +Cc: Stan Hoeppner, linux-raid
In-Reply-To: <4DD99A16.5010108@fnarfbargle.com>

[-- Attachment #1: Type: text/plain, Size: 3571 bytes --]

On Mon, 23 May 2011 07:19:50 +0800
Brad Campbell <lists2009@fnarfbargle.com> wrote:

> On 23/05/11 03:25, Stan Hoeppner wrote:
> >> Actually, I'm pretty sure the WD drives have a 5400 rpm spindle speed
> >> period. I've got 15 of them here and I have no evidence of any form of
> >> spindle speed variation. They say the drives have spindle speed :
> >> "intellipower" which is marketspeak for slow enough to save a few watts,
> >> but fast enough to do the job.
> >
> > From:  http://www.anandtech.com/show/2385/2
> >
> > The Western Digital drive's IntelliPower algorithm, which varies the
> > rotational speed between 5400RPM and 7200RPM, dictates the Western
> > Digital's rotational speed.

You can find 100s or 1000s of articles reiterating the manufacturer's
marketing materials without much thought or experimentation.

On the other hand there have been some tests of these drives involving a
microphone and rotational noise frequency to RPM calculation which show it
does not vary ever. I can't dig up those off-hand though; so for the
'next-best' proof see HddRpmEst results by the japanese link below.

> "In 2007 Western Digital announced the WD GP drive touting rotational 
> speed "between 7200 and 5400 rpm", which, if potentially misleading, is 
> technically correct; the drive spins at 5405 rpm, and the Green Power 
> spin speed is not variable.[citation needed]"

And here is a couple of [citations]:
http://www.ciol.com/News/News-Reports/Seagate-targets-Western-Digitals-IntelliPower/131009126262/0/
http://www.storagereview.com/1000.sr

By the way, 5400-7200 isn't even true in any sense of the word. 
There are some models of WD20EARS (e.g. 00MVWB0, maybe others) which spin at
constant 5000 RPM instead: http://club.coneco.net/user/10682/review/37049/

> > IIRC from discussions here, mdadm has alignment issues with hybrid
> > sector size drives when assembling raw disks.  Not everyone assembles
> > their md devices from partitions.  Many assemble raw devices.
> 
> Which means the data starts at sector 0. That's an even multiple of 8. 
> Job done. (Mine are all assembled raw also).

The data does not necessarily start at sector 0. However it still most
likely to be fine:

$ sudo mdadm --examine /dev/sdb3 | grep Offset
    Data Offset : 2048 sectors
   Super Offset : 8 sectors

> > Google WD20EARS and you'll find a 100:1 or more post ratio of problems
> > vs praise for this drive.  This is the original 2TB model which has
> > shipped in much greater numbers into the marketplace than all other
> > Green drives.  Heck, simply search the archives of this list.
> 
> Indeed, but the same follows for almost any drive. People are quick to 
> voice their discontent but not so quick to praise something that does 
> what it says on the tin.

Personally so far I have been sucessful in avoiding the "Advanced Format" crap
that WD and others are pushing down customers' throats; I have none of such
drives. It *is* possible to make a non-AF 2TB drive, even a 3-platter one. And
this one in my opinion the ideal Green drive to buy today, which has the
advantages of being non-AF and at the same time still in production (maybe
not for too long, with WD buying Hitachi :-/ ): 

  Hitachi 5K3000 HDS5C3020ALA632
  http://www.newegg.com/Product/Product.aspx?Item=N82E16822145475

I also had only the best experiences with Hitachi HDDs, and it looks like I am
not alone:
http://www.tomshardware.com/reviews/hdd-reliability-storelab,2681-2.html

-- 
With respect,
Roman

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply

* Re: HBA Adaptor advice
From: Tobias McNulty @ 2011-05-23  3:39 UTC (permalink / raw)
  To: Jim Schatzman; +Cc: linux-raid
In-Reply-To: <20110523021234.1BAEAF9C03C@mail.fulab.com>

On Sun, May 22, 2011 at 10:11 PM, Jim Schatzman
<James.Schatzman@fulab.com> wrote:
> One more input re. using cheap drives.
>
> I have been running about 20 Western Digital "Green" and "Enterprise" drives (half are 1.5 TB and half are 2 TB) for several years in Raid-5 and Raid-6 configurations (all linux md). They are up 24x7. When they first came on the market, about 30% of my new drives failed within 3 months of operation (about equal fractions of Green and Enterprise). Overall, 50% of the drives eventually failed - 35% of the Green drives and 100% of the Enterprise drives. In the past 18 months, one has failed (an Enterprise drive). That drive was a warranty replacement for an earlier Enterprise drive failure.
>
> My impression is that Western was having some quality control issues with the 2GB drives - both Green and Enterprise. This was very annoying. It appears that quality has improved. I never lost any data nor ever had to restore from backup because I was always able to replace the bad drive and rebuild the raid without difficulty that I could not get solved through this forum.
>
> My experience suggests that the WD Enterprise class drives were an unnecessary expense, at least as far as reliability is concerned.
>
> Would I recommend cheap SATA drives for mission critical data?  Absolutely not. I wouldn't recommend  any  SATA drives. Go with the most  expensive SAS drives available. For that matter, I have loads of SCSI drives that are still going fine after 5 to 10 years of 365x24x7 operation.
>
> If you are going to build a RAID from cheap drives, expect that part of your hardware savings will be compensated for by labor costs. Run smart checks often. Also, and I cannot emphasize this enough, make certain that everything attached to the RAID is plugged into a high quality UPS. Otherwise, you are just asking for a power spike to take out multiple drives and/or the controller and to lose data.

One odd statistical fluke regarding quality control on the large Green
drives:  I ordered 3 of the drives on Amazon and 3 on Newegg, and all
3 of the Newegg drives failed very quickly (within a couple weeks),
while all 3 of the Amazon drives are still going strong (5 months old
and on 24/7).  I'm not sure if it was a packaging issue or an issue
with that particular set of drives that Newegg had in stock, but it
left me wondering.

Tobias
-- 
Tobias McNulty, Managing Member
Caktus Consulting Group, LLC
http://www.caktusgroup.com
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: HBA Adaptor advice
From: Jim Schatzman @ 2011-05-23  2:11 UTC (permalink / raw)
  To: linux-raid

One more input re. using cheap drives. 

I have been running about 20 Western Digital "Green" and "Enterprise" drives (half are 1.5 TB and half are 2 TB) for several years in Raid-5 and Raid-6 configurations (all linux md). They are up 24x7. When they first came on the market, about 30% of my new drives failed within 3 months of operation (about equal fractions of Green and Enterprise). Overall, 50% of the drives eventually failed - 35% of the Green drives and 100% of the Enterprise drives. In the past 18 months, one has failed (an Enterprise drive). That drive was a warranty replacement for an earlier Enterprise drive failure.

My impression is that Western was having some quality control issues with the 2GB drives - both Green and Enterprise. This was very annoying. It appears that quality has improved. I never lost any data nor ever had to restore from backup because I was always able to replace the bad drive and rebuild the raid without difficulty that I could not get solved through this forum. 

My experience suggests that the WD Enterprise class drives were an unnecessary expense, at least as far as reliability is concerned.

Would I recommend cheap SATA drives for mission critical data?  Absolutely not. I wouldn't recommend  any  SATA drives. Go with the most  expensive SAS drives available. For that matter, I have loads of SCSI drives that are still going fine after 5 to 10 years of 365x24x7 operation.

If you are going to build a RAID from cheap drives, expect that part of your hardware savings will be compensated for by labor costs. Run smart checks often. Also, and I cannot emphasize this enough, make certain that everything attached to the RAID is plugged into a high quality UPS. Otherwise, you are just asking for a power spike to take out multiple drives and/or the controller and to lose data.

Jim

^ permalink raw reply

* Re: HBA Adaptor advice
From: Brad Campbell @ 2011-05-23  0:07 UTC (permalink / raw)
  To: Stan Hoeppner; +Cc: linux-raid
In-Reply-To: <4DD99FF2.2030609@fnarfbargle.com>

On 23/05/11 07:44, Brad Campbell wrote:
> He used WD commodity drives on a "hardware" RAID enclosure that needed TLER. The RAID-5 kicked out 4
> drives in a short period of time, so he power cycled it and re-initialised the array and it came up
> fine, but blank (as it would as he re-initialised it).
>

Just to clarify that as it was somewhat muddled. The initial failure was on an unspecified array 
with unspecified drives and resulted in a blank array. The backup failure was TLER related using WD 
GP drives on a hardware array and was left unresolved.

That's still not concrete evidence of those drives failing, it's just using the wrong tool for the 
wrong job.

^ permalink raw reply

* Re: HBA Adaptor advice
From: Brad Campbell @ 2011-05-22 23:44 UTC (permalink / raw)
  To: Stan Hoeppner; +Cc: Brad Campbell, linux-raid
In-Reply-To: <4DD9633E.5000101@hardwarefreak.com>

On 23/05/11 03:25, Stan Hoeppner wrote:
>
> And that backup array may fail you when you need it most:  during a
> restore.  Search the XFS archives for the horrific tale at University of
> California Santa Cruz.  The SA lost ~7TB of doctoral student research
> data due to multiple WD20EARS drives in his primary storage arrays *and*
> his D2D backup array dying in quick succession.  IIRC multiple grad
> students were forced to attend another semester to redo their
> experiments and field work to recreate the lost data, so they could then
> submit their theses.

So I "googled" that thread, and after I picked my way past all the top rating hits which appear to 
be you telling people to google that thread I found the real problem.

He used WD commodity drives on a "hardware" RAID enclosure that needed TLER. The RAID-5 kicked out 4 
drives in a short period of time, so he power cycled it and re-initialised the array and it came up 
fine, but blank (as it would as he re-initialised it).

Sorry Stan, that's not a failure of the drives. He lost the data due to limitations in his RAID 
configuration and bad management.

^ permalink raw reply

* Re: HBA Adaptor advice
From: Brad Campbell @ 2011-05-22 23:19 UTC (permalink / raw)
  To: Stan Hoeppner; +Cc: linux-raid
In-Reply-To: <4DD9633E.5000101@hardwarefreak.com>

On 23/05/11 03:25, Stan Hoeppner wrote:
> On 5/22/2011 5:09 AM, Brad Campbell wrote:
>> On 22/05/11 17:04, Stan Hoeppner wrote:
>>
>>> WD's Green drives have a 5400 rpm 'variable' spindle speed.  The Seagate
>>> 2.5" SAS drive has a 7.2k spindle speed.
>>
>> Actually, I'm pretty sure the WD drives have a 5400 rpm spindle speed
>> period. I've got 15 of them here and I have no evidence of any form of
>> spindle speed variation. They say the drives have spindle speed :
>> "intellipower" which is marketspeak for slow enough to save a few watts,
>> but fast enough to do the job.
>
> From:  http://www.anandtech.com/show/2385/2
>
> The Western Digital drive's IntelliPower algorithm, which varies the
> rotational speed between 5400RPM and 7200RPM, dictates the Western
> Digital's rotational speed.

"In 2007 Western Digital announced the WD GP drive touting rotational 
speed "between 7200 and 5400 rpm", which, if potentially misleading, is 
technically correct; the drive spins at 5405 rpm, and the Green Power 
spin speed is not variable.[citation needed]"

http://en.wikipedia.org/wiki/Western_Digital

They're not variable. Or to put it another way, if they _can_ vary the 
spindle speed none of mine ever do.

Can you imagine the potential vibration nightmare as 10 drives vary 
their spindle speed up and down? Not to mention the extra load on the 
+12V rail and the delays while waiting for the platters to reach servo lock?

> IIRC from discussions here, mdadm has alignment issues with hybrid
> sector size drives when assembling raw disks.  Not everyone assembles
> their md devices from partitions.  Many assemble raw devices.

Which means the data starts at sector 0. That's an even multiple of 8. 
Job done. (Mine are all assembled raw also).

> You must boot your server with MD-DOS or FreeDOS and run wdidle3 once
> for each Green drive in the system.  But, IIRC, if the drives are
> connected via SAS expander or SATA PMP, this will not work.  A direct
> connection to the HBA is required.

Indeed. In my workshop I have an old machine with 3 SATA hotswap bays 
that allowed me to do 3 at once, booting off a USB key into DOS.

> Once one accounts for all the necessary labor and configuration
> contortions one must put himself through to make a Green drive into a
> 'regular' drive, it is often far more cost effective to buy 'regular'
> drives to begin with.  This saves on labor $$ which is usually greater,
> from a total life cycle perspective, than the drive acquisition savings.
>  The drives you end up with are already designed and tuned for the
> application.  Reiterating Rudy's earlier point, using the Green drives
> in arrays is "penny wise, pound foolish".
>

I agree with you. If I were doing it again I'd spend some extra $$$ on 
better drives, but I've already outlaid the cash and have a working array.

> Google WD20EARS and you'll find a 100:1 or more post ratio of problems
> vs praise for this drive.  This is the original 2TB model which has
> shipped in much greater numbers into the marketplace than all other
> Green drives.  Heck, simply search the archives of this list.

Indeed, but the same follows for almost any drive. People are quick to 
voice their discontent but not so quick to praise something that does 
what it says on the tin.

> And that backup array may fail you when you need it most:  during a
> restore.  Search the XFS archives for the horrific tale at University of
> California Santa Cruz.  The SA lost ~7TB of doctoral student research
> data due to multiple WD20EARS drives in his primary storage arrays *and*
> his D2D backup array dying in quick succession.  IIRC multiple grad
> students were forced to attend another semester to redo their
> experiments and field work to recreate the lost data, so they could then
> submit their theses.
>

Perhaps. Mine get a SMART short test every morning, a LONG every Sunday 
and a complete array scrub every other Sunday. My critical backups are 
also replicated to a WD World Edition Mybook that lives in another building.

I've had quite a few large arrays over the years, all comprised of the 
cheapest available storage at the time. I've had drives fail, but aside 
from a Sil3124 controller induced array failure I've never lost data 
because of a cheap hard disk and I've saved many, many, many $$$ on drives.

I'm not arguing the penny wise, pound foolish sentiment. I'm just 
stating my personal experience has been otherwise with drives.


^ permalink raw reply

* Re: HBA Adaptor advice
From: Johannes Truschnigg @ 2011-05-22 21:13 UTC (permalink / raw)
  To: Tobias McNulty; +Cc: linux-raid
In-Reply-To: <BANLkTi=az3ny5m7Mf20t5fnFj79CsowJWg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1157 bytes --]

On 05/22/2011 10:57 PM, Tobias McNulty wrote:
> Case in point: I have 4 of these 2TB Green drives in a RAID5 array.  I
> assembled them from the raw devices (no partition table) without any
> special precautions.  Am I in trouble?  The array seems to be working
> fine...

No, you aren't. If you don't create a partition table in the first
place, there's no possibility for partition boundaries to be mis-aligned
in regard to the physical sector or erase block size of the underlying
blockdevice. You could probably still get it wrong if you chose (if
that's even possible, I don't know for sure off-hand) a very weird
non-power-of-two chunk size that happens to interfere with the sector
size of your disks in a bad way, but since md's default chunk sizes are
rather large powers of two, you'd have to put some effort into screwing
up (if that is at all possible, as I mentioned before) ;)

-- 
with best regards:
- Johannes Truschnigg ( johannes@truschnigg.info )

www: http://johannes.truschnigg.info/
phone: +43 650 2 133337
xmpp: johannes@truschnigg.info

Please do not bother me with HTML-eMail or attachments. Thank you.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 261 bytes --]

^ permalink raw reply

* Re: HBA Adaptor advice
From: Tobias McNulty @ 2011-05-22 20:57 UTC (permalink / raw)
  To: Stan Hoeppner; +Cc: Brad Campbell, linux-raid
In-Reply-To: <4DD9633E.5000101@hardwarefreak.com>

On Sun, May 22, 2011 at 3:25 PM, Stan Hoeppner <stan@hardwarefreak.com> wrote:
>
> On 5/22/2011 5:09 AM, Brad Campbell wrote:
> > On 22/05/11 17:04, Stan Hoeppner wrote:
> >> It's difficult to align partitions properly on the Green drives due to
> >> native 4K sectors translated by drive firmware to 512B sectors.  The
> >> Seagate SAS drive has native 512B sectors.
> >
> > Actually it's not difficult at all. You just make sure all your
> > partitions start on an even multiple of 8 sectors. No magic in it. Just
> > the same as all my SSD partitions start on 512k boundaries.
>
> IIRC from discussions here, mdadm has alignment issues with hybrid
> sector size drives when assembling raw disks.  Not everyone assembles
> their md devices from partitions.  Many assemble raw devices.

Case in point: I have 4 of these 2TB Green drives in a RAID5 array.  I
assembled them from the raw devices (no partition table) without any
special precautions.  Am I in trouble?  The array seems to be working
fine...

Tobias
--
Tobias McNulty, Managing Member
Caktus Consulting Group, LLC
http://www.caktusgroup.com
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: HBA Adaptor advice
From: Stan Hoeppner @ 2011-05-22 19:25 UTC (permalink / raw)
  To: Brad Campbell; +Cc: linux-raid
In-Reply-To: <4DD8E0D3.1030905@fnarfbargle.com>

On 5/22/2011 5:09 AM, Brad Campbell wrote:
> On 22/05/11 17:04, Stan Hoeppner wrote:
> 
>> WD's Green drives have a 5400 rpm 'variable' spindle speed.  The Seagate
>> 2.5" SAS drive has a 7.2k spindle speed.
> 
> Actually, I'm pretty sure the WD drives have a 5400 rpm spindle speed
> period. I've got 15 of them here and I have no evidence of any form of
> spindle speed variation. They say the drives have spindle speed :
> "intellipower" which is marketspeak for slow enough to save a few watts,
> but fast enough to do the job.

From:  http://www.anandtech.com/show/2385/2

The Western Digital drive's IntelliPower algorithm, which varies the
rotational speed between 5400RPM and 7200RPM, dictates the Western
Digital's rotational speed.

>> It's difficult to align partitions properly on the Green drives due to
>> native 4K sectors translated by drive firmware to 512B sectors.  The
>> Seagate SAS drive has native 512B sectors.
> 
> Actually it's not difficult at all. You just make sure all your
> partitions start on an even multiple of 8 sectors. No magic in it. Just
> the same as all my SSD partitions start on 512k boundaries.

IIRC from discussions here, mdadm has alignment issues with hybrid
sector size drives when assembling raw disks.  Not everyone assembles
their md devices from partitions.  Many assemble raw devices.

>> The Green drives have aggressive power saving firmware not suitable for
>> business use as the heads are auto parked every 8 seconds or so.  IIRC
>> the drive goes into sleep mode after a short period of inactivity on the
>> host interface.  In short, these drives are designed optimally for the
>> "is not running" case rather than the "running" case.  Hence the name
>> "Green".  How do you save power?  Turn off the drive.  And that's
>> exactly what these drives are designed to do.
> 
> You can turn off the aggressive head parking with a little DOS utility,
> and they don't go to sleep at all unless you tell them to. They will
> happily keep spinning just the same as any other disk.

You must boot your server with MD-DOS or FreeDOS and run wdidle3 once
for each Green drive in the system.  But, IIRC, if the drives are
connected via SAS expander or SATA PMP, this will not work.  A direct
connection to the HBA is required.

Once one accounts for all the necessary labor and configuration
contortions one must put himself through to make a Green drive into a
'regular' drive, it is often far more cost effective to buy 'regular'
drives to begin with.  This saves on labor $$ which is usually greater,
from a total life cycle perspective, than the drive acquisition savings.
 The drives you end up with are already designed and tuned for the
application.  Reiterating Rudy's earlier point, using the Green drives
in arrays is "penny wise, pound foolish".

Google WD20EARS and you'll find a 100:1 or more post ratio of problems
vs praise for this drive.  This is the original 2TB model which has
shipped in much greater numbers into the marketplace than all other
Green drives.  Heck, simply search the archives of this list.

> I'm running them in a couple of large(ish) RAID arrays. I'm not saying
> it's a good idea, it's just been my experience with ultra-cheap drives
> that if you burn in the drives to weed out the early failures, and you
> keep them running 24/7 in a nice environment they tend to last long
> enough to do the job. I tend to replace my drives at around ~30,000
> hours, so these have a long way to go yet.

You're one out of 100.  Congratulations. :)

> On the other hand, I have my company data on Seagate Cheetah SAS drives
> in RAID-10, but I back up to the large WD Green arrays.

And that backup array may fail you when you need it most:  during a
restore.  Search the XFS archives for the horrific tale at University of
California Santa Cruz.  The SA lost ~7TB of doctoral student research
data due to multiple WD20EARS drives in his primary storage arrays *and*
his D2D backup array dying in quick succession.  IIRC multiple grad
students were forced to attend another semester to redo their
experiments and field work to recreate the lost data, so they could then
submit their theses.

How much did this incident cost the university and the Ph. D. students
in real money and lost time?  I'm sure some actuaries might be able to
tell you, and the real cost is likely hundreds of thousands of times the
cost savings of using these crap drives, especially when you figure in
the lost salaries for 6 months of these Ph. D. students.  Depending on
their field this could be over $100k per student.  If such 10 students
were affected that's potentially $1 million in lost earnings alone.

Spending an additional $10-20K on proper disk drives would have saved an
enormous amount in this case, and not just purely money.  If you were
one of the students who was told you had to repeat a semester because a
computer lost all of your research data, how would you digest and cope
with that?  I'd bet at least one, if not more, lawsuits/settlements will
results from this.

Give that things like this can, and DO happen when banking on cheap
consumer drives in a production environment, why would anyone ever take
such a chance?

-- 
Stan

^ permalink raw reply

* Re: Performarce raid6 degraded
From: Pol Hallen @ 2011-05-22 12:57 UTC (permalink / raw)
  To: Linux-RAID

> What was it when the array was not degraded?

ehm.. I never didn't wrote the results :-O

I discover another bad disk.. problems problems..

thanks :-)

Pol

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox