* Re: Software raid, booting and bios
From: Roberto Spadim @ 2011-05-20 21:27 UTC (permalink / raw)
To: Phil Turmel; +Cc: Paul van der Vlis, linux-raid
In-Reply-To: <4DD6C1EB.1030907@turmel.org>
i´m using usb 'pendrives' for 3 years without problems, only damn
small linux, or slitaz running in a desktop machine
2011/5/20 Phil Turmel <philip@turmel.org>:
> {do use reply-to-all on kernel.org lists... not everyone is subscribed}
>
> On 05/20/2011 11:53 AM, Paul van der Vlis wrote:
>> Op 20-05-11 14:11, Phil Turmel schreef:
>>> (Just to show what's out there.) The embedded boards I use
>>> occasionally have the equivalent of this soldered to their
>>> motherboards.
>>>
>>> The best DMA capable CF cards are usually found in markets that cater
>>> to industrial designers or to professional photographers.
>>
>> Do you think the risk of a problem with a CF card (or something like
>> that) is much lower then the risk of a problem with a harddisk?
>
> the big deal is the lack of moving parts: No spindle bearing, no head positioner gear train. On top of that, when set up to support your boot tasks only, there's no write activity to wear it out.
>
>> And what about booting from an USB stick?
>
> Just as good, technically, IMHO. If mounted internally, just as good, period. Plugged into an external port, I'd be wary of some uninformed soul pulling it out. CF cards look like they "belong".
>
> I like Ed W's suggestions, as well, with the caveat that their usefulness would make them more likely to be "borrowed". Even by yourself, in a pinch.
>
> Phil
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Roberto Spadim
Spadim Technology / SPAEmpresarial
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: HBA Adaptor advice
From: Brad Campbell @ 2011-05-20 21:23 UTC (permalink / raw)
To: RAID Linux
In-Reply-To: <BANLkTimBX-oCoOT2+GSSzAbs0U_MNv2Mpw@mail.gmail.com>
On 21/05/11 04:58, Tobias McNulty wrote:
>
> * SYBA SD-PEX40031 (Pericom PI7C9X111 + Silicone Image Sil3124
> chipset) - lots of errors during heavy I/O such as resyc'ing
Oh dear.. Gee, another item of useless anecdotal evidence directly attributable to cheaply
manufactured cards from a third world country and no direct correlation to a flaky chipset design.
> * 3ware 9650SE-8LPML-Sgl - I thought money would solve the problem,
> but I didn't realize that you can't use an expensive RAID card to
> access existing data on the disk
Ahh..
> * Supermicro AOC-Saslp-MV8 - I thought it would be a perfect match
> given my Supermicro motherboard, but also gave me lots of errors
> during heavy I/O and this experience seems to be confirmed by other
> users in this thread
Oh dear.. Yes, the mvsas driver has been noted to be somewhat problematic still.
I hope you're not a betting man. 3 for 3 is not a great record thus far. On the up-side the 7042's
have been as solid as a rock.. and _fast_ for the last couple of years. Marvell worked with Mark
Lord and the result was a workable version of the sata_mv driver. Shame they don't do the same with
the mvsas code.
Additionally, I migrated two arrays from the Marvell7042 controllers onto the LSI based "IBM"
controllers configured up as JBOD and they just worked. No initialisation or reconfiguration
required at all.
Brad
^ permalink raw reply
* Re: HBA Adaptor advice
From: Tobias McNulty @ 2011-05-20 20:58 UTC (permalink / raw)
To: Brad Campbell; +Cc: RAID Linux
In-Reply-To: <4DD5867A.2090102@fnarfbargle.com>
On Thu, May 19, 2011 at 5:07 PM, Brad Campbell
<lists2009@fnarfbargle.com> wrote:
>
> On 19/05/11 20:26, Ed W wrote:
>
>> Please add suggestions for good value, reliable controllers known to
>> work well with linux
>
> I have three of these :
>
> http://www.startech.com/product/PEXSATA24E-2-Port-eSATA-4-Port-SATA-PCI-Express-x4-SATA-Controller-Adapter-Card-PCIe
>
> and 4 of these :
>
> http://www.ebay.com.au/itm/IBM-M1015-46M0861-ServeRAID-M1015-SAS-SATA-Controller-/280655527117?pt=AU_Server_Accessories_Parts&hash=item41585f7ccd
>
> All of which I can't recommend highly enough.
>
> I got the Startech ones cheap from a dodgy shop about 4 years ago. They cost me about $30 each.
> I got the IBM (really LSI) ones cheap from ebay at about $110 each at Christmas.
>
> The Startech cards use the sata_mv driver and are solid, the LSI cards use the megaraid_sas driver and are solid. As a bonus of having SAS ports, I picked up 4 Seagate Cheetah 15k.5 SAS drives for a wicked fast RAID10 array.
So, I've been through 3 cards in my current NAS, all of which didn't
fit my needs for one reason or another, and I had given up until this
thread reignited my interest in having more than 6 available SATA
ports in the box. The cards I've tried are:
* SYBA SD-PEX40031 (Pericom PI7C9X111 + Silicone Image Sil3124
chipset) - lots of errors during heavy I/O such as resyc'ing
* 3ware 9650SE-8LPML-Sgl - I thought money would solve the problem,
but I didn't realize that you can't use an expensive RAID card to
access existing data on the disk
* Supermicro AOC-Saslp-MV8 - I thought it would be a perfect match
given my Supermicro motherboard, but also gave me lots of errors
during heavy I/O and this experience seems to be confirmed by other
users in this thread
In all cases switching back to the onboard SATA ports resulted in
seamless operation (same drives, cables, etc.).
In light of what I've learned in this thread I just ordered the
Rosewill RC-218 SATA card, which has the same Marvell 88SX7042 chipset
as the Startech link above, but runs only $80 on Newegg [1] and seems
to have good reviews from a few Linux users. I'll report back after I
get it installed next week.
Cheers,
Tobias
[1] http://www.newegg.com/Product/Product.aspx?Item=N82E16816132018
--
Tobias McNulty, Managing Member
Caktus Consulting Group, LLC
http://www.caktusgroup.com
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: HBA Adaptor advice
From: Stan Hoeppner @ 2011-05-20 20:58 UTC (permalink / raw)
To: Drew; +Cc: linux-raid
In-Reply-To: <BANLkTimP_5HHAq3Fq2sAkasZKH85gJFtYw@mail.gmail.com>
On 5/20/2011 3:24 PM, Drew wrote:
>> It's a shame; maybe there will be disks with battery-backed cache
>> one day.
>
> There's already hybrid drives which pack a small SSD onboard to act as
> a large cache.
These hybrid drives still have a small 32-64MB cache DRAM, in front of
the SSD. The DRAM loses its contents when the power goes out. The on
board SSD doesn't prevent this cache data loss.
It may be worth noting that most, if not all, pure SSDs also have cache
DRAM in front of the flash array, and thus will lose data in the cache
when the power fails. Some models have what has been termed "super
capacitors" on board to power the device long enough to flush pending
writes in cache to the flash cells, but few, if any, of the
manufacturers advertise that their drives have this feature, or even
bother to put it on the spec sheet. So there's no easy/consistent way,
at present, to really know if your SSD has this feature or not.
As always, a good data persistence strategy starts with a good UPS.
Laptop users have an advantage as they get a free built in UPS, and
typically, good software integration to automatically and safely
shutdown when the battery is about out of juice.
--
Stan
^ permalink raw reply
* Re: HBA Adaptor advice
From: Drew @ 2011-05-20 20:24 UTC (permalink / raw)
To: linux-raid
In-Reply-To: <20110520200100.GF4759@bitfolk.com>
> It's a shame; maybe there will be disks with battery-backed cache
> one day.
There's already hybrid drives which pack a small SSD onboard to act as
a large cache.
--
Drew
^ permalink raw reply
* Re: HBA Adaptor advice
From: Stan Hoeppner @ 2011-05-20 20:12 UTC (permalink / raw)
To: linux-raid
In-Reply-To: <20110520200100.GF4759@bitfolk.com>
On 5/20/2011 3:01 PM, Andy Smith wrote:
> It's a shame; maybe there will be disks with battery-backed cache
> one day.
You'll never see a cache DRAM BBU built into a drive. If this *concept*
were to be implemented it would be done with flash and a capacitor
instead of a BBU. The capacitor would be sized to hold just enough
juice to power the ASIC, flash chip, and related circuitry, and write
the cache DRAM contents to the flash chip after sensing power to the
card has been lost.
Many higher end RAID cards already have flash backup of the cache DRAM
in addition to, in instead of, a BBU.
--
Stan
^ permalink raw reply
* Re: HBA Adaptor advice
From: Andy Smith @ 2011-05-20 20:01 UTC (permalink / raw)
To: linux-raid
In-Reply-To: <4DD65C18.5090804@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 600 bytes --]
Hi Joe,
On Fri, May 20, 2011 at 08:18:32AM -0400, Joe Landman wrote:
>> On 20/05/2011 03:08, Andy Smith wrote:
>>> Are there actually any HBAs that have BBU without using their RAID
>>> features?
>>>
>>> I'd like to stop using hardware RAID but I can't give up the BBU and
>>> write cache.
>
> HBAs don't have BBU or write cache. Only RAIDs do. While you can run
> the RAID in JBOD mode, you effectively lose the cache (and BBU) aspect by
> doing so.
That's what I thought, thanks.
It's a shame; maybe there will be disks with battery-backed cache
one day.
Cheers,
Andy
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply
* Re: Software raid, booting and bios
From: Phil Turmel @ 2011-05-20 19:32 UTC (permalink / raw)
To: Paul van der Vlis; +Cc: linux-raid
In-Reply-To: <ir62p0$b5o$1@dough.gmane.org>
{do use reply-to-all on kernel.org lists... not everyone is subscribed}
On 05/20/2011 11:53 AM, Paul van der Vlis wrote:
> Op 20-05-11 14:11, Phil Turmel schreef:
>> (Just to show what's out there.) The embedded boards I use
>> occasionally have the equivalent of this soldered to their
>> motherboards.
>>
>> The best DMA capable CF cards are usually found in markets that cater
>> to industrial designers or to professional photographers.
>
> Do you think the risk of a problem with a CF card (or something like
> that) is much lower then the risk of a problem with a harddisk?
the big deal is the lack of moving parts: No spindle bearing, no head positioner gear train. On top of that, when set up to support your boot tasks only, there's no write activity to wear it out.
> And what about booting from an USB stick?
Just as good, technically, IMHO. If mounted internally, just as good, period. Plugged into an external port, I'd be wary of some uninformed soul pulling it out. CF cards look like they "belong".
I like Ed W's suggestions, as well, with the caveat that their usefulness would make them more likely to be "borrowed". Even by yourself, in a pinch.
Phil
^ permalink raw reply
* Re: Software raid, booting and bios
From: Ed W @ 2011-05-20 19:13 UTC (permalink / raw)
To: Paul van der Vlis; +Cc: linux-raid
In-Reply-To: <ir58vs$9qp$1@dough.gmane.org>
On 20/05/2011 09:33, Paul van der Vlis wrote:
> You can select the "boot device priority" where you can choose about
> devices types (DVD, harddisk, USB, network) but you can choose only one
> SATA disk. Study it, and you will see I am right. I've asked it to my
> rackserver-vendor, they say: "that's always the case".
Hi, what I have done with all my supermicro servers is to buy a tiny USB
flash drive (physically small, not capacity small) - I think what I
bought might be one of the tiny PNY devices, not sure though
The Supermicro boards have internal USB headers mounted on the
motherboard, even with a 1U server I have plenty of room to install my
USB on the MB (could stick them out the back of the server and cable tie
them (or superglue them))
Then I put SysrescueCD on my stick and setup GRUB with a bunch of boot
options.
In my case I'm under the possibly misguided apprehension that my boot
will fail over to the spare disks if one fails. However, I can set the
subsequent failover to be my USB stick also. I think I have them set at
the moment that the USB stick boots the main drives as normal, but has a
boot menu where I can also boot the sysrescueimage if I need to (I use
this (over IPMI) for initial system installation and serious
maintenance, eg failed grub upgrade or similar).
The only other option that I think the big hosting guys use is to have a
netboot setup which boots everything and can also offer rescue images,
etc. Beyond my skills to setup for my meagre number of servers, but if
you have more than a couple of machines this could be a very good solution?
For my needs the USB stick option is perfect
Sysrescuecd suits me because all my servers are gentoo based - clearly
it will work for other distros also, but you might want to evaluate
other rescue distros before choosing one?
Good luck
Ed W
^ permalink raw reply
* Re: why should I "initialize" JBOD disks?
From: Alexander @ 2011-05-20 19:08 UTC (permalink / raw)
To: Louis-David Mitterrand; +Cc: linux-raid
In-Reply-To: <20110520133244.GA13330@apartia.fr>
----- Message from vindex+lists-linux-raid@apartia.org ---------
Date: Fri, 20 May 2011 15:32:44 +0200
From: Louis-David Mitterrand <vindex+lists-linux-raid@apartia.org>
Subject: why should I "initialize" JBOD disks?
To: linux-raid@vger.kernel.org
> Hi,
>
> I just installed an Adaptec SAS 4805 and plugged 8 disks previously used
> "as is" (no raid) on an LSI adapter but none are recongnized as boot
> disk ("no bios installed").
>
> It seems I need to "initialize" (read: wipe) them in the Adaptec bios
> menu before being able to use them as JBOD.
>
> What the fuck?!
The controller puts the configuration details for the disk on the disk
itself. That way you can plug the disk into another port or controller
of the same brand and it will simply work.
> By my defintion JBOD should means "just get out of the way and let me
> use my disks as they are".
>
> Does that "initlialization" mean that disks become unusable when plugged
> to an adapter of any other brand or model?
It will show as uninitialized.
> What is the underlying format of JBOD disks. Can they be read by a
> straight, non-raid adapter?
The configuration needs to be stored somewhere on the disk. Either at
the beginning or at the end (or a combination of both).
Hence I would expect the start of the disk to be shifted a bit (what
used to be block 0 on the disk while plugged into the RAID controller
will be some higher block) or the disk will "end" a bit before the
actual end of the disk.
So I'm pretty sure you can dd such a disk onto another one to rescue
the data given the right seek offset.
Alex.
----- End message from vindex+lists-linux-raid@apartia.org -----
========================================================================
# _ __ _ __ http://www.nagilum.org/ \n icq://69646724 #
# / |/ /__ ____ _(_) /_ ____ _ nagilum@nagilum.org \n +491776461165 #
# / / _ `/ _ `/ / / // / ' \ Amiga (68k/PPC): AOS/NetBSD/Linux #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/ Mac (PPC): MacOS-X / Linux / MacOS-X #
# /___/ x86: FreeBSD/Linux/Solaris/Win2k ARM9: EPOC EV6 #
========================================================================
----------------------------------------------------------------
cakebox.homeunix.net - all the machine one needs..
^ permalink raw reply
* md resync looping
From: Schmidt, Annemarie @ 2011-05-20 18:51 UTC (permalink / raw)
To: linux-raid
Hi,
On RH 6.1 system, I have a raid1 2-disk array:
>>[root@typhon ~]# mdadm --detail /dev/md21
/dev/md21:
Version : 1.2
Creation Time : Thu May 19 09:15:56 2011
Raid Level : raid1
Array Size : 5241844 (5.00 GiB 5.37 GB)
Used Dev Size : 5241844 (5.00 GiB 5.37 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Intent Bitmap : Internal
...
Number Major Minor RaidDevice State
0 65 18 0 active sync /dev/sdc2
1 65 50 1 active sync /dev/sdk2
After starting I/O to the array, I pulled one of the disks. After
getting an error from the lower level scsi driver regarding an aborted
I/O, the array then went into a tight loop claiming to be resyncing:
05-20 11:01:57 end_request: I/O error, dev sdt, sector 11457968
05-20 11:01:57 md/raid1:md21: Disk failure on sdt2, disabling device.
05-20 11:01:57 md/raid1:md21: Operation continuing on 1 devices.
05-20 11:01:57 md: recovery of RAID array md21
05-20 11:01:57 md: minimum _guaranteed_ speed: 200000 KB/sec/disk.
05-20 11:01:57 md: using maximum available idle IO bandwidth (but not
more than 200000 KB/sec) for recovery.
05-20 11:01:57 md: using 128k window, over a total of 5241844 blocks.
05-20 11:01:57 md: resuming recovery of md21 from checkpoint.
05-20 11:01:57 md: md21: recovery done.
05-20 11:01:57 md: recovery of RAID array md21
05-20 11:01:57 md: minimum _guaranteed_ speed: 200000 KB/sec/disk.
05-20 11:01:57 md: using maximum available idle IO bandwidth (but not
more than 200000 KB/sec) for recovery.
05-20 11:01:57 md: using 128k window, over a total of 5241844 blocks.
05-20 11:01:57 md: resuming recovery of md21 from checkpoint.
05-20 11:01:57 md: md21: recovery done.
05-20 11:01:57 md: recovery of RAID array md21
05-20 11:01:57 md: minimum _guaranteed_ speed: 200000 KB/sec/disk.
05-20 11:01:57 md: using maximum available idle IO bandwidth (but not
more than 200000 KB/sec) for recovery.
05-20 11:01:57 md: using 128k window, over a total of 5241844 blocks.
05-20 11:01:57 md: resuming recovery of md21 from checkpoint.
05-20 11:01:57 md: md21: recovery done.
05-20 11:01:57 md: recovery of RAID array md21
05-20 11:01:57 md: minimum _guaranteed_ speed: 200000 KB/sec/disk.
05-20 11:01:57 md: using maximum available idle IO bandwidth (but not
more than 200000 KB/sec) for recovery.
05-20 11:01:57 md: using 128k window, over a total of 5241844 blocks.
05-20 11:01:57 md: resuming recovery of md21 from checkpoint.
05-20 11:01:57 md: md21: recovery done.
05-20 11:01:57 md: recovery of RAID array md21
...
And on and on.
Has anyone else run into this?
I see that there were changes made to the remove_and_add_spares function
in md.c in RHEL 6. I believe that one of these changes may be causing
the loop, specifically the first "if" statement. The disk that was
pulled has been marked 'faulty' in the rdev->flags and its raid_disk
value is >= 0. Since it is neither In-sync nor Blocked, spares gets
incremented and so md thinks there is a spare when in fact there is not.
In previous revs of md.c, the only way spares got incremented was
through the 2nd "if" statement which would not have been true in my
case:
remove_and_add_spares:
list_for_each_entry(rdev, &mddev->disks, same_set) {
***********************************
if (rdev->raid_disk >= 0 &&
!test_bit(In_sync, &rdev->flags) &&
!test_bit(Blocked, &rdev->flags))
spares++;
***********************************
if (rdev->raid_disk < 0
&& !test_bit(Faulty, &rdev->flags)) {
rdev->recovery_offset = 0;
if (mddev->pers->
hot_add_disk(mddev, rdev) == 0) {
char nm[20];
sprintf(nm, "rd%d",
rdev->raid_disk);
if
(sysfs_create_link(&mddev->kobj,
&rdev->kobj, nm))
/* failure here is OK
*/;
spares++;
md_new_event(mddev);
set_bit(MD_CHANGE_DEVS,
&mddev->flags);
} else
break;
}
Any comments on this?
Thanks,
Annemarie
^ permalink raw reply
* RE: Mdadm re-add fails
From: Schmidt, Annemarie @ 2011-05-20 17:16 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid, Dailey, Nate
In-Reply-To: <20110520095133.54b44dd4@notabene.brown>
Neil,
Yes, that worked:
>> [root@typhon ~]# mdadm --detail /dev/md24
/dev/md24:
Version : 1.2
Creation Time : Fri May 20 11:42:17 2011
Raid Level : raid1
Array Size : 5241844 (5.00 GiB 5.37 GB)
Used Dev Size : 5241844 (5.00 GiB 5.37 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Fri May 20 12:47:09 2011
State : active
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Name : typhon.mno.stratus.com:24 (local to host typhon.mno.stratus.com)
UUID : 562323d9:9a7b2979:a734abf0:b3fb8f0b
Events : 155
Number Major Minor RaidDevice State
3 65 22 0 active sync /dev/sdc6
2 65 54 1 active sync /dev/sdk6
>> [root@typhon sbin]# mdadm /dev/md24 -f /dev/sdk6 -r /dev/sdk6
mdadm: set /dev/sdk6 faulty in /dev/md24
mdadm: hot removed /dev/sdk6 from /dev/md24
Without the fix:
---------------------
>> root@typhon sbin]# mdadm /dev/md24 -a /dev/sdk6
mdadm: /dev/sdk6 reports being an active member for /dev/md24, but a --re-add fails.
mdadm: not performing --add as that would convert /dev/sdk6 in to a spare.
mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sdk6" first.
With the fix:
-----------------
>> [root@typhon ~]# ./mdadm /dev/md24 -a /dev/sdk6
mdadm: re-added /dev/sdk6
Thanks very much for the assistance.
Regards,
Annemarie
-----Original Message-----
From: NeilBrown [mailto:neilb@suse.de]
Sent: Thursday, May 19, 2011 7:52 PM
To: Schmidt, Annemarie
Cc: linux-raid@vger.kernel.org
Subject: Re: Mdadm re-add fails
On Wed, 18 May 2011 10:43:47 -0400 "Schmidt, Annemarie"
<Annemarie.Schmidt@stratus.com> wrote:
> Hi!
>
> I have a 2 disk raid1 data array. As a result of other testing, the device info
> in the superblock for one of the partners, /dev/sdc2, ended up being in slot 3
> of the device info array:
>
> [root@typhon ~]# mdadm --detail /dev/md21
> /dev/md21:
> Version : 1.2
> Creation Time : Mon May 9 11:19:43 2011
> Raid Level : raid1
> Array Size : 5241844 (5.00 GiB 5.37 GB)
> Used Dev Size : 5241844 (5.00 GiB 5.37 GB)
> Raid Devices : 2
> Total Devices : 2
> Persistence : Superblock is persistent
>
> Intent Bitmap : Internal
>
> Update Time : Thu May 12 15:51:50 2011
> State : active
> Active Devices : 2
> Working Devices : 2
> Failed Devices : 0
> Spare Devices : 0
>
> Name : typhon.mno.stratus.com:21 (local to host typhon.mno.stratus.com)
> UUID : 996d993f:baac367a:8b154ba9:43e56cff
> Events : 687
>
> Number Major Minor RaidDevice State
> --> 3 65 34 0 active sync /dev/sdc2
> 2 65 82 1 active sync /dev/sdk2
>
> When I remove /dev/sdk2 and then a re-add it back in, the re-add fails:
>
> >> [root@typhon ~]# mdadm /dev/md21 -f /dev/sdk2 -r /dev/sdk2
> mdadm: set /dev/sdk2 faulty in /dev/md21
> mdadm: hot removed /dev/sdk2 from /dev/md21
>
> >> [root@typhon ~]# mdadm /dev/md21 -a /dev/sdk2
> mdadm: /dev/sdk2 reports being an active member for /dev/md21, but a --re-add
> fails.
> mdadm: not performing --add as that would convert /dev/sdk2 in to a spare.
> mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sdk2" first.
>
> I believe the re-add fails because the enough_fd function (util.c) is not searching deep enough into the
> dev_info array with this line of code:
> for (i=0; i<array.raid_disks + array.nr_disks; i++)
>
> array.raids_disk = 2 and array/nr_disks = 1, and so for this particular md device, it is only looking at slots 0-2.
> I believe the code needs to be changed to look at all possible dev_info array slots, taking into account the
> version of the superblock (like the Detail function does (Detail.c).
>
> Do folks agree?
>
I do - largely. I think there might be a better more general way to control
the loop though.
Could you try this please?
Thanks,
NeilBrown
diff --git a/util.c b/util.c
index 1056ae4..d005e0a 100644
--- a/util.c
+++ b/util.c
@@ -370,10 +370,14 @@ int enough_fd(int fd)
array.raid_disks <= 0)
return 0;
avail = calloc(array.raid_disks, 1);
- for (i=0; i<array.raid_disks + array.nr_disks; i++) {
+ for (i=0; i < 1024 && array.raid_disks > 0; i++) {
disk.number = i;
if (ioctl(fd, GET_DISK_INFO, &disk) != 0)
continue;
+ if (disk.major == 0 && disk.minor == 0)
+ continue;
+ array.raid_disks--;
+
if (! (disk.state & (1<<MD_DISK_SYNC)))
continue;
if (disk.raid_disk < 0 || disk.raid_disk >= array.raid_disks)
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* Re: Software raid, booting and bios
From: Paul van der Vlis @ 2011-05-20 15:53 UTC (permalink / raw)
To: linux-raid
In-Reply-To: <4DD65A66.5030904@turmel.org>
Op 20-05-11 14:11, Phil Turmel schreef:
> On 05/20/2011 05:33 AM, Paul van der Vlis wrote:
>>
>> The problem is about detected disks with a defect in the MBR.
>
> This is a crucial point. A BIOS that supports multiple drives in the
> boot order should skip to the next if the MBR cannot be read. But
> the BIOS loses control once the MBR code is executed. If an error is
> encountered in later sectors of the bootloader, there's no way to
> switch to the next drive. This is also true when the BIOS only
> supports dissimilar devices in the boot order.
I think you are right.
> If I had to minimize the chance of this ever biting me, I'd use a CF
> <==> IDE adapter with a DMA capable CF card, and set it up as my boot
> device. And I wouldn't use it for anything but boot. A quick google
> turned up this:
>
> http://www.addonics.com/products/flash_memory_reader/adidecf.asp
In the servers I use, there is normally no place for such a cardreader.
But a low profile PCI card with a CF card on it could do it.
> (Just to show what's out there.) The embedded boards I use
> occasionally have the equivalent of this soldered to their
> motherboards.
>
> The best DMA capable CF cards are usually found in markets that cater
> to industrial designers or to professional photographers.
Do you think the risk of a problem with a CF card (or something like
that) is much lower then the risk of a problem with a harddisk?
And what about booting from an USB stick?
With regards,
Paul van der Vlis.
--
http://www.vandervlis.nl
^ permalink raw reply
* Re: why should I "initialize" JBOD disks?
From: Gordon Henderson @ 2011-05-20 14:34 UTC (permalink / raw)
To: linux-raid
In-Reply-To: <20110520133244.GA13330@apartia.fr>
On Fri, 20 May 2011, Louis-David Mitterrand wrote:
> Hi,
>
> I just installed an Adaptec SAS 4805 and plugged 8 disks previously used
> "as is" (no raid) on an LSI adapter but none are recongnized as boot
> disk ("no bios installed").
>
> It seems I need to "initialize" (read: wipe) them in the Adaptec bios
> menu before being able to use them as JBOD.
>
> What the fuck?!
>
> By my defintion JBOD should means "just get out of the way and let me
> use my disks as they are".
>
> Does that "initlialization" mean that disks become unusable when plugged
> to an adapter of any other brand or model?
>
> What is the underlying format of JBOD disks. Can they be read by a
> straight, non-raid adapter?
>
> These hardware raid cards suck on so many levels...
Yup.
I don't know about that, but I had something similar with a Dell
controller - I wanted a box of 15 x 500GB drives - but the only way to get
it was to buy it with their PERC RAID controller cards, however I was
assured that I didn't need to use the card RAID functions and I could use
them "just as a box of disks" ...
It wasn't having it and in the end, I had to create 15 RAID-0 arrays of
one disk each, which I could them assemble as a RAID-6 drive underLinux
MD.
It worked fine, but would have been a PITA to anyone else changing a drive
(fortunately it never broke in it's lifetime) but the down-side was that
the controler hid all the SMART information from the host )-:
Bah!
Gordon
^ permalink raw reply
* Re: HBA Adaptor advice
From: Joe Landman @ 2011-05-20 14:23 UTC (permalink / raw)
To: Ed W; +Cc: linux-raid
In-Reply-To: <4DD66AC8.60804@wildgooses.com>
On 05/20/2011 09:21 AM, Ed W wrote:
> Hi
>
>> If you absolutely insist on using a large expensive RAID card as a JBOD
>> card, yeah, there are things you *can* do to keep access to the cache
>> and BBU, though they are counter-intuitive.
>
> The main issue with hardware cards is that really you need at least two
> of them... At the most inopportune moment the only single one you own
> will break and then your entire dataset becomes unavailable...
That is a risk with any proprietary design (a point we refer to in our
marketing, relative to completely closed designs). This said, the issue
on the RAID side isn't all that terrible. RAID cards, individually,
aren't that expensive. You can buy replacements on ebay, or from
various used machine resellers. That is, your data really isn't at an
unmitigateable risk, but it at risk.
Put another way, yeah, having a spare RAID card around isn't a bad idea.
In most cases they don't burn out (we've seen 4 failed RAID cards in
our time in the field, 2 of which were ... er ... customer initiated
burnouts ... due to bad grounding).
> For sure, anyone with moderate or larger budgets, or a pool of similar
> hardware, this becomes a case of simply buying an extra one and stashing
> it. Or at least keeping an eye on when it becomes end of line and
> unavailable to buy a new one...
And in the case of the businesses/researchers, the cost of the
additional card in spares stock locally is (in most cases) in the noise
level as compared to the actual cost of the gear.
That is, its not a terrible thing to do this. If you are a home user,
its another issue entirely. A 1000 EUR might cost as much as the rest
of your system. So you want to mitigate that risk, and not have to pay
that cost. That decision to mitigate, by using MD raid, will come at
some cost, though we see MD raid very much as the future of RAID
systems. Its all about refresh rates and economies of scale.
>> First off, the LSI 920x series has a 16 port HBA. You can look it up on
>> their site. SAS+SATA HBA I think. LSI likes adorning some of their
>> HBAs with some inherent RAID capability (their IR mode). I personally
>> prefer the IT mode, but its sometimes hard/impossible to make the switch
>> (this is usually for motherboard mounted 'RAID' units). HBAs can be used
>> as RAIDs, though the performance is abysmal (c.f. PERC*, lower end LSI
>> ... which PERC are rebranded versions of, ...)
>
> This sounds helpful, but I'm not understanding it?
The 16 port card is mostly HBA, with a little onboard logic for RAID0,
RAID1, RAID10.
>
> Are you describing the reverse, ie taking a straight HBA card and asking
> it to do "hardware raid" of multiple disks?
LSI's HBAs have some of this capability, though we do not recommend
using this. We prefer to use them as straight HBAs.
>
> Or do you mean that performance is dismal even if you make X arrays of 1
> disk each in order to access their BB cache?
No ... we haven't looked into that performance as much, as this is a
very difficult to use model, and honestly, there are no real benefits to
this.
>
> Or to be really clear - can I take a cheapo PERC6 from ebay, and make it
> run 8x disks completely under linux MD Raid, with smartctl access to the
> individual disks and BB cache on the card - *with* high performance...
> (phew...)
I am going to pull a Clinton here, and ask you to define "high
performance" :) More seriously, performance is in the eye of the
beholder ... what does it mean to you, and where do you need to be in
performance ... and from that, you can see if MD RAID will get you there.
>> When you do this, then use mdadm atop this. We've found, generally, by
>> doing this, we can build much faster RAIDs than the LSI 8888 units, and
>> comparible to the 9260's in terms of performance across the same number
>> of disks, at a lower price. E.g. mdadm and the MD RAID stack are quite
>> good.
>
> What do you think stops the MD Stack being *better* than a 9260? Also
> in very round terms what kind of performance drop do you see from going
> to linux MD raid versus a 9260?
Very little on the read side. MD raid is as fast, if not faster than
the 9260 on reads. The 9260 isn't a bad card mind you, it is roughly
midrange in LSI's lineup. The write side ... I think the 9260 has a
deeply pipelined XOR engine you need for the GF(256) calculations. So
we see about a 2x better write performance on the 9260 than we do on the
MD raid.
>> The additional cache doesn't buy you much for this arrangement. Might
>> work against you if the card CPU is slow (as most of the hardware RAID
>> chips are).
>
> Hopefully not a silly question, but surely the CPU would have to be
> extremely slow indeed not to keep up with a sorted bunch of writes that
> are being issued to spinning rust drives with multi-ms seek latencies?
> Are they really that slow..?
Many of the low end cards run processors at 200-800 MHz. Yeah ... some
of them are really ... really ... slow. MD RAID runs circles around
them. And soon, I think it will be running circles around the midrange
(and probably higher end cards as well).
Regards,
Joe
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman@scalableinformatics.com
web : http://scalableinformatics.com
http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615
^ permalink raw reply
* why should I "initialize" JBOD disks?
From: Louis-David Mitterrand @ 2011-05-20 13:32 UTC (permalink / raw)
To: linux-raid
Hi,
I just installed an Adaptec SAS 4805 and plugged 8 disks previously used
"as is" (no raid) on an LSI adapter but none are recongnized as boot
disk ("no bios installed").
It seems I need to "initialize" (read: wipe) them in the Adaptec bios
menu before being able to use them as JBOD.
What the fuck?!
By my defintion JBOD should means "just get out of the way and let me
use my disks as they are".
Does that "initlialization" mean that disks become unusable when plugged
to an adapter of any other brand or model?
What is the underlying format of JBOD disks. Can they be read by a
straight, non-raid adapter?
These hardware raid cards suck on so many levels...
</rant>
^ permalink raw reply
* Re: Software raid, booting and bios
From: Gordon Henderson @ 2011-05-20 13:22 UTC (permalink / raw)
To: linux-raid
In-Reply-To: <4DD65A66.5030904@turmel.org>
On Fri, 20 May 2011, Phil Turmel wrote:
> http://www.addonics.com/products/flash_memory_reader/adidecf.asp
>
> (Just to show what's out there.) The embedded boards I use occasionally
> have the equivalent of this soldered to their motherboards.
>
> The best DMA capable CF cards are usually found in markets that cater to
> industrial designers or to professional photographers.
FWIW: I've built a few systems which boot off flash, then run from SATA
drives - but it all depends on the motherboards. These days there are many
IDE (and now SATA) drives - maybe the forerunner to SSD's... e.g. I use
these sort of things:
http://linitx.com/viewcategory.php?catid=129
It's easy to have root on a small flash IDE drive then the rest of the
system on RAID'd SATA drives. (swap, /usr, /var, /home, etc.)
Is that good? I don't know - but I also build lots of small embedded
systems (no drives) for other purposes which boot off these type of
devices and I've not had an issue with one failling in 5+ years...
Motherboards are increasingly not coming with IDE ports though, but things
move on!
Gordon
^ permalink raw reply
* Re: HBA Adaptor advice
From: Ed W @ 2011-05-20 13:21 UTC (permalink / raw)
To: Joe Landman; +Cc: linux-raid
In-Reply-To: <4DD65C18.5090804@gmail.com>
Hi
> If you absolutely insist on using a large expensive RAID card as a JBOD
> card, yeah, there are things you *can* do to keep access to the cache
> and BBU, though they are counter-intuitive.
The main issue with hardware cards is that really you need at least two
of them... At the most inopportune moment the only single one you own
will break and then your entire dataset becomes unavailable...
For sure, anyone with moderate or larger budgets, or a pool of similar
hardware, this becomes a case of simply buying an extra one and stashing
it. Or at least keeping an eye on when it becomes end of line and
unavailable to buy a new one...
> First off, the LSI 920x series has a 16 port HBA. You can look it up on
> their site. SAS+SATA HBA I think. LSI likes adorning some of their
> HBAs with some inherent RAID capability (their IR mode). I personally
> prefer the IT mode, but its sometimes hard/impossible to make the switch
> (this is usually for motherboard mounted 'RAID' units). HBAs can be used
> as RAIDs, though the performance is abysmal (c.f. PERC*, lower end LSI
> ... which PERC are rebranded versions of, ...)
This sounds helpful, but I'm not understanding it?
Are you describing the reverse, ie taking a straight HBA card and asking
it to do "hardware raid" of multiple disks?
Or do you mean that performance is dismal even if you make X arrays of 1
disk each in order to access their BB cache?
Or to be really clear - can I take a cheapo PERC6 from ebay, and make it
run 8x disks completely under linux MD Raid, with smartctl access to the
individual disks and BB cache on the card - *with* high performance...
(phew...)
> When you do this, then use mdadm atop this. We've found, generally, by
> doing this, we can build much faster RAIDs than the LSI 8888 units, and
> comparible to the 9260's in terms of performance across the same number
> of disks, at a lower price. E.g. mdadm and the MD RAID stack are quite
> good.
What do you think stops the MD Stack being *better* than a 9260? Also
in very round terms what kind of performance drop do you see from going
to linux MD raid versus a 9260?
> The additional cache doesn't buy you much for this arrangement. Might
> work against you if the card CPU is slow (as most of the hardware RAID
> chips are).
Hopefully not a silly question, but surely the CPU would have to be
extremely slow indeed not to keep up with a sorted bunch of writes that
are being issued to spinning rust drives with multi-ms seek latencies?
Are they really that slow..?
Thanks for your very helpful feedback - much appreciated
Ed W
^ permalink raw reply
* Re: HBA Adaptor advice
From: Joe Landman @ 2011-05-20 12:48 UTC (permalink / raw)
To: Roman Mamedov; +Cc: Ed W, linux-raid
In-Reply-To: <20110520183413.64fe3ccc@natsu>
On 05/20/2011 08:34 AM, Roman Mamedov wrote:
> On Fri, 20 May 2011 08:18:32 -0400
> Joe Landman<joe.landman@gmail.com> wrote:
>
>> Second off, you can turn any of the expensive RAID cards into an 'JBOD'
>> by doing something like this:
>>
>> 1) have the unit configured in RAID mode
>>
>> 2) build virtual disks out of single drives, as RAID0.
>>
>> 3) iterate 2 until you exhaust your drives.
>>
>> 4) make sure you prevent these drives from messing with your boot drive
>> order ... some bioses "helpfully" reorganize new drives for you by
>> messing with this list.
>>
>> Once the drive is a 1 disk RAID0, you get the cache, and the BBU for the
>> cache. Yeah, its a little weird. But it does work (we've done this
>> with some LSI8888's).
>
> But can you then access SMART of the individual drives?
I don't view the loss of direct SMART access as a bad thing ... most of
the RAID cards will give you CLI access to this data, if in a convoluted
manner. SMART's utility is generally pretty questionable (see the
Google paper for a discussion on the profound lack of correlation of
SMART parameters with actual failure rates). But its there if you want it.
> Or will you see only some bogus block devices which do not accept SMART
> commands, do not return real drive identity, and present themselves as RAID0
> #1, RAID0 #2 etc. instead?
The RAID will provide you an abstraction (e.g. a layer you have to walk
through) to your disks. Seeing what composes the RAID is generally not
hard, though you might need to write a quick and dirty parser for this.
The block devices are not bogus. They are logical block devices.
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: landman@scalableinformatics.com
web : http://scalableinformatics.com
http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615
^ permalink raw reply
* Re: HBA Adaptor advice
From: Mathias Burén @ 2011-05-20 12:36 UTC (permalink / raw)
To: Roman Mamedov; +Cc: Joe Landman, Ed W, linux-raid
In-Reply-To: <20110520183413.64fe3ccc@natsu>
On 20 May 2011 13:34, Roman Mamedov <rm@romanrm.ru> wrote:
> On Fri, 20 May 2011 08:18:32 -0400
> Joe Landman <joe.landman@gmail.com> wrote:
>
>> Second off, you can turn any of the expensive RAID cards into an 'JBOD'
>> by doing something like this:
>>
>> 1) have the unit configured in RAID mode
>>
>> 2) build virtual disks out of single drives, as RAID0.
>>
>> 3) iterate 2 until you exhaust your drives.
>>
>> 4) make sure you prevent these drives from messing with your boot drive
>> order ... some bioses "helpfully" reorganize new drives for you by
>> messing with this list.
>>
>> Once the drive is a 1 disk RAID0, you get the cache, and the BBU for the
>> cache. Yeah, its a little weird. But it does work (we've done this
>> with some LSI8888's).
>
> But can you then access SMART of the individual drives?
> Or will you see only some bogus block devices which do not accept SMART
> commands, do not return real drive identity, and present themselves as RAID0
> #1, RAID0 #2 etc. instead?
>
> --
> With respect,
> Roman
>
Depends on the controller; e.g.
smartctl -A -d 3ware,$I /dev/twa0
smartctl -A -d megaraid,$I /dev/sda
(where $I is the port on the controller)
/M
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: HBA Adaptor advice
From: Roman Mamedov @ 2011-05-20 12:34 UTC (permalink / raw)
To: Joe Landman; +Cc: Ed W, linux-raid
In-Reply-To: <4DD65C18.5090804@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 976 bytes --]
On Fri, 20 May 2011 08:18:32 -0400
Joe Landman <joe.landman@gmail.com> wrote:
> Second off, you can turn any of the expensive RAID cards into an 'JBOD'
> by doing something like this:
>
> 1) have the unit configured in RAID mode
>
> 2) build virtual disks out of single drives, as RAID0.
>
> 3) iterate 2 until you exhaust your drives.
>
> 4) make sure you prevent these drives from messing with your boot drive
> order ... some bioses "helpfully" reorganize new drives for you by
> messing with this list.
>
> Once the drive is a 1 disk RAID0, you get the cache, and the BBU for the
> cache. Yeah, its a little weird. But it does work (we've done this
> with some LSI8888's).
But can you then access SMART of the individual drives?
Or will you see only some bogus block devices which do not accept SMART
commands, do not return real drive identity, and present themselves as RAID0
#1, RAID0 #2 etc. instead?
--
With respect,
Roman
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply
* Re: HBA Adaptor advice
From: Joe Landman @ 2011-05-20 12:18 UTC (permalink / raw)
To: Ed W; +Cc: linux-raid
In-Reply-To: <4DD61948.8050302@wildgooses.com>
On 05/20/2011 03:33 AM, Ed W wrote:
> On 20/05/2011 03:08, Andy Smith wrote:
>> Are there actually any HBAs that have BBU without using their RAID
>> features?
>>
>> I'd like to stop using hardware RAID but I can't give up the BBU and
>> write cache.
HBAs don't have BBU or write cache. Only RAIDs do. While you can run
the RAID in JBOD mode, you effectively lose the cache (and BBU) aspect
by doing so.
More in a moment.
> This is a very interesting question. Does anyone know if say the Areca
> ARC-1880ix-24 can be used in the same way, ie battery backed JBOD type mode?
If you absolutely insist on using a large expensive RAID card as a JBOD
card, yeah, there are things you *can* do to keep access to the cache
and BBU, though they are counter-intuitive.
First off, the LSI 920x series has a 16 port HBA. You can look it up on
their site. SAS+SATA HBA I think. LSI likes adorning some of their
HBAs with some inherent RAID capability (their IR mode). I personally
prefer the IT mode, but its sometimes hard/impossible to make the switch
(this is usually for motherboard mounted 'RAID' units). HBAs can be used
as RAIDs, though the performance is abysmal (c.f. PERC*, lower end LSI
... which PERC are rebranded versions of, ...)
Second off, you can turn any of the expensive RAID cards into an 'JBOD'
by doing something like this:
1) have the unit configured in RAID mode
2) build virtual disks out of single drives, as RAID0.
3) iterate 2 until you exhaust your drives.
4) make sure you prevent these drives from messing with your boot drive
order ... some bioses "helpfully" reorganize new drives for you by
messing with this list.
Once the drive is a 1 disk RAID0, you get the cache, and the BBU for the
cache. Yeah, its a little weird. But it does work (we've done this
with some LSI8888's).
When you do this, then use mdadm atop this. We've found, generally, by
doing this, we can build much faster RAIDs than the LSI 8888 units, and
comparible to the 9260's in terms of performance across the same number
of disks, at a lower price. E.g. mdadm and the MD RAID stack are quite
good.
[...]
> I guess the limitation is that some of these cards can only create a
> small number of arrays and/or they don't use their writeback cache
> efficiently in the case of multiple arrays?
These are the issues. Most RAID cards aren't thinking they'll be used
on more than a few LUNs/RAIDs at a time, so they might not scale well
here, with 16 or 24 single drive RAID0's.
The additional cache doesn't buy you much for this arrangement. Might
work against you if the card CPU is slow (as most of the hardware RAID
chips are).
^ permalink raw reply
* Re: Software raid, booting and bios
From: Phil Turmel @ 2011-05-20 12:11 UTC (permalink / raw)
To: Paul van der Vlis; +Cc: linux-raid
In-Reply-To: <ir5ci1$uc6$1@dough.gmane.org>
On 05/20/2011 05:33 AM, Paul van der Vlis wrote:
>
> The problem is about detected disks with a defect in the MBR.
This is a crucial point. A BIOS that supports multiple drives in the boot order should skip to the next if the MBR cannot be read. But the BIOS loses control once the MBR code is executed. If an error is encountered in later sectors of the bootloader, there's no way to switch to the next drive. This is also true when the BIOS only supports dissimilar devices in the boot order.
If I had to minimize the chance of this ever biting me, I'd use a CF <==> IDE adapter with a DMA capable CF card, and set it up as my boot device. And I wouldn't use it for anything but boot. A quick google turned up this:
http://www.addonics.com/products/flash_memory_reader/adidecf.asp
(Just to show what's out there.) The embedded boards I use occasionally have the equivalent of this soldered to their motherboards.
The best DMA capable CF cards are usually found in markets that cater to industrial designers or to professional photographers.
HTH,
Phil
^ permalink raw reply
* Re: HBA Adaptor advice
From: Stan Hoeppner @ 2011-05-20 10:21 UTC (permalink / raw)
To: Ed W; +Cc: linux-raid
In-Reply-To: <4DD61948.8050302@wildgooses.com>
On 5/20/2011 2:33 AM, Ed W wrote:
> On 20/05/2011 03:08, Andy Smith wrote:
>> Are there actually any HBAs that have BBU without using their RAID
>> features?
>>
>> I'd like to stop using hardware RAID but I can't give up the BBU and
>> write cache.
I'm curious why you are convinced that you need BBWC, or even simply WC,
on an HBA used for md RAID. I'm also curious as to why you are so
adamant about _not_ using the RAID ASIC on an HBA, given that it will
take much greater advantage of the BBWC than md RAID will. You may be
interested to know:
1. When BBWC is enabled, all internal drive caches must be disabled.
Otherwise you eliminate the design benefit of the BBU, and may as
well not have one.
2. w/md RAID on an HBA, if you have a good UPS and don't suffer
kernel panics, crashes, etc, you can disable barrier support in
your FS and you can use the drive caches.
3. The elevator will perform well directly on drives with large cache
Most good higher end RAID cards have 512MB to 1GB or cache. w/12 2TB
drives you'll have a combined cache of 768MB, as most drives of this
size have a 64MB cache. So there's not much difference in total cache
size. And the drive firmware will usually make better decisions WRT
cache use optimization than an upstream RAID card BIOS that has disabled
the drive caches.
For a stable system with good UPS and auto shutdown configured, BBWC is
totally overrated. If the system never takes a nose dive from power
drop, and doesn't crash due to software or hardware failure, then BBWC
is a useless $200-1000 option. Some hardware RAID cards require a
functional BBU before they will allow you to enable write caching. In
that case BBU is needed. In most other cases it's not.
If your current reasoning for wanting write cache on the HBA is
performance, then forget about the write cache as you don't need it with
md RAID. If you want the BBWC combo for safety as your system isn't
stable or you have a crappy or no UPS, then forgo md RAID and use the
hardware RAID and BBWC combo.
One last point: If you're bargain hunting, especially if looking at
used gear on Ebay, that mindset is antithetical to proper system
integration, especially when talking about a RAID card BBU. If you buy
a use card, the first thing you muse do is chuck the BBU and order a new
one, because the used battery can't be trusted--you have no idea how
much life is left in it. For you data to be safe, you need a new
battery. Buying a brand new card w/bundled BBU may cost you the same or
less than a used card and a new battery from the manufacturer.
The following would be a darn good fit for your md RAID office server
setup, given your criteria, WRT the HBA, hot swap cages, drives, and
cables. Drop the LSI SAS HBA into a PCIe 2.0 x8 slot. Drop the Intel
24 port SAS expander into an x4/x8 slot, or mount it to the side or
floor of the chassis and power it via the 4 pin Molex plug. Connect the
8087/8087 cable from the LSI card to the first port on the Intel SAS
Expander. Mount the 5 IcyDock 4 x 2.5" SAS hot swap backplane cages in
5 x 5.25" externally accessible drive bays. Connect each of the five
8087 breakout cables from the remaining 5 ports on the Intel Expander to
each of the hot swap backplanes--one cable per backplane--label which
drive connects to which port on the Intel expander so you can properly
identify failed drives! Mount each Seagate Enterprise 2.5" 1TB drive in
a tray and insert the trays into the backplanes--fill each quad bay
before putting drives in the next bay. After booting the machine hop
into the LSI BIOS and configure for JBOD. You should know how to do the
read.
This setup gives you 12 enterprise 2.5" SAS 7.2K RPM 1TB drives--not
cheap SATA drives not fit for RAID--12TB raw total, in only three 5.25"
bays, and drawing much less power than equivalent 3.5" drives. You will
have 8 free hot swap bays for future expansion, 20TB total if acquiring
the same drives. Controller to drive aggregate bandwidth is 2.4GB/s,
4.8GB/s full duplex, HBA to host b/w is 4/8 GB/s, likely far more than
you need.
The parts list. Total cost from NewEgg in the US is ~$3800 with ~$3000
of that being the 12 drives at $250 each. The HBA + expander are only $470.
Buy 1:
http://www.lsi.com/channel/products/megaraid/sassata/9240-4i/index.html
Buy 1:
http://www.intel.com/Products/Server/RAID-controllers/re-res2sv240/RES2SV240-Overview.htm
Buy 5:
http://www.icydock.com/goods.php?id=114
Buy 12:
http://www.seagate.com/ww/v/index.jsp?name=st91000640ss-constellation2-6gbs-sas-1-tb-hd&vgnextoid=ff13c5b2933d9210VgnVCM1000001a48090aRCRD&vgnextchannel=f424072516d8c010VgnVCM100000dd04090aRCRD&locale=en-US&reqPage=Support#tTabContentSpecifications
Buy 5 (or local equivalent):
http://www.newegg.com/Product/Product.aspx?Item=N82E16816116098&cm_re=cable-_-16-116-098-_-Product
Buy 1 (or local equivalent):
http://www.newegg.com/Product/Product.aspx?Item=N82E16816116093&cm_re=cable-_-16-116-093-_-Product
Food for thought. Hope it's useful as I killed over an hour putting
this together for you. :)
--
Stan
^ permalink raw reply
* Re: Software raid, booting and bios
From: CoolCold @ 2011-05-20 10:04 UTC (permalink / raw)
To: Paul van der Vlis; +Cc: linux-raid
In-Reply-To: <ir58vs$9qp$1@dough.gmane.org>
Здравствуйте, Paul.
Вы писали 20 мая 2011 г., 12:33:00:
> Op 20-05-11 09:19, Simon Mcnair schreef:
>> I have not come across a pc which does not allow you to boot a
>> secondary drive before... Please can you read the manual and triple
>> check this ? The only possible reason I can think this would happen
>> Is that you're using an add on board and you would configure this from
>> a secondary bios. My Dell, Asus, and all other motherboards I've had
>> over the past 10 years all allow a second device.
> You can select the "boot device priority" where you can choose about
> devices types (DVD, harddisk, USB, network) but you can choose only one
> SATA disk. Study it, and you will see I am right. I've asked it to my
> rackserver-vendor, they say: "that's always the case".
I've seen supermicro servers with bios allowing to set several drives
as boot disks and without such option (like in your case).
One of those which could do this was
http://www.supermicro.com/products/motherboard/Xeon1333/5000P/X7DB3.cfm
if i remember things right.
> But I think I have had systems in the past, what could do it. An
> interesting question is then: how well is it tested? What when e.g. a
> disk boots, and then gives an I/O error? I am looking for a well-tested
> way to solve this, and I am willing to pay for it or choose another
> hardware vendor for it.
> With regards,
> Paul van der Vlis.
>> Cheers
>> Simon
>>
>> On 20 May 2011, at 08:15, Paul van der Vlis <paul@vandervlis.nl> wrote:
>>
>>> Op 20-05-11 09:03, Simon Mcnair schreef:
>>>> Please can you further define what you mean by 'it can become a
>>>> problem to boot' ?
>>>> Generally this is resolved by having a mbr and boot partition on each
>>>> of your mirrored drives so that whichever you use to boot has the
>>>> pertinent information to boot the kernel and construct the raid array.
>>>> If you have raid 5 with 3 disks you'd have a 3 drive mirror partition
>>>> on each disk and a raid 5 set across all three too.
>>>
>>> In the bios from my machines (Supermicro, Dell) I can select only one
>>> drive to boot. Wenn the drive fails, no other disk is tried.
>>>
>>> I can go into the bios and change the drive when it fails, or I can
>>> exchange the disks. But I would like it, when the machine would simple
>>> boot even when the first disk is corrupt.
>>>
>>> With regards,
>>> Paul van der Vlis.
>>>
>>>
>>>> I'm not a guru on this and can't provide much knowledge past the
>>>> theory and high level ;-)
>>>> Simon
>>>>
>>>> On 20 May 2011, at 07:55, Paul van der Vlis <paul@vandervlis.nl> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I use software raid (mdadm). The main problem for me is that when the
>>>>> drive with the MBR fails, it can become a problem to boot.
>>>>>
>>>>> When the bios would use another drive to boot when the first drive
>>>>> failes, this problem would be gone. But I don't know rackservers who do
>>>>> that. Do you?
>>>>>
>>>>> Or is there maybe some kind of fake-raid card what uses mdadm to solve
>>>>> this problem?
>>>>>
>>>>> Another way would be to use e.g. an USB device to boot to solve this
>>>>> problem. Any experiences with that?
>>>>>
>>>>> (hmm, I realize that netboot is an option too).
>>>>>
>>>>> With regards,
>>>>> Paul van der Vlis.
>>>>>
>>>>>
>>>>> --
>>>>> http://www.vandervlis.nl
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Best regards,
[COOLCOLD-RIPN]
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox