* Confusion with setting up new RAID6 with mdadm
@ 2010-11-14 15:36 Zoltan Szecsei
2010-11-14 16:48 ` Mikael Abrahamsson
` (2 more replies)
0 siblings, 3 replies; 19+ messages in thread
From: Zoltan Szecsei @ 2010-11-14 15:36 UTC (permalink / raw)
To: linux-raid
Hi,
I hope this is the correct list to address this on - I've done a lot of
typing for nothing, if not :-)
I have done days of research, including reading
https://raid.wiki.kernel.org/index.php/Linux_Raid, but all I am doing is
getting confused in the detail.
My goal is to set up an 8*2TB SiI3132 based RAID6 on Ubuntu 10.04LTS,
with LVM and ext4.
The setup will mostly hold thousands of 400MB image files, and they will
not be accessed regularly - they mostly just need to be online. The
entire space on all 8 drives can be used, and I want 1 massive
filesystem, when I finally mount this RAID device. No boot, root or swap.
I have gone quite far with the help of the local linux group, but after
I had completed the 27 hour mdadm --create run, further tidbits were
thrown at me, and I am trying to get an opinion on if it is worth
scrapping this effort, and starting again.
Please can someone provide clarity on:
*If I have to reformat the drives and redo mdadm --create, other than
mdadm stop, how can I get rid of all the /dev/md* etc etc so that when I
restart this exercise, the original bad RAID does not interfere with
this new attempt?
*Partition alignment?
Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drives)
None of the mdadm helps I've googled or received speak about how to
correctly format the drives before running mdadm --create.
All the benchmarks & performance tests I've found, do not bother to say
whether they have aligned the partitions on the HD
*What is the correct fdisk or parted method get rid of the DOS & GPT
flags, and create a correctly aligned partition, and should this be a
0xda partiton (& then I use metatdata 1.2 for mdadm)?
*Chunk size:
After reading MANY different opinions, I'm guessing staying at the
default chunk size is optimal? Anyone want to add to this argument?
*After partitioning the 8 drives, is this the correct sequence?
mdadm --create /dev/md0 --metadata=1.2 --auto=md --chunk=64
--level=raid6 --raid devices=8 /dev/sd[abcdefgh]1
mdadm --detail --scan >> /etc/mdadm.conf
mdadm --assemble /dev/md0 /dev/sd[abcdefg]1
*After this, do I mkfs ext4 first, or LVM first?
*What stride and stripe values should I use?
If you've read this far: Wow! - big thanks.
If you're going to venture some help or affirmation - BIGGER thanks! :=)
Kind regards to all,
Zoltan
This is where I am, but I'd like to get it right, so am happy to delete
& restart, if the current state is not fixable.:
**************************************************
root@gs0:/home/geograph# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
[raid4] [raid10]
md_d0 : active raid6 sde1[4] sdg1[6] sdh1[7] sdc1[2] sda1[0] sdb1[1]
sdd1[3] sdf1[5]
11721071616 blocks level 6, 64k chunk, algorithm 2 [8/8] [UUUUUUUU]
unused devices: <none>
root@gs0:/home/geograph#
**************************************************
root@gs0:/home/geograph# mdadm -E /dev/md_d0
mdadm: No md superblock detected on /dev/md_d0.
**************************************************
root@gs0:/dev# ls -la /dev/md*
brw-rw---- 1 root disk 254, 0 2010-11-13 16:41 /dev/md_d0
lrwxrwxrwx 1 root root 7 2010-11-13 16:41 /dev/md_d0p1 -> md/d0p1
lrwxrwxrwx 1 root root 7 2010-11-13 16:41 /dev/md_d0p2 -> md/d0p2
lrwxrwxrwx 1 root root 7 2010-11-13 16:41 /dev/md_d0p3 -> md/d0p3
lrwxrwxrwx 1 root root 7 2010-11-13 16:41 /dev/md_d0p4 -> md/d0p4
/dev/md:
total 0
drwxrwx--- 2 root disk 140 2010-11-13 16:41 .
drwxr-xr-x 18 root root 4520 2010-11-14 11:42 ..
brw------- 1 root root 254, 0 2010-11-13 16:41 d0
brw------- 1 root root 254, 1 2010-11-13 16:41 d0p1
brw------- 1 root root 254, 2 2010-11-13 16:41 d0p2
brw------- 1 root root 254, 3 2010-11-13 16:41 d0p3
brw------- 1 root root 254, 4 2010-11-13 16:41 d0p4
root@gs0:/dev#
***********************************************
root@gs0:/home/geograph# fdisk -lu
WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util
fdisk doesn't support GPT. Use GNU Parted.
Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
256 heads, 63 sectors/track, 242251 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Device Boot Start End Blocks Id System
/dev/sda1 63 3907024127 1953512032+ fd Linux raid
autodetect
(All 8 disks are as above)
************************************************
root@gs0:/home/geograph# parted /dev/sde
GNU Parted 2.2
Using /dev/sde
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) p
Warning: /dev/sde contains GPT signatures, indicating that it has a GPT
table. However, it does not have a valid fake msdos partition table, as
it should. Perhaps it was
corrupted -- possibly by a program that doesn't understand GPT partition
tables. Or perhaps you deleted the GPT table, and are now using an
msdos partition table. Is this a
GPT partition table?
Yes/No? yes
Model: ATA ST32000542AS (scsi)
Disk /dev/sde: 2000GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Number Start End Size File system Name
Flags
1 17.4kB 134MB 134MB Microsoft reserved
partition msftres
(parted)
****************************************************
--
===========================================
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.
65 Main Road, Muizenberg 7945
Western Cape, South Africa.
34° 6'16.35"S 18°28'5.62"E
Tel: +27-21-7884897 Mobile: +27-83-6004028
Fax: +27-86-6115323 www.geograph.co.za
===========================================
-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3255 - Release Date: 11/13/10
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Confusion with setting up new RAID6 with mdadm
2010-11-14 15:36 Confusion with setting up new RAID6 with mdadm Zoltan Szecsei
@ 2010-11-14 16:48 ` Mikael Abrahamsson
2010-11-15 12:27 ` Zoltan Szecsei
2010-11-14 19:50 ` Luca Berra
2010-11-14 22:13 ` Neil Brown
2 siblings, 1 reply; 19+ messages in thread
From: Mikael Abrahamsson @ 2010-11-14 16:48 UTC (permalink / raw)
To: Zoltan Szecsei; +Cc: linux-raid
On Sun, 14 Nov 2010, Zoltan Szecsei wrote:
> *If I have to reformat the drives and redo mdadm --create, other than mdadm
> stop, how can I get rid of all the /dev/md* etc etc so that when I restart
> this exercise, the original bad RAID does not interfere with this new
> attempt?
Look into "--zero-superblock" for all drives.
> *Partition alignment?
> Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drives)
> None of the mdadm helps I've googled or received speak about how to correctly
> format the drives before running mdadm --create.
> All the benchmarks & performance tests I've found, do not bother to say
> whether they have aligned the partitions on the HD
Me recommendation is to not use partitions at all, just use the whole
device (/dev/sdX).
> *What is the correct fdisk or parted method get rid of the DOS & GPT flags,
> and create a correctly aligned partition, and should this be a 0xda partiton
> (& then I use metatdata 1.2 for mdadm)?
I just "dd if=/dev/zero of=/dev/sdX bs=1024000 count=1" to zero the first
megabyte of the drive to get rid of the partition table (you get rid of
the v1.2 metadata at the same time actually). Then you know for sure
you're correctly aligned as well as md is 4k aligned.
> *Chunk size:
> After reading MANY different opinions, I'm guessing staying at the default
> chunk size is optimal? Anyone want to add to this argument?
Default should be fine.
> *After this, do I mkfs ext4 first, or LVM first?
LVM if you want to use LVM. Filesystems live in lv:s in the LVM concept.
--
Mikael Abrahamsson email: swmike@swm.pp.se
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Confusion with setting up new RAID6 with mdadm
2010-11-14 15:36 Confusion with setting up new RAID6 with mdadm Zoltan Szecsei
2010-11-14 16:48 ` Mikael Abrahamsson
@ 2010-11-14 19:50 ` Luca Berra
2010-11-15 6:52 ` Zoltan Szecsei
2011-07-22 1:08 ` Tanguy Herrmann
2010-11-14 22:13 ` Neil Brown
2 siblings, 2 replies; 19+ messages in thread
From: Luca Berra @ 2010-11-14 19:50 UTC (permalink / raw)
To: linux-raid
On Sun, Nov 14, 2010 at 05:36:38PM +0200, Zoltan Szecsei wrote:
> *If I have to reformat the drives and redo mdadm --create, other than mdadm
> stop, how can I get rid of all the /dev/md* etc etc so that when I restart
> this exercise, the original bad RAID does not interfere with this new
> attempt?
mdadm -Ss
mdadm --zero-superblock on each partition
>
>
> *Partition alignment?
> Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drives)
for modern hdds with 4k sectors it is
new fdisk and/or parted should already know how to align
in any case, since you want to use the whole space for raid, why create
partitions at all, md works nicely without
> *Chunk size:
> After reading MANY different opinions, I'm guessing staying at the default
> chunk size is optimal? Anyone want to add to this argument?
i believe the default in newer mdadm is fine
>
> *After partitioning the 8 drives, is this the correct sequence?
> mdadm --create /dev/md0 --metadata=1.2 --auto=md --chunk=64 --level=raid6
why you state the chunk size here, i tought you wanted to stay with the
defauklt
> --raid devices=8 /dev/sd[abcdefgh]1
> mdadm --detail --scan >> /etc/mdadm.conf
> mdadm --assemble /dev/md0 /dev/sd[abcdefg]1
it should be already assembled after create, and after you appended the
info to mdadm.conf you just need mdadm --assemble /dev/md0 or mdadm
--assemble --scan.
> *After this, do I mkfs ext4 first, or LVM first?
if you want to use lvm it would be lvm first, but... do you want to?
there is no point if the aim is allocating the whole space to a single
filesystem.
> *What stride and stripe values should I use?
new toolstack should already find the correct stripe/stride for you
one more note:
for such a big array i would suggest to create a bitmap, so in case of
an unclean shutdown you do not have to wait for 27 hours for it to
rebuild. an internal bitmap will do.
L.
--
Luca Berra -- bluca@comedia.it
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Confusion with setting up new RAID6 with mdadm
2010-11-14 15:36 Confusion with setting up new RAID6 with mdadm Zoltan Szecsei
2010-11-14 16:48 ` Mikael Abrahamsson
2010-11-14 19:50 ` Luca Berra
@ 2010-11-14 22:13 ` Neil Brown
2010-11-15 5:30 ` Roman Mamedov
` (2 more replies)
2 siblings, 3 replies; 19+ messages in thread
From: Neil Brown @ 2010-11-14 22:13 UTC (permalink / raw)
To: Zoltan Szecsei; +Cc: linux-raid
On Sun, 14 Nov 2010 17:36:38 +0200
Zoltan Szecsei <zoltans@geograph.co.za> wrote:
> Hi,
> I hope this is the correct list to address this on - I've done a lot of
> typing for nothing, if not :-)
>
> I have done days of research, including reading
> https://raid.wiki.kernel.org/index.php/Linux_Raid, but all I am doing is
> getting confused in the detail.
>
> My goal is to set up an 8*2TB SiI3132 based RAID6 on Ubuntu 10.04LTS,
> with LVM and ext4.
> The setup will mostly hold thousands of 400MB image files, and they will
> not be accessed regularly - they mostly just need to be online. The
> entire space on all 8 drives can be used, and I want 1 massive
> filesystem, when I finally mount this RAID device. No boot, root or swap.
>
> I have gone quite far with the help of the local linux group, but after
> I had completed the 27 hour mdadm --create run, further tidbits were
> thrown at me, and I am trying to get an opinion on if it is worth
> scrapping this effort, and starting again.
>
>
>
> Please can someone provide clarity on:
>
> *If I have to reformat the drives and redo mdadm --create, other than
> mdadm stop, how can I get rid of all the /dev/md* etc etc so that when I
> restart this exercise, the original bad RAID does not interfere with
> this new attempt?
>
>
>
> *Partition alignment?
> Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drives)
> None of the mdadm helps I've googled or received speak about how to
> correctly format the drives before running mdadm --create.
> All the benchmarks & performance tests I've found, do not bother to say
> whether they have aligned the partitions on the HD
>
> *What is the correct fdisk or parted method get rid of the DOS & GPT
> flags, and create a correctly aligned partition, and should this be a
> 0xda partiton (& then I use metatdata 1.2 for mdadm)?
>
>
> *Chunk size:
> After reading MANY different opinions, I'm guessing staying at the
> default chunk size is optimal? Anyone want to add to this argument?
Depending on which version of mdadm you are using, the default chunk size
will be 64K or 512K. I would recommend using 512K even if you have an older
mdadm. 64K appears to be too small for modern hardware, particularly if you
are storing large files.
For raid6 with the current implementation it is safe to use "--assume-clean"
to avoid the long recovery time. It is certainly safe to use that if you
want to build a test array, do some performance measurement, and then scrap
it and try again. If some time later you want to be sure that the array is
entirely in sync you can
echo repair > /sys/block/md0/md/sync_action
and wait a while.
I agree with what Mikael and Luca suggested - particularly the suggested for
"--bitmap internal". You really want that.
NeilBrown
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Confusion with setting up new RAID6 with mdadm
2010-11-14 22:13 ` Neil Brown
@ 2010-11-15 5:30 ` Roman Mamedov
2010-11-15 6:58 ` Zoltan Szecsei
2010-11-15 18:01 ` Zoltan Szecsei
2 siblings, 0 replies; 19+ messages in thread
From: Roman Mamedov @ 2010-11-15 5:30 UTC (permalink / raw)
To: Neil Brown; +Cc: Zoltan Szecsei, linux-raid
[-- Attachment #1: Type: text/plain, Size: 991 bytes --]
On Mon, 15 Nov 2010 09:13:26 +1100
Neil Brown <neilb@suse.de> wrote:
> Depending on which version of mdadm you are using, the default chunk size
> will be 64K or 512K. I would recommend using 512K even if you have an older
> mdadm. 64K appears to be too small for modern hardware, particularly if you
> are storing large files.
According to some benchmarks I found, 64 or 128K still provide the sweet-spot
performance on RAID5 and RAID6, especially on writes.
http://louwrentius.blogspot.com/2010/05/raid-level-and-chunk-size-benchmarks.html
http://blog.jamponi.net/2008/07/raid56-and-10-benchmarks-on-26255_10.html#raid-5-performance
http://alephnull.com/benchmarks/sata2009/chunksize.html
> I agree with what Mikael and Luca suggested - particularly the suggested for
> "--bitmap internal". You really want that.
It will also help to increase the --bitmap-chunk value to reduce its
performance impact, I suggest using 131072 or more.
--
With respect,
Roman
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Confusion with setting up new RAID6 with mdadm
2010-11-14 19:50 ` Luca Berra
@ 2010-11-15 6:52 ` Zoltan Szecsei
2010-11-15 7:41 ` Luca Berra
2011-07-22 1:08 ` Tanguy Herrmann
1 sibling, 1 reply; 19+ messages in thread
From: Zoltan Szecsei @ 2010-11-15 6:52 UTC (permalink / raw)
To: linux-raid
On 2010-11-14 21:50, Luca Berra wrote:
> On Sun, Nov 14, 2010 at 05:36:38PM +0200, Zoltan Szecsei wrote:
>>
>> *Partition alignment?
>> Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drives)
> for modern hdds with 4k sectors it is
> new fdisk and/or parted should already know how to align
fdisk reports 512b sectors:
root@gs0:/home/geograph# fdisk -lu
WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util
fdisk doesn't support GPT. Use GNU Parted.
Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
256 heads, 63 sectors/track, 242251 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Device Boot Start End Blocks Id System
/dev/sda1 63 3907024127 1953512032+ fd Linux raid
autodetect
(All 8 disks are as above)
> in any case, since you want to use the whole space for raid, why create
> partitions at all, md works nicely without
OK
>
>> *Chunk size:
>> After reading MANY different opinions, I'm guessing staying at the
>> default chunk size is optimal? Anyone want to add to this argument?
> i believe the default in newer mdadm is fine
>
>>
>> *After partitioning the 8 drives, is this the correct sequence?
>> mdadm --create /dev/md0 --metadata=1.2 --auto=md --chunk=64
>> --level=raid6
> why you state the chunk size here, i tought you wanted to stay with the
> defauklt
because on my mdadm, 64 is the default - and I was just re-enforcing
that for the reader.
root@gs0:/home/geograph# mdadm -V
mdadm - v2.6.7.1 - 15th October 2008
root@gs0:/home/geograph#
>> *After this, do I mkfs ext4 first, or LVM first?
> if you want to use lvm it would be lvm first, but... do you want to?
> there is no point if the aim is allocating the whole space to a single
> filesystem.
Because I might want to join this array to another one at a later stage
- I would then have 2 boxes each with 8 drives, and each with a SiI3132
card on the same motherboard.
I might use the second box to mirror the first, or to extend it - not
sure of my needs yet.
>
>> *What stride and stripe values should I use?
> new toolstack should already find the correct stripe/stride for you
How would I check ?
root@gs0:/home/geograph# mkfs.ext4 -V
mke2fs 1.41.11 (14-Mar-2010)
Using EXT2FS Library version 1.41.11
root@gs0:/home/geograph#
>
> one more note:
> for such a big array i would suggest to create a bitmap, so in case of
> an unclean shutdown you do not have to wait for 27 hours for it to
> rebuild. an internal bitmap will do.
>
Nice tip - I'll look into it - Thanks,
Zoltan
--
===========================================
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.
65 Main Road, Muizenberg 7945
Western Cape, South Africa.
34° 6'16.35"S 18°28'5.62"E
Tel: +27-21-7884897 Mobile: +27-83-6004028
Fax: +27-86-6115323 www.geograph.co.za
===========================================
-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3256 - Release Date: 11/14/10
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Confusion with setting up new RAID6 with mdadm
2010-11-14 22:13 ` Neil Brown
2010-11-15 5:30 ` Roman Mamedov
@ 2010-11-15 6:58 ` Zoltan Szecsei
2010-11-15 7:43 ` Mikael Abrahamsson
2010-11-15 18:01 ` Zoltan Szecsei
2 siblings, 1 reply; 19+ messages in thread
From: Zoltan Szecsei @ 2010-11-15 6:58 UTC (permalink / raw)
Cc: linux-raid
On 2010-11-15 00:13, Neil Brown wrote:
>
> Depending on which version of mdadm you are using, the default chunk size
> will be 64K or 512K. I would recommend using 512K even if you have an older
> mdadm. 64K appears to be too small for modern hardware, particularly if you
> are storing large files.
>
root@gs0:/home/geograph# mdadm -V
mdadm - v2.6.7.1 - 15th October 2008
root@gs0:/home/geograph#
This was what apt-get install got for me, from Ubuntu 10.04 64bit Desktop.
Should I download & compile a newer one?
(Where from? - haven't found the mdadm developer page yet))
> For raid6 with the current implementation it is safe to use "--assume-clean"
>
Is my above version "current" enough?
> to avoid the long recovery time. It is certainly safe to use that if you
> want to build a test array, do some performance measurement, and then scrap
> it and try again. If some time later you want to be sure that the array is
> entirely in sync you can
> echo repair> /sys/block/md0/md/sync_action
> and wait a while.
>
> I agree with what Mikael and Luca suggested - particularly the suggested for
> "--bitmap internal". You really want that.
>
>
> N
>
Regards & thanks,
Zoltan
--
===========================================
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.
65 Main Road, Muizenberg 7945
Western Cape, South Africa.
34° 6'16.35"S 18°28'5.62"E
Tel: +27-21-7884897 Mobile: +27-83-6004028
Fax: +27-86-6115323 www.geograph.co.za
===========================================
-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3256 - Release Date: 11/14/10
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Confusion with setting up new RAID6 with mdadm
2010-11-15 6:52 ` Zoltan Szecsei
@ 2010-11-15 7:41 ` Luca Berra
2010-11-15 11:06 ` Zoltan Szecsei
0 siblings, 1 reply; 19+ messages in thread
From: Luca Berra @ 2010-11-15 7:41 UTC (permalink / raw)
To: linux-raid
On Mon, Nov 15, 2010 at 08:52:32AM +0200, Zoltan Szecsei wrote:
> On 2010-11-14 21:50, Luca Berra wrote:
>> On Sun, Nov 14, 2010 at 05:36:38PM +0200, Zoltan Szecsei wrote:
>>>
>>> *Partition alignment?
>>> Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drives)
>> for modern hdds with 4k sectors it is
>> new fdisk and/or parted should already know how to align
> fdisk reports 512b sectors:
> root@gs0:/home/geograph# fdisk -lu
i believe your fdisk does not support getting geometry from blkid, but i
am not an ubuntu user
you could try checking with something like 'strings /sbin/fdisk | grep io_size',
but since we are going without partitions you can ignore all this.
btw to check sector size on a disk, on a fairly recent kernel you can
check the files under /sys/block/*/queue,
hw_sector_size
minimum_io_size
optimal_io_size
except for disks that lie about their sector size, but this is a
different story.
>>> *After partitioning the 8 drives, is this the correct sequence?
>>> mdadm --create /dev/md0 --metadata=1.2 --auto=md --chunk=64 --level=raid6
>> why you state the chunk size here, i tought you wanted to stay with the
>> defauklt
> because on my mdadm, 64 is the default - and I was just re-enforcing that
> for the reader.
> root@gs0:/home/geograph# mdadm -V
> mdadm - v2.6.7.1 - 15th October 2008
this mdadm release is a tad old (about two years), it will work, but
some things may be different than current 3.1.x
> root@gs0:/home/geograph#
>>> *After this, do I mkfs ext4 first, or LVM first?
>> if you want to use lvm it would be lvm first, but... do you want to?
>> there is no point if the aim is allocating the whole space to a single
>> filesystem.
> Because I might want to join this array to another one at a later stage - I
> would then have 2 boxes each with 8 drives, and each with a SiI3132 card on
> the same motherboard.
> I might use the second box to mirror the first, or to extend it - not sure
> of my needs yet.
ok, then you need to align lvm as well
check if you have these parameters in /etc/lvm/lvm.conf
md_chunk_alignment = 1
data_alignment_detection = 1
if you don't have those at all, check if lvm supports them
strings /sbin/lvm|grep io_size
if not, you have to align manually, using the --dataalignment option to
pvcreate, align to a full stripe (chunk_size * 6, see below)
>>> *What stride and stripe values should I use?
stride=chunk_size/fs block size
stripe-width=stride * num_data_disks
num_data disks in your case is 6, 8 total disks - 2 parity disks
on a fairly recent kernel:
/sys/block/md?/queue/minimum_io_size would be the chunk_size of the array
/sys/block/md?/queue/optimal_io_size would be the stripe size
this should be exported on lvm devices also
/sys/block/dm-*/queue/...
so you can check with data in /sys/block at each step which is the value
to feed into tools.
>> new toolstack should already find the correct stripe/stride for you
> How would I check ?
strings /sbin/lvm|grep io_size
--
Luca Berra -- bluca@comedia.it
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Confusion with setting up new RAID6 with mdadm
2010-11-15 6:58 ` Zoltan Szecsei
@ 2010-11-15 7:43 ` Mikael Abrahamsson
2010-11-15 9:18 ` Neil Brown
0 siblings, 1 reply; 19+ messages in thread
From: Mikael Abrahamsson @ 2010-11-15 7:43 UTC (permalink / raw)
To: Zoltan Szecsei; +Cc: linux-raid
On Mon, 15 Nov 2010, Zoltan Szecsei wrote:
> (Where from? - haven't found the mdadm developer page yet))
<http://neil.brown.name/blog/20040607123837>
--
Mikael Abrahamsson email: swmike@swm.pp.se
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Confusion with setting up new RAID6 with mdadm
2010-11-15 7:43 ` Mikael Abrahamsson
@ 2010-11-15 9:18 ` Neil Brown
0 siblings, 0 replies; 19+ messages in thread
From: Neil Brown @ 2010-11-15 9:18 UTC (permalink / raw)
To: Mikael Abrahamsson; +Cc: Zoltan Szecsei, linux-raid
On Mon, 15 Nov 2010 08:43:03 +0100 (CET)
Mikael Abrahamsson <swmike@swm.pp.se> wrote:
> On Mon, 15 Nov 2010, Zoltan Szecsei wrote:
>
> > (Where from? - haven't found the mdadm developer page yet))
>
> <http://neil.brown.name/blog/20040607123837>
>
aka http://neil.brown.name/blog/mdadm
NeilBrown
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Confusion with setting up new RAID6 with mdadm
2010-11-15 7:41 ` Luca Berra
@ 2010-11-15 11:06 ` Zoltan Szecsei
0 siblings, 0 replies; 19+ messages in thread
From: Zoltan Szecsei @ 2010-11-15 11:06 UTC (permalink / raw)
To: linux-raid
On 2010-11-15 09:41, Luca Berra wrote:
> btw to check sector size on a disk, on a fairly recent kernel you can
> check the files under /sys/block/*/queue,
> hw_sector_size
512
> minimum_io_size
512
> optimal_io_size
0
>
> except for disks that lie about their sector size, but this is a
> different story.
>> root@gs0:/home/geograph# mdadm -V
>> mdadm - v2.6.7.1 - 15th October 2008
> this mdadm release is a tad old (about two years), it will work, but
> some things may be different than current 3.1.x
>
just downloaded the tarball for 3.1.4 and will have a crack at compiling it.
Thanks !
Z
--
===========================================
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.
65 Main Road, Muizenberg 7945
Western Cape, South Africa.
34° 6'16.35"S 18°28'5.62"E
Tel: +27-21-7884897 Mobile: +27-83-6004028
Fax: +27-86-6115323 www.geograph.co.za
===========================================
-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3257 - Release Date: 11/14/10
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Confusion with setting up new RAID6 with mdadm
2010-11-14 16:48 ` Mikael Abrahamsson
@ 2010-11-15 12:27 ` Zoltan Szecsei
2010-11-15 12:47 ` Michal Soltys
0 siblings, 1 reply; 19+ messages in thread
From: Zoltan Szecsei @ 2010-11-15 12:27 UTC (permalink / raw)
To: Mikael Abrahamsson; +Cc: linux-raid
On 2010-11-14 18:48, Mikael Abrahamsson wrote:
>
>> *What is the correct fdisk or parted method get rid of the DOS & GPT
>> flags, and create a correctly aligned partition, and should this be a
>> 0xda partiton (& then I use metatdata 1.2 for mdadm)?
>
> I just "dd if=/dev/zero of=/dev/sdX bs=1024000 count=1" to zero the
> first megabyte of the drive to get rid of the partition table (you get
> rid of the v1.2 metadata at the same time actually). Then you know for
> sure you're correctly aligned as well as md is 4k aligned.
I did this on all 8 drives (/dev/sd[a-h])
root@gs0:/etc# dd if=/dev/zero of=/dev/sdb bs=1024000 count=1
1+0 records in
1+0 records out
1024000 bytes (1.0 MB) copied, 0.0103209 s, 99.2 MB/s
root@gs0:/etc# dd if=/dev/zero of=/dev/sdc bs=1024000 count=1
But the GPT id has not disappeared. I am going to use these drives
unpartitioned, so is this a problem?
Thanks,
Zoltan
root@gs0:/etc# fdisk -lu
WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util
fdisk doesn't support GPT. Use GNU Parted.
Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sda doesn't contain a valid partition table
WARNING: GPT (GUID Partition Table) detected on '/dev/sdb'! The util
fdisk doesn't support GPT. Use GNU Parted.
Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sdb doesn't contain a valid partition table
WARNING: GPT (GUID Partition Table) detected on '/dev/sdc'! The util
fdisk doesn't support GPT. Use GNU Parted.
Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sdc doesn't contain a valid partition table
WARNING: GPT (GUID Partition Table) detected on '/dev/sdd'! The util
fdisk doesn't support GPT. Use GNU Parted.
Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sdd doesn't contain a valid partition table
WARNING: GPT (GUID Partition Table) detected on '/dev/sde'! The util
fdisk doesn't support GPT. Use GNU Parted.
Disk /dev/sde: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sde doesn't contain a valid partition table
WARNING: GPT (GUID Partition Table) detected on '/dev/sdf'! The util
fdisk doesn't support GPT. Use GNU Parted.
Disk /dev/sdf: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sdf doesn't contain a valid partition table
WARNING: GPT (GUID Partition Table) detected on '/dev/sdg'! The util
fdisk doesn't support GPT. Use GNU Parted.
Disk /dev/sdg: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sdg doesn't contain a valid partition table
WARNING: GPT (GUID Partition Table) detected on '/dev/sdh'! The util
fdisk doesn't support GPT. Use GNU Parted.
Disk /dev/sdh: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sdh doesn't contain a valid partition table
Disk /dev/sdi: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000e30c7
Device Boot Start End Blocks Id System
/dev/sdi1 * 2048 391167 194560 83 Linux
Partition 1 does not end on cylinder boundary.
/dev/sdi2 393214 976771071 488188929 5 Extended
/dev/sdi5 393216 98047999 48827392 83 Linux
/dev/sdi6 98050048 110047231 5998592 82 Linux swap / Solaris
/dev/sdi7 110049280 976771071 433360896 83 Linux
root@gs0:/etc#
--
===========================================
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.
65 Main Road, Muizenberg 7945
Western Cape, South Africa.
34° 6'16.35"S 18°28'5.62"E
Tel: +27-21-7884897 Mobile: +27-83-6004028
Fax: +27-86-6115323 www.geograph.co.za
===========================================
-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3257 - Release Date: 11/14/10
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Confusion with setting up new RAID6 with mdadm
2010-11-15 12:27 ` Zoltan Szecsei
@ 2010-11-15 12:47 ` Michal Soltys
2010-11-15 13:23 ` Zoltan Szecsei
0 siblings, 1 reply; 19+ messages in thread
From: Michal Soltys @ 2010-11-15 12:47 UTC (permalink / raw)
To: Zoltan Szecsei; +Cc: Mikael Abrahamsson, linux-raid
On 15.11.2010 13:27, Zoltan Szecsei wrote:
> On 2010-11-14 18:48, Mikael Abrahamsson wrote:
>>
>>> *What is the correct fdisk or parted method get rid of the DOS & GPT
>>> flags, and create a correctly aligned partition, and should this be a
>>> 0xda partiton (& then I use metatdata 1.2 for mdadm)?
>>
>> I just "dd if=/dev/zero of=/dev/sdX bs=1024000 count=1" to zero the
>> first megabyte of the drive to get rid of the partition table (you get
>> rid of the v1.2 metadata at the same time actually). Then you know for
>> sure you're correctly aligned as well as md is 4k aligned.
> I did this on all 8 drives (/dev/sd[a-h])
> root@gs0:/etc# dd if=/dev/zero of=/dev/sdb bs=1024000 count=1
> 1+0 records in
> 1+0 records out
> 1024000 bytes (1.0 MB) copied, 0.0103209 s, 99.2 MB/s
> root@gs0:/etc# dd if=/dev/zero of=/dev/sdc bs=1024000 count=1
>
> But the GPT id has not disappeared.
You might want to do blockdev --rereadpt /dev/sd[a-h] to make sure
kernel registers new situation (or do the same with sfdisk -R)
Also, GPT stores backup partition table + gpt header at the end of the
disk. Kernel might be clever enough to rely on it if you destroy the
data at the beginning of the disk.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Confusion with setting up new RAID6 with mdadm
2010-11-15 12:47 ` Michal Soltys
@ 2010-11-15 13:23 ` Zoltan Szecsei
0 siblings, 0 replies; 19+ messages in thread
From: Zoltan Szecsei @ 2010-11-15 13:23 UTC (permalink / raw)
To: Michal Soltys; +Cc: Mikael Abrahamsson, linux-raid
On 2010-11-15 14:47, Michal Soltys wrote:
> On 15.11.2010 13:27, Zoltan Szecsei wrote:
>> On 2010-11-14 18:48, Mikael Abrahamsson wrote:
>>>
>>>> *What is the correct fdisk or parted method get rid of the DOS & GPT
>>>> flags, and create a correctly aligned partition, and should this be a
>>>> 0xda partiton (& then I use metatdata 1.2 for mdadm)?
>>>
>>> I just "dd if=/dev/zero of=/dev/sdX bs=1024000 count=1" to zero the
>>> first megabyte of the drive to get rid of the partition table (you get
>>> rid of the v1.2 metadata at the same time actually). Then you know for
>>> sure you're correctly aligned as well as md is 4k aligned.
>> I did this on all 8 drives (/dev/sd[a-h])
>> root@gs0:/etc# dd if=/dev/zero of=/dev/sdb bs=1024000 count=1
>> 1+0 records in
>> 1+0 records out
>> 1024000 bytes (1.0 MB) copied, 0.0103209 s, 99.2 MB/s
>> root@gs0:/etc# dd if=/dev/zero of=/dev/sdc bs=1024000 count=1
>>
>> But the GPT id has not disappeared.
>
> You might want to do blockdev --rereadpt /dev/sd[a-h] to make sure
> kernel registers new situation (or do the same with sfdisk -R)
>
> Also, GPT stores backup partition table + gpt header at the end of the
> disk. Kernel might be clever enough to rely on it if you destroy the
> data at the beginning of the disk.
>
OK, just done this on all 8 drives:
root@gs0:/sys/block# dd if=/dev/zero of=/dev/sdb bs=512 seek=3907029166
dd: writing `/dev/sdb': No space left on device
3+0 records in
2+0 records out
1024 bytes (1.0 kB) copied, 0.000913281 s, 1.1 MB/s
root@gs0:/sys/block#
fdisk -lu produces the results below - so presumably the drives now
clean & ready for mdadm?
BTW: I've just downloaded and compiled the latest mdadm too:
root@gs0:/sys/block# mdadm -V
mdadm - v3.1.4 - 31st August 2010
root@gs0:/sys/block#
Thanks for your (collective) helps...
Zoltan
root@gs0:/sys/block# fdisk -lu
Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sda doesn't contain a valid partition table
Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sdb doesn't contain a valid partition table
Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sdc doesn't contain a valid partition table
Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sdd doesn't contain a valid partition table
Disk /dev/sde: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sde doesn't contain a valid partition table
Disk /dev/sdf: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sdf doesn't contain a valid partition table
Disk /dev/sdg: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sdg doesn't contain a valid partition table
Disk /dev/sdh: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sdh doesn't contain a valid partition table
Disk /dev/sdi: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000e30c7
Device Boot Start End Blocks Id System
/dev/sdi1 * 2048 391167 194560 83 Linux
Partition 1 does not end on cylinder boundary.
/dev/sdi2 393214 976771071 488188929 5 Extended
/dev/sdi5 393216 98047999 48827392 83 Linux
/dev/sdi6 98050048 110047231 5998592 82 Linux swap / Solaris
/dev/sdi7 110049280 976771071 433360896 83 Linux
root@gs0:/sys/block#
--
===========================================
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.
65 Main Road, Muizenberg 7945
Western Cape, South Africa.
34° 6'16.35"S 18°28'5.62"E
Tel: +27-21-7884897 Mobile: +27-83-6004028
Fax: +27-86-6115323 www.geograph.co.za
===========================================
-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3257 - Release Date: 11/14/10
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Confusion with setting up new RAID6 with mdadm
2010-11-14 22:13 ` Neil Brown
2010-11-15 5:30 ` Roman Mamedov
2010-11-15 6:58 ` Zoltan Szecsei
@ 2010-11-15 18:01 ` Zoltan Szecsei
2010-11-15 19:53 ` Neil Brown
2 siblings, 1 reply; 19+ messages in thread
From: Zoltan Szecsei @ 2010-11-15 18:01 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-raid
Hi,
One last quick question:
Neil Brown <neilb@suse.de> wrote:
> Depending on which version of mdadm you are using, the default chunk size
> will be 64K or 512K. I would recommend using 512K even if you have an older
> mdadm. 64K appears to be too small for modern hardware, particularly if you
> are storing large files.
>
> For raid6 with the current implementation it is safe to use "--assume-clean"
> to avoid the long recovery time. It is certainly safe to use that if you
> want to build a test array, do some performance measurement, and then scrap
> it and try again. If some time later you want to be sure that the array is
> entirely in sync you can
> echo repair> /sys/block/md0/md/sync_action
> and wait a while.
>
****************************************************
I have compiled the following mdadm on my Ubuntu 64 bit 10.04 Desktop
system:
root@gs0:/home/geograph# uname -a
Linux gs0 2.6.32-25-generic #45-Ubuntu SMP Sat Oct 16 19:52:42 UTC 2010
x86_64 GNU/Linux
root@gs0:/home/geograph# mdadm -V
mdadm - v3.1.4 - 31st August 2010
root@gs0:/home/geograph#
****************************************************
I have deleted the partitions on all 8 drives, and done a mdadm -Ss
root@gs0:/home/geograph# fdisk -lu
Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sda doesn't contain a valid partition table
Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
******************************************************
Based on the above "assume-clean" comment, plus all the help you guys
have offered, I have just run:
mdadm --create /dev/md0 --metadata=1.2 --auto=md --assume-clean
--bitmap=internal --bitmap-chunk=131072 --chunk=512 --level=6
--raid-devices=8 /dev/sd[abcdefgh]
It took a nano-second to complete!
The man-pages for assume-clean say that "the array pre-existed". Surely
as I have erased the HDs, and now have no partitions on them, this is
not true?
Do I need to re-run the above mdadm command, or is it safe to proceed
with LVM then mkfs ext4?
Thanks for all,
Zoltan
******************************************************
root@gs0:/home/geograph# mdadm -E /dev/md0
mdadm: No md superblock detected on /dev/md0.
root@gs0:/home/geograph# ls -la /dev/md*
brw-rw---- 1 root disk 9, 0 2010-11-15 19:53 /dev/md0
/dev/md:
total 0
drwxr-xr-x 2 root root 60 2010-11-15 19:53 .
drwxr-xr-x 19 root root 4260 2010-11-15 19:53 ..
lrwxrwxrwx 1 root root 6 2010-11-15 19:53 0 -> ../md0
root@gs0:/home/geograph# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
[raid4] [raid10]
md0 : active raid6 sdc[2] sdf[5] sdh[7] sdd[3] sdb[1] sdg[6] sda[0] sde[4]
11721077760 blocks super 1.2 level 6, 512k chunk, algorithm 2
[8/8] [UUUUUUUU]
bitmap: 0/8 pages [0KB], 131072KB chunk
unused devices: <none>
*******************************************************
--
===========================================
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.
65 Main Road, Muizenberg 7945
Western Cape, South Africa.
34° 6'16.35"S 18°28'5.62"E
Tel: +27-21-7884897 Mobile: +27-83-6004028
Fax: +27-86-6115323 www.geograph.co.za
===========================================
-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3258 - Release Date: 11/15/10
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Confusion with setting up new RAID6 with mdadm
2010-11-15 18:01 ` Zoltan Szecsei
@ 2010-11-15 19:53 ` Neil Brown
2010-11-16 6:48 ` Zoltan Szecsei
0 siblings, 1 reply; 19+ messages in thread
From: Neil Brown @ 2010-11-15 19:53 UTC (permalink / raw)
To: Zoltan Szecsei; +Cc: linux-raid
On Mon, 15 Nov 2010 20:01:48 +0200
Zoltan Szecsei <zoltans@geograph.co.za> wrote:
> Hi,
> One last quick question:
>
> Neil Brown <neilb@suse.de> wrote:
> > Depending on which version of mdadm you are using, the default chunk size
> > will be 64K or 512K. I would recommend using 512K even if you have an older
> > mdadm. 64K appears to be too small for modern hardware, particularly if you
> > are storing large files.
> >
> > For raid6 with the current implementation it is safe to use "--assume-clean"
> > to avoid the long recovery time. It is certainly safe to use that if you
> > want to build a test array, do some performance measurement, and then scrap
> > it and try again. If some time later you want to be sure that the array is
> > entirely in sync you can
> > echo repair> /sys/block/md0/md/sync_action
> > and wait a while.
> >
> ****************************************************
> I have compiled the following mdadm on my Ubuntu 64 bit 10.04 Desktop
> system:
> root@gs0:/home/geograph# uname -a
> Linux gs0 2.6.32-25-generic #45-Ubuntu SMP Sat Oct 16 19:52:42 UTC 2010
> x86_64 GNU/Linux
> root@gs0:/home/geograph# mdadm -V
> mdadm - v3.1.4 - 31st August 2010
> root@gs0:/home/geograph#
>
> ****************************************************
> I have deleted the partitions on all 8 drives, and done a mdadm -Ss
>
> root@gs0:/home/geograph# fdisk -lu
>
> Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
> 255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x00000000
>
> Disk /dev/sda doesn't contain a valid partition table
>
> Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
>
> ******************************************************
> Based on the above "assume-clean" comment, plus all the help you guys
> have offered, I have just run:
> mdadm --create /dev/md0 --metadata=1.2 --auto=md --assume-clean
> --bitmap=internal --bitmap-chunk=131072 --chunk=512 --level=6
> --raid-devices=8 /dev/sd[abcdefgh]
>
> It took a nano-second to complete!
>
> The man-pages for assume-clean say that "the array pre-existed". Surely
> as I have erased the HDs, and now have no partitions on them, this is
> not true?
> Do I need to re-run the above mdadm command, or is it safe to proceed
> with LVM then mkfs ext4?
It is safe to proceed.
The situation is that the two parity block are probably not correct on most
(or even any) stripes. But you have no live data on them to protect, so it
doesn't really matter.
With the current implementation of RAID6, every time you write, the correct
parity blocks are computed and written. So any live data that is written
will be accompanies by correct parity blocks to protect it.
This does *not* apply to RAID5 as it sometimes uses the old parity block to
compute the new parity block. If the old was wrong, the new will be wrong
too.
It is conceivable that one day we might change the raid6 code to perform
similar updates if it ever turns out to be faster to do it that way, but it
seems unlikely at the moment.
NeilBrown
>
> Thanks for all,
> Zoltan
>
> ******************************************************
> root@gs0:/home/geograph# mdadm -E /dev/md0
> mdadm: No md superblock detected on /dev/md0.
>
>
>
> root@gs0:/home/geograph# ls -la /dev/md*
> brw-rw---- 1 root disk 9, 0 2010-11-15 19:53 /dev/md0
> /dev/md:
> total 0
> drwxr-xr-x 2 root root 60 2010-11-15 19:53 .
> drwxr-xr-x 19 root root 4260 2010-11-15 19:53 ..
> lrwxrwxrwx 1 root root 6 2010-11-15 19:53 0 -> ../md0
>
>
> root@gs0:/home/geograph# cat /proc/mdstat
> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
> [raid4] [raid10]
> md0 : active raid6 sdc[2] sdf[5] sdh[7] sdd[3] sdb[1] sdg[6] sda[0] sde[4]
> 11721077760 blocks super 1.2 level 6, 512k chunk, algorithm 2
> [8/8] [UUUUUUUU]
> bitmap: 0/8 pages [0KB], 131072KB chunk
>
> unused devices: <none>
>
>
>
>
> *******************************************************
>
>
>
>
>
>
>
>
>
>
>
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Confusion with setting up new RAID6 with mdadm
2010-11-15 19:53 ` Neil Brown
@ 2010-11-16 6:48 ` Zoltan Szecsei
0 siblings, 0 replies; 19+ messages in thread
From: Zoltan Szecsei @ 2010-11-16 6:48 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-raid
On 2010-11-15 21:53, Neil Brown wrote:
> On Mon, 15 Nov 2010 20:01:48 +0200
> Zoltan Szecsei<zoltans@geograph.co.za> wrote:
>
>
>> Hi,
>> One last quick question:
>> ****************************************************
>> I have compiled the following mdadm on my Ubuntu 64 bit 10.04 Desktop
>> system:
>> root@gs0:/home/geograph# uname -a
>> Linux gs0 2.6.32-25-generic #45-Ubuntu SMP Sat Oct 16 19:52:42 UTC 2010
>> x86_64 GNU/Linux
>> root@gs0:/home/geograph# mdadm -V
>> mdadm - v3.1.4 - 31st August 2010
>> root@gs0:/home/geograph#
>>
>> ****************************************************
>> I have deleted the partitions on all 8 drives, and done a mdadm -Ss
>>
>> root@gs0:/home/geograph# fdisk -lu
>>
>> Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
>> 255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
>> Units = sectors of 1 * 512 = 512 bytes
>> Sector size (logical/physical): 512 bytes / 512 bytes
>> I/O size (minimum/optimal): 512 bytes / 512 bytes
>> Disk identifier: 0x00000000
>>
>> Disk /dev/sda doesn't contain a valid partition table
>>
>> Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
>>
>> ******************************************************
>> Based on the above "assume-clean" comment, plus all the help you guys
>> have offered, I have just run:
>> mdadm --create /dev/md0 --metadata=1.2 --auto=md --assume-clean
>> --bitmap=internal --bitmap-chunk=131072 --chunk=512 --level=6
>> --raid-devices=8 /dev/sd[abcdefgh]
>>
>> It took a nano-second to complete!
>>
>> The man-pages for assume-clean say that "the array pre-existed". Surely
>> as I have erased the HDs, and now have no partitions on them, this is
>> not true?
>> Do I need to re-run the above mdadm command, or is it safe to proceed
>> with LVM then mkfs ext4?
>>
> It is safe to proceed.
>
Too cool (A for away at last :-) )
Neil: Big thanks to you and the others on this list for all the patience
& help you guys have given.,
Kind regards,
Zoltan
> The situation is that the two parity block are probably not correct on most
> (or even any) stripes. But you have no live data on them to protect, so it
> doesn't really matter.
>
> With the current implementation of RAID6, every time you write, the correct
> parity blocks are computed and written. So any live data that is written
> will be accompanies by correct parity blocks to protect it.
>
> This does *not* apply to RAID5 as it sometimes uses the old parity block to
> compute the new parity block. If the old was wrong, the new will be wrong
> too.
>
> It is conceivable that one day we might change the raid6 code to perform
> similar updates if it ever turns out to be faster to do it that way, but it
> seems unlikely at the moment.
>
> NeilBrown
>
>
>
--
===========================================
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.
65 Main Road, Muizenberg 7945
Western Cape, South Africa.
34° 6'16.35"S 18°28'5.62"E
Tel: +27-21-7884897 Mobile: +27-83-6004028
Fax: +27-86-6115323 www.geograph.co.za
===========================================
-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3258 - Release Date: 11/15/10
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Confusion with setting up new RAID6 with mdadm
2010-11-14 19:50 ` Luca Berra
2010-11-15 6:52 ` Zoltan Szecsei
@ 2011-07-22 1:08 ` Tanguy Herrmann
2011-07-22 5:17 ` Mikael Abrahamsson
1 sibling, 1 reply; 19+ messages in thread
From: Tanguy Herrmann @ 2011-07-22 1:08 UTC (permalink / raw)
To: linux-raid
Luca Berra <bluca <at> comedia.it> writes:
>
> On Sun, Nov 14, 2010 at 05:36:38PM +0200, Zoltan Szecsei wrote:
> > *If I have to reformat the drives and redo mdadm --create, other than mdadm
> > stop, how can I get rid of all the /dev/md* etc etc so that when I restart
> > this exercise, the original bad RAID does not interfere with this new
> > attempt?
>
> mdadm -Ss
> mdadm --zero-superblock on each partition
> >
> >
> > *Partition alignment?
> > Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drives)
> for modern hdds with 4k sectors it is
> new fdisk and/or parted should already know how to align
> in any case, since you want to use the whole space for raid, why create
> partitions at all, md works nicely without
Hello,
first thank you for the interesting topic (because it fits my questions ^^) and
for all the participation of this community to this topic !
I've read somewhere (sorry I can't remind it) that the raid still could be
unaligned by using the whole disk, and so we had to create a partition aligned
(by using fdisk -u, then creating a partition beginning at LBA 64 at least, and
that would span on a length multiple of 8.
Was it totally wrong ?
Tanguy
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Confusion with setting up new RAID6 with mdadm
2011-07-22 1:08 ` Tanguy Herrmann
@ 2011-07-22 5:17 ` Mikael Abrahamsson
0 siblings, 0 replies; 19+ messages in thread
From: Mikael Abrahamsson @ 2011-07-22 5:17 UTC (permalink / raw)
To: Tanguy Herrmann; +Cc: linux-raid
On Fri, 22 Jul 2011, Tanguy Herrmann wrote:
> I've read somewhere (sorry I can't remind it) that the raid still could be
> unaligned by using the whole disk, and so we had to create a partition aligned
> (by using fdisk -u, then creating a partition beginning at LBA 64 at least, and
> that would span on a length multiple of 8.
>
> Was it totally wrong ?
No, but I don't see how using the whole disk can end up to be not 4k
aligned when doing your way would. Only way this would end up unaligned
would be if the offset jumper on the WDxxEARS drives was set, and then
you'd be misaligned regardless of method.
--
Mikael Abrahamsson email: swmike@swm.pp.se
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2011-07-22 5:17 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-14 15:36 Confusion with setting up new RAID6 with mdadm Zoltan Szecsei
2010-11-14 16:48 ` Mikael Abrahamsson
2010-11-15 12:27 ` Zoltan Szecsei
2010-11-15 12:47 ` Michal Soltys
2010-11-15 13:23 ` Zoltan Szecsei
2010-11-14 19:50 ` Luca Berra
2010-11-15 6:52 ` Zoltan Szecsei
2010-11-15 7:41 ` Luca Berra
2010-11-15 11:06 ` Zoltan Szecsei
2011-07-22 1:08 ` Tanguy Herrmann
2011-07-22 5:17 ` Mikael Abrahamsson
2010-11-14 22:13 ` Neil Brown
2010-11-15 5:30 ` Roman Mamedov
2010-11-15 6:58 ` Zoltan Szecsei
2010-11-15 7:43 ` Mikael Abrahamsson
2010-11-15 9:18 ` Neil Brown
2010-11-15 18:01 ` Zoltan Szecsei
2010-11-15 19:53 ` Neil Brown
2010-11-16 6:48 ` Zoltan Szecsei
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).