Confusion with setting up new RAID6 with mdadm

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Confusion with setting up new RAID6 with mdadm
@ 2010-11-14 15:36 Zoltan Szecsei
  2010-11-14 16:48 ` Mikael Abrahamsson
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Zoltan Szecsei @ 2010-11-14 15:36 UTC (permalink / raw)
  To: linux-raid

Hi,
I hope this is the correct list to address this on - I've done a lot of 
typing for nothing, if not :-)

I have done days of research, including reading 
https://raid.wiki.kernel.org/index.php/Linux_Raid, but all I am doing is 
getting confused in the detail.

My goal is to set up an 8*2TB SiI3132 based RAID6 on Ubuntu 10.04LTS, 
with LVM and ext4.
The setup will mostly hold thousands of 400MB image files, and they will 
not be accessed regularly - they mostly just need to be online. The 
entire space on all 8 drives can be used, and I want 1 massive 
filesystem, when I finally mount this RAID device. No boot, root or swap.

I have gone quite far with the help of the local linux group, but after 
I had completed the 27 hour mdadm --create run, further tidbits were 
thrown at me, and I am trying to get an opinion on if it is worth 
scrapping this effort, and starting again.

Please can someone provide clarity on:

*If I have to reformat the drives and redo mdadm --create, other than 
mdadm stop, how can I get rid of all the /dev/md* etc etc so that when I 
restart this exercise, the original bad RAID does not interfere with 
this new attempt?

*Partition alignment?
Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drives)
None of the mdadm helps I've googled or received speak about how to 
correctly format the drives before running mdadm --create.
All the benchmarks & performance tests I've found, do not bother to say 
whether they have aligned the partitions on the HD

*What is the correct fdisk or parted method get rid of the DOS & GPT 
flags, and create a correctly aligned partition, and should this be a 
0xda partiton (& then I use metatdata 1.2 for mdadm)?

*Chunk size:
After reading MANY different opinions, I'm guessing staying at the 
default chunk size is optimal? Anyone want to add to this argument?

*After partitioning the 8 drives, is this the correct sequence?
mdadm --create /dev/md0 --metadata=1.2 --auto=md --chunk=64 
--level=raid6 --raid devices=8 /dev/sd[abcdefgh]1
mdadm --detail --scan >> /etc/mdadm.conf
mdadm --assemble /dev/md0 /dev/sd[abcdefg]1

*After this, do I mkfs ext4 first, or LVM first?

*What stride and stripe values should I use?

If you've read this far: Wow! - big thanks.
If you're going to venture some help or affirmation - BIGGER thanks! :=)

Kind regards to all,
Zoltan

This is where I am, but I'd like to get it right, so am happy to delete 
& restart, if the current state is not fixable.:

**************************************************

root@gs0:/home/geograph# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] 
[raid4] [raid10]
md_d0 : active raid6 sde1[4] sdg1[6] sdh1[7] sdc1[2] sda1[0] sdb1[1] 
sdd1[3] sdf1[5]
       11721071616 blocks level 6, 64k chunk, algorithm 2 [8/8] [UUUUUUUU]

unused devices: <none>
root@gs0:/home/geograph#
**************************************************

root@gs0:/home/geograph# mdadm -E /dev/md_d0
mdadm: No md superblock detected on /dev/md_d0.

**************************************************
root@gs0:/dev# ls -la /dev/md*
brw-rw---- 1 root disk 254, 0 2010-11-13 16:41 /dev/md_d0
lrwxrwxrwx 1 root root      7 2010-11-13 16:41 /dev/md_d0p1 -> md/d0p1
lrwxrwxrwx 1 root root      7 2010-11-13 16:41 /dev/md_d0p2 -> md/d0p2
lrwxrwxrwx 1 root root      7 2010-11-13 16:41 /dev/md_d0p3 -> md/d0p3
lrwxrwxrwx 1 root root      7 2010-11-13 16:41 /dev/md_d0p4 -> md/d0p4

/dev/md:
total 0
drwxrwx---  2 root disk    140 2010-11-13 16:41 .
drwxr-xr-x 18 root root   4520 2010-11-14 11:42 ..
brw-------  1 root root 254, 0 2010-11-13 16:41 d0
brw-------  1 root root 254, 1 2010-11-13 16:41 d0p1
brw-------  1 root root 254, 2 2010-11-13 16:41 d0p2
brw-------  1 root root 254, 3 2010-11-13 16:41 d0p3
brw-------  1 root root 254, 4 2010-11-13 16:41 d0p4
root@gs0:/dev#

***********************************************
root@gs0:/home/geograph# fdisk -lu

WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util 
fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
256 heads, 63 sectors/track, 242251 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

    Device Boot      Start         End      Blocks   Id  System
/dev/sda1              63  3907024127  1953512032+  fd  Linux raid 
autodetect

(All 8 disks are as above)
************************************************

root@gs0:/home/geograph# parted /dev/sde
GNU Parted 2.2
Using /dev/sde
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) p
Warning: /dev/sde contains GPT signatures, indicating that it has a GPT 
table.  However, it does not have a valid fake msdos partition table, as 
it should.  Perhaps it was
corrupted -- possibly by a program that doesn't understand GPT partition 
tables.  Or perhaps you deleted the GPT table, and are now using an 
msdos partition table.  Is this a
GPT partition table?
Yes/No? yes
Model: ATA ST32000542AS (scsi)
Disk /dev/sde: 2000GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start   End    Size   File system  Name                          
Flags
  1      17.4kB  134MB  134MB               Microsoft reserved 
partition  msftres

(parted)

****************************************************

-- 

===========================================
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.

65 Main Road, Muizenberg 7945
Western Cape, South Africa.

34° 6'16.35"S 18°28'5.62"E

Tel: +27-21-7884897  Mobile: +27-83-6004028
Fax: +27-86-6115323     www.geograph.co.za
===========================================

-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3255 - Release Date: 11/13/10

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Confusion with setting up new RAID6 with mdadm
  2010-11-14 15:36 Confusion with setting up new RAID6 with mdadm Zoltan Szecsei
@ 2010-11-14 16:48 ` Mikael Abrahamsson
  2010-11-15 12:27   ` Zoltan Szecsei
  2010-11-14 19:50 ` Luca Berra
  2010-11-14 22:13 ` Neil Brown
  2 siblings, 1 reply; 19+ messages in thread
From: Mikael Abrahamsson @ 2010-11-14 16:48 UTC (permalink / raw)
  To: Zoltan Szecsei; +Cc: linux-raid

On Sun, 14 Nov 2010, Zoltan Szecsei wrote:

> *If I have to reformat the drives and redo mdadm --create, other than mdadm 
> stop, how can I get rid of all the /dev/md* etc etc so that when I restart 
> this exercise, the original bad RAID does not interfere with this new 
> attempt?

Look into "--zero-superblock" for all drives.

> *Partition alignment?
> Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drives)
> None of the mdadm helps I've googled or received speak about how to correctly 
> format the drives before running mdadm --create.
> All the benchmarks & performance tests I've found, do not bother to say 
> whether they have aligned the partitions on the HD

Me recommendation is to not use partitions at all, just use the whole 
device (/dev/sdX).

> *What is the correct fdisk or parted method get rid of the DOS & GPT flags, 
> and create a correctly aligned partition, and should this be a 0xda partiton 
> (& then I use metatdata 1.2 for mdadm)?

I just "dd if=/dev/zero of=/dev/sdX bs=1024000 count=1" to zero the first 
megabyte of the drive to get rid of the partition table (you get rid of 
the v1.2 metadata at the same time actually). Then you know for sure 
you're correctly aligned as well as md is 4k aligned.

> *Chunk size:
> After reading MANY different opinions, I'm guessing staying at the default 
> chunk size is optimal? Anyone want to add to this argument?

Default should be fine.

> *After this, do I mkfs ext4 first, or LVM first?

LVM if you want to use LVM. Filesystems live in lv:s in the LVM concept.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Confusion with setting up new RAID6 with mdadm
  2010-11-14 15:36 Confusion with setting up new RAID6 with mdadm Zoltan Szecsei
  2010-11-14 16:48 ` Mikael Abrahamsson
@ 2010-11-14 19:50 ` Luca Berra
  2010-11-15  6:52   ` Zoltan Szecsei
  2011-07-22  1:08   ` Tanguy Herrmann
  2010-11-14 22:13 ` Neil Brown
  2 siblings, 2 replies; 19+ messages in thread
From: Luca Berra @ 2010-11-14 19:50 UTC (permalink / raw)
  To: linux-raid

On Sun, Nov 14, 2010 at 05:36:38PM +0200, Zoltan Szecsei wrote:
> *If I have to reformat the drives and redo mdadm --create, other than mdadm 
> stop, how can I get rid of all the /dev/md* etc etc so that when I restart 
> this exercise, the original bad RAID does not interfere with this new 
> attempt?

mdadm -Ss
mdadm --zero-superblock on each partition
>
>
> *Partition alignment?
> Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drives)
for modern hdds with 4k sectors it is
new fdisk and/or parted should already know how to align
in any case, since you want to use the whole space for raid, why create
partitions at all, md works nicely without

> *Chunk size:
> After reading MANY different opinions, I'm guessing staying at the default 
> chunk size is optimal? Anyone want to add to this argument?
i believe the default in newer mdadm is fine

>
> *After partitioning the 8 drives, is this the correct sequence?
> mdadm --create /dev/md0 --metadata=1.2 --auto=md --chunk=64 --level=raid6 
why you state the chunk size here, i tought you wanted to stay with the
defauklt
> --raid devices=8 /dev/sd[abcdefgh]1
> mdadm --detail --scan >> /etc/mdadm.conf
> mdadm --assemble /dev/md0 /dev/sd[abcdefg]1
it should be already assembled after create, and after you appended the
info to mdadm.conf you just need mdadm --assemble /dev/md0 or mdadm
--assemble --scan.

> *After this, do I mkfs ext4 first, or LVM first?
if you want to use lvm it would be lvm first, but... do you want to?
there is no point if the aim is allocating the whole space to a single
filesystem.

> *What stride and stripe values should I use?
new toolstack should already find the correct stripe/stride for you

one more note:
for such a big array i would suggest to create a bitmap, so in case of
an unclean shutdown you do not have to wait for 27 hours for it to
rebuild. an internal bitmap will do.

L.

-- 
Luca Berra -- bluca@comedia.it

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Confusion with setting up new RAID6 with mdadm
  2010-11-14 15:36 Confusion with setting up new RAID6 with mdadm Zoltan Szecsei
  2010-11-14 16:48 ` Mikael Abrahamsson
  2010-11-14 19:50 ` Luca Berra
@ 2010-11-14 22:13 ` Neil Brown
  2010-11-15  5:30   ` Roman Mamedov
                     ` (2 more replies)
  2 siblings, 3 replies; 19+ messages in thread
From: Neil Brown @ 2010-11-14 22:13 UTC (permalink / raw)
  To: Zoltan Szecsei; +Cc: linux-raid

On Sun, 14 Nov 2010 17:36:38 +0200
Zoltan Szecsei <zoltans@geograph.co.za> wrote:

> Hi,
> I hope this is the correct list to address this on - I've done a lot of 
> typing for nothing, if not :-)
> 
> I have done days of research, including reading 
> https://raid.wiki.kernel.org/index.php/Linux_Raid, but all I am doing is 
> getting confused in the detail.
> 
> My goal is to set up an 8*2TB SiI3132 based RAID6 on Ubuntu 10.04LTS, 
> with LVM and ext4.
> The setup will mostly hold thousands of 400MB image files, and they will 
> not be accessed regularly - they mostly just need to be online. The 
> entire space on all 8 drives can be used, and I want 1 massive 
> filesystem, when I finally mount this RAID device. No boot, root or swap.
> 
> I have gone quite far with the help of the local linux group, but after 
> I had completed the 27 hour mdadm --create run, further tidbits were 
> thrown at me, and I am trying to get an opinion on if it is worth 
> scrapping this effort, and starting again.
> 
> 
> 
> Please can someone provide clarity on:
> 
> *If I have to reformat the drives and redo mdadm --create, other than 
> mdadm stop, how can I get rid of all the /dev/md* etc etc so that when I 
> restart this exercise, the original bad RAID does not interfere with 
> this new attempt?
> 
> 
> 
> *Partition alignment?
> Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drives)
> None of the mdadm helps I've googled or received speak about how to 
> correctly format the drives before running mdadm --create.
> All the benchmarks & performance tests I've found, do not bother to say 
> whether they have aligned the partitions on the HD
> 
> *What is the correct fdisk or parted method get rid of the DOS & GPT 
> flags, and create a correctly aligned partition, and should this be a 
> 0xda partiton (& then I use metatdata 1.2 for mdadm)?
> 
> 
> *Chunk size:
> After reading MANY different opinions, I'm guessing staying at the 
> default chunk size is optimal? Anyone want to add to this argument?

Depending on which version of mdadm you are using, the default chunk size
will be 64K or 512K.  I would recommend using 512K even if you have an older
mdadm.  64K appears to be too small for modern hardware, particularly if you
are storing large files.

For raid6 with the current implementation it is safe to use "--assume-clean"
to avoid the long recovery time.  It is certainly safe to use that if you
want to build a test array, do some performance measurement, and then scrap
it and try again.  If some time later you want to be sure that the array is
entirely in sync you can
  echo repair > /sys/block/md0/md/sync_action
and wait a while.

I agree with what Mikael and Luca suggested - particularly the suggested for
"--bitmap internal".  You really want that.


NeilBrown



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Confusion with setting up new RAID6 with mdadm
  2010-11-14 22:13 ` Neil Brown
@ 2010-11-15  5:30   ` Roman Mamedov
  2010-11-15  6:58   ` Zoltan Szecsei
  2010-11-15 18:01   ` Zoltan Szecsei
  2 siblings, 0 replies; 19+ messages in thread
From: Roman Mamedov @ 2010-11-15  5:30 UTC (permalink / raw)
  To: Neil Brown; +Cc: Zoltan Szecsei, linux-raid

[-- Attachment #1: Type: text/plain, Size: 991 bytes --]

On Mon, 15 Nov 2010 09:13:26 +1100
Neil Brown <neilb@suse.de> wrote:

> Depending on which version of mdadm you are using, the default chunk size
> will be 64K or 512K.  I would recommend using 512K even if you have an older
> mdadm.  64K appears to be too small for modern hardware, particularly if you
> are storing large files.

According to some benchmarks I found, 64 or 128K still provide the sweet-spot
performance on RAID5 and RAID6, especially on writes.
http://louwrentius.blogspot.com/2010/05/raid-level-and-chunk-size-benchmarks.html
http://blog.jamponi.net/2008/07/raid56-and-10-benchmarks-on-26255_10.html#raid-5-performance
http://alephnull.com/benchmarks/sata2009/chunksize.html

> I agree with what Mikael and Luca suggested - particularly the suggested for
> "--bitmap internal".  You really want that.

It will also help to increase the --bitmap-chunk value to reduce its
performance impact, I suggest using 131072 or more.

-- 
With respect,
Roman

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Confusion with setting up new RAID6 with mdadm
  2010-11-14 19:50 ` Luca Berra
@ 2010-11-15  6:52   ` Zoltan Szecsei
  2010-11-15  7:41     ` Luca Berra
  2011-07-22  1:08   ` Tanguy Herrmann
  1 sibling, 1 reply; 19+ messages in thread
From: Zoltan Szecsei @ 2010-11-15  6:52 UTC (permalink / raw)
  To: linux-raid

On 2010-11-14 21:50, Luca Berra wrote:
> On Sun, Nov 14, 2010 at 05:36:38PM +0200, Zoltan Szecsei wrote:
>>
>> *Partition alignment?
>> Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drives)
> for modern hdds with 4k sectors it is
> new fdisk and/or parted should already know how to align
fdisk reports 512b sectors:
root@gs0:/home/geograph# fdisk -lu

WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util 
fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
256 heads, 63 sectors/track, 242251 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

    Device Boot      Start         End      Blocks   Id  System
/dev/sda1              63  3907024127  1953512032+  fd  Linux raid 
autodetect

(All 8 disks are as above)


> in any case, since you want to use the whole space for raid, why create
> partitions at all, md works nicely without
OK

>
>> *Chunk size:
>> After reading MANY different opinions, I'm guessing staying at the 
>> default chunk size is optimal? Anyone want to add to this argument?
> i believe the default in newer mdadm is fine
>
>>
>> *After partitioning the 8 drives, is this the correct sequence?
>> mdadm --create /dev/md0 --metadata=1.2 --auto=md --chunk=64 
>> --level=raid6 
> why you state the chunk size here, i tought you wanted to stay with the
> defauklt
because on my mdadm, 64 is the default - and I was just re-enforcing 
that for the reader.
root@gs0:/home/geograph# mdadm -V
mdadm - v2.6.7.1 - 15th October 2008
root@gs0:/home/geograph#
>> *After this, do I mkfs ext4 first, or LVM first?
> if you want to use lvm it would be lvm first, but... do you want to?
> there is no point if the aim is allocating the whole space to a single
> filesystem.
Because I might want to join this array to another one at a later stage 
- I would then have 2 boxes each with 8 drives, and each with a SiI3132 
card on the same motherboard.
I might use the second box to mirror the first, or to extend it - not 
sure of my needs yet.


>
>> *What stride and stripe values should I use?
> new toolstack should already find the correct stripe/stride for you
How would I check ?

root@gs0:/home/geograph# mkfs.ext4 -V
mke2fs 1.41.11 (14-Mar-2010)
         Using EXT2FS Library version 1.41.11
root@gs0:/home/geograph#
>
> one more note:
> for such a big array i would suggest to create a bitmap, so in case of
> an unclean shutdown you do not have to wait for 27 hours for it to
> rebuild. an internal bitmap will do.
>
Nice tip - I'll look into it - Thanks,
Zoltan


-- 

===========================================
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.

65 Main Road, Muizenberg 7945
Western Cape, South Africa.

34° 6'16.35"S 18°28'5.62"E

Tel: +27-21-7884897  Mobile: +27-83-6004028
Fax: +27-86-6115323     www.geograph.co.za
===========================================



-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3256 - Release Date: 11/14/10

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Confusion with setting up new RAID6 with mdadm
  2010-11-14 22:13 ` Neil Brown
  2010-11-15  5:30   ` Roman Mamedov
@ 2010-11-15  6:58   ` Zoltan Szecsei
  2010-11-15  7:43     ` Mikael Abrahamsson
  2010-11-15 18:01   ` Zoltan Szecsei
  2 siblings, 1 reply; 19+ messages in thread
From: Zoltan Szecsei @ 2010-11-15  6:58 UTC (permalink / raw)
  Cc: linux-raid

On 2010-11-15 00:13, Neil Brown wrote:
>
> Depending on which version of mdadm you are using, the default chunk size
> will be 64K or 512K.  I would recommend using 512K even if you have an older
> mdadm.  64K appears to be too small for modern hardware, particularly if you
> are storing large files.
>    
root@gs0:/home/geograph# mdadm -V
mdadm - v2.6.7.1 - 15th October 2008
root@gs0:/home/geograph#

This was what apt-get install got for me, from Ubuntu 10.04 64bit Desktop.
Should I download & compile a newer one?
(Where from? - haven't found the mdadm developer page yet))


> For raid6 with the current implementation it is safe to use "--assume-clean"
>    
Is my above version "current" enough?
> to avoid the long recovery time.  It is certainly safe to use that if you
> want to build a test array, do some performance measurement, and then scrap
> it and try again.  If some time later you want to be sure that the array is
> entirely in sync you can
>    echo repair>  /sys/block/md0/md/sync_action
> and wait a while.
>
> I agree with what Mikael and Luca suggested - particularly the suggested for
> "--bitmap internal".  You really want that.
>
>
> N
>    

Regards & thanks,
Zoltan


-- 

===========================================
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.

65 Main Road, Muizenberg 7945
Western Cape, South Africa.

34° 6'16.35"S 18°28'5.62"E

Tel: +27-21-7884897  Mobile: +27-83-6004028
Fax: +27-86-6115323     www.geograph.co.za
===========================================



-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3256 - Release Date: 11/14/10

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Confusion with setting up new RAID6 with mdadm
  2010-11-15  6:52   ` Zoltan Szecsei
@ 2010-11-15  7:41     ` Luca Berra
  2010-11-15 11:06       ` Zoltan Szecsei
  0 siblings, 1 reply; 19+ messages in thread
From: Luca Berra @ 2010-11-15  7:41 UTC (permalink / raw)
  To: linux-raid

On Mon, Nov 15, 2010 at 08:52:32AM +0200, Zoltan Szecsei wrote:
> On 2010-11-14 21:50, Luca Berra wrote:
>> On Sun, Nov 14, 2010 at 05:36:38PM +0200, Zoltan Szecsei wrote:
>>>
>>> *Partition alignment?
>>> Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drives)
>> for modern hdds with 4k sectors it is
>> new fdisk and/or parted should already know how to align
> fdisk reports 512b sectors:
> root@gs0:/home/geograph# fdisk -lu

i believe your fdisk does not support getting geometry from blkid, but i
am not an ubuntu user
you could try checking with something like 'strings /sbin/fdisk | grep io_size',
but since we are going without partitions you can ignore all this.

btw to check sector size on a disk, on a fairly recent kernel you can
check the files under /sys/block/*/queue,
hw_sector_size
minimum_io_size
optimal_io_size

except for disks that lie about their sector size, but this is a
different story.

>>> *After partitioning the 8 drives, is this the correct sequence?
>>> mdadm --create /dev/md0 --metadata=1.2 --auto=md --chunk=64 --level=raid6
>> why you state the chunk size here, i tought you wanted to stay with the
>> defauklt
> because on my mdadm, 64 is the default - and I was just re-enforcing that 
> for the reader.
> root@gs0:/home/geograph# mdadm -V
> mdadm - v2.6.7.1 - 15th October 2008
this mdadm release is a tad old (about two years), it will work, but
some things may be different than current 3.1.x

> root@gs0:/home/geograph#
>>> *After this, do I mkfs ext4 first, or LVM first?
>> if you want to use lvm it would be lvm first, but... do you want to?
>> there is no point if the aim is allocating the whole space to a single
>> filesystem.
> Because I might want to join this array to another one at a later stage - I 
> would then have 2 boxes each with 8 drives, and each with a SiI3132 card on 
> the same motherboard.
> I might use the second box to mirror the first, or to extend it - not sure 
> of my needs yet.
ok, then you need to align lvm as well
check if you have these parameters in /etc/lvm/lvm.conf
md_chunk_alignment = 1
data_alignment_detection = 1

if you don't have those at all, check if lvm supports them
strings /sbin/lvm|grep io_size

if not, you have to align manually, using the --dataalignment option to
pvcreate, align to a full stripe (chunk_size * 6, see below)


>>> *What stride and stripe values should I use?
stride=chunk_size/fs block size
stripe-width=stride * num_data_disks
num_data disks in your case is 6, 8 total disks - 2 parity disks

on a fairly recent kernel:
/sys/block/md?/queue/minimum_io_size would be the chunk_size of the array
/sys/block/md?/queue/optimal_io_size would be the stripe size

this should be exported on lvm devices also
/sys/block/dm-*/queue/...

so you can check with data in /sys/block at each step which is the value
to feed into tools.

>> new toolstack should already find the correct stripe/stride for you
> How would I check ?
strings /sbin/lvm|grep io_size

-- 
Luca Berra -- bluca@comedia.it

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Confusion with setting up new RAID6 with mdadm
  2010-11-15  6:58   ` Zoltan Szecsei
@ 2010-11-15  7:43     ` Mikael Abrahamsson
  2010-11-15  9:18       ` Neil Brown
  0 siblings, 1 reply; 19+ messages in thread
From: Mikael Abrahamsson @ 2010-11-15  7:43 UTC (permalink / raw)
  To: Zoltan Szecsei; +Cc: linux-raid

On Mon, 15 Nov 2010, Zoltan Szecsei wrote:

> (Where from? - haven't found the mdadm developer page yet))

<http://neil.brown.name/blog/20040607123837>

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Confusion with setting up new RAID6 with mdadm
  2010-11-15  7:43     ` Mikael Abrahamsson
@ 2010-11-15  9:18       ` Neil Brown
  0 siblings, 0 replies; 19+ messages in thread
From: Neil Brown @ 2010-11-15  9:18 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: Zoltan Szecsei, linux-raid

On Mon, 15 Nov 2010 08:43:03 +0100 (CET)
Mikael Abrahamsson <swmike@swm.pp.se> wrote:

> On Mon, 15 Nov 2010, Zoltan Szecsei wrote:
> 
> > (Where from? - haven't found the mdadm developer page yet))
> 
> <http://neil.brown.name/blog/20040607123837>
> 

aka http://neil.brown.name/blog/mdadm

NeilBrown

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Confusion with setting up new RAID6 with mdadm
  2010-11-15  7:41     ` Luca Berra
@ 2010-11-15 11:06       ` Zoltan Szecsei
  0 siblings, 0 replies; 19+ messages in thread
From: Zoltan Szecsei @ 2010-11-15 11:06 UTC (permalink / raw)
  To: linux-raid

On 2010-11-15 09:41, Luca Berra wrote:
> btw to check sector size on a disk, on a fairly recent kernel you can
> check the files under /sys/block/*/queue,
> hw_sector_size
512
> minimum_io_size
512
> optimal_io_size
0
>
> except for disks that lie about their sector size, but this is a
> different story.
>> root@gs0:/home/geograph# mdadm -V
>> mdadm - v2.6.7.1 - 15th October 2008
> this mdadm release is a tad old (about two years), it will work, but
> some things may be different than current 3.1.x
>
just downloaded the tarball for 3.1.4 and will have a crack at compiling it.


Thanks !

Z


-- 

===========================================
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.

65 Main Road, Muizenberg 7945
Western Cape, South Africa.

34° 6'16.35"S 18°28'5.62"E

Tel: +27-21-7884897  Mobile: +27-83-6004028
Fax: +27-86-6115323     www.geograph.co.za
===========================================



-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3257 - Release Date: 11/14/10

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Confusion with setting up new RAID6 with mdadm
  2010-11-14 16:48 ` Mikael Abrahamsson
@ 2010-11-15 12:27   ` Zoltan Szecsei
  2010-11-15 12:47     ` Michal Soltys
  0 siblings, 1 reply; 19+ messages in thread
From: Zoltan Szecsei @ 2010-11-15 12:27 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: linux-raid

On 2010-11-14 18:48, Mikael Abrahamsson wrote:
>
>> *What is the correct fdisk or parted method get rid of the DOS & GPT 
>> flags, and create a correctly aligned partition, and should this be a 
>> 0xda partiton (& then I use metatdata 1.2 for mdadm)?
>
> I just "dd if=/dev/zero of=/dev/sdX bs=1024000 count=1" to zero the 
> first megabyte of the drive to get rid of the partition table (you get 
> rid of the v1.2 metadata at the same time actually). Then you know for 
> sure you're correctly aligned as well as md is 4k aligned.
I did this on all 8 drives (/dev/sd[a-h])
root@gs0:/etc# dd if=/dev/zero of=/dev/sdb bs=1024000 count=1
1+0 records in
1+0 records out
1024000 bytes (1.0 MB) copied, 0.0103209 s, 99.2 MB/s
root@gs0:/etc# dd if=/dev/zero of=/dev/sdc bs=1024000 count=1

But the GPT id has not disappeared. I am going to use these drives 
unpartitioned, so is this a problem?

Thanks,
Zoltan

root@gs0:/etc# fdisk -lu

WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util 
fdisk doesn't support GPT. Use GNU Parted.


Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sda doesn't contain a valid partition table

WARNING: GPT (GUID Partition Table) detected on '/dev/sdb'! The util 
fdisk doesn't support GPT. Use GNU Parted.


Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdb doesn't contain a valid partition table

WARNING: GPT (GUID Partition Table) detected on '/dev/sdc'! The util 
fdisk doesn't support GPT. Use GNU Parted.


Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdc doesn't contain a valid partition table

WARNING: GPT (GUID Partition Table) detected on '/dev/sdd'! The util 
fdisk doesn't support GPT. Use GNU Parted.


Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdd doesn't contain a valid partition table

WARNING: GPT (GUID Partition Table) detected on '/dev/sde'! The util 
fdisk doesn't support GPT. Use GNU Parted.


Disk /dev/sde: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sde doesn't contain a valid partition table

WARNING: GPT (GUID Partition Table) detected on '/dev/sdf'! The util 
fdisk doesn't support GPT. Use GNU Parted.


Disk /dev/sdf: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdf doesn't contain a valid partition table

WARNING: GPT (GUID Partition Table) detected on '/dev/sdg'! The util 
fdisk doesn't support GPT. Use GNU Parted.


Disk /dev/sdg: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdg doesn't contain a valid partition table

WARNING: GPT (GUID Partition Table) detected on '/dev/sdh'! The util 
fdisk doesn't support GPT. Use GNU Parted.


Disk /dev/sdh: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdh doesn't contain a valid partition table

Disk /dev/sdi: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000e30c7

    Device Boot      Start         End      Blocks   Id  System
/dev/sdi1   *        2048      391167      194560   83  Linux
Partition 1 does not end on cylinder boundary.
/dev/sdi2          393214   976771071   488188929    5  Extended
/dev/sdi5          393216    98047999    48827392   83  Linux
/dev/sdi6        98050048   110047231     5998592   82  Linux swap / Solaris
/dev/sdi7       110049280   976771071   433360896   83  Linux
root@gs0:/etc#




-- 

===========================================
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.

65 Main Road, Muizenberg 7945
Western Cape, South Africa.

34° 6'16.35"S 18°28'5.62"E

Tel: +27-21-7884897  Mobile: +27-83-6004028
Fax: +27-86-6115323     www.geograph.co.za
===========================================



-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3257 - Release Date: 11/14/10

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Confusion with setting up new RAID6 with mdadm
  2010-11-15 12:27   ` Zoltan Szecsei
@ 2010-11-15 12:47     ` Michal Soltys
  2010-11-15 13:23       ` Zoltan Szecsei
  0 siblings, 1 reply; 19+ messages in thread
From: Michal Soltys @ 2010-11-15 12:47 UTC (permalink / raw)
  To: Zoltan Szecsei; +Cc: Mikael Abrahamsson, linux-raid

On 15.11.2010 13:27, Zoltan Szecsei wrote:
> On 2010-11-14 18:48, Mikael Abrahamsson wrote:
>>
>>> *What is the correct fdisk or parted method get rid of the DOS & GPT
>>> flags, and create a correctly aligned partition, and should this be a
>>> 0xda partiton (& then I use metatdata 1.2 for mdadm)?
>>
>> I just "dd if=/dev/zero of=/dev/sdX bs=1024000 count=1" to zero the
>> first megabyte of the drive to get rid of the partition table (you get
>> rid of the v1.2 metadata at the same time actually). Then you know for
>> sure you're correctly aligned as well as md is 4k aligned.
> I did this on all 8 drives (/dev/sd[a-h])
> root@gs0:/etc# dd if=/dev/zero of=/dev/sdb bs=1024000 count=1
> 1+0 records in
> 1+0 records out
> 1024000 bytes (1.0 MB) copied, 0.0103209 s, 99.2 MB/s
> root@gs0:/etc# dd if=/dev/zero of=/dev/sdc bs=1024000 count=1
>
> But the GPT id has not disappeared.

You might want to do blockdev --rereadpt /dev/sd[a-h] to make sure 
kernel registers new situation (or do the same with sfdisk -R)

Also, GPT stores backup partition table + gpt header at the end of the 
disk. Kernel might be clever enough to rely on it if you destroy the 
data at the beginning of the disk.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Confusion with setting up new RAID6 with mdadm
  2010-11-15 12:47     ` Michal Soltys
@ 2010-11-15 13:23       ` Zoltan Szecsei
  0 siblings, 0 replies; 19+ messages in thread
From: Zoltan Szecsei @ 2010-11-15 13:23 UTC (permalink / raw)
  To: Michal Soltys; +Cc: Mikael Abrahamsson, linux-raid

On 2010-11-15 14:47, Michal Soltys wrote:
> On 15.11.2010 13:27, Zoltan Szecsei wrote:
>> On 2010-11-14 18:48, Mikael Abrahamsson wrote:
>>>
>>>> *What is the correct fdisk or parted method get rid of the DOS & GPT
>>>> flags, and create a correctly aligned partition, and should this be a
>>>> 0xda partiton (& then I use metatdata 1.2 for mdadm)?
>>>
>>> I just "dd if=/dev/zero of=/dev/sdX bs=1024000 count=1" to zero the
>>> first megabyte of the drive to get rid of the partition table (you get
>>> rid of the v1.2 metadata at the same time actually). Then you know for
>>> sure you're correctly aligned as well as md is 4k aligned.
>> I did this on all 8 drives (/dev/sd[a-h])
>> root@gs0:/etc# dd if=/dev/zero of=/dev/sdb bs=1024000 count=1
>> 1+0 records in
>> 1+0 records out
>> 1024000 bytes (1.0 MB) copied, 0.0103209 s, 99.2 MB/s
>> root@gs0:/etc# dd if=/dev/zero of=/dev/sdc bs=1024000 count=1
>>
>> But the GPT id has not disappeared.
>
> You might want to do blockdev --rereadpt /dev/sd[a-h] to make sure 
> kernel registers new situation (or do the same with sfdisk -R)
>
> Also, GPT stores backup partition table + gpt header at the end of the 
> disk. Kernel might be clever enough to rely on it if you destroy the 
> data at the beginning of the disk.
>

OK, just done this on all 8 drives:
root@gs0:/sys/block# dd if=/dev/zero of=/dev/sdb bs=512 seek=3907029166
dd: writing `/dev/sdb': No space left on device
3+0 records in
2+0 records out
1024 bytes (1.0 kB) copied, 0.000913281 s, 1.1 MB/s
root@gs0:/sys/block#


fdisk -lu produces the results below - so presumably the drives now 
clean & ready for mdadm?

BTW: I've just downloaded and compiled the latest mdadm too:
root@gs0:/sys/block# mdadm -V
mdadm - v3.1.4 - 31st August 2010
root@gs0:/sys/block#


Thanks for your (collective) helps...
Zoltan



root@gs0:/sys/block# fdisk -lu

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sda doesn't contain a valid partition table

Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdb doesn't contain a valid partition table

Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdc doesn't contain a valid partition table

Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdd doesn't contain a valid partition table

Disk /dev/sde: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sde doesn't contain a valid partition table

Disk /dev/sdf: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdf doesn't contain a valid partition table

Disk /dev/sdg: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdg doesn't contain a valid partition table

Disk /dev/sdh: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdh doesn't contain a valid partition table

Disk /dev/sdi: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000e30c7

    Device Boot      Start         End      Blocks   Id  System
/dev/sdi1   *        2048      391167      194560   83  Linux
Partition 1 does not end on cylinder boundary.
/dev/sdi2          393214   976771071   488188929    5  Extended
/dev/sdi5          393216    98047999    48827392   83  Linux
/dev/sdi6        98050048   110047231     5998592   82  Linux swap / Solaris
/dev/sdi7       110049280   976771071   433360896   83  Linux
root@gs0:/sys/block#







-- 

===========================================
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.

65 Main Road, Muizenberg 7945
Western Cape, South Africa.

34° 6'16.35"S 18°28'5.62"E

Tel: +27-21-7884897  Mobile: +27-83-6004028
Fax: +27-86-6115323     www.geograph.co.za
===========================================



-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3257 - Release Date: 11/14/10

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Confusion with setting up new RAID6 with mdadm
  2010-11-14 22:13 ` Neil Brown
  2010-11-15  5:30   ` Roman Mamedov
  2010-11-15  6:58   ` Zoltan Szecsei
@ 2010-11-15 18:01   ` Zoltan Szecsei
  2010-11-15 19:53     ` Neil Brown
  2 siblings, 1 reply; 19+ messages in thread
From: Zoltan Szecsei @ 2010-11-15 18:01 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

Hi,
One last quick question:

Neil Brown <neilb@suse.de> wrote:
> Depending on which version of mdadm you are using, the default chunk size
> will be 64K or 512K.  I would recommend using 512K even if you have an older
> mdadm.  64K appears to be too small for modern hardware, particularly if you
> are storing large files.
>
> For raid6 with the current implementation it is safe to use "--assume-clean"
> to avoid the long recovery time.  It is certainly safe to use that if you
> want to build a test array, do some performance measurement, and then scrap
> it and try again.  If some time later you want to be sure that the array is
> entirely in sync you can
>    echo repair>  /sys/block/md0/md/sync_action
> and wait a while.
>    
****************************************************
I have compiled the following mdadm on my Ubuntu 64 bit 10.04 Desktop 
system:
root@gs0:/home/geograph# uname -a
Linux gs0 2.6.32-25-generic #45-Ubuntu SMP Sat Oct 16 19:52:42 UTC 2010 
x86_64 GNU/Linux
root@gs0:/home/geograph# mdadm -V
mdadm - v3.1.4 - 31st August 2010
root@gs0:/home/geograph#

****************************************************
I have deleted the partitions on all 8 drives, and done a mdadm -Ss

root@gs0:/home/geograph# fdisk -lu

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sda doesn't contain a valid partition table

Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes

******************************************************
Based on the above "assume-clean" comment, plus all the help you guys 
have offered, I have just run:
mdadm --create /dev/md0 --metadata=1.2 --auto=md --assume-clean 
--bitmap=internal --bitmap-chunk=131072 --chunk=512 --level=6 
--raid-devices=8 /dev/sd[abcdefgh]

It took a nano-second to complete!

The man-pages for assume-clean say that "the array pre-existed". Surely 
as I have erased the HDs, and now have no partitions on them, this is 
not true?
Do I need to re-run the above mdadm command, or is it safe to proceed 
with LVM then mkfs ext4?

Thanks for all,
Zoltan

******************************************************
root@gs0:/home/geograph# mdadm -E /dev/md0
mdadm: No md superblock detected on /dev/md0.



root@gs0:/home/geograph# ls -la /dev/md*
brw-rw---- 1 root disk 9, 0 2010-11-15 19:53 /dev/md0
/dev/md:
total 0
drwxr-xr-x  2 root root   60 2010-11-15 19:53 .
drwxr-xr-x 19 root root 4260 2010-11-15 19:53 ..
lrwxrwxrwx  1 root root    6 2010-11-15 19:53 0 -> ../md0


root@gs0:/home/geograph# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] 
[raid4] [raid10]
md0 : active raid6 sdc[2] sdf[5] sdh[7] sdd[3] sdb[1] sdg[6] sda[0] sde[4]
       11721077760 blocks super 1.2 level 6, 512k chunk, algorithm 2 
[8/8] [UUUUUUUU]
       bitmap: 0/8 pages [0KB], 131072KB chunk

unused devices: <none>




*******************************************************












-- 

===========================================
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.

65 Main Road, Muizenberg 7945
Western Cape, South Africa.

34° 6'16.35"S 18°28'5.62"E

Tel: +27-21-7884897  Mobile: +27-83-6004028
Fax: +27-86-6115323     www.geograph.co.za
===========================================



-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3258 - Release Date: 11/15/10

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Confusion with setting up new RAID6 with mdadm
  2010-11-15 18:01   ` Zoltan Szecsei
@ 2010-11-15 19:53     ` Neil Brown
  2010-11-16  6:48       ` Zoltan Szecsei
  0 siblings, 1 reply; 19+ messages in thread
From: Neil Brown @ 2010-11-15 19:53 UTC (permalink / raw)
  To: Zoltan Szecsei; +Cc: linux-raid

On Mon, 15 Nov 2010 20:01:48 +0200
Zoltan Szecsei <zoltans@geograph.co.za> wrote:

> Hi,
> One last quick question:
> 
> Neil Brown <neilb@suse.de> wrote:
> > Depending on which version of mdadm you are using, the default chunk size
> > will be 64K or 512K.  I would recommend using 512K even if you have an older
> > mdadm.  64K appears to be too small for modern hardware, particularly if you
> > are storing large files.
> >
> > For raid6 with the current implementation it is safe to use "--assume-clean"
> > to avoid the long recovery time.  It is certainly safe to use that if you
> > want to build a test array, do some performance measurement, and then scrap
> > it and try again.  If some time later you want to be sure that the array is
> > entirely in sync you can
> >    echo repair>  /sys/block/md0/md/sync_action
> > and wait a while.
> >    
> ****************************************************
> I have compiled the following mdadm on my Ubuntu 64 bit 10.04 Desktop 
> system:
> root@gs0:/home/geograph# uname -a
> Linux gs0 2.6.32-25-generic #45-Ubuntu SMP Sat Oct 16 19:52:42 UTC 2010 
> x86_64 GNU/Linux
> root@gs0:/home/geograph# mdadm -V
> mdadm - v3.1.4 - 31st August 2010
> root@gs0:/home/geograph#
> 
> ****************************************************
> I have deleted the partitions on all 8 drives, and done a mdadm -Ss
> 
> root@gs0:/home/geograph# fdisk -lu
> 
> Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
> 255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x00000000
> 
> Disk /dev/sda doesn't contain a valid partition table
> 
> Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
> 
> ******************************************************
> Based on the above "assume-clean" comment, plus all the help you guys 
> have offered, I have just run:
> mdadm --create /dev/md0 --metadata=1.2 --auto=md --assume-clean 
> --bitmap=internal --bitmap-chunk=131072 --chunk=512 --level=6 
> --raid-devices=8 /dev/sd[abcdefgh]
> 
> It took a nano-second to complete!
> 
> The man-pages for assume-clean say that "the array pre-existed". Surely 
> as I have erased the HDs, and now have no partitions on them, this is 
> not true?
> Do I need to re-run the above mdadm command, or is it safe to proceed 
> with LVM then mkfs ext4?

It is safe to proceed.

The situation is that the two parity block are probably not correct on most
(or even any) stripes.  But you have no live data on them to protect, so it
doesn't really matter.

With the current implementation of RAID6, every time you write, the correct
parity blocks are computed and written.  So any live data that is written
will be accompanies by correct parity blocks to protect it.

This does *not* apply to RAID5 as it sometimes uses the old parity block to
compute the new parity block.  If the old was wrong, the new will be wrong
too.

It is conceivable that one day we might change the raid6 code to perform
similar updates if it ever turns out to be faster to do it that way, but it
seems unlikely at the moment.

NeilBrown


> 
> Thanks for all,
> Zoltan
> 
> ******************************************************
> root@gs0:/home/geograph# mdadm -E /dev/md0
> mdadm: No md superblock detected on /dev/md0.
> 
> 
> 
> root@gs0:/home/geograph# ls -la /dev/md*
> brw-rw---- 1 root disk 9, 0 2010-11-15 19:53 /dev/md0
> /dev/md:
> total 0
> drwxr-xr-x  2 root root   60 2010-11-15 19:53 .
> drwxr-xr-x 19 root root 4260 2010-11-15 19:53 ..
> lrwxrwxrwx  1 root root    6 2010-11-15 19:53 0 -> ../md0
> 
> 
> root@gs0:/home/geograph# cat /proc/mdstat
> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] 
> [raid4] [raid10]
> md0 : active raid6 sdc[2] sdf[5] sdh[7] sdd[3] sdb[1] sdg[6] sda[0] sde[4]
>        11721077760 blocks super 1.2 level 6, 512k chunk, algorithm 2 
> [8/8] [UUUUUUUU]
>        bitmap: 0/8 pages [0KB], 131072KB chunk
> 
> unused devices: <none>
> 
> 
> 
> 
> *******************************************************
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Confusion with setting up new RAID6 with mdadm
  2010-11-15 19:53     ` Neil Brown
@ 2010-11-16  6:48       ` Zoltan Szecsei
  0 siblings, 0 replies; 19+ messages in thread
From: Zoltan Szecsei @ 2010-11-16  6:48 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

On 2010-11-15 21:53, Neil Brown wrote:
> On Mon, 15 Nov 2010 20:01:48 +0200
> Zoltan Szecsei<zoltans@geograph.co.za>  wrote:
>
>    
>> Hi,
>> One last quick question:
>> ****************************************************
>> I have compiled the following mdadm on my Ubuntu 64 bit 10.04 Desktop
>> system:
>> root@gs0:/home/geograph# uname -a
>> Linux gs0 2.6.32-25-generic #45-Ubuntu SMP Sat Oct 16 19:52:42 UTC 2010
>> x86_64 GNU/Linux
>> root@gs0:/home/geograph# mdadm -V
>> mdadm - v3.1.4 - 31st August 2010
>> root@gs0:/home/geograph#
>>
>> ****************************************************
>> I have deleted the partitions on all 8 drives, and done a mdadm -Ss
>>
>> root@gs0:/home/geograph# fdisk -lu
>>
>> Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
>> 255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
>> Units = sectors of 1 * 512 = 512 bytes
>> Sector size (logical/physical): 512 bytes / 512 bytes
>> I/O size (minimum/optimal): 512 bytes / 512 bytes
>> Disk identifier: 0x00000000
>>
>> Disk /dev/sda doesn't contain a valid partition table
>>
>> Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
>>
>> ******************************************************
>> Based on the above "assume-clean" comment, plus all the help you guys
>> have offered, I have just run:
>> mdadm --create /dev/md0 --metadata=1.2 --auto=md --assume-clean
>> --bitmap=internal --bitmap-chunk=131072 --chunk=512 --level=6
>> --raid-devices=8 /dev/sd[abcdefgh]
>>
>> It took a nano-second to complete!
>>
>> The man-pages for assume-clean say that "the array pre-existed". Surely
>> as I have erased the HDs, and now have no partitions on them, this is
>> not true?
>> Do I need to re-run the above mdadm command, or is it safe to proceed
>> with LVM then mkfs ext4?
>>      
> It is safe to proceed.
>    

Too cool (A for away at last :-) )
Neil: Big thanks to you and the others on this list for all the patience 
& help you guys have given.,
Kind regards,
Zoltan
> The situation is that the two parity block are probably not correct on most
> (or even any) stripes.  But you have no live data on them to protect, so it
> doesn't really matter.
>
> With the current implementation of RAID6, every time you write, the correct
> parity blocks are computed and written.  So any live data that is written
> will be accompanies by correct parity blocks to protect it.
>
> This does *not* apply to RAID5 as it sometimes uses the old parity block to
> compute the new parity block.  If the old was wrong, the new will be wrong
> too.
>
> It is conceivable that one day we might change the raid6 code to perform
> similar updates if it ever turns out to be faster to do it that way, but it
> seems unlikely at the moment.
>
> NeilBrown
>
>
>    




-- 

===========================================
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.

65 Main Road, Muizenberg 7945
Western Cape, South Africa.

34° 6'16.35"S 18°28'5.62"E

Tel: +27-21-7884897  Mobile: +27-83-6004028
Fax: +27-86-6115323     www.geograph.co.za
===========================================



-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3258 - Release Date: 11/15/10

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Confusion with setting up new RAID6 with mdadm
  2010-11-14 19:50 ` Luca Berra
  2010-11-15  6:52   ` Zoltan Szecsei
@ 2011-07-22  1:08   ` Tanguy Herrmann
  2011-07-22  5:17     ` Mikael Abrahamsson
  1 sibling, 1 reply; 19+ messages in thread
From: Tanguy Herrmann @ 2011-07-22  1:08 UTC (permalink / raw)
  To: linux-raid

Luca Berra <bluca <at> comedia.it> writes:

> 
> On Sun, Nov 14, 2010 at 05:36:38PM +0200, Zoltan Szecsei wrote:
> > *If I have to reformat the drives and redo mdadm --create, other than mdadm 
> > stop, how can I get rid of all the /dev/md* etc etc so that when I restart 
> > this exercise, the original bad RAID does not interfere with this new 
> > attempt?
> 
> mdadm -Ss
> mdadm --zero-superblock on each partition
> >
> >
> > *Partition alignment?
> > Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drives)
> for modern hdds with 4k sectors it is
> new fdisk and/or parted should already know how to align
> in any case, since you want to use the whole space for raid, why create
> partitions at all, md works nicely without

Hello,
first thank you for the interesting topic (because it fits my questions ^^) and 
for all the participation of this community to this topic !

I've read somewhere (sorry I can't remind it) that the raid still could be 
unaligned by using the whole disk, and so we had to create a partition aligned 
(by using fdisk -u, then creating a partition beginning at LBA 64 at least, and 
that would span on a length multiple of 8.

Was it totally wrong ?

Tanguy

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Confusion with setting up new RAID6 with mdadm
  2011-07-22  1:08   ` Tanguy Herrmann
@ 2011-07-22  5:17     ` Mikael Abrahamsson
  0 siblings, 0 replies; 19+ messages in thread
From: Mikael Abrahamsson @ 2011-07-22  5:17 UTC (permalink / raw)
  To: Tanguy Herrmann; +Cc: linux-raid

On Fri, 22 Jul 2011, Tanguy Herrmann wrote:

> I've read somewhere (sorry I can't remind it) that the raid still could be
> unaligned by using the whole disk, and so we had to create a partition aligned
> (by using fdisk -u, then creating a partition beginning at LBA 64 at least, and
> that would span on a length multiple of 8.
>
> Was it totally wrong ?

No, but I don't see how using the whole disk can end up to be not 4k 
aligned when doing your way would. Only way this would end up unaligned 
would be if the offset jumper on the WDxxEARS drives was set, and then 
you'd be misaligned regardless of method.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2011-07-22  5:17 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-14 15:36 Confusion with setting up new RAID6 with mdadm Zoltan Szecsei
2010-11-14 16:48 ` Mikael Abrahamsson
2010-11-15 12:27   ` Zoltan Szecsei
2010-11-15 12:47     ` Michal Soltys
2010-11-15 13:23       ` Zoltan Szecsei
2010-11-14 19:50 ` Luca Berra
2010-11-15  6:52   ` Zoltan Szecsei
2010-11-15  7:41     ` Luca Berra
2010-11-15 11:06       ` Zoltan Szecsei
2011-07-22  1:08   ` Tanguy Herrmann
2011-07-22  5:17     ` Mikael Abrahamsson
2010-11-14 22:13 ` Neil Brown
2010-11-15  5:30   ` Roman Mamedov
2010-11-15  6:58   ` Zoltan Szecsei
2010-11-15  7:43     ` Mikael Abrahamsson
2010-11-15  9:18       ` Neil Brown
2010-11-15 18:01   ` Zoltan Szecsei
2010-11-15 19:53     ` Neil Brown
2010-11-16  6:48       ` Zoltan Szecsei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).