linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Array created by mdadm 3.2 & 3.3 have different array size, why?
@ 2014-03-26 10:31 Tide
  2014-03-26 18:01 ` Larry Fenske
  2014-03-26 18:38 ` Mikael Abrahamsson
  0 siblings, 2 replies; 13+ messages in thread
From: Tide @ 2014-03-26 10:31 UTC (permalink / raw)
  To: linux-raid

I created a software RAID 5 array one year ago using mdadm v3.2.x (CentOS 6.3) with 3 3TB disks (Seagate ST3000DM001), after few months, I moved/assembled the array to/in Fedora 19 (now Fedora 20). Now, I added tow more disks to it, and grow this array to 4 disks + 1 hot spare. Now, the array size is 8383.55 GiB.

Then I created another array (RAID 6) using mdadm v3.3 (Fedora 20) with 5 3TB disks (Toshiba DT01ACA300), but the array size is 8383.18 GiB, which is slightly smaller than 8383.55 GiB.

The partition size of each disk in the two arrays are identical (all partitions have 5860531087 logical sectors), so why the array size are different? Is it caused by different mdadm version (I guess so) or different array level or something else?

Same question on unix.stackexchange.com:
http://unix.stackexchange.com/questions/121310/two-arrays-have-slightly-different-array-size-with-same-size-disks-partitions-w


===================
== RAID 5 detail ==
===================
# mdadm -D /dev/md127 
/dev/md127:
        Version : 1.2
  Creation Time : Fri Jan 11 17:56:18 2013
     Raid Level : raid5
     Array Size : 8790792192 (8383.55 GiB 9001.77 GB)
  Used Dev Size : 2930264064 (2794.52 GiB 3000.59 GB)
   Raid Devices : 4
  Total Devices : 5
    Persistence : Superblock is persistent

    Update Time : Tue Mar 25 11:04:15 2014
          State : clean 
 Active Devices : 4
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 512K

           Name : RecordBackup01:127  (local to host RecordBackup01)
           UUID : dfd3bbe7:4b0231fe:9007bc4a:e106acac
         Events : 7264

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8       33        1      active sync   /dev/sdc1
       3       8       49        2      active sync   /dev/sdd1
       5       8       81        3      active sync   /dev/sdf1

       4       8       65        -      spare   /dev/sde1

===================
== RAID 6 detail ==
===================
# mdadm -D /dev/md127 
/dev/md127:
        Version : 1.2
  Creation Time : Fri Mar 21 18:12:00 2014
     Raid Level : raid6
     Array Size : 8790402048 (8383.18 GiB 9001.37 GB)
  Used Dev Size : 2930134016 (2794.39 GiB 3000.46 GB)
   Raid Devices : 5
  Total Devices : 5
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Tue Mar 25 11:18:51 2014
          State : active 
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 512K

           Name : RecordBackup02:127  (local to host RecordBackup02)
           UUID : 923c9658:12739258:506fc8b0:f8c5edf3
         Events : 8172

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8       33        1      active sync   /dev/sdc1
       2       8       49        2      active sync   /dev/sdd1
       3       8       65        3      active sync   /dev/sde1
       4       8       81        4      active sync   /dev/sdf1



===================================
== RAID 5 partitions information ==
===================================
Model: ATA ST3000DM001-1CH1 (scsi)
Disk /dev/sdb: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start  End          Size         File system  Name  Flags
 1      2048s  5860533134s  5860531087s               pri


Model: ATA ST3000DM001-1CH1 (scsi)
Disk /dev/sdc: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start  End          Size         File system  Name     Flags
 1      2048s  5860533134s  5860531087s  ext4         primary


Model: ATA ST3000DM001-1CH1 (scsi)
Disk /dev/sdd: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start  End          Size         File system  Name     Flags
 1      2048s  5860533134s  5860531087s  ext4         primary


Model: ATA ST3000DM001-1CH1 (scsi)
Disk /dev/sde: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start  End          Size         File system  Name     Flags
 1      2048s  5860533134s  5860531087s               primary


Model: ATA ST3000DM001-1CH1 (scsi)
Disk /dev/sdf: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start  End          Size         File system  Name     Flags
 1      2048s  5860533134s  5860531087s               primary




Model: Linux Software RAID Array (md)
Disk /dev/md127: 17581584384s
Sector size (logical/physical): 512B/4096B
Partition Table: loop
Disk Flags: 

Number  Start  End           Size          File system  Flags
 1      0s     17581584383s  17581584384s  xfs


===================================
== RAID 6 partitions information ==
===================================
Model: ATA TOSHIBA DT01ACA3 (scsi)
Disk /dev/sdb: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start  End          Size         File system  Name     Flags
 1      2048s  5860533134s  5860531087s               primary


Model: ATA TOSHIBA DT01ACA3 (scsi)
Disk /dev/sdc: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start  End          Size         File system  Name     Flags
 1      2048s  5860533134s  5860531087s               primary


Model: ATA TOSHIBA DT01ACA3 (scsi)
Disk /dev/sdd: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start  End          Size         File system  Name     Flags
 1      2048s  5860533134s  5860531087s               primary


Model: ATA TOSHIBA DT01ACA3 (scsi)
Disk /dev/sde: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start  End          Size         File system  Name     Flags
 1      2048s  5860533134s  5860531087s               primary


Model: ATA TOSHIBA DT01ACA3 (scsi)
Disk /dev/sdf: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start  End          Size         File system  Name     Flags
 1      2048s  5860533134s  5860531087s               primary






Model: Linux Software RAID Array (md)
Disk /dev/md127: 17580804096s
Sector size (logical/physical): 512B/4096B
Partition Table: loop
Disk Flags: 

Number  Start  End           Size          File system  Flags
 1      0s     17580804095s  17580804096s  xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Array created by mdadm 3.2 & 3.3 have different array size, why?
  2014-03-26 10:31 Array created by mdadm 3.2 & 3.3 have different array size, why? Tide
@ 2014-03-26 18:01 ` Larry Fenske
  2014-03-26 19:47   ` Tide
  2014-03-26 18:38 ` Mikael Abrahamsson
  1 sibling, 1 reply; 13+ messages in thread
From: Larry Fenske @ 2014-03-26 18:01 UTC (permalink / raw)
  To: Tide, linux-raid

Your RAID 6 array has an internal bitmap, while your RAID 5 array does not.

- Larry Fenske
SGI signature

Senior Software Engineer
SGI


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Array created by mdadm 3.2 & 3.3 have different array size, why?
  2014-03-26 10:31 Array created by mdadm 3.2 & 3.3 have different array size, why? Tide
  2014-03-26 18:01 ` Larry Fenske
@ 2014-03-26 18:38 ` Mikael Abrahamsson
  2014-03-26 20:00   ` Tide
  1 sibling, 1 reply; 13+ messages in thread
From: Mikael Abrahamsson @ 2014-03-26 18:38 UTC (permalink / raw)
  To: Tide; +Cc: linux-raid

On Wed, 26 Mar 2014, Tide wrote:

> The partition size of each disk in the two arrays are identical (all 
> partitions have 5860531087 logical sectors), so why the array size are 
> different? Is it caused by different mdadm version (I guess so) or 
> different array level or something else?

Do mdadm -E on one component device from each md volume. My first guess 
would be different data offset.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Array created by mdadm 3.2 & 3.3 have different array size, why?
  2014-03-26 18:01 ` Larry Fenske
@ 2014-03-26 19:47   ` Tide
  0 siblings, 0 replies; 13+ messages in thread
From: Tide @ 2014-03-26 19:47 UTC (permalink / raw)
  To: linux-raid

Larry Fenske <LFenske <at> SGI.com> writes:

> 
> Your RAID 6 array has an internal bitmap, while your RAID 5 array does not.

When creating array, I didn't specified bitmap options (don't know what it
is), so is it different behavior between v3.2 and v3.3, or it's just the 
compile parameters are different between CentOS 6 and Fedora 19/20?

> 
> - Larry Fenske
> SGI signature
> 
> Senior Software Engineer
> SGI
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo <at> vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Array created by mdadm 3.2 & 3.3 have different array size, why?
  2014-03-26 18:38 ` Mikael Abrahamsson
@ 2014-03-26 20:00   ` Tide
  2014-03-26 21:14     ` Stan Hoeppner
  0 siblings, 1 reply; 13+ messages in thread
From: Tide @ 2014-03-26 20:00 UTC (permalink / raw)
  To: linux-raid

Mikael Abrahamsson <swmike <at> swm.pp.se> writes:

> 
> On Wed, 26 Mar 2014, Tide wrote:
> 
> > The partition size of each disk in the two arrays are identical (all 
> > partitions have 5860531087 logical sectors), so why the array size are 
> > different? Is it caused by different mdadm version (I guess so) or 
> > different array level or something else?
> 
> Do mdadm -E on one component device from each md volume. My first guess 
> would be different data offset.
> 

Yes, the data offset are different. But why? When creating array, I didn't
specified any other options:
mdadm -C /dev/md127 -l 5 -n 3 /dev/sd[bcd]1
mdadm --add /dev/md127 /dev/sd[ef]1
mdadm --grow /dev/md127 -n 4

mdadm -C /dev/md127 -l 6 -n 5 /dev/sd[bcdef]1

=================
Array 1 (RAID 5):
=================
# mdadm --examine /dev/sdb1
/dev/sdb1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : dfd3bbe7:4b0231fe:9007bc4a:e106acac
           Name : RecordBackup01:127  (local to host RecordBackup01)
  Creation Time : Fri Jan 11 17:56:18 2013
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 5860529039 (2794.52 GiB 3000.59 GB)
     Array Size : 8790792192 (8383.55 GiB 9001.77 GB)
  Used Dev Size : 5860528128 (2794.52 GiB 3000.59 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=911 sectors
          State : active
    Device UUID : 85fa01f1:b8e7875c:5f19039f:b198c0bb

    Update Time : Thu Mar 27 03:35:06 2014
       Checksum : 16ff8ff5 - correct
         Events : 7266

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)


=================
Array 2 (RAID 6):
=================
# mdadm --examine /dev/sdb1
/dev/sdb1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 923c9658:12739258:506fc8b0:f8c5edf3
           Name : RecordBackup02:127  (local to host RecordBackup02)
  Creation Time : Fri Mar 21 18:12:00 2014
     Raid Level : raid6
   Raid Devices : 5

 Avail Dev Size : 5860268943 (2794.39 GiB 3000.46 GB)
     Array Size : 8790402048 (8383.18 GiB 9001.37 GB)
  Used Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=911 sectors
          State : clean
    Device UUID : 94a83f3f:33991f7a:9545d12f:87bd169f

Internal Bitmap : 8 sectors from superblock
    Update Time : Thu Mar 27 01:55:49 2014
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 5a7094a4 - correct
         Events : 8178

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAAA ('A' == active, '.' == missing, 'R' == replacing)




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Array created by mdadm 3.2 & 3.3 have different array size, why?
  2014-03-26 20:00   ` Tide
@ 2014-03-26 21:14     ` Stan Hoeppner
  2014-03-27  2:44       ` Array created by mdadm 3.2 & 3.3 have different array size Tide
  2014-03-27 15:23       ` Array created by mdadm 3.2 & 3.3 have different array size, why? Bernd Schubert
  0 siblings, 2 replies; 13+ messages in thread
From: Stan Hoeppner @ 2014-03-26 21:14 UTC (permalink / raw)
  To: Tide, linux-raid

On 3/26/2014 3:00 PM, Tide wrote:
...
> =================
> Array 2 (RAID 6):
> =================
> # mdadm --examine /dev/sdb1
...
>      Raid Level : raid6
>    Raid Devices : 5
...
>   Bad Block Log : 512 entries available at offset 72 sectors

The RAID6 array has sectors on each drive reserved for bad block
reassignment.  The RAID5 array does not.

This is the answer to your mystery.

Cheers,

Stan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Array created by mdadm 3.2 & 3.3 have different array size
  2014-03-26 21:14     ` Stan Hoeppner
@ 2014-03-27  2:44       ` Tide
  2014-03-27  5:52         ` Stan Hoeppner
  2014-03-27 15:23       ` Array created by mdadm 3.2 & 3.3 have different array size, why? Bernd Schubert
  1 sibling, 1 reply; 13+ messages in thread
From: Tide @ 2014-03-27  2:44 UTC (permalink / raw)
  To: linux-raid

Stan Hoeppner <stan <at> hardwarefreak.com> writes:

> 
> On 3/26/2014 3:00 PM, Tide wrote:
> ...
> > =================
> > Array 2 (RAID 6):
> > =================
> > # mdadm --examine /dev/sdb1
> ...
> >      Raid Level : raid6
> >    Raid Devices : 5
> ...
> >   Bad Block Log : 512 entries available at offset 72 sectors
> 
> The RAID6 array has sectors on each drive reserved for bad block
> reassignment.  The RAID5 array does not.
> 
> This is the answer to your mystery.

"The RAID6/RAID5 array", do you mean my RAID array (just this instance), or
you mean "all RAID6/RAID5 arrays created by mdadm" ?

I also did created another RAID 5 & RAID 6 testing arrays using loopback
devices in both CentOS 6.5 and Fedora 20, and found out: array size of RAID
5 & RAID 6 are identical in same OS (in CentOS or in Fedora), but they are
different between OSes. That's what I still feel mysterious.

So is it different behaviour between mdadm v3.2 and v3.3, or it's just the 
compilation parameters of mdadm are different between CentOS 6.x and Fedora
19/20?

Testing script:
MAKEDEV /dev/loop; truncate -s 10M hdd5{1..5} hdd6{1..5}; for hdd in {1..5};
do losetup /dev/loop5$hdd hdd5$hdd; losetup /dev/loop6$hdd hdd6$hdd; done;
mdadm -C /dev/md5 -l 5 -n 4 -x 1 /dev/loop5{1..5}; mdadm -C /dev/md6 -l 6 -n
5 /dev/loop6{1..5}; mdadm -D /dev/md5; mdadm -D /dev/md6

> 
> Cheers,
> 
> Stan
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo <at> vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Array created by mdadm 3.2 & 3.3 have different array size
  2014-03-27  2:44       ` Array created by mdadm 3.2 & 3.3 have different array size Tide
@ 2014-03-27  5:52         ` Stan Hoeppner
  2014-03-27  6:41           ` Tide
  0 siblings, 1 reply; 13+ messages in thread
From: Stan Hoeppner @ 2014-03-27  5:52 UTC (permalink / raw)
  To: Tide, linux-raid

On 3/26/2014 9:44 PM, Tide wrote:
> Stan Hoeppner <stan <at> hardwarefreak.com> writes:
>>
>> On 3/26/2014 3:00 PM, Tide wrote:
>> ...
>>> =================
>>> Array 2 (RAID 6):
>>> =================
>>> # mdadm --examine /dev/sdb1
>> ...
>>>      Raid Level : raid6
>>>    Raid Devices : 5
>> ...
>>>   Bad Block Log : 512 entries available at offset 72 sectors
>>
>> The RAID6 array has sectors on each drive reserved for bad block
>> reassignment.  The RAID5 array does not.
>>
>> This is the answer to your mystery.


> "The RAID6/RAID5 array", do you mean my RAID array (just this instance), or
> you mean "all RAID6/RAID5 arrays created by mdadm" ?

My reply above is unambiguous.  I quoted your array data and gave an
answer that applies to your provided array data.

WRT your other questions, I do not have time to research the answer to
those, and wouldn't spend it on that if I did.  I have never used CentOS
nor Fedora, and don't plan to ever use either.  For those answers,
either wait for someone to answer, or research it yourself.

Cheers,

Stan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Array created by mdadm 3.2 & 3.3 have different array size
  2014-03-27  5:52         ` Stan Hoeppner
@ 2014-03-27  6:41           ` Tide
  2014-03-27 13:04             ` Wilson Jonathan
  0 siblings, 1 reply; 13+ messages in thread
From: Tide @ 2014-03-27  6:41 UTC (permalink / raw)
  To: linux-raid

Stan Hoeppner <stan <at> hardwarefreak.com> writes:

> 
> > "The RAID6/RAID5 array", do you mean my RAID array (just this instance), or
> > you mean "all RAID6/RAID5 arrays created by mdadm" ?
> 
> My reply above is unambiguous.  I quoted your array data and gave an
> answer that applies to your provided array data.
> 
> WRT your other questions, I do not have time to research the answer to
> those, and wouldn't spend it on that if I did.  I have never used CentOS
> nor Fedora, and don't plan to ever use either.  For those answers,
> either wait for someone to answer, or research it yourself.
> 
> Cheers,
> 
> Stan

Thank you Stan!

Now I'm going to try remove the write-intent bitmap and bad block log in the
RAID 6 array to reshape two arrays to same size.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Array created by mdadm 3.2 & 3.3 have different array size
  2014-03-27  6:41           ` Tide
@ 2014-03-27 13:04             ` Wilson Jonathan
  0 siblings, 0 replies; 13+ messages in thread
From: Wilson Jonathan @ 2014-03-27 13:04 UTC (permalink / raw)
  To: Tide; +Cc: linux-raid

On Thu, 2014-03-27 at 06:41 +0000, Tide wrote:
> Stan Hoeppner <stan <at> hardwarefreak.com> writes:
> 
> > 
> > > "The RAID6/RAID5 array", do you mean my RAID array (just this instance), or
> > > you mean "all RAID6/RAID5 arrays created by mdadm" ?
> > 
> > My reply above is unambiguous.  I quoted your array data and gave an
> > answer that applies to your provided array data.
> > 
> > WRT your other questions, I do not have time to research the answer to
> > those, and wouldn't spend it on that if I did.  I have never used CentOS
> > nor Fedora, and don't plan to ever use either.  For those answers,
> > either wait for someone to answer, or research it yourself.
> > 
> > Cheers,
> > 
> > Stan
> 
> Thank you Stan!
> 
> Now I'm going to try remove the write-intent bitmap and bad block log in the
> RAID 6 array to reshape two arrays to same size.

I'm not 100% certain but I doubt that will have any effect, once an
array is created its basic structural layout on individual disks is set
and from what I can tell is usually dependent on which version was used
to create the array, although I do recall that after updating my debian
and therefore mdadm through various levels the on disk basic structure,
such as offset, also changed on new devices that were added to existing
arrays... hence why there is (was?) a special version of mdadm that
allows entering offsets on individual members for use in some recovery
situations.

I think that even if at the creation time of a new array, you tell it no
write intent and no bad blocks (not sure if bad blocks can be removed)
it will still reserve space in case you want to use them at another
time.

If you want exactly the same space/size then you would need to create
with the original os/mdadm versions previously used... but even so I'm
not sure that disk/partition size plays a part in initial layout (I
don't think it does, but I may be wrong) so creating a new array on say
3GB disks as apposed to 1GB disks "might" cause a different size/layout.

Personally I wouldn't worry about minor differences in overall size,
especially concidering that the only time it would make a difference
would be if you were copying one to the other using a regular transfer
(cp/rsync/etc) and the original array was full (100% full) and the
second array was slightly smaller, although adjusting the file system
"root" reserve space on ext3/4 on the new array might mitigate the
problem.

Jon.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Array created by mdadm 3.2 & 3.3 have different array size, why?
  2014-03-26 21:14     ` Stan Hoeppner
  2014-03-27  2:44       ` Array created by mdadm 3.2 & 3.3 have different array size Tide
@ 2014-03-27 15:23       ` Bernd Schubert
  2014-03-27 20:39         ` Stan Hoeppner
  1 sibling, 1 reply; 13+ messages in thread
From: Bernd Schubert @ 2014-03-27 15:23 UTC (permalink / raw)
  To: stan, Tide, linux-raid

On 03/26/2014 10:14 PM, Stan Hoeppner wrote:
> On 3/26/2014 3:00 PM, Tide wrote:
> ...
>> =================
>> Array 2 (RAID 6):
>> =================
>> # mdadm --examine /dev/sdb1
> ...
>>       Raid Level : raid6
>>     Raid Devices : 5
> ...
>>    Bad Block Log : 512 entries available at offset 72 sectors
>
> The RAID6 array has sectors on each drive reserved for bad block
> reassignment.  The RAID5 array does not.
>
> This is the answer to your mystery.

Commits and code do not confirm this assumption.

>         __u16   bblog_size;     /* number of sectors reserved for badblocklist */

...

>                 printf("  Bad Block Log : %d entries available at offset %ld sectors",
>                        __le16_to_cpu(sb->bblog_size)*512/8,


So 512 bad-block-log entries only need 8 sectors and that would still 
fit into the data offset of 2048 bytes (array 1). The 
write-intent-bitmap is also not that big. But this commit log gives the 
correct answer

> commit 508a7f16b242d6c3353e15aab46ac8ca8dc7cd08
> Author: NeilBrown <neilb@suse.de>
> Date:   Wed Apr 4 14:00:40 2012 +1000
>
>     super1: leave more space in front of data by default.
>
>     The kernel is growing the ability to avoid the need for a
>     backup file during reshape by being able to change the data offset.
>
>     For this to be useful we need plenty of free space before the
>     data so the data offset can be reduced.
>
>     So for v1.1 and v1.2 metadata make the default data_offset much
>     larger.  Aim for 128Meg, but keep a power of 2 and don't use more
>     than 0.1% of each device.
>
>     Don't change v1.0 as that is used when the data_offset is required to
>     be zero.



Bernd


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Array created by mdadm 3.2 & 3.3 have different array size, why?
  2014-03-27 15:23       ` Array created by mdadm 3.2 & 3.3 have different array size, why? Bernd Schubert
@ 2014-03-27 20:39         ` Stan Hoeppner
  2014-03-28  2:44           ` Tide
  0 siblings, 1 reply; 13+ messages in thread
From: Stan Hoeppner @ 2014-03-27 20:39 UTC (permalink / raw)
  To: Bernd Schubert, Tide, linux-raid

On 3/27/2014 10:23 AM, Bernd Schubert wrote:
> On 03/26/2014 10:14 PM, Stan Hoeppner wrote:
>> On 3/26/2014 3:00 PM, Tide wrote:
>> ...
>>> =================
>>> Array 2 (RAID 6):
>>> =================
>>> # mdadm --examine /dev/sdb1
>> ...
>>>       Raid Level : raid6
>>>     Raid Devices : 5
>> ...
>>>    Bad Block Log : 512 entries available at offset 72 sectors
>>
>> The RAID6 array has sectors on each drive reserved for bad block
>> reassignment.  The RAID5 array does not.
>>
>> This is the answer to your mystery.
> 
> Commits and code do not confirm this assumption.
> 
>>         __u16   bblog_size;     /* number of sectors reserved for
>> badblocklist */
> 
> ...
> 
>>                 printf("  Bad Block Log : %d entries available at
>> offset %ld sectors",
>>                        __le16_to_cpu(sb->bblog_size)*512/8,
> 
> 
> So 512 bad-block-log entries only need 8 sectors and that would still
> fit into the data offset of 2048 bytes (array 1). The
> write-intent-bitmap is also not that big. But this commit log gives the
> correct answer
> 
>> commit 508a7f16b242d6c3353e15aab46ac8ca8dc7cd08
>> Author: NeilBrown <neilb@suse.de>
>> Date:   Wed Apr 4 14:00:40 2012 +1000
>>
>>     super1: leave more space in front of data by default.
>>
>>     The kernel is growing the ability to avoid the need for a
>>     backup file during reshape by being able to change the data offset.
>>
>>     For this to be useful we need plenty of free space before the
>>     data so the data offset can be reduced.
>>
>>     So for v1.1 and v1.2 metadata make the default data_offset much
>>     larger.  Aim for 128Meg, but keep a power of 2 and don't use more
>>     than 0.1% of each device.
>>
>>     Don't change v1.0 as that is used when the data_offset is required to
>>     be zero.

This is a good match because the discrepancy on his RAID6 is pretty
close to 128MB per drive.  However, both his RAID5 and RAID6 arrays are
metadata 1.2.  So this commit alone may not fully explain the capacity
difference he's seeing between RAID5 and RAID6.  Or is this commit RAID6
specific?  I don't see that in the comments above.

Cheers,

Stan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Array created by mdadm 3.2 & 3.3 have different array size, why?
  2014-03-27 20:39         ` Stan Hoeppner
@ 2014-03-28  2:44           ` Tide
  0 siblings, 0 replies; 13+ messages in thread
From: Tide @ 2014-03-28  2:44 UTC (permalink / raw)
  To: linux-raid

Stan Hoeppner <stan <at> hardwarefreak.com> writes:

> 
> On 3/27/2014 10:23 AM, Bernd Schubert wrote:
> ...
> >> commit 508a7f16b242d6c3353e15aab46ac8ca8dc7cd08
> >> Author: NeilBrown <neilb <at> suse.de>
> >> Date:   Wed Apr 4 14:00:40 2012 +1000
> >>
> >>     super1: leave more space in front of data by default.
> >>
> >>     The kernel is growing the ability to avoid the need for a
> >>     backup file during reshape by being able to change the data offset.
> >>
> >>     For this to be useful we need plenty of free space before the
> >>     data so the data offset can be reduced.
> >>
> >>     So for v1.1 and v1.2 metadata make the default data_offset much
> >>     larger.  Aim for 128Meg, but keep a power of 2 and don't use more
> >>     than 0.1% of each device.
> >>
> >>     Don't change v1.0 as that is used when the data_offset is required to
> >>     be zero.
> 
> This is a good match because the discrepancy on his RAID6 is pretty
> close to 128MB per drive.  However, both his RAID5 and RAID6 arrays are
> metadata 1.2.  So this commit alone may not fully explain the capacity
> difference he's seeing between RAID5 and RAID6.  Or is this commit RAID6
> specific?  I don't see that in the comments above.
> 
> Cheers,
> 
> Stan

Yes, it's not fully explain the difference. Here's another test with mdadm
v3.2.6 (which already have the 508a7f16b242d6c3353e15aab46ac8ca8dc7cd08
commit) in CentOS 6.5.

Created a RAID 5 array with 5 1GiB loop devices, the data_offset of this new
array is only 1024 sectors (0.5M).

# MAKEDEV /dev/loop

# for hdd in {1..5}; do truncate -s 1G hdd5$hdd; losetup /dev/loop5$hdd
hdd5$hdd; done

# mdadm -C /dev/md5 -l5 -n4 -x1 /dev/loop5[1-5]

# mdadm --version
mdadm - v3.2.6 - 25th October 2012

# mdadm --examine /dev/loop51
/dev/loop51:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 5fb7e243:01760f20:c773f451:09928fa0
           Name : RecordFileServer:5  (local to host RecordFileServer)
  Creation Time : Fri Mar 28 10:14:36 2014
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 2096128 (1023.67 MiB 1073.22 MB)
     Array Size : 3142656 (3.00 GiB 3.22 GB)
  Used Dev Size : 2095104 (1023.17 MiB 1072.69 MB)
    Data Offset : 1024 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : d662c632:b2aafbac:406c84a6:ae519102

    Update Time : Fri Mar 28 10:15:07 2014
       Checksum : ab92e515 - correct
         Events : 20

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAA ('A' == active, '.' == missing)

# mdadm -D /dev/md5
/dev/md5:
        Version : 1.2
  Creation Time : Fri Mar 28 10:14:36 2014
     Raid Level : raid5
     Array Size : 3142656 (3.00 GiB 3.22 GB)
  Used Dev Size : 1047552 (1023.17 MiB 1072.69 MB)
   Raid Devices : 4
  Total Devices : 5
    Persistence : Superblock is persistent

    Update Time : Fri Mar 28 10:14:56 2014
          State : clean 
 Active Devices : 4
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 512K

           Name : RecordFileServer:5  (local to host RecordFileServer)
           UUID : 5fb7e243:01760f20:c773f451:09928fa0
         Events : 20

    Number   Major   Minor   RaidDevice State
       0       7       51        0      active sync   /dev/loop51
       1       7       52        1      active sync   /dev/loop52
       2       7       53        2      active sync   /dev/loop53
       5       7       54        3      active sync   /dev/loop54

       4       7       55        -      spare   /dev/loop55



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-03-28  2:44 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-26 10:31 Array created by mdadm 3.2 & 3.3 have different array size, why? Tide
2014-03-26 18:01 ` Larry Fenske
2014-03-26 19:47   ` Tide
2014-03-26 18:38 ` Mikael Abrahamsson
2014-03-26 20:00   ` Tide
2014-03-26 21:14     ` Stan Hoeppner
2014-03-27  2:44       ` Array created by mdadm 3.2 & 3.3 have different array size Tide
2014-03-27  5:52         ` Stan Hoeppner
2014-03-27  6:41           ` Tide
2014-03-27 13:04             ` Wilson Jonathan
2014-03-27 15:23       ` Array created by mdadm 3.2 & 3.3 have different array size, why? Bernd Schubert
2014-03-27 20:39         ` Stan Hoeppner
2014-03-28  2:44           ` Tide

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).