public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* xfs hardware RAID alignment over linear lvm
@ 2013-09-25 12:56 Stewart Webb
  2013-09-25 21:18 ` Stan Hoeppner
  0 siblings, 1 reply; 17+ messages in thread
From: Stewart Webb @ 2013-09-25 12:56 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: text/plain, Size: 531 bytes --]

Hi All,

I am trying to do the following:
3 x Hardware RAID Cards each with a raid 6 volume of 12 disks presented to
the OS
all raid units have a "stripe size" of 512 KB

so given the info on the xfs.org wiki - I sould give each filesystem a
sunit of 512 KB and a swidth of 10 (because RAID 6 has 2 parity disks)

all well and good

But - I would like to use Linear LVM to bring all 3 cards into 1 logical
volume -
here is where my question crops up:
Does this effect how I need to align the filesystem?

Regards

-- 
Stewart Webb

[-- Attachment #1.2: Type: text/html, Size: 818 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: xfs hardware RAID alignment over linear lvm
  2013-09-25 12:56 xfs hardware RAID alignment over linear lvm Stewart Webb
@ 2013-09-25 21:18 ` Stan Hoeppner
  2013-09-25 21:34   ` Chris Murphy
  0 siblings, 1 reply; 17+ messages in thread
From: Stan Hoeppner @ 2013-09-25 21:18 UTC (permalink / raw)
  To: Stewart Webb; +Cc: xfs

On 9/25/2013 7:56 AM, Stewart Webb wrote:
> Hi All,

Hi Stewart,

> I am trying to do the following:
> 3 x Hardware RAID Cards each with a raid 6 volume of 12 disks presented to
> the OS
> all raid units have a "stripe size" of 512 KB

Just for future reference so you're using correct terminology, a value
of 512KB is surely your XFS su value, also called a "strip" in LSI
terminology, or a "chunk" in Linux software md/RAID terminology.  This
is the amount of data written to each data spindle (excluding parity) in
the array.

"Stripe size" is a synonym of XFS sw, which is su * #disks.  This is the
amount of data written across the full RAID stripe (excluding parity).

> so given the info on the xfs.org wiki - I sould give each filesystem a
> sunit of 512 KB and a swidth of 10 (because RAID 6 has 2 parity disks)

Partially correct.  If you format each /dev/[device] presented by the
RAID controller with an XFS filesystem, 3 filesystems total, then your
values above are correct.  EXCEPT you must use the su/sw parameters in
mkfs.xfs if using BYTE values.  See mkfs.xfs(8)

> all well and good
> 
> But - I would like to use Linear LVM to bring all 3 cards into 1 logical
> volume -
> here is where my question crops up:
> Does this effect how I need to align the filesystem?

In the case of a concatenation, which is what LVM linear is, you should
use an XFS alignment identical to that for a single array as above.

-- 
Stan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: xfs hardware RAID alignment over linear lvm
  2013-09-25 21:18 ` Stan Hoeppner
@ 2013-09-25 21:34   ` Chris Murphy
  2013-09-25 21:48     ` Stan Hoeppner
  2013-09-25 21:57     ` Dave Chinner
  0 siblings, 2 replies; 17+ messages in thread
From: Chris Murphy @ 2013-09-25 21:34 UTC (permalink / raw)
  To: xfs@oss.sgi.com


On Sep 25, 2013, at 3:18 PM, Stan Hoeppner <stan@hardwarefreak.com> wrote:

> On 9/25/2013 7:56 AM, Stewart Webb wrote:
>> Hi All,
> 
> Hi Stewart,
> 
>> I am trying to do the following:
>> 3 x Hardware RAID Cards each with a raid 6 volume of 12 disks presented to
>> the OS
>> all raid units have a "stripe size" of 512 KB
> 
> Just for future reference so you're using correct terminology, a value
> of 512KB is surely your XFS su value, also called a "strip" in LSI
> terminology, or a "chunk" in Linux software md/RAID terminology.  This
> is the amount of data written to each data spindle (excluding parity) in
> the array.
> 
> "Stripe size" is a synonym of XFS sw, which is su * #disks.  This is the
> amount of data written across the full RAID stripe (excluding parity).
> 
>> so given the info on the xfs.org wiki - I sould give each filesystem a
>> sunit of 512 KB and a swidth of 10 (because RAID 6 has 2 parity disks)
> 
> Partially correct.  If you format each /dev/[device] presented by the
> RAID controller with an XFS filesystem, 3 filesystems total, then your
> values above are correct.  EXCEPT you must use the su/sw parameters in
> mkfs.xfs if using BYTE values.  See mkfs.xfs(8)
> 
>> all well and good
>> 
>> But - I would like to use Linear LVM to bring all 3 cards into 1 logical
>> volume -
>> here is where my question crops up:
>> Does this effect how I need to align the filesystem?
> 
> In the case of a concatenation, which is what LVM linear is, you should
> use an XFS alignment identical to that for a single array as above.

So keeping the example, 3 arrays x 10 data disks, would this be su=512k and sw=30?


Chris Murphy
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: xfs hardware RAID alignment over linear lvm
  2013-09-25 21:34   ` Chris Murphy
@ 2013-09-25 21:48     ` Stan Hoeppner
  2013-09-25 21:53       ` Chris Murphy
  2013-09-25 21:57     ` Dave Chinner
  1 sibling, 1 reply; 17+ messages in thread
From: Stan Hoeppner @ 2013-09-25 21:48 UTC (permalink / raw)
  To: Chris Murphy; +Cc: xfs@oss.sgi.com

On 9/25/2013 4:34 PM, Chris Murphy wrote:
> 
> On Sep 25, 2013, at 3:18 PM, Stan Hoeppner <stan@hardwarefreak.com> wrote:
> 
>> On 9/25/2013 7:56 AM, Stewart Webb wrote:
>>> Hi All,
>>
>> Hi Stewart,
>>
>>> I am trying to do the following:
>>> 3 x Hardware RAID Cards each with a raid 6 volume of 12 disks presented to
>>> the OS
>>> all raid units have a "stripe size" of 512 KB
>>
>> Just for future reference so you're using correct terminology, a value
>> of 512KB is surely your XFS su value, also called a "strip" in LSI
>> terminology, or a "chunk" in Linux software md/RAID terminology.  This
>> is the amount of data written to each data spindle (excluding parity) in
>> the array.
>>
>> "Stripe size" is a synonym of XFS sw, which is su * #disks.  This is the
>> amount of data written across the full RAID stripe (excluding parity).
>>
>>> so given the info on the xfs.org wiki - I sould give each filesystem a
>>> sunit of 512 KB and a swidth of 10 (because RAID 6 has 2 parity disks)
>>
>> Partially correct.  If you format each /dev/[device] presented by the
>> RAID controller with an XFS filesystem, 3 filesystems total, then your
>> values above are correct.  EXCEPT you must use the su/sw parameters in
>> mkfs.xfs if using BYTE values.  See mkfs.xfs(8)

Small correction:  su is a byte value.  sw is an integer representing
the number of data spindles.

>>> all well and good
>>>
>>> But - I would like to use Linear LVM to bring all 3 cards into 1 logical
>>> volume -
>>> here is where my question crops up:
>>> Does this effect how I need to align the filesystem?
>>
>> In the case of a concatenation, which is what LVM linear is, you should
>> use an XFS alignment identical to that for a single array as above.
> 
> So keeping the example, 3 arrays x 10 data disks, would this be su=512k and sw=30?

No.  In this configuration, as far as XFS is concerned LVM doesn't exist
in the stack because it doesn't change the RAID geometry, so you ignore it.

-- 
Stan


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: xfs hardware RAID alignment over linear lvm
  2013-09-25 21:48     ` Stan Hoeppner
@ 2013-09-25 21:53       ` Chris Murphy
  0 siblings, 0 replies; 17+ messages in thread
From: Chris Murphy @ 2013-09-25 21:53 UTC (permalink / raw)
  To: xfs@oss.sgi.com


On Sep 25, 2013, at 3:48 PM, Stan Hoeppner <stan@hardwarefreak.com> wrote:
>> 
>> So keeping the example, 3 arrays x 10 data disks, would this be su=512k and sw=30?
> 
> No.  In this configuration, as far as XFS is concerned LVM doesn't exist
> in the stack because it doesn't change the RAID geometry, so you ignore it.

OK and if this were md linear where the file system definitely would be created across all disks?


Chris Murphy
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: xfs hardware RAID alignment over linear lvm
  2013-09-25 21:34   ` Chris Murphy
  2013-09-25 21:48     ` Stan Hoeppner
@ 2013-09-25 21:57     ` Dave Chinner
  2013-09-26  8:44       ` Stan Hoeppner
  2013-09-26  8:55       ` Stewart Webb
  1 sibling, 2 replies; 17+ messages in thread
From: Dave Chinner @ 2013-09-25 21:57 UTC (permalink / raw)
  To: Chris Murphy; +Cc: xfs@oss.sgi.com

On Wed, Sep 25, 2013 at 03:34:01PM -0600, Chris Murphy wrote:
> 
> On Sep 25, 2013, at 3:18 PM, Stan Hoeppner <stan@hardwarefreak.com> wrote:
> 
> > On 9/25/2013 7:56 AM, Stewart Webb wrote:
> >> Hi All,
> > 
> > Hi Stewart,
> > 
> >> I am trying to do the following:
> >> 3 x Hardware RAID Cards each with a raid 6 volume of 12 disks presented to
> >> the OS
> >> all raid units have a "stripe size" of 512 KB
> > 
> > Just for future reference so you're using correct terminology, a value
> > of 512KB is surely your XFS su value, also called a "strip" in LSI
> > terminology, or a "chunk" in Linux software md/RAID terminology.  This
> > is the amount of data written to each data spindle (excluding parity) in
> > the array.
> > 
> > "Stripe size" is a synonym of XFS sw, which is su * #disks.  This is the
> > amount of data written across the full RAID stripe (excluding parity).
> > 
> >> so given the info on the xfs.org wiki - I sould give each filesystem a
> >> sunit of 512 KB and a swidth of 10 (because RAID 6 has 2 parity disks)
> > 
> > Partially correct.  If you format each /dev/[device] presented by the
> > RAID controller with an XFS filesystem, 3 filesystems total, then your
> > values above are correct.  EXCEPT you must use the su/sw parameters in
> > mkfs.xfs if using BYTE values.  See mkfs.xfs(8)
> > 
> >> all well and good
> >> 
> >> But - I would like to use Linear LVM to bring all 3 cards into 1 logical
> >> volume -
> >> here is where my question crops up:
> >> Does this effect how I need to align the filesystem?
> > 
> > In the case of a concatenation, which is what LVM linear is, you should
> > use an XFS alignment identical to that for a single array as above.
                                                 ^^^^^^
> So keeping the example, 3 arrays x 10 data disks, would this be su=512k and sw=30?

No, the alignment should match that of a *single* 10 disk array,
so su=512k,sw=10.

Linear concatentation looks like this:

offset		volume				array
0		+-D1-+-D2-+.....+-Dn-+		0	# first sw
.....
X-sw		+-D1-+-D2-+.....+-Dn-+		0
X		+-E1-+-E2-+.....+-En-+		1	# first sw
.....
2X-sw		+-E1-+-E2-+.....+-En-+		1
2X		+-F1-+-F2-+.....+-Fn-+		2	# first sw
.....
3X-sw		+-F1-+-F2-+.....+-Fn-+		2

Where:
	D1...Dn are the disks in the first array
	E1...En are the disks in the second array
	F1...Fn are the disks in the third array
	X is the size of the each array
	sw = su * number of data disks in array

As you can see, all the volumes are arranged in a single column -
identical to a larger single array of the same size.  Hence the
exposed alignment of a single array is what the filesystem should be
aligned to, as that is how the linear concat behaves.

You also might note here that if you want the second and subsequent
arrays to be correctly aligned to the initial array in the linear
concat (and you do want that), the arrays must be sized to be an
exact multiple of the stripe width.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: xfs hardware RAID alignment over linear lvm
  2013-09-25 21:57     ` Dave Chinner
@ 2013-09-26  8:44       ` Stan Hoeppner
  2013-09-26  8:55       ` Stewart Webb
  1 sibling, 0 replies; 17+ messages in thread
From: Stan Hoeppner @ 2013-09-26  8:44 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Chris Murphy, xfs@oss.sgi.com

On 9/25/2013 4:57 PM, Dave Chinner wrote:
...
> Linear concatentation looks like this:
> 
> offset		volume				array
> 0		+-D1-+-D2-+.....+-Dn-+		0	# first sw
> .....
> X-sw		+-D1-+-D2-+.....+-Dn-+		0
> X		+-E1-+-E2-+.....+-En-+		1	# first sw
> .....
> 2X-sw		+-E1-+-E2-+.....+-En-+		1
> 2X		+-F1-+-F2-+.....+-Fn-+		2	# first sw
> .....
> 3X-sw		+-F1-+-F2-+.....+-Fn-+		2
> 
> Where:
> 	D1...Dn are the disks in the first array
> 	E1...En are the disks in the second array
> 	F1...Fn are the disks in the third array
> 	X is the size of the each array
> 	sw = su * number of data disks in array
> 
> As you can see, all the volumes are arranged in a single column -
> identical to a larger single array of the same size.  Hence the
> exposed alignment of a single array is what the filesystem should be
> aligned to, as that is how the linear concat behaves.
> 
> You also might note here that if you want the second and subsequent
> arrays to be correctly aligned to the initial array in the linear
> concat (and you do want that), the arrays must be sized to be an
> exact multiple of the stripe width.

On a similar note, if I do a concat like this I specify agsize/agcount
during mkfs.xfs so no AGs straddle array boundaries.  I do this to keep
per AG throughput consistent, among other concerns.  This may or may not
be of benefit to the OP.  mkfs.xfs using defaults is not aware of the
array boundaries within the concat, so it may well create AGs across
array boundaries.

-- 
Stan


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: xfs hardware RAID alignment over linear lvm
  2013-09-25 21:57     ` Dave Chinner
  2013-09-26  8:44       ` Stan Hoeppner
@ 2013-09-26  8:55       ` Stewart Webb
  2013-09-26  9:22         ` Stan Hoeppner
  1 sibling, 1 reply; 17+ messages in thread
From: Stewart Webb @ 2013-09-26  8:55 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Chris Murphy, xfs@oss.sgi.com


[-- Attachment #1.1: Type: text/plain, Size: 4848 bytes --]

Thanks for all this info Stan and Dave,

> "Stripe size" is a synonym of XFS sw, which is su * #disks.  This is the
> amount of data written across the full RAID stripe (excluding parity).

The reason I stated Stripe size is because in this instance, I have 3ware
RAID controllers, which refer to
this value as "Stripe" in their tw_cli software (god bless manufacturers
renaming everything)

I do, however, have a follow-on question:
On other systems, I have similar hardware:
3x Raid Controllers
1 of them has 10 disks as RAID 6 that I would like to add to a logical
volume
2 of them have 12 disks as a RAID 6 that I would like to add to the same
logical volume

All have the same "Stripe" or "Strip Size" of 512 KB

So if I where going to make 3 seperate xfs volumes, I would do the
following:
mkfs.xfs -d su=512k sw=8 /dev/sda
mkfs.xfs -d su=512k sw=10 /dev/sdb
mkfs.xfs -d su=512k sw=10 /dev/sdc

I assume, If I where going to bring them all into 1 logical volume, it
would be best placed to have the sw value set
to a value that is divisible by both 8 and 10 - in this case 2?

Obviously, this is not an ideal situation, and I will most likely modify
the hardware to better suite.
But I'd really like to fully understand this.

Thanks for any insight you are able to give

Regards


On 25 September 2013 22:57, Dave Chinner <david@fromorbit.com> wrote:

> On Wed, Sep 25, 2013 at 03:34:01PM -0600, Chris Murphy wrote:
> >
> > On Sep 25, 2013, at 3:18 PM, Stan Hoeppner <stan@hardwarefreak.com>
> wrote:
> >
> > > On 9/25/2013 7:56 AM, Stewart Webb wrote:
> > >> Hi All,
> > >
> > > Hi Stewart,
> > >
> > >> I am trying to do the following:
> > >> 3 x Hardware RAID Cards each with a raid 6 volume of 12 disks
> presented to
> > >> the OS
> > >> all raid units have a "stripe size" of 512 KB
> > >
> > > Just for future reference so you're using correct terminology, a value
> > > of 512KB is surely your XFS su value, also called a "strip" in LSI
> > > terminology, or a "chunk" in Linux software md/RAID terminology.  This
> > > is the amount of data written to each data spindle (excluding parity)
> in
> > > the array.
> > >
> > > "Stripe size" is a synonym of XFS sw, which is su * #disks.  This is
> the
> > > amount of data written across the full RAID stripe (excluding parity).
> > >
> > >> so given the info on the xfs.org wiki - I sould give each filesystem
> a
> > >> sunit of 512 KB and a swidth of 10 (because RAID 6 has 2 parity disks)
> > >
> > > Partially correct.  If you format each /dev/[device] presented by the
> > > RAID controller with an XFS filesystem, 3 filesystems total, then your
> > > values above are correct.  EXCEPT you must use the su/sw parameters in
> > > mkfs.xfs if using BYTE values.  See mkfs.xfs(8)
> > >
> > >> all well and good
> > >>
> > >> But - I would like to use Linear LVM to bring all 3 cards into 1
> logical
> > >> volume -
> > >> here is where my question crops up:
> > >> Does this effect how I need to align the filesystem?
> > >
> > > In the case of a concatenation, which is what LVM linear is, you should
> > > use an XFS alignment identical to that for a single array as above.
>                                                  ^^^^^^
> > So keeping the example, 3 arrays x 10 data disks, would this be su=512k
> and sw=30?
>
> No, the alignment should match that of a *single* 10 disk array,
> so su=512k,sw=10.
>
> Linear concatentation looks like this:
>
> offset          volume                          array
> 0               +-D1-+-D2-+.....+-Dn-+          0       # first sw
> .....
> X-sw            +-D1-+-D2-+.....+-Dn-+          0
> X               +-E1-+-E2-+.....+-En-+          1       # first sw
> .....
> 2X-sw           +-E1-+-E2-+.....+-En-+          1
> 2X              +-F1-+-F2-+.....+-Fn-+          2       # first sw
> .....
> 3X-sw           +-F1-+-F2-+.....+-Fn-+          2
>
> Where:
>         D1...Dn are the disks in the first array
>         E1...En are the disks in the second array
>         F1...Fn are the disks in the third array
>         X is the size of the each array
>         sw = su * number of data disks in array
>
> As you can see, all the volumes are arranged in a single column -
> identical to a larger single array of the same size.  Hence the
> exposed alignment of a single array is what the filesystem should be
> aligned to, as that is how the linear concat behaves.
>
> You also might note here that if you want the second and subsequent
> arrays to be correctly aligned to the initial array in the linear
> concat (and you do want that), the arrays must be sized to be an
> exact multiple of the stripe width.
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>



-- 
Stewart Webb

[-- Attachment #1.2: Type: text/html, Size: 6832 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: xfs hardware RAID alignment over linear lvm
  2013-09-26  8:55       ` Stewart Webb
@ 2013-09-26  9:22         ` Stan Hoeppner
  2013-09-26  9:28           ` Stewart Webb
  2013-09-26 21:58           ` Dave Chinner
  0 siblings, 2 replies; 17+ messages in thread
From: Stan Hoeppner @ 2013-09-26  9:22 UTC (permalink / raw)
  To: Stewart Webb; +Cc: Chris Murphy, xfs@oss.sgi.com

On 9/26/2013 3:55 AM, Stewart Webb wrote:
> Thanks for all this info Stan and Dave,
> 
>> "Stripe size" is a synonym of XFS sw, which is su * #disks.  This is the
>> amount of data written across the full RAID stripe (excluding parity).
> 
> The reason I stated Stripe size is because in this instance, I have 3ware
> RAID controllers, which refer to
> this value as "Stripe" in their tw_cli software (god bless manufacturers
> renaming everything)
> 
> I do, however, have a follow-on question:
> On other systems, I have similar hardware:
> 3x Raid Controllers
> 1 of them has 10 disks as RAID 6 that I would like to add to a logical
> volume
> 2 of them have 12 disks as a RAID 6 that I would like to add to the same
> logical volume
> 
> All have the same "Stripe" or "Strip Size" of 512 KB
> 
> So if I where going to make 3 seperate xfs volumes, I would do the
> following:
> mkfs.xfs -d su=512k sw=8 /dev/sda
> mkfs.xfs -d su=512k sw=10 /dev/sdb
> mkfs.xfs -d su=512k sw=10 /dev/sdc
> 
> I assume, If I where going to bring them all into 1 logical volume, it
> would be best placed to have the sw value set
> to a value that is divisible by both 8 and 10 - in this case 2?

No.  In this case you do NOT stripe align XFS to the storage, because
it's impossible--the RAID stripes are dissimilar.  In this case you use
the default 4KB write out, as if this is a single disk drive.

As Dave stated, if you format a concatenated device with XFS and you
desire to align XFS, then all constituent arrays must have the same
geometry.

Two things to be aware of here:

1.  With a decent hardware write caching RAID controller, having XFS
alined to the RAID geometry is a small optimization WRT overall write
performance, because the controller is going to be doing the optimizing
of final writeback to the drives.

2. Alignment does not affect read performance.

3.  XFS only performs aligned writes during allocation.  I.e. this only
occurs when creating a new file, new inode, etc.  For append and
modify-in-place operations, there is no write alignment.  So again,
stripe alignment to the hardware geometry is merely an optimization, and
only affect some types of writes.

What really makes a difference as to whether alignment will be of
benefit to you, and how often, is your workload.  So at this point, you
need to describe the primary workload(s) of your systems we're discussing.

-- 
Stan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: xfs hardware RAID alignment over linear lvm
  2013-09-26  9:22         ` Stan Hoeppner
@ 2013-09-26  9:28           ` Stewart Webb
  2013-09-26 21:58           ` Dave Chinner
  1 sibling, 0 replies; 17+ messages in thread
From: Stewart Webb @ 2013-09-26  9:28 UTC (permalink / raw)
  To: stan; +Cc: Chris Murphy, xfs@oss.sgi.com


[-- Attachment #1.1: Type: text/plain, Size: 2971 bytes --]

Understood,

My workload is primarily reads (about 80%+ read operations) - so defaults
will most likely
be best suited on this occasion.

I was simply trying to follow the guidelines on the XFS wiki to
the best of my ability, and felt I didn't understand the impact of using
this via LVM.

Now I feel I understand enough to continue in what I need to do.

Thanks again


On 26 September 2013 10:22, Stan Hoeppner <stan@hardwarefreak.com> wrote:

> On 9/26/2013 3:55 AM, Stewart Webb wrote:
> > Thanks for all this info Stan and Dave,
> >
> >> "Stripe size" is a synonym of XFS sw, which is su * #disks.  This is the
> >> amount of data written across the full RAID stripe (excluding parity).
> >
> > The reason I stated Stripe size is because in this instance, I have 3ware
> > RAID controllers, which refer to
> > this value as "Stripe" in their tw_cli software (god bless manufacturers
> > renaming everything)
> >
> > I do, however, have a follow-on question:
> > On other systems, I have similar hardware:
> > 3x Raid Controllers
> > 1 of them has 10 disks as RAID 6 that I would like to add to a logical
> > volume
> > 2 of them have 12 disks as a RAID 6 that I would like to add to the same
> > logical volume
> >
> > All have the same "Stripe" or "Strip Size" of 512 KB
> >
> > So if I where going to make 3 seperate xfs volumes, I would do the
> > following:
> > mkfs.xfs -d su=512k sw=8 /dev/sda
> > mkfs.xfs -d su=512k sw=10 /dev/sdb
> > mkfs.xfs -d su=512k sw=10 /dev/sdc
> >
> > I assume, If I where going to bring them all into 1 logical volume, it
> > would be best placed to have the sw value set
> > to a value that is divisible by both 8 and 10 - in this case 2?
>
> No.  In this case you do NOT stripe align XFS to the storage, because
> it's impossible--the RAID stripes are dissimilar.  In this case you use
> the default 4KB write out, as if this is a single disk drive.
>
> As Dave stated, if you format a concatenated device with XFS and you
> desire to align XFS, then all constituent arrays must have the same
> geometry.
>
> Two things to be aware of here:
>
> 1.  With a decent hardware write caching RAID controller, having XFS
> alined to the RAID geometry is a small optimization WRT overall write
> performance, because the controller is going to be doing the optimizing
> of final writeback to the drives.
>
> 2. Alignment does not affect read performance.
>
> 3.  XFS only performs aligned writes during allocation.  I.e. this only
> occurs when creating a new file, new inode, etc.  For append and
> modify-in-place operations, there is no write alignment.  So again,
> stripe alignment to the hardware geometry is merely an optimization, and
> only affect some types of writes.
>
> What really makes a difference as to whether alignment will be of
> benefit to you, and how often, is your workload.  So at this point, you
> need to describe the primary workload(s) of your systems we're discussing.
>
> --
> Stan
>
>


-- 
Stewart Webb

[-- Attachment #1.2: Type: text/html, Size: 3876 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: xfs hardware RAID alignment over linear lvm
  2013-09-26  9:22         ` Stan Hoeppner
  2013-09-26  9:28           ` Stewart Webb
@ 2013-09-26 21:58           ` Dave Chinner
  2013-09-27  1:10             ` Stan Hoeppner
  1 sibling, 1 reply; 17+ messages in thread
From: Dave Chinner @ 2013-09-26 21:58 UTC (permalink / raw)
  To: Stan Hoeppner; +Cc: Stewart Webb, Chris Murphy, xfs@oss.sgi.com

On Thu, Sep 26, 2013 at 04:22:30AM -0500, Stan Hoeppner wrote:
> On 9/26/2013 3:55 AM, Stewart Webb wrote:
> > Thanks for all this info Stan and Dave,
> > 
> >> "Stripe size" is a synonym of XFS sw, which is su * #disks.  This is the
> >> amount of data written across the full RAID stripe (excluding parity).
> > 
> > The reason I stated Stripe size is because in this instance, I have 3ware
> > RAID controllers, which refer to
> > this value as "Stripe" in their tw_cli software (god bless manufacturers
> > renaming everything)
> > 
> > I do, however, have a follow-on question:
> > On other systems, I have similar hardware:
> > 3x Raid Controllers
> > 1 of them has 10 disks as RAID 6 that I would like to add to a logical
> > volume
> > 2 of them have 12 disks as a RAID 6 that I would like to add to the same
> > logical volume
> > 
> > All have the same "Stripe" or "Strip Size" of 512 KB
> > 
> > So if I where going to make 3 seperate xfs volumes, I would do the
> > following:
> > mkfs.xfs -d su=512k sw=8 /dev/sda
> > mkfs.xfs -d su=512k sw=10 /dev/sdb
> > mkfs.xfs -d su=512k sw=10 /dev/sdc
> > 
> > I assume, If I where going to bring them all into 1 logical volume, it
> > would be best placed to have the sw value set
> > to a value that is divisible by both 8 and 10 - in this case 2?
> 
> No.  In this case you do NOT stripe align XFS to the storage, because
> it's impossible--the RAID stripes are dissimilar.  In this case you use
> the default 4KB write out, as if this is a single disk drive.
> 
> As Dave stated, if you format a concatenated device with XFS and you
> desire to align XFS, then all constituent arrays must have the same
> geometry.
> 
> Two things to be aware of here:
> 
> 1.  With a decent hardware write caching RAID controller, having XFS
> alined to the RAID geometry is a small optimization WRT overall write
> performance, because the controller is going to be doing the optimizing
> of final writeback to the drives.
> 
> 2. Alignment does not affect read performance.

Ah, but it does...

> 3.  XFS only performs aligned writes during allocation.

Right, and it does so not only to improve write performance, but to
also maximise sequential read performance of the data that is
written, especially when multiple files are being read
simultaneously and IO latency is important to keep low (e.g.
realtime video ingest and playout).

> What really makes a difference as to whether alignment will be of
> benefit to you, and how often, is your workload.  So at this point, you
> need to describe the primary workload(s) of your systems we're discussing.

Yup, my thoughts exactly...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: xfs hardware RAID alignment over linear lvm
  2013-09-26 21:58           ` Dave Chinner
@ 2013-09-27  1:10             ` Stan Hoeppner
  2013-09-27 12:23               ` Stewart Webb
  0 siblings, 1 reply; 17+ messages in thread
From: Stan Hoeppner @ 2013-09-27  1:10 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Stewart Webb, Chris Murphy, xfs@oss.sgi.com

On 9/26/2013 4:58 PM, Dave Chinner wrote:
> On Thu, Sep 26, 2013 at 04:22:30AM -0500, Stan Hoeppner wrote:
>> On 9/26/2013 3:55 AM, Stewart Webb wrote:
>>> Thanks for all this info Stan and Dave,
>>>
>>>> "Stripe size" is a synonym of XFS sw, which is su * #disks.  This is the
>>>> amount of data written across the full RAID stripe (excluding parity).
>>>
>>> The reason I stated Stripe size is because in this instance, I have 3ware
>>> RAID controllers, which refer to
>>> this value as "Stripe" in their tw_cli software (god bless manufacturers
>>> renaming everything)
>>>
>>> I do, however, have a follow-on question:
>>> On other systems, I have similar hardware:
>>> 3x Raid Controllers
>>> 1 of them has 10 disks as RAID 6 that I would like to add to a logical
>>> volume
>>> 2 of them have 12 disks as a RAID 6 that I would like to add to the same
>>> logical volume
>>>
>>> All have the same "Stripe" or "Strip Size" of 512 KB
>>>
>>> So if I where going to make 3 seperate xfs volumes, I would do the
>>> following:
>>> mkfs.xfs -d su=512k sw=8 /dev/sda
>>> mkfs.xfs -d su=512k sw=10 /dev/sdb
>>> mkfs.xfs -d su=512k sw=10 /dev/sdc
>>>
>>> I assume, If I where going to bring them all into 1 logical volume, it
>>> would be best placed to have the sw value set
>>> to a value that is divisible by both 8 and 10 - in this case 2?
>>
>> No.  In this case you do NOT stripe align XFS to the storage, because
>> it's impossible--the RAID stripes are dissimilar.  In this case you use
>> the default 4KB write out, as if this is a single disk drive.
>>
>> As Dave stated, if you format a concatenated device with XFS and you
>> desire to align XFS, then all constituent arrays must have the same
>> geometry.
>>
>> Two things to be aware of here:
>>
>> 1.  With a decent hardware write caching RAID controller, having XFS
>> alined to the RAID geometry is a small optimization WRT overall write
>> performance, because the controller is going to be doing the optimizing
>> of final writeback to the drives.
>>
>> 2. Alignment does not affect read performance.
> 
> Ah, but it does...
> 
>> 3.  XFS only performs aligned writes during allocation.
> 
> Right, and it does so not only to improve write performance, but to
> also maximise sequential read performance of the data that is
> written, especially when multiple files are being read
> simultaneously and IO latency is important to keep low (e.g.
> realtime video ingest and playout).

Absolutely correct, as Dave always is.  As my workloads are mostly
random, as are those of others I consult in other fora, I sometimes
forget the [multi]streaming case.  Which is not good, as many folks
choose XFS specifically for [multi]streaming workloads.  My remarks to
this audience should always reflect that.  Apologies for my oversight on
this occasion.

>> What really makes a difference as to whether alignment will be of
>> benefit to you, and how often, is your workload.  So at this point, you
>> need to describe the primary workload(s) of your systems we're discussing.
> 
> Yup, my thoughts exactly...
> 
> Cheers,
> 
> Dave.
> 

-- 
Stan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: xfs hardware RAID alignment over linear lvm
  2013-09-27  1:10             ` Stan Hoeppner
@ 2013-09-27 12:23               ` Stewart Webb
  2013-09-27 13:09                 ` Stan Hoeppner
  0 siblings, 1 reply; 17+ messages in thread
From: Stewart Webb @ 2013-09-27 12:23 UTC (permalink / raw)
  To: stan; +Cc: Chris Murphy, xfs@oss.sgi.com


[-- Attachment #1.1: Type: text/plain, Size: 3892 bytes --]

>Right, and it does so not only to improve write performance, but to
>also maximise sequential read performance of the data that is
>written, especially when multiple files are being read
>simultaneously and IO latency is important to keep low (e.g.
>realtime video ingest and playout).

So does this mean that I should avoid having devices in RAID with a
differing amount of spindles (or non-parity disks)
If I would like to use Linear concatenation LVM? Or is there a best
practice if this instance is not
avoidable?

Regards


On 27 September 2013 02:10, Stan Hoeppner <stan@hardwarefreak.com> wrote:

> On 9/26/2013 4:58 PM, Dave Chinner wrote:
> > On Thu, Sep 26, 2013 at 04:22:30AM -0500, Stan Hoeppner wrote:
> >> On 9/26/2013 3:55 AM, Stewart Webb wrote:
> >>> Thanks for all this info Stan and Dave,
> >>>
> >>>> "Stripe size" is a synonym of XFS sw, which is su * #disks.  This is
> the
> >>>> amount of data written across the full RAID stripe (excluding parity).
> >>>
> >>> The reason I stated Stripe size is because in this instance, I have
> 3ware
> >>> RAID controllers, which refer to
> >>> this value as "Stripe" in their tw_cli software (god bless
> manufacturers
> >>> renaming everything)
> >>>
> >>> I do, however, have a follow-on question:
> >>> On other systems, I have similar hardware:
> >>> 3x Raid Controllers
> >>> 1 of them has 10 disks as RAID 6 that I would like to add to a logical
> >>> volume
> >>> 2 of them have 12 disks as a RAID 6 that I would like to add to the
> same
> >>> logical volume
> >>>
> >>> All have the same "Stripe" or "Strip Size" of 512 KB
> >>>
> >>> So if I where going to make 3 seperate xfs volumes, I would do the
> >>> following:
> >>> mkfs.xfs -d su=512k sw=8 /dev/sda
> >>> mkfs.xfs -d su=512k sw=10 /dev/sdb
> >>> mkfs.xfs -d su=512k sw=10 /dev/sdc
> >>>
> >>> I assume, If I where going to bring them all into 1 logical volume, it
> >>> would be best placed to have the sw value set
> >>> to a value that is divisible by both 8 and 10 - in this case 2?
> >>
> >> No.  In this case you do NOT stripe align XFS to the storage, because
> >> it's impossible--the RAID stripes are dissimilar.  In this case you use
> >> the default 4KB write out, as if this is a single disk drive.
> >>
> >> As Dave stated, if you format a concatenated device with XFS and you
> >> desire to align XFS, then all constituent arrays must have the same
> >> geometry.
> >>
> >> Two things to be aware of here:
> >>
> >> 1.  With a decent hardware write caching RAID controller, having XFS
> >> alined to the RAID geometry is a small optimization WRT overall write
> >> performance, because the controller is going to be doing the optimizing
> >> of final writeback to the drives.
> >>
> >> 2. Alignment does not affect read performance.
> >
> > Ah, but it does...
> >
> >> 3.  XFS only performs aligned writes during allocation.
> >
> > Right, and it does so not only to improve write performance, but to
> > also maximise sequential read performance of the data that is
> > written, especially when multiple files are being read
> > simultaneously and IO latency is important to keep low (e.g.
> > realtime video ingest and playout).
>
> Absolutely correct, as Dave always is.  As my workloads are mostly
> random, as are those of others I consult in other fora, I sometimes
> forget the [multi]streaming case.  Which is not good, as many folks
> choose XFS specifically for [multi]streaming workloads.  My remarks to
> this audience should always reflect that.  Apologies for my oversight on
> this occasion.
>
> >> What really makes a difference as to whether alignment will be of
> >> benefit to you, and how often, is your workload.  So at this point, you
> >> need to describe the primary workload(s) of your systems we're
> discussing.
> >
> > Yup, my thoughts exactly...
> >
> > Cheers,
> >
> > Dave.
> >
>
> --
> Stan
>
>


-- 
Stewart Webb

[-- Attachment #1.2: Type: text/html, Size: 6345 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: xfs hardware RAID alignment over linear lvm
  2013-09-27 12:23               ` Stewart Webb
@ 2013-09-27 13:09                 ` Stan Hoeppner
  2013-09-27 13:29                   ` Stewart Webb
  0 siblings, 1 reply; 17+ messages in thread
From: Stan Hoeppner @ 2013-09-27 13:09 UTC (permalink / raw)
  To: Stewart Webb; +Cc: Chris Murphy, xfs@oss.sgi.com

On 9/27/2013 7:23 AM, Stewart Webb wrote:
>> Right, and it does so not only to improve write performance, but to
>> also maximise sequential read performance of the data that is
>> written, especially when multiple files are being read
>> simultaneously and IO latency is important to keep low (e.g.
>> realtime video ingest and playout).
> 
> So does this mean that I should avoid having devices in RAID with a
> differing amount of spindles (or non-parity disks)
> If I would like to use Linear concatenation LVM? Or is there a best
> practice if this instance is not
> avoidable?

Above, Dave was correcting my oversight, not necessarily informing you,
per se.  It seems clear from your follow up question that you didn't
really grasp what he was saying.  Let's back up a little bit.

What you need to concentrate on right now is the following which we
stated previously in the thread, but which you did not reply to:

>>>> What really makes a difference as to whether alignment will be of
>>>> benefit to you, and how often, is your workload.  So at this point, you
>>>> need to describe the primary workload(s) of your systems we're
>> discussing.
>>>
>>> Yup, my thoughts exactly...

This means you need to describe in detail how you are writing your
files, and how you are reading them back.  I.e. what application are you
using, what does it do, etc.  You stated IIRC that your workload is 80%
read.  What types of files is it reading?  Small, large?  Is it reading
multiple files in parallel?  How are these files originally written
before being read?  Etc, etc.

You may not understand why this is relevant, but it is the only thing
that is relevant, at this point.  Spindles, RAID level, alignment, no
alignment...none of this matters if it doesn't match up with how your
application(s) do their IO.

Rule #1 of storage architecture:  Always build your storage stack (i.e.
disks, controller, driver, filesystem, etc) to fit the workload(s), not
the other way around.

> 
> On 27 September 2013 02:10, Stan Hoeppner <stan@hardwarefreak.com> wrote:
> 
>> On 9/26/2013 4:58 PM, Dave Chinner wrote:
>>> On Thu, Sep 26, 2013 at 04:22:30AM -0500, Stan Hoeppner wrote:
>>>> On 9/26/2013 3:55 AM, Stewart Webb wrote:
>>>>> Thanks for all this info Stan and Dave,
>>>>>
>>>>>> "Stripe size" is a synonym of XFS sw, which is su * #disks.  This is
>> the
>>>>>> amount of data written across the full RAID stripe (excluding parity).
>>>>>
>>>>> The reason I stated Stripe size is because in this instance, I have
>> 3ware
>>>>> RAID controllers, which refer to
>>>>> this value as "Stripe" in their tw_cli software (god bless
>> manufacturers
>>>>> renaming everything)
>>>>>
>>>>> I do, however, have a follow-on question:
>>>>> On other systems, I have similar hardware:
>>>>> 3x Raid Controllers
>>>>> 1 of them has 10 disks as RAID 6 that I would like to add to a logical
>>>>> volume
>>>>> 2 of them have 12 disks as a RAID 6 that I would like to add to the
>> same
>>>>> logical volume
>>>>>
>>>>> All have the same "Stripe" or "Strip Size" of 512 KB
>>>>>
>>>>> So if I where going to make 3 seperate xfs volumes, I would do the
>>>>> following:
>>>>> mkfs.xfs -d su=512k sw=8 /dev/sda
>>>>> mkfs.xfs -d su=512k sw=10 /dev/sdb
>>>>> mkfs.xfs -d su=512k sw=10 /dev/sdc
>>>>>
>>>>> I assume, If I where going to bring them all into 1 logical volume, it
>>>>> would be best placed to have the sw value set
>>>>> to a value that is divisible by both 8 and 10 - in this case 2?
>>>>
>>>> No.  In this case you do NOT stripe align XFS to the storage, because
>>>> it's impossible--the RAID stripes are dissimilar.  In this case you use
>>>> the default 4KB write out, as if this is a single disk drive.
>>>>
>>>> As Dave stated, if you format a concatenated device with XFS and you
>>>> desire to align XFS, then all constituent arrays must have the same
>>>> geometry.
>>>>
>>>> Two things to be aware of here:
>>>>
>>>> 1.  With a decent hardware write caching RAID controller, having XFS
>>>> alined to the RAID geometry is a small optimization WRT overall write
>>>> performance, because the controller is going to be doing the optimizing
>>>> of final writeback to the drives.
>>>>
>>>> 2. Alignment does not affect read performance.
>>>
>>> Ah, but it does...
>>>
>>>> 3.  XFS only performs aligned writes during allocation.
>>>
>>> Right, and it does so not only to improve write performance, but to
>>> also maximise sequential read performance of the data that is
>>> written, especially when multiple files are being read
>>> simultaneously and IO latency is important to keep low (e.g.
>>> realtime video ingest and playout).
>>
>> Absolutely correct, as Dave always is.  As my workloads are mostly
>> random, as are those of others I consult in other fora, I sometimes
>> forget the [multi]streaming case.  Which is not good, as many folks
>> choose XFS specifically for [multi]streaming workloads.  My remarks to
>> this audience should always reflect that.  Apologies for my oversight on
>> this occasion.
>>
>>>> What really makes a difference as to whether alignment will be of
>>>> benefit to you, and how often, is your workload.  So at this point, you
>>>> need to describe the primary workload(s) of your systems we're
>> discussing.
>>>
>>> Yup, my thoughts exactly...
>>>
>>> Cheers,
>>>
>>> Dave.
>>>
>>
>> --
>> Stan
>>
>>
> 
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: xfs hardware RAID alignment over linear lvm
  2013-09-27 13:09                 ` Stan Hoeppner
@ 2013-09-27 13:29                   ` Stewart Webb
  2013-09-28 14:54                     ` Stan Hoeppner
  0 siblings, 1 reply; 17+ messages in thread
From: Stewart Webb @ 2013-09-27 13:29 UTC (permalink / raw)
  To: stan; +Cc: Chris Murphy, xfs@oss.sgi.com


[-- Attachment #1.1: Type: text/plain, Size: 6628 bytes --]

Hi Stan,

Apologies for not directly answering -
I was aiming at filling gaps in my knowledge that I could not find in the
xfs.org wiki.

My workload for the storage is mainly reads of single large files (ranging
for 20GB to 100GB each)
These reads are mainly linear (video playback, although not always as the
end user may be jumping to different points in the video)
There are concurrent reads required, estimated at 2 to 8, any more would be
a bonus.
The challenge of this would be that the reads need to be "real-time"
operations as they are interacted with by a person, and each
read operation would have to consistently have a low latency and obtain
speeds of over 50Mb/s

Disk write speeds are not *as* important for me - as they these files are
copied to location before they are required (in this case
using rsync or scp) and these operations do not require as much "real-time"
interaction.


On 27 September 2013 14:09, Stan Hoeppner <stan@hardwarefreak.com> wrote:

> On 9/27/2013 7:23 AM, Stewart Webb wrote:
> >> Right, and it does so not only to improve write performance, but to
> >> also maximise sequential read performance of the data that is
> >> written, especially when multiple files are being read
> >> simultaneously and IO latency is important to keep low (e.g.
> >> realtime video ingest and playout).
> >
> > So does this mean that I should avoid having devices in RAID with a
> > differing amount of spindles (or non-parity disks)
> > If I would like to use Linear concatenation LVM? Or is there a best
> > practice if this instance is not
> > avoidable?
>
> Above, Dave was correcting my oversight, not necessarily informing you,
> per se.  It seems clear from your follow up question that you didn't
> really grasp what he was saying.  Let's back up a little bit.
>
> What you need to concentrate on right now is the following which we
> stated previously in the thread, but which you did not reply to:
>
> >>>> What really makes a difference as to whether alignment will be of
> >>>> benefit to you, and how often, is your workload.  So at this point,
> you
> >>>> need to describe the primary workload(s) of your systems we're
> >> discussing.
> >>>
> >>> Yup, my thoughts exactly...
>
> This means you need to describe in detail how you are writing your
> files, and how you are reading them back.  I.e. what application are you
> using, what does it do, etc.  You stated IIRC that your workload is 80%
> read.  What types of files is it reading?  Small, large?  Is it reading
> multiple files in parallel?  How are these files originally written
> before being read?  Etc, etc.
>
> You may not understand why this is relevant, but it is the only thing
> that is relevant, at this point.  Spindles, RAID level, alignment, no
> alignment...none of this matters if it doesn't match up with how your
> application(s) do their IO.
>
> Rule #1 of storage architecture:  Always build your storage stack (i.e.
> disks, controller, driver, filesystem, etc) to fit the workload(s), not
> the other way around.
>
> >
> > On 27 September 2013 02:10, Stan Hoeppner <stan@hardwarefreak.com>
> wrote:
> >
> >> On 9/26/2013 4:58 PM, Dave Chinner wrote:
> >>> On Thu, Sep 26, 2013 at 04:22:30AM -0500, Stan Hoeppner wrote:
> >>>> On 9/26/2013 3:55 AM, Stewart Webb wrote:
> >>>>> Thanks for all this info Stan and Dave,
> >>>>>
> >>>>>> "Stripe size" is a synonym of XFS sw, which is su * #disks.  This is
> >> the
> >>>>>> amount of data written across the full RAID stripe (excluding
> parity).
> >>>>>
> >>>>> The reason I stated Stripe size is because in this instance, I have
> >> 3ware
> >>>>> RAID controllers, which refer to
> >>>>> this value as "Stripe" in their tw_cli software (god bless
> >> manufacturers
> >>>>> renaming everything)
> >>>>>
> >>>>> I do, however, have a follow-on question:
> >>>>> On other systems, I have similar hardware:
> >>>>> 3x Raid Controllers
> >>>>> 1 of them has 10 disks as RAID 6 that I would like to add to a
> logical
> >>>>> volume
> >>>>> 2 of them have 12 disks as a RAID 6 that I would like to add to the
> >> same
> >>>>> logical volume
> >>>>>
> >>>>> All have the same "Stripe" or "Strip Size" of 512 KB
> >>>>>
> >>>>> So if I where going to make 3 seperate xfs volumes, I would do the
> >>>>> following:
> >>>>> mkfs.xfs -d su=512k sw=8 /dev/sda
> >>>>> mkfs.xfs -d su=512k sw=10 /dev/sdb
> >>>>> mkfs.xfs -d su=512k sw=10 /dev/sdc
> >>>>>
> >>>>> I assume, If I where going to bring them all into 1 logical volume,
> it
> >>>>> would be best placed to have the sw value set
> >>>>> to a value that is divisible by both 8 and 10 - in this case 2?
> >>>>
> >>>> No.  In this case you do NOT stripe align XFS to the storage, because
> >>>> it's impossible--the RAID stripes are dissimilar.  In this case you
> use
> >>>> the default 4KB write out, as if this is a single disk drive.
> >>>>
> >>>> As Dave stated, if you format a concatenated device with XFS and you
> >>>> desire to align XFS, then all constituent arrays must have the same
> >>>> geometry.
> >>>>
> >>>> Two things to be aware of here:
> >>>>
> >>>> 1.  With a decent hardware write caching RAID controller, having XFS
> >>>> alined to the RAID geometry is a small optimization WRT overall write
> >>>> performance, because the controller is going to be doing the
> optimizing
> >>>> of final writeback to the drives.
> >>>>
> >>>> 2. Alignment does not affect read performance.
> >>>
> >>> Ah, but it does...
> >>>
> >>>> 3.  XFS only performs aligned writes during allocation.
> >>>
> >>> Right, and it does so not only to improve write performance, but to
> >>> also maximise sequential read performance of the data that is
> >>> written, especially when multiple files are being read
> >>> simultaneously and IO latency is important to keep low (e.g.
> >>> realtime video ingest and playout).
> >>
> >> Absolutely correct, as Dave always is.  As my workloads are mostly
> >> random, as are those of others I consult in other fora, I sometimes
> >> forget the [multi]streaming case.  Which is not good, as many folks
> >> choose XFS specifically for [multi]streaming workloads.  My remarks to
> >> this audience should always reflect that.  Apologies for my oversight on
> >> this occasion.
> >>
> >>>> What really makes a difference as to whether alignment will be of
> >>>> benefit to you, and how often, is your workload.  So at this point,
> you
> >>>> need to describe the primary workload(s) of your systems we're
> >> discussing.
> >>>
> >>> Yup, my thoughts exactly...
> >>>
> >>> Cheers,
> >>>
> >>> Dave.
> >>>
> >>
> >> --
> >> Stan
> >>
> >>
> >
> >
>
>


-- 
Stewart Webb

[-- Attachment #1.2: Type: text/html, Size: 8893 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: xfs hardware RAID alignment over linear lvm
  2013-09-27 13:29                   ` Stewart Webb
@ 2013-09-28 14:54                     ` Stan Hoeppner
  2013-09-30  8:48                       ` Stewart Webb
  0 siblings, 1 reply; 17+ messages in thread
From: Stan Hoeppner @ 2013-09-28 14:54 UTC (permalink / raw)
  To: Stewart Webb; +Cc: Chris Murphy, xfs@oss.sgi.com

On 9/27/2013 8:29 AM, Stewart Webb wrote:
> Hi Stan,
> 
> Apologies for not directly answering -

No problem, sorry for the late reply.

> I was aiming at filling gaps in my knowledge that I could not find in the
> xfs.org wiki.

Hopefully this is occurring. :)

> My workload for the storage is mainly reads of single large files (ranging
> for 20GB to 100GB each)
> These reads are mainly linear (video playback, although not always as the
> end user may be jumping to different points in the video)
> There are concurrent reads required, estimated at 2 to 8, any more would be
> a bonus.

This is the type of workload Dave described previously that should
exhibit an increase in read performance if the files are written with
alignment, especially with concurrent readers, which you describe as
2-8, maybe more.  The number of "maybe more" is dictated by whether
you're aligned.  I.e. with alignment your odds of successfully serving
more readers is much greater.

Thus, if you need to stitch arrays together with LVM concatenation,
you'd definitely benefit from making the geometry of all arrays
identical, and aligning the filesystem to that geometry.  I.e. same
number of disks, same RAID level, same RAID stripe unit (data per non
parity disk), and stripe width (#non parity disks).

> The challenge of this would be that the reads need to be "real-time"
> operations as they are interacted with by a person, and each
> read operation would have to consistently have a low latency and obtain
> speeds of over 50Mb/s
> 
> Disk write speeds are not *as* important for me - as they these files are
> copied to location before they are required (in this case
> using rsync or scp) and these operations do not require as much "real-time"
> interaction.
> 
> 
> On 27 September 2013 14:09, Stan Hoeppner <stan@hardwarefreak.com> wrote:
> 
>> On 9/27/2013 7:23 AM, Stewart Webb wrote:
>>>> Right, and it does so not only to improve write performance, but to
>>>> also maximise sequential read performance of the data that is
>>>> written, especially when multiple files are being read
>>>> simultaneously and IO latency is important to keep low (e.g.
>>>> realtime video ingest and playout).
>>>
>>> So does this mean that I should avoid having devices in RAID with a
>>> differing amount of spindles (or non-parity disks)
>>> If I would like to use Linear concatenation LVM? Or is there a best
>>> practice if this instance is not
>>> avoidable?
>>
>> Above, Dave was correcting my oversight, not necessarily informing you,
>> per se.  It seems clear from your follow up question that you didn't
>> really grasp what he was saying.  Let's back up a little bit.
>>
>> What you need to concentrate on right now is the following which we
>> stated previously in the thread, but which you did not reply to:
>>
>>>>>> What really makes a difference as to whether alignment will be of
>>>>>> benefit to you, and how often, is your workload.  So at this point,
>> you
>>>>>> need to describe the primary workload(s) of your systems we're
>>>> discussing.
>>>>>
>>>>> Yup, my thoughts exactly...
>>
>> This means you need to describe in detail how you are writing your
>> files, and how you are reading them back.  I.e. what application are you
>> using, what does it do, etc.  You stated IIRC that your workload is 80%
>> read.  What types of files is it reading?  Small, large?  Is it reading
>> multiple files in parallel?  How are these files originally written
>> before being read?  Etc, etc.
>>
>> You may not understand why this is relevant, but it is the only thing
>> that is relevant, at this point.  Spindles, RAID level, alignment, no
>> alignment...none of this matters if it doesn't match up with how your
>> application(s) do their IO.
>>
>> Rule #1 of storage architecture:  Always build your storage stack (i.e.
>> disks, controller, driver, filesystem, etc) to fit the workload(s), not
>> the other way around.
>>
>>>
>>> On 27 September 2013 02:10, Stan Hoeppner <stan@hardwarefreak.com>
>> wrote:
>>>
>>>> On 9/26/2013 4:58 PM, Dave Chinner wrote:
>>>>> On Thu, Sep 26, 2013 at 04:22:30AM -0500, Stan Hoeppner wrote:
>>>>>> On 9/26/2013 3:55 AM, Stewart Webb wrote:
>>>>>>> Thanks for all this info Stan and Dave,
>>>>>>>
>>>>>>>> "Stripe size" is a synonym of XFS sw, which is su * #disks.  This is
>>>> the
>>>>>>>> amount of data written across the full RAID stripe (excluding
>> parity).
>>>>>>>
>>>>>>> The reason I stated Stripe size is because in this instance, I have
>>>> 3ware
>>>>>>> RAID controllers, which refer to
>>>>>>> this value as "Stripe" in their tw_cli software (god bless
>>>> manufacturers
>>>>>>> renaming everything)
>>>>>>>
>>>>>>> I do, however, have a follow-on question:
>>>>>>> On other systems, I have similar hardware:
>>>>>>> 3x Raid Controllers
>>>>>>> 1 of them has 10 disks as RAID 6 that I would like to add to a
>> logical
>>>>>>> volume
>>>>>>> 2 of them have 12 disks as a RAID 6 that I would like to add to the
>>>> same
>>>>>>> logical volume
>>>>>>>
>>>>>>> All have the same "Stripe" or "Strip Size" of 512 KB
>>>>>>>
>>>>>>> So if I where going to make 3 seperate xfs volumes, I would do the
>>>>>>> following:
>>>>>>> mkfs.xfs -d su=512k sw=8 /dev/sda
>>>>>>> mkfs.xfs -d su=512k sw=10 /dev/sdb
>>>>>>> mkfs.xfs -d su=512k sw=10 /dev/sdc
>>>>>>>
>>>>>>> I assume, If I where going to bring them all into 1 logical volume,
>> it
>>>>>>> would be best placed to have the sw value set
>>>>>>> to a value that is divisible by both 8 and 10 - in this case 2?
>>>>>>
>>>>>> No.  In this case you do NOT stripe align XFS to the storage, because
>>>>>> it's impossible--the RAID stripes are dissimilar.  In this case you
>> use
>>>>>> the default 4KB write out, as if this is a single disk drive.
>>>>>>
>>>>>> As Dave stated, if you format a concatenated device with XFS and you
>>>>>> desire to align XFS, then all constituent arrays must have the same
>>>>>> geometry.
>>>>>>
>>>>>> Two things to be aware of here:
>>>>>>
>>>>>> 1.  With a decent hardware write caching RAID controller, having XFS
>>>>>> alined to the RAID geometry is a small optimization WRT overall write
>>>>>> performance, because the controller is going to be doing the
>> optimizing
>>>>>> of final writeback to the drives.
>>>>>>
>>>>>> 2. Alignment does not affect read performance.
>>>>>
>>>>> Ah, but it does...
>>>>>
>>>>>> 3.  XFS only performs aligned writes during allocation.
>>>>>
>>>>> Right, and it does so not only to improve write performance, but to
>>>>> also maximise sequential read performance of the data that is
>>>>> written, especially when multiple files are being read
>>>>> simultaneously and IO latency is important to keep low (e.g.
>>>>> realtime video ingest and playout).
>>>>
>>>> Absolutely correct, as Dave always is.  As my workloads are mostly
>>>> random, as are those of others I consult in other fora, I sometimes
>>>> forget the [multi]streaming case.  Which is not good, as many folks
>>>> choose XFS specifically for [multi]streaming workloads.  My remarks to
>>>> this audience should always reflect that.  Apologies for my oversight on
>>>> this occasion.
>>>>
>>>>>> What really makes a difference as to whether alignment will be of
>>>>>> benefit to you, and how often, is your workload.  So at this point,
>> you
>>>>>> need to describe the primary workload(s) of your systems we're
>>>> discussing.
>>>>>
>>>>> Yup, my thoughts exactly...
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Dave.
>>>>>
>>>>
>>>> --
>>>> Stan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: xfs hardware RAID alignment over linear lvm
  2013-09-28 14:54                     ` Stan Hoeppner
@ 2013-09-30  8:48                       ` Stewart Webb
  0 siblings, 0 replies; 17+ messages in thread
From: Stewart Webb @ 2013-09-30  8:48 UTC (permalink / raw)
  To: stan; +Cc: Chris Murphy, xfs@oss.sgi.com


[-- Attachment #1.1: Type: text/plain, Size: 7990 bytes --]

Ok,

Thanks Stan
Much appreciated


On 28 September 2013 15:54, Stan Hoeppner <stan@hardwarefreak.com> wrote:

> On 9/27/2013 8:29 AM, Stewart Webb wrote:
> > Hi Stan,
> >
> > Apologies for not directly answering -
>
> No problem, sorry for the late reply.
>
> > I was aiming at filling gaps in my knowledge that I could not find in the
> > xfs.org wiki.
>
> Hopefully this is occurring. :)
>
> > My workload for the storage is mainly reads of single large files
> (ranging
> > for 20GB to 100GB each)
> > These reads are mainly linear (video playback, although not always as the
> > end user may be jumping to different points in the video)
> > There are concurrent reads required, estimated at 2 to 8, any more would
> be
> > a bonus.
>
> This is the type of workload Dave described previously that should
> exhibit an increase in read performance if the files are written with
> alignment, especially with concurrent readers, which you describe as
> 2-8, maybe more.  The number of "maybe more" is dictated by whether
> you're aligned.  I.e. with alignment your odds of successfully serving
> more readers is much greater.
>
> Thus, if you need to stitch arrays together with LVM concatenation,
> you'd definitely benefit from making the geometry of all arrays
> identical, and aligning the filesystem to that geometry.  I.e. same
> number of disks, same RAID level, same RAID stripe unit (data per non
> parity disk), and stripe width (#non parity disks).
>
> > The challenge of this would be that the reads need to be "real-time"
> > operations as they are interacted with by a person, and each
> > read operation would have to consistently have a low latency and obtain
> > speeds of over 50Mb/s
> >
> > Disk write speeds are not *as* important for me - as they these files are
> > copied to location before they are required (in this case
> > using rsync or scp) and these operations do not require as much
> "real-time"
> > interaction.
> >
> >
> > On 27 September 2013 14:09, Stan Hoeppner <stan@hardwarefreak.com>
> wrote:
> >
> >> On 9/27/2013 7:23 AM, Stewart Webb wrote:
> >>>> Right, and it does so not only to improve write performance, but to
> >>>> also maximise sequential read performance of the data that is
> >>>> written, especially when multiple files are being read
> >>>> simultaneously and IO latency is important to keep low (e.g.
> >>>> realtime video ingest and playout).
> >>>
> >>> So does this mean that I should avoid having devices in RAID with a
> >>> differing amount of spindles (or non-parity disks)
> >>> If I would like to use Linear concatenation LVM? Or is there a best
> >>> practice if this instance is not
> >>> avoidable?
> >>
> >> Above, Dave was correcting my oversight, not necessarily informing you,
> >> per se.  It seems clear from your follow up question that you didn't
> >> really grasp what he was saying.  Let's back up a little bit.
> >>
> >> What you need to concentrate on right now is the following which we
> >> stated previously in the thread, but which you did not reply to:
> >>
> >>>>>> What really makes a difference as to whether alignment will be of
> >>>>>> benefit to you, and how often, is your workload.  So at this point,
> >> you
> >>>>>> need to describe the primary workload(s) of your systems we're
> >>>> discussing.
> >>>>>
> >>>>> Yup, my thoughts exactly...
> >>
> >> This means you need to describe in detail how you are writing your
> >> files, and how you are reading them back.  I.e. what application are you
> >> using, what does it do, etc.  You stated IIRC that your workload is 80%
> >> read.  What types of files is it reading?  Small, large?  Is it reading
> >> multiple files in parallel?  How are these files originally written
> >> before being read?  Etc, etc.
> >>
> >> You may not understand why this is relevant, but it is the only thing
> >> that is relevant, at this point.  Spindles, RAID level, alignment, no
> >> alignment...none of this matters if it doesn't match up with how your
> >> application(s) do their IO.
> >>
> >> Rule #1 of storage architecture:  Always build your storage stack (i.e.
> >> disks, controller, driver, filesystem, etc) to fit the workload(s), not
> >> the other way around.
> >>
> >>>
> >>> On 27 September 2013 02:10, Stan Hoeppner <stan@hardwarefreak.com>
> >> wrote:
> >>>
> >>>> On 9/26/2013 4:58 PM, Dave Chinner wrote:
> >>>>> On Thu, Sep 26, 2013 at 04:22:30AM -0500, Stan Hoeppner wrote:
> >>>>>> On 9/26/2013 3:55 AM, Stewart Webb wrote:
> >>>>>>> Thanks for all this info Stan and Dave,
> >>>>>>>
> >>>>>>>> "Stripe size" is a synonym of XFS sw, which is su * #disks.  This
> is
> >>>> the
> >>>>>>>> amount of data written across the full RAID stripe (excluding
> >> parity).
> >>>>>>>
> >>>>>>> The reason I stated Stripe size is because in this instance, I have
> >>>> 3ware
> >>>>>>> RAID controllers, which refer to
> >>>>>>> this value as "Stripe" in their tw_cli software (god bless
> >>>> manufacturers
> >>>>>>> renaming everything)
> >>>>>>>
> >>>>>>> I do, however, have a follow-on question:
> >>>>>>> On other systems, I have similar hardware:
> >>>>>>> 3x Raid Controllers
> >>>>>>> 1 of them has 10 disks as RAID 6 that I would like to add to a
> >> logical
> >>>>>>> volume
> >>>>>>> 2 of them have 12 disks as a RAID 6 that I would like to add to the
> >>>> same
> >>>>>>> logical volume
> >>>>>>>
> >>>>>>> All have the same "Stripe" or "Strip Size" of 512 KB
> >>>>>>>
> >>>>>>> So if I where going to make 3 seperate xfs volumes, I would do the
> >>>>>>> following:
> >>>>>>> mkfs.xfs -d su=512k sw=8 /dev/sda
> >>>>>>> mkfs.xfs -d su=512k sw=10 /dev/sdb
> >>>>>>> mkfs.xfs -d su=512k sw=10 /dev/sdc
> >>>>>>>
> >>>>>>> I assume, If I where going to bring them all into 1 logical volume,
> >> it
> >>>>>>> would be best placed to have the sw value set
> >>>>>>> to a value that is divisible by both 8 and 10 - in this case 2?
> >>>>>>
> >>>>>> No.  In this case you do NOT stripe align XFS to the storage,
> because
> >>>>>> it's impossible--the RAID stripes are dissimilar.  In this case you
> >> use
> >>>>>> the default 4KB write out, as if this is a single disk drive.
> >>>>>>
> >>>>>> As Dave stated, if you format a concatenated device with XFS and you
> >>>>>> desire to align XFS, then all constituent arrays must have the same
> >>>>>> geometry.
> >>>>>>
> >>>>>> Two things to be aware of here:
> >>>>>>
> >>>>>> 1.  With a decent hardware write caching RAID controller, having XFS
> >>>>>> alined to the RAID geometry is a small optimization WRT overall
> write
> >>>>>> performance, because the controller is going to be doing the
> >> optimizing
> >>>>>> of final writeback to the drives.
> >>>>>>
> >>>>>> 2. Alignment does not affect read performance.
> >>>>>
> >>>>> Ah, but it does...
> >>>>>
> >>>>>> 3.  XFS only performs aligned writes during allocation.
> >>>>>
> >>>>> Right, and it does so not only to improve write performance, but to
> >>>>> also maximise sequential read performance of the data that is
> >>>>> written, especially when multiple files are being read
> >>>>> simultaneously and IO latency is important to keep low (e.g.
> >>>>> realtime video ingest and playout).
> >>>>
> >>>> Absolutely correct, as Dave always is.  As my workloads are mostly
> >>>> random, as are those of others I consult in other fora, I sometimes
> >>>> forget the [multi]streaming case.  Which is not good, as many folks
> >>>> choose XFS specifically for [multi]streaming workloads.  My remarks to
> >>>> this audience should always reflect that.  Apologies for my oversight
> on
> >>>> this occasion.
> >>>>
> >>>>>> What really makes a difference as to whether alignment will be of
> >>>>>> benefit to you, and how often, is your workload.  So at this point,
> >> you
> >>>>>> need to describe the primary workload(s) of your systems we're
> >>>> discussing.
> >>>>>
> >>>>> Yup, my thoughts exactly...
> >>>>>
> >>>>> Cheers,
> >>>>>
> >>>>> Dave.
> >>>>>
> >>>>
> >>>> --
> >>>> Stan
>
>


-- 
Stewart Webb

[-- Attachment #1.2: Type: text/html, Size: 11306 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-09-30  8:49 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-25 12:56 xfs hardware RAID alignment over linear lvm Stewart Webb
2013-09-25 21:18 ` Stan Hoeppner
2013-09-25 21:34   ` Chris Murphy
2013-09-25 21:48     ` Stan Hoeppner
2013-09-25 21:53       ` Chris Murphy
2013-09-25 21:57     ` Dave Chinner
2013-09-26  8:44       ` Stan Hoeppner
2013-09-26  8:55       ` Stewart Webb
2013-09-26  9:22         ` Stan Hoeppner
2013-09-26  9:28           ` Stewart Webb
2013-09-26 21:58           ` Dave Chinner
2013-09-27  1:10             ` Stan Hoeppner
2013-09-27 12:23               ` Stewart Webb
2013-09-27 13:09                 ` Stan Hoeppner
2013-09-27 13:29                   ` Stewart Webb
2013-09-28 14:54                     ` Stan Hoeppner
2013-09-30  8:48                       ` Stewart Webb

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox