linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] Alignment: XFS + LVM2
@ 2014-05-06 15:54 Marc Caubet
  2014-05-07 15:27 ` Mike Snitzer
  0 siblings, 1 reply; 4+ messages in thread
From: Marc Caubet @ 2014-05-06 15:54 UTC (permalink / raw)
  To: linux-lvm

[-- Attachment #1: Type: text/plain, Size: 2667 bytes --]

Hi all,

I am trying to setup a storage pool with correct disk alignment and I hope
somebody can help me to understand some unclear parts to me when
configuring XFS over LVM2.

Actually we have few storage pools with the following settings each:

- LSI Controller with 3xRAID6
- Each RAID6 is configured with 10 data disks + 2 for double-parity.
- Each disk has a capacity of 4TB, 512e and physical sector size of 4K.
- 3x(10+2) configuration was considered in order to gain best performance
and data safety (less disks per RAID less probability of data corruption)

From the O.S. side we see:

[root@stgpool01 ~]# fdisk -l /dev/sda /dev/sdb /dev/sdc

Disk /dev/sda: 40000.0 GB, 39999997214720 bytes
255 heads, 63 sectors/track, 4863055 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000

Disk /dev/sdb: 40000.0 GB, 39999997214720 bytes
255 heads, 63 sectors/track, 4863055 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000

Disk /dev/sdc: 40000.0 GB, 39999997214720 bytes
255 heads, 63 sectors/track, 4863055 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000

The idea is to aggregate the above devices and show only 1 storage space.
We did as follows:

vgcreate dcvg_a /dev/sda /dev/sdb /dev/sdc
lvcreate -i 3 -I 4096 -n dcpool -l 100%FREE -v dcvg_a

Hence, stripe of the 3 RAID6 in a LV.

And here is my first question: How can I check if the storage and the LV
are correctly aligned?

On the other hand, I have formatted XFS as follows:

mkfs.xfs -d su=256k,sw=10 -l size=128m,lazy-count=1 /dev/dcvg_a/dcpool

So my second question is, are the above 'su' and 'sw' parameters correct on
the current LV configuration? If not, which values should I have and why?
AFAIK su is the stripe size configured in the controller side, but in this
case we have a LV. Also, sw is the number of data disks in a RAID, but
again, we have a LV with 3 stripes, and I am not sure if the number of data
disks should be 30 instead.

Thanks a lot,
-- 
Marc Caubet Serrabou
PIC (Port d'Informació Científica)
Campus UAB, Edificio D
E-08193 Bellaterra, Barcelona
Tel: +34 93 581 33 22
Fax: +34 93 581 41 10
http://www.pic.es
Avis - Aviso - Legal Notice: http://www.ifae.es/legal.html

[-- Attachment #2: Type: text/html, Size: 3320 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [linux-lvm] Alignment: XFS + LVM2
  2014-05-06 15:54 [linux-lvm] Alignment: XFS + LVM2 Marc Caubet
@ 2014-05-07 15:27 ` Mike Snitzer
  2014-05-08  9:29   ` Marc Caubet
  2014-06-04  5:39   ` Linda A. Walsh
  0 siblings, 2 replies; 4+ messages in thread
From: Mike Snitzer @ 2014-05-07 15:27 UTC (permalink / raw)
  To: Marc Caubet; +Cc: linux-lvm

On Tue, May 06 2014 at 11:54am -0400,
Marc Caubet <mcaubet@pic.es> wrote:

> Hi all,
> 
> I am trying to setup a storage pool with correct disk alignment and I hope
> somebody can help me to understand some unclear parts to me when
> configuring XFS over LVM2.
> 
> Actually we have few storage pools with the following settings each:
> 
> - LSI Controller with 3xRAID6
> - Each RAID6 is configured with 10 data disks + 2 for double-parity.
> - Each disk has a capacity of 4TB, 512e and physical sector size of 4K.
> - 3x(10+2) configuration was considered in order to gain best performance
> and data safety (less disks per RAID less probability of data corruption)

What is the chunk size used for these RAID6 devices?
Say it is 256K, you have 10 data devices, so the full stripe would be
2560K.

Which version of lvm2 and kernel are you using?  Newer versions support
a striped LV stripesize that is not a power-of-2.

> >From the O.S. side we see:
> 
> [root@stgpool01 ~]# fdisk -l /dev/sda /dev/sdb /dev/sdc
> 
> Disk /dev/sda: 40000.0 GB, 39999997214720 bytes
> 255 heads, 63 sectors/track, 4863055 cylinders
> Units = cylinders of 16065 * 512 = 8225280 bytes
> Sector size (logical/physical): 512 bytes / 4096 bytes
> I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> Disk identifier: 0x00000000
> 
> Disk /dev/sdb: 40000.0 GB, 39999997214720 bytes
> 255 heads, 63 sectors/track, 4863055 cylinders
> Units = cylinders of 16065 * 512 = 8225280 bytes
> Sector size (logical/physical): 512 bytes / 4096 bytes
> I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> Disk identifier: 0x00000000
> 
> Disk /dev/sdc: 40000.0 GB, 39999997214720 bytes
> 255 heads, 63 sectors/track, 4863055 cylinders
> Units = cylinders of 16065 * 512 = 8225280 bytes
> Sector size (logical/physical): 512 bytes / 4096 bytes
> I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> Disk identifier: 0x00000000
> 
> The idea is to aggregate the above devices and show only 1 storage space.
> We did as follows:
> 
> vgcreate dcvg_a /dev/sda /dev/sdb /dev/sdc
> lvcreate -i 3 -I 4096 -n dcpool -l 100%FREE -v dcvg_a

I'd imagine you'd want the stripesize of this striped LV to match the
underlying RAID6 stripesize no?  So 2560K, e.g. -i 3 -I 2560

That makes for a very large full stripe through...

> Hence, stripe of the 3 RAID6 in a LV.
> 
> And here is my first question: How can I check if the storage and the LV
> are correctly aligned?
> 
> On the other hand, I have formatted XFS as follows:
> 
> mkfs.xfs -d su=256k,sw=10 -l size=128m,lazy-count=1 /dev/dcvg_a/dcpool
> 
> So my second question is, are the above 'su' and 'sw' parameters correct on
> the current LV configuration? If not, which values should I have and why?
> AFAIK su is the stripe size configured in the controller side, but in this
> case we have a LV. Also, sw is the number of data disks in a RAID, but
> again, we have a LV with 3 stripes, and I am not sure if the number of data
> disks should be 30 instead.

Newer versions of mkfs.xfs _should_ pick up the hints exposed (as
minimum_io_size and optimal_io_size) by the striped LV.

But if not you definitely don't want to be trying to pierce through the
striped LV config to establish settings of the underlying RAID6.  Each
layer in the stack should respect the layer beneath it.  So, if the
striped LV is configured how you'd like, you should only concern
yourself with the limits that have been established for the topmost
striped LV that you're layering XFS on.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [linux-lvm] Alignment: XFS + LVM2
  2014-05-07 15:27 ` Mike Snitzer
@ 2014-05-08  9:29   ` Marc Caubet
  2014-06-04  5:39   ` Linda A. Walsh
  1 sibling, 0 replies; 4+ messages in thread
From: Marc Caubet @ 2014-05-08  9:29 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: linux-lvm

[-- Attachment #1: Type: text/plain, Size: 5117 bytes --]

Hi Mike,

thanks a lot for your answer.

> Hi all,
> >
> > I am trying to setup a storage pool with correct disk alignment and I
> hope
> > somebody can help me to understand some unclear parts to me when
> > configuring XFS over LVM2.
> >
> > Actually we have few storage pools with the following settings each:
> >
> > - LSI Controller with 3xRAID6
> > - Each RAID6 is configured with 10 data disks + 2 for double-parity.
> > - Each disk has a capacity of 4TB, 512e and physical sector size of 4K.
> > - 3x(10+2) configuration was considered in order to gain best performance
> > and data safety (less disks per RAID less probability of data corruption)
>
> What is the chunk size used for these RAID6 devices?
> Say it is 256K, you have 10 data devices, so the full stripe would be
> 2560K.
>

Actually chunk size is 256KB (in a near future we will try 1MB as we are
managing large files but actually we want to keep the current configuration
of 256KB)

Which version of lvm2 and kernel are you using?  Newer versions support
> a striped LV stripesize that is not a power-of-2.
>

Current LVM2 version is  lvm2-2.02.100-8.el6.x86_64

> >From the O.S. side we see:
> >
> > [root@stgpool01 ~]# fdisk -l /dev/sda /dev/sdb /dev/sdc
> >
> > Disk /dev/sda: 40000.0 GB, 39999997214720 bytes
> > 255 heads, 63 sectors/track, 4863055 cylinders
> > Units = cylinders of 16065 * 512 = 8225280 bytes
> > Sector size (logical/physical): 512 bytes / 4096 bytes
> > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> > Disk identifier: 0x00000000
> >
> > Disk /dev/sdb: 40000.0 GB, 39999997214720 bytes
> > 255 heads, 63 sectors/track, 4863055 cylinders
> > Units = cylinders of 16065 * 512 = 8225280 bytes
> > Sector size (logical/physical): 512 bytes / 4096 bytes
> > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> > Disk identifier: 0x00000000
> >
> > Disk /dev/sdc: 40000.0 GB, 39999997214720 bytes
> > 255 heads, 63 sectors/track, 4863055 cylinders
> > Units = cylinders of 16065 * 512 = 8225280 bytes
> > Sector size (logical/physical): 512 bytes / 4096 bytes
> > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> > Disk identifier: 0x00000000
> >
> > The idea is to aggregate the above devices and show only 1 storage space.
> > We did as follows:
> >
> > vgcreate dcvg_a /dev/sda /dev/sdb /dev/sdc
> > lvcreate -i 3 -I 4096 -n dcpool -l 100%FREE -v dcvg_a
>
> I'd imagine you'd want the stripesize of this striped LV to match the
> underlying RAID6 stripesize no?  So 2560K, e.g. -i 3 -I 2560
>
> That makes for a very large full stripe through...
>

Hence for a RAID6 with 256KB of stripe size "I" should be 2560. Does it
mean that the "I" parameter is stripesize*number_of_data_disks? I mean, if
I have 16 data disks in a RAID6 and 1MB of stripe size, which should be the
"I" value?

On the other hand, yes, 2560 is a large full stripe but we are mostly
managing large files (hundred MBs and few GB), so I guess this is ok. Is
possible to check the minimum recommended file size for a configuration
like this? I would like to know it because we also have few storage pools
(less than 3% of the total) we a small file profile and I would like to fit
the disk configuration to its workload type.

> Hence, stripe of the 3 RAID6 in a LV.
> >
> > And here is my first question: How can I check if the storage and the LV
> > are correctly aligned?
> >
> > On the other hand, I have formatted XFS as follows:
> >
> > mkfs.xfs -d su=256k,sw=10 -l size=128m,lazy-count=1 /dev/dcvg_a/dcpool
> >
> > So my second question is, are the above 'su' and 'sw' parameters correct
> on
> > the current LV configuration? If not, which values should I have and why?
> > AFAIK su is the stripe size configured in the controller side, but in
> this
> > case we have a LV. Also, sw is the number of data disks in a RAID, but
> > again, we have a LV with 3 stripes, and I am not sure if the number of
> data
> > disks should be 30 instead.
>
> Newer versions of mkfs.xfs _should_ pick up the hints exposed (as
> minimum_io_size and optimal_io_size) by the striped LV.
>
> But if not you definitely don't want to be trying to pierce through the
> striped LV config to establish settings of the underlying RAID6.  Each
> layer in the stack should respect the layer beneath it.  So, if the
> striped LV is configured how you'd like, you should only concern
> yourself with the limits that have been established for the topmost
> striped LV that you're layering XFS on.
>

Current XFS package has the version xfsprogs-3.1.1-14.el6.x86_64 and comes
with Scientific Linux 6. Then, how should I manage the XFS 'su' and 'sw'
parameters from the LVM2 configuration in order to ensure disk alignment in
order to have best performance?

Once again, thanks a lot for you help,
-- 
Marc Caubet Serrabou
PIC (Port d'Informació Científica)
Campus UAB, Edificio D
E-08193 Bellaterra, Barcelona
Tel: +34 93 581 33 22
Fax: +34 93 581 41 10
http://www.pic.es
Avis - Aviso - Legal Notice: http://www.ifae.es/legal.html

[-- Attachment #2: Type: text/html, Size: 6509 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [linux-lvm] Alignment: XFS + LVM2
  2014-05-07 15:27 ` Mike Snitzer
  2014-05-08  9:29   ` Marc Caubet
@ 2014-06-04  5:39   ` Linda A. Walsh
  1 sibling, 0 replies; 4+ messages in thread
From: Linda A. Walsh @ 2014-06-04  5:39 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: Marc Caubet

Mike Snitzer wrote:
> On Tue, May 06 2014 at 11:54am -0400,
> Marc Caubet <mcaubet@pic.es> wrote:
>
>   
>> Hi all,
>>
>> I am trying to setup a storage pool with correct disk alignment and I hope
>> somebody can help me to understand some unclear parts to me when
>> configuring XFS over LVM2.
>>
>> Actually we have few storage pools with the following settings each:
>>
>> - LSI Controller with 3xRAID6
>> - Each RAID6 is configured with 10 data disks + 2 for double-parity.
>> - Each disk has a capacity of 4TB, 512e and physical sector size of 4K.
>> - 3x(10+2) configuration was considered in order to gain best performance
>> and data safety (less disks per RAID less probability of data corruption)
>>     
----
I have a similar setup and am almost certain I have 2 of them wrong as
shown below:


Model: LSI MR9280DE-8e (scsi)
Disk /dev/sda: 24.0TB
Sector size (logical/physical): 512B/512B
Partition Table: gpt_sync_mbr

Number  Start   End     Size    File system  Name       Flags
 1      17.4kB  24.0TB  24.0TB               home+shar  lvm

Model: LSI MR9280DE-8e (scsi)
Disk /dev/sdb: 12.0TB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start   End     Size    File system  Name     Flags
 1      1049kB  12.0TB  12.0TB               Backups  lvm


Model: DELL PERC 6/i (scsi)
Disk /dev/sdd: 7999GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt_sync_mbr

Number  Start   End     Size    File system  Name      Flags
 1      17.4kB  7999GB  7999GB               Media      lvm

pvs says:
# pvs
  PV         VG      Fmt  Attr PSize  PFree
  /dev/sda1  HnS     lvm2 a--  21.83t 2.73t
  /dev/sdb1  Backups lvm2 a--  10.91t 3.15g
  /dev/sdd1  Media   lvm2 a--   7.28t    0
-----

Notice how each of them are starting at some weird offset.

I thought I started /dev/sdb @ 1MB, which comes out to 1048576..  so sdb 
might
be aligned on a sector boundary.....but has 6 data disks x 64K stripe, = 
384K, which
doesn't divide into 1MB evenly. 

/dev/sda has a strip-size of 768K, BUT since it is a RAID50 (3 RAID5's in a
RAID0 config), I can use 256K as a strip-size for writes, as a write of
any aligned 256K chunk will only affect 4 data disks (+ 1 parity).
>
>   
>>
>> And here is my first question: How can I check if the storage and the LV
>> are correctly aligned?
>>
>> On the other hand, I have formatted XFS as follows:
>>
>> mkfs.xfs -d su=256k,sw=10 -l size=128m,lazy-count=1 /dev/dcvg_a/dcpool
>>
>> So my second question is, are the above 'su' and 'sw' parameters correct on
>> the current LV configuration? If not, which values should I have and why?
>> AFAIK su is the stripe size configured in the controller side, but in this
>> case we have a LV. Also, sw is the number of data disks in a RAID, but
>> again, we have a LV with 3 stripes, and I am not sure if the number of data
>> disks should be 30 instead.
>>     
>
> Newer versions of mkfs.xfs _should_ pick up the hints exposed (as
> minimum_io_size and optimal_io_size) by the striped LV.
>   
----
    But mkfs.xfs won't pick up the io_size optimal inside the LSI 
controller.
That's underlying all of this.  LVM didn't try to align space to even 
some even amount
based on starting at 17.4k (i.e. would hve to round up to nearest 256 or 
384 or 768K depending
on subsystem. 
> But if not you definitely don't want to be trying to pierce through the
> striped LV config to establish settings of the underlying RAID6.
----
You have to. 
>   Each
> layer in the stack should respect the layer beneath it.
They don't.  LV doesn't determine optimal start based on partition 
start, so all of its
alignments are off.

My writes are noticeably slower than my reads sometimes by close to 10x 
(5x in more general
case). 


I hope to get another disk subsystem so I can dump those partitions and 
align them, but
also, follow Stan Hoepper's advice from the xfs list -- go with a RAID 
1+0... Then each
pair of RAID1 is independent of every other.  The worst has to be that 
768K.  It triggers a bug
in the gnu database format which assumes the optimal I/O size will be a 
power of 2
(which it is not, in my case).

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-06-04  5:39 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-06 15:54 [linux-lvm] Alignment: XFS + LVM2 Marc Caubet
2014-05-07 15:27 ` Mike Snitzer
2014-05-08  9:29   ` Marc Caubet
2014-06-04  5:39   ` Linda A. Walsh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).