All of lore.kernel.org
 help / color / mirror / Atom feed
From: Phil Turmel <philip@turmel.org>
To: Adam Goryachev <mailinglists@websitemanagers.com.au>
Cc: stan@hardwarefreak.com, Linux RAID <linux-raid@vger.kernel.org>
Subject: Re: RAID performance - new kernel results - 5x SSD RAID5
Date: Mon, 04 Mar 2013 07:41:39 -0500	[thread overview]
Message-ID: <51349683.4040903@turmel.org> (raw)
In-Reply-To: <51346BB5.2010403@websitemanagers.com.au>

On 03/04/2013 04:39 AM, Adam Goryachev wrote:

> Probably a silly question, but how do you convert from the
> information: Fast-Root: 0 314572800 linear 9:3 3072 vg0-hostname: 0
> 204808192 linear 147:2 512
> 
> My example was 512 sectors, assuming a sector size of 512 bytes,
> that provides 256kB (as you advised above).
> 
> Your example was 3072 sectors, again assuming a sector size of 512 
> bytes, that becomes 1.5MB (as you stated above).
> 
> So how come my system has an alignment of 256k (1 x offset) while
> yours has 512k (1.5M/3) ?

Alignment is the greatest common multiple of a power of two that the
beginning of the data area falls on.   In practice, that means find the
lowest "1" bit in the binary representation of the starting offset.  I
personally switch to hex, then mentally pick the lowest "1" bit in the
first non-zero digit from the right.

> I'm assuming that in any case, as you suggested, the main thing is
> that the answer I got (256kB offset) was equal to the MD chunk size
> (64kB) multiplied by the number of data drives (4), or 256kB.

Yes.

>>> Also, pvdisplay tells me the PE Size is 4M, so I'm assuming that
>>> regardless of how the LV's are arranged, they will always be 512k
>>> aligned?
>> 
>> 256k, but yeah.
> 
> So LVM will not allocate any LV a block of space smaller than 4M,
> and I'm assuming will always be on a 4M boundary from the beginning
> of the device. Since 4MB is a mulitple of 256kB, then alignment is
> OK?

It will not be on a 4M boundary.  The first PE is at a 256k offset, so
any multiple of 4M added to that will also be 256k aligned.  For you,
that's fine.

> If the MD stripe size was larger, eg, if I added 2 more drives it
> would become 64kB chunk x 6 data drives = 384kB. This would mean my
> LVM is no longer properly aligned. The first block of 4MB would start
> at 256kB which is smaller than the stripe size, and each 4MB block
> would most likely not line up since 4MB is not divisible by 384kB?

Then it gets complicated.  When the # of data drives in parity raid
isn't a power of two, you generally cannot make higher layers
consistently align with the stripe boundaries.  The best you can do is
align to the greatest common power of two of the stripe size.  For your
example, that would be 128k.

> So, if I ever choose to expand the array to include a larger number
> of devices (as opposed to replacing all members with larger drives),
> what would I need to do to fix all this up?
>
> Re-partition to start the partition at a higher starting sector
> (such that 4M / start sector * 512 produces an integer)?

pvcreate can be told what alignment to use.  It will round up to its
requirements, though.  vgcreate can be told what physical extent size to
use.  So you have a great deal of control over these behaviors.  It
can't deal with an odd stripe size, though.

> That resolves the first LVM block, but to ensure all other blocks
> are properly aligned, is the best answer to upgrade to 8 x data
> drives (512kB stripe size)? Or is there some other magic solution I'm
> missing here?

You want data alignment to be greater than or equal to stripe alignment.
 Going to 8 drives would break alignment for your existing PV.

>>> So, is that enough to be sure that this is not an issue?
>> It looks to me like you are good on alignment.
> 
> Thanks.
> 
> On 04/03/13 16:25, Stan Hoeppner wrote:
>> If you have no gaps between this one and your other LVs, and each
>> of them is evenly divisible by 512 sectors, then they should all
>> be aligned.
> 
> Given the 4MB size of the LVM blocks, does that automatically make
> this true? I thought it did, but given your above comment, I'm
> unsure.

Alignment is the lowest power of two of all of the offsets and sizing
multiples.  The 4M PE size is larger than any of the other alignment
factors, so it drops out of the analysis.

HTH,

Phil

  reply	other threads:[~2013-03-04 12:41 UTC|newest]

Thread overview: 131+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-07  6:48 RAID performance Adam Goryachev
2013-02-07  6:51 ` Adam Goryachev
2013-02-07  8:24   ` Stan Hoeppner
2013-02-07  7:02 ` Carsten Aulbert
2013-02-07 10:12   ` Adam Goryachev
2013-02-07 10:29     ` Carsten Aulbert
2013-02-07 10:41       ` Adam Goryachev
2013-02-07  8:11 ` Stan Hoeppner
2013-02-07 10:05   ` Adam Goryachev
2013-02-16  4:33     ` RAID performance - *Slow SSDs likely solved* Stan Hoeppner
     [not found]       ` <cfefe7a6-a13f-413c-9e3d-e061c68dc01b@email.android.com>
2013-02-17  5:01         ` Stan Hoeppner
2013-02-08  7:21   ` RAID performance Adam Goryachev
2013-02-08  7:37     ` Chris Murphy
2013-02-08 13:04     ` Stan Hoeppner
2013-02-07  9:07 ` Dave Cundiff
2013-02-07 10:19   ` Adam Goryachev
2013-02-07 11:07     ` Dave Cundiff
2013-02-07 12:49       ` Adam Goryachev
2013-02-07 12:53         ` Phil Turmel
2013-02-07 12:58           ` Adam Goryachev
2013-02-07 13:03             ` Phil Turmel
2013-02-07 13:08               ` Adam Goryachev
2013-02-07 13:20                 ` Mikael Abrahamsson
2013-02-07 22:03               ` Chris Murphy
2013-02-07 23:48                 ` Chris Murphy
2013-02-08  0:02                   ` Chris Murphy
2013-02-08  6:25                     ` Adam Goryachev
2013-02-08  7:35                       ` Chris Murphy
2013-02-08  8:34                         ` Chris Murphy
2013-02-08 14:31                           ` Adam Goryachev
2013-02-08 14:19                         ` Adam Goryachev
2013-02-08  6:15                   ` Adam Goryachev
2013-02-07 15:32         ` Dave Cundiff
2013-02-08 13:58           ` Adam Goryachev
2013-02-08 21:42             ` Stan Hoeppner
2013-02-14 22:42               ` Chris Murphy
2013-02-15  1:10                 ` Adam Goryachev
2013-02-15  1:40                   ` Chris Murphy
2013-02-15  4:01                     ` Adam Goryachev
2013-02-15  5:14                       ` Chris Murphy
2013-02-15 11:10                         ` Adam Goryachev
2013-02-15 23:01                           ` Chris Murphy
2013-02-17  9:52             ` RAID performance - new kernel results Adam Goryachev
2013-02-18 13:20               ` RAID performance - new kernel results - 5x SSD RAID5 Stan Hoeppner
2013-02-20 17:10                 ` Adam Goryachev
2013-02-21  6:04                   ` Stan Hoeppner
2013-02-21  6:40                     ` Adam Goryachev
2013-02-21  8:47                       ` Joseph Glanville
2013-02-22  8:10                       ` Stan Hoeppner
2013-02-24 20:36                         ` Stan Hoeppner
2013-03-01 16:06                           ` Adam Goryachev
2013-03-02  9:15                             ` Stan Hoeppner
2013-03-02 17:07                               ` Phil Turmel
2013-03-02 23:48                                 ` Stan Hoeppner
2013-03-03  2:35                                   ` Phil Turmel
2013-03-03 15:19                                 ` Adam Goryachev
2013-03-04  1:31                                   ` Phil Turmel
2013-03-04  9:39                                     ` Adam Goryachev
2013-03-04 12:41                                       ` Phil Turmel [this message]
2013-03-04 12:42                                       ` Stan Hoeppner
2013-03-04  5:25                                   ` Stan Hoeppner
2013-03-03 17:32                               ` Adam Goryachev
2013-03-04 12:20                                 ` Stan Hoeppner
2013-03-04 16:26                                   ` Adam Goryachev
2013-03-05  9:30                                     ` RAID performance - 5x SSD RAID5 - effects of stripe cache sizing Stan Hoeppner
2013-03-05 15:53                                       ` Adam Goryachev
2013-03-07  7:36                                         ` Stan Hoeppner
2013-03-08  0:17                                           ` Adam Goryachev
2013-03-08  4:02                                             ` Stan Hoeppner
2013-03-08  5:57                                               ` Mikael Abrahamsson
2013-03-08 10:09                                                 ` Stan Hoeppner
2013-03-08 14:11                                                   ` Mikael Abrahamsson
2013-02-21 17:41                     ` RAID performance - new kernel results - 5x SSD RAID5 David Brown
2013-02-23  6:41                       ` Stan Hoeppner
2013-02-23 15:57               ` RAID performance - new kernel results John Stoffel
2013-03-01 16:10                 ` Adam Goryachev
2013-03-10 15:35                   ` Charles Polisher
2013-04-15 12:23                     ` Adam Goryachev
2013-04-15 15:31                       ` John Stoffel
2013-04-17 10:15                         ` Adam Goryachev
2013-04-15 16:49                       ` Roy Sigurd Karlsbakk
2013-04-15 20:16                       ` Phil Turmel
2013-04-16 19:28                         ` Roy Sigurd Karlsbakk
2013-04-16 21:03                           ` Phil Turmel
2013-04-16 21:43                           ` Stan Hoeppner
2013-04-15 20:42                       ` Stan Hoeppner
2013-02-08  3:32       ` RAID performance Stan Hoeppner
2013-02-08  7:11         ` Adam Goryachev
2013-02-08 17:10           ` Stan Hoeppner
2013-02-08 18:44             ` Adam Goryachev
2013-02-09  4:09               ` Stan Hoeppner
2013-02-10  4:40                 ` Adam Goryachev
2013-02-10 13:22                   ` Stan Hoeppner
2013-02-10 16:16                     ` Adam Goryachev
2013-02-10 17:19                       ` Mikael Abrahamsson
2013-02-10 21:57                         ` Adam Goryachev
2013-02-11  3:41                           ` Adam Goryachev
2013-02-11  4:33                           ` Mikael Abrahamsson
2013-02-12  2:46                       ` Stan Hoeppner
2013-02-12  5:33                         ` Adam Goryachev
2013-02-13  7:56                           ` Stan Hoeppner
2013-02-13 13:48                             ` Phil Turmel
2013-02-13 16:17                             ` Adam Goryachev
2013-02-13 20:20                               ` Adam Goryachev
2013-02-14 12:22                                 ` Stan Hoeppner
2013-02-15 13:31                                   ` Stan Hoeppner
2013-02-15 14:32                                     ` Adam Goryachev
2013-02-16  1:07                                       ` Stan Hoeppner
2013-02-16 17:19                                         ` Adam Goryachev
2013-02-17  1:42                                           ` Stan Hoeppner
2013-02-17  5:02                                             ` Adam Goryachev
2013-02-17  6:28                                               ` Stan Hoeppner
2013-02-17  8:41                                                 ` Adam Goryachev
2013-02-17 13:58                                                   ` Stan Hoeppner
2013-02-17 14:46                                                     ` Adam Goryachev
2013-02-19  8:17                                                       ` Stan Hoeppner
2013-02-20 16:45                                                         ` Adam Goryachev
2013-02-21  0:45                                                           ` Stan Hoeppner
2013-02-21  3:10                                                             ` Adam Goryachev
2013-02-22 11:19                                                               ` Stan Hoeppner
2013-02-22 15:25                                                                 ` Charles Polisher
2013-02-23  4:14                                                                   ` Stan Hoeppner
2013-02-12  7:34                         ` Mikael Abrahamsson
2013-02-08  7:17         ` Adam Goryachev
2013-02-07 12:01     ` Brad Campbell
2013-02-07 12:37       ` Adam Goryachev
2013-02-07 17:12         ` Fredrik Lindgren
2013-02-08  0:00           ` Adam Goryachev
2013-02-11 19:49   ` Roy Sigurd Karlsbakk
2013-02-11 20:30     ` Dave Cundiff
2013-02-07 11:32 ` Mikael Abrahamsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51349683.4040903@turmel.org \
    --to=philip@turmel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=mailinglists@websitemanagers.com.au \
    --cc=stan@hardwarefreak.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.