linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Spelic <spelic@shiftmail.org>
To: linux-lvm@redhat.com
Subject: Re: [linux-lvm] pvmove painfully slow on parity RAID
Date: Thu, 30 Dec 2010 04:13:35 +0100	[thread overview]
Message-ID: <4D1BF8DF.20908@shiftmail.org> (raw)
In-Reply-To: <Pine.LNX.4.64.1012292136500.19697@bmsred.bmsi.com>

On 12/30/2010 03:42 AM, Stuart D. Gathman wrote:
> On Wed, 29 Dec 2010, Spelic wrote:
>    
>> I tried multiple times for every device with consistent results, so I'm pretty
>> sure these are actual numbers.
>> What's happening?
>> Apart from the amazing difference of parity raid vs nonparity raid, with
>> parity raid it seems to vary randomly with the number of devices and the
>> chunksize..?
>>      
> This is pretty much my experience with parity raid all around.  Which
> is why I stick with raid1 and raid10.
>    

Parity raid goes fast for me for normal filesystem operations, that's 
why I suppose there is some strict sequentiality is enforced here.

> That said, the sequential writes of pvmove should be fast for raid5 *if*
> the chunks are aligned so that there is no read/modify/write cycle.
>
> 1) Perhaps your test targets are not properly aligned?
>    

aligned to zero yes (arrays are empty now), but all raids have different 
chunk sizes and stripe sizes as I reported, which are all bigger than 
the lvm chunksize which is 1M for the VG.

> 2) Perhaps the raid5 implementation (hardware? linux md?
>     experimental lvm raid5?) does a read modify write even when it
>     doesn't have to.
>
> Your numbers sure look like read/modify/write is happening for some reason.
>    

Ok but strict sequentiality is probably enforced too much. There must be 
some barrier or flush & wait thing going on here at each tiny bit of 
information (at each lvm chunk maybe?). Are you a lvm devel?
Consider that a sequential dd write goes hundreds of megabytes per 
second on my arrays, not hundreds of... kilobytes!
Even random io goes *much* faster than this, if one stripe does not have 
to wait for another stripe to be fully updated (i.e. sequentiality not 
enforced from the application layer).
If pvmove ouputed 100MB before every sync or flush, I'm pretty sure I 
would see speeds almost 100 times higher.


Also there is still the mystery of why times appear *randomly* related 
to the number of devices, chunk sizes, and stripe sizes! if the rmw 
cycle was the culprit, how come I see:
raid5, 4 devices, 16384k chunk: 41sec (4.9MB/sec)
raid5, 6 device, 4096k chunk: 2m18sec ?!?! (1.44 MB/sec!?)
the first has much larger stripe size of 49152K , the second has 20480K !

Thank you

  reply	other threads:[~2010-12-30  3:14 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-29  2:40 [linux-lvm] pvmove painfully slow on parity RAID Spelic
2010-12-29 14:02 ` Spelic
2010-12-30  2:42   ` Stuart D. Gathman
2010-12-30  3:13     ` Spelic [this message]
2010-12-30 19:12       ` Stuart D. Gathman
2010-12-31  3:41         ` Spelic
2010-12-31 15:36           ` Stuart D. Gathman
2010-12-31 17:23             ` Stuart D. Gathman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D1BF8DF.20908@shiftmail.org \
    --to=spelic@shiftmail.org \
    --cc=linux-lvm@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).