Re: Speeding up chunk size change?

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Steven Haigh <netwiz@crc.id.au>
To: stan@hardwarefreak.com
Cc: linux-raid@vger.kernel.org
Subject: Re: Speeding up chunk size change?
Date: Sun, 04 Mar 2012 11:56:39 +1100	[thread overview]
Message-ID: <4F52BDC7.5070805@crc.id.au> (raw)
In-Reply-To: <4F52904E.10203@hardwarefreak.com>

[-- Attachment #1: Type: text/plain, Size: 4332 bytes --]

On 4/03/2012 8:42 AM, Stan Hoeppner wrote:
> On 3/3/2012 1:36 PM, Steven Haigh wrote:
>> Hi all,
>>
>> I just wanted to run this past a few folk here as I want to make sure
>> I'm doing it the Right Way(tm).
>>
>> I've decided to experiment with using a 128Kb chunk size on my RAID6
>> instead of a 64kb chunk.
>
> Why?  Does your target application(s) perform better with a larger
> chunk, and therefore larger total stripe size?  If you're strictly after
> larger dd copy numbers then you're wasting everyone's time, including
> yours, as such has almost zero bearing on real world performance, as
> most workloads are far more random than sequential.

Purely experimental for fun and education. I actually thought that a 
reshape would go at somewhat near the resync speeds I get of 
~60-90Mb/sec. I guess this shows I'm wrong ;)

> And apparently you're not using XFS.  This reshape will screw up your
> alignment, and you'll need to change your fstab mount to reflect the new
> RAID geometry.  But my guess is you're not using.  If you were you'd
> probably be experienced enough to know that doubling your chunk size
> isn't going to make much difference, if any, in real world system usage.

I do use XFS - but this machines role is a Xen Dom0 - so md2 holds the 
filesystems for the guest VMsin LVs. One of those guest filesystems is 
an LV of the VG on md2 formatted as XFS. It will be interesting to see 
how this affects things :)

>> I set a few 'optimisations' that I believe should help:
>> ## Tweak the RAIDs
>> blockdev --setra 8192 /dev/sd[abcdefg]
>
> Read-ahead is per file descriptor, and occurs at the filesystem level.
> The read-ahead value used is that of the device immediately underlying
> the filessytem.  So don't bother setting these above.

Interesting - I didn't think that was the case for whole disk arrays - 
but there you go... Learnt something else :)

>> blockdev --setra 8192 /dev/md0
>> blockdev --setra 8192 /dev/md1
>> blockdev --setra 16384 /dev/md2
>
> This is fine.  You could theoretically set this to 1GB or more if you
> always read entire files, with no ill effects, as read-ahead doesn't go
> past EOF.  However if you do any mmap reads (many apps do) of portions
> of large files, this will hammer performance, obviously, as you're
> reading entire large files speculatively when not needed.  Play with
> this at your own risk.

The workloads of the array (having LVM on top) for the VMs would 
probably make it quite random. This is part of the reason I am playing 
here - pure experimentation. I am very curious to see if it works better 
or worse after the reshape. I honestly don't know :)

>> echo 16384>  /sys/block/md2/md/stripe_cache_size
>>
>> for i in sda sdb sdc sdd sde sdf; do
>>          echo "Setting options for $i"
>>          echo 256>  /sys/block/$i/queue/nr_requests
>>          echo 4096>  /sys/block/$i/queue/read_ahead_kb
> Eliminate this line ^^^^

Any insight into why? I would have thought that this would help - 
however I'm not quite sure as to the values - as this is much less than 
one chunk... That also being said, wouldn't it be a good idea to have 
*some* readahead?

>>          echo 1>  /sys/block/$i/device/queue_depth
>>          echo deadline>  /sys/block/$i/queue/scheduler
>> done
>>
>> Just wondering if anyone knows of any possible way to speed up the
>> reshape a little, or if (like I suspect) it will take ~2 days to
>> complete the reshape.
>
> Considering how expensive such operations are in both time and wear on
> the disk drives, it's better to read everything available to you on the
> subject and ask questions *before* performing expensive experiments on
> your array.  If you currently have an performance problem you're trying
> to solve, the cause lay somewhere other than your chunk size.

As I said above, there really is no 'problem' I'm trying to solve. The 
whole reason is experimentation and education - really to see a 'what 
if' case. The last reshape I did on this array was a RAID5->RAID6 grow 
which went very well - however I have never experimented with chunk size 
on a mdadm raid.

-- 
Steven Haigh

Email: netwiz@crc.id.au
Web: http://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897
Fax: (03) 8338 0299

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4952 bytes --]

next prev parent reply	other threads:[~2012-03-04  0:56 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-03 19:36 Speeding up chunk size change? Steven Haigh
2012-03-03 21:42 ` Stan Hoeppner
2012-03-04  0:56   ` Steven Haigh [this message]
2012-03-04  2:24     ` Stan Hoeppner
2012-03-04  2:27       ` Steven Haigh
2012-03-04  2:37         ` Bernd Schubert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F52BDC7.5070805@crc.id.au \
    --to=netwiz@crc.id.au \
    --cc=linux-raid@vger.kernel.org \
    --cc=stan@hardwarefreak.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.