linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bill Davidsen <davidsen@tmr.com>
To: linux-raid@vger.kernel.org
Subject: Re: Linux RAID Partition Offset 63 cylinders / 30% performance hit?
Date: Tue, 25 Dec 2007 14:06:41 -0500	[thread overview]
Message-ID: <477154C1.6000503@tmr.com> (raw)
In-Reply-To: <20071219215948.GA7129@cthulhu.home.robinhill.me.uk>

Robin Hill wrote:
> On Wed Dec 19, 2007 at 09:50:16AM -0500, Justin Piszcz wrote:
>
>   
>> The (up to) 30% percent figure is mentioned here:
>> http://insights.oetiker.ch/linux/raidoptimization.html
>>
>>     
> That looks to be referring to partitioning a RAID device - this'll only
> apply to hardware RAID or partitionable software RAID, not to the normal
> use case.  When you're creating an array out of standard partitions then
> you know the array stripe size will align with the disks (there's no way
> it cannot), and you can set the filesystem stripe size to align as well
> (XFS will do this automatically).
>
> I've actually done tests on this with hardware RAID to try to find the
> correct partition offset, but wasn't able to see any difference (using
> bonnie++ and moving the partition start by one sector at a time).
>
>   
>> # fdisk -l /dev/sdc
>>
>> Disk /dev/sdc: 150.0 GB, 150039945216 bytes
>> 255 heads, 63 sectors/track, 18241 cylinders
>> Units = cylinders of 16065 * 512 = 8225280 bytes
>> Disk identifier: 0x5667c24a
>>
>>    Device Boot      Start         End      Blocks   Id  System
>> /dev/sdc1               1       18241   146520801   fd  Linux raid 
>> autodetect
>>
>>     
> This looks to be a normal disk - the partition offsets shouldn't be
> relevant here (barring any knowledge of the actual physical disk layout
> anyway, and block remapping may well make that rather irrelevant).
>   
The issue I'm thinking about is hardware sector size, which on modern 
drives may be larger than 512b and therefore entail a read-alter-rewrite 
(RAR) cycle when writing a 512b block. With larger writes, if the 
alignment is poor and the write size is some multiple of 512, it's 
possible to have an RAR at each end of the write. The only way to have a 
hope of controlling the alignment is to write a raw device or use a 
filesystem which can be configured to have blocks which are a multiple 
of the sector size and to do all i/o in block size starting each file on 
a block boundary. That may be possible with ext[234] set up properly.

Why this is important: the physical layout of the drive is useful, but 
for a large write the drive will have to make some number of steps from 
on cylinder to another. By carefully choosing the starting point, the 
best improvement will be to eliminate 2 track-to-track seek times, one 
at the start and one at the end. If the writes are small only one t2t 
saving is possible.

Now consider a RAR process. The drive is spinning typically at 7200 rpm, 
or 8.333 ms/rev. A read might take .5 rev on average, and a RAR will 
take 1.5 rev, because it takes a full revolution after the original data 
is read before the altered data can be rewritten. Larger sectors give 
more capacity, but reduced performance for write. And doing small writes 
can result in paying the RAR penalty on every write. So there may be a 
measurable benefit to getting that alignment right at the drive level.

-- 
Bill Davidsen <davidsen@tmr.com>
  "Woe unto the statesman who makes war without a reason that will still
  be valid when the war is over..." Otto von Bismark 



  parent reply	other threads:[~2007-12-25 19:06 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-19 14:50 Linux RAID Partition Offset 63 cylinders / 30% performance hit? Justin Piszcz
2007-12-19 15:01 ` Mattias Wadenstein
2007-12-19 15:04   ` Justin Piszcz
2007-12-19 15:06     ` Jon Nelson
2007-12-19 15:31       ` Justin Piszcz
2007-12-20 10:37         ` Gabor Gombas
2007-12-19 17:40     ` Bill Davidsen
2007-12-19 17:37       ` Jon Nelson
2007-12-19 17:37       ` Jon Nelson
2007-12-19 17:55       ` Justin Piszcz
2007-12-19 19:18         ` Bill Davidsen
2007-12-19 19:44           ` Justin Piszcz
2007-12-19 21:31           ` Justin Piszcz
2007-12-20 15:18             ` Bill Davidsen
2007-12-20 15:00               ` Justin Piszcz
2007-12-20 10:24         ` Gabor Gombas
2007-12-20 10:33   ` Gabor Gombas
2007-12-19 21:44 ` Michal Soltys
2007-12-19 22:12   ` Jon Nelson
2007-12-20 13:01     ` Michal Soltys
2007-12-19 21:59 ` Robin Hill
2007-12-19 22:03   ` Justin Piszcz
2007-12-25 19:06   ` Bill Davidsen [this message]
2007-12-29 17:22     ` dean gaudet
2007-12-29 17:34       ` Justin Piszcz
2007-12-30  1:33         ` Michael Tokarev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=477154C1.6000503@tmr.com \
    --to=davidsen@tmr.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).