linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bill Davidsen <davidsen@tmr.com>
To: Neil Brown <neilb@suse.de>
Cc: Mike Snitzer <snitzer@redhat.com>,
	linux-scsi@vger.kernel.org, jens.axboe@oracle.com,
	linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
	linux-ide@vger.kernel.org,
	device-mapper development <dm-devel@redhat.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	linux-fsdevel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Alasdair G Kergon <agk@redhat.com>
Subject: Re: REQUEST for new 'topology' metrics to be moved out of the 'queue' sysfs directory.
Date: Tue, 07 Jul 2009 18:06:53 -0400	[thread overview]
Message-ID: <4A53C6FD.3010603@tmr.com> (raw)
In-Reply-To: <19026.43290.340555.774690@notabene.brown>

Neil Brown wrote:
> On Friday June 26, martin.petersen@oracle.com wrote:
>   
>> As far as making the application of these values more obvious I propose
>> the following:
>>
>> What:		/sys/block/<disk>/queue/minimum_io_size
>> Date:		April 2009
>> Contact:	Martin K. Petersen <martin.petersen@oracle.com>
>> Description:
>> 		Storage devices may report a granularity or minimum I/O
>> 		size which is the device's preferred unit of I/O.
>> 		Requests smaller than this may incur a significant
>> 		performance penalty.
>>
>> 		For disk drives this value corresponds to the physical
>> 		block size. For RAID devices it is usually the stripe
>> 		chunk size.
>>     
>
> These two paragraphs are contradictory.  There is no sense in which a
> RAID chunk size is a preferred minimum I/O size.
>
> To some degree it is actually a 'maximum' preferred size for random
> IO.  If you do random IO is blocks larger than the chunk size then you
> risk causing more 'head contention' (at least with RAID0 - with RAID5
> the tradeoff is more complex).
>
>   
Actually this is allocation unit, and the array can be assumed to be a 
series of sets of contiguous bytes of this size. Given LBA addressing, 
array members which are not simple whole devices, etc, this doesn't 
(can't) promise much for the physical layout. And any read which resides 
entirely within a chunk would not have a performance penalty, although 
write might, if it were not a multiple of the sector size of the array 
member(s) involved.

> If you are talking about "alignment", then yes - the chunk size is an
> appropriate size to align on.  But so are the block size and the
> stripe size and none is, in general, any better than any other.
>
>   
I would assume that a chunk, aligned on a chunk boundary, would be 
allocated in a contiguous series of bytes on the underlying array 
member. And that any i/o not aligned on a chunk boundary would be more 
likely to access multiple array members.

Feel free to clarify my assumptions.

> Also, you say "may" report.  If a device does not report, what happens
> to this file.  Is it not present, or empty, or contain a special
> "undefined" value?
> I think the answer is that "512" is reported.  It might be good to
> explicitly document that.
>   
> I'd really like to see an example of how you expect filesystems to use
> this.
> I can well imagine the VM or elevator using this to assemble IO
> requests in to properly aligned requests.  But I cannot imagine how
> e.g mkfs would use it.
> Or am I misunderstanding and this is for programs that use O_DIRECT on
> the block device so they can optimise their request stream?
>   

-- 
Bill Davidsen <davidsen@tmr.com>
  Obscure bug of 2004: BASH BUFFER OVERFLOW - if bash is being run by a
normal user and is setuid root, with the "vi" line edit mode selected,
and the character set is "big5," an off-by-one error occurs during
wildcard (glob) expansion.

  parent reply	other threads:[~2009-07-07 22:06 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-25  3:58 REQUEST for new 'topology' metrics to be moved out of the 'queue' sysfs directory Neil Brown
2009-06-25  8:00 ` Martin K. Petersen
2009-06-25 11:07   ` [dm-devel] " NeilBrown
2009-06-25 11:36     ` John Robinson
2009-06-25 17:43       ` Martin K. Petersen
2009-06-25 12:17     ` berthiaume_wayne
2009-06-25 17:38     ` Martin K. Petersen
2009-06-25 17:46       ` Linus Torvalds
2009-06-25 19:34         ` Jens Axboe
2009-06-26 11:58       ` [dm-devel] " Neil Brown
2009-06-26 14:48         ` Martin K. Petersen
2009-07-07  1:47           ` [dm-devel] " Neil Brown
2009-07-07  5:29             ` Martin K. Petersen
2009-07-09  0:42               ` Neil Brown
2009-07-07 22:06             ` Bill Davidsen [this message]
2009-06-25 19:40     ` [dm-devel] " Jens Axboe
2009-06-26 12:41       ` Neil Brown
2009-06-26 12:50         ` Jens Axboe
2009-06-26 13:16           ` NeilBrown
2009-06-26 13:27             ` Jens Axboe
2009-06-26 13:41             ` NeilBrown
2009-06-26 13:49               ` Jens Axboe
2009-06-27 12:50                 ` Neil Brown
2009-06-26 13:23           ` [dm-devel] " NeilBrown
2009-06-26 13:29             ` Jens Axboe
2009-06-27 12:32               ` Neil Brown
2009-06-29 10:18                 ` [dm-devel] " Jens Axboe
2009-06-29 10:52                   ` NeilBrown
2009-06-29 11:41                     ` Jens Axboe
2009-06-29 12:45                       ` Boaz Harrosh
2009-06-29 12:52                         ` Jens Axboe
2009-06-29 23:09                       ` Andreas Dilger
2009-07-01  0:29                         ` Neil Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A53C6FD.3010603@tmr.com \
    --to=davidsen@tmr.com \
    --cc=agk@redhat.com \
    --cc=dm-devel@redhat.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=neilb@suse.de \
    --cc=snitzer@redhat.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).