From: Andreas Dilger <adilger@sun.com>
To: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 0 of 3] [RFC] I/O Hints
Date: Thu, 05 Jun 2008 00:27:54 -0600 [thread overview]
Message-ID: <20080605062754.GQ2961@webber.adilger.int> (raw)
In-Reply-To: <patchbomb.1212643369@sermon.lab.mkp.net>
On Jun 05, 2008 01:22 -0400, Martin K. Petersen wrote:
> This is just a proof of concept set of patches. I'd like some
> feedback before I spend more time on them.
>
> At the Filesystem & Storage Workshop there was lots of discussion
> about how to communicate I/O alignment, stripe width, etc. to the
> filesystems so they could lay out things properly.
Thanks for looking into this Martin. We could use this pretty
immediately in ext4 for helping the block allocator make good
decisions.
> An addition to the up-and-coming version of the SCSI block protocol
> features an inquiry page that hardware RAIDs can use to indicate
> preferred I/O sizes for a given LUN.
>
> This patch kit implements support for exporting those values in
> /sys/block/. I have implemented support for it in sd.c using the
> Block Limits VPD and in MD using chunk size and stripe width.
>
> The physical sector offset for the start of the "virtual" block device
> is also exported. This includes partitions so you can get the actual
> physical start sector offset for - say - an MD device sitting on a
> partitioned set of drives.
The kernel part of the code seems pretty reasonable (nicely stackable,
as your MD examples show) and useful for filesystems. Having this
information available in the kernel removes much of the need to find
this information in userspace, but unfortunately not all of it (e.g.
some mkfs-time layout decisions need to be done before the filesystem
is mounted, even if the allocator can use the kernel-supplied hits).
To be honest, however, having the information exported only via sysfs
is a bit ugly IMHO. I've had all sorts of grief with settings there
because there isn't always a match between the device that is being
specified by the user and what appears in sysfs (e.g. /dev/disk/by-id/foo
doesn't match /sys/block/sda) and hoops have to be jumped through to
find this mapping, before parsing a text value in C.
Having an ioctl() that can be called on the block device (getting
the right device regardless of its name) seems a lot more useful to
applications in my experience, unless you are using a script.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
next prev parent reply other threads:[~2008-06-05 6:27 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-05 5:22 [PATCH 0 of 3] [RFC] I/O Hints Martin K. Petersen
2008-06-05 5:22 ` [PATCH 1 of 3] block: Export I/O hints for block devices and partitions Martin K. Petersen
2008-06-05 14:42 ` James Bottomley
2008-06-06 1:18 ` Martin K. Petersen
2008-06-06 14:21 ` Jamie Lokier
2008-06-05 5:22 ` [PATCH 2 of 3] md: Export preferred I/O sizes and physical alignment Martin K. Petersen
2008-06-05 5:22 ` [PATCH 3 of 3] sd: Export preferred I/O sizes Martin K. Petersen
2008-06-05 11:25 ` Boaz Harrosh
2008-06-05 6:27 ` Andreas Dilger [this message]
2008-06-05 10:32 ` [PATCH 0 of 3] [RFC] I/O Hints Jamie Lokier
2008-06-05 12:35 ` Matthew Wilcox
2008-06-05 17:02 ` Dan Williams
2008-06-06 1:03 ` Martin K. Petersen
2008-06-06 14:02 ` Jamie Lokier
2008-06-06 16:48 ` Martin K. Petersen
2008-06-09 10:47 ` Jamie Lokier
2008-06-10 2:17 ` Martin K. Petersen
2008-06-05 10:40 ` Jamie Lokier
2008-06-05 19:19 ` Andreas Dilger
2008-06-06 12:55 ` Jamie Lokier
2008-06-06 1:16 ` Martin K. Petersen
2008-06-06 4:51 ` Dave Chinner
2008-06-06 16:53 ` Martin K. Petersen
2008-06-07 20:54 ` Dave Chinner
2008-06-09 15:05 ` Martin K. Petersen
2008-06-06 12:52 ` Jamie Lokier
2008-06-06 14:26 ` Jamie Lokier
2008-06-06 16:56 ` Martin K. Petersen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080605062754.GQ2961@webber.adilger.int \
--to=adilger@sun.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).