linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ric Wheeler <ricwheeler@gmail.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Eric Sandeen <sandeen@sandeen.net>,
	Ilya Dryomov <idryomov@gmail.com>,
	xfs <linux-xfs@vger.kernel.org>, Mark Nelson <mnelson@redhat.com>,
	Eric Sandeen <sandeen@redhat.com>,
	Mike Snitzer <snitzer@redhat.com>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	IDE/ATA development list <linux-ide@vger.kernel.org>,
	device-mapper development <dm-devel@redhat.com>,
	Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org
Subject: Re: block layer API for file system creation - when to use multidisk mode
Date: Sat, 1 Dec 2018 15:52:31 -0500	[thread overview]
Message-ID: <80505ddf-8c6f-50d7-1e6d-2e50e7349c6f@gmail.com> (raw)
In-Reply-To: <20181201043509.GZ6311@dastard>

On 11/30/18 11:35 PM, Dave Chinner wrote:
> On Fri, Nov 30, 2018 at 01:00:52PM -0500, Ric Wheeler wrote:
>> On 11/30/18 7:55 AM, Dave Chinner wrote:
>>> On Thu, Nov 29, 2018 at 06:53:14PM -0500, Ric Wheeler wrote:
>>>> Other file systems also need to
>>>> accommodate/probe behind the fictitious visible storage device
>>>> layer... Specifically, is there something we can add per block
>>>> device to help here? Number of independent devices
>>> That's how mkfs.xfs used to do stripe unit/stripe width calculations
>>> automatically on MD devices back in the 2000s. We got rid of that
>>> for more generaly applicable configuration information such as
>>> minimum/optimal IO sizes so we could expose equivalent alignment
>>> information from lots of different types of storage device....
>>>
>>>> or a map of
>>>> those regions?
>>> Not sure what this means or how we'd use it.
>>> Dave.
>> What I was thinking of was a way of giving up a good outline of how
>> many independent regions that are behind one "virtual" block device
>> like a ceph rbd or device mapper device. My assumption is that we
>> are trying to lay down (at least one) allocation group per region.
>>
>> What we need to optimize for includes:
>>
>>      * how many independent regions are there?
>>
>>      * what are the boundaries of those regions?
>>
>>      * optimal IO size/alignment/etc
>>
>> Some of that we have, but the current assumptions don't work well
>> for all device types.
> Oh, so essential "independent regions" of the storage device. I
> wrote this in 2008:
>
> http://xfs.org/index.php/Reliable_Detection_and_Repair_of_Metadata_Corruption#Failure_Domains
>
> This was derived from the ideas in prototype code I wrote in ~2007
> to try to optimise file layout and load distribution across linear
> concats of multi-TB RAID6 luns. Some of that work was published
> long after I left SGI:
>
> https://marc.info/?l=linux-xfs&m=123441191222714&w=2
>
> Essentially, independent regions - called "Logical
> Extension Groups", or "legs" of the filesystem - and would
> essentially be an aggregation of AGs in that region. The
> concept was that we'd move the geometry information from the
> superblock into the legs, and so we could have different AG
> geoemetry optimies for each independent leg of the filesystem.
>
> eg the SSD region could have numerous small AGs, the large,
> contiguous RAID6 part could have maximally size AGs or even make use
> of the RT allocator for free space management instead of the
> AG/btree allocator. Basically it was seen as a mechanism for getting
> rid of needing to specify block devices as command line or mount
> options.
>
> Fundamentally, though, it was based on the concept that Linux would
> eventually grow an interface for the block device/volume manager to
> tell the filesystem where the independent regions in the device
> were(*), but that's not something that has ever appeared. If you can
> provide an indepedent region map in an easy to digest format (e.g. a
> set of {offset, len, geometry} tuples), then we can obviously make
> use of it in XFS....
>
> Cheers,
>
> Dave.
>
> (*) Basically provide a linux version of the functionality Irix
> volume managers had provided filesystems since the late 80s....
>
Hi Dave,

This is exactly the kind of thing I think would be useful.  We might want to 
have a distinct value (like the rotational) that indicates this is a device with 
multiple "legs" so that normally we query that and don't have to look for the 
more complicated information.

Regards,

Ric

  reply	other threads:[~2018-12-02  8:06 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-04 17:58 [PATCH] mkfs.xfs: don't go into multidisk mode if there is only one stripe Ilya Dryomov
2018-10-04 18:33 ` Eric Sandeen
2018-10-04 18:56   ` Ilya Dryomov
2018-10-04 22:29   ` Dave Chinner
2018-10-05 11:27     ` Ilya Dryomov
2018-10-05 13:51       ` Eric Sandeen
2018-10-05 23:27         ` Dave Chinner
2018-10-06 12:17           ` Ilya Dryomov
2018-10-06 23:20             ` Dave Chinner
2018-10-07  0:14               ` Eric Sandeen
2018-11-29 13:53                 ` Ric Wheeler
2018-11-29 21:48                   ` Dave Chinner
2018-11-29 23:53                     ` Ric Wheeler
2018-11-30  2:25                       ` Dave Chinner
2018-11-30 18:00                         ` block layer API for file system creation - when to use multidisk mode Ric Wheeler
2018-11-30 18:05                           ` Mark Nelson
2018-12-01  4:35                           ` Dave Chinner
2018-12-01 20:52                             ` Ric Wheeler [this message]
2018-10-07 13:54               ` [PATCH] mkfs.xfs: don't go into multidisk mode if there is only one stripe Ilya Dryomov
2018-10-10  0:28                 ` Dave Chinner
2018-10-05 14:50       ` Mike Snitzer
2018-10-05 14:55         ` Eric Sandeen
2018-10-05 17:21           ` Ilya Dryomov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=80505ddf-8c6f-50d7-1e6d-2e50e7349c6f@gmail.com \
    --to=ricwheeler@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=david@fromorbit.com \
    --cc=dm-devel@redhat.com \
    --cc=idryomov@gmail.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mnelson@redhat.com \
    --cc=sandeen@redhat.com \
    --cc=sandeen@sandeen.net \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).