linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@sun.com>
To: Sage Weil <sage@newdream.net>
Cc: linux-fsdevel@vger.kernel.org
Subject: Re: common layout xattr
Date: Thu, 16 Jul 2009 01:13:31 -0400	[thread overview]
Message-ID: <20090716051331.GL14175@webber.adilger.int> (raw)
In-Reply-To: <Pine.LNX.4.64.0907151512250.5776@cobra.newdream.net>

On Jul 15, 2009  15:19 -0700, Sage Weil wrote:
> On Wed, 15 Jul 2009, Andreas Dilger wrote:
> > I'm thinking of using simple ASCII key=value pairs to store basic
> > layout information like chunk size, stripe count, mirror count,
> > RAID type, etc.  Some of them may not be applicable/usable by all
> > filesystems, but having a handful of "well known" keys and values
> > for a common xattr name would at least be better than what we have
> > now (which is nothing).
> > 
> > Something like (not necessarily a firm proposal yet):
> > 
> > trusted.common_layout:
> > chunk_bytes=65536
> > stripe_count=32
> > mirror_count=3
> > raid_type=1+0
> > 
> > Is this something you would be interested to pursue?  I've also discussed
> > this with Panasas, and they had some interest in this as well.  Any GPFS
> > developers watching?
> 
> This sounds like a good idea to me.  I think the main hurdle is going to 
> be defining a generalized layout description that captures all the full 
> space of layouts for each file systems, and also translates gracefully 
> between them.  IIRC Lustre, for instance, will stripe over $stripe_count 
> objects, while Ceph (and Panasas?) will stripe up to some $max_object_size 
> and then move on to a new set of objects.  Or stagger chunk order in 
> successive stripes, etc.

Well, I don't think we can capture all of the details for every
filesystem, but I'm hoping we can get some of the main parameters
working.  Having additional attributes that are more filesystem
specific is fine too (to a reasonable extent of course).

For parts of the layout that are generated programatically, like the
Ceph/Panasas striping order, I don't think that has to be encoded
explicitly into the layout xattr, since I'd assume the pattern is
always the same between files (e.g. use $stripe_count objects until
$max_object_size bytes, then a different set of $stripe_count objects
for $max_object_size bytes).  That Lustre uses the same $stripe_count
objects for the whole file, and it would ignore $max_object_size is
below the level of detail that I'm currently interested in.  In the
reverse direction, I'd assume that Ceph/Panasas would fill in the
value for $max_object_size from a default, as if no layout was used.


Filesystems are free to ignore parameters they don't like, and/or save them
and return them again when asked (probably with a flag that indicates they
are not currently in use), basically treating them as an opaque user xattr.
This will preserve the settings across an fsX -> fsY -> fsX transfer.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


  reply	other threads:[~2009-07-16  5:14 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-15 21:24 [PATCH 00/20] ceph: Ceph distributed file system client v0.10 Sage Weil
2009-07-15 21:24 ` [PATCH 01/20] ceph: documentation Sage Weil
2009-07-15 21:24   ` [PATCH 02/20] ceph: on-wire types Sage Weil
2009-07-15 21:24     ` [PATCH 03/20] ceph: client types Sage Weil
2009-07-15 21:24       ` [PATCH 04/20] ceph: super.c Sage Weil
2009-07-15 21:24         ` [PATCH 05/20] ceph: inode operations Sage Weil
2009-07-15 21:24           ` [PATCH 06/20] ceph: directory operations Sage Weil
2009-07-15 21:24             ` [PATCH 07/20] ceph: file operations Sage Weil
2009-07-15 21:24               ` [PATCH 08/20] ceph: address space operations Sage Weil
2009-07-15 21:24                 ` [PATCH 09/20] ceph: MDS client Sage Weil
2009-07-15 21:24                   ` [PATCH 10/20] ceph: OSD client Sage Weil
2009-07-15 21:24                     ` [PATCH 11/20] ceph: CRUSH mapping algorithm Sage Weil
2009-07-15 21:24                       ` [PATCH 12/20] ceph: monitor client Sage Weil
2009-07-15 21:24                         ` [PATCH 13/20] ceph: capability management Sage Weil
2009-07-15 21:24                           ` [PATCH 14/20] ceph: snapshot management Sage Weil
2009-07-15 21:24                             ` [PATCH 15/20] ceph: messenger library Sage Weil
2009-07-15 21:24                               ` [PATCH 16/20] ceph: nfs re-export support Sage Weil
2009-07-15 21:24                                 ` [PATCH 17/20] ceph: ioctls Sage Weil
2009-07-15 21:24                                   ` [PATCH 18/20] ceph: debugging Sage Weil
2009-07-15 21:24                                     ` [PATCH 19/20] ceph: debugfs Sage Weil
2009-07-15 21:24                                       ` [PATCH 20/20] ceph: Kconfig, Makefile Sage Weil
2009-07-16 12:27                                     ` [PATCH 18/20] ceph: debugging Andi Kleen
2009-07-16 17:17                                       ` Sage Weil
2009-07-17 18:07                                         ` Sage Weil
2009-07-17 18:56                                           ` Andi Kleen
2009-07-17 19:52                                             ` Sage Weil
2009-07-17 20:01                                               ` Andi Kleen
2009-07-17 21:35                                                 ` Sage Weil
2009-07-17 21:51                                                   ` Andi Kleen
2009-07-15 22:05                                   ` common layout xattr Andreas Dilger
2009-07-15 22:19                                     ` Sage Weil
2009-07-16  5:13                                       ` Andreas Dilger [this message]
2009-07-16 22:29                                         ` Sage Weil
2009-07-17  4:45                                           ` Andreas Dilger
2009-07-18  4:51                                             ` Sage Weil
2009-07-16 19:27                                 ` [PATCH 16/20] ceph: nfs re-export support J. Bruce Fields
2009-07-16 19:50                                   ` Sage Weil
2009-07-16 21:21                                     ` Trond Myklebust
2009-07-16 22:07                                       ` Sage Weil
2009-07-17 14:05                                         ` J. Bruce Fields
2009-07-17 16:49                                           ` Sage Weil
2009-07-17 16:57                                             ` J. Bruce Fields
2009-07-16 12:31     ` [PATCH 02/20] ceph: on-wire types Andi Kleen
2009-07-16 16:58       ` Sage Weil
2009-07-16  3:59 ` [PATCH 00/20] ceph: Ceph distributed file system client v0.10 Noah Watkins
2009-07-16 17:03   ` Sage Weil
2009-07-16 12:26 ` Andi Kleen
2009-07-16 17:11   ` Sage Weil
2009-07-18  1:28     ` Chris Wright
2009-07-18  4:39       ` Sage Weil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090716051331.GL14175@webber.adilger.int \
    --to=adilger@sun.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=sage@newdream.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).