CEPH filesystem development
 help / color / mirror / Atom feed
From: Mark Nelson <mark.nelson@inktank.com>
To: Sage Weil <sage@inktank.com>
Cc: ceph-devel@vger.kernel.org
Subject: Re: parent xattrs on file objects
Date: Tue, 16 Oct 2012 16:32:59 -0500	[thread overview]
Message-ID: <507DD28B.4030605@inktank.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1210161410490.5344@cobra.newdream.net>

On 10/16/2012 04:17 PM, Sage Weil wrote:
> Hey-
>
> One of the design goals of the ceph fs was to keep metadata separate from
> data.  This means, among other things, that when a client is creating a
> bunch of files, it creates the inode via the mds and writes the file data
> to the OSD, but no mds->osd interaction is necessary.
>
> One of the challenges we currently have is that it is difficult to lookup
> an inode by ino.  Normally clients traverse the hierarchy to get there, so
> things are fine for native ceph clients, but when reexporting via NFS we
> can get ESTALE because we an ancient nfs file handle can be presented and
> the ceph MDS won't know where to find it.  We have a similar problem with
> the fsck design in that it is not always possible to discover orphaned
> children of directory that was somehow lost.
>
> One option is to put an ancestor xattr on the first object for each file,
> similar to what we do for directories.  This basically means that each
> file creation will be followed (eventually) by a setxattr osd operation.
> This used to scare me, but now it's seeming like a pretty small price to
> pay for robust NFS reexport and additional information for fsck to
> utilize.
>

Seems like a small price to pay especially for large writes.  How much 
later does the setxattr happen?  For small writes, any idea if this is 
going to cause an additional seek if it's delayed?

> It's also nice because it means we could get rid of the anchor table (used
> for locating files with multiple hard links) entirely and use the
> ancestore xattrs instead.  That means one less thing to fsck, and avoids
> having to invest any time in making the anchor table effectively scale (it
> currently doesn't).
>
> Anyone feel like we shouldn't go ahead and do this?
>
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


  parent reply	other threads:[~2012-10-16 21:33 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-16 21:17 parent xattrs on file objects Sage Weil
2012-10-16 21:26 ` Gregory Farnum
2012-10-16 21:35   ` Sage Weil
2012-10-16 21:47     ` Yehuda Sadeh Weinraub
2012-10-16 21:54       ` Gregory Farnum
2012-10-16 21:32 ` Mark Nelson [this message]
2012-10-16 21:35 ` Matt W. Benjamin
     [not found] <2054435269.116.1350502651797.JavaMail.root@thunderbeast.private.linuxbox.com>
2012-10-17 19:40 ` Casey Bodley
2012-10-17 19:53   ` Sage Weil
2012-10-17 20:18   ` Gregory Farnum
     [not found] <937776470.145.1350510476081.JavaMail.root@thunderbeast.private.linuxbox.com>
2012-10-17 21:51 ` Casey Bodley
2012-10-17 22:04   ` Gregory Farnum
2012-10-17 22:15     ` Adam C. Emerson
2012-10-19 21:17       ` Sage Weil
     [not found] <1743327214.12.1350731614461.JavaMail.root@thunderbeast.private.linuxbox.com>
2012-10-20 12:09 ` Matt W. Benjamin
2012-10-22 21:27   ` Sage Weil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=507DD28B.4030605@inktank.com \
    --to=mark.nelson@inktank.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=sage@inktank.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox