From: "J. Bruce Fields" <bfields@fieldses.org>
To: Sage Weil <sage@newdream.net>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 16/20] ceph: nfs re-export support
Date: Fri, 17 Jul 2009 12:57:24 -0400 [thread overview]
Message-ID: <20090717165724.GA13834@fieldses.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0907170714500.25768@cobra.newdream.net>
On Fri, Jul 17, 2009 at 09:49:06AM -0700, Sage Weil wrote:
> On Fri, 17 Jul 2009, J. Bruce Fields wrote:
> > On Thu, Jul 16, 2009 at 03:07:35PM -0700, Sage Weil wrote:
> > > On Thu, 16 Jul 2009, Trond Myklebust wrote:
> > > > On Thu, 2009-07-16 at 12:50 -0700, Sage Weil wrote:
> > > > > On Thu, 16 Jul 2009, J. Bruce Fields wrote:
> > > > > > On Wed, Jul 15, 2009 at 02:24:46PM -0700, Sage Weil wrote:
> > > > > > > Basic NFS re-export support is included. This mostly works. However,
> > > > > > > Ceph's MDS design precludes the ability to generate a (small)
> > > > > > > filehandle that will be valid forever, so this is of limited utility.
> > > > > >
> > > > > > Is there any hope of fixing that?
> > > > >
> > > > > Yes, but it requires some additional ondisk metadata the MDS isn't
> > > > > maintaining yet (a parent directory backpointer on file objects).
> > > > >
> > > > > The MDS changes will mean more random IO for rename intensive workloads,
> > > > > but the backpointers would also be useful for rebuilding the directory
> > > > > tree in the event of some catastrophic metadata loss or corruption.
> > > > > (Currently they're only there for directories, not all files.)
> > > >
> > > > Note that a filehandle that contains parent directory information is
> > > > still not one that is valid forever. It will change in the case of a
> > > > cross-directory rename, and so isn't a filehandle in the NFSv2/v3 sense.
> > > > Even in the NFSv4 case, it would have to be labelled as 'volatile'.
> > >
> > > Right. The parent directory information in the fh it used as a hint, but
> > > can't be relied on because of the rename problem. That's exactly why the
> > > Ceph MDS will need to be changed to maintain backpointers on all files,
> > > not just directories. When that happens, reexporting via NFS will work
> > > reliably. Until then, old and idle filehandles for renamed files will
> > > eventually go stale.
> >
> > Maybe I should look again at the patch instead of continuing to ask,
> > but.... I'm confused: how will having a backpointers from inodes to
> > directories help do the filehandle-to-inode lookup? (If you can't look
> > up the inode in the first place, what use is any pointer stored in that
> > inode?)
>
> The backpointers will be on the first file data or directory metadata
> object in the object store, which is random access. The inode itself is
> embedded in the containing directory's metadata object, sono backpointer
> is needed there. (The MDS' embedded inodes trade random access to inodes
> for directory prefetching/locality.)
Oh, OK, I think what I didn't understand was the distinction between
"file data object" and "inode". So the filehandle will be some kind of
pointer to the file data object, from which you'll then be able to look
up the directory and the inode that's stored in it. And rename will
require modifying that file data object, not the filehandle, so
filehandles will remain stable over rename.
Thanks for the explanation!
> But this is all outside the scope of the client, so you wouldn't find it
> in the patch. The result is just that there is no reliable lookup-by-ino
> operation, but it's only needed for NFS reexport. (Hard links are
> resolved by the server using a slightly different mechanism.)
Got it.
--b.
next prev parent reply other threads:[~2009-07-17 16:57 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-15 21:24 [PATCH 00/20] ceph: Ceph distributed file system client v0.10 Sage Weil
2009-07-15 21:24 ` [PATCH 01/20] ceph: documentation Sage Weil
2009-07-15 21:24 ` [PATCH 02/20] ceph: on-wire types Sage Weil
2009-07-15 21:24 ` [PATCH 03/20] ceph: client types Sage Weil
2009-07-15 21:24 ` [PATCH 04/20] ceph: super.c Sage Weil
2009-07-15 21:24 ` [PATCH 05/20] ceph: inode operations Sage Weil
2009-07-15 21:24 ` [PATCH 06/20] ceph: directory operations Sage Weil
2009-07-15 21:24 ` [PATCH 07/20] ceph: file operations Sage Weil
2009-07-15 21:24 ` [PATCH 08/20] ceph: address space operations Sage Weil
2009-07-15 21:24 ` [PATCH 09/20] ceph: MDS client Sage Weil
2009-07-15 21:24 ` [PATCH 10/20] ceph: OSD client Sage Weil
2009-07-15 21:24 ` [PATCH 11/20] ceph: CRUSH mapping algorithm Sage Weil
2009-07-15 21:24 ` [PATCH 12/20] ceph: monitor client Sage Weil
2009-07-15 21:24 ` [PATCH 13/20] ceph: capability management Sage Weil
2009-07-15 21:24 ` [PATCH 14/20] ceph: snapshot management Sage Weil
2009-07-15 21:24 ` [PATCH 15/20] ceph: messenger library Sage Weil
2009-07-15 21:24 ` [PATCH 16/20] ceph: nfs re-export support Sage Weil
2009-07-15 21:24 ` [PATCH 17/20] ceph: ioctls Sage Weil
2009-07-15 21:24 ` [PATCH 18/20] ceph: debugging Sage Weil
2009-07-15 21:24 ` [PATCH 19/20] ceph: debugfs Sage Weil
2009-07-15 21:24 ` [PATCH 20/20] ceph: Kconfig, Makefile Sage Weil
2009-07-16 12:27 ` [PATCH 18/20] ceph: debugging Andi Kleen
2009-07-16 17:17 ` Sage Weil
2009-07-17 18:07 ` Sage Weil
2009-07-17 18:56 ` Andi Kleen
2009-07-17 19:52 ` Sage Weil
2009-07-17 20:01 ` Andi Kleen
2009-07-17 21:35 ` Sage Weil
2009-07-17 21:51 ` Andi Kleen
2009-07-15 22:05 ` common layout xattr Andreas Dilger
2009-07-15 22:19 ` Sage Weil
2009-07-16 5:13 ` Andreas Dilger
2009-07-16 22:29 ` Sage Weil
2009-07-17 4:45 ` Andreas Dilger
2009-07-18 4:51 ` Sage Weil
2009-07-16 19:27 ` [PATCH 16/20] ceph: nfs re-export support J. Bruce Fields
2009-07-16 19:50 ` Sage Weil
2009-07-16 21:21 ` Trond Myklebust
2009-07-16 22:07 ` Sage Weil
2009-07-17 14:05 ` J. Bruce Fields
2009-07-17 16:49 ` Sage Weil
2009-07-17 16:57 ` J. Bruce Fields [this message]
2009-07-16 12:31 ` [PATCH 02/20] ceph: on-wire types Andi Kleen
2009-07-16 16:58 ` Sage Weil
2009-07-16 3:59 ` [PATCH 00/20] ceph: Ceph distributed file system client v0.10 Noah Watkins
2009-07-16 17:03 ` Sage Weil
2009-07-16 12:26 ` Andi Kleen
2009-07-16 17:11 ` Sage Weil
2009-07-18 1:28 ` Chris Wright
2009-07-18 4:39 ` Sage Weil
-- strict thread matches above, loose matches on Subject: below --
2009-03-09 22:40 [PATCH 00/20] ceph: Ceph distributed file system client Sage Weil
2009-03-09 22:40 ` [PATCH 01/20] ceph: documentation Sage Weil
2009-03-09 22:40 ` [PATCH 02/20] ceph: on-wire types Sage Weil
2009-03-09 22:40 ` [PATCH 03/20] ceph: client types Sage Weil
2009-03-09 22:40 ` [PATCH 04/20] ceph: super.c Sage Weil
2009-03-09 22:40 ` [PATCH 05/20] ceph: inode operations Sage Weil
2009-03-09 22:40 ` [PATCH 06/20] ceph: directory operations Sage Weil
2009-03-09 22:40 ` [PATCH 07/20] ceph: file operations Sage Weil
2009-03-09 22:40 ` [PATCH 08/20] ceph: address space operations Sage Weil
2009-03-09 22:40 ` [PATCH 09/20] ceph: MDS client Sage Weil
2009-03-09 22:40 ` [PATCH 10/20] ceph: OSD client Sage Weil
2009-03-09 22:40 ` [PATCH 11/20] ceph: CRUSH mapping algorithm Sage Weil
2009-03-09 22:40 ` [PATCH 12/20] ceph: monitor client Sage Weil
2009-03-09 22:40 ` [PATCH 13/20] ceph: capability management Sage Weil
2009-03-09 22:40 ` [PATCH 14/20] ceph: snapshot management Sage Weil
2009-03-09 22:40 ` [PATCH 15/20] ceph: messenger library Sage Weil
2009-03-09 22:40 ` [PATCH 16/20] ceph: nfs re-export support Sage Weil
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090717165724.GA13834@fieldses.org \
--to=bfields@fieldses.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=sage@newdream.net \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).