CEPH filesystem development
 help / color / mirror / Atom feed
[parent not found: <1743327214.12.1350731614461.JavaMail.root@thunderbeast.private.linuxbox.com>]
[parent not found: <2054435269.116.1350502651797.JavaMail.root@thunderbeast.private.linuxbox.com>]
* parent xattrs on file objects
@ 2012-10-16 21:17 Sage Weil
  2012-10-16 21:26 ` Gregory Farnum
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Sage Weil @ 2012-10-16 21:17 UTC (permalink / raw)
  To: ceph-devel

Hey-

One of the design goals of the ceph fs was to keep metadata separate from 
data.  This means, among other things, that when a client is creating a 
bunch of files, it creates the inode via the mds and writes the file data 
to the OSD, but no mds->osd interaction is necessary.

One of the challenges we currently have is that it is difficult to lookup 
an inode by ino.  Normally clients traverse the hierarchy to get there, so 
things are fine for native ceph clients, but when reexporting via NFS we 
can get ESTALE because we an ancient nfs file handle can be presented and 
the ceph MDS won't know where to find it.  We have a similar problem with 
the fsck design in that it is not always possible to discover orphaned 
children of directory that was somehow lost.

One option is to put an ancestor xattr on the first object for each file, 
similar to what we do for directories.  This basically means that each 
file creation will be followed (eventually) by a setxattr osd operation.  
This used to scare me, but now it's seeming like a pretty small price to 
pay for robust NFS reexport and additional information for fsck to 
utilize.

It's also nice because it means we could get rid of the anchor table (used 
for locating files with multiple hard links) entirely and use the 
ancestore xattrs instead.  That means one less thing to fsck, and avoids 
having to invest any time in making the anchor table effectively scale (it 
currently doesn't).

Anyone feel like we shouldn't go ahead and do this?

sage

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2012-10-22 21:28 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <937776470.145.1350510476081.JavaMail.root@thunderbeast.private.linuxbox.com>
2012-10-17 21:51 ` parent xattrs on file objects Casey Bodley
2012-10-17 22:04   ` Gregory Farnum
2012-10-17 22:15     ` Adam C. Emerson
2012-10-19 21:17       ` Sage Weil
     [not found] <1743327214.12.1350731614461.JavaMail.root@thunderbeast.private.linuxbox.com>
2012-10-20 12:09 ` Matt W. Benjamin
2012-10-22 21:27   ` Sage Weil
     [not found] <2054435269.116.1350502651797.JavaMail.root@thunderbeast.private.linuxbox.com>
2012-10-17 19:40 ` Casey Bodley
2012-10-17 19:53   ` Sage Weil
2012-10-17 20:18   ` Gregory Farnum
2012-10-16 21:17 Sage Weil
2012-10-16 21:26 ` Gregory Farnum
2012-10-16 21:35   ` Sage Weil
2012-10-16 21:47     ` Yehuda Sadeh Weinraub
2012-10-16 21:54       ` Gregory Farnum
2012-10-16 21:32 ` Mark Nelson
2012-10-16 21:35 ` Matt W. Benjamin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox