From: Ric Wheeler <rwheeler@redhat.com>
To: "Myklebust, Trond" <Trond.Myklebust@netapp.com>
Cc: Anand Avati <aavati@redhat.com>,
Dr Fields James Bruce <bfields@redhat.com>,
Christoph Anton Mitterer <calestyo@scientia.net>,
Mailing List Linux NFS <linux-nfs@vger.kernel.org>,
Dickson Steve <steved@redhat.com>
Subject: Re: XATTRs in NFS?
Date: Mon, 28 Oct 2013 21:00:03 -0400 [thread overview]
Message-ID: <526F0893.5030700@redhat.com> (raw)
In-Reply-To: <18F0636D-7CE0-42C1-9249-325DF69516D4@netapp.com>
On 10/28/2013 08:49 PM, Myklebust, Trond wrote:
> On Oct 28, 2013, at 8:22 PM, Anand Avati <aavati@redhat.com> wrote:
>
>> On 10/28/2013 01:07 PM, Ric Wheeler wrote:
>>> On Mon, Oct 28, 2013 at 02:00:58PM -0400, Ric Wheeler wrote:
>>>> On 10/28/2013 01:49 PM, Myklebust, Trond wrote:
>>>>> On Oct 28, 2013, at 12:15 PM, Christoph Anton Mitterer
>>>> <calestyo@scientia.net> wrote:
>>>>>> On Mon, 2013-10-28 at 11:40 -0400, Ric Wheeler wrote:
>>>>>>> Then you end up with large directories and an extra name per inode
>>>> that needs to
>>>>>>> be stored and extra lookups for each file when you do a whole file
>>>> system crawl.
>>>>>>> Certainly not as easy as adding and xattrs with that information :)
>>>>>> And I think there's another reason why it wouldn't work...
>>>>>>
>>>>>> Imagine I change my system to encode what should be XATTRs in hardlink
>>>>>> pseudo files...
>>>>>>
>>>>>> If I have such pair locally e.g. on my ext4:
>>>>>> /foo/bar/actual/file
>>>>>> /meta/<SHA512 identifier>.2342348324
>>>>>>
>>>>>> And now move/copy the file via the network to the archive, I'd have to
>>>>>> copy both files (which is really annoying), and I'd guess the inode
>>>>>> coupling would get los (and at least the name wouldn't fit anymore).
>>>>>>
>>>>>> So the whole thing is IMHO not even a workaround.
>>>>> OK. So you're going to do XATTRs for us?
>>>>>
>>>>> Trond
>>>> Now that pNFS is perfect and labeled NFS has made it upstream, I
>>>> think that Steve D must be looking for something to keep him busy :)
>>> I agree with Trond that we first really need good evidence about exactly
>>> who wants this and why.
>>>
>> Some reasons why XATTRs in NFS could be useful w/ glusterfs:
>>
>> - glusterfs exposes data locality through virtual extended attributes. One could do a getxattr("filename", "glusterfs.pathinfo") and get a parsable response about which servers store what parts and copies of the file. Such a mechanism is already used to implement Hadoop plugins for example (Hadoop plugin internally mounts gluster through FUSE where xattrs work). In some use-cases we really want to use NFS and still retain the ability to expose data locality through virtual xattrs, but lack of xattr support limits that possibility.
>>
>> - gluster implements a "merkel tree" like inode attribute called "xtime" which is the recursive max mtime of all files/dirs in a subtree, maintained in real-time on all dirs. This is an extremely handy and powerful feature for implementing backups. This xtime is both stored as an xattr and exposed as an xattr. Users who chose to mount gluster through NFS protocol are giving up access this feature which is available only through xattrs.
>>
>> - A very similar recursive function also provided by gluster is real-time size of dir subtrees, also exposed as extended attributes. For e.g a user instead of doing "du -hs /mnt/gluster/some/subdir" can instead do "getfattr -n glusterfs.quota.size /mnt/gluster/some/dir" and get instantaneous results. Again such a feature is not available for users mounting through NFS because of the lack of generic xattrs.
>>
>> - A lot of our users have asked many times for the ability to use existing NFS servers as "gluster bricks" - because they have paid a ton of money and/or have a lot of data in there and do not want to "move it out". A major roadblocker for such a use case is the lack of xattr support. Gluster stores a lot of metadata in xattrs and therefore avoids having a "metadata server" (for e.g it stores details about which of the copies of a file/dir is fresh and stale in xattrs of that inode, it stores "hash ranges" of directories as xattrs on the directory inode, etc.) If only NFS mounts supported storing of these xattrs, we could support pre-existing NFS volumes as gluster bricks.
>>
>> These are just some reasons on how implementing xattrs in NFS can be useful to one project.
>>
>> It would be interesting to see how the server can control the caching behavior of such xattrs. For ex some of the (virtual) xattrs are better not cached by the client ever.
>>
>> Avati
> ..and here is a perfect example of exactly what is wrong with xattrs. You're describing a private syscall interface, not a data storage format.
>
> Trond
What Avati described is having an application store user defined attributes in a
file in a standard way - pretty much every local file system does this. I don't
get the private syscall interface comment or the need to re-argue a battle that
was waged and lost effectively *years* ago :)
Ric
next prev parent reply other threads:[~2013-10-29 1:00 UTC|newest]
Thread overview: 84+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-23 20:37 XATTRs in NFS? Christoph Anton Mitterer
2013-10-24 8:45 ` Myklebust, Trond
2013-10-24 14:13 ` Christoph Anton Mitterer
2013-10-24 14:32 ` Myklebust, Trond
2013-10-24 15:07 ` Simo Sorce
2013-10-24 15:11 ` Myklebust, Trond
2013-10-24 15:16 ` Simo Sorce
2013-10-24 15:23 ` Jeff Layton
2013-10-24 15:29 ` Matt W. Benjamin
2013-10-24 15:53 ` Myklebust, Trond
2013-10-24 16:10 ` Christoph Anton Mitterer
2013-10-24 15:27 ` Myklebust, Trond
2013-10-24 16:01 ` Christoph Anton Mitterer
2013-10-24 16:30 ` Myklebust, Trond
2013-10-24 17:22 ` Christoph Anton Mitterer
2013-10-25 14:08 ` J. Bruce Fields
2013-10-25 15:26 ` Ric Wheeler
2013-10-25 15:32 ` Chuck Lever
2013-10-26 18:00 ` Christoph Anton Mitterer
2013-10-26 13:20 ` Myklebust, Trond
[not found] ` <OF01D9818B.36018C0F-ON88257C10.00608BC0-88257C10.006139C6@LocalDomain>
2013-10-26 17:46 ` Marc Eshel
2013-10-27 12:48 ` Myklebust, Trond
2013-10-28 0:14 ` Christoph Anton Mitterer
2013-10-28 0:19 ` Myklebust, Trond
2013-10-28 0:23 ` Christoph Anton Mitterer
2013-10-28 13:25 ` James Morris
2013-10-28 15:41 ` Ric Wheeler
2013-10-26 17:12 ` Christoph Anton Mitterer
2013-10-27 19:15 ` J. Bruce Fields
2013-10-27 21:57 ` Christoph Anton Mitterer
2013-10-28 0:17 ` Myklebust, Trond
2013-10-28 0:27 ` Christoph Anton Mitterer
2013-10-28 0:44 ` Myklebust, Trond
2013-10-28 1:04 ` Christoph Anton Mitterer
2013-10-28 15:40 ` Ric Wheeler
2013-10-28 16:15 ` Christoph Anton Mitterer
2013-10-28 17:49 ` Myklebust, Trond
2013-10-28 18:00 ` Ric Wheeler
2013-10-28 18:08 ` Dr Fields James Bruce
2013-10-28 18:31 ` Ric Wheeler
2013-10-28 20:44 ` Marc Eshel
2013-10-28 20:49 ` [nfsv4] " Spencer Shepler
2013-10-28 20:55 ` Haynes, Tom
2013-10-28 21:02 ` J. Bruce Fields
2013-10-28 21:04 ` Chuck Lever
2013-10-28 21:28 ` Marc Eshel
[not found] ` <OF3A48E6D9.7BB93CB0-ON88257C12.0075527E-88257C12.0075F065@LocalDomain>
2013-10-28 22:28 ` XATTRs in NFS Marc Eshel
2013-10-28 22:41 ` Marc Eshel
[not found] ` <5272742D.7000905@redhat.com>
2013-10-31 20:54 ` Anand Avati
2013-10-31 21:36 ` [nfsv4] " Nico Williams
2013-10-28 23:02 ` Nico Williams
2013-10-28 21:28 ` [nfsv4] XATTRs in NFS? Marc Eshel
[not found] ` <526EC3F7.3090601@gmail.com>
2013-10-29 0:22 ` Fwd: " Anand Avati
2013-10-29 0:39 ` Christoph Anton Mitterer
2013-10-29 0:53 ` Myklebust, Trond
2013-10-29 1:04 ` Christoph Anton Mitterer
2013-10-29 0:49 ` Myklebust, Trond
2013-10-29 1:00 ` Ric Wheeler [this message]
2013-10-29 1:26 ` Myklebust, Trond
2013-10-29 1:24 ` Anand Avati
2013-10-29 1:52 ` Myklebust, Trond
2013-10-29 2:22 ` Anand Avati
2013-10-29 1:39 ` Christoph Anton Mitterer
2013-10-29 2:28 ` Myklebust, Trond
2013-10-29 4:27 ` Marc Eshel
2013-10-28 21:34 ` Matt W. Benjamin
2013-10-28 18:15 ` Christoph Anton Mitterer
[not found] <155020130.44.1382627021008.JavaMail.root@thunderbeast.private.linuxbox.com>
2013-10-24 15:05 ` Matt W. Benjamin
2013-10-24 15:08 ` Myklebust, Trond
2013-10-24 15:10 ` Matt W. Benjamin
[not found] <739187808.295.1382744200733.JavaMail.root@thunderbeast.private.linuxbox.com>
2013-10-25 23:52 ` Matt W. Benjamin
2013-10-26 5:18 ` J. Bruce Fields
2013-10-26 11:36 ` Matt W. Benjamin
[not found] <432349691.14.1382795633967.JavaMail.root@thunderbeast.private.linuxbox.com>
2013-10-26 14:01 ` Matt W. Benjamin
2013-10-27 12:31 ` Myklebust, Trond
2013-10-27 16:56 ` Christoph Hellwig
2013-10-27 17:50 ` Simo Sorce
2013-10-27 18:07 ` Myklebust, Trond
2013-10-27 18:30 ` Simo Sorce
2013-10-27 18:41 ` Myklebust, Trond
2013-10-27 22:20 ` Christoph Anton Mitterer
2013-10-28 0:32 ` Myklebust, Trond
2013-10-28 9:53 ` Hellwig Christoph
2013-10-27 21:22 ` Matt W. Benjamin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=526F0893.5030700@redhat.com \
--to=rwheeler@redhat.com \
--cc=Trond.Myklebust@netapp.com \
--cc=aavati@redhat.com \
--cc=bfields@redhat.com \
--cc=calestyo@scientia.net \
--cc=linux-nfs@vger.kernel.org \
--cc=steved@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).