linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: Andreas Dilger <andreas.dilger@oracle.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>,
	Miklos Szeredi <miklos@szeredi.hu>,
	david@fromorbit.com, aneesh.kumar@linux.vnet.ibm.com,
	hch@infradead.org, viro@zeniv.linux.org.uk, adilger@sun.com,
	corbet@lwn.net, serue@us.ibm.com, hooanon05@yahoo.co.jp,
	linux-fsdevel@vger.kernel.org, sfrench@us.ibm.com,
	philippe.deniel@CEA.FR, linux-kernel@vger.kernel.org
Subject: Re: [PATCH -V14 0/11] Generic name to handle and open by handle syscalls
Date: Thu, 8 Jul 2010 15:03:31 +1000	[thread overview]
Message-ID: <20100708150331.10c6ef52@notabene.brown> (raw)
In-Reply-To: <B5A5C6E8-0F5C-4649-8D72-D0E8989B67A8@oracle.com>

On Wed, 7 Jul 2010 18:03:36 -0600
Andreas Dilger <andreas.dilger@oracle.com> wrote:

> On 2010-07-07, at 16:21, Neil Brown wrote:
> > It doesn't matter if there is an underlying block device, or if it is shared
> > among subvolmes.  st_dev is *the* primary key for filesystems.  Every "struct super_block" has a unique s_dev and that is returned in st_dev.
> > 
> > For "traditional" filesystem, this is the major/minor number of the block
> > device.
> > For NFS and btrfs and other filesystems which don't have exclusive use of a
> > block device, 'set_anon_super' is used to get a unique s_dev based on a major
> > number of '0'.
> 
> But the major/minor number returned is essentially random between different clients, so there is no way to use it on another node that is accessing the same filesystem.  Conversely, the UUID will be the same on all of the clients.
> 
> > So you can *always* use st_dev as an identifier for the filesystem which is
> > stable and unique as long as you hold an active reference to the filesystem
> > (open file descriptor, cwd in fs, etc).
> 
> Only on a single system.

Well the system call only runs on a single system.
If you want a cluster-unique name, get the cluster software to generate it
or enforce it.
Performing a mapping is not hard.

> 
> > If you poll(2) /proc/mounts to get notifications of changes to the mount
> > table, then it should be quite easy to cache st-dev -> uuid mappings in a
> > race-free way.
> 
> This sounds unpleasant for any application to implement.  It might be OK for a user-space NFS/CIFS server, but it is complex and error-prone for any normal usage, and doesn't seem like a good API design to me.

Define "normal usage" for filehandle-based lookup ???

Identifing a filesystem by st_dev is completely reliable.  That is a good
start for API design.

Identifing by UUID is not unless uniqueness is enforced, and ....

> 
> > There might be value in getting name_to_handle to return the st_dev of the
> > target file to ensure that you haven't unexepected crossed into a different
> > filesystem.  I would prefer that to returning a uuid:  st_dev is guaranteed
> > to be unique, a uuid is only supposed to be unique (i.e. that is not
> > enforced).
> 
> UUID duplication (w.r.t. multiple mounts of the same underlying device) doesn't matter at all for regular file opens, where the only interest is getting a handle for the inode.  I wouldn't be against requiring the UUID be unique if that was needed, or failing regular opens in the rare case that there is a non-unique UUID pointing to different devices, or failing directory opens for the case of multiple mountpoints.

It has already been said that requiring uuids to be unique breaks current
practice (involving mounting dm snapshots of active filesystems).
Failing legitimate syscalls in rare circumstances sounds like bad API design
to me.

NeilBrown


> 
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Technical Lead
> Oracle Corporation Canada Inc.

  reply	other threads:[~2010-07-08  5:03 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-15 17:12 [PATCH -V14 0/11] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
2010-06-15 17:12 ` [PATCH -V14 01/11] exportfs: Return the minimum required handle size Aneesh Kumar K.V
2010-06-15 17:12 ` [PATCH -V14 02/11] vfs: Add name to file handle conversion support Aneesh Kumar K.V
2010-06-15 17:12 ` [PATCH -V14 03/11] vfs: Add open by file handle support Aneesh Kumar K.V
2010-07-07 15:17   ` Nick Piggin
2010-07-07 16:16     ` Aneesh Kumar K. V
2010-06-15 17:12 ` [PATCH -V14 04/11] vfs: Allow handle based open on symlinks Aneesh Kumar K.V
2010-07-07 15:23   ` Nick Piggin
2010-07-07 16:24     ` Aneesh Kumar K. V
2010-07-07 16:57       ` Nick Piggin
2010-07-07 17:53         ` Aneesh Kumar K. V
2010-07-07 18:20           ` Nick Piggin
2010-07-07 16:48   ` Nick Piggin
2010-07-08 10:42     ` Aneesh Kumar K. V
2010-06-15 17:12 ` [PATCH -V14 05/11] vfs: Support null pathname in readlink Aneesh Kumar K.V
2010-07-07 15:27   ` Nick Piggin
2010-07-07 16:32     ` Aneesh Kumar K. V
2010-07-07 17:03       ` Nick Piggin
2010-06-15 17:12 ` [PATCH -V14 06/11] ext4: Copy fs UUID to superblock Aneesh Kumar K.V
2010-06-15 17:12 ` [PATCH -V14 07/11] x86: Add new syscalls for x86_32 Aneesh Kumar K.V
2010-06-15 17:12 ` [PATCH -V14 08/11] x86: Add new syscalls for x86_64 Aneesh Kumar K.V
2010-06-15 17:12 ` [PATCH -V14 09/11] ext3: Copy fs UUID to superblock Aneesh Kumar K.V
2010-06-15 17:13 ` [PATCH -V14 10/11] vfs: Support null pathname in faccessat Aneesh Kumar K.V
2010-06-15 17:13 ` [PATCH -V14 11/11] vfs: Support null pathname in linkat Aneesh Kumar K.V
2010-07-01 16:28 ` [PATCH -V14 0/11] Generic name to handle and open by handle syscalls Aneesh Kumar K. V
2010-07-01 20:41   ` Neil Brown
2010-07-01 21:15     ` Aneesh Kumar K. V
2010-07-06 16:10       ` J. Bruce Fields
2010-07-06 17:09         ` Aneesh Kumar K. V
2010-07-06 23:23           ` Dave Chinner
2010-07-06 23:36             ` Neil Brown
2010-07-07  2:11               ` Dave Chinner
2010-07-07  2:57                 ` Neil Brown
2010-07-07 12:44                   ` Miklos Szeredi
2010-07-07 12:57                   ` J. Bruce Fields
2010-07-07 13:10                     ` Miklos Szeredi
2010-07-07 13:17                       ` J. Bruce Fields
2010-07-07 13:35                         ` Miklos Szeredi
2010-07-07 14:45                           ` J. Bruce Fields
2010-07-07 16:33                             ` Aneesh Kumar K. V
2010-07-07 16:39                               ` J. Bruce Fields
2010-07-07 22:21                             ` Neil Brown
2010-07-07 22:25                               ` J. Bruce Fields
2010-07-08  0:03                               ` Andreas Dilger
2010-07-08  5:03                                 ` Neil Brown [this message]
2010-07-08 10:40                               ` Aneesh Kumar K. V
2010-07-08 11:52                                 ` Miklos Szeredi
2010-07-08 12:21                                 ` Neil Brown
2010-07-09 18:42                                   ` Andreas Dilger
2010-07-10  4:58                                     ` Aneesh Kumar K. V
2010-07-07  7:40           ` Andreas Dilger
2010-07-07 15:05             ` J. Bruce Fields
2010-07-07 17:02               ` Andreas Dilger
2010-07-07 17:37                 ` J. Bruce Fields
2010-07-07 18:05                 ` Nick Piggin
2010-07-07 23:49                   ` Andreas Dilger
2010-07-07 18:18                 ` Aneesh Kumar K. V
2010-07-07 20:39                 ` Alan Cox
2010-07-07 23:54                   ` Andreas Dilger
2010-07-02  4:02     ` Andreas Dilger
2010-07-02  7:05       ` hch
2010-07-02 16:12         ` Andreas Dilger
2010-07-02 22:09           ` Neil Brown
2010-07-02 22:47             ` Andreas Dilger
2010-07-03 16:04             ` Aneesh Kumar K. V

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100708150331.10c6ef52@notabene.brown \
    --to=neilb@suse.de \
    --cc=adilger@sun.com \
    --cc=andreas.dilger@oracle.com \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=bfields@fieldses.org \
    --cc=corbet@lwn.net \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=hooanon05@yahoo.co.jp \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=philippe.deniel@CEA.FR \
    --cc=serue@us.ibm.com \
    --cc=sfrench@us.ibm.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).