All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Aneesh Kumar K. V" <aneesh.kumar@linux.vnet.ibm.com>
To: Andreas Dilger <adilger@sun.com>
Cc: hch@infradead.org, viro@zeniv.linux.org.uk,
	linux-fsdevel@vger.kernel.org
Subject: Re: [RFC PATCH 2/3] vfs: Add open by file handle support
Date: Mon, 22 Feb 2010 11:43:43 +0530	[thread overview]
Message-ID: <87tyt9n6g8.fsf@linux.vnet.ibm.com> (raw)
In-Reply-To: <AD5A6061-9B7B-4B29-A458-822E781C2588@sun.com>

On Sat, 20 Feb 2010 11:58:34 -0700, Andreas Dilger <adilger@sun.com> wrote:
> On 2010-02-18, at 22:42, Aneesh Kumar K.V wrote:
> > +static struct dentry *handle_to_dentry(struct path *path,
> > +				struct file_handle *fh)
> > +{
> > +	inode = path->dentry->d_inode;
> > +	if (!S_ISREG(inode->i_mode) &&
> > +		!S_ISDIR(inode->i_mode) &&
> > +		!S_ISLNK(inode->i_mode)) {
> 
> Please follow normal indenting rules, like:
> 
> 	if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode) &&
> 	    !S_ISLNK(inode->i_mode)) {
> 
> > +	/* should we do some check on handle size ?*/
> > +	handle = kmalloc(handle_size, GFP_KERNEL);
> 
> Good question.  kmalloc() allows up to 32MB allocations, so a user  
> running 1024 threads calling this with bad handles could OOM most  
> systems today due to unswappable kernel allocations.  Should there be  
> an upper limit on the handle size, like 4096 bytes or so?  Should some  
> filesystem legitimately need a larger size the kernel can always be  
> changed without problems since the size isn't part of the ABI (it  
> should be an internal sanity check), but I can't imagine what needs to  
> be put into a file handle that would exceed this size.

At this point i will limit it to 4096 bytes.


> 
> > +	dentry = exportfs_decode_fh(path->mnt, (struct fid *)handle,
> > +					handle_size, fh->handle_type,
> > +					vfs_dentry_acceptable, NULL);
> 
> One serious problem I can see with this is that the handle only  
> encodes the ino+generation of the inode, and nothing about the device  
> itself.  NFS handles this by using the fsid to identify the  
> filesystem, but it would be highly confusing to applications if a  
> process with a different CWD gets a different file or an error trying  
> to open the file handle, or if a new filesystem gets mounted and the  
> file handle stops working.  Having the file handle able to positively  
> identify a single inode in the system is the most important property  
> it has, I think.
> 
> Probably one of the important use cases of open_by_handle() is to have  
> one process doing pathname resolution and slave threads doing work on  
> those handles.  Usually daemon threads change their working directory  
> to "/" to avoid pinning some directory.  Even without that, it would  
> be impossible for a process to create handles in 2 different  
> filesystems, which is totally broken:
> 
>           (CWD is /home/adilger)
>           name_to_handle("/usr/src/linux/COPYING", &fh);
> 
>           fd = open_by_handle(&fh, O_RDONLY);
>           (fd = -1; errno = ENOENT or EBADF or whatever)
> 

do_sys_open_by_handle actually takes an fd. I was planning to do an openat
style syscall also. That should handle the above problem. In case of the
new syscall dirfd will be a file descriptor for a file in the
particular filesystem.


> Without the handle identifying the filesystem in some way this is  
> mostly useless.  Encoding the device number would be a simple (though  
> non-robust) way to do this, a better way would be with the filesystem  
> UUID and adding an in-kernel UUID->sb mapping function (which is  
> useful for other things as well).
> 

But i understand this is an good requirement. I will see what best I
can do.

> > +long do_sys_open_by_handle(int dfd, struct file_handle *fh, int  
> > flags)
> > +{
> > +	if (!capable(CAP_SYS_ADMIN))
> > +		/* Allow open by handle only by sysadmin */
> > +		return -EPERM;
> 
> Hmm, I guess this avoids some of the security concerns, but makes it a  
> lot less useful.  I was thinking this could be used for e.g. user NFS  
> serving or such, but if it is limited to root only then you might as  
> well just set up the in-kernel NFSd.  By making the handle hard to  
> forge (e.g. generate random key per superblock, sha1(ino+gen+key) and  
> store that into fh; someone with more security experience can think of  
> a better scheme) then you can reasonably safely dispense with the  
> CAP_SYS_ADMIN check because you can be sure that the proper path  
> traversal has been done by a trusted process and there is no more  
> exposure than unix socket fd passing.

If we make the handle depend on the random key, how do we make sure
that after a server crash when get the same inode with the same handle.
That would be a requirement for an NFS server right? We can definitely
encode device details or UUID details the same way nfs server does in
fh_compose. But i guess making it random. 

What are the issues of being able to guess the file handle ? May be
there are other ways to handle those.


> 
> > +	if ((!(flags & O_APPEND) || (flags & O_TRUNC)) &&
> > +		(flags & FMODE_WRITE) && IS_APPEND(inode)) {
> 
> Please use normal indenting style, aligning on the parenthesis.


-aneesh

  parent reply	other threads:[~2010-02-22  6:13 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-19  5:42 [RFC PATCH] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
2010-02-19  5:42 ` [RFC PATCH 1/3] vfs: Add name to file handle conversion support Aneesh Kumar K.V
2010-02-20 18:15   ` Andreas Dilger
2010-02-22  5:15     ` Aneesh Kumar K. V
2010-02-19  5:42 ` [RFC PATCH 2/3] vfs: Add open by file handle support Aneesh Kumar K.V
2010-02-20 18:58   ` Andreas Dilger
2010-02-20 20:13     ` Brad Boyer
     [not found]       ` <FB88A140-C2EB-4E62-9769-D2524C874C8C@sun.com>
2010-02-22  2:46         ` Brad Boyer
2010-02-26 19:21         ` J. Bruce Fields
2010-02-28 17:55           ` Andreas Dilger
2010-02-28 19:00             ` J. Bruce Fields
2010-03-01 18:25               ` Oleg Drokin
2010-03-01 21:25                 ` J. Bruce Fields
2010-02-22  6:13     ` Aneesh Kumar K. V [this message]
2010-02-22  6:31       ` Dave Chinner
2010-02-26 19:24     ` J. Bruce Fields
2010-02-19  5:42 ` [RFC PATCH 3/3] x86: Add new syscalls for x86_32 Aneesh Kumar K.V
2010-02-19  9:34 ` [RFC PATCH] Generic name to handle and open by handle syscalls Andreas Dilger
2010-02-19  9:49   ` Aneesh Kumar K. V
2010-02-20 19:01     ` Andreas Dilger
2010-02-22  6:27       ` Aneesh Kumar K. V
2010-02-22 23:06 ` Jonathan Corbet
2010-02-23  0:56   ` James Morris
2010-02-23  8:58   ` Aneesh Kumar K. V
2010-02-23 19:46     ` Jonathan Corbet
2010-02-24  0:49     ` Dave Chinner
2010-02-25  4:53     ` Serge E. Hallyn
2010-02-25 14:30       ` Jonathan Corbet
2010-02-25 15:19         ` Serge E. Hallyn
2010-02-25 17:55           ` Aneesh Kumar K. V
2010-02-25 18:11             ` Serge E. Hallyn
2010-02-25 18:20               ` Aneesh Kumar K. V
2010-02-25 19:05                 ` Serge E. Hallyn
2010-02-26  9:12                   ` Andreas Dilger
2010-02-26 19:56                     ` Serge E. Hallyn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tyt9n6g8.fsf@linux.vnet.ibm.com \
    --to=aneesh.kumar@linux.vnet.ibm.com \
    --cc=adilger@sun.com \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.