linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@poochiereds.net>
To: Oleg Drokin <green@linuxhacker.ru>,
	Trond Myklebust <trond.myklebust@primarydata.com>,
	Al Viro <viro@ZenIV.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org
Subject: Re: [RFC] [PATCH 0/2] mkdir lookup optimization
Date: Thu, 07 Jul 2016 06:53:04 -0400	[thread overview]
Message-ID: <1467888784.3195.20.camel@poochiereds.net> (raw)
In-Reply-To: <1467870827-2959489-1-git-send-email-green@linuxhacker.ru>

On Thu, 2016-07-07 at 01:53 -0400, Oleg Drokin wrote:
> (sorry for resend, the first go around did not make it to fsdevel and to Al).
> 
> This is inspired by a bug in Lustre that's ATM is shared by NFS
> and used o be shared by CIFS code.
> 
> The problem at hand is: when you try to mkdir in a directory
> where you do not have permissions to create anything, you only
> supposed to get EPERM if the directory you are creatign does not exist.
> Now if the name does exist, you are supposed to get EEXIST instead.
> There are tons of programs that when fed a pathname go and try
> to perform a create of every path component starting from /,
> and ignoring EEXIST, but not other errors. Those programs are broken
> by the above mentioned bug.
> 
> All is fine everywhere by Lustre and NFS at the moment, because
> there's an optimization at hand. e.g. in NFS:
>        /*
>         * If we're doing an exclusive create, optimize away the lookup
>         * but don't hash the dentry.
>         */
>        if (nfs_is_exclusive_create(dir, flags))
>                return NULL;
> 
> Now, this is all fine except when you have no permissions to create
> anything - then vfs_mknod/mkdir/create will do may_create(dir, dentry)
> and we exit spuriously with EPERM.
> 
> [green@fedora1 crash]$ mkdir aaa 
> mkdir: cannot create directory 'aaa': Permission denied
> [green@fedora1 crash]$ mkdir lost+found
> mkdir: cannot create directory 'lost+found': Permission denied
> [green@fedora1 crash]$ ls -ld lost+found
> drwx------ 2 root root 16384 May 25  2013 lost+found
> [green@fedora1 crash]$ mkdir lost+found
> mkdir: cannot create directory 'lost+found': File exists
> 
> cifs had exactly the same code, but it got removed when atomic_open
> was introduced (throwing away a perfectly good optimization for mkdir
> in process) with commit d2c127197dfc0b2bae62a52e1e0d3e3ff493919e
> "cifs: implement i_op->atomic_open()"
> 
> These two patches are the lazy way of fixing the problem -
> "just throw in the extra permission check before bailing out"
> with a bit of complication on the NFS side because there
> the inode permission check is actually circumvented in nfs_permission,
> for MAY_WRITE | !MAY_READ case which is enough to fool
> may_create, but not enough to fool some following check, I guess
> as the problem still exists.
> (I am not sure of the performance implications of just removing that
> thing in nfs_permission).
> 
> Anyway I think instead of resurrecting this optimization for cifs,
> and seeing if ceph and others need it, why not bring it up
> all the way to __lookup_hash() so that we don't do actual lookup
> if the parent is writeable?
> 
> Even for local filesystems like ext4 that's of benefit - we save
> one lookup (even with hashed dirs, that only gives us the last blook
> to lookat and then we still need to check all names to make sure
> the one we want does not exist - so it's not exactly free).
> 
> This should not upset any sort of client-side SELinux/other security
> stuff magic either. If the name exists, we get EEXIST no matter what,
> if it does not exist, parent policy declares if we can create or not
> anyway.
> 
> Something like this (+ whatever nfs_permission fix):
> diff --git a/fs/namei.c b/fs/namei.c
> index 70580ab..b9de645 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1512,6 +1512,10 @@ static struct dentry *__lookup_hash(const struct qstr *name,
> 	if (unlikely(!dentry))
> 		return ERR_PTR(-ENOMEM);
> 
> +	if ((flags & LOOKUP_EXCL|LOOKUP_CREATE) &&
> +	    (may_create(base, dentry) == 0))
> +		return dentry;
> +

That would need to check that LOOKUP_EXCL is actually set. I think you
want something like:

   (flags & (LOOKUP_EXCL|LOOKUP_CREATE)) == (LOOKUP_EXCL|LOOKUP_CREATE)

...and you'd have to figure out how to determine the isdir param for
may_create at that point.

That said, it does seem like a reasonable idea at first glance.

> 	return lookup_real(base->d_inode, dentry, flags);
> }
> 
> Comments?
> 
> Oleg Drokin (2):
>   nfs: Fix spurios EPERM when mkdir of existing dentry
>   staging/lustre: Prevent spurious EPERM on mkdir
> 
>  drivers/staging/lustre/lustre/llite/namei.c | 8 ++++++--
>  fs/nfs/dir.c                                | 4 +++-
>  2 files changed, 9 insertions(+), 3 deletions(-)
> 

-- 
Jeff Layton <jlayton@poochiereds.net>

      parent reply	other threads:[~2016-07-07 10:53 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-07  5:53 [RFC] [PATCH 0/2] mkdir lookup optimization Oleg Drokin
2016-07-07  5:53 ` [PATCH 1/2] nfs: Fix spurios EPERM when mkdir of existing dentry Oleg Drokin
2016-07-07 16:16   ` Trond Myklebust
2016-07-07 16:52     ` Oleg Drokin
2016-07-07 16:59       ` Trond Myklebust
2016-07-07 17:07         ` Oleg Drokin
2016-07-07 17:27           ` Trond Myklebust
2016-07-07 17:32             ` Oleg Drokin
2016-07-07 21:52             ` Oleg Drokin
2016-07-07 23:17               ` Trond Myklebust
2016-07-07  5:53 ` [PATCH 2/2] staging/lustre: Prevent spurious EPERM on mkdir Oleg Drokin
2016-07-07 10:53 ` Jeff Layton [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1467888784.3195.20.camel@poochiereds.net \
    --to=jlayton@poochiereds.net \
    --cc=green@linuxhacker.ru \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trond.myklebust@primarydata.com \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).