All of lore.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: "Serge E. Hallyn" <serge@hallyn.com>,
	Gao feng <gaofeng@cn.fujitsu.com>,
	Containers <containers@lists.linux-foundation.org>,
	linux-fsdevel@vger.kernel.org,
	Aditya Kali <adityakali@google.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Eric Dumazet <edumazet@google.com>, Neil Brown <neilb@suse.de>
Subject: Re: [REVIEW][PATCH 1/3] vfs: In d_path don't call d_dname on a mount point
Date: Thu, 16 Jan 2014 19:29:16 -0800	[thread overview]
Message-ID: <874n53gub7.fsf@xmission.com> (raw)
In-Reply-To: 20131130224340.GA10323@ZenIV.linux.org.uk

Al Viro <viro@ZenIV.linux.org.uk> writes:

> On Sat, Nov 30, 2013 at 01:51:07PM -0800, Eric W. Biederman wrote:
>> Al Viro <viro@ZenIV.linux.org.uk> writes:
>> 
>> > Eric, what behaviour do you want there?
>> 
>> Here is the behavior I want.
>> 
>> a) Something reasonable in /proc/self/fd when I just open one of these.
>> 
>> root@unassigned:~# exec 5< /proc/self/ns/net
>> root@unassigned:~# ls -l /proc/self/fd
>> total 0
>> lrwx------ 1 root root 64 Nov 30 13:12 0 -> /dev/ttyS0
>> lrwx------ 1 root root 64 Nov 30 13:12 1 -> /dev/ttyS0
>> lrwx------ 1 root root 64 Nov 30 13:12 2 -> /dev/ttyS0
>> lr-x------ 1 root root 64 Nov 30 13:12 3 -> /proc/718/fd
>> lr-x------ 1 root root 64 Nov 30 13:12 5 -> net:[4026531955]
>> root@unassigned:~# ip netns add foo
>> 
>> b) Something reasonable in /proc/mounts when I bind mount one of these.
>> 
>> root@unassigned:~# ip netns add foo
>> root@unassigned:~# cat /proc/mounts 
>> [...]
>> proc /var/run/netns/foo proc rw,nosuid,nodev,noexec,relatime 0 0
>
> Sigh...  What do you want to see in /proc/self/mountinfo?

With or without my patch I see what I am after right now is:
31 12 0:3 / /var/run/netns/foo rw,nosuid,nodev,noexec,relatime - proc proc rw

Apologies for not being clear on that point.

However it would be nice to see:
28 13 0:3 net:[4026532127] /var/run/netns/foo rw,nosuid,nodev,noexec,relatime - proc proc rw

Not having the path be the magical floating path is not causing
regressions for people so I don't care much at the moment.

>> Any dentry allocated with d_alloc_pseudo will have the same problem if
>> it is ever in a context where it can be bind mounted.  So this is a
>> general issue with d_name.
>
> Yes, ->d_dname() is called in a wrong place.  It should be in prepend_path
> if not deeper.  What you are doing is growing a kludge on top of that
> mistake.

I disagree. I am not growing a kludge on of a mistake.  I am refining
the logic at the call site of d_dname.  The logic to not call d_dname
on a mountpoint should be there wherever we call d_dname.

Which makes my change completely orthogonal to moving the call of
d_dname into the strangeness that is prepend_path.

> Note that your patch does nothing for mountinfo.

And it doesn't need to do anything for mountinfo.  My desire and the
only sensible semantics I can think of is for non-mountpoints to have
d_dname called on them.  By definition /proc/mountinfo only shows mount
points, so there is never a neeed to call d_dpath is /proc/mountinfo.


Looking at what is going on in detail, the implementations of the
.d_dname method are:

arch/ia64/kernel/perfmon.c: pfmfs_dname
    ia64 perfmon files are unlinked character devices whose dentries are
    allocated with d_alloc, and the filesystem is mounted with kern_mount.
fs/aio.c:simple_dname
    aio private files that are non-directories whose dentries are 
    allocated with d_alloc_pseudo, and the filesystem is mounted with kern_mount.
fs/anon_inodes.c:anon_inodefs_dname
    anonymous inodes that are non-directories whose dentries are
    allocated with d_alloc_pseudo, and the filesystem is mounted with
    kern_mount.    
fs/pipe.c:pipefs_dname
    pipes that are non-directories whose dentries are allocated with
    d_alloc_psuedo, and the fs is mounted with kern_mount.
net/socket.c:sockfs_dname
    sockets are non-directories whose dentries are allocated with
    d_alloc_pseudo, and the filesystem is mounted with kern_mount.
mm/shmem.c:simple_dname
    shared memory segments are non-directories whose dentries are
    allocated with d_alloc_pseudo, and the filesystem is mounted with
    kern_mount.
fs/hugetlbfs/inode.c:simple_dname
    hugetlb shared memory segments are non-directories whose dentries
    are allocted with d_alloc_pseudo, and the filesystem is mounted with
    kern_mount_data.
fs/proc/namespaces.c:ns_dname
    namespace files are non-directories whose dentries are allocated
    with d_alloc_pseudo, and the filesystem is mounted normally.


In the worst case prepend_path is safe to use on the dentries that
implement d_dname as their dentry name field has valid data.

The callers of prepend_path that are not d_path are:
__d_path
d_absolute_path
getcwd



getcwd need not be worried about because get_fs_root_and_pwd will always
return directories.  And none of the implementations of d_dname are
directories.



__d_path is has only two callers: fs/seq_file.c:seq_path_root and
security/apparmor/path.c:d_namespace_path.

fs/seq_file.c: seq_path_root which is only used in
fs/proc_namespaces.c:show_mountinfo aka /proc/mountinfo.  Since this is
only dealing with mounted paths there will never be a need to call
d_dname.

security/apparmor/path.c:d_namespace_path in apparmor is almost
interesting, it is called on files that come from exec (brpm->file),
link, rename, open, unlink, create, truncate, chown, chmod, and getattr.
Which with clever usage of magic proc symlinks could conceivably point
to a file descriptor that implements d_dname.  However d_namespace_path
tests the mountpoint for MNT_INTERNAL and calls dentry_path if that is
the case, avoiding __d_path or d_absolute_path, except for the proc
namespace files.  In either case today if mounted a sensible result and
if not mounted an empty path.

The only other caller of d_absolute_path is
security/tomoyo/realpath.c:tomoyo_get_absolute_path.  However
tomyo_get_absolute_path is only called from tomoyo_realpath_from_path
which checks for dentries that implement d_dname and calls them
explicitly, only calling d_absolute_name if there is no such
implementation.  



Which is a long way of saying that right now moving the call of d_dname
into prepend_path would make no practical difference.   

At a practical level there are improvements to be had by calling d_dname
in dentry_path and by adding my avoidance of calling d_dname on a
mountpoint into tomoyo_realpath_from_path.

So while I see what you mean by prepend_path needing cleaning up that is
really orthogonal to the acidental regression that I am fixing.

> As for redesign...  That's exactly what is needed in that area, instead of
> letting it fester.

I can't backport a redesign to fix the regression, and a redesign
results in no practical benefit for me.  So I might be persuaded later
if I can find the time but right now a redesign is the wrong place to
start.

> As for check_mnt(old) in do_loopback()...  I wonder if we shouldn't just
> turn that puppy into (check_mnt(old) || (old->mnt.mnt_flags & MNT_INTERNAL)).
> Note that MS_NOUSER in superblock flags would prevent binding pipefs and
> all mount_pseudo() users, so we wouldn't change existing behaviour...

Strictly speaking behavior would change for the proc namespace files, as
now you could do things by finding a mount of proc in another mount
namespace where the namespace files were available.

I would be a lot more comfortable with changing do_loopback if we could
safely remove the check_mnt(old) test altogether.  Why do we have the
check_mnt(old) test in do_loopback?   If we can't remove the
check_mnt(old) test I need verify that the reasons for that test don't
apply to my namespace file descriptor case.  Otherwise I can't see any
real problems with making that change (time permitting).



After getting this regression fix merged, the windmill that I expect I
will be tilting at are fixing the user space visible races that let a
dentry be renamed while we are mounting something on it, and the messy
fact that check_submounts_and_drop isn't part of d_invalidate.  Both of
those can potentially result in user space visible insantiy.

Eric


  parent reply	other threads:[~2014-01-17  3:29 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-15 16:41 Regression wrt mounting /proc in user namespace in 3.13 Daniel P. Berrange
     [not found] ` <20131115164123.GN28794-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-11-16 16:48   ` Serge E. Hallyn
     [not found]     ` <20131116164840.GA4441-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2013-11-17  3:06       ` Serge E. Hallyn
     [not found]         ` <20131117030653.GA7670-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2013-11-18  3:19           ` Serge E. Hallyn
     [not found]             ` <20131118031932.GA17621-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2013-11-18  4:52               ` Gao feng
     [not found]                 ` <52899D09.5080202-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2013-11-18 14:08                   ` Serge E. Hallyn
     [not found]                     ` <20131118140830.GA22075-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2013-11-18 18:01                       ` Serge E. Hallyn
     [not found]                         ` <20131118180134.GA24156-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2013-11-19  1:51                           ` Eric W. Biederman
     [not found]                             ` <87k3g5gnuv.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-11-19  3:47                               ` Serge E. Hallyn
2013-11-26 18:10                               ` Serge E. Hallyn
     [not found]                                 ` <20131126181043.GA25492-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2013-11-27  0:14                                   ` [REVIEW][PATCH 0/3] userns fixes for v3.13-rc1 Eric W. Biederman
     [not found]                                     ` <87siui1z1g.fsf_-_-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-11-27  0:16                                       ` [REVIEW][PATCH 1/3] vfs: In d_path don't call d_dname on a mount point Eric W. Biederman
2013-11-27  1:58                                         ` Serge E. Hallyn
     [not found]                                         ` <8738mi1yya.fsf_-_-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-11-27  1:58                                           ` Serge E. Hallyn
2013-11-30  6:15                                           ` Al Viro
     [not found]                                             ` <20131130061525.GY10323-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2013-11-30 17:02                                               ` Al Viro
     [not found]                                                 ` <20131130170226.GZ10323-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2013-11-30 21:51                                                   ` Eric W. Biederman
     [not found]                                                     ` <87a9glh838.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-11-30 22:43                                                       ` Al Viro
     [not found]                                                         ` <20131130224340.GA10323-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2013-12-02  7:29                                                           ` Al Viro
2014-01-17  3:29                                                         ` Eric W. Biederman [this message]
     [not found]                                                           ` <874n53gub7.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2014-01-17  8:39                                                             ` Al Viro
     [not found]                                                               ` <20140117083901.GA10323-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2014-02-07  2:21                                                                 ` [PATCH 0/4] d_dname cleanups Eric W. Biederman
     [not found]                                                                   ` <87iosrhdc0.fsf_-_-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2014-02-07  2:23                                                                     ` [PATCH 1/4] perfmon: Use d_alloc_pseudo like all of the d_dname callers Eric W. Biederman
2014-02-07  2:23                                                                     ` [PATCH 2/4] vfs: Simply when d_alloc_dname is called Eric W. Biederman
2014-02-07  2:24                                                                     ` [PATCH 3/4] vfs: Move the call of d_op->d_dname from d_path to prepend_path Eric W. Biederman
2014-02-07  2:24                                                                     ` [PATCH 4/4] vfs: Call d_dname from dentry_path Eric W. Biederman
2014-01-17  3:29                                                         ` [REVIEW][PATCH 1/3] vfs: In d_path don't call d_dname on a mount point Eric W. Biederman
2013-12-01  5:09                                                   ` Al Viro
     [not found]                                                     ` <20131201050930.GB10323-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2013-12-01  6:15                                                       ` [REVIEW][PATCH 1/3] vfs: In d_path don't call d_dname on a mountpoint Tetsuo Handa
2013-12-01  6:15                                                     ` Tetsuo Handa
2013-12-02  5:43                                                   ` [REVIEW][PATCH 1/3] vfs: In d_path don't call d_dname on a mount point NeilBrown
2013-12-02  5:43                                                     ` NeilBrown
     [not found]                                                     ` <20131202164359.4f4f2c94-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2013-12-02 16:23                                                       ` J.Bruce Fields
2013-12-02 16:23                                                     ` J.Bruce Fields
2013-12-02 16:23                                                       ` J.Bruce Fields
2013-11-27  0:16                                       ` [REVIEW][PATCH 2/3] fork: Allow CLONE_PARENT after setns(CLONE_NEWPID) Eric W. Biederman
2013-11-27  0:16                                         ` Eric W. Biederman
     [not found]                                         ` <87vbzezojq.fsf_-_-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-11-27  1:58                                           ` Serge E. Hallyn
2013-11-27  0:17                                       ` [REVIEW][PATCH 3/3] vfs: Fix a regression in mounting proc Eric W. Biederman
     [not found]                                         ` <87pppmzoin.fsf_-_-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-11-27  0:21                                           ` Andy Lutomirski
     [not found]                                             ` <CALCETrVp78EfzY3Oa-LV1Hm8A4Y35apehcxrxdyrzvTb5sp=pA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-27  0:36                                               ` Eric W. Biederman
2013-11-27  2:00                                           ` Serge E. Hallyn
2013-11-27  3:19                                           ` Gao feng
     [not found]                                             ` <529564AA.8050100-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2013-11-27  5:00                                               ` Eric W. Biederman
2013-11-27  5:00                                             ` Eric W. Biederman
2013-11-27 16:13                                           ` Oleg Nesterov
2013-11-27 16:13                                         ` Oleg Nesterov
     [not found]                                           ` <20131127161300.GA24773-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-11-27 16:29                                             ` Serge E. Hallyn
     [not found]                                               ` <20131127162928.GB7358-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2013-11-27 18:09                                                 ` Oleg Nesterov
2013-11-27 16:41                                             ` Andy Lutomirski
     [not found]                                               ` <CALCETrXFnw63=JoEaQxM+Opj+kCXSL=9XppymzGKhLzOnp3WaQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-27 18:10                                                 ` Oleg Nesterov
2013-11-27 18:51                                             ` Eric W. Biederman
2013-11-27 19:47                                               ` Oleg Nesterov
2013-11-27 19:52                                                 ` Eric W. Biederman
     [not found]                                                   ` <87iovdmxl7.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-11-27 20:01                                                     ` Oleg Nesterov
2013-11-27 20:07                                                     ` Eric W. Biederman
2013-11-27 20:41                                                       ` Andy Lutomirski
     [not found]                                                         ` <CALCETrUwjK7iLMMJaCvKUbBwEqV58oXY4dWzTGJohYgg4DwjWQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-29 14:56                                                           ` Serge E. Hallyn
     [not found]                                                       ` <87wqjtlic3.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-11-27 20:41                                                         ` Andy Lutomirski
2013-11-29 19:53                                                         ` Oleg Nesterov
2013-12-13 22:07                                                           ` Richard Weinberger
     [not found]                                                           ` <20131129195327.GA12974-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-12-13 22:07                                                             ` Richard Weinberger
     [not found]                                                 ` <20131127194722.GA32673-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-11-27 19:52                                                   ` Eric W. Biederman
     [not found]                                               ` <871u21oeyr.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-11-27 19:47                                                 ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874n53gub7.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=adityakali@google.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=edumazet@google.com \
    --cc=gaofeng@cn.fujitsu.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=neilb@suse.de \
    --cc=oleg@redhat.com \
    --cc=serge@hallyn.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.