Re: Inconsistency when mounting a directory that 'world' cannot access.

linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Steve Dickson <SteveD@redhat.com>
To: NeilBrown <neilb@suse.de>
Cc: "J. Bruce Fields" <bfields@fieldses.org>,
	"Myklebust, Trond" <Trond.Myklebust@netapp.com>,
	NFS <linux-nfs@vger.kernel.org>
Subject: Re: Inconsistency when mounting a directory that 'world' cannot access.
Date: Mon, 08 Oct 2012 07:42:34 -0400	[thread overview]
Message-ID: <5072BC2A.1060100@RedHat.com> (raw)
In-Reply-To: <20121008170304.37dc6ae9@notabene.brown>



On 08/10/12 02:03, NeilBrown wrote:
> On Thu, 4 Oct 2012 12:07:39 -0400 "J. Bruce Fields" <bfields@fieldses.org>
> wrote:
> 
>> On Thu, Oct 04, 2012 at 08:46:59AM +1000, NeilBrown wrote:
>>> On Wed, 3 Oct 2012 12:27:28 -0400 "J. Bruce Fields" <bfields@fieldses.org>
>>> wrote:
>>>
>>>> On Wed, Oct 03, 2012 at 03:48:43PM +0000, Myklebust, Trond wrote:
>>>>> On Wed, 2012-10-03 at 11:13 -0400, J. Bruce Fields wrote:
>>>>>> On Wed, Oct 03, 2012 at 01:46:29PM +1000, NeilBrown wrote:
>>>>>>> On Tue, 2 Oct 2012 10:33:34 -0400 "J. Bruce Fields" <bfields@fieldses.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I guess you're right.  So it starts to sound more like: "you have a
>>>>>>>> confusing setup.  Your export configuration says one thing, and your
>>>>>>>> filesystem permissions say another.  Under NFSv3 the confusion didn't
>>>>>>>> matter, but now it does--time to fix it."
>>>>>>>>
>>>>>>>
>>>>>>> That's the best I could come to - I'm glad to have it confirmed.  Thanks!
>>>>>>>
>>>>>>> It is unfortunate that Linux NFS uses an anon credential to mount when krb5
>>>>>>> is in use, and uses 'root' when auth_sys is used (which might be anon if
>>>>>>> "root_squash" is active, but might not).
>>>>>>> I wonder if it would work to use auth_none for the mount-time lookup, just
>>>>>>> for consistency..
>>>>>>>
>>>>>>> Is the following appropriate?  Is there somewhere better to put this caveat?
>>>>>>
>>>>>> Unfortunately, it's more complicated than this, as it depends on client
>>>>>> implementation and configuration details.
>>>>>>
>>>>>> Something like this would be more accurate but possibly too long:
>>>>>>
>>>>>> 	Note that under NFSv2 and NFSv3, the mount path is traversed by
>>>>>> 	mountd acting as root, but under NFSv4 the mount path is looked
>>>>>> 	up using the client's credentials.  This means that, for
>>>>>> 	example, if a client mounts using a krb5 credential that the
>>>>>> 	server maps to an "anonmyous" user, then the mount will only
>>>>>> 	succeed if that directory and all its parents allow eXecute
>>>>>> 	permissions.
>>>>>
>>>>> So you're listing this as a "feature" rather than a bug? There should be
>>>>> no reason to constrain the pseudofs to use the permission checks from
>>>>> the underlying filesystem.
>>>>
>>>> I'd be fine with that.
>>>>
>>>> (That still leaves some subtle v3/v4 difference in the case of mount
>>>> paths underneath an export?
>>>>
>>>> What *is* the existing mountd behavior there, exactly?  I'm inclined to
>>>> think allowing mounts of arbitrary subdirectories is a bug, but maybe
>>>> there's some historical reason for it or maybe someone already depends
>>>> on it.)
>>>>
>>>> --b.
>>>
>>> The behaviour is simple that you mount a filehandle (typically belonging to a
>>> directory) and that filehandle can be anything inside any exported filesystem.
>>
>> It's not the nfsd behavior that bothers me--there's nothing we can do
>> about the fact that access by filehandle can bypass directory
>> permissions.
>>
>> What bothers is that mountd will apparently allow anyone to do a lookup
>> anywhere in an exported filesystem.
> 
> Not anyone - it requires a privileged source port from a known host.
> So it is only "anyone who can get 'root'".
> 
>>
>> I don't know--maybe I shouldn't be so concerned about the possibility a
>> rogue user could figure out that my "Music" directory includes an
>> unreasonable number of Miles Davis titles.
>>
>>> Yes, please do depend on being able to mount filehandles that aren't to root
>>> of a filesystem.
>>>
>>> The case the brought this issue to my attention involved the server having
>>> a directory containing hundreds of home directories.  This directory is
>>> exported.
>>>
>>> If they mount that top level directory they get horrible performance.  If
>>> they use an automounter to just mount the homes that are accessed it works
>>> better.  They weren't able to explain why but my guess is that some tools
>>> (GUI filesystem browser) would occasionally do the equivalent of "ls  -l" of
>>> the top level directory which would hammer nfs-idmapd and probably ldap....
>>> though you would think that would get cached and not be a problem for long.
>>> So maybe it is more subtle than that.
>>
>> Getting all the id->name mappings for a 100-entry directory is going to
>> require a 100 serialized upcalls to idmapd (and then possibly ldap), and
>> by default it looks like the idmapd cache will go cold after 10
>> minutes....  Not hard to imagine that could be a problem.
>>
>> Running multiple idmapd process would be easy and might help?  Though
>> not if the client's just giving us the getattrs one at a time.
>>
>> Or maybe the problem's somewhere else entirely, but that's a real bug if
>> we aren't giving good performance on /home.
> 
> I did some experimenting..
> On both 'client' and 'server':
>   for i in `seq 2000 3000`; do echo u$i:x:$i:1000::/nohome:/bin/false; done
>>> /etc/passwd
> 
> On server in suitable directory
> 
>   for i in `seq 2000 3000`; do mkdir $i ; chown u$i $i ; done
> 
> Mount that directory onto the client with NFSv3 and "time ls -l" takes a
> little under 4 seconds.
> Mount with NFSv4 and it takes about the same.  However:
> 
> .....
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2974
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2975
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2976
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2977
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2978
> drwxr-xr-x 2 u2979      root 4096 Oct  8 16:19 2979
> drwxr-xr-x 2 u2980      root 4096 Oct  8 16:19 2980
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2981
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2982
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2983
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2984
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2985
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2986
> ....
> 
> 
> tcpdump shows the server is returning the write stuff, but something if going
> wrong on the client.  I've tried unmounting/remounting and killing/restarting
> rpc.idmapd.
> I had some config problems previously .. is there any chance that these
> unknown entries are in a cache?  Any easy way to view or flush the cache?
Assuming you are using the keyring based idmapper, "nfsidmap -cv" will
clear the keyring of user and group ids. See nfsidmap(5).

If you using rpc.idmapd, I believe 
    echo `date +'%s'` > /proc/net/rpc/nfs4.idtoname/flush
will do the trick.... The CITI faq 
    http://www.citi.umich.edu/projects/nfsv4/linux/faq/
has a section on work with this cache...

steved.

> 
> Of course this is with text-file password lookup.  LDAP might be slower but
> I'd be surprised if it was much slower.
> 
> NeilBrown
> 
> 
> 
>>
>> --b.
>>
>>> I've built similar setups before.  There is something attractive about
>>> everyone's home directory being /home/$USERNAME even though they are on
>>> different servers and different filesystems.
>>>
>>> In the particular problem scenario, local policy requires that the 'staff'
>>> directory on the server to not be world-accessible, but they still want to
>>> mount the individual home directories from there onto client machines as
>>> required.
>>> I cannot easily justify that policy, but the point is that it works with
>>> NFSv3 and with AUTH_SYS/no_root_squash, but not with NFSv4/kerb5.  I don't
>>> think we can fix this inconsistency but maybe we can explain it.
>>>
>>> I think your text is more accurate than mine, but also a little more vague so
>>> the important may not be immediately obvious.  That might be a price we have
>>> to pay for accuracy.
>

next prev parent reply	other threads:[~2012-10-08 11:42 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-18  1:23 Inconsistency when mounting a directory that 'world' cannot access NeilBrown
2012-10-01 15:43 ` J. Bruce Fields
2012-10-02  2:38   ` NeilBrown
2012-10-02 14:33     ` J. Bruce Fields
2012-10-03  3:46       ` NeilBrown
2012-10-03 15:13         ` J. Bruce Fields
2012-10-03 15:48           ` Myklebust, Trond
2012-10-03 16:27             ` J. Bruce Fields
2012-10-03 22:46               ` NeilBrown
2012-10-04 16:07                 ` J. Bruce Fields
2012-10-08  6:03                   ` NeilBrown
2012-10-08 11:42                     ` Steve Dickson [this message]
2012-10-08 12:20                       ` J. Bruce Fields
2012-10-09  0:30                       ` NeilBrown
2012-10-08 12:19                     ` J. Bruce Fields
2012-10-08 13:54                     ` Malahal Naineni
2012-10-08 14:18                       ` J. Bruce Fields
2012-10-08 15:26                         ` Malahal Naineni
2012-10-09  0:33                           ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5072BC2A.1060100@RedHat.com \
    --to=steved@redhat.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=bfields@fieldses.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).