From: Steve Dickson <SteveD@redhat.com>
To: NeilBrown <neilb@suse.de>
Cc: "J. Bruce Fields" <bfields@fieldses.org>,
"Myklebust, Trond" <Trond.Myklebust@netapp.com>,
NFS <linux-nfs@vger.kernel.org>
Subject: Re: Inconsistency when mounting a directory that 'world' cannot access.
Date: Mon, 08 Oct 2012 07:42:34 -0400 [thread overview]
Message-ID: <5072BC2A.1060100@RedHat.com> (raw)
In-Reply-To: <20121008170304.37dc6ae9@notabene.brown>
On 08/10/12 02:03, NeilBrown wrote:
> On Thu, 4 Oct 2012 12:07:39 -0400 "J. Bruce Fields" <bfields@fieldses.org>
> wrote:
>
>> On Thu, Oct 04, 2012 at 08:46:59AM +1000, NeilBrown wrote:
>>> On Wed, 3 Oct 2012 12:27:28 -0400 "J. Bruce Fields" <bfields@fieldses.org>
>>> wrote:
>>>
>>>> On Wed, Oct 03, 2012 at 03:48:43PM +0000, Myklebust, Trond wrote:
>>>>> On Wed, 2012-10-03 at 11:13 -0400, J. Bruce Fields wrote:
>>>>>> On Wed, Oct 03, 2012 at 01:46:29PM +1000, NeilBrown wrote:
>>>>>>> On Tue, 2 Oct 2012 10:33:34 -0400 "J. Bruce Fields" <bfields@fieldses.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I guess you're right. So it starts to sound more like: "you have a
>>>>>>>> confusing setup. Your export configuration says one thing, and your
>>>>>>>> filesystem permissions say another. Under NFSv3 the confusion didn't
>>>>>>>> matter, but now it does--time to fix it."
>>>>>>>>
>>>>>>>
>>>>>>> That's the best I could come to - I'm glad to have it confirmed. Thanks!
>>>>>>>
>>>>>>> It is unfortunate that Linux NFS uses an anon credential to mount when krb5
>>>>>>> is in use, and uses 'root' when auth_sys is used (which might be anon if
>>>>>>> "root_squash" is active, but might not).
>>>>>>> I wonder if it would work to use auth_none for the mount-time lookup, just
>>>>>>> for consistency..
>>>>>>>
>>>>>>> Is the following appropriate? Is there somewhere better to put this caveat?
>>>>>>
>>>>>> Unfortunately, it's more complicated than this, as it depends on client
>>>>>> implementation and configuration details.
>>>>>>
>>>>>> Something like this would be more accurate but possibly too long:
>>>>>>
>>>>>> Note that under NFSv2 and NFSv3, the mount path is traversed by
>>>>>> mountd acting as root, but under NFSv4 the mount path is looked
>>>>>> up using the client's credentials. This means that, for
>>>>>> example, if a client mounts using a krb5 credential that the
>>>>>> server maps to an "anonmyous" user, then the mount will only
>>>>>> succeed if that directory and all its parents allow eXecute
>>>>>> permissions.
>>>>>
>>>>> So you're listing this as a "feature" rather than a bug? There should be
>>>>> no reason to constrain the pseudofs to use the permission checks from
>>>>> the underlying filesystem.
>>>>
>>>> I'd be fine with that.
>>>>
>>>> (That still leaves some subtle v3/v4 difference in the case of mount
>>>> paths underneath an export?
>>>>
>>>> What *is* the existing mountd behavior there, exactly? I'm inclined to
>>>> think allowing mounts of arbitrary subdirectories is a bug, but maybe
>>>> there's some historical reason for it or maybe someone already depends
>>>> on it.)
>>>>
>>>> --b.
>>>
>>> The behaviour is simple that you mount a filehandle (typically belonging to a
>>> directory) and that filehandle can be anything inside any exported filesystem.
>>
>> It's not the nfsd behavior that bothers me--there's nothing we can do
>> about the fact that access by filehandle can bypass directory
>> permissions.
>>
>> What bothers is that mountd will apparently allow anyone to do a lookup
>> anywhere in an exported filesystem.
>
> Not anyone - it requires a privileged source port from a known host.
> So it is only "anyone who can get 'root'".
>
>>
>> I don't know--maybe I shouldn't be so concerned about the possibility a
>> rogue user could figure out that my "Music" directory includes an
>> unreasonable number of Miles Davis titles.
>>
>>> Yes, please do depend on being able to mount filehandles that aren't to root
>>> of a filesystem.
>>>
>>> The case the brought this issue to my attention involved the server having
>>> a directory containing hundreds of home directories. This directory is
>>> exported.
>>>
>>> If they mount that top level directory they get horrible performance. If
>>> they use an automounter to just mount the homes that are accessed it works
>>> better. They weren't able to explain why but my guess is that some tools
>>> (GUI filesystem browser) would occasionally do the equivalent of "ls -l" of
>>> the top level directory which would hammer nfs-idmapd and probably ldap....
>>> though you would think that would get cached and not be a problem for long.
>>> So maybe it is more subtle than that.
>>
>> Getting all the id->name mappings for a 100-entry directory is going to
>> require a 100 serialized upcalls to idmapd (and then possibly ldap), and
>> by default it looks like the idmapd cache will go cold after 10
>> minutes.... Not hard to imagine that could be a problem.
>>
>> Running multiple idmapd process would be easy and might help? Though
>> not if the client's just giving us the getattrs one at a time.
>>
>> Or maybe the problem's somewhere else entirely, but that's a real bug if
>> we aren't giving good performance on /home.
>
> I did some experimenting..
> On both 'client' and 'server':
> for i in `seq 2000 3000`; do echo u$i:x:$i:1000::/nohome:/bin/false; done
>>> /etc/passwd
>
> On server in suitable directory
>
> for i in `seq 2000 3000`; do mkdir $i ; chown u$i $i ; done
>
> Mount that directory onto the client with NFSv3 and "time ls -l" takes a
> little under 4 seconds.
> Mount with NFSv4 and it takes about the same. However:
>
> .....
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2974
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2975
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2976
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2977
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2978
> drwxr-xr-x 2 u2979 root 4096 Oct 8 16:19 2979
> drwxr-xr-x 2 u2980 root 4096 Oct 8 16:19 2980
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2981
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2982
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2983
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2984
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2985
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2986
> ....
>
>
> tcpdump shows the server is returning the write stuff, but something if going
> wrong on the client. I've tried unmounting/remounting and killing/restarting
> rpc.idmapd.
> I had some config problems previously .. is there any chance that these
> unknown entries are in a cache? Any easy way to view or flush the cache?
Assuming you are using the keyring based idmapper, "nfsidmap -cv" will
clear the keyring of user and group ids. See nfsidmap(5).
If you using rpc.idmapd, I believe
echo `date +'%s'` > /proc/net/rpc/nfs4.idtoname/flush
will do the trick.... The CITI faq
http://www.citi.umich.edu/projects/nfsv4/linux/faq/
has a section on work with this cache...
steved.
>
> Of course this is with text-file password lookup. LDAP might be slower but
> I'd be surprised if it was much slower.
>
> NeilBrown
>
>
>
>>
>> --b.
>>
>>> I've built similar setups before. There is something attractive about
>>> everyone's home directory being /home/$USERNAME even though they are on
>>> different servers and different filesystems.
>>>
>>> In the particular problem scenario, local policy requires that the 'staff'
>>> directory on the server to not be world-accessible, but they still want to
>>> mount the individual home directories from there onto client machines as
>>> required.
>>> I cannot easily justify that policy, but the point is that it works with
>>> NFSv3 and with AUTH_SYS/no_root_squash, but not with NFSv4/kerb5. I don't
>>> think we can fix this inconsistency but maybe we can explain it.
>>>
>>> I think your text is more accurate than mine, but also a little more vague so
>>> the important may not be immediately obvious. That might be a price we have
>>> to pay for accuracy.
>
next prev parent reply other threads:[~2012-10-08 11:42 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-18 1:23 Inconsistency when mounting a directory that 'world' cannot access NeilBrown
2012-10-01 15:43 ` J. Bruce Fields
2012-10-02 2:38 ` NeilBrown
2012-10-02 14:33 ` J. Bruce Fields
2012-10-03 3:46 ` NeilBrown
2012-10-03 15:13 ` J. Bruce Fields
2012-10-03 15:48 ` Myklebust, Trond
2012-10-03 16:27 ` J. Bruce Fields
2012-10-03 22:46 ` NeilBrown
2012-10-04 16:07 ` J. Bruce Fields
2012-10-08 6:03 ` NeilBrown
2012-10-08 11:42 ` Steve Dickson [this message]
2012-10-08 12:20 ` J. Bruce Fields
2012-10-09 0:30 ` NeilBrown
2012-10-08 12:19 ` J. Bruce Fields
2012-10-08 13:54 ` Malahal Naineni
2012-10-08 14:18 ` J. Bruce Fields
2012-10-08 15:26 ` Malahal Naineni
2012-10-09 0:33 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5072BC2A.1060100@RedHat.com \
--to=steved@redhat.com \
--cc=Trond.Myklebust@netapp.com \
--cc=bfields@fieldses.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).