From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jim Lieb Subject: Re: Re: Re: [PATCH 1/3] switch_creds: Syscall to switch creds for file server ops Date: Thu, 31 Oct 2013 12:43:33 -0700 Message-ID: <3154208.hemKm7CgkG@jlieb-e6410> References: <1381960919-4542-1-git-send-email-jlieb@panasas.com> <5068709.xK6KPeP0iH@jlieb-e6410> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Linux FS Devel , "Eric W. Biederman" , Al Viro , "Theodore Ts'o" , "linux-kernel@vger.kernel.org" , , Jeff Layton To: Andy Lutomirski Return-path: In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Thursday, October 31, 2013 12:09:08 Andy Lutomirski wrote: > On Thu, Oct 24, 2013 at 1:24 PM, Jim Lieb wrote: > > On Thursday, October 24, 2013 12:28:15 Andy Lutomirski wrote: > >> On Wed, Oct 23, 2013 at 10:59 PM, Eric W. Biederman > >>=20 > >> wrote: > >> > Andy Lutomirski writes: > >> >> On 10/16/2013 08:52 PM, Eric W. Biederman wrote: > >> >>> Al Viro writes: > >> >>>> On Wed, Oct 16, 2013 at 06:18:16PM -0700, Eric W. Biederman w= rote: > >> >>>>> That doesn't look bad but it does need capable(CAP_SETUID) &= & > >> >>>>> capable(CAP_SETGID) or possibly something a little more refi= ned. > >> >>>>=20 > >> >>>> D'oh > >> >>>>=20 > >> >>>>> I don't think we want file descriptor passing to all of a su= dden > >> >>>>> become > >> >>>>> a grant of privilege, beyond what the passed fd can do. > >> >>>>=20 > >> >>>> Definitely. And an extra ) to make it compile wouldn't hurt > >> >>>> either... > >> >>>=20 > >> >>> There also appears to need to be a check that we don't gain an= y > >> >>> capabilities. > >> >>>=20 > >> >>> We also need a check so that you don't gain any capabilities, = and > >> >>> possibly a few other things. > >> >>=20 > >> >> Why? I like the user_ns part, but I'm not immediately seeing t= he > >> >> issue > >> >> with capabilities. > >> >=20 > >> > My reasoning was instead of making this syscall as generic as po= ssible > >> > start it out by only allowing the cases Jim cares about and work= ing > >> > with > >> > a model where you can't gain any permissions you couldn't gain > >> > otherwise. > >> >=20 > >> > Although the fd -1 trick to revert to your other existing cred s= eems > >> > reasonable. > >> >=20 > >> >>> So I suspect we want a check something like: > >> >>>=20 > >> >>> if ((new_cred->securebits !=3D current_cred->securebits) || > >> >>>=20 > >> >>> (new_cred->cap_inheritable !=3D current_cred->cap_inherita= ble) || > >> >>> (new_cred->cap_permitted !=3D current_cred->cap_permitted)= || > >> >>> (new_cred->cap_effective !=3D current_cred->cap_effective)= || > >> >>> (new_cred->cap_bset !=3D current_cred->cap_bset) || > >> >>> (new_cred->jit_keyring !=3D current_cred->jit_keyring) || > >> >>> (new_cred->session_keyring !=3D current_cred->session_keyr= ing) || > >> >>> (new_cred->process_keyring !=3D current_cred->process_keyr= ing) || > >> >>> (new_cred->thread_keyring !=3D current_cred->thread_keyrin= g) || > >> >>> (new_cred->request_keyring !=3D current_cred->request_keyr= ing) || > >> >>> (new_cred->security !=3D current_cred->security) || > >> >>> (new_cred->user_ns !=3D current_cred->user_ns)) { > >> >>> =20 > >> >>> return -EPERM; > >> >>>=20 > >> >>> } > >> >>=20 > >> >> I *really* don't like the idea of being able to use any old fil= e > >> >> descriptor. I barely care what rights the caller needs to have= to > >> >> invoke this -- if you're going to pass an fd that grants a capa= bility > >> >> (in the non-Linux sense of the work), please make sure that the= sender > >> >> actually wants that behavior. > >> >>=20 > >> >> IOW, have a syscall to generate a special fd for this purpose. = It's > >> >> only a couple lines of code, and I think we'll really regret it= if we > >> >> fsck this up. > >> >>=20 > >> >> (I will take it as a personal challenge to find at least one > >> >> exploitable > >> >> privilege escalation in this if an arbitrary fd works.) > >> >=20 > >> > If you can't switch to a uid or a gid you couldn't switch to oth= erwise > >> > then the worst that can happen is an information leak. And info= rmation > >> > leaks are rarely directly exploitable. > >>=20 > >> Here's the attack: > >>=20 > >> Suppose there's a daemon that uses this in conjunction with > >> SCM_RIGHTS. The daemon is effectively root (under the current > >> proposal, it has to be). So a client connects, sends a credential= fd, > >> and the daemon impersonates those credentials. > >>=20 > >> Now a malicious user obtains *any* fd opened by root. It sends th= at > >> fd to the daemon. The daemon then impersonates root. We lose. (= It > >> can't possibly be hard to obtain an fd with highly privileged f_cr= ed > >> -- I bet that most console programs have stdin like that, for exam= ple. > >>=20 > >> There are probably many setuid programs that will happily open > >>=20 > >> /dev/null for you, too.) > >=20 > > In my reply to Eric, I note that I need to add a check that the fd = passed > > is one from switch_creds. With that test, not any fd will do and th= e one > > that does has only been able to set fsuid, fsgid, altgroups, and re= duced > > (the nfsd set) caps. They can do no more damage than what the orig= inal > > switch_creds allowed. The any fd by root no longer applies so use > > doesn't get much (no escalation). > >=20 > >> >> Also... real_cred looks confusing. AFAICS it is used *only* fo= r knfsd > >> >> and faccessat. That is, current userspace can't see it. But n= ow > >> >> you'll > >> >> expose various oddities. For example, AFAICS a capability-less > >> >> process > >> >> that's a userns owner can always use setuid. This will *overwr= ite* > >> >> real_cred. Then you're screwed, especially if this happens by > >> >> accident. > >> >=20 > >> > And doing in userland what faccessat, and knfsd do in the kernel= is > >> > exactly what is desired here. But maybe there are issues with t= hat. > >> >=20 > >> >> That being said, Windows has had functions like this for a long= time. > >> >> Processes have a primary token and possibly an impersonation to= ken.=20 > >> >> Any > >> >> process can call ImpersonateLoggedOnUser (no privilege required= ) to > >> >> impersonate the credentials of a token (which is special kind o= f fd). > >> >> Similarly, any process can call RevertToSelf to undo it. > >> >>=20 > >> >> Is there any actual problem with allowing completely unprivileg= ed > >> >> tasks > >> >> to switch to one of these magic cred fds? That would avoid nee= ding a > >> >> "revert" operation. > >> >=20 > >> > If the permission model is this switching of credentials doesn't= get > >> > you > >> > anything you couldn't get some other way. That would seem to to= tally > >> > rules out unprivileged processes switching these things. > >>=20 > >> IMO, there are two reasonable models that involve fds carrying som= e > >> kind of credential. > >>=20 > >> 1. The fd actually carries the ability to use the credentials. Yo= u > >> need to be very careful whom you send these to. The security > >> implications are obvious (which is good) and the receiver doesn't = need > >> privilege (which is good, as long as the receiver is careful). > >=20 > > I test for caps. I think using switch_creds in all forms should re= quire > > privs. I thought of the revert case and although it does simply re= turn to > > "real" creds, you still need CAP_SETUID and CAP_SETGID to do it. T= his > > means, of course, if you have really messed up things, you may not = be > > able to get back home which, although a bad thing for the process, = is a > > good thing for the system as a whole. > >=20 > >> 2. The fd is for identification only. But this means that the fd > >> carries the ability to identify as a user. So you *still* have to= be > >> very careful about whom you send it to. What you need is an opera= tion > >> that allows you to identify using the fd without transitively gran= ting > >> the recipient the same ability. On networks, this is done by sign= ing > >> some kind of challenge. The kernel could work the same way, or th= ere > >> could be a new CMSG_IDENTITY that you need an identity fd to send = but > >> that does not copy that fd to the recipient. > >=20 > > I am not sure I understand this. CMSG only applies to UNIX_DOMAIN = sockets > > which means that the switch_creds fd test still applies here. It i= s > > identification but only for within the same kernel. As for namespa= ces, > > the > > translation was done when the creds fd was created. I suppose if i= t was > > passed across namespace boundaries there could be a problem but wha= t is in > > the creds structure is the translated fduid,fsgid etc., not the > > untranslated one. We are still doing access checks and quota stuff = with > > the translated creds. If one namespace has "fred" as uid 52 and an= other > > has "mary" as 52, the quota will only be credited to whoever "fred" > > really is only if "fred" gets to be "mary" in the second alternate > > universe... >=20 > I'm not talking about namespaces here -- I think we're not really on > the same page. Can you describe what you're envisioning doing with > these fds? I have a new version that is now two syscalls and no multiplexing has m= ore=20 testing etc. to mirror exactly the tests in setfsuid(). I still am tes= ting=20 but plan to send this new patchset next week for review. Ok, I may have missed something. What I meant to say is that I'm usin= g the=20 same namespace functions for switch_creds as all the set*uid syscalls u= se so=20 what I have should work as well. If one were to pass this fd via CMSG,= it=20 could cross a namespace boundary. The changed mappings of this fd woul= d be=20 the same as for any other fd passed across now. Maybe that is irreleva= nt in=20 this case.=20 The purpose of the fd is to create a handle to a creds set that a serve= r can=20 then efficiently switch to prior to filesystem syscalls where fsuid etc= =2E are=20 relevant (mkdir, creat, write, etc.). This is exposing the override_cr= eds()=20 and revert_creds() to these servers. I do not intend in our server to = hand=20 these fds to anyone else. After all, they are useless for anything oth= er than=20 switch_creds/use_creds. The server keeps a cache of its known, currently active client creds al= ong=20 with this fd. The fast path is to lookup the creds and use the fd. On= cache=20 misses, it does a switch_creds() and saves the fd in the cache. >=20 > >> >> Another note: I think that there may be issues if the creator o= f a > >> >> token > >> >> has no_new_privs set and the user doesn't. Imagine a daemon th= at > >> >> accepts one of these fds, impersonates it, and calls exec. Thi= s could > >> >> be used to escape from no_new_privs land. > >> >=20 > >> > Which is why I was suggesting that we don't allow changing any f= ield in > >> > the cred except for uids and gids. > >>=20 > >> If the daemon impersonates and execs, we still lose. > >=20 > > I answered this. You only get to impersonate for purposes of file = access > > with reduced caps to prevent you from being root as well. Also, si= nce > > these are O_CLOEXEC and switch_creds "magic" fds, this can't happen > > because the fd is gone post-exec. >=20 > If a no_new_privs process authenticates to a daemon and that daemon > execs, then there's now a dumpable, ptraceable, non-no-new-privs > process running as the uid/gid of the no_new_privs process. This may > be dangerous. Two things. My new patch set now throws an error if the fd is not one=20 returned by switch_creds(). The new syscall here is use_creds() btw. = That fd=20 is opened O_CLOEXEC so the syscall has two guards. First, it can only = be one=20 that switch_creds() returned and second, it gets closed on an exec so t= he=20 no_new_privs process doesn't get it. In addition, switch_creds() reduc= es the=20 effective caps set to the same minimal set that nfsd_setuser() has. If someone else chooses to pass this fd via CMSG, whoever gets it has r= educed=20 privs and in order to use it for anything, i.e. use_creds(), it still n= eeds to=20 inherit SETUID and SETGID caps from its parent or it will get an EPERM = error.=20 Passing this in "friendly" code defeats the purpose of the cache above = in that=20 we could get fd leaks. If there is another flag that could prevent its= being=20 passed along via CMSG, I can add that too because using CMSG is well ou= tside=20 the use case. Note too, it also cannot pass it on via exec. The use case is and shou= ld be=20 as for knfsd, namely which calls nfsd_setuser() followed by fulfilling= the=20 protocol op followed by a revert of creds. All this is is in the same = thread. Jim >=20 > >> --Andy > >=20 > > -- > > Jim Lieb > > Linux Systems Engineer > > Panasas Inc. > >=20 > > "If ease of use was the only requirement, we would all be riding > > tricycles" > > - Douglas Engelbart 1925=962013 --=20 Jim Lieb Linux Systems Engineer Panasas Inc. "If ease of use was the only requirement, we would all be riding tricyc= les" - Douglas Engelbart 1925=962013