* [RFC] capabilities: Ambient capabilities @ 2015-03-12 18:08 Andy Lutomirski 2015-03-12 21:49 ` Kees Cook 0 siblings, 1 reply; 34+ messages in thread From: Andy Lutomirski @ 2015-03-12 18:08 UTC (permalink / raw) Cc: Andy Lutomirski, Kees Cook, Christoph Lameter, Serge Hallyn, Andy Lutomirski, Jonathan Corbet, Aaron Jones, Ted Ts'o, linux-security-module-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-api-u79uwXL29TY76Z2rM5mHXA, akpm-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Andrew G. Morgan, Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen, Michael Kerrisk Credit where credit is due: this idea comes from Christoph Lameter with a lot of valuable input from Serge Hallyn. This patch is heavily based on Christoph's patch. ===== The status quo ===== On Linux, there are a number of capabilities defined by the kernel. To perform various privileged tasks, processes can wield capabilities that they hold. Each task has four capability masks: effective (pE), permitted (pP), inheritable (pI), and a bounding set (X). When the kernel checks for a capability, it checks pE. The other capability masks serve to modify what capabilities can be in pE. Any task can remove capabilities from pE, pP, or pI at any time. If a task has a capability in pP, it can add that capability to pE and/or pI. If a task has CAP_SETPCAP, then it can add any capability to pI, and it can remove capabilities from X. Tasks are not the only things that can have capabilities; files can also have capabilities. A file can have no capabilty information at all [1]. If a file has capability information, then it has a permitted mask (fP) and an inheritable mask (fI) as well as a single effective bit (fE) [2]. File capabilities modify the capabilities of tasks that execve(2) them. A task that successfully calls execve has its capabilities modified for the file ultimately being excecuted (i.e. the binary itself if that binary is ELF or for the interpreter if the binary is a script.) [3] In the capability evolution rules, for each mask Z, pZ represents the old value and pZ' represents the new value. The rules are: pP' = (X & fP) | (pI & fI) pI' = pI pE' = (fE ? pP' : 0) X is unchanged For setuid binaries, fP, fI, and fE are modified by a moderately complicated set of rules that emulate POSIX behavior. Similarly, if euid == 0 or ruid == 0, then fP, fI, and fE are modified differently (primary, fP and fI usually end up being the full set). For nonroot users executing binaries with neither setuid nor file caps, fI and fP are empty and fE is false. As an extra complication, if you execute a process as nonroot and fE is set, then the "secure exec" rules are in effect: AT_SECURE gets set, LD_PRELOAD doesn't work, etc. This is rather messy. We've learned that making any changes is dangerous, though: if a new kernel version allows an unprivileged program to change its security state in a way that persists cross execution of a setuid program or a program with file caps, this persistent state is surprisingly likely to allow setuid or file-capped programs to be exploited for privilege escalation. ===== The problem ===== Capability inheritance is basically useless. If you aren't root and you execute an ordinary binary, fI is zero, so your capabilities have no effect whatsoever on pP'. This means that you can't usefully execute a helper process or a shell command with elevated capabilities if you aren't root. On current kernels, you can sort of work around this by setting fI to the full set for most or all non-setuid executable files. This causes pP' = pI for nonroot, and inheritance works. No one does this because it's a PITA and it isn't even supported on most filesystems. If you try this, you'll discover that every nonroot program ends up with secure exec rules, breaking many things. This is a problem that has bitten many people who have tried to use capabilities for anything useful. ===== The proposed change ===== This patch adds a fifth capability mask called the ambient mask (pA). pA does what pI should have done. pA obeys the invariant that no bit can ever be set in pA if it is not set in both pP and pI. Dropping a bit from pP or pI drops that bit from pA. This ensures that existing programs that try to drop capabilities still do so, with a complication. Because capability inheritance is so broken, setting KEEPCAPS, using setresuid to switch to nonroot uids, and calling execve effectively drops capabilities. Therefore, setresuid from root to nonroot unconditionally clears pA. Processes that don't like this can re-add bits to pA afterwards. The capability evolution rules are changed: pA' = (file caps or setuid or setgid ? 0 : pA) pP' = (X & fP) | (pI & fI) | pA' pI' = pI pE' = (fE ? pP' : pA') X is unchanged If you are nonroot but you have a capability, you can add it to pA. If you do so, your children get that capability in pA, pP, and pE. For example, you can set pA = CAP_NET_BIND_SERVICE, and your children can automatically bind low-numbered ports. Hallelujah! Unprivileged users can create user namespaces, map themselves to a nonzero uid, and create both privileged (relative to their namespace) and unprivileged process trees. This is currently more or less impossible. Hallelujah! You cannot use pA to try to subvert a setuid, setgid, or file-capped program: if you execute any such program, pA gets cleared and the resulting evolution rules are unchanged by this patch. Users with nonzero pA are unlikely to unintentionally leak that capability. If they run programs that try to drop privileges, dropping privileges will still work. It's worth noting that the degree of paranoia in this patch could possibly be relaxed without causing serious problems. Specifically, if we allowed pA to persist across executing non-pA-aware setuid binaries and across setresuid, then, naively, the only capabilities that could leak as a result would be the capabilities in pA, and any attacker *already* has those capabilities. This would make me nervous, though -- setuid binaries that tried to privilege-separate might fail to do so, and putting CAP_DAC_READ_SEARCH or CAP_DAC_OVERRIDE into pA could have unexpected side effects. (Whether these unexpected side effects would be exploitable is an open question.) I've therefore taken the more paranoid route. An alternative would be to either require PR_SET_NO_NEW_PRIVS before setting ambient capabilities. I think that this would be annoying and would make granting otherwise unprivileged users minor ambient capabilities (CAP_NET_BIND_SERVICE or CAP_NET_RAW for example) much less useful than it is with this patch. ===== Footnotes ===== [1] Files that are missing the "security.capability" xattr or that have unrecognized values for that xattr end up with has_cap == false. The code that does that appears to be complicated for no good reason. [2] The libcap capability mask parsers and formatters are dangerously misleading and the documentation is flat-out wrong. fE is *not* a mask; it's a single bit. This has probably confused every single person who has tried to use file capabilities. [3] Linux very confusingly processes the script and the interpreter if applicable, for reasons that escape me. The results from thinking about a script's file capabilities and/or setuid bits are mostly discarded. Cc: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> Cc: Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org> Cc: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Cc: Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> Cc: Jonathan Corbet <corbet-T1hC0tSOHrs@public.gmane.org> Cc: Aaron Jones <aaronmdjones-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> CC: Ted Ts'o <tytso-3s7WtUTddSA@public.gmane.org> Cc: linux-security-module-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Cc: linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Cc: akpm-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org Cc: Andrew G. Morgan <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Cc: Mimi Zohar <zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> Cc: Austin S Hemmelgarn <ahferroin7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Cc: Markku Savela <msa-kXoF896ld44xHbG02/KK1g@public.gmane.org> Cc: Jarkko Sakkinen <jarkko.sakkinen-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> Cc: Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Signed-off-by: Andy Lutomirski <luto-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> --- Preliminary userspace code is here: https://git.kernel.org/cgit/linux/kernel/git/luto/util-linux-playground.git/commit/?h=cap_ambient&id=860c73ac1acaaae976bdd3bb83b89b0180f0702a fs/proc/array.c | 5 ++- include/linux/cred.h | 15 +++++++++ include/uapi/linux/prctl.h | 6 ++++ kernel/user_namespace.c | 1 + security/commoncap.c | 75 ++++++++++++++++++++++++++++++++++++++------ security/keys/process_keys.c | 1 + 6 files changed, 92 insertions(+), 11 deletions(-) diff --git a/fs/proc/array.c b/fs/proc/array.c index 1295a00ca316..bc15356d6551 100644 --- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -282,7 +282,8 @@ static void render_cap_t(struct seq_file *m, const char *header, static inline void task_cap(struct seq_file *m, struct task_struct *p) { const struct cred *cred; - kernel_cap_t cap_inheritable, cap_permitted, cap_effective, cap_bset; + kernel_cap_t cap_inheritable, cap_permitted, cap_effective, + cap_bset, cap_ambient; rcu_read_lock(); cred = __task_cred(p); @@ -290,12 +291,14 @@ static inline void task_cap(struct seq_file *m, struct task_struct *p) cap_permitted = cred->cap_permitted; cap_effective = cred->cap_effective; cap_bset = cred->cap_bset; + cap_ambient = cred->cap_ambient; rcu_read_unlock(); render_cap_t(m, "CapInh:\t", &cap_inheritable); render_cap_t(m, "CapPrm:\t", &cap_permitted); render_cap_t(m, "CapEff:\t", &cap_effective); render_cap_t(m, "CapBnd:\t", &cap_bset); + render_cap_t(m, "CapAmb:\t", &cap_ambient); } static inline void task_seccomp(struct seq_file *m, struct task_struct *p) diff --git a/include/linux/cred.h b/include/linux/cred.h index 2fb2ca2127ed..a21bcba6ef84 100644 --- a/include/linux/cred.h +++ b/include/linux/cred.h @@ -122,6 +122,7 @@ struct cred { kernel_cap_t cap_permitted; /* caps we're permitted */ kernel_cap_t cap_effective; /* caps we can actually use */ kernel_cap_t cap_bset; /* capability bounding set */ + kernel_cap_t cap_ambient; /* Ambient capability set */ #ifdef CONFIG_KEYS unsigned char jit_keyring; /* default keyring to attach requested * keys to */ @@ -197,6 +198,20 @@ static inline void validate_process_creds(void) } #endif +static inline void cap_enforce_ambient_invariants(struct cred *cred) +{ + cred->cap_ambient = cap_intersect(cred->cap_ambient, + cap_intersect(cred->cap_permitted, + cred->cap_inheritable)); +} + +static inline bool cap_ambient_invariant_ok(const struct cred *cred) +{ + return cap_issubset(cred->cap_ambient, + cap_intersect(cred->cap_permitted, + cred->cap_inheritable)); +} + /** * get_new_cred - Get a reference on a new set of credentials * @cred: The new credentials to reference diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 31891d9535e2..65407f867e82 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -190,4 +190,10 @@ struct prctl_mm_map { # define PR_FP_MODE_FR (1 << 0) /* 64b FP registers */ # define PR_FP_MODE_FRE (1 << 1) /* 32b compatibility */ +/* Control the ambient capability set */ +#define PR_CAP_AMBIENT 47 +# define PR_CAP_AMBIENT_GET 1 +# define PR_CAP_AMBIENT_RAISE 2 +# define PR_CAP_AMBIENT_LOWER 3 + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c index 4109f8320684..dab0f808235a 100644 --- a/kernel/user_namespace.c +++ b/kernel/user_namespace.c @@ -39,6 +39,7 @@ static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns) cred->cap_inheritable = CAP_EMPTY_SET; cred->cap_permitted = CAP_FULL_SET; cred->cap_effective = CAP_FULL_SET; + cred->cap_ambient = CAP_EMPTY_SET; cred->cap_bset = CAP_FULL_SET; #ifdef CONFIG_KEYS key_put(cred->request_key_auth); diff --git a/security/commoncap.c b/security/commoncap.c index f66713bd7450..b3253886ecad 100644 --- a/security/commoncap.c +++ b/security/commoncap.c @@ -272,6 +272,7 @@ int cap_capset(struct cred *new, new->cap_effective = *effective; new->cap_inheritable = *inheritable; new->cap_permitted = *permitted; + cap_enforce_ambient_invariants(new); return 0; } @@ -352,6 +353,7 @@ static inline int bprm_caps_from_vfs_caps(struct cpu_vfs_cap_data *caps, /* * pP' = (X & fP) | (pI & fI) + * The addition of pA' is handled later. */ new->cap_permitted.cap[i] = (new->cap_bset.cap[i] & permitted) | @@ -479,10 +481,12 @@ int cap_bprm_set_creds(struct linux_binprm *bprm) { const struct cred *old = current_cred(); struct cred *new = bprm->cred; - bool effective, has_cap = false; + bool effective, has_cap = false, is_setid; int ret; kuid_t root_uid; + BUG_ON(!cap_ambient_invariant_ok(old)); + effective = false; ret = get_file_caps(bprm, &effective, &has_cap); if (ret < 0) @@ -527,8 +531,9 @@ skip: * * In addition, if NO_NEW_PRIVS, then ensure we get no new privs. */ - if ((!uid_eq(new->euid, old->uid) || - !gid_eq(new->egid, old->gid) || + is_setid = !uid_eq(new->euid, old->uid) || !gid_eq(new->egid, old->gid); + + if ((is_setid || !cap_issubset(new->cap_permitted, old->cap_permitted)) && bprm->unsafe & ~LSM_UNSAFE_PTRACE_CAP) { /* downgrade; they get no more than they had, and maybe less */ @@ -544,10 +549,23 @@ skip: new->suid = new->fsuid = new->euid; new->sgid = new->fsgid = new->egid; + /* File caps or setid cancel ambient. */ + if (has_cap || is_setid) + cap_clear(new->cap_ambient); + + /* + * Now that we've computed pA', update pP' to give: + * pP' = (X & fP) | (pI & fI) | pA' + */ + new->cap_permitted = cap_combine(new->cap_permitted, new->cap_ambient); + if (effective) new->cap_effective = new->cap_permitted; else - cap_clear(new->cap_effective); + new->cap_effective = new->cap_ambient; + + BUG_ON(!cap_ambient_invariant_ok(new)); + bprm->cap_effective = effective; /* @@ -562,7 +580,7 @@ skip: * Number 1 above might fail if you don't have a full bset, but I think * that is interesting information to audit. */ - if (!cap_isclear(new->cap_effective)) { + if (!cap_issubset(new->cap_effective, new->cap_ambient)) { if (!cap_issubset(CAP_FULL_SET, new->cap_effective) || !uid_eq(new->euid, root_uid) || !uid_eq(new->uid, root_uid) || issecure(SECURE_NOROOT)) { @@ -573,6 +591,9 @@ skip: } new->securebits &= ~issecure_mask(SECURE_KEEP_CAPS); + + BUG_ON(!cap_ambient_invariant_ok(new)); + return 0; } @@ -594,7 +615,7 @@ int cap_bprm_secureexec(struct linux_binprm *bprm) if (!uid_eq(cred->uid, root_uid)) { if (bprm->cap_effective) return 1; - if (!cap_isclear(cred->cap_permitted)) + if (!cap_issubset(cred->cap_permitted, cred->cap_ambient)) return 1; } @@ -696,10 +717,18 @@ static inline void cap_emulate_setxuid(struct cred *new, const struct cred *old) uid_eq(old->suid, root_uid)) && (!uid_eq(new->uid, root_uid) && !uid_eq(new->euid, root_uid) && - !uid_eq(new->suid, root_uid)) && - !issecure(SECURE_KEEP_CAPS)) { - cap_clear(new->cap_permitted); - cap_clear(new->cap_effective); + !uid_eq(new->suid, root_uid))) { + if (!issecure(SECURE_KEEP_CAPS)) { + cap_clear(new->cap_permitted); + cap_clear(new->cap_effective); + } + + /* + * Pre-ambient programs except setresuid to nonroot followed + * by exec to drop capabilities. We should make sure that + * this remains the case. + */ + cap_clear(new->cap_ambient); } if (uid_eq(old->euid, root_uid) && !uid_eq(new->euid, root_uid)) cap_clear(new->cap_effective); @@ -929,6 +958,32 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3, new->securebits &= ~issecure_mask(SECURE_KEEP_CAPS); return commit_creds(new); + case PR_CAP_AMBIENT: + if (((!cap_valid(arg3)) | arg4 | arg5)) + return -EINVAL; + + if (arg2 == PR_CAP_AMBIENT_GET) { + return !!cap_raised(current_cred()->cap_ambient, arg3); + } else if (arg2 != PR_CAP_AMBIENT_RAISE && + arg2 != PR_CAP_AMBIENT_LOWER) { + return -EINVAL; + } else { + if (arg2 == PR_CAP_AMBIENT_RAISE && + (!cap_raised(current_cred()->cap_permitted, arg3) || + !cap_raised(current_cred()->cap_inheritable, + arg3))) + return -EPERM; + + new = prepare_creds(); + if (!new) + return -ENOMEM; + if (arg2 == PR_CAP_AMBIENT_RAISE) + cap_raise(new->cap_ambient, arg3); + else + cap_lower(new->cap_ambient, arg3); + return commit_creds(new); + } + default: /* No functionality available - continue with default */ return -ENOSYS; diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c index bd536cb221e2..43b4cddbf2b3 100644 --- a/security/keys/process_keys.c +++ b/security/keys/process_keys.c @@ -848,6 +848,7 @@ void key_change_session_keyring(struct callback_head *twork) new->cap_inheritable = old->cap_inheritable; new->cap_permitted = old->cap_permitted; new->cap_effective = old->cap_effective; + new->cap_ambient = old->cap_ambient; new->cap_bset = old->cap_bset; new->jit_keyring = old->jit_keyring; -- 2.3.0 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [RFC] capabilities: Ambient capabilities 2015-03-12 18:08 [RFC] capabilities: Ambient capabilities Andy Lutomirski @ 2015-03-12 21:49 ` Kees Cook 2015-03-12 22:10 ` Andrew G. Morgan 0 siblings, 1 reply; 34+ messages in thread From: Kees Cook @ 2015-03-12 21:49 UTC (permalink / raw) To: Andy Lutomirski Cc: Christoph Lameter, Serge Hallyn, Andy Lutomirski, Jonathan Corbet, Aaron Jones, Ted Ts'o, linux-security-module, LKML, Linux API, akpm, Andrew G. Morgan, Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen, Michael Kerrisk On Thu, Mar 12, 2015 at 11:08 AM, Andy Lutomirski <luto@kernel.org> wrote: > Credit where credit is due: this idea comes from Christoph Lameter > with a lot of valuable input from Serge Hallyn. This patch is > heavily based on Christoph's patch. > > ===== The status quo ===== > > On Linux, there are a number of capabilities defined by the kernel. > To perform various privileged tasks, processes can wield > capabilities that they hold. > > Each task has four capability masks: effective (pE), permitted (pP), > inheritable (pI), and a bounding set (X). When the kernel checks > for a capability, it checks pE. The other capability masks serve to > modify what capabilities can be in pE. > > Any task can remove capabilities from pE, pP, or pI at any time. If > a task has a capability in pP, it can add that capability to pE > and/or pI. If a task has CAP_SETPCAP, then it can add any > capability to pI, and it can remove capabilities from X. > > Tasks are not the only things that can have capabilities; files can > also have capabilities. A file can have no capabilty information at > all [1]. If a file has capability information, then it has a > permitted mask (fP) and an inheritable mask (fI) as well as a single > effective bit (fE) [2]. File capabilities modify the capabilities > of tasks that execve(2) them. > > A task that successfully calls execve has its capabilities modified > for the file ultimately being excecuted (i.e. the binary itself if > that binary is ELF or for the interpreter if the binary is a > script.) [3] In the capability evolution rules, for each mask Z, pZ > represents the old value and pZ' represents the new value. The > rules are: > > pP' = (X & fP) | (pI & fI) > pI' = pI > pE' = (fE ? pP' : 0) > X is unchanged > > For setuid binaries, fP, fI, and fE are modified by a moderately > complicated set of rules that emulate POSIX behavior. Similarly, if > euid == 0 or ruid == 0, then fP, fI, and fE are modified differently > (primary, fP and fI usually end up being the full set). For nonroot > users executing binaries with neither setuid nor file caps, fI and > fP are empty and fE is false. > > As an extra complication, if you execute a process as nonroot and fE > is set, then the "secure exec" rules are in effect: AT_SECURE gets > set, LD_PRELOAD doesn't work, etc. > > This is rather messy. We've learned that making any changes is > dangerous, though: if a new kernel version allows an unprivileged > program to change its security state in a way that persists cross > execution of a setuid program or a program with file caps, this > persistent state is surprisingly likely to allow setuid or > file-capped programs to be exploited for privilege escalation. > > ===== The problem ===== > > Capability inheritance is basically useless. > > If you aren't root and you execute an ordinary binary, fI is zero, > so your capabilities have no effect whatsoever on pP'. This means > that you can't usefully execute a helper process or a shell command > with elevated capabilities if you aren't root. > > On current kernels, you can sort of work around this by setting fI > to the full set for most or all non-setuid executable files. This > causes pP' = pI for nonroot, and inheritance works. No one does > this because it's a PITA and it isn't even supported on most > filesystems. > > If you try this, you'll discover that every nonroot program ends up > with secure exec rules, breaking many things. > > This is a problem that has bitten many people who have tried to use > capabilities for anything useful. > > ===== The proposed change ===== > > This patch adds a fifth capability mask called the ambient mask > (pA). pA does what pI should have done. > > pA obeys the invariant that no bit can ever be set in pA if it is > not set in both pP and pI. Dropping a bit from pP or pI drops that > bit from pA. This ensures that existing programs that try to drop > capabilities still do so, with a complication. Because capability > inheritance is so broken, setting KEEPCAPS, using setresuid to > switch to nonroot uids, and calling execve effectively drops > capabilities. Therefore, setresuid from root to nonroot > unconditionally clears pA. Processes that don't like this can > re-add bits to pA afterwards. > > The capability evolution rules are changed: > > pA' = (file caps or setuid or setgid ? 0 : pA) > pP' = (X & fP) | (pI & fI) | pA' > pI' = pI > pE' = (fE ? pP' : pA') > X is unchanged > > If you are nonroot but you have a capability, you can add it to pA. > If you do so, your children get that capability in pA, pP, and pE. > For example, you can set pA = CAP_NET_BIND_SERVICE, and your > children can automatically bind low-numbered ports. Hallelujah! > > Unprivileged users can create user namespaces, map themselves to a > nonzero uid, and create both privileged (relative to their > namespace) and unprivileged process trees. This is currently more > or less impossible. Hallelujah! > > You cannot use pA to try to subvert a setuid, setgid, or file-capped > program: if you execute any such program, pA gets cleared and the > resulting evolution rules are unchanged by this patch. > > Users with nonzero pA are unlikely to unintentionally leak that > capability. If they run programs that try to drop privileges, > dropping privileges will still work. > > It's worth noting that the degree of paranoia in this patch could > possibly be relaxed without causing serious problems. Specifically, > if we allowed pA to persist across executing non-pA-aware setuid > binaries and across setresuid, then, naively, the only capabilities > that could leak as a result would be the capabilities in pA, and any > attacker *already* has those capabilities. This would make me > nervous, though -- setuid binaries that tried to privilege-separate > might fail to do so, and putting CAP_DAC_READ_SEARCH or > CAP_DAC_OVERRIDE into pA could have unexpected side effects. > (Whether these unexpected side effects would be exploitable is an > open question.) I've therefore taken the more paranoid route. > > An alternative would be to either require PR_SET_NO_NEW_PRIVS before > setting ambient capabilities. I think that this would be annoying > and would make granting otherwise unprivileged users minor ambient > capabilities (CAP_NET_BIND_SERVICE or CAP_NET_RAW for example) much > less useful than it is with this patch. > > ===== Footnotes ===== > > [1] Files that are missing the "security.capability" xattr or that > have unrecognized values for that xattr end up with has_cap == > false. The code that does that appears to be complicated for no > good reason. > > [2] The libcap capability mask parsers and formatters are > dangerously misleading and the documentation is flat-out wrong. fE > is *not* a mask; it's a single bit. This has probably confused > every single person who has tried to use file capabilities. > > [3] Linux very confusingly processes the script and the interpreter if > applicable, for reasons that escape me. The results from thinking > about a script's file capabilities and/or setuid bits are mostly discarded. > > Cc: Kees Cook <keescook@chromium.org> > Cc: Christoph Lameter <cl@linux.com> > Cc: Serge Hallyn <serge.hallyn@canonical.com> > Cc: Andy Lutomirski <luto@amacapital.net> > Cc: Jonathan Corbet <corbet@lwn.net> > Cc: Aaron Jones <aaronmdjones@gmail.com> > CC: Ted Ts'o <tytso@mit.edu> > Cc: linux-security-module@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Cc: linux-api@vger.kernel.org > Cc: akpm@linuxfoundation.org > Cc: Andrew G. Morgan <morgan@kernel.org> > Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> > Cc: Austin S Hemmelgarn <ahferroin7@gmail.com> > Cc: Markku Savela <msa@moth.iki.fi> > Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> > Cc: Michael Kerrisk <mtk.manpages@gmail.com> > Signed-off-by: Andy Lutomirski <luto@kernel.org> This would be quite welcome for things we're doing in Chrome OS. Presently, we're able to use fscaps to keep non-root caps across exec and haven't encountered issues with AT_SECURE (yet), but using pA would be much nicer and exactly matches how we want to use it: a launcher is creating a tree of processes that are non-root but need some capabilities. Right now the tree is very small and we're able to sprinkle our fscaps lightly. :) This would be better. -Kees > --- > > Preliminary userspace code is here: > > https://git.kernel.org/cgit/linux/kernel/git/luto/util-linux-playground.git/commit/?h=cap_ambient&id=860c73ac1acaaae976bdd3bb83b89b0180f0702a > > fs/proc/array.c | 5 ++- > include/linux/cred.h | 15 +++++++++ > include/uapi/linux/prctl.h | 6 ++++ > kernel/user_namespace.c | 1 + > security/commoncap.c | 75 ++++++++++++++++++++++++++++++++++++++------ > security/keys/process_keys.c | 1 + > 6 files changed, 92 insertions(+), 11 deletions(-) > > diff --git a/fs/proc/array.c b/fs/proc/array.c > index 1295a00ca316..bc15356d6551 100644 > --- a/fs/proc/array.c > +++ b/fs/proc/array.c > @@ -282,7 +282,8 @@ static void render_cap_t(struct seq_file *m, const char *header, > static inline void task_cap(struct seq_file *m, struct task_struct *p) > { > const struct cred *cred; > - kernel_cap_t cap_inheritable, cap_permitted, cap_effective, cap_bset; > + kernel_cap_t cap_inheritable, cap_permitted, cap_effective, > + cap_bset, cap_ambient; > > rcu_read_lock(); > cred = __task_cred(p); > @@ -290,12 +291,14 @@ static inline void task_cap(struct seq_file *m, struct task_struct *p) > cap_permitted = cred->cap_permitted; > cap_effective = cred->cap_effective; > cap_bset = cred->cap_bset; > + cap_ambient = cred->cap_ambient; > rcu_read_unlock(); > > render_cap_t(m, "CapInh:\t", &cap_inheritable); > render_cap_t(m, "CapPrm:\t", &cap_permitted); > render_cap_t(m, "CapEff:\t", &cap_effective); > render_cap_t(m, "CapBnd:\t", &cap_bset); > + render_cap_t(m, "CapAmb:\t", &cap_ambient); > } > > static inline void task_seccomp(struct seq_file *m, struct task_struct *p) > diff --git a/include/linux/cred.h b/include/linux/cred.h > index 2fb2ca2127ed..a21bcba6ef84 100644 > --- a/include/linux/cred.h > +++ b/include/linux/cred.h > @@ -122,6 +122,7 @@ struct cred { > kernel_cap_t cap_permitted; /* caps we're permitted */ > kernel_cap_t cap_effective; /* caps we can actually use */ > kernel_cap_t cap_bset; /* capability bounding set */ > + kernel_cap_t cap_ambient; /* Ambient capability set */ > #ifdef CONFIG_KEYS > unsigned char jit_keyring; /* default keyring to attach requested > * keys to */ > @@ -197,6 +198,20 @@ static inline void validate_process_creds(void) > } > #endif > > +static inline void cap_enforce_ambient_invariants(struct cred *cred) > +{ > + cred->cap_ambient = cap_intersect(cred->cap_ambient, > + cap_intersect(cred->cap_permitted, > + cred->cap_inheritable)); > +} > + > +static inline bool cap_ambient_invariant_ok(const struct cred *cred) > +{ > + return cap_issubset(cred->cap_ambient, > + cap_intersect(cred->cap_permitted, > + cred->cap_inheritable)); > +} > + > /** > * get_new_cred - Get a reference on a new set of credentials > * @cred: The new credentials to reference > diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h > index 31891d9535e2..65407f867e82 100644 > --- a/include/uapi/linux/prctl.h > +++ b/include/uapi/linux/prctl.h > @@ -190,4 +190,10 @@ struct prctl_mm_map { > # define PR_FP_MODE_FR (1 << 0) /* 64b FP registers */ > # define PR_FP_MODE_FRE (1 << 1) /* 32b compatibility */ > > +/* Control the ambient capability set */ > +#define PR_CAP_AMBIENT 47 > +# define PR_CAP_AMBIENT_GET 1 > +# define PR_CAP_AMBIENT_RAISE 2 > +# define PR_CAP_AMBIENT_LOWER 3 > + > #endif /* _LINUX_PRCTL_H */ > diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c > index 4109f8320684..dab0f808235a 100644 > --- a/kernel/user_namespace.c > +++ b/kernel/user_namespace.c > @@ -39,6 +39,7 @@ static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns) > cred->cap_inheritable = CAP_EMPTY_SET; > cred->cap_permitted = CAP_FULL_SET; > cred->cap_effective = CAP_FULL_SET; > + cred->cap_ambient = CAP_EMPTY_SET; > cred->cap_bset = CAP_FULL_SET; > #ifdef CONFIG_KEYS > key_put(cred->request_key_auth); > diff --git a/security/commoncap.c b/security/commoncap.c > index f66713bd7450..b3253886ecad 100644 > --- a/security/commoncap.c > +++ b/security/commoncap.c > @@ -272,6 +272,7 @@ int cap_capset(struct cred *new, > new->cap_effective = *effective; > new->cap_inheritable = *inheritable; > new->cap_permitted = *permitted; > + cap_enforce_ambient_invariants(new); > return 0; > } > > @@ -352,6 +353,7 @@ static inline int bprm_caps_from_vfs_caps(struct cpu_vfs_cap_data *caps, > > /* > * pP' = (X & fP) | (pI & fI) > + * The addition of pA' is handled later. > */ > new->cap_permitted.cap[i] = > (new->cap_bset.cap[i] & permitted) | > @@ -479,10 +481,12 @@ int cap_bprm_set_creds(struct linux_binprm *bprm) > { > const struct cred *old = current_cred(); > struct cred *new = bprm->cred; > - bool effective, has_cap = false; > + bool effective, has_cap = false, is_setid; > int ret; > kuid_t root_uid; > > + BUG_ON(!cap_ambient_invariant_ok(old)); > + > effective = false; > ret = get_file_caps(bprm, &effective, &has_cap); > if (ret < 0) > @@ -527,8 +531,9 @@ skip: > * > * In addition, if NO_NEW_PRIVS, then ensure we get no new privs. > */ > - if ((!uid_eq(new->euid, old->uid) || > - !gid_eq(new->egid, old->gid) || > + is_setid = !uid_eq(new->euid, old->uid) || !gid_eq(new->egid, old->gid); > + > + if ((is_setid || > !cap_issubset(new->cap_permitted, old->cap_permitted)) && > bprm->unsafe & ~LSM_UNSAFE_PTRACE_CAP) { > /* downgrade; they get no more than they had, and maybe less */ > @@ -544,10 +549,23 @@ skip: > new->suid = new->fsuid = new->euid; > new->sgid = new->fsgid = new->egid; > > + /* File caps or setid cancel ambient. */ > + if (has_cap || is_setid) > + cap_clear(new->cap_ambient); > + > + /* > + * Now that we've computed pA', update pP' to give: > + * pP' = (X & fP) | (pI & fI) | pA' > + */ > + new->cap_permitted = cap_combine(new->cap_permitted, new->cap_ambient); > + > if (effective) > new->cap_effective = new->cap_permitted; > else > - cap_clear(new->cap_effective); > + new->cap_effective = new->cap_ambient; > + > + BUG_ON(!cap_ambient_invariant_ok(new)); > + > bprm->cap_effective = effective; > > /* > @@ -562,7 +580,7 @@ skip: > * Number 1 above might fail if you don't have a full bset, but I think > * that is interesting information to audit. > */ > - if (!cap_isclear(new->cap_effective)) { > + if (!cap_issubset(new->cap_effective, new->cap_ambient)) { > if (!cap_issubset(CAP_FULL_SET, new->cap_effective) || > !uid_eq(new->euid, root_uid) || !uid_eq(new->uid, root_uid) || > issecure(SECURE_NOROOT)) { > @@ -573,6 +591,9 @@ skip: > } > > new->securebits &= ~issecure_mask(SECURE_KEEP_CAPS); > + > + BUG_ON(!cap_ambient_invariant_ok(new)); > + > return 0; > } > > @@ -594,7 +615,7 @@ int cap_bprm_secureexec(struct linux_binprm *bprm) > if (!uid_eq(cred->uid, root_uid)) { > if (bprm->cap_effective) > return 1; > - if (!cap_isclear(cred->cap_permitted)) > + if (!cap_issubset(cred->cap_permitted, cred->cap_ambient)) > return 1; > } > > @@ -696,10 +717,18 @@ static inline void cap_emulate_setxuid(struct cred *new, const struct cred *old) > uid_eq(old->suid, root_uid)) && > (!uid_eq(new->uid, root_uid) && > !uid_eq(new->euid, root_uid) && > - !uid_eq(new->suid, root_uid)) && > - !issecure(SECURE_KEEP_CAPS)) { > - cap_clear(new->cap_permitted); > - cap_clear(new->cap_effective); > + !uid_eq(new->suid, root_uid))) { > + if (!issecure(SECURE_KEEP_CAPS)) { > + cap_clear(new->cap_permitted); > + cap_clear(new->cap_effective); > + } > + > + /* > + * Pre-ambient programs except setresuid to nonroot followed > + * by exec to drop capabilities. We should make sure that > + * this remains the case. > + */ > + cap_clear(new->cap_ambient); > } > if (uid_eq(old->euid, root_uid) && !uid_eq(new->euid, root_uid)) > cap_clear(new->cap_effective); > @@ -929,6 +958,32 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3, > new->securebits &= ~issecure_mask(SECURE_KEEP_CAPS); > return commit_creds(new); > > + case PR_CAP_AMBIENT: > + if (((!cap_valid(arg3)) | arg4 | arg5)) > + return -EINVAL; > + > + if (arg2 == PR_CAP_AMBIENT_GET) { > + return !!cap_raised(current_cred()->cap_ambient, arg3); > + } else if (arg2 != PR_CAP_AMBIENT_RAISE && > + arg2 != PR_CAP_AMBIENT_LOWER) { > + return -EINVAL; > + } else { > + if (arg2 == PR_CAP_AMBIENT_RAISE && > + (!cap_raised(current_cred()->cap_permitted, arg3) || > + !cap_raised(current_cred()->cap_inheritable, > + arg3))) > + return -EPERM; > + > + new = prepare_creds(); > + if (!new) > + return -ENOMEM; > + if (arg2 == PR_CAP_AMBIENT_RAISE) > + cap_raise(new->cap_ambient, arg3); > + else > + cap_lower(new->cap_ambient, arg3); > + return commit_creds(new); > + } > + > default: > /* No functionality available - continue with default */ > return -ENOSYS; > diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c > index bd536cb221e2..43b4cddbf2b3 100644 > --- a/security/keys/process_keys.c > +++ b/security/keys/process_keys.c > @@ -848,6 +848,7 @@ void key_change_session_keyring(struct callback_head *twork) > new->cap_inheritable = old->cap_inheritable; > new->cap_permitted = old->cap_permitted; > new->cap_effective = old->cap_effective; > + new->cap_ambient = old->cap_ambient; > new->cap_bset = old->cap_bset; > > new->jit_keyring = old->jit_keyring; > -- > 2.3.0 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Kees Cook Chrome OS Security ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC] capabilities: Ambient capabilities 2015-03-12 21:49 ` Kees Cook @ 2015-03-12 22:10 ` Andrew G. Morgan [not found] ` <CALQRfL7b8CjYgUnVy3jykNwv48fOc03T385RKo--cfv25YenBg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 34+ messages in thread From: Andrew G. Morgan @ 2015-03-12 22:10 UTC (permalink / raw) To: Kees Cook Cc: Andy Lutomirski, Christoph Lameter, Serge Hallyn, Andy Lutomirski, Jonathan Corbet, Aaron Jones, Ted Ts'o, linux-security-module, LKML, Linux API, Andrew Morton, Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen, Michael Kerrisk I'm unclear why you refer to the inheritable set in this test: + } else { + if (arg2 == PR_CAP_AMBIENT_RAISE && + (!cap_raised(current_cred()->cap_permitted, arg3) || + !cap_raised(current_cred()->cap_inheritable, + arg3))) + return -EPERM; I'm also unclear how you can turn off this new 'feature' for a process tree? As it is, the code creates an exploit path for a capable (pP != 0) program with an exploitable flaw to create a privilege escalation for an arbitrary child program. While I understand that everyone 'knows what they are doing' in implementing this change, I'm convinced that folk that are up to no good also do... Why not provide a lockable secure bit to selectively disable this support? Nacked-by: Andrew G. Morgan <morgan@kernel.org> Cheers Andrew On Thu, Mar 12, 2015 at 2:49 PM, Kees Cook <keescook@chromium.org> wrote: > On Thu, Mar 12, 2015 at 11:08 AM, Andy Lutomirski <luto@kernel.org> wrote: >> Credit where credit is due: this idea comes from Christoph Lameter >> with a lot of valuable input from Serge Hallyn. This patch is >> heavily based on Christoph's patch. >> >> ===== The status quo ===== >> >> On Linux, there are a number of capabilities defined by the kernel. >> To perform various privileged tasks, processes can wield >> capabilities that they hold. >> >> Each task has four capability masks: effective (pE), permitted (pP), >> inheritable (pI), and a bounding set (X). When the kernel checks >> for a capability, it checks pE. The other capability masks serve to >> modify what capabilities can be in pE. >> >> Any task can remove capabilities from pE, pP, or pI at any time. If >> a task has a capability in pP, it can add that capability to pE >> and/or pI. If a task has CAP_SETPCAP, then it can add any >> capability to pI, and it can remove capabilities from X. >> >> Tasks are not the only things that can have capabilities; files can >> also have capabilities. A file can have no capabilty information at >> all [1]. If a file has capability information, then it has a >> permitted mask (fP) and an inheritable mask (fI) as well as a single >> effective bit (fE) [2]. File capabilities modify the capabilities >> of tasks that execve(2) them. >> >> A task that successfully calls execve has its capabilities modified >> for the file ultimately being excecuted (i.e. the binary itself if >> that binary is ELF or for the interpreter if the binary is a >> script.) [3] In the capability evolution rules, for each mask Z, pZ >> represents the old value and pZ' represents the new value. The >> rules are: >> >> pP' = (X & fP) | (pI & fI) >> pI' = pI >> pE' = (fE ? pP' : 0) >> X is unchanged >> >> For setuid binaries, fP, fI, and fE are modified by a moderately >> complicated set of rules that emulate POSIX behavior. Similarly, if >> euid == 0 or ruid == 0, then fP, fI, and fE are modified differently >> (primary, fP and fI usually end up being the full set). For nonroot >> users executing binaries with neither setuid nor file caps, fI and >> fP are empty and fE is false. >> >> As an extra complication, if you execute a process as nonroot and fE >> is set, then the "secure exec" rules are in effect: AT_SECURE gets >> set, LD_PRELOAD doesn't work, etc. >> >> This is rather messy. We've learned that making any changes is >> dangerous, though: if a new kernel version allows an unprivileged >> program to change its security state in a way that persists cross >> execution of a setuid program or a program with file caps, this >> persistent state is surprisingly likely to allow setuid or >> file-capped programs to be exploited for privilege escalation. >> >> ===== The problem ===== >> >> Capability inheritance is basically useless. >> >> If you aren't root and you execute an ordinary binary, fI is zero, >> so your capabilities have no effect whatsoever on pP'. This means >> that you can't usefully execute a helper process or a shell command >> with elevated capabilities if you aren't root. >> >> On current kernels, you can sort of work around this by setting fI >> to the full set for most or all non-setuid executable files. This >> causes pP' = pI for nonroot, and inheritance works. No one does >> this because it's a PITA and it isn't even supported on most >> filesystems. >> >> If you try this, you'll discover that every nonroot program ends up >> with secure exec rules, breaking many things. >> >> This is a problem that has bitten many people who have tried to use >> capabilities for anything useful. >> >> ===== The proposed change ===== >> >> This patch adds a fifth capability mask called the ambient mask >> (pA). pA does what pI should have done. >> >> pA obeys the invariant that no bit can ever be set in pA if it is >> not set in both pP and pI. Dropping a bit from pP or pI drops that >> bit from pA. This ensures that existing programs that try to drop >> capabilities still do so, with a complication. Because capability >> inheritance is so broken, setting KEEPCAPS, using setresuid to >> switch to nonroot uids, and calling execve effectively drops >> capabilities. Therefore, setresuid from root to nonroot >> unconditionally clears pA. Processes that don't like this can >> re-add bits to pA afterwards. >> >> The capability evolution rules are changed: >> >> pA' = (file caps or setuid or setgid ? 0 : pA) >> pP' = (X & fP) | (pI & fI) | pA' >> pI' = pI >> pE' = (fE ? pP' : pA') >> X is unchanged >> >> If you are nonroot but you have a capability, you can add it to pA. >> If you do so, your children get that capability in pA, pP, and pE. >> For example, you can set pA = CAP_NET_BIND_SERVICE, and your >> children can automatically bind low-numbered ports. Hallelujah! >> >> Unprivileged users can create user namespaces, map themselves to a >> nonzero uid, and create both privileged (relative to their >> namespace) and unprivileged process trees. This is currently more >> or less impossible. Hallelujah! >> >> You cannot use pA to try to subvert a setuid, setgid, or file-capped >> program: if you execute any such program, pA gets cleared and the >> resulting evolution rules are unchanged by this patch. >> >> Users with nonzero pA are unlikely to unintentionally leak that >> capability. If they run programs that try to drop privileges, >> dropping privileges will still work. >> >> It's worth noting that the degree of paranoia in this patch could >> possibly be relaxed without causing serious problems. Specifically, >> if we allowed pA to persist across executing non-pA-aware setuid >> binaries and across setresuid, then, naively, the only capabilities >> that could leak as a result would be the capabilities in pA, and any >> attacker *already* has those capabilities. This would make me >> nervous, though -- setuid binaries that tried to privilege-separate >> might fail to do so, and putting CAP_DAC_READ_SEARCH or >> CAP_DAC_OVERRIDE into pA could have unexpected side effects. >> (Whether these unexpected side effects would be exploitable is an >> open question.) I've therefore taken the more paranoid route. >> >> An alternative would be to either require PR_SET_NO_NEW_PRIVS before >> setting ambient capabilities. I think that this would be annoying >> and would make granting otherwise unprivileged users minor ambient >> capabilities (CAP_NET_BIND_SERVICE or CAP_NET_RAW for example) much >> less useful than it is with this patch. >> >> ===== Footnotes ===== >> >> [1] Files that are missing the "security.capability" xattr or that >> have unrecognized values for that xattr end up with has_cap == >> false. The code that does that appears to be complicated for no >> good reason. >> >> [2] The libcap capability mask parsers and formatters are >> dangerously misleading and the documentation is flat-out wrong. fE >> is *not* a mask; it's a single bit. This has probably confused >> every single person who has tried to use file capabilities. >> >> [3] Linux very confusingly processes the script and the interpreter if >> applicable, for reasons that escape me. The results from thinking >> about a script's file capabilities and/or setuid bits are mostly discarded. >> >> Cc: Kees Cook <keescook@chromium.org> >> Cc: Christoph Lameter <cl@linux.com> >> Cc: Serge Hallyn <serge.hallyn@canonical.com> >> Cc: Andy Lutomirski <luto@amacapital.net> >> Cc: Jonathan Corbet <corbet@lwn.net> >> Cc: Aaron Jones <aaronmdjones@gmail.com> >> CC: Ted Ts'o <tytso@mit.edu> >> Cc: linux-security-module@vger.kernel.org >> Cc: linux-kernel@vger.kernel.org >> Cc: linux-api@vger.kernel.org >> Cc: akpm@linuxfoundation.org >> Cc: Andrew G. Morgan <morgan@kernel.org> >> Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> >> Cc: Austin S Hemmelgarn <ahferroin7@gmail.com> >> Cc: Markku Savela <msa@moth.iki.fi> >> Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> >> Cc: Michael Kerrisk <mtk.manpages@gmail.com> >> Signed-off-by: Andy Lutomirski <luto@kernel.org> > > This would be quite welcome for things we're doing in Chrome OS. > Presently, we're able to use fscaps to keep non-root caps across exec > and haven't encountered issues with AT_SECURE (yet), but using pA > would be much nicer and exactly matches how we want to use it: a > launcher is creating a tree of processes that are non-root but need > some capabilities. Right now the tree is very small and we're able to > sprinkle our fscaps lightly. :) This would be better. > > -Kees > >> --- >> >> Preliminary userspace code is here: >> >> https://git.kernel.org/cgit/linux/kernel/git/luto/util-linux-playground.git/commit/?h=cap_ambient&id=860c73ac1acaaae976bdd3bb83b89b0180f0702a >> >> fs/proc/array.c | 5 ++- >> include/linux/cred.h | 15 +++++++++ >> include/uapi/linux/prctl.h | 6 ++++ >> kernel/user_namespace.c | 1 + >> security/commoncap.c | 75 ++++++++++++++++++++++++++++++++++++++------ >> security/keys/process_keys.c | 1 + >> 6 files changed, 92 insertions(+), 11 deletions(-) >> >> diff --git a/fs/proc/array.c b/fs/proc/array.c >> index 1295a00ca316..bc15356d6551 100644 >> --- a/fs/proc/array.c >> +++ b/fs/proc/array.c >> @@ -282,7 +282,8 @@ static void render_cap_t(struct seq_file *m, const char *header, >> static inline void task_cap(struct seq_file *m, struct task_struct *p) >> { >> const struct cred *cred; >> - kernel_cap_t cap_inheritable, cap_permitted, cap_effective, cap_bset; >> + kernel_cap_t cap_inheritable, cap_permitted, cap_effective, >> + cap_bset, cap_ambient; >> >> rcu_read_lock(); >> cred = __task_cred(p); >> @@ -290,12 +291,14 @@ static inline void task_cap(struct seq_file *m, struct task_struct *p) >> cap_permitted = cred->cap_permitted; >> cap_effective = cred->cap_effective; >> cap_bset = cred->cap_bset; >> + cap_ambient = cred->cap_ambient; >> rcu_read_unlock(); >> >> render_cap_t(m, "CapInh:\t", &cap_inheritable); >> render_cap_t(m, "CapPrm:\t", &cap_permitted); >> render_cap_t(m, "CapEff:\t", &cap_effective); >> render_cap_t(m, "CapBnd:\t", &cap_bset); >> + render_cap_t(m, "CapAmb:\t", &cap_ambient); >> } >> >> static inline void task_seccomp(struct seq_file *m, struct task_struct *p) >> diff --git a/include/linux/cred.h b/include/linux/cred.h >> index 2fb2ca2127ed..a21bcba6ef84 100644 >> --- a/include/linux/cred.h >> +++ b/include/linux/cred.h >> @@ -122,6 +122,7 @@ struct cred { >> kernel_cap_t cap_permitted; /* caps we're permitted */ >> kernel_cap_t cap_effective; /* caps we can actually use */ >> kernel_cap_t cap_bset; /* capability bounding set */ >> + kernel_cap_t cap_ambient; /* Ambient capability set */ >> #ifdef CONFIG_KEYS >> unsigned char jit_keyring; /* default keyring to attach requested >> * keys to */ >> @@ -197,6 +198,20 @@ static inline void validate_process_creds(void) >> } >> #endif >> >> +static inline void cap_enforce_ambient_invariants(struct cred *cred) >> +{ >> + cred->cap_ambient = cap_intersect(cred->cap_ambient, >> + cap_intersect(cred->cap_permitted, >> + cred->cap_inheritable)); >> +} >> + >> +static inline bool cap_ambient_invariant_ok(const struct cred *cred) >> +{ >> + return cap_issubset(cred->cap_ambient, >> + cap_intersect(cred->cap_permitted, >> + cred->cap_inheritable)); >> +} >> + >> /** >> * get_new_cred - Get a reference on a new set of credentials >> * @cred: The new credentials to reference >> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h >> index 31891d9535e2..65407f867e82 100644 >> --- a/include/uapi/linux/prctl.h >> +++ b/include/uapi/linux/prctl.h >> @@ -190,4 +190,10 @@ struct prctl_mm_map { >> # define PR_FP_MODE_FR (1 << 0) /* 64b FP registers */ >> # define PR_FP_MODE_FRE (1 << 1) /* 32b compatibility */ >> >> +/* Control the ambient capability set */ >> +#define PR_CAP_AMBIENT 47 >> +# define PR_CAP_AMBIENT_GET 1 >> +# define PR_CAP_AMBIENT_RAISE 2 >> +# define PR_CAP_AMBIENT_LOWER 3 >> + >> #endif /* _LINUX_PRCTL_H */ >> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c >> index 4109f8320684..dab0f808235a 100644 >> --- a/kernel/user_namespace.c >> +++ b/kernel/user_namespace.c >> @@ -39,6 +39,7 @@ static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns) >> cred->cap_inheritable = CAP_EMPTY_SET; >> cred->cap_permitted = CAP_FULL_SET; >> cred->cap_effective = CAP_FULL_SET; >> + cred->cap_ambient = CAP_EMPTY_SET; >> cred->cap_bset = CAP_FULL_SET; >> #ifdef CONFIG_KEYS >> key_put(cred->request_key_auth); >> diff --git a/security/commoncap.c b/security/commoncap.c >> index f66713bd7450..b3253886ecad 100644 >> --- a/security/commoncap.c >> +++ b/security/commoncap.c >> @@ -272,6 +272,7 @@ int cap_capset(struct cred *new, >> new->cap_effective = *effective; >> new->cap_inheritable = *inheritable; >> new->cap_permitted = *permitted; >> + cap_enforce_ambient_invariants(new); >> return 0; >> } >> >> @@ -352,6 +353,7 @@ static inline int bprm_caps_from_vfs_caps(struct cpu_vfs_cap_data *caps, >> >> /* >> * pP' = (X & fP) | (pI & fI) >> + * The addition of pA' is handled later. >> */ >> new->cap_permitted.cap[i] = >> (new->cap_bset.cap[i] & permitted) | >> @@ -479,10 +481,12 @@ int cap_bprm_set_creds(struct linux_binprm *bprm) >> { >> const struct cred *old = current_cred(); >> struct cred *new = bprm->cred; >> - bool effective, has_cap = false; >> + bool effective, has_cap = false, is_setid; >> int ret; >> kuid_t root_uid; >> >> + BUG_ON(!cap_ambient_invariant_ok(old)); >> + >> effective = false; >> ret = get_file_caps(bprm, &effective, &has_cap); >> if (ret < 0) >> @@ -527,8 +531,9 @@ skip: >> * >> * In addition, if NO_NEW_PRIVS, then ensure we get no new privs. >> */ >> - if ((!uid_eq(new->euid, old->uid) || >> - !gid_eq(new->egid, old->gid) || >> + is_setid = !uid_eq(new->euid, old->uid) || !gid_eq(new->egid, old->gid); >> + >> + if ((is_setid || >> !cap_issubset(new->cap_permitted, old->cap_permitted)) && >> bprm->unsafe & ~LSM_UNSAFE_PTRACE_CAP) { >> /* downgrade; they get no more than they had, and maybe less */ >> @@ -544,10 +549,23 @@ skip: >> new->suid = new->fsuid = new->euid; >> new->sgid = new->fsgid = new->egid; >> >> + /* File caps or setid cancel ambient. */ >> + if (has_cap || is_setid) >> + cap_clear(new->cap_ambient); >> + >> + /* >> + * Now that we've computed pA', update pP' to give: >> + * pP' = (X & fP) | (pI & fI) | pA' >> + */ >> + new->cap_permitted = cap_combine(new->cap_permitted, new->cap_ambient); >> + >> if (effective) >> new->cap_effective = new->cap_permitted; >> else >> - cap_clear(new->cap_effective); >> + new->cap_effective = new->cap_ambient; >> + >> + BUG_ON(!cap_ambient_invariant_ok(new)); >> + >> bprm->cap_effective = effective; >> >> /* >> @@ -562,7 +580,7 @@ skip: >> * Number 1 above might fail if you don't have a full bset, but I think >> * that is interesting information to audit. >> */ >> - if (!cap_isclear(new->cap_effective)) { >> + if (!cap_issubset(new->cap_effective, new->cap_ambient)) { >> if (!cap_issubset(CAP_FULL_SET, new->cap_effective) || >> !uid_eq(new->euid, root_uid) || !uid_eq(new->uid, root_uid) || >> issecure(SECURE_NOROOT)) { >> @@ -573,6 +591,9 @@ skip: >> } >> >> new->securebits &= ~issecure_mask(SECURE_KEEP_CAPS); >> + >> + BUG_ON(!cap_ambient_invariant_ok(new)); >> + >> return 0; >> } >> >> @@ -594,7 +615,7 @@ int cap_bprm_secureexec(struct linux_binprm *bprm) >> if (!uid_eq(cred->uid, root_uid)) { >> if (bprm->cap_effective) >> return 1; >> - if (!cap_isclear(cred->cap_permitted)) >> + if (!cap_issubset(cred->cap_permitted, cred->cap_ambient)) >> return 1; >> } >> >> @@ -696,10 +717,18 @@ static inline void cap_emulate_setxuid(struct cred *new, const struct cred *old) >> uid_eq(old->suid, root_uid)) && >> (!uid_eq(new->uid, root_uid) && >> !uid_eq(new->euid, root_uid) && >> - !uid_eq(new->suid, root_uid)) && >> - !issecure(SECURE_KEEP_CAPS)) { >> - cap_clear(new->cap_permitted); >> - cap_clear(new->cap_effective); >> + !uid_eq(new->suid, root_uid))) { >> + if (!issecure(SECURE_KEEP_CAPS)) { >> + cap_clear(new->cap_permitted); >> + cap_clear(new->cap_effective); >> + } >> + >> + /* >> + * Pre-ambient programs except setresuid to nonroot followed >> + * by exec to drop capabilities. We should make sure that >> + * this remains the case. >> + */ >> + cap_clear(new->cap_ambient); >> } >> if (uid_eq(old->euid, root_uid) && !uid_eq(new->euid, root_uid)) >> cap_clear(new->cap_effective); >> @@ -929,6 +958,32 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3, >> new->securebits &= ~issecure_mask(SECURE_KEEP_CAPS); >> return commit_creds(new); >> >> + case PR_CAP_AMBIENT: >> + if (((!cap_valid(arg3)) | arg4 | arg5)) >> + return -EINVAL; >> + >> + if (arg2 == PR_CAP_AMBIENT_GET) { >> + return !!cap_raised(current_cred()->cap_ambient, arg3); >> + } else if (arg2 != PR_CAP_AMBIENT_RAISE && >> + arg2 != PR_CAP_AMBIENT_LOWER) { >> + return -EINVAL; >> + } else { >> + if (arg2 == PR_CAP_AMBIENT_RAISE && >> + (!cap_raised(current_cred()->cap_permitted, arg3) || >> + !cap_raised(current_cred()->cap_inheritable, >> + arg3))) >> + return -EPERM; >> + >> + new = prepare_creds(); >> + if (!new) >> + return -ENOMEM; >> + if (arg2 == PR_CAP_AMBIENT_RAISE) >> + cap_raise(new->cap_ambient, arg3); >> + else >> + cap_lower(new->cap_ambient, arg3); >> + return commit_creds(new); >> + } >> + >> default: >> /* No functionality available - continue with default */ >> return -ENOSYS; >> diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c >> index bd536cb221e2..43b4cddbf2b3 100644 >> --- a/security/keys/process_keys.c >> +++ b/security/keys/process_keys.c >> @@ -848,6 +848,7 @@ void key_change_session_keyring(struct callback_head *twork) >> new->cap_inheritable = old->cap_inheritable; >> new->cap_permitted = old->cap_permitted; >> new->cap_effective = old->cap_effective; >> + new->cap_ambient = old->cap_ambient; >> new->cap_bset = old->cap_bset; >> >> new->jit_keyring = old->jit_keyring; >> -- >> 2.3.0 >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ > > > > -- > Kees Cook > Chrome OS Security ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <CALQRfL7b8CjYgUnVy3jykNwv48fOc03T385RKo--cfv25YenBg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFC] capabilities: Ambient capabilities [not found] ` <CALQRfL7b8CjYgUnVy3jykNwv48fOc03T385RKo--cfv25YenBg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-03-12 22:27 ` Andrew Lutomirski [not found] ` <CAObL_7HgU-ekb7mJS7C=0=idfrzV6=CSqTrG2LvUgdDtvbJx_w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 34+ messages in thread From: Andrew Lutomirski @ 2015-03-12 22:27 UTC (permalink / raw) To: Andrew G. Morgan Cc: Kees Cook, Christoph Lameter, Serge Hallyn, Andy Lutomirski, Jonathan Corbet, Aaron Jones, Ted Ts'o, linux-security-module, LKML, Linux API, Andrew Morton, Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen, Michael Kerrisk On Thu, Mar 12, 2015 at 3:10 PM, Andrew G. Morgan <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > I'm unclear why you refer to the inheritable set in this test: > > + } else { > + if (arg2 == PR_CAP_AMBIENT_RAISE && > + (!cap_raised(current_cred()->cap_permitted, arg3) || > + !cap_raised(current_cred()->cap_inheritable, > + arg3))) > + return -EPERM; It's to preserve the invariant that pA is always a subset of pI. > > I'm also unclear how you can turn off this new 'feature' for a process > tree? As it is, the code creates an exploit path for a capable (pP != > 0) program with an exploitable flaw to create a privilege escalation > for an arbitrary child program. Huh? If you exploit the parent, you already win. Yes, if a kiddie injects shellcode that does system("/bin/bash") into some pP != 0 program, they don't actually elevate their privileges. On the other hand, by the time an attacker injected shellcode for: prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, CAP_SYS_ADMIN); system("/bin/bash"); into a target, they can already do whatever they want. > While I understand that everyone > 'knows what they are doing' in implementing this change, I'm convinced > that folk that are up to no good also do... Why not provide a lockable > secure bit to selectively disable this support? Show me a legitimate use case and I'll gladly implement a secure bit. In the mean time, I don't even believe that there's a legitimate use for any of the other secure bits (except keepcaps, and I don't know why that's a securebit in the first place). In the mean time, see CVE-2014-3215 for an example of why securebits are probably more trouble than they're worth. --Andy ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <CAObL_7HgU-ekb7mJS7C=0=idfrzV6=CSqTrG2LvUgdDtvbJx_w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFC] capabilities: Ambient capabilities [not found] ` <CAObL_7HgU-ekb7mJS7C=0=idfrzV6=CSqTrG2LvUgdDtvbJx_w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-03-13 13:24 ` Andrew G. Morgan 2015-03-13 16:06 ` Christoph Lameter 2015-03-13 17:57 ` Andy Lutomirski 0 siblings, 2 replies; 34+ messages in thread From: Andrew G. Morgan @ 2015-03-13 13:24 UTC (permalink / raw) To: Andrew Lutomirski Cc: Kees Cook, Christoph Lameter, Serge Hallyn, Andy Lutomirski, Jonathan Corbet, Aaron Jones, Ted Ts'o, linux-security-module, LKML, Linux API, Andrew Morton, Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen, Michael Kerrisk > It's to preserve the invariant that pA is always a subset of pI. But since a user can always raise a bit in pI if it is present in pP, what does this invariant add to your model other than inconvenience? >> I'm also unclear how you can turn off this new 'feature' for a process >> tree? As it is, the code creates an exploit path for a capable (pP != >> 0) program with an exploitable flaw to create a privilege escalation >> for an arbitrary child program. > > Huh? If you exploit the parent, you already win. Yes, if a kiddie > injects shellcode that does system("/bin/bash") into some pP != 0 > program, they don't actually elevate their privileges. On the other > hand, by the time an attacker injected shellcode for: > > prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, CAP_SYS_ADMIN); > system("/bin/bash"); Let's call the above two lines [a] and [b]. With this patch, you are encouraging folk to write programs that contain a line like [a] already. So, yes, I am saying that you are creating an exploitable path in these programs that says if someone can inject system("/bin/bash") into these programs they can get a new (because of this patch) privilege escalation. In the prevailing model, this kind of privilege escalation (resulting from naive inheritance) is designed out. I recognize that you want to add it back in, but I am concerned that you are not allowing for the possibility that some folk might still want to still be able to prevent it. > into a target, they can already do whatever they want. > >> While I understand that everyone >> 'knows what they are doing' in implementing this change, I'm convinced >> that folk that are up to no good also do... Why not provide a lockable >> secure bit to selectively disable this support? > > Show me a legitimate use case and I'll gladly implement a secure bit. Thanks. I was kind of hoping that you would add a lockable secure bit that defaults this support to off, but can be used to turn it on with or without a lock. That way, you can avoid disturbing the legacy defaults (no surprises). > In the mean time, I don't even believe that there's a legitimate use > for any of the other secure bits (except keepcaps, and I don't know > why that's a securebit in the first place). Those bits currently make it possible to run a subsystem with no [set]uid-0 support in its process tree. > In the mean time, see CVE-2014-3215 for an example of why securebits > are probably more trouble than they're worth. I think it is safe to say that naive privilege inheritance has a fair track record of being exploited orders of magnitude more frequently than this. After all, these are the reasons LD_PRELOAD and shell script setuid bits are suppressed. Cheers Andrew ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC] capabilities: Ambient capabilities 2015-03-13 13:24 ` Andrew G. Morgan @ 2015-03-13 16:06 ` Christoph Lameter [not found] ` <alpine.DEB.2.11.1503131049150.13899-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org> 2015-03-13 17:57 ` Andy Lutomirski 1 sibling, 1 reply; 34+ messages in thread From: Christoph Lameter @ 2015-03-13 16:06 UTC (permalink / raw) To: Andrew G. Morgan Cc: Andrew Lutomirski, Kees Cook, Serge Hallyn, Andy Lutomirski, Jonathan Corbet, Aaron Jones, Ted Ts'o, linux-security-module, LKML, Linux API, Andrew Morton, Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen, Michael Kerrisk On Fri, 13 Mar 2015, Andrew G. Morgan wrote: > > prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, CAP_SYS_ADMIN); > > system("/bin/bash"); > > Let's call the above two lines [a] and [b]. With this patch, you are > encouraging folk to write programs that contain a line like [a] > already. So, yes, I am saying that you are creating an exploitable > path in these programs that says if someone can inject > system("/bin/bash") into these programs they can get a new (because of > this patch) privilege escalation. Well this is what one naively expects capabilities to give you. An ability to avoid full superuser binaries by segmenting off capabilities. Often there is really no other choice. If you do not provide this mode then the system will be even less secure since people run stuff as root. This looks to many like the design of capabilties is inherent flawed since it does not give you what you need. You experience a go around that leads nowhere and just wastes your time. > In the prevailing model, this kind of privilege escalation (resulting > from naive inheritance) is designed out. I recognize that you want to > add it back in, but I am concerned that you are not allowing for the > possibility that some folk might still want to still be able to > prevent it. The functionalty here depends on CAP_SETPCAP. That was intended as some point to be off by default? You can have distros/kernels with that being off. ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <alpine.DEB.2.11.1503131049150.13899-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org>]
* Re: [RFC] capabilities: Ambient capabilities [not found] ` <alpine.DEB.2.11.1503131049150.13899-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org> @ 2015-03-13 17:58 ` Andy Lutomirski [not found] ` <CALCETrXcUfbqfm7av9crrxQ5nCBYpdJO8gRo4ZhRA97g27B2iw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 34+ messages in thread From: Andy Lutomirski @ 2015-03-13 17:58 UTC (permalink / raw) To: Christoph Lameter Cc: Andrew G. Morgan, Andrew Lutomirski, Kees Cook, Serge Hallyn, Jonathan Corbet, Aaron Jones, Ted Ts'o, linux-security-module, LKML, Linux API, Andrew Morton, Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen, Michael Kerrisk On Fri, Mar 13, 2015 at 9:06 AM, Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org> wrote: > On Fri, 13 Mar 2015, Andrew G. Morgan wrote: > >> > prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, CAP_SYS_ADMIN); >> > system("/bin/bash"); >> >> Let's call the above two lines [a] and [b]. With this patch, you are >> encouraging folk to write programs that contain a line like [a] >> already. So, yes, I am saying that you are creating an exploitable >> path in these programs that says if someone can inject >> system("/bin/bash") into these programs they can get a new (because of >> this patch) privilege escalation. > > Well this is what one naively expects capabilities to give you. An ability > to avoid full superuser binaries by segmenting off capabilities. Often > there is really no other choice. If you do not provide this mode then > the system will be even less secure since people run stuff as root. > > This looks to many like the design of capabilties is inherent flawed since > it does not give you what you need. You experience a go around that leads > nowhere and just wastes your time. > >> In the prevailing model, this kind of privilege escalation (resulting >> from naive inheritance) is designed out. I recognize that you want to >> add it back in, but I am concerned that you are not allowing for the >> possibility that some folk might still want to still be able to >> prevent it. > > The functionalty here depends on CAP_SETPCAP. That was intended as some > point to be off by default? You can have distros/kernels with that being > off. Not in my version. I don't want to further encourage people to hand out CAP_SETPCAP. --Andy -- Andy Lutomirski AMA Capital Management, LLC ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <CALCETrXcUfbqfm7av9crrxQ5nCBYpdJO8gRo4ZhRA97g27B2iw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFC] capabilities: Ambient capabilities [not found] ` <CALCETrXcUfbqfm7av9crrxQ5nCBYpdJO8gRo4ZhRA97g27B2iw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-03-13 18:52 ` Christoph Lameter 0 siblings, 0 replies; 34+ messages in thread From: Christoph Lameter @ 2015-03-13 18:52 UTC (permalink / raw) To: Andy Lutomirski Cc: Andrew G. Morgan, Andrew Lutomirski, Kees Cook, Serge Hallyn, Jonathan Corbet, Aaron Jones, Ted Ts'o, linux-security-module, LKML, Linux API, Andrew Morton, Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen, Michael Kerrisk On Fri, 13 Mar 2015, Andy Lutomirski wrote: > > The functionalty here depends on CAP_SETPCAP. That was intended as some > > point to be off by default? You can have distros/kernels with that being > > off. > > Not in my version. I don't want to further encourage people to hand > out CAP_SETPCAP. Owww... But if you leave it in then maybe Andrew will be satisfied with this approach? ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC] capabilities: Ambient capabilities 2015-03-13 13:24 ` Andrew G. Morgan 2015-03-13 16:06 ` Christoph Lameter @ 2015-03-13 17:57 ` Andy Lutomirski [not found] ` <CALCETrVz39pJmMAEFodBNA8cOZjd0xkH95EzMazY0MWJDtpYgg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 1 sibling, 1 reply; 34+ messages in thread From: Andy Lutomirski @ 2015-03-13 17:57 UTC (permalink / raw) To: Andrew G. Morgan Cc: Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton, Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Christoph Lameter, LKML, Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet On Mar 13, 2015 6:24 AM, "Andrew G. Morgan" <morgan@kernel.org> wrote: > > > It's to preserve the invariant that pA is always a subset of pI. > > But since a user can always raise a bit in pI if it is present in pP, > what does this invariant add to your model other than inconvenience? The useful part is that dropping a bit from pI also drops it from pA. To keep the model consistent, I also require that you add the bit to pI before adding it to pA. > > >> I'm also unclear how you can turn off this new 'feature' for a process > >> tree? As it is, the code creates an exploit path for a capable (pP != > >> 0) program with an exploitable flaw to create a privilege escalation > >> for an arbitrary child program. > > > > Huh? If you exploit the parent, you already win. Yes, if a kiddie > > injects shellcode that does system("/bin/bash") into some pP != 0 > > program, they don't actually elevate their privileges. On the other > > hand, by the time an attacker injected shellcode for: > > > > prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, CAP_SYS_ADMIN); > > system("/bin/bash"); > > Let's call the above two lines [a] and [b]. With this patch, you are > encouraging folk to write programs that contain a line like [a] > already. So, yes, I am saying that you are creating an exploitable > path in these programs that says if someone can inject > system("/bin/bash") into these programs they can get a new (because of > this patch) privilege escalation. > > In the prevailing model, this kind of privilege escalation (resulting > from naive inheritance) is designed out. I recognize that you want to > add it back in, but I am concerned that you are not allowing for the > possibility that some folk might still want to still be able to > prevent it. If you have a program that deliberately uses PR_CAP_AMBIENT, then setting such a securebit will break the program, so it still doesn't buy you anything. > > > into a target, they can already do whatever they want. > > > >> While I understand that everyone > >> 'knows what they are doing' in implementing this change, I'm convinced > >> that folk that are up to no good also do... Why not provide a lockable > >> secure bit to selectively disable this support? > > > > Show me a legitimate use case and I'll gladly implement a secure bit. > > Thanks. I was kind of hoping that you would add a lockable secure bit > that defaults this support to off, but can be used to turn it on with > or without a lock. That way, you can avoid disturbing the legacy > defaults (no surprises). I think this thing needs to default on to be useful. > > > In the mean time, I don't even believe that there's a legitimate use > > for any of the other secure bits (except keepcaps, and I don't know > > why that's a securebit in the first place). > > Those bits currently make it possible to run a subsystem with no > [set]uid-0 support in its process tree. Not usefully, because even with all the securebits set to their non-legacy modes, caps don't inherit, so it doesn't work. I've tried. > > > In the mean time, see CVE-2014-3215 for an example of why securebits > > are probably more trouble than they're worth. > > I think it is safe to say that naive privilege inheritance has a fair > track record of being exploited orders of magnitude more frequently > than this. After all, these are the reasons LD_PRELOAD and shell > script setuid bits are suppressed. I don't know what you mean here by naive privilege inheritance. The examples you're taking about aren't inheritance at all; they're exploring privilege *grants* during execve. My patch deliberately leaves grants like that alone. --Andy > > Cheers > > Andrew ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <CALCETrVz39pJmMAEFodBNA8cOZjd0xkH95EzMazY0MWJDtpYgg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFC] capabilities: Ambient capabilities [not found] ` <CALCETrVz39pJmMAEFodBNA8cOZjd0xkH95EzMazY0MWJDtpYgg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-03-13 18:52 ` Kees Cook 2015-03-13 19:03 ` Andy Lutomirski 2015-03-14 21:09 ` Andrew G. Morgan 1 sibling, 1 reply; 34+ messages in thread From: Kees Cook @ 2015-03-13 18:52 UTC (permalink / raw) To: Andy Lutomirski Cc: Andrew G. Morgan, Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton, Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Christoph Lameter, LKML, Serge Hallyn, Markku Savela, Jonathan Corbet On Fri, Mar 13, 2015 at 10:57 AM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: > On Mar 13, 2015 6:24 AM, "Andrew G. Morgan" <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: >> I think it is safe to say that naive privilege inheritance has a fair >> track record of being exploited orders of magnitude more frequently >> than this. After all, these are the reasons LD_PRELOAD and shell >> script setuid bits are suppressed. > > I don't know what you mean here by naive privilege inheritance. The > examples you're taking about aren't inheritance at all; they're > exploring privilege *grants* during execve. My patch deliberately > leaves grants like that alone. Just to clarify (at least for myself), I think the issue here is the perception of privileges leaking into newly execed processes that may have flaws that allow arbitrary execution control. Case A: daemon is running (with whatever set of privileges), has a flaw and an attacker now has those same privileges. Case B: running as root, runs some child (which stays root) with a flaw and an attacker now has root privileges. Case C: running with pE=CAP_NET_ADMIN, runs some child (which loses the cap) with a flaw and an attacker now has only user privileges. Case D: running with pE=pI=CAP_NET_ADMIN, runs some child that has fI=CAP_NET_ADMIN, but other children don't and lose the cap as in case C. The privileged attack surface is reduced to only the child with fI set. Case E: running with CAP_NET_ADMIN in pA, runs some child (which gets pE=pA=CAP_NET_ADMIN), and has an attack surface larger (all children) than Case D, but with a scope smaller (only CAP_NET_ADMIN) than Case B. We don't need to talk about Case A, since we're not crossing an exec boundary and the attacker already has execution control. We all lose. Case B is the poster-child for "don't run daemons as root", since privileges aren't dropped, and we're basically back to Case A again. Case C is the classic "just give the daemon what caps it needs" case, and if the daemon itself isn't flawed (Case A), then the privileges don't leak out to any children (usually helpers of some kind) since the cap doesn't cross the exec boundary. This lack of leaking is really frustrating for daemons that need to give caps to helpers (e.g. containers). Case D is the standard solution to the frustrations in Case C, but requires filesystem capabilities (and/or SELinux) to pass capabilities, and comes with various limitations as described in Andy's first email. Case E would be possible with Andy's patch. It lacks the limitations and frustrations of C and D, but opens us up to a limited version of the risks in Case B. So, I don't think it's as "bad" as Case B since a process must opt into it (by setting pA), and it passes only the limited set of capabilities. I think this boils down to accepting the elevated risk while recognizing that it may be less than Case B but greater than Case C. All this said, almost half of the capabilities, if passed to flawed children with attacker controlled execution, can be elevated to full root privileges pretty easily[1], so I think any documentation around this feature should include some pretty dire warnings about using this. -Kees [1] https://forums.grsecurity.net/viewtopic.php?f=7&t=2522 -- Kees Cook Chrome OS Security ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC] capabilities: Ambient capabilities 2015-03-13 18:52 ` Kees Cook @ 2015-03-13 19:03 ` Andy Lutomirski [not found] ` <CALCETrV7vDJVhM_AgtGn8ENStcyxZBwCw3zhSn-whArT_XPg8A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 34+ messages in thread From: Andy Lutomirski @ 2015-03-13 19:03 UTC (permalink / raw) To: Kees Cook Cc: Andrew G. Morgan, Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton, Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Christoph Lameter, LKML, Serge Hallyn, Markku Savela, Jonathan Corbet On Fri, Mar 13, 2015 at 11:52 AM, Kees Cook <keescook@chromium.org> wrote: > > All this said, almost half of the capabilities, if passed to flawed > children with attacker controlled execution, can be elevated to full > root privileges pretty easily[1], so I think any documentation around > this feature should include some pretty dire warnings about using > this. That's a good point. I'll make sure to document that. It's worth noting that, for many applications, that list is overstated. For example, many of the suggested privilege escalations don't work if you're in a sufficiently restrictive mount namespace. For my own use, I plan on adding only CAP_NET_BIND_SERVICE and CAP_NET_RAW to pA, and I'll be layering seccomp on top to the extent possible. --Andy > > -Kees > > [1] https://forums.grsecurity.net/viewtopic.php?f=7&t=2522 > > -- > Kees Cook > Chrome OS Security -- Andy Lutomirski AMA Capital Management, LLC ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <CALCETrV7vDJVhM_AgtGn8ENStcyxZBwCw3zhSn-whArT_XPg8A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFC] capabilities: Ambient capabilities [not found] ` <CALCETrV7vDJVhM_AgtGn8ENStcyxZBwCw3zhSn-whArT_XPg8A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-03-13 19:54 ` Kees Cook 0 siblings, 0 replies; 34+ messages in thread From: Kees Cook @ 2015-03-13 19:54 UTC (permalink / raw) To: Andy Lutomirski Cc: Andrew G. Morgan, Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton, Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Christoph Lameter, LKML, Serge Hallyn, Markku Savela, Jonathan Corbet On Fri, Mar 13, 2015 at 12:03 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: > On Fri, Mar 13, 2015 at 11:52 AM, Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: >> >> All this said, almost half of the capabilities, if passed to flawed >> children with attacker controlled execution, can be elevated to full >> root privileges pretty easily[1], so I think any documentation around >> this feature should include some pretty dire warnings about using >> this. > > That's a good point. I'll make sure to document that. > > It's worth noting that, for many applications, that list is > overstated. For example, many of the suggested privilege escalations > don't work if you're in a sufficiently restrictive mount namespace. > > For my own use, I plan on adding only CAP_NET_BIND_SERVICE and > CAP_NET_RAW to pA, and I'll be layering seccomp on top to the extent > possible. Right, keeping software authors aware of the fact that their efforts for attack surface reducing may need additional confinement beyond just the capability reduction. -Kees > > --Andy > >> >> -Kees >> >> [1] https://forums.grsecurity.net/viewtopic.php?f=7&t=2522 >> >> -- >> Kees Cook >> Chrome OS Security > > > > -- > Andy Lutomirski > AMA Capital Management, LLC -- Kees Cook Chrome OS Security ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC] capabilities: Ambient capabilities [not found] ` <CALCETrVz39pJmMAEFodBNA8cOZjd0xkH95EzMazY0MWJDtpYgg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2015-03-13 18:52 ` Kees Cook @ 2015-03-14 21:09 ` Andrew G. Morgan 2015-03-14 21:45 ` Andy Lutomirski 1 sibling, 1 reply; 34+ messages in thread From: Andrew G. Morgan @ 2015-03-14 21:09 UTC (permalink / raw) To: Andy Lutomirski Cc: Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton, Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Christoph Lameter, LKML, Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet On Fri, Mar 13, 2015 at 10:57 AM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: > On Mar 13, 2015 6:24 AM, "Andrew G. Morgan" <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: >> >> > It's to preserve the invariant that pA is always a subset of pI. >> >> But since a user can always raise a bit in pI if it is present in pP, >> what does this invariant add to your model other than inconvenience? > > The useful part is that dropping a bit from pI also drops it from pA. > To keep the model consistent, I also require that you add the bit to > pI before adding it to pA. So you are saying that pA is always a strict subset of pI (and pP)? Then why not explicitly implement it as: pA' = (file caps or setuid or setgid ? 0 : pA) pP' = (fP & X) | (pI & [fI | (pA' & pP)] ) As it is you have so distributed these constraints that it is hard to be sure it will remain that way. >> >> I'm also unclear how you can turn off this new 'feature' for a process >> >> tree? As it is, the code creates an exploit path for a capable (pP != >> >> 0) program with an exploitable flaw to create a privilege escalation >> >> for an arbitrary child program. >> > >> > Huh? If you exploit the parent, you already win. Yes, if a kiddie >> > injects shellcode that does system("/bin/bash") into some pP != 0 >> > program, they don't actually elevate their privileges. On the other >> > hand, by the time an attacker injected shellcode for: >> > >> > prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, CAP_SYS_ADMIN); >> > system("/bin/bash"); >> >> Let's call the above two lines [a] and [b]. With this patch, you are >> encouraging folk to write programs that contain a line like [a] >> already. So, yes, I am saying that you are creating an exploitable >> path in these programs that says if someone can inject >> system("/bin/bash") into these programs they can get a new (because of >> this patch) privilege escalation. >> >> In the prevailing model, this kind of privilege escalation (resulting >> from naive inheritance) is designed out. I recognize that you want to >> add it back in, but I am concerned that you are not allowing for the >> possibility that some folk might still want to still be able to >> prevent it. > > If you have a program that deliberately uses PR_CAP_AMBIENT, then > setting such a securebit will break the program, so it still doesn't > buy you anything. Not if you make the bit lockable (like the other bits). If you want to run with your model in effect, you lock the enable bit on. >> > into a target, they can already do whatever they want. >> > >> >> While I understand that everyone >> >> 'knows what they are doing' in implementing this change, I'm convinced >> >> that folk that are up to no good also do... Why not provide a lockable >> >> secure bit to selectively disable this support? >> > >> > Show me a legitimate use case and I'll gladly implement a secure bit. >> >> Thanks. I was kind of hoping that you would add a lockable secure bit >> that defaults this support to off, but can be used to turn it on with >> or without a lock. That way, you can avoid disturbing the legacy >> defaults (no surprises). > > I think this thing needs to default on to be useful. > >> >> > In the mean time, I don't even believe that there's a legitimate use >> > for any of the other secure bits (except keepcaps, and I don't know >> > why that's a securebit in the first place). >> >> Those bits currently make it possible to run a subsystem with no >> [set]uid-0 support in its process tree. > > Not usefully, because even with all the securebits set to their > non-legacy modes, caps don't inherit, so it doesn't work. I've tried. Not sure I follow. They work for a definition if inheritable that you seem to refuse to accept. >> > In the mean time, see CVE-2014-3215 for an example of why securebits >> > are probably more trouble than they're worth. >> >> I think it is safe to say that naive privilege inheritance has a fair >> track record of being exploited orders of magnitude more frequently >> than this. After all, these are the reasons LD_PRELOAD and shell >> script setuid bits are suppressed. > > I don't know what you mean here by naive privilege inheritance. The > examples you're taking about aren't inheritance at all; they're > exploring privilege *grants* during execve. My patch deliberately > leaves grants like that alone. The pI set is inherited through this exec unmolested. You are adding a pA set that is passed through exec somewhat unmolested but allows all descendants to have privilege. - Naive inheritance is 'if you had a privilege pre-exec any descendant gets it post-exec'. This is what your pA achieves. - Inheritance, as it is currently implemented, is implemented as a handshake between a privileged ancestor (who added a pI bit) and a descendant that is ready for it (has a fI bit to match). - the un-naive aspect (if you prefer 'selective' inheritance) of this is that only binaries with file capabilities can ever wield privilege, and while it can be suppressed by an intermediate descendant the grant of inherited privilege can skip generations. I think we can agree that the only place that inheritance can occur is through an exec. So 'grants' through exec is sort of a semantic bit of smoke masquerading as an argument. Even in the naive inheritance case you have to explicitly grant some privilege - you are just choosing different rules. My Nack remains that you are eliminating the explicit enforcement of selective inheritance. A lockable secure bit protecting access to your prctl() function would address this concern. Thanks Andrew ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC] capabilities: Ambient capabilities 2015-03-14 21:09 ` Andrew G. Morgan @ 2015-03-14 21:45 ` Andy Lutomirski [not found] ` <CALCETrUGASa3Gp6cC1zdAahcS59DOdyLnTtgNX7t7FucMZLmoQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 34+ messages in thread From: Andy Lutomirski @ 2015-03-14 21:45 UTC (permalink / raw) To: Andrew G. Morgan Cc: Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton, Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Christoph Lameter, LKML, Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet On Sat, Mar 14, 2015 at 2:09 PM, Andrew G. Morgan <morgan@kernel.org> wrote: > On Fri, Mar 13, 2015 at 10:57 AM, Andy Lutomirski <luto@amacapital.net> wrote: >> On Mar 13, 2015 6:24 AM, "Andrew G. Morgan" <morgan@kernel.org> wrote: >>> >>> > It's to preserve the invariant that pA is always a subset of pI. >>> >>> But since a user can always raise a bit in pI if it is present in pP, >>> what does this invariant add to your model other than inconvenience? >> >> The useful part is that dropping a bit from pI also drops it from pA. >> To keep the model consistent, I also require that you add the bit to >> pI before adding it to pA. > > So you are saying that pA is always a strict subset of pI (and pP)? > Then why not explicitly implement it as: > > pA' = (file caps or setuid or setgid ? 0 : pA) > pP' = (fP & X) | (pI & [fI | (pA' & pP)] ) > > As it is you have so distributed these constraints that it is hard to > be sure it will remain that way. That would be insecure. If an attacker had pA = CAP_SYS_ADMIN, pI = 0, pP = 0 (i.e. no privs but pA is set somehow) then, unless that's there's some other protection implemented, they could run some setuid program, and that program could switch back to non-root, set pI = 0, and call execve. Unexpectedly, CAP_SYS_ADMIN would be inherited. So I made the invariant explicit and added an assertion. >> >> If you have a program that deliberately uses PR_CAP_AMBIENT, then >> setting such a securebit will break the program, so it still doesn't >> buy you anything. > > Not if you make the bit lockable (like the other bits). If you want to > run with your model in effect, you lock the enable bit on. I don't see the point. Again, this should be the default. >> >>> >>> > In the mean time, I don't even believe that there's a legitimate use >>> > for any of the other secure bits (except keepcaps, and I don't know >>> > why that's a securebit in the first place). >>> >>> Those bits currently make it possible to run a subsystem with no >>> [set]uid-0 support in its process tree. >> >> Not usefully, because even with all the securebits set to their >> non-legacy modes, caps don't inherit, so it doesn't work. I've tried. > > Not sure I follow. They work for a definition if inheritable that you > seem to refuse to accept. I, and everyone I know who's tried to use inheritable capabilities, has run into the near-complete uselessness of the current model. I understand that a defunct POSIX draft specified it, but it's still nearly useless. You've objected to changing it, but you've never directly addressed any of the reasons why Christoph, Google, and I all believe that we can't usefully use it. >>> I think it is safe to say that naive privilege inheritance has a fair >>> track record of being exploited orders of magnitude more frequently >>> than this. After all, these are the reasons LD_PRELOAD and shell >>> script setuid bits are suppressed. >> >> I don't know what you mean here by naive privilege inheritance. The >> examples you're taking about aren't inheritance at all; they're >> exploring privilege *grants* during execve. My patch deliberately >> leaves grants like that alone. > > The pI set is inherited through this exec unmolested. This is flat-out useless. Having pI = CAP_NET_BIND_SERVICE doesn't let me bind low-numbered ports, full stop. > My Nack remains that you are eliminating the explicit enforcement of > selective inheritance. A lockable secure bit protecting access to your > prctl() function would address this concern. Would a sysctl or securebit that *optionally* allows pA to be disabled satisfy you? I don't understand why lockable is at all useful. You'd need CAP_SETPCAP to flip it regardless. --Andy ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <CALCETrUGASa3Gp6cC1zdAahcS59DOdyLnTtgNX7t7FucMZLmoQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFC] capabilities: Ambient capabilities [not found] ` <CALCETrUGASa3Gp6cC1zdAahcS59DOdyLnTtgNX7t7FucMZLmoQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-03-14 22:17 ` Andrew G. Morgan [not found] ` <CALQRfL6z2dEtPkWiuGydT0a+y4iWm2qw-cY0Rk9hA-K4gPw_qA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 34+ messages in thread From: Andrew G. Morgan @ 2015-03-14 22:17 UTC (permalink / raw) To: Andy Lutomirski Cc: Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton, Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Christoph Lameter, LKML, Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet On Sat, Mar 14, 2015 at 2:45 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: > On Sat, Mar 14, 2015 at 2:09 PM, Andrew G. Morgan <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: >> On Fri, Mar 13, 2015 at 10:57 AM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: >>> On Mar 13, 2015 6:24 AM, "Andrew G. Morgan" <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: >>>> >>>> > It's to preserve the invariant that pA is always a subset of pI. >>>> >>>> But since a user can always raise a bit in pI if it is present in pP, >>>> what does this invariant add to your model other than inconvenience? >>> >>> The useful part is that dropping a bit from pI also drops it from pA. >>> To keep the model consistent, I also require that you add the bit to >>> pI before adding it to pA. >> >> So you are saying that pA is always a strict subset of pI (and pP)? >> Then why not explicitly implement it as: >> >> pA' = (file caps or setuid or setgid ? 0 : pA) >> pP' = (fP & X) | (pI & [fI | (pA' & pP)] ) >> >> As it is you have so distributed these constraints that it is hard to >> be sure it will remain that way. > > That would be insecure. If an attacker had pA = CAP_SYS_ADMIN, pI = > 0, pP = 0 (i.e. no privs but pA is set somehow) then, unless that's > there's some other protection implemented, they could run some setuid > program, and that program could switch back to non-root, set pI = 0, > and call execve. Unexpectedly, CAP_SYS_ADMIN would be inherited. Forgive me, but which bit of pI & [fI | (pA' & pP)] with pI = 0 makes that so? > So I made the invariant explicit and added an assertion. > >>> >>> If you have a program that deliberately uses PR_CAP_AMBIENT, then >>> setting such a securebit will break the program, so it still doesn't >>> buy you anything. >> >> Not if you make the bit lockable (like the other bits). If you want to >> run with your model in effect, you lock the enable bit on. > > I don't see the point. Again, this should be the default. > > >>> >>>> >>>> > In the mean time, I don't even believe that there's a legitimate use >>>> > for any of the other secure bits (except keepcaps, and I don't know >>>> > why that's a securebit in the first place). >>>> >>>> Those bits currently make it possible to run a subsystem with no >>>> [set]uid-0 support in its process tree. >>> >>> Not usefully, because even with all the securebits set to their >>> non-legacy modes, caps don't inherit, so it doesn't work. I've tried. >> >> Not sure I follow. They work for a definition if inheritable that you >> seem to refuse to accept. > > I, and everyone I know who's tried to use inheritable capabilities, > has run into the near-complete uselessness of the current model. I > understand that a defunct POSIX draft specified it, but it's still > nearly useless. > > You've objected to changing it, but you've never directly addressed I've repeatedly said I am not a fan of naive inheritance. I've not objected to changing it, I've objected to mandating it be changed. I have repeatedly suggested ways to conditionalize this feature addition. > any of the reasons why Christoph, Google, and I all believe that we > can't usefully use it. Working for Google, myself, I sort of find that a curious generalization. >>>> I think it is safe to say that naive privilege inheritance has a fair >>>> track record of being exploited orders of magnitude more frequently >>>> than this. After all, these are the reasons LD_PRELOAD and shell >>>> script setuid bits are suppressed. >>> >>> I don't know what you mean here by naive privilege inheritance. The >>> examples you're taking about aren't inheritance at all; they're >>> exploring privilege *grants* during execve. My patch deliberately >>> leaves grants like that alone. >> >> The pI set is inherited through this exec unmolested. > > This is flat-out useless. Having pI = CAP_NET_BIND_SERVICE doesn't > let me bind low-numbered ports, full stop. Even in your proposed model, neither pI nor pA does this. It is what is in pE that counts. >> My Nack remains that you are eliminating the explicit enforcement of >> selective inheritance. A lockable secure bit protecting access to your >> prctl() function would address this concern. > > Would a sysctl or securebit that *optionally* allows pA to be disabled > satisfy you? > > I don't understand why lockable is at all useful. You'd need > CAP_SETPCAP to flip it regardless. Because it means one can create process trees in which this behavior is guaranteed to be allowed and/or disallowed. Cheers Andrew ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <CALQRfL6z2dEtPkWiuGydT0a+y4iWm2qw-cY0Rk9hA-K4gPw_qA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFC] capabilities: Ambient capabilities [not found] ` <CALQRfL6z2dEtPkWiuGydT0a+y4iWm2qw-cY0Rk9hA-K4gPw_qA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-03-14 22:53 ` Andy Lutomirski 2015-03-14 22:55 ` Andy Lutomirski 1 sibling, 0 replies; 34+ messages in thread From: Andy Lutomirski @ 2015-03-14 22:53 UTC (permalink / raw) To: Andrew G. Morgan Cc: Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton, Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Christoph Lameter, LKML, Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet On Sat, Mar 14, 2015 at 3:17 PM, Andrew G. Morgan <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > On Sat, Mar 14, 2015 at 2:45 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: >> On Sat, Mar 14, 2015 at 2:09 PM, Andrew G. Morgan <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: >>> On Fri, Mar 13, 2015 at 10:57 AM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: >>>> On Mar 13, 2015 6:24 AM, "Andrew G. Morgan" <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: >>>>> >>>>> > It's to preserve the invariant that pA is always a subset of pI. >>>>> >>>>> But since a user can always raise a bit in pI if it is present in pP, >>>>> what does this invariant add to your model other than inconvenience? >>>> >>>> The useful part is that dropping a bit from pI also drops it from pA. >>>> To keep the model consistent, I also require that you add the bit to >>>> pI before adding it to pA. >>> >>> So you are saying that pA is always a strict subset of pI (and pP)? >>> Then why not explicitly implement it as: >>> >>> pA' = (file caps or setuid or setgid ? 0 : pA) >>> pP' = (fP & X) | (pI & [fI | (pA' & pP)] ) >>> >>> As it is you have so distributed these constraints that it is hard to >>> be sure it will remain that way. >> >> That would be insecure. If an attacker had pA = CAP_SYS_ADMIN, pI = >> 0, pP = 0 (i.e. no privs but pA is set somehow) then, unless that's >> there's some other protection implemented, they could run some setuid >> program, and that program could switch back to non-root, set pI = 0, >> and call execve. Unexpectedly, CAP_SYS_ADMIN would be inherited. > > Forgive me, but which bit of > > pI & [fI | (pA' & pP)] > > with pI = 0 makes that so? Oh, I misread that. I think it's already unnecessarily confusing that you can inherit nonzero pI without inheriting the corresponding bits in pP, and I don't want to add yet more degrees of freedom in non-permitted caps. >>>> >>>> I don't know what you mean here by naive privilege inheritance. The >>>> examples you're taking about aren't inheritance at all; they're >>>> exploring privilege *grants* during execve. My patch deliberately >>>> leaves grants like that alone. >>> >>> The pI set is inherited through this exec unmolested. >> >> This is flat-out useless. Having pI = CAP_NET_BIND_SERVICE doesn't >> let me bind low-numbered ports, full stop. > > Even in your proposed model, neither pI nor pA does this. It is what > is in pE that counts. Let me state it more precisely. If your parent puts CAP_NET_BIND_SERVICE in pI and execs you, you can't bind low-numbered ports. If your parent puts CAP_NET_BIND_SERVICE in pA, you can. > >>> My Nack remains that you are eliminating the explicit enforcement of >>> selective inheritance. A lockable secure bit protecting access to your >>> prctl() function would address this concern. >> >> Would a sysctl or securebit that *optionally* allows pA to be disabled >> satisfy you? >> >> I don't understand why lockable is at all useful. You'd need >> CAP_SETPCAP to flip it regardless. > > Because it means one can create process trees in which this behavior > is guaranteed to be allowed and/or disallowed. > I don't see why guaranteeing that it's allowed is particularly useful. I also don't see why any of the securebits lock bits are useful. I don't specifically object; I just don't see the point. If you give a subtree CAP_SETPCAP but you don't trust them enough to leave securebits unlocked, then I think your threat model is confused. --Andy ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC] capabilities: Ambient capabilities [not found] ` <CALQRfL6z2dEtPkWiuGydT0a+y4iWm2qw-cY0Rk9hA-K4gPw_qA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2015-03-14 22:53 ` Andy Lutomirski @ 2015-03-14 22:55 ` Andy Lutomirski [not found] ` <CALCETrWRHRVLotFs=Cpdr=7KjE7q52NdT62GxZg=xjv+LFZBqw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 1 sibling, 1 reply; 34+ messages in thread From: Andy Lutomirski @ 2015-03-14 22:55 UTC (permalink / raw) To: Andrew G. Morgan Cc: Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton, Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Christoph Lameter, LKML, Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet It occurs to me that my previous reply was unnecessarily long and missed the point. Trying again: On Sat, Mar 14, 2015 at 3:17 PM, Andrew G. Morgan <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > On Sat, Mar 14, 2015 at 2:45 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: >> On Sat, Mar 14, 2015 at 2:09 PM, Andrew G. Morgan <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: >>> My Nack remains that you are eliminating the explicit enforcement of >>> selective inheritance. A lockable secure bit protecting access to your >>> prctl() function would address this concern. >> >> Would a sysctl or securebit that *optionally* allows pA to be disabled >> satisfy you? ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ It would be kind of nice to remove your nack. I think that the above is the relevant question. Could you answer it? --Andy ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <CALCETrWRHRVLotFs=Cpdr=7KjE7q52NdT62GxZg=xjv+LFZBqw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFC] capabilities: Ambient capabilities [not found] ` <CALCETrWRHRVLotFs=Cpdr=7KjE7q52NdT62GxZg=xjv+LFZBqw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-03-15 0:04 ` Andrew G. Morgan 2015-03-30 12:55 ` Christoph Lameter 0 siblings, 1 reply; 34+ messages in thread From: Andrew G. Morgan @ 2015-03-15 0:04 UTC (permalink / raw) To: Andy Lutomirski Cc: Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton, Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Christoph Lameter, LKML, Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet On Sat, Mar 14, 2015 at 3:55 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: > It occurs to me that my previous reply was unnecessarily long and > missed the point. Trying again: > > On Sat, Mar 14, 2015 at 3:17 PM, Andrew G. Morgan <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: >> On Sat, Mar 14, 2015 at 2:45 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: >>> On Sat, Mar 14, 2015 at 2:09 PM, Andrew G. Morgan <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: >>>> My Nack remains that you are eliminating the explicit enforcement of >>>> selective inheritance. A lockable secure bit protecting access to your >>>> prctl() function would address this concern. >>> >>> Would a sysctl or securebit that *optionally* allows pA to be disabled >>> satisfy you? > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > It would be kind of nice to remove your nack. I think that the above > is the relevant question. Could you answer it? I thought I did. Please implement a lockable secure bit and I will remove my Nack. I don't understand your confusion. Perhaps you are defining *optionally* in a way I don't follow? Perhaps you are saying that you want the default behavior of kernels to change to include your new feature, and that you want the secure bit, when set, would put it into pA suppressed mode? Other folk are probably better at anticipating the ramifications of changing default behavior. I'd prefer you default to pA being off, but I shall remove my Nack either way if you implement a lockable secure bit. Clear enough? Thanks Andrew ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC] capabilities: Ambient capabilities 2015-03-15 0:04 ` Andrew G. Morgan @ 2015-03-30 12:55 ` Christoph Lameter [not found] ` <alpine.DEB.2.11.1503300754070.5031-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org> 0 siblings, 1 reply; 34+ messages in thread From: Christoph Lameter @ 2015-03-30 12:55 UTC (permalink / raw) To: Andrew G. Morgan Cc: Andy Lutomirski, Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton, Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn, linux-security-module, Aaron Jones, LKML, Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet On Sat, 14 Mar 2015, Andrew G. Morgan wrote: > > I thought I did. Please implement a lockable secure bit and I will Would this suffice? It puts the CAP_SETPCAP limitation back to how it was in my earlier patch. Subject: ambient caps: Allow disabling with SETPCAP Do not allow setting ambient caps if CAP_SETPCAP is not set. Signed-off-by: Christoph Lameter <cl@linux.com> Index: linux/security/commoncap.c =================================================================== --- linux.orig/security/commoncap.c +++ linux/security/commoncap.c @@ -962,6 +962,9 @@ int cap_task_prctl(int option, unsigned if (((!cap_valid(arg3)) | arg4 | arg5)) return -EINVAL; + if (!ns_capable(current_user_ns(), CAP_SETPCAP)) + return -EPERM; + if (arg2 == PR_CAP_AMBIENT_GET) { return !!cap_raised(current_cred()->cap_ambient, arg3); } else if (arg2 != PR_CAP_AMBIENT_RAISE && ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <alpine.DEB.2.11.1503300754070.5031-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org>]
* Re: [RFC] capabilities: Ambient capabilities [not found] ` <alpine.DEB.2.11.1503300754070.5031-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org> @ 2015-03-30 14:31 ` Andy Lutomirski [not found] ` <CALCETrW9ckw=7-1JSyFkenhFu2_1MvVqjA+inOmsuuWemGaU0w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 34+ messages in thread From: Andy Lutomirski @ 2015-03-30 14:31 UTC (permalink / raw) To: Christoph Lameter Cc: Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton, Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet On Mar 30, 2015 7:55 AM, "Christoph Lameter" <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org> wrote: > > On Sat, 14 Mar 2015, Andrew G. Morgan wrote: > > > > > I thought I did. Please implement a lockable secure bit and I will > > Would this suffice? It puts the CAP_SETPCAP limitation back to how it > was in my earlier patch. > I really don't like that variant. CAP_SETPCAP is dangerous and so absurdly powerful that people really shouldn't hand it out. I'll submit a new version this week with the securebits. Sorry for the delay. --Andy > > > Subject: ambient caps: Allow disabling with SETPCAP > > Do not allow setting ambient caps if CAP_SETPCAP is not set. > > Signed-off-by: Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org> > > Index: linux/security/commoncap.c > =================================================================== > --- linux.orig/security/commoncap.c > +++ linux/security/commoncap.c > @@ -962,6 +962,9 @@ int cap_task_prctl(int option, unsigned > if (((!cap_valid(arg3)) | arg4 | arg5)) > return -EINVAL; > > + if (!ns_capable(current_user_ns(), CAP_SETPCAP)) > + return -EPERM; > + > if (arg2 == PR_CAP_AMBIENT_GET) { > return !!cap_raised(current_cred()->cap_ambient, arg3); > } else if (arg2 != PR_CAP_AMBIENT_RAISE && ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <CALCETrW9ckw=7-1JSyFkenhFu2_1MvVqjA+inOmsuuWemGaU0w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFC] capabilities: Ambient capabilities [not found] ` <CALCETrW9ckw=7-1JSyFkenhFu2_1MvVqjA+inOmsuuWemGaU0w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-03-30 15:05 ` Christoph Lameter 2015-04-09 15:25 ` Christoph Lameter 1 sibling, 0 replies; 34+ messages in thread From: Christoph Lameter @ 2015-03-30 15:05 UTC (permalink / raw) To: Andy Lutomirski Cc: Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton, Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet On Mon, 30 Mar 2015, Andy Lutomirski wrote: > > Would this suffice? It puts the CAP_SETPCAP limitation back to how it > > was in my earlier patch. > I really don't like that variant. CAP_SETPCAP is dangerous and so > absurdly powerful that people really shouldn't hand it out. According to man 7 capabilities CAP_SETPCAP is required to setup securebits. This hides the functionality behind yet another stage of security and obscures this ability somewhat more? ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC] capabilities: Ambient capabilities [not found] ` <CALCETrW9ckw=7-1JSyFkenhFu2_1MvVqjA+inOmsuuWemGaU0w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2015-03-30 15:05 ` Christoph Lameter @ 2015-04-09 15:25 ` Christoph Lameter [not found] ` <alpine.DEB.2.11.1504091024540.13650-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org> 1 sibling, 1 reply; 34+ messages in thread From: Christoph Lameter @ 2015-04-09 15:25 UTC (permalink / raw) To: Andy Lutomirski Cc: Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton, Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet On Mon, 30 Mar 2015, Andy Lutomirski wrote: > I'll submit a new version this week with the securebits. Sorry for the delay. Are we going to get a new version? ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <alpine.DEB.2.11.1504091024540.13650-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org>]
* Re: [RFC] capabilities: Ambient capabilities [not found] ` <alpine.DEB.2.11.1504091024540.13650-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org> @ 2015-04-23 14:01 ` Christoph Lameter [not found] ` <alpine.DEB.2.11.1504230900260.32203-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org> 0 siblings, 1 reply; 34+ messages in thread From: Christoph Lameter @ 2015-04-23 14:01 UTC (permalink / raw) To: Andy Lutomirski Cc: Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton, Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet On Thu, 9 Apr 2015, Christoph Lameter wrote: > > I'll submit a new version this week with the securebits. Sorry for the delay. > Are we going to get a new version? Replying to my own here. Cant we simply use the SETPCAP approach as per the patch I posted? ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <alpine.DEB.2.11.1504230900260.32203-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org>]
* Re: [RFC] capabilities: Ambient capabilities [not found] ` <alpine.DEB.2.11.1504230900260.32203-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org> @ 2015-04-24 17:53 ` Serge Hallyn 2015-04-24 18:41 ` Andy Lutomirski 0 siblings, 1 reply; 34+ messages in thread From: Serge Hallyn @ 2015-04-24 17:53 UTC (permalink / raw) To: Christoph Lameter Cc: Andy Lutomirski, Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton, Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet Quoting Christoph Lameter (cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org): > On Thu, 9 Apr 2015, Christoph Lameter wrote: > > > > I'll submit a new version this week with the securebits. Sorry for the delay. > > Are we going to get a new version? > > Replying to my own here. Cant we simply use the SETPCAP approach as per > the patch I posted? Andy had objections to that, but it seems ok to me. -serge ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC] capabilities: Ambient capabilities 2015-04-24 17:53 ` Serge Hallyn @ 2015-04-24 18:41 ` Andy Lutomirski [not found] ` <CALCETrW5_85mFFTkzCVNA5N1KQeNBrkw2d8FF3H3LYtmz6UAPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 34+ messages in thread From: Andy Lutomirski @ 2015-04-24 18:41 UTC (permalink / raw) To: Serge Hallyn Cc: Christoph Lameter, Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton, Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet On Fri, Apr 24, 2015 at 10:53 AM, Serge Hallyn <serge.hallyn@ubuntu.com> wrote: > Quoting Christoph Lameter (cl@linux.com): >> On Thu, 9 Apr 2015, Christoph Lameter wrote: >> >> > > I'll submit a new version this week with the securebits. Sorry for the delay. >> > Are we going to get a new version? >> >> Replying to my own here. Cant we simply use the SETPCAP approach as per >> the patch I posted? > > Andy had objections to that, but it seems ok to me. > I object because CAP_SETPCAP is very powerful whereas CAP_NET_BIND_SERVICE, for example, isn't. I'm fine with having a switch to turn off ambient caps, but requiring the "on" state to give processes superpowers seems unfortunate. Sorry for the huge delay. I got caught up with travel and the merge window. Here's a sneak peek: https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/log/?h=cap_ambient I need to write the user code to go with it and test it a bit before sending it out for real. --Andy ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <CALCETrW5_85mFFTkzCVNA5N1KQeNBrkw2d8FF3H3LYtmz6UAPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFC] capabilities: Ambient capabilities [not found] ` <CALCETrW5_85mFFTkzCVNA5N1KQeNBrkw2d8FF3H3LYtmz6UAPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-04-24 19:09 ` Serge Hallyn 2015-04-24 19:25 ` Christoph Lameter 2015-04-24 19:53 ` Andy Lutomirski 0 siblings, 2 replies; 34+ messages in thread From: Serge Hallyn @ 2015-04-24 19:09 UTC (permalink / raw) To: Andy Lutomirski Cc: Christoph Lameter, Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton, Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet Quoting Andy Lutomirski (luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org): > On Fri, Apr 24, 2015 at 10:53 AM, Serge Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> wrote: > > Quoting Christoph Lameter (cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org): > >> On Thu, 9 Apr 2015, Christoph Lameter wrote: > >> > >> > > I'll submit a new version this week with the securebits. Sorry for the delay. > >> > Are we going to get a new version? > >> > >> Replying to my own here. Cant we simply use the SETPCAP approach as per > >> the patch I posted? > > > > Andy had objections to that, but it seems ok to me. > > > > I object because CAP_SETPCAP is very powerful whereas > CAP_NET_BIND_SERVICE, for example, isn't. I'm fine with having a > switch to turn off ambient caps, but requiring the "on" state to give Would only really be needed for the initial 'enable ambient caps for this process tree', though. Once that was set, add/remove'ing caps from the ambient set wouldn't need to be required. > processes superpowers seems unfortunate. > > Sorry for the huge delay. I got caught up with travel and the merge > window. Here's a sneak peek: > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/log/?h=cap_ambient > > I need to write the user code to go with it and test it a bit before > sending it out for real. Ok, thanks -serge ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC] capabilities: Ambient capabilities 2015-04-24 19:09 ` Serge Hallyn @ 2015-04-24 19:25 ` Christoph Lameter 2015-04-24 19:53 ` Andy Lutomirski 1 sibling, 0 replies; 34+ messages in thread From: Christoph Lameter @ 2015-04-24 19:25 UTC (permalink / raw) To: Serge Hallyn Cc: Andy Lutomirski, Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton, Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet On Fri, 24 Apr 2015, Serge Hallyn wrote: > > I object because CAP_SETPCAP is very powerful whereas > > CAP_NET_BIND_SERVICE, for example, isn't. I'm fine with having a > > switch to turn off ambient caps, but requiring the "on" state to give > > Would only really be needed for the initial 'enable ambient caps for this > process tree', though. Once that was set, add/remove'ing caps from the > ambient set wouldn't need to be required. Exactly. Its much simpler than alternate approaches. And its convoluted enough. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC] capabilities: Ambient capabilities 2015-04-24 19:09 ` Serge Hallyn 2015-04-24 19:25 ` Christoph Lameter @ 2015-04-24 19:53 ` Andy Lutomirski 2015-04-24 20:13 ` Christoph Lameter 1 sibling, 1 reply; 34+ messages in thread From: Andy Lutomirski @ 2015-04-24 19:53 UTC (permalink / raw) To: Serge Hallyn Cc: Christoph Lameter, Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton, Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet On Fri, Apr 24, 2015 at 12:09 PM, Serge Hallyn <serge.hallyn@ubuntu.com> wrote: > Quoting Andy Lutomirski (luto@amacapital.net): >> On Fri, Apr 24, 2015 at 10:53 AM, Serge Hallyn <serge.hallyn@ubuntu.com> wrote: >> > Quoting Christoph Lameter (cl@linux.com): >> >> On Thu, 9 Apr 2015, Christoph Lameter wrote: >> >> >> >> > > I'll submit a new version this week with the securebits. Sorry for the delay. >> >> > Are we going to get a new version? >> >> >> >> Replying to my own here. Cant we simply use the SETPCAP approach as per >> >> the patch I posted? >> > >> > Andy had objections to that, but it seems ok to me. >> > >> >> I object because CAP_SETPCAP is very powerful whereas >> CAP_NET_BIND_SERVICE, for example, isn't. I'm fine with having a >> switch to turn off ambient caps, but requiring the "on" state to give > > Would only really be needed for the initial 'enable ambient caps for this > process tree', though. Once that was set, add/remove'ing caps from the > ambient set wouldn't need to be required. That's sort of what my patch does -- you need CAP_SETPCAP to switch the securebit. But Christoph's patch required it to add caps to the ambient set, right? --Andy ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC] capabilities: Ambient capabilities 2015-04-24 19:53 ` Andy Lutomirski @ 2015-04-24 20:13 ` Christoph Lameter [not found] ` <alpine.DEB.2.11.1504241512570.11839-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org> 0 siblings, 1 reply; 34+ messages in thread From: Christoph Lameter @ 2015-04-24 20:13 UTC (permalink / raw) To: Andy Lutomirski Cc: Serge Hallyn, Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton, Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet On Fri, 24 Apr 2015, Andy Lutomirski wrote: > That's sort of what my patch does -- you need CAP_SETPCAP to switch > the securebit. > > But Christoph's patch required it to add caps to the ambient set, right? Yes but you seem to be just adding one additional step without too much of a benefit because you still need CAP_SETPCAP. ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <alpine.DEB.2.11.1504241512570.11839-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org>]
* Re: [RFC] capabilities: Ambient capabilities [not found] ` <alpine.DEB.2.11.1504241512570.11839-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org> @ 2015-04-24 20:18 ` Andy Lutomirski 2015-04-24 21:15 ` Serge E. Hallyn 2015-04-27 14:55 ` Christoph Lameter 0 siblings, 2 replies; 34+ messages in thread From: Andy Lutomirski @ 2015-04-24 20:18 UTC (permalink / raw) To: Christoph Lameter Cc: Serge Hallyn, Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton, Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet On Fri, Apr 24, 2015 at 1:13 PM, Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org> wrote: > On Fri, 24 Apr 2015, Andy Lutomirski wrote: > >> That's sort of what my patch does -- you need CAP_SETPCAP to switch >> the securebit. >> >> But Christoph's patch required it to add caps to the ambient set, right? > > Yes but you seem to be just adding one additional step without too much of > a benefit because you still need CAP_SETPCAP. > No, because I set the default to on :) Also, in my model you can do: $ sudo capset cap_whatever=eip something $ ./something and the program can make its cap be ambient and run a helper. In the CAP_SETPCAP model, that doesn't work. --Andy -- Andy Lutomirski AMA Capital Management, LLC ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC] capabilities: Ambient capabilities 2015-04-24 20:18 ` Andy Lutomirski @ 2015-04-24 21:15 ` Serge E. Hallyn [not found] ` <20150424211511.GB28613-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org> 2015-04-27 14:55 ` Christoph Lameter 1 sibling, 1 reply; 34+ messages in thread From: Serge E. Hallyn @ 2015-04-24 21:15 UTC (permalink / raw) To: Andy Lutomirski Cc: Christoph Lameter, Serge Hallyn, Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton, Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet On Fri, Apr 24, 2015 at 01:18:44PM -0700, Andy Lutomirski wrote: > On Fri, Apr 24, 2015 at 1:13 PM, Christoph Lameter <cl@linux.com> wrote: > > On Fri, 24 Apr 2015, Andy Lutomirski wrote: > > > >> That's sort of what my patch does -- you need CAP_SETPCAP to switch > >> the securebit. > >> > >> But Christoph's patch required it to add caps to the ambient set, right? > > > > Yes but you seem to be just adding one additional step without too much of > > a benefit because you still need CAP_SETPCAP. > > > > No, because I set the default to on :) Right - I definately prefer . default off . CAP_SETPCAP required to turn it on (for self and children) . once on, anyone can copy bits from (whatever we decided) to pA. > Also, in my model you can do: > > $ sudo capset cap_whatever=eip something > $ ./something > > and the program can make its cap be ambient and run a helper. In the > CAP_SETPCAP model, that doesn't work. > > --Andy > > -- > Andy Lutomirski > AMA Capital Management, LLC > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <20150424211511.GB28613-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>]
* Re: [RFC] capabilities: Ambient capabilities [not found] ` <20150424211511.GB28613-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org> @ 2015-04-24 22:55 ` Andy Lutomirski [not found] ` <CALCETrUdNp18ZeOC8pRYn_YPUp8nFyH7aS+dFoBf34XNmER-nA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 34+ messages in thread From: Andy Lutomirski @ 2015-04-24 22:55 UTC (permalink / raw) To: Serge E. Hallyn Cc: Jarkko Sakkinen, Ted Ts'o, Andrew G. Morgan, Andrew Morton, Serge Hallyn, Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Christoph Lameter, LKML, Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet On Apr 24, 2015 2:15 PM, "Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> wrote: > > On Fri, Apr 24, 2015 at 01:18:44PM -0700, Andy Lutomirski wrote: > > On Fri, Apr 24, 2015 at 1:13 PM, Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org> wrote: > > > On Fri, 24 Apr 2015, Andy Lutomirski wrote: > > > > > >> That's sort of what my patch does -- you need CAP_SETPCAP to switch > > >> the securebit. > > >> > > >> But Christoph's patch required it to add caps to the ambient set, right? > > > > > > Yes but you seem to be just adding one additional step without too much of > > > a benefit because you still need CAP_SETPCAP. > > > > > > > No, because I set the default to on :) > > Right - I definately prefer > > . default off > . CAP_SETPCAP required to turn it on (for self and children) > . once on, anyone can copy bits from (whatever we decided) to pA. > Why default off? If there's some weird reason that switching it on could cause a security problem, then I'd agree, but I haven't spotted a reason yet. --Andy ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <CALCETrUdNp18ZeOC8pRYn_YPUp8nFyH7aS+dFoBf34XNmER-nA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFC] capabilities: Ambient capabilities [not found] ` <CALCETrUdNp18ZeOC8pRYn_YPUp8nFyH7aS+dFoBf34XNmER-nA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-04-25 2:45 ` Serge Hallyn 0 siblings, 0 replies; 34+ messages in thread From: Serge Hallyn @ 2015-04-25 2:45 UTC (permalink / raw) To: Andy Lutomirski Cc: Serge E. Hallyn, Jarkko Sakkinen, Ted Ts'o, Andrew G. Morgan, Andrew Morton, Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Christoph Lameter, LKML, Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet Quoting Andy Lutomirski (luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org): > On Apr 24, 2015 2:15 PM, "Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> wrote: > > > > On Fri, Apr 24, 2015 at 01:18:44PM -0700, Andy Lutomirski wrote: > > > On Fri, Apr 24, 2015 at 1:13 PM, Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org> wrote: > > > > On Fri, 24 Apr 2015, Andy Lutomirski wrote: > > > > > > > >> That's sort of what my patch does -- you need CAP_SETPCAP to switch > > > >> the securebit. > > > >> > > > >> But Christoph's patch required it to add caps to the ambient set, right? > > > > > > > > Yes but you seem to be just adding one additional step without too much of > > > > a benefit because you still need CAP_SETPCAP. > > > > > > > > > > No, because I set the default to on :) > > > > Right - I definately prefer > > > > . default off > > . CAP_SETPCAP required to turn it on (for self and children) > > . once on, anyone can copy bits from (whatever we decided) to pA. > > > > Why default off? If there's some weird reason that switching it on > could cause a security problem, then I'd agree, but I haven't spotted > a reason yet. Cause it's less scary? I'll just wait for the new patchset :) ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC] capabilities: Ambient capabilities 2015-04-24 20:18 ` Andy Lutomirski 2015-04-24 21:15 ` Serge E. Hallyn @ 2015-04-27 14:55 ` Christoph Lameter 1 sibling, 0 replies; 34+ messages in thread From: Christoph Lameter @ 2015-04-27 14:55 UTC (permalink / raw) To: Andy Lutomirski Cc: Serge Hallyn, Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton, Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn, linux-security-module, Aaron Jones, Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet On Fri, 24 Apr 2015, Andy Lutomirski wrote: > Also, in my model you can do: > > $ sudo capset cap_whatever=eip something > $ ./something > > and the program can make its cap be ambient and run a helper. In the > CAP_SETPCAP model, that doesn't work. Dont see too much difference in setting caps and CAP_SETPCAP on "./something" to allow it to set the ambient caps. ^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2015-04-27 14:55 UTC | newest] Thread overview: 34+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-03-12 18:08 [RFC] capabilities: Ambient capabilities Andy Lutomirski 2015-03-12 21:49 ` Kees Cook 2015-03-12 22:10 ` Andrew G. Morgan [not found] ` <CALQRfL7b8CjYgUnVy3jykNwv48fOc03T385RKo--cfv25YenBg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2015-03-12 22:27 ` Andrew Lutomirski [not found] ` <CAObL_7HgU-ekb7mJS7C=0=idfrzV6=CSqTrG2LvUgdDtvbJx_w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2015-03-13 13:24 ` Andrew G. Morgan 2015-03-13 16:06 ` Christoph Lameter [not found] ` <alpine.DEB.2.11.1503131049150.13899-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org> 2015-03-13 17:58 ` Andy Lutomirski [not found] ` <CALCETrXcUfbqfm7av9crrxQ5nCBYpdJO8gRo4ZhRA97g27B2iw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2015-03-13 18:52 ` Christoph Lameter 2015-03-13 17:57 ` Andy Lutomirski [not found] ` <CALCETrVz39pJmMAEFodBNA8cOZjd0xkH95EzMazY0MWJDtpYgg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2015-03-13 18:52 ` Kees Cook 2015-03-13 19:03 ` Andy Lutomirski [not found] ` <CALCETrV7vDJVhM_AgtGn8ENStcyxZBwCw3zhSn-whArT_XPg8A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2015-03-13 19:54 ` Kees Cook 2015-03-14 21:09 ` Andrew G. Morgan 2015-03-14 21:45 ` Andy Lutomirski [not found] ` <CALCETrUGASa3Gp6cC1zdAahcS59DOdyLnTtgNX7t7FucMZLmoQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2015-03-14 22:17 ` Andrew G. Morgan [not found] ` <CALQRfL6z2dEtPkWiuGydT0a+y4iWm2qw-cY0Rk9hA-K4gPw_qA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2015-03-14 22:53 ` Andy Lutomirski 2015-03-14 22:55 ` Andy Lutomirski [not found] ` <CALCETrWRHRVLotFs=Cpdr=7KjE7q52NdT62GxZg=xjv+LFZBqw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2015-03-15 0:04 ` Andrew G. Morgan 2015-03-30 12:55 ` Christoph Lameter [not found] ` <alpine.DEB.2.11.1503300754070.5031-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org> 2015-03-30 14:31 ` Andy Lutomirski [not found] ` <CALCETrW9ckw=7-1JSyFkenhFu2_1MvVqjA+inOmsuuWemGaU0w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2015-03-30 15:05 ` Christoph Lameter 2015-04-09 15:25 ` Christoph Lameter [not found] ` <alpine.DEB.2.11.1504091024540.13650-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org> 2015-04-23 14:01 ` Christoph Lameter [not found] ` <alpine.DEB.2.11.1504230900260.32203-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org> 2015-04-24 17:53 ` Serge Hallyn 2015-04-24 18:41 ` Andy Lutomirski [not found] ` <CALCETrW5_85mFFTkzCVNA5N1KQeNBrkw2d8FF3H3LYtmz6UAPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2015-04-24 19:09 ` Serge Hallyn 2015-04-24 19:25 ` Christoph Lameter 2015-04-24 19:53 ` Andy Lutomirski 2015-04-24 20:13 ` Christoph Lameter [not found] ` <alpine.DEB.2.11.1504241512570.11839-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org> 2015-04-24 20:18 ` Andy Lutomirski 2015-04-24 21:15 ` Serge E. Hallyn [not found] ` <20150424211511.GB28613-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org> 2015-04-24 22:55 ` Andy Lutomirski [not found] ` <CALCETrUdNp18ZeOC8pRYn_YPUp8nFyH7aS+dFoBf34XNmER-nA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2015-04-25 2:45 ` Serge Hallyn 2015-04-27 14:55 ` Christoph Lameter
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).