From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jann Horn Subject: Re: [PATCH] kernel/signal: Signal-based pre-coredump notification Date: Mon, 15 Oct 2018 20:54:27 +0200 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Return-path: In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org To: enkechen@cisco.com Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H . Peter Anvin" , the arch/x86 maintainers , Peter Zijlstra , Arnd Bergmann , "Eric W. Biederman" , Khalid Aziz , Kate Stewart , deller@gmx.de, Greg Kroah-Hartman , Al Viro , Andrew Morton , christian@brauner.io, Catalin Marinas , Will Deacon , Dave.Martin@arm.com, mchehab+samsung@kernel.org, Michal Hocko , Rik van Riel , "Kirill A . Shutemov" List-Id: linux-arch.vger.kernel.org On Mon, Oct 15, 2018 at 8:36 PM Enke Chen wrote: > On 10/13/18 11:27 AM, Jann Horn wrote: > > On Sat, Oct 13, 2018 at 2:33 AM Enke Chen wrote: > >> For simplicity and consistency, this patch provides an implementation > >> for signal-based fault notification prior to the coredump of a child > >> process. A new prctl command, PR_SET_PREDUMP_SIG, is defined that can > >> be used by an application to express its interest and to specify the > >> signal (SIGCHLD or SIGUSR1 or SIGUSR2) for such a notification. A new > >> signal code (si_code), CLD_PREDUMP, is also defined for SIGCHLD. > > > > Your suggested API looks vaguely similar to PR_SET_PDEATHSIG, but with > > some important differences: > > > > - You don't reset the signal on setuid execution. [...] > > > > For both of these: Are these differences actually necessary, and if > > so, can you provide a specific rationale? From a security perspective, > > I would very much prefer it if this API had semantics closer to > > PR_SET_PDEATHSIG. > [...] > > Regarding the impact of "setuid", this property "PR_SET_PREDUMP_SIG" has to > do with the application/process whether the signal handler is set for receiving > such a notification. If it is set, the "uid" should not matter. If an attacker's process first calls PR_SET_PREDUMP_SIG, then forks off a child, then calls execve() on a setuid binary, the setuid binary calls setuid(0), and the attacker-controlled child then crashes, the privileged process will receive an unexpected signal that the attacker wouldn't have been allowed to send otherwise. For similar reasons, the parent death signal is reset when a setuid binary is executed: void setup_new_exec(struct linux_binprm * bprm) { /* * Once here, prepare_binrpm() will not be called any more, so * the final state of setuid/setgid/fscaps can be merged into the * secureexec flag. */ bprm->secureexec |= bprm->cap_elevated; if (bprm->secureexec) { /* Make sure parent cannot signal privileged process. */ current->pdeath_signal = 0; [...] } [...] } int commit_creds(struct cred *new) { [...] /* dumpability changes */ if (!uid_eq(old->euid, new->euid) || !gid_eq(old->egid, new->egid) || !uid_eq(old->fsuid, new->fsuid) || !gid_eq(old->fsgid, new->fsgid) || !cred_cap_issubset(old, new)) { if (task->mm) set_dumpable(task->mm, suid_dumpable); task->pdeath_signal = 0; smp_wmb(); } [...] } AppArmor and SELinux also do related changes: static void apparmor_bprm_committing_creds(struct linux_binprm *bprm) { [...] /* bail out if unconfined or not changing profile */ if ((new_label->proxy == label->proxy) || (unconfined(new_label))) return; aa_inherit_files(bprm->cred, current->files); current->pdeath_signal = 0; [...] } static void selinux_bprm_committing_creds(struct linux_binprm *bprm) { [...] new_tsec = bprm->cred->security; if (new_tsec->sid == new_tsec->osid) return; /* Close files for which the new task SID is not authorized. */ flush_unauthorized_files(bprm->cred, current->files); /* Always clear parent death signal on SID transitions. */ current->pdeath_signal = 0; [...] } You should probably reset the coredump signal in the same places - or even better, add a new helper for resetting the parent death signal, and then add code for resetting the coredump signal in there. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot1-f66.google.com ([209.85.210.66]:41223 "EHLO mail-ot1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726713AbeJPClY (ORCPT ); Mon, 15 Oct 2018 22:41:24 -0400 Received: by mail-ot1-f66.google.com with SMTP id c32so19967563otb.8 for ; Mon, 15 Oct 2018 11:54:53 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Jann Horn Date: Mon, 15 Oct 2018 20:54:27 +0200 Message-ID: Subject: Re: [PATCH] kernel/signal: Signal-based pre-coredump notification Content-Type: text/plain; charset="UTF-8" Sender: linux-arch-owner@vger.kernel.org List-ID: To: enkechen@cisco.com Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H . Peter Anvin" , the arch/x86 maintainers , Peter Zijlstra , Arnd Bergmann , "Eric W. Biederman" , Khalid Aziz , Kate Stewart , deller@gmx.de, Greg Kroah-Hartman , Al Viro , Andrew Morton , christian@brauner.io, Catalin Marinas , Will Deacon , Dave.Martin@arm.com, mchehab+samsung@kernel.org, Michal Hocko , Rik van Riel , "Kirill A . Shutemov" , guro@fb.com, Marcos Souza , Oleg Nesterov , linux@dominikbrodowski.net, Cyrill Gorcunov , yang.shi@linux.alibaba.com, Kees Cook , kernel list , linux-arch , Victor Kamensky , xe-linux-external@cisco.com, sstrogin@cisco.com Message-ID: <20181015185427.q8jN_y6C2qwKZsj0_5ueHkKr1mWFRaz5nwlOYp9CYTI@z> On Mon, Oct 15, 2018 at 8:36 PM Enke Chen wrote: > On 10/13/18 11:27 AM, Jann Horn wrote: > > On Sat, Oct 13, 2018 at 2:33 AM Enke Chen wrote: > >> For simplicity and consistency, this patch provides an implementation > >> for signal-based fault notification prior to the coredump of a child > >> process. A new prctl command, PR_SET_PREDUMP_SIG, is defined that can > >> be used by an application to express its interest and to specify the > >> signal (SIGCHLD or SIGUSR1 or SIGUSR2) for such a notification. A new > >> signal code (si_code), CLD_PREDUMP, is also defined for SIGCHLD. > > > > Your suggested API looks vaguely similar to PR_SET_PDEATHSIG, but with > > some important differences: > > > > - You don't reset the signal on setuid execution. [...] > > > > For both of these: Are these differences actually necessary, and if > > so, can you provide a specific rationale? From a security perspective, > > I would very much prefer it if this API had semantics closer to > > PR_SET_PDEATHSIG. > [...] > > Regarding the impact of "setuid", this property "PR_SET_PREDUMP_SIG" has to > do with the application/process whether the signal handler is set for receiving > such a notification. If it is set, the "uid" should not matter. If an attacker's process first calls PR_SET_PREDUMP_SIG, then forks off a child, then calls execve() on a setuid binary, the setuid binary calls setuid(0), and the attacker-controlled child then crashes, the privileged process will receive an unexpected signal that the attacker wouldn't have been allowed to send otherwise. For similar reasons, the parent death signal is reset when a setuid binary is executed: void setup_new_exec(struct linux_binprm * bprm) { /* * Once here, prepare_binrpm() will not be called any more, so * the final state of setuid/setgid/fscaps can be merged into the * secureexec flag. */ bprm->secureexec |= bprm->cap_elevated; if (bprm->secureexec) { /* Make sure parent cannot signal privileged process. */ current->pdeath_signal = 0; [...] } [...] } int commit_creds(struct cred *new) { [...] /* dumpability changes */ if (!uid_eq(old->euid, new->euid) || !gid_eq(old->egid, new->egid) || !uid_eq(old->fsuid, new->fsuid) || !gid_eq(old->fsgid, new->fsgid) || !cred_cap_issubset(old, new)) { if (task->mm) set_dumpable(task->mm, suid_dumpable); task->pdeath_signal = 0; smp_wmb(); } [...] } AppArmor and SELinux also do related changes: static void apparmor_bprm_committing_creds(struct linux_binprm *bprm) { [...] /* bail out if unconfined or not changing profile */ if ((new_label->proxy == label->proxy) || (unconfined(new_label))) return; aa_inherit_files(bprm->cred, current->files); current->pdeath_signal = 0; [...] } static void selinux_bprm_committing_creds(struct linux_binprm *bprm) { [...] new_tsec = bprm->cred->security; if (new_tsec->sid == new_tsec->osid) return; /* Close files for which the new task SID is not authorized. */ flush_unauthorized_files(bprm->cred, current->files); /* Always clear parent death signal on SID transitions. */ current->pdeath_signal = 0; [...] } You should probably reset the coredump signal in the same places - or even better, add a new helper for resetting the parent death signal, and then add code for resetting the coredump signal in there.