From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yu-cheng Yu Subject: Re: [RFC PATCH v2 17/27] x86/cet/shstk: User-mode shadow stack support Date: Fri, 13 Jul 2018 11:03:04 -0700 Message-ID: <1531504984.11680.21.camel@intel.com> References: <20180710222639.8241-1-yu-cheng.yu@intel.com> <20180710222639.8241-18-yu-cheng.yu@intel.com> <6F5FEFFD-0A9A-4181-8D15-5FC323632BA6@amacapital.net> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Return-path: In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org To: Andy Lutomirski , Jann Horn Cc: the arch/x86 maintainers , "H . Peter Anvin" , Thomas Gleixner , Ingo Molnar , kernel list , linux-doc@vger.kernel.org, Linux-MM , linux-arch , Linux API , Arnd Bergmann , bsingharora@gmail.com, Cyrill Gorcunov , Dave Hansen , Florian Weimer , hjl.tools@gmail.com, Jonathan Corbet , keescook@chromiun.org, Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , ravi List-Id: linux-arch.vger.kernel.org On Wed, 2018-07-11 at 15:21 -0700, Andy Lutomirski wrote: > > > > On Jul 11, 2018, at 2:51 PM, Jann Horn wrote: > > > > On Wed, Jul 11, 2018 at 2:34 PM Andy Lutomirski wrote: > > > > > > > > > > > On Jul 11, 2018, at 2:10 PM, Jann Horn wrote: > > > > > > > > > > > > > > On Tue, Jul 10, 2018 at 3:31 PM Yu-cheng Yu wrote: > > > > > > > > > > This patch adds basic shadow stack enabling/disabling routines. > > > > > A task's shadow stack is allocated from memory with VM_SHSTK > > > > > flag set and read-only protection.  The shadow stack is > > > > > allocated to a fixed size. > > > > > > > > > > Signed-off-by: Yu-cheng Yu > > > > [...] > > > > > > > > > > diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c > > > > > new file mode 100644 > > > > > index 000000000000..96bf69db7da7 > > > > > --- /dev/null > > > > > +++ b/arch/x86/kernel/cet.c > > > > [...] > > > > > > > > > > +static unsigned long shstk_mmap(unsigned long addr, unsigned long len) > > > > > +{ > > > > > +       struct mm_struct *mm = current->mm; > > > > > +       unsigned long populate; > > > > > + > > > > > +       down_write(&mm->mmap_sem); > > > > > +       addr = do_mmap(NULL, addr, len, PROT_READ, > > > > > +                      MAP_ANONYMOUS | MAP_PRIVATE, VM_SHSTK, > > > > > +                      0, &populate, NULL); > > > > > +       up_write(&mm->mmap_sem); > > > > > + > > > > > +       if (populate) > > > > > +               mm_populate(addr, populate); > > > > > + > > > > > +       return addr; > > > > > +} > > [...] > > > > > > > > > > > Should the kernel enforce that two shadow stacks must have a guard > > > > page between them so that they can not be directly adjacent, so that > > > > if you have too much recursion, you can't end up corrupting an > > > > adjacent shadow stack? > > > I think the answer is a qualified “no”. I would like to instead enforce a general guard page on all mmaps that don’t use MAP_FORCE. We *might* need to exempt any mmap with an address hint for > > > compatibility. > > I like this idea a lot. > > > > > > > > My commercial software has been manually adding guard pages on every single mmap done by tcmalloc for years, and it has caught a couple bugs and costs essentially nothing. > > > > > > Hmm. Linux should maybe add something like Windows’ “reserved” virtual memory. It’s basically a way to ask for a VA range that explicitly contains nothing and can be subsequently be turned into > > > something useful with the equivalent of MAP_FORCE. > > What's the benefit over creating an anonymous PROT_NONE region? That > > the kernel won't have to scan through the corresponding PTEs when > > tearing down the mapping? > Make it more obvious what’s happening and avoid accounting issues?  What I’ve actually used is MAP_NORESERVE | PROT_NONE, but I think this still counts against the VA rlimit. But maybe that’s > actually the desired behavior. We can put a NULL at both ends of a SHSTK to guard against corruption. Yu-cheng  From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com ([192.55.52.93]:4809 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387717AbeGMSW1 (ORCPT ); Fri, 13 Jul 2018 14:22:27 -0400 Message-ID: <1531504984.11680.21.camel@intel.com> Subject: Re: [RFC PATCH v2 17/27] x86/cet/shstk: User-mode shadow stack support From: Yu-cheng Yu Date: Fri, 13 Jul 2018 11:03:04 -0700 In-Reply-To: References: <20180710222639.8241-1-yu-cheng.yu@intel.com> <20180710222639.8241-18-yu-cheng.yu@intel.com> <6F5FEFFD-0A9A-4181-8D15-5FC323632BA6@amacapital.net> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-arch-owner@vger.kernel.org List-ID: To: Andy Lutomirski , Jann Horn Cc: the arch/x86 maintainers , "H . Peter Anvin" , Thomas Gleixner , Ingo Molnar , kernel list , linux-doc@vger.kernel.org, Linux-MM , linux-arch , Linux API , Arnd Bergmann , bsingharora@gmail.com, Cyrill Gorcunov , Dave Hansen , Florian Weimer , hjl.tools@gmail.com, Jonathan Corbet , keescook@chromiun.org, Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , ravi.v.shankar@intel.com, vedvyas.shanbhogue@intel.com Message-ID: <20180713180304.gKIqrqNaNH7Fo-GjJgf1GVORxAq42UD3xcc_-Rr--34@z> On Wed, 2018-07-11 at 15:21 -0700, Andy Lutomirski wrote: > > > > On Jul 11, 2018, at 2:51 PM, Jann Horn wrote: > > > > On Wed, Jul 11, 2018 at 2:34 PM Andy Lutomirski wrote: > > > > > > > > > > > On Jul 11, 2018, at 2:10 PM, Jann Horn wrote: > > > > > > > > > > > > > > On Tue, Jul 10, 2018 at 3:31 PM Yu-cheng Yu wrote: > > > > > > > > > > This patch adds basic shadow stack enabling/disabling routines. > > > > > A task's shadow stack is allocated from memory with VM_SHSTK > > > > > flag set and read-only protection.  The shadow stack is > > > > > allocated to a fixed size. > > > > > > > > > > Signed-off-by: Yu-cheng Yu > > > > [...] > > > > > > > > > > diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c > > > > > new file mode 100644 > > > > > index 000000000000..96bf69db7da7 > > > > > --- /dev/null > > > > > +++ b/arch/x86/kernel/cet.c > > > > [...] > > > > > > > > > > +static unsigned long shstk_mmap(unsigned long addr, unsigned long len) > > > > > +{ > > > > > +       struct mm_struct *mm = current->mm; > > > > > +       unsigned long populate; > > > > > + > > > > > +       down_write(&mm->mmap_sem); > > > > > +       addr = do_mmap(NULL, addr, len, PROT_READ, > > > > > +                      MAP_ANONYMOUS | MAP_PRIVATE, VM_SHSTK, > > > > > +                      0, &populate, NULL); > > > > > +       up_write(&mm->mmap_sem); > > > > > + > > > > > +       if (populate) > > > > > +               mm_populate(addr, populate); > > > > > + > > > > > +       return addr; > > > > > +} > > [...] > > > > > > > > > > > Should the kernel enforce that two shadow stacks must have a guard > > > > page between them so that they can not be directly adjacent, so that > > > > if you have too much recursion, you can't end up corrupting an > > > > adjacent shadow stack? > > > I think the answer is a qualified “no”. I would like to instead enforce a general guard page on all mmaps that don’t use MAP_FORCE. We *might* need to exempt any mmap with an address hint for > > > compatibility. > > I like this idea a lot. > > > > > > > > My commercial software has been manually adding guard pages on every single mmap done by tcmalloc for years, and it has caught a couple bugs and costs essentially nothing. > > > > > > Hmm. Linux should maybe add something like Windows’ “reserved” virtual memory. It’s basically a way to ask for a VA range that explicitly contains nothing and can be subsequently be turned into > > > something useful with the equivalent of MAP_FORCE. > > What's the benefit over creating an anonymous PROT_NONE region? That > > the kernel won't have to scan through the corresponding PTEs when > > tearing down the mapping? > Make it more obvious what’s happening and avoid accounting issues?  What I’ve actually used is MAP_NORESERVE | PROT_NONE, but I think this still counts against the VA rlimit. But maybe that’s > actually the desired behavior. We can put a NULL at both ends of a SHSTK to guard against corruption. Yu-cheng