From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3532C4321E for ; Fri, 4 Nov 2022 22:43:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230348AbiKDWnn (ORCPT ); Fri, 4 Nov 2022 18:43:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49464 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230298AbiKDWnR (ORCPT ); Fri, 4 Nov 2022 18:43:17 -0400 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8553E59FED; Fri, 4 Nov 2022 15:40:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667601601; x=1699137601; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=of60iN5sOAyNYUtQIOruia5Wg5JZGtNeyl3TvdwitXo=; b=R8qzkDLg+mL96ra/ohxGtVrP9H660y+7JpK16wy0neLQFOoQjR+lXmCW 8IgEMyCTiRei8zyF3YuGBy3bqikt6ykThOvKdkzFOxKuPd7iaYK5M4no0 YCwSX/jUeYSX/S9akjoaNAghOqiyEnL9OmkClFaUm3cIpzvplDK3Hn3ec B1rSsEKjIOL3nbffteJjRwvWx1d0W+LqSgSngojgLRVKeMuu9u+cJFDQL Hnr+5+MD5hsCXJtZi2xmfhVlOzR5piMNI0WHLDFsk9E1FErcCy8cKDN8R 4XnC2MY+dM2/01TZxtdBQNWSygYZZnz9YH7VVi7JMeTs7cScpFx8nHXi+ w==; X-IronPort-AV: E=McAfee;i="6500,9779,10521"; a="311840579" X-IronPort-AV: E=Sophos;i="5.96,138,1665471600"; d="scan'208";a="311840579" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Nov 2022 15:39:45 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10521"; a="668514117" X-IronPort-AV: E=Sophos;i="5.96,138,1665471600"; d="scan'208";a="668514117" Received: from adhjerms-mobl1.amr.corp.intel.com (HELO rpedgeco-desk.amr.corp.intel.com) ([10.212.227.68]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Nov 2022 15:39:44 -0700 From: Rick Edgecombe To: x86@kernel.org, "H . Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H . J . Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V . Shankar" , Weijiang Yang , "Kirill A . Shutemov" , John Allen , kcc@google.com, eranian@google.com, rppt@kernel.org, jamorris@linux.microsoft.com, dethoma@microsoft.com, akpm@linux-foundation.org Cc: rick.p.edgecombe@intel.com, Yu-cheng Yu Subject: [PATCH v3 25/37] x86/shstk: Add user-mode shadow stack support Date: Fri, 4 Nov 2022 15:35:52 -0700 Message-Id: <20221104223604.29615-26-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20221104223604.29615-1-rick.p.edgecombe@intel.com> References: <20221104223604.29615-1-rick.p.edgecombe@intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-api@vger.kernel.org From: Yu-cheng Yu Introduce basic shadow stack enabling/disabling/allocation routines. A task's shadow stack is allocated from memory with VM_SHADOW_STACK flag and has a fixed size of min(RLIMIT_STACK, 4GB). Keep the task's shadow stack address and size in thread_struct. This will be copied when cloning new threads, but needs to be cleared during exec, so add a function to do this. Do not support IA32 emulation or x32. Tested-by: Pengfei Xu Tested-by: John Allen Signed-off-by: Yu-cheng Yu Co-developed-by: Rick Edgecombe Signed-off-by: Rick Edgecombe Cc: Kees Cook --- v3: - Use define for set_clr_bits_msrl() (Kees) - Make some functions static (Kees) - Change feature_foo() to features_foo() (Kees) - Centralize shadow stack size rlimit checks (Kees) - Disable x32 support v2: - Get rid of unnessary shstk->base checks - Don't support IA32 emulation v1: - Switch to xsave helpers. - Expand commit log. Yu-cheng v30: - Remove superfluous comments for struct thread_shstk. - Replace 'populate' with 'unused'. arch/x86/include/asm/cet.h | 7 ++ arch/x86/include/asm/msr.h | 11 +++ arch/x86/include/asm/processor.h | 3 + arch/x86/include/uapi/asm/prctl.h | 3 + arch/x86/kernel/shstk.c | 146 ++++++++++++++++++++++++++++++ 5 files changed, 170 insertions(+) diff --git a/arch/x86/include/asm/cet.h b/arch/x86/include/asm/cet.h index a2f3c6e06ef5..cade110b2ea8 100644 --- a/arch/x86/include/asm/cet.h +++ b/arch/x86/include/asm/cet.h @@ -8,12 +8,19 @@ struct task_struct; #ifdef CONFIG_X86_USER_SHADOW_STACK +struct thread_shstk { + u64 base; + u64 size; +}; + long cet_prctl(struct task_struct *task, int option, unsigned long features); void reset_thread_features(void); +void shstk_free(struct task_struct *p); #else static inline long cet_prctl(struct task_struct *task, int option, unsigned long features) { return -EINVAL; } static inline void reset_thread_features(void) {} +static inline void shstk_free(struct task_struct *p) {} #endif /* CONFIG_X86_USER_SHADOW_STACK */ #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h index 65ec1965cd28..a4b86eb537d6 100644 --- a/arch/x86/include/asm/msr.h +++ b/arch/x86/include/asm/msr.h @@ -310,6 +310,17 @@ void msrs_free(struct msr *msrs); int msr_set_bit(u32 msr, u8 bit); int msr_clear_bit(u32 msr, u8 bit); +/* Helper that can never get accidentally un-inlined. */ +#define set_clr_bits_msrl(msr, set, clear) do { \ + u64 __val, __new_val; \ + \ + rdmsrl(msr, __val); \ + __new_val = (__val & ~(clear)) | (set); \ + \ + if (__new_val != __val) \ + wrmsrl(msr, __new_val); \ +} while (0) + #ifdef CONFIG_SMP int rdmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 *l, u32 *h); int wrmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 l, u32 h); diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index ca66d320a263..a6c414dfd10f 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -27,6 +27,7 @@ struct vm86; #include #include #include +#include #include #include @@ -533,6 +534,8 @@ struct thread_struct { #ifdef CONFIG_X86_USER_SHADOW_STACK unsigned long features; unsigned long features_locked; + + struct thread_shstk shstk; #endif /* Floating point and extended processor state */ diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h index 2dae9997ee17..dad5288bf086 100644 --- a/arch/x86/include/uapi/asm/prctl.h +++ b/arch/x86/include/uapi/asm/prctl.h @@ -26,4 +26,7 @@ #define ARCH_CET_DISABLE 0x5002 #define ARCH_CET_LOCK 0x5003 +/* ARCH_CET_ features bits */ +#define CET_SHSTK (1ULL << 0) + #endif /* _ASM_X86_PRCTL_H */ diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c index ed6f25cc07c5..20da2008e021 100644 --- a/arch/x86/kernel/shstk.c +++ b/arch/x86/kernel/shstk.c @@ -8,14 +8,160 @@ #include #include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include #include +static bool features_enabled(unsigned long features) +{ + return current->thread.features & features; +} + +static void features_set(unsigned long features) +{ + current->thread.features |= features; +} + +static void features_clr(unsigned long features) +{ + current->thread.features &= ~features; +} + +static unsigned long alloc_shstk(unsigned long size) +{ + int flags = MAP_ANONYMOUS | MAP_PRIVATE; + struct mm_struct *mm = current->mm; + unsigned long addr, unused; + + mmap_write_lock(mm); + addr = do_mmap(NULL, addr, size, PROT_READ, flags, + VM_SHADOW_STACK | VM_WRITE, 0, &unused, NULL); + + mmap_write_unlock(mm); + + return addr; +} + +static unsigned long adjust_shstk_size(unsigned long size) +{ + if (size) + return PAGE_ALIGN(size); + + return PAGE_ALIGN(min_t(unsigned long long, rlimit(RLIMIT_STACK), SZ_4G)); +} + +static void unmap_shadow_stack(u64 base, u64 size) +{ + while (1) { + int r; + + r = vm_munmap(base, size); + + /* + * vm_munmap() returns -EINTR when mmap_lock is held by + * something else, and that lock should not be held for a + * long time. Retry it for the case. + */ + if (r == -EINTR) { + cond_resched(); + continue; + } + + /* + * For all other types of vm_munmap() failure, either the + * system is out of memory or there is bug. + */ + WARN_ON_ONCE(r); + break; + } +} + +static int shstk_setup(void) +{ + struct thread_shstk *shstk = ¤t->thread.shstk; + unsigned long addr, size; + + /* Already enabled */ + if (features_enabled(CET_SHSTK)) + return 0; + + /* Also not supported for 32 bit and x32 */ + if (!cpu_feature_enabled(X86_FEATURE_USER_SHSTK) || in_32bit_syscall()) + return -EOPNOTSUPP; + + size = adjust_shstk_size(0); + addr = alloc_shstk(size); + if (IS_ERR_VALUE(addr)) + return PTR_ERR((void *)addr); + + fpregs_lock_and_load(); + wrmsrl(MSR_IA32_PL3_SSP, addr + size); + wrmsrl(MSR_IA32_U_CET, CET_SHSTK_EN); + fpregs_unlock(); + + shstk->base = addr; + shstk->size = size; + features_set(CET_SHSTK); + + return 0; +} + void reset_thread_features(void) { + memset(¤t->thread.shstk, 0, sizeof(struct thread_shstk)); current->thread.features = 0; current->thread.features_locked = 0; } +void shstk_free(struct task_struct *tsk) +{ + struct thread_shstk *shstk = &tsk->thread.shstk; + + if (!cpu_feature_enabled(X86_FEATURE_USER_SHSTK) || + !features_enabled(CET_SHSTK)) + return; + + if (!tsk->mm) + return; + + unmap_shadow_stack(shstk->base, shstk->size); +} + + +static int shstk_disable(void) +{ + if (!cpu_feature_enabled(X86_FEATURE_USER_SHSTK)) + return -EOPNOTSUPP; + + /* Already disabled? */ + if (!features_enabled(CET_SHSTK)) + return 0; + + fpregs_lock_and_load(); + /* Disable WRSS too when disabling shadow stack */ + set_clr_bits_msrl(MSR_IA32_U_CET, 0, CET_SHSTK_EN); + wrmsrl(MSR_IA32_PL3_SSP, 0); + fpregs_unlock(); + + shstk_free(current); + features_clr(CET_SHSTK); + + return 0; +} + long cet_prctl(struct task_struct *task, int option, unsigned long features) { if (option == ARCH_CET_LOCK) { -- 2.17.1