From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4DB42C6FD1F for ; Sat, 25 Mar 2023 08:13:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:List-Subscribe:List-Help: List-Post:List-Archive:List-Unsubscribe:List-Id:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=5MaupyinZZt9OG37ybF7UANWzBTKcTe1lfs2sae3ZDw=; b=gFDG+xeSd/YqVf xsxqooxT8RY48I+V+fCfYSXnU3jGliXf7ZvqGaHDjcGrzwfQi+5l93arDl02+W9qMsvLzoI+Qkzzs gezgq1GgmkJiHWnI9/YhrmvE5n6plXNUvrKYy67EbhiAQ57yompAAFik30D/+bUjsnOygzT9HGUtp hRTXjavoWcQ1j9l5f//u02h9iFTIYFqGdvilmars0XCLkTBX+UdMXXsIHQK+KIbhbHK4qZFF/r7FH X42yEKG/j3Id3it3UQGdb2+1S/MhaKOm3ubsvl51bZMmE2ETjG5Ig1feeCOL6ZUuwWUJLwkVQtR25 85ixlEh3m64N3yGb1Zdg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1pfz0k-006QKn-2U; Sat, 25 Mar 2023 08:12:26 +0000 Received: from mail-lj1-x22d.google.com ([2a00:1450:4864:20::22d]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1pfz0Y-006QGF-2C for linux-arm-kernel@lists.infradead.org; Sat, 25 Mar 2023 08:12:18 +0000 Received: by mail-lj1-x22d.google.com with SMTP id t14so3922944ljd.5 for ; Sat, 25 Mar 2023 01:12:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1679731932; h=references:in-reply-to:message-id:date:subject:cc:to:from:from:to :cc:subject:date:message-id:reply-to; bh=A4XgPFdc/okcXx2NbhmBkGOEY1T8HTI/lZkH2CUhYRE=; b=C//SxJywjFJCBim3f6DyMsplXjZ4WAJfXUaNGINYm4lfjGjLMUse/uAxgZciltCCWG 7S8tbz/Pj9CzPyyRLzphy/aYo1cmxLBKEvSZNTJ3s77/GiDHzGgq36oQzc7DsNV5c8OC CBFltmG8eP3gwIxGF9+ZWZligefahxMz9ar3bYi/F05sxnoRpaLFxbxRZOJIb6g14vWf JD33Qp3p0ULL7MRGoDYX7Dnh/9Ccxpepq9E1jVeBH4KAtXzeJNkJIuzl1KaMo+iPJyvt oR6vJYg0YYLzW2xlrQxMJ2w3MJGVhLkkCWiDUnu5LzAahvJAvpz44zYMUehRXL0AQAeZ tpyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679731932; h=references:in-reply-to:message-id:date:subject:cc:to:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=A4XgPFdc/okcXx2NbhmBkGOEY1T8HTI/lZkH2CUhYRE=; b=QbdMeuI91jWokTDM6Q+2/xDeRLOzjR7LAA3cODsAzVYA42fOQL/bFbMl0nLiuZ0CGU omkBjkgoXUYn8Tma6HhYScXXAaQjgBX3yx2+mhkpsAy8svlLvOAyiXCY43YafqSoSQn7 udW7E3o9AsjMQLSSeMEpc9fI1X9QC19Tt8iIW/OAGsNtJxLLr2mhI1zOBdcLstTQq/sn hz7a692Vk1JGj7M2sr/Fj6VCpw2WKGhYL8gzX42XrkIgDcJFmOey8RolMu8XTYq3t2LL a7UFM+PLSLX2bvQ9F0pEVcE4mB8tIPaZuMloxbmxUve664r6GLaviU0mxx9AV3+uBxUM iX6g== X-Gm-Message-State: AAQBX9c7ASj0TVTA6HOCosn+OjUnhlI7Jhycom2DjVKqB0OeQoXZkKza Fez3JjzZxeaQBbVOAseT4PI= X-Google-Smtp-Source: AKy350ZSw9H6KrkYhnBYCdWvbVd4OQ5MwkUCRpmMI4PZZlLnyTuRn6CcIDCfClOBlNuS/0x5UHLXiA== X-Received: by 2002:a2e:8604:0:b0:290:5166:7c28 with SMTP id a4-20020a2e8604000000b0029051667c28mr2010774lji.20.1679731931733; Sat, 25 Mar 2023 01:12:11 -0700 (PDT) Received: from localhost ([188.119.65.94]) by smtp.gmail.com with ESMTPSA id s24-20020a2e98d8000000b002996e0e6461sm3715162ljj.29.2023.03.25.01.12.10 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 25 Mar 2023 01:12:11 -0700 (PDT) From: Dan Li To: gcc-patches@gcc.gnu.org, Richard Sandiford , Masahiro Yamada , Michal Marek , Nick Desaulniers , Catalin Marinas , Will Deacon , Sami Tolvanen , Kees Cook , Nathan Chancellor , Tom Rix , Peter Zijlstra , "Paul E. McKenney" , Mark Rutland , Josh Poimboeuf , Frederic Weisbecker , "Eric W. Biederman" , Dan Li , Marco Elver , Christophe Leroy , Song Liu , Andrew Morton , Uros Bizjak , Kumar Kartikeya Dwivedi , Juergen Gross , Luis Chamberlain , Borislav Petkov , Masami Hiramatsu , Dmitry Torokhov , Aaron Tomlin , Kalesh Singh , Yuntao Wang , Changbin Du Cc: linux-kbuild@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, llvm@lists.linux.dev, linux-hardening@vger.kernel.org Subject: [RFC/RFT,V2 3/3] [PR102768] aarch64: Add support for Kernel Control Flow Integrity Date: Sat, 25 Mar 2023 01:11:17 -0700 Message-Id: <20230325081117.93245-4-ashimida.1990@gmail.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230325081117.93245-1-ashimida.1990@gmail.com> References: <20221219055431.22596-1-ashimida.1990@gmail.com> <20230325081117.93245-1-ashimida.1990@gmail.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230325_011214_739201_0A065FC7 X-CRM114-Status: GOOD ( 33.65 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org In the AArch64 platform, typeid can be directly inserted in front of the function header (offset is patch_area_entry + 4), it should be assumed that patch_area_entry is the same for all functions. For all functions that will not be called indirectly, insert the reserved RESERVED_CFI_TYPEID (0x0) as typeid in front of them. If not, the attacker may use the instruction/data before the function as typeid to bypass CFI. All typeids ignore some bits (& AARCH64_UNALLOCATED_INSN_MASK) to avoid conflicts with the AArch64 instruction set (see AAPCS64 for details). Signed-off-by: Dan Li gcc/ChangeLog: * config/aarch64/aarch64.cc (RESERVED_CFI_TYPEID): Macro definition. (DEFAULT_CFI_TYPEID): Likewise. (AARCH64_UNALLOCATED_INSN_MASK): Likewise. (aarch64_calc_func_cfi_typeid): Platform-dependent CFI function. (cgraph_indirectly_callable): Determine whether a funtion may be called indirectly. (aarch64_output_func_kcfi_typeid): Platform-dependent CFI function. (aarch64_output_icall_kcfi_check): Likewise. (TARGET_HAVE_KCFI): New hook. (TARGET_CALC_FUNC_CFI_TYPEID): Likewise. (TARGET_ASM_OUTPUT_FUNC_KCFI_TYPEID): Likewise. (TARGET_ASM_OUTPUT_ICALL_KCFI_CHECK): Likewise. * doc/invoke.texi: Document -fsanitize=kcfi. --- gcc/config/aarch64/aarch64.cc | 166 ++++++++++++++++++++++++++++++++++ gcc/doc/invoke.texi | 36 ++++++++ 2 files changed, 202 insertions(+) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 5c9e7791a12..5b55541d437 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -5450,6 +5450,160 @@ aarch64_output_sve_addvl_addpl (rtx offset) return buffer; } +/* Reserved for all functions that cannot be called indirectly. */ +#define RESERVED_CFI_TYPEID 0x0U + +/* If the typeid of a function that can be called indirectly is equal to + RESERVED_CFI_TYPEID, change it to DEFAULT_CFI_TYPEID. */ +#define DEFAULT_CFI_TYPEID 0x00000ADAU + +/* Mask of reserved and unallocated instructions in AArch64 platform. */ +#define AARCH64_UNALLOCATED_INSN_MASK 0xE7FFFFFFU + +static unsigned int +aarch64_calc_func_cfi_typeid (const_tree fntype) +{ + unsigned int hash; + + /* The value of typeid has a probability of being the same as the encoding + of an instruction. If the attacker can find the same encoding as the + typeid in the assembly code, then he has found a usable jump location. + So here, a platform-related mask is used when generating a typeid to + avoid such conflicts as much as possible. */ + hash = unified_type_hash (fntype) & AARCH64_UNALLOCATED_INSN_MASK; + + /* RESERVED_CFI_TYPEID is reserved for functions that cannot + be called indirectly. */ + if (hash == RESERVED_CFI_TYPEID) + hash = DEFAULT_CFI_TYPEID; + + return hash; +} + +static bool +cgraph_indirectly_callable (struct cgraph_node *node, + void *data ATTRIBUTE_UNUSED) +{ + if (node->externally_visible || node->address_taken) + return true; + + return false; +} + +static void +aarch64_output_func_kcfi_typeid (FILE * stream, tree decl) +{ + struct cgraph_node *node; + unsigned int cur_func_typeid; + + node = cgraph_node::get (decl); + + if (!node->call_for_symbol_thunks_and_aliases (cgraph_indirectly_callable, + NULL, true)) + /* CFI's typeid check always considers that there is a typeid before the + target function, so it is also necessary to output typeid for functions + that cannot be called indirectly to prevent attackers from bypassing + CFI by using instructions/data before those functions. + The typeid inserted before such a function is RESERVED_CFI_TYPEID, + and the calculation of the typeid must ensure that this value is always + reserved. */ + cur_func_typeid = RESERVED_CFI_TYPEID; + else + cur_func_typeid = aarch64_calc_func_cfi_typeid (TREE_TYPE (decl)); + + fprintf (stream, "__kcfi_%s:\n", get_name (decl)); + fprintf (stream, "\t.4byte %#010x\n", cur_func_typeid); +} + +/* This function outputs assembly instructions to check cfi typeid before + indirect call (blr Xn), which may destroy x16, x17, x9 registers (according + to the AAPCS64 specification, these registers do not need to be restored + after the function call). + The assembly code output by this function is as follows: + ldur w16, [x1, #-4] + movk w17, #13570 + movk w17, #17309, lsl #16 + cmp w16, w17 + b.eq .Lkcfi8 + brk #0x8221 +.Lkcfi8: + blr x1 + */ + +static void +aarch64_output_icall_kcfi_check (rtx reg, unsigned int value) +{ + unsigned int addr_reg, scratch_reg1, scratch_reg2; + unsigned int esr, addr_index, type_index; + char label_buf[256]; + const char *label_ptr; + unsigned HOST_WIDE_INT patch_area_entry = crtl->patch_area_entry; + rtx_code_label * tmp_label = gen_label_rtx (); + + gcc_assert (GET_CODE (reg) == REG); + + addr_reg = REGNO (reg); + + /* The typeid read from the front of the callee is saved in the + register specified by scratch_reg1, the default is R16_REGNUM. */ + scratch_reg1 = R16_REGNUM; + + /* The expected typeid of the caller is saved in the register + specified by scratch_reg2, which defaults to R17_REGNUM. */ + scratch_reg2 = R17_REGNUM; + + gcc_assert (GP_REGNUM_P (addr_reg)); + + /* If one of the scratch registers is used for the call target, + we can clobber another caller-saved temporary register instead + (in this case, R9_REGNUM) as the check is immediately followed + by the call instruction. */ + if (addr_reg == R16_REGNUM) + { + scratch_reg1 = R9_REGNUM; + } + else if (addr_reg == R17_REGNUM) + { + scratch_reg2 = R9_REGNUM; + } + + gcc_assert ((scratch_reg1 != addr_reg) && (scratch_reg2 != addr_reg)); + + ASM_GENERATE_INTERNAL_LABEL (label_buf, "Lkcfi", + CODE_LABEL_NUMBER (tmp_label)); + label_ptr = targetm.strip_name_encoding (label_buf); + + /* The offset of callee's typeid needs to be adjusted according to + patch_area_entry. This assumes that patch_area_entry is the + same for all functions. */ + fprintf (asm_out_file, "\tldur\tw%d, [x%d, #-%ld]\n", + scratch_reg1, addr_reg, patch_area_entry * 4 + 4); + + fprintf (asm_out_file, "\tmovk\tw%d, #%d\n", scratch_reg2, value & 0xFFFF); + + fprintf (asm_out_file, "\tmovk\tw%d, #%d, lsl #16\n", + scratch_reg2, (value >> 16) & 0xFFFF); + + fprintf (asm_out_file, "\tcmp\tw%d, w%d\n", scratch_reg1, scratch_reg2); + + fprintf (asm_out_file, "\tb.eq\t%s\n", label_ptr); + + /* The base ESR for brk is 0x8000 and the register information is + encoded in bits 0-9 as follows: + - 0-4: n, where the register Xn contains the callee address + - 5-9: m, where the register Wm contains the expected typeid + Where n, m are in[0,30]. + */ + addr_index = addr_reg - R0_REGNUM; + type_index = scratch_reg2 - R0_REGNUM; + esr = 0x8000 | ((type_index & 31) << 5) | (addr_index & 31); + fprintf (asm_out_file, "\tbrk\t#0x%x\n", esr); + + fprintf (asm_out_file, "%s:\n", label_ptr); + + return; +} + /* Return true if X is a valid immediate for an SVE vector INC or DEC instruction. If it is, store the number of elements in each vector quadword in *NELTS_PER_VQ_OUT (if nonnull) and store the multiplication @@ -27823,6 +27977,18 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_HAVE_SHADOW_CALL_STACK #define TARGET_HAVE_SHADOW_CALL_STACK true +#undef TARGET_HAVE_KCFI +#define TARGET_HAVE_KCFI true + +#undef TARGET_CALC_FUNC_CFI_TYPEID +#define TARGET_CALC_FUNC_CFI_TYPEID aarch64_calc_func_cfi_typeid + +#undef TARGET_ASM_OUTPUT_FUNC_KCFI_TYPEID +#define TARGET_ASM_OUTPUT_FUNC_KCFI_TYPEID aarch64_output_func_kcfi_typeid + +#undef TARGET_ASM_OUTPUT_ICALL_KCFI_CHECK +#define TARGET_ASM_OUTPUT_ICALL_KCFI_CHECK aarch64_output_icall_kcfi_check + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-aarch64.h" diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index ff6c338bedb..1b2ba7a0f29 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -15736,6 +15736,42 @@ to turn off exceptions. See @uref{https://clang.llvm.org/docs/ShadowCallStack.html} for more details. +@item -fsanitize=kcfi +@opindex fsanitize=kcfi +The KCFI sanitizer, enabled with @option{-fsanitize=kcfi}, implements a +forward-edge control flow integrity scheme for indirect calls. It +attaches a type identifier (@code{typeid}) for each function and injects +verification code before indirect calls. + +A @code{typeid} is a 32-bit constant, its value is mainly related to the +return value type and all parameter types of the function, and is invariant +for each compilation. Since the value of @code{typeid} may conflict with +the instruction set encoding of the current platform, some bits may be +ignored on different platforms. + +At compile time, the compiler inserts checking code on all indirect calls, +and at run time, before any indirect calls occur, the code checks that +the @code{typeid} before the callee function matches the @code{typeid} +requested by the caller. If the match fails, an exception instruction +will be triggered, such as a @code{brk} in aarch64. This mechanism is +mainly designed for low-level codes, such as operating systems, and the +system needs to handle those exceptions by itself. + +If a program contains indirect calls to assembly functions, they must be +manually annotated with the expected type identifiers to prevent errors. +To make this easier, CFI generates a weak SHN_ABS +@code{__kcfi_typeid_} symbol for each address-taken function +declaration, which can be used to annotate functions in assembly as long +as at least one C translation unit linked into the program takes the +function address. + +Currently this feature only supports the aarch64 platform, mainly for +the linux kernel. Users who want to use this feature in other system +need to provide their own support for the exception handling. + +See @uref{https://clang.llvm.org/docs/ControlFlowIntegrity.html} for +more details. + @item -fsanitize=thread @opindex fsanitize=thread Enable ThreadSanitizer, a fast data race detector. -- 2.17.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel