* [PATCH v2 1/7] mangle: Introduce C typeinfo mangling API
2025-09-05 0:24 [PATCH v2 0/7] Introduce Kernel Control Flow Integrity ABI [PR107048] Kees Cook
@ 2025-09-05 0:24 ` Kees Cook
2025-09-05 0:50 ` Andrew Pinski
2025-09-05 0:24 ` [PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure Kees Cook
` (5 subsequent siblings)
6 siblings, 1 reply; 32+ messages in thread
From: Kees Cook @ 2025-09-05 0:24 UTC (permalink / raw)
To: Qing Zhao
Cc: Kees Cook, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Peter Zijlstra, Dan Li,
Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
Bill Wendling, gcc-patches, linux-hardening
To support the KCFI type-id which needs to convert unique function
prototypes into unique 32-bit values, add a subset of the Itanium C++
mangling ABI for C typeinfo of function prototypes, but then do
hashing, which is needed by KCFI to get a 32-bit hash value for a
given function prototype. Optionally report the mangled string
to the dumpfile.
Trying to extract only the C portions of the gcc/cp/mangle.cc code
seemed infeasible after a few attempts. So this is the minimal subset
of the mangling ABI needed to generate unique KCFI type ids.
I could not find a way to build a sensible selftest infrastructure for
this code. I wanted to do something like this:
#ifdef CHECKING_P
const char code[] = "
typedef struct { int x, y } xy_t;
extern int func(xy_t *p);
";
ASSERT_MANGLE (code, "_ZTSPFiP4xy_tE");
...
#endif
But I could not find any way to build a localized parser that could
parse the "code" string from which I could extract the "func" fndecl.
It would have been so much nicer to build the selftest directly into
mangle.cc here, but I couldn't figure it out. Instead, later patches
create a "kcfi" dump file, and the large kcfi testsuite validates
expected mangle strings as part of the type-id validation.
gcc/ChangeLog:
* Makefile.in: Add mangle.o to build.
* mangle.cc: New file. Implement C typeinfo mangling for KCFI.
* mangle.h: New file. Export hash_function_type function.
Signed-off-by: Kees Cook <kees@kernel.org>
---
gcc/Makefile.in | 1 +
gcc/mangle.h | 32 +++
gcc/mangle.cc | 512 ++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 545 insertions(+)
create mode 100644 gcc/mangle.h
create mode 100644 gcc/mangle.cc
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index d2744db843d7..4c12ac68d979 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1617,6 +1617,7 @@ OBJS = \
lto-section-out.o \
lto-opts.o \
lto-compress.o \
+ mangle.o \
mcf.o \
mode-switching.o \
modulo-sched.o \
diff --git a/gcc/mangle.h b/gcc/mangle.h
new file mode 100644
index 000000000000..fe7916dd68e0
--- /dev/null
+++ b/gcc/mangle.h
@@ -0,0 +1,32 @@
+/* Itanium C++ ABI type mangling for GCC.
+ Copyright (C) 2025 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3. If not see
+<http://www.gnu.org/licenses/>. */
+
+#ifndef GCC_MANGLE_H
+#define GCC_MANGLE_H
+
+#include "tree.h"
+#include <string>
+
+/* Function type hashing following Itanium C++ ABI conventions.
+ Returns the FNV-1a hash of the mangled type string.
+ Builds the actual string only if dump is active for debugging.
+ Optional fndecl parameter provides function context for error reporting. */
+extern uint32_t hash_function_type (tree fntype, tree fndecl = NULL_TREE);
+
+#endif /* GCC_MANGLE_H */
diff --git a/gcc/mangle.cc b/gcc/mangle.cc
new file mode 100644
index 000000000000..8f177a415e15
--- /dev/null
+++ b/gcc/mangle.cc
@@ -0,0 +1,512 @@
+/* Itanium C++ ABI type mangling for GCC.
+ Copyright (C) 2025 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3. If not see
+<http://www.gnu.org/licenses/>. */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "diagnostic-core.h"
+#include "stringpool.h"
+#include "stor-layout.h"
+#include "mangle.h"
+#include "selftest.h"
+#include "dumpfile.h"
+#include "print-tree.h"
+
+/* Current function context for better error reporting. */
+static tree current_function_context = NULL_TREE;
+
+/* Helper to update FNV-1a hash with a single character. */
+static inline void
+fnv1a_hash_char (uint32_t *hash_state, unsigned char c)
+{
+ *hash_state ^= c;
+ *hash_state *= 16777619U; /* FNV-1a 32-bit prime. */
+}
+
+/* Helper to append character to optional string and update hash using FNV-1a. */
+static void
+append_char (char c, std::string *out_str, uint32_t *hash_state)
+{
+ if (out_str)
+ *out_str += c;
+ fnv1a_hash_char (hash_state, (unsigned char) c);
+}
+
+/* Helper to append string to optional string and update hash using FNV-1a. */
+static void
+append_string (const char *str, std::string *out_str, uint32_t *hash_state)
+{
+ if (out_str)
+ *out_str += str;
+ for (const char *p = str; *p; p++)
+ fnv1a_hash_char (hash_state, (unsigned char) *p);
+}
+
+/* Forward declaration for recursive type mangling. */
+static void mangle_type (tree type, std::string *out_str, uint32_t *hash_state);
+
+/* Mangle a builtin type following Itanium C++ ABI for C types. */
+static void
+mangle_builtin_type (tree type, std::string *out_str, uint32_t *hash_state)
+{
+ gcc_assert (type != NULL_TREE);
+
+ switch (TREE_CODE (type))
+ {
+ case VOID_TYPE:
+ append_char ('v', out_str, hash_state);
+ return;
+
+ case BOOLEAN_TYPE:
+ append_char ('b', out_str, hash_state);
+ return;
+
+ case INTEGER_TYPE:
+ /* Handle standard integer types using Itanium ABI codes. */
+ if (type == char_type_node)
+ append_char ('c', out_str, hash_state);
+ else if (type == signed_char_type_node)
+ append_char ('a', out_str, hash_state);
+ else if (type == unsigned_char_type_node)
+ append_char ('h', out_str, hash_state);
+ else if (type == short_integer_type_node)
+ append_char ('s', out_str, hash_state);
+ else if (type == short_unsigned_type_node)
+ append_char ('t', out_str, hash_state);
+ else if (type == integer_type_node)
+ append_char ('i', out_str, hash_state);
+ else if (type == unsigned_type_node)
+ append_char ('j', out_str, hash_state);
+ else if (type == long_integer_type_node)
+ append_char ('l', out_str, hash_state);
+ else if (type == long_unsigned_type_node)
+ append_char ('m', out_str, hash_state);
+ else if (type == long_long_integer_type_node)
+ append_char ('x', out_str, hash_state);
+ else if (type == long_long_unsigned_type_node)
+ append_char ('y', out_str, hash_state);
+ else
+ {
+ /* Fallback for other integer types - use precision-based encoding. */
+ append_char ('i', out_str, hash_state);
+ append_string (std::to_string (TYPE_PRECISION (type)).c_str (), out_str, hash_state);
+ }
+ return;
+
+ case REAL_TYPE:
+ if (type == float_type_node)
+ append_char ('f', out_str, hash_state);
+ else if (type == double_type_node)
+ append_char ('d', out_str, hash_state);
+ else if (type == long_double_type_node)
+ append_char ('e', out_str, hash_state);
+ else
+ {
+ /* Fallback for other real types. */
+ append_char ('f', out_str, hash_state);
+ append_string (std::to_string (TYPE_PRECISION (type)).c_str (), out_str, hash_state);
+ }
+ return;
+
+ case VECTOR_TYPE:
+ {
+ /* Handle vector types following Itanium C++ ABI:
+ Dv<num-elements>_<element-type-encoding>
+ Example: uint8x16_t → Dv16_h (vector of 16 unsigned char) */
+ tree vector_size = TYPE_SIZE_UNIT (type);
+ tree element_type = TREE_TYPE (type);
+ tree element_size = TYPE_SIZE_UNIT (element_type);
+
+ if (vector_size && element_size &&
+ TREE_CODE (vector_size) == INTEGER_CST &&
+ TREE_CODE (element_size) == INTEGER_CST)
+ {
+ append_char ('D', out_str, hash_state);
+ append_char ('v', out_str, hash_state);
+
+ unsigned HOST_WIDE_INT vec_bytes = tree_to_uhwi (vector_size);
+ unsigned HOST_WIDE_INT elem_bytes = tree_to_uhwi (element_size);
+ unsigned HOST_WIDE_INT num_elements = vec_bytes / elem_bytes;
+
+ /* Append number of elements. */
+ append_string (std::to_string (num_elements).c_str (), out_str, hash_state);
+ append_char ('_', out_str, hash_state);
+
+ /* Recursively mangle the element type. */
+ mangle_type (element_type, out_str, hash_state);
+ return;
+ }
+ /* Fail for vectors with unknown size. */
+ }
+ break;
+
+ default:
+ break;
+ }
+
+ /* Unknown builtin type - this should never happen in a well-formed C program. */
+ debug_tree (type);
+ internal_error ("mangle: Unknown builtin type in function %qD - please report this as a bug",
+ current_function_context);
+}
+
+/* Canonicalize typedef types to their underlying named struct/union types. */
+static tree
+canonicalize_typedef_type (tree type)
+{
+ /* Handle typedef types - canonicalize to named structs when possible. */
+ if (TYPE_NAME (type) && TREE_CODE (TYPE_NAME (type)) == TYPE_DECL)
+ {
+ tree type_decl = TYPE_NAME (type);
+
+ /* Check if this is a typedef (not the original struct declaration) */
+ if (DECL_ORIGINAL_TYPE (type_decl))
+ {
+ tree original_type = DECL_ORIGINAL_TYPE (type_decl);
+
+ /* If the original type is a named struct/union/enum, use that instead. */
+ if ((TREE_CODE (original_type) == RECORD_TYPE
+ || TREE_CODE (original_type) == UNION_TYPE
+ || TREE_CODE (original_type) == ENUMERAL_TYPE)
+ && TYPE_NAME (original_type)
+ && ((TREE_CODE (TYPE_NAME (original_type)) == TYPE_DECL
+ && DECL_NAME (TYPE_NAME (original_type)))
+ || TREE_CODE (TYPE_NAME (original_type)) == IDENTIFIER_NODE))
+ {
+ /* Recursively canonicalize in case the original type is also a typedef. */
+ return canonicalize_typedef_type (original_type);
+ }
+
+ /* For basic type typedefs (e.g., u8 -> unsigned char), canonicalize to original type. */
+ if (TREE_CODE (original_type) == INTEGER_TYPE
+ || TREE_CODE (original_type) == REAL_TYPE
+ || TREE_CODE (original_type) == POINTER_TYPE
+ || TREE_CODE (original_type) == ARRAY_TYPE
+ || TREE_CODE (original_type) == FUNCTION_TYPE
+ || TREE_CODE (original_type) == METHOD_TYPE
+ || TREE_CODE (original_type) == BOOLEAN_TYPE
+ || TREE_CODE (original_type) == COMPLEX_TYPE
+ || TREE_CODE (original_type) == VECTOR_TYPE)
+ {
+ /* Recursively canonicalize in case the original type is also a typedef. */
+ return canonicalize_typedef_type (original_type);
+ }
+ }
+ }
+
+ return type;
+}
+
+/* Recursively mangle a type following Itanium C++ ABI conventions. */
+static void
+mangle_type (tree type, std::string *out_str, uint32_t *hash_state)
+{
+ gcc_assert (type != NULL_TREE);
+
+ /* Canonicalize typedef types to their underlying named struct types. */
+ type = canonicalize_typedef_type (type);
+
+ switch (TREE_CODE (type))
+ {
+ case POINTER_TYPE:
+ {
+ /* Pointer type: 'P' + qualifiers + pointed-to type. */
+ append_char ('P', out_str, hash_state);
+
+ /* Add qualifiers to the pointed-to type following Itanium C++ ABI ordering. */
+ tree pointed_to_type = TREE_TYPE (type);
+ if (TYPE_QUALS (pointed_to_type) != TYPE_UNQUALIFIED)
+ {
+ /* Emit qualifiers in Itanium ABI order: restrict, volatile, const. */
+ if (TYPE_QUALS (pointed_to_type) & TYPE_QUAL_RESTRICT)
+ append_char ('r', out_str, hash_state);
+ if (TYPE_QUALS (pointed_to_type) & TYPE_QUAL_VOLATILE)
+ append_char ('V', out_str, hash_state);
+ if (TYPE_QUALS (pointed_to_type) & TYPE_QUAL_CONST)
+ append_char ('K', out_str, hash_state);
+ }
+
+ /* For KCFI's hybrid type system: preserve typedef names for compound types,
+ but use canonical forms for primitive types. */
+ tree target_type;
+ if (TREE_CODE (pointed_to_type) == RECORD_TYPE
+ || TREE_CODE (pointed_to_type) == UNION_TYPE
+ || TREE_CODE (pointed_to_type) == ENUMERAL_TYPE)
+ {
+ /* Compound type: preserve typedef information by using original type. */
+ target_type = pointed_to_type;
+ }
+ else
+ {
+ /* Primitive type: use canonical form to ensure structural typing. */
+ target_type = TYPE_MAIN_VARIANT (pointed_to_type);
+ }
+ mangle_type (target_type, out_str, hash_state);
+ break;
+ }
+
+ case ARRAY_TYPE:
+ /* Array type: 'A' + size + '_' + element type (simplified). */
+ append_char ('A', out_str, hash_state);
+ if (TYPE_DOMAIN (type) && TYPE_MAX_VALUE (TYPE_DOMAIN (type)))
+ {
+ tree max_val = TYPE_MAX_VALUE (TYPE_DOMAIN (type));
+ /* Check if array size is a compile-time constant to handle VLAs safely. */
+ if (TREE_CODE (max_val) == INTEGER_CST && tree_fits_shwi_p (max_val))
+ {
+ HOST_WIDE_INT size = tree_to_shwi (max_val) + 1;
+ append_string (std::to_string ((long) size).c_str (), out_str, hash_state);
+ }
+ /* For VLAs or non-constant dimensions, emit empty size (A_). */
+ append_char ('_', out_str, hash_state);
+ }
+ else
+ {
+ /* No domain or no max value - emit A_. */
+ append_char ('_', out_str, hash_state);
+ }
+ mangle_type (TREE_TYPE (type), out_str, hash_state);
+ break;
+
+ case REFERENCE_TYPE:
+ /* Reference type: 'R' + referenced type.
+ Note: We must handle references to builtin types including compiler
+ builtins like __builtin_va_list used in functions like va_start. */
+ append_char ('R', out_str, hash_state);
+ mangle_type (TREE_TYPE (type), out_str, hash_state);
+ break;
+
+ case FUNCTION_TYPE:
+ {
+ /* Function type: 'F' + return type + parameter types + 'E' */
+ append_char ('F', out_str, hash_state);
+ mangle_type (TREE_TYPE (type), out_str, hash_state);
+
+ /* Add parameter types. */
+ tree param_types = TYPE_ARG_TYPES (type);
+
+ if (param_types == NULL_TREE)
+ {
+ /* func() - no parameter list (could be variadic). */
+ }
+ else
+ {
+ bool found_real_params = false;
+ for (tree param = param_types; param; param = TREE_CHAIN (param))
+ {
+ tree param_type = TREE_VALUE (param);
+ if (param_type == void_type_node)
+ {
+ /* Check if this is the first parameter (explicit void) or a sentinel */
+ if (!found_real_params)
+ {
+ /* func(void) - explicit empty parameter list.
+ Mangle void to distinguish from variadic func(). */
+ mangle_type (void_type_node, out_str, hash_state);
+ }
+ /* If we found real params before this void, it's a sentinel - stop */
+ break;
+ }
+
+ found_real_params = true;
+
+ /* For value parameters, ignore const/volatile qualifiers as they
+ don't affect the calling convention. const int and int are
+ passed identically by value. */
+ tree canonical_param_type = param_type;
+ if (TREE_CODE (param_type) != POINTER_TYPE
+ && TREE_CODE (param_type) != REFERENCE_TYPE
+ && TREE_CODE (param_type) != ARRAY_TYPE)
+ {
+ /* Strip qualifiers for non-pointer/reference value parameters. */
+ canonical_param_type = TYPE_MAIN_VARIANT (param_type);
+ }
+
+ mangle_type (canonical_param_type, out_str, hash_state);
+ }
+ }
+
+ /* Check if this is a variadic function and add 'z' marker. */
+ if (stdarg_p (type))
+ {
+ append_char ('z', out_str, hash_state);
+ }
+
+ append_char ('E', out_str, hash_state);
+ break;
+ }
+
+ case RECORD_TYPE:
+ case UNION_TYPE:
+ case ENUMERAL_TYPE:
+ {
+ /* Struct/union/enum: use simplified representation for C types. */
+ const char *name = NULL;
+
+ if (TYPE_NAME (type))
+ {
+ if (TREE_CODE (TYPE_NAME (type)) == TYPE_DECL)
+ {
+ /* TYPE_DECL case: both named structs and typedef structs. */
+ tree decl_name = DECL_NAME (TYPE_NAME (type));
+ if (decl_name && TREE_CODE (decl_name) == IDENTIFIER_NODE)
+ {
+ name = IDENTIFIER_POINTER (decl_name);
+ }
+ }
+ else if (TREE_CODE (TYPE_NAME (type)) == IDENTIFIER_NODE)
+ {
+ /* Direct identifier case. */
+ name = IDENTIFIER_POINTER (TYPE_NAME (type));
+ }
+ }
+
+ /* If no name found through normal extraction, handle anonymous types following Itanium C++ ABI. */
+ if (!name && !TYPE_NAME (type))
+ {
+ static char anon_name[128];
+
+ if (TREE_CODE (type) == UNION_TYPE)
+ {
+ /* For anonymous unions, try to find first named field (Itanium ABI approach). */
+ tree field = TYPE_FIELDS (type);
+ while (field && !DECL_NAME (field))
+ field = DECL_CHAIN (field);
+
+ if (field && DECL_NAME (field))
+ {
+ const char *field_name = IDENTIFIER_POINTER (DECL_NAME (field));
+ snprintf (anon_name, sizeof(anon_name), "anon_union_by_%s", field_name);
+ }
+ else
+ {
+ /* No named fields - use Itanium-style Ut encoding. */
+ snprintf (anon_name, sizeof(anon_name), "Ut_unnamed_union");
+ }
+ }
+ else
+ {
+ /* For anonymous structs/enums, use Itanium-style Ut encoding
+ with layout info for discrimination. */
+ const char *type_prefix = "";
+ if (TREE_CODE (type) == RECORD_TYPE)
+ type_prefix = "struct";
+ else if (TREE_CODE (type) == ENUMERAL_TYPE)
+ type_prefix = "enum";
+
+ /* Include size and field layout for better discrimination. */
+ HOST_WIDE_INT size = 0;
+ if (TYPE_SIZE (type) && tree_fits_shwi_p (TYPE_SIZE (type)))
+ size = tree_to_shwi (TYPE_SIZE (type));
+
+ /* Generate a hash based on field layout to distinguish same-sized
+ anonymous types. */
+ unsigned layout_hash = 0;
+ if (TREE_CODE (type) == RECORD_TYPE)
+ {
+ for (tree field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
+ {
+ if (TREE_CODE (field) == FIELD_DECL)
+ {
+ /* Hash field offset and type. */
+ if (DECL_FIELD_OFFSET (field))
+ {
+ HOST_WIDE_INT offset = tree_to_shwi (DECL_FIELD_OFFSET (field));
+ layout_hash = layout_hash * 31 + (unsigned)offset;
+ }
+
+ /* Hash field type. */
+ tree field_type = TREE_TYPE (field);
+ if (field_type && TYPE_MODE (field_type) != VOIDmode)
+ layout_hash = layout_hash * 37 + (unsigned)TYPE_MODE (field_type);
+ }
+ }
+ }
+
+ if (layout_hash != 0)
+ snprintf (anon_name, sizeof(anon_name), "Ut_%s_%ld_%x",
+ type_prefix, (long)size, layout_hash);
+ else
+ snprintf (anon_name, sizeof(anon_name), "Ut_%s_%ld",
+ type_prefix, (long)size);
+ }
+
+ name = anon_name;
+ }
+
+ if (name)
+ {
+ append_string (std::to_string (strlen (name)).c_str (), out_str, hash_state);
+ append_string (name, out_str, hash_state);
+ }
+ else
+ {
+ /* Always show diagnostic information for missing struct names. */
+ debug_tree (type);
+ internal_error ("mangle: Missing case in struct name extraction - please report this as a bug");
+ }
+ break;
+ }
+
+ default:
+ /* Handle builtin types. */
+ mangle_builtin_type (type, out_str, hash_state);
+ break;
+ }
+}
+
+/* Compute canonical function type hash using Itanium C++ ABI mangling. */
+uint32_t
+hash_function_type (tree fntype, tree fndecl)
+{
+ gcc_assert (fntype);
+ gcc_assert (TREE_CODE (fntype) == FUNCTION_TYPE);
+
+ std::string result;
+ std::string *out_str = nullptr;
+ uint32_t hash_state = 2166136261U; /* FNV-1a 32-bit offset basis. */
+
+ /* Only build string if dump is active. */
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ {
+ result.reserve (32);
+ out_str = &result;
+ }
+
+ /* Store function context for error reporting. */
+ current_function_context = fndecl;
+
+ /* Typeinfo for a function prototype. */
+ append_string ("_ZTS", out_str, &hash_state);
+
+ mangle_type (fntype, out_str, &hash_state);
+
+ /* Clear function context. */
+ current_function_context = NULL_TREE;
+
+ /* Output to dump file if enabled. */
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ {
+ fprintf (dump_file, "KCFI type ID: mangled='%s' typeid=0x%08x\n",
+ result.c_str (), hash_state);
+ }
+
+ return hash_state;
+}
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* Re: [PATCH v2 1/7] mangle: Introduce C typeinfo mangling API
2025-09-05 0:24 ` [PATCH v2 1/7] mangle: Introduce C typeinfo mangling API Kees Cook
@ 2025-09-05 0:50 ` Andrew Pinski
2025-09-05 1:09 ` Kees Cook
0 siblings, 1 reply; 32+ messages in thread
From: Andrew Pinski @ 2025-09-05 0:50 UTC (permalink / raw)
To: Kees Cook
Cc: Qing Zhao, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Peter Zijlstra, Dan Li,
Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
Bill Wendling, gcc-patches, linux-hardening
On Thu, Sep 4, 2025 at 5:27 PM Kees Cook <kees@kernel.org> wrote:
>
> To support the KCFI type-id which needs to convert unique function
> prototypes into unique 32-bit values, add a subset of the Itanium C++
> mangling ABI for C typeinfo of function prototypes, but then do
> hashing, which is needed by KCFI to get a 32-bit hash value for a
> given function prototype. Optionally report the mangled string
> to the dumpfile.
>
> Trying to extract only the C portions of the gcc/cp/mangle.cc code
> seemed infeasible after a few attempts. So this is the minimal subset
> of the mangling ABI needed to generate unique KCFI type ids.
>
> I could not find a way to build a sensible selftest infrastructure for
> this code. I wanted to do something like this:
>
> #ifdef CHECKING_P
> const char code[] = "
> typedef struct { int x, y } xy_t;
> extern int func(xy_t *p);
> ";
>
> ASSERT_MANGLE (code, "_ZTSPFiP4xy_tE");
> ...
> #endif
>
> But I could not find any way to build a localized parser that could
> parse the "code" string from which I could extract the "func" fndecl.
> It would have been so much nicer to build the selftest directly into
> mangle.cc here, but I couldn't figure it out. Instead, later patches
> create a "kcfi" dump file, and the large kcfi testsuite validates
> expected mangle strings as part of the type-id validation.
>
> gcc/ChangeLog:
>
> * Makefile.in: Add mangle.o to build.
> * mangle.cc: New file. Implement C typeinfo mangling for KCFI.
> * mangle.h: New file. Export hash_function_type function.
>
> Signed-off-by: Kees Cook <kees@kernel.org>
> ---
> gcc/Makefile.in | 1 +
> gcc/mangle.h | 32 +++
> gcc/mangle.cc | 512 ++++++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 545 insertions(+)
> create mode 100644 gcc/mangle.h
> create mode 100644 gcc/mangle.cc
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index d2744db843d7..4c12ac68d979 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1617,6 +1617,7 @@ OBJS = \
> lto-section-out.o \
> lto-opts.o \
> lto-compress.o \
> + mangle.o \
> mcf.o \
> mode-switching.o \
> modulo-sched.o \
> diff --git a/gcc/mangle.h b/gcc/mangle.h
> new file mode 100644
> index 000000000000..fe7916dd68e0
> --- /dev/null
> +++ b/gcc/mangle.h
> @@ -0,0 +1,32 @@
> +/* Itanium C++ ABI type mangling for GCC.
> + Copyright (C) 2025 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3. If not see
> +<http://www.gnu.org/licenses/>. */
> +
> +#ifndef GCC_MANGLE_H
> +#define GCC_MANGLE_H
> +
> +#include "tree.h"
> +#include <string>
> +
> +/* Function type hashing following Itanium C++ ABI conventions.
> + Returns the FNV-1a hash of the mangled type string.
> + Builds the actual string only if dump is active for debugging.
> + Optional fndecl parameter provides function context for error reporting. */
> +extern uint32_t hash_function_type (tree fntype, tree fndecl = NULL_TREE);
> +
> +#endif /* GCC_MANGLE_H */
> diff --git a/gcc/mangle.cc b/gcc/mangle.cc
> new file mode 100644
> index 000000000000..8f177a415e15
> --- /dev/null
> +++ b/gcc/mangle.cc
> @@ -0,0 +1,512 @@
> +/* Itanium C++ ABI type mangling for GCC.
> + Copyright (C) 2025 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3. If not see
> +<http://www.gnu.org/licenses/>. */
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "tree.h"
> +#include "diagnostic-core.h"
> +#include "stringpool.h"
> +#include "stor-layout.h"
> +#include "mangle.h"
> +#include "selftest.h"
> +#include "dumpfile.h"
> +#include "print-tree.h"
> +
> +/* Current function context for better error reporting. */
> +static tree current_function_context = NULL_TREE;
> +
> +/* Helper to update FNV-1a hash with a single character. */
> +static inline void
> +fnv1a_hash_char (uint32_t *hash_state, unsigned char c)
> +{
> + *hash_state ^= c;
> + *hash_state *= 16777619U; /* FNV-1a 32-bit prime. */
> +}
> +
> +/* Helper to append character to optional string and update hash using FNV-1a. */
> +static void
> +append_char (char c, std::string *out_str, uint32_t *hash_state)
> +{
> + if (out_str)
> + *out_str += c;
> + fnv1a_hash_char (hash_state, (unsigned char) c);
> +}
> +
> +/* Helper to append string to optional string and update hash using FNV-1a. */
> +static void
> +append_string (const char *str, std::string *out_str, uint32_t *hash_state)
> +{
> + if (out_str)
> + *out_str += str;
> + for (const char *p = str; *p; p++)
> + fnv1a_hash_char (hash_state, (unsigned char) *p);
> +}
> +
> +/* Forward declaration for recursive type mangling. */
> +static void mangle_type (tree type, std::string *out_str, uint32_t *hash_state);
> +
> +/* Mangle a builtin type following Itanium C++ ABI for C types. */
> +static void
> +mangle_builtin_type (tree type, std::string *out_str, uint32_t *hash_state)
> +{
> + gcc_assert (type != NULL_TREE);
> +
> + switch (TREE_CODE (type))
> + {
> + case VOID_TYPE:
> + append_char ('v', out_str, hash_state);
> + return;
> +
> + case BOOLEAN_TYPE:
> + append_char ('b', out_str, hash_state);
> + return;
> +
> + case INTEGER_TYPE:
> + /* Handle standard integer types using Itanium ABI codes. */
> + if (type == char_type_node)
> + append_char ('c', out_str, hash_state);
> + else if (type == signed_char_type_node)
> + append_char ('a', out_str, hash_state);
> + else if (type == unsigned_char_type_node)
> + append_char ('h', out_str, hash_state);
> + else if (type == short_integer_type_node)
> + append_char ('s', out_str, hash_state);
> + else if (type == short_unsigned_type_node)
> + append_char ('t', out_str, hash_state);
> + else if (type == integer_type_node)
> + append_char ('i', out_str, hash_state);
> + else if (type == unsigned_type_node)
> + append_char ('j', out_str, hash_state);
> + else if (type == long_integer_type_node)
> + append_char ('l', out_str, hash_state);
> + else if (type == long_unsigned_type_node)
> + append_char ('m', out_str, hash_state);
> + else if (type == long_long_integer_type_node)
> + append_char ('x', out_str, hash_state);
> + else if (type == long_long_unsigned_type_node)
> + append_char ('y', out_str, hash_state);
> + else
> + {
> + /* Fallback for other integer types - use precision-based encoding. */
> + append_char ('i', out_str, hash_state);
> + append_string (std::to_string (TYPE_PRECISION (type)).c_str (), out_str, hash_state);
> + }
> + return;
> +
> + case REAL_TYPE:
> + if (type == float_type_node)
> + append_char ('f', out_str, hash_state);
> + else if (type == double_type_node)
> + append_char ('d', out_str, hash_state);
> + else if (type == long_double_type_node)
> + append_char ('e', out_str, hash_state);
> + else
> + {
> + /* Fallback for other real types. */
> + append_char ('f', out_str, hash_state);
> + append_string (std::to_string (TYPE_PRECISION (type)).c_str (), out_str, hash_state);
> + }
> + return;
> +
> + case VECTOR_TYPE:
> + {
> + /* Handle vector types following Itanium C++ ABI:
> + Dv<num-elements>_<element-type-encoding>
> + Example: uint8x16_t → Dv16_h (vector of 16 unsigned char) */
> + tree vector_size = TYPE_SIZE_UNIT (type);
> + tree element_type = TREE_TYPE (type);
> + tree element_size = TYPE_SIZE_UNIT (element_type);
> +
> + if (vector_size && element_size &&
> + TREE_CODE (vector_size) == INTEGER_CST &&
> + TREE_CODE (element_size) == INTEGER_CST)
> + {
> + append_char ('D', out_str, hash_state);
> + append_char ('v', out_str, hash_state);
> +
> + unsigned HOST_WIDE_INT vec_bytes = tree_to_uhwi (vector_size);
> + unsigned HOST_WIDE_INT elem_bytes = tree_to_uhwi (element_size);
> + unsigned HOST_WIDE_INT num_elements = vec_bytes / elem_bytes;
> +
> + /* Append number of elements. */
> + append_string (std::to_string (num_elements).c_str (), out_str, hash_state);
> + append_char ('_', out_str, hash_state);
> +
> + /* Recursively mangle the element type. */
> + mangle_type (element_type, out_str, hash_state);
> + return;
> + }
> + /* Fail for vectors with unknown size. */
> + }
> + break;
> +
> + default:
> + break;
> + }
> +
> + /* Unknown builtin type - this should never happen in a well-formed C program. */
> + debug_tree (type);
> + internal_error ("mangle: Unknown builtin type in function %qD - please report this as a bug",
> + current_function_context);
This should NOT be internal_error but rather sorry.
> +}
> +
> +/* Canonicalize typedef types to their underlying named struct/union types. */
> +static tree
> +canonicalize_typedef_type (tree type)
> +{
> + /* Handle typedef types - canonicalize to named structs when possible. */
> + if (TYPE_NAME (type) && TREE_CODE (TYPE_NAME (type)) == TYPE_DECL)
> + {
> + tree type_decl = TYPE_NAME (type);
> +
> + /* Check if this is a typedef (not the original struct declaration) */
> + if (DECL_ORIGINAL_TYPE (type_decl))
> + {
> + tree original_type = DECL_ORIGINAL_TYPE (type_decl);
> +
> + /* If the original type is a named struct/union/enum, use that instead. */
> + if ((TREE_CODE (original_type) == RECORD_TYPE
> + || TREE_CODE (original_type) == UNION_TYPE
> + || TREE_CODE (original_type) == ENUMERAL_TYPE)
> + && TYPE_NAME (original_type)
> + && ((TREE_CODE (TYPE_NAME (original_type)) == TYPE_DECL
> + && DECL_NAME (TYPE_NAME (original_type)))
> + || TREE_CODE (TYPE_NAME (original_type)) == IDENTIFIER_NODE))
> + {
> + /* Recursively canonicalize in case the original type is also a typedef. */
> + return canonicalize_typedef_type (original_type);
> + }
> +
> + /* For basic type typedefs (e.g., u8 -> unsigned char), canonicalize to original type. */
> + if (TREE_CODE (original_type) == INTEGER_TYPE
> + || TREE_CODE (original_type) == REAL_TYPE
> + || TREE_CODE (original_type) == POINTER_TYPE
> + || TREE_CODE (original_type) == ARRAY_TYPE
> + || TREE_CODE (original_type) == FUNCTION_TYPE
> + || TREE_CODE (original_type) == METHOD_TYPE
> + || TREE_CODE (original_type) == BOOLEAN_TYPE
> + || TREE_CODE (original_type) == COMPLEX_TYPE
> + || TREE_CODE (original_type) == VECTOR_TYPE)
> + {
> + /* Recursively canonicalize in case the original type is also a typedef. */
> + return canonicalize_typedef_type (original_type);
> + }
> + }
> + }
> +
> + return type;
> +}
> +
> +/* Recursively mangle a type following Itanium C++ ABI conventions. */
> +static void
> +mangle_type (tree type, std::string *out_str, uint32_t *hash_state)
> +{
> + gcc_assert (type != NULL_TREE);
> +
> + /* Canonicalize typedef types to their underlying named struct types. */
> + type = canonicalize_typedef_type (type);
> +
> + switch (TREE_CODE (type))
> + {
> + case POINTER_TYPE:
> + {
> + /* Pointer type: 'P' + qualifiers + pointed-to type. */
> + append_char ('P', out_str, hash_state);
> +
> + /* Add qualifiers to the pointed-to type following Itanium C++ ABI ordering. */
> + tree pointed_to_type = TREE_TYPE (type);
> + if (TYPE_QUALS (pointed_to_type) != TYPE_UNQUALIFIED)
> + {
> + /* Emit qualifiers in Itanium ABI order: restrict, volatile, const. */
> + if (TYPE_QUALS (pointed_to_type) & TYPE_QUAL_RESTRICT)
> + append_char ('r', out_str, hash_state);
> + if (TYPE_QUALS (pointed_to_type) & TYPE_QUAL_VOLATILE)
> + append_char ('V', out_str, hash_state);
> + if (TYPE_QUALS (pointed_to_type) & TYPE_QUAL_CONST)
> + append_char ('K', out_str, hash_state);
> + }
> +
> + /* For KCFI's hybrid type system: preserve typedef names for compound types,
> + but use canonical forms for primitive types. */
> + tree target_type;
> + if (TREE_CODE (pointed_to_type) == RECORD_TYPE
> + || TREE_CODE (pointed_to_type) == UNION_TYPE
> + || TREE_CODE (pointed_to_type) == ENUMERAL_TYPE)
> + {
> + /* Compound type: preserve typedef information by using original type. */
> + target_type = pointed_to_type;
> + }
> + else
> + {
> + /* Primitive type: use canonical form to ensure structural typing. */
> + target_type = TYPE_MAIN_VARIANT (pointed_to_type);
> + }
> + mangle_type (target_type, out_str, hash_state);
> + break;
> + }
> +
> + case ARRAY_TYPE:
> + /* Array type: 'A' + size + '_' + element type (simplified). */
> + append_char ('A', out_str, hash_state);
> + if (TYPE_DOMAIN (type) && TYPE_MAX_VALUE (TYPE_DOMAIN (type)))
> + {
> + tree max_val = TYPE_MAX_VALUE (TYPE_DOMAIN (type));
> + /* Check if array size is a compile-time constant to handle VLAs safely. */
> + if (TREE_CODE (max_val) == INTEGER_CST && tree_fits_shwi_p (max_val))
> + {
> + HOST_WIDE_INT size = tree_to_shwi (max_val) + 1;
> + append_string (std::to_string ((long) size).c_str (), out_str, hash_state);
> + }
> + /* For VLAs or non-constant dimensions, emit empty size (A_). */
> + append_char ('_', out_str, hash_state);
> + }
> + else
> + {
> + /* No domain or no max value - emit A_. */
> + append_char ('_', out_str, hash_state);
> + }
> + mangle_type (TREE_TYPE (type), out_str, hash_state);
> + break;
> +
> + case REFERENCE_TYPE:
> + /* Reference type: 'R' + referenced type.
> + Note: We must handle references to builtin types including compiler
> + builtins like __builtin_va_list used in functions like va_start. */
> + append_char ('R', out_str, hash_state);
> + mangle_type (TREE_TYPE (type), out_str, hash_state);
> + break;
> +
> + case FUNCTION_TYPE:
> + {
> + /* Function type: 'F' + return type + parameter types + 'E' */
> + append_char ('F', out_str, hash_state);
> + mangle_type (TREE_TYPE (type), out_str, hash_state);
> +
> + /* Add parameter types. */
> + tree param_types = TYPE_ARG_TYPES (type);
> +
> + if (param_types == NULL_TREE)
> + {
> + /* func() - no parameter list (could be variadic). */
> + }
> + else
> + {
> + bool found_real_params = false;
> + for (tree param = param_types; param; param = TREE_CHAIN (param))
> + {
> + tree param_type = TREE_VALUE (param);
> + if (param_type == void_type_node)
> + {
> + /* Check if this is the first parameter (explicit void) or a sentinel */
> + if (!found_real_params)
> + {
> + /* func(void) - explicit empty parameter list.
> + Mangle void to distinguish from variadic func(). */
> + mangle_type (void_type_node, out_str, hash_state);
> + }
> + /* If we found real params before this void, it's a sentinel - stop */
> + break;
> + }
> +
> + found_real_params = true;
> +
> + /* For value parameters, ignore const/volatile qualifiers as they
> + don't affect the calling convention. const int and int are
> + passed identically by value. */
> + tree canonical_param_type = param_type;
> + if (TREE_CODE (param_type) != POINTER_TYPE
> + && TREE_CODE (param_type) != REFERENCE_TYPE
> + && TREE_CODE (param_type) != ARRAY_TYPE)
> + {
> + /* Strip qualifiers for non-pointer/reference value parameters. */
> + canonical_param_type = TYPE_MAIN_VARIANT (param_type);
> + }
> +
> + mangle_type (canonical_param_type, out_str, hash_state);
> + }
> + }
> +
> + /* Check if this is a variadic function and add 'z' marker. */
> + if (stdarg_p (type))
> + {
> + append_char ('z', out_str, hash_state);
> + }
> +
> + append_char ('E', out_str, hash_state);
> + break;
> + }
> +
> + case RECORD_TYPE:
> + case UNION_TYPE:
> + case ENUMERAL_TYPE:
> + {
> + /* Struct/union/enum: use simplified representation for C types. */
> + const char *name = NULL;
> +
> + if (TYPE_NAME (type))
> + {
> + if (TREE_CODE (TYPE_NAME (type)) == TYPE_DECL)
> + {
> + /* TYPE_DECL case: both named structs and typedef structs. */
> + tree decl_name = DECL_NAME (TYPE_NAME (type));
> + if (decl_name && TREE_CODE (decl_name) == IDENTIFIER_NODE)
> + {
> + name = IDENTIFIER_POINTER (decl_name);
> + }
> + }
> + else if (TREE_CODE (TYPE_NAME (type)) == IDENTIFIER_NODE)
> + {
> + /* Direct identifier case. */
> + name = IDENTIFIER_POINTER (TYPE_NAME (type));
> + }
> + }
> +
> + /* If no name found through normal extraction, handle anonymous types following Itanium C++ ABI. */
> + if (!name && !TYPE_NAME (type))
> + {
> + static char anon_name[128];
I am not a fan of a static variable here. Why not just a stack variable?
> +
> + if (TREE_CODE (type) == UNION_TYPE)
> + {
> + /* For anonymous unions, try to find first named field (Itanium ABI approach). */
> + tree field = TYPE_FIELDS (type);
> + while (field && !DECL_NAME (field))
> + field = DECL_CHAIN (field);
> +
> + if (field && DECL_NAME (field))
> + {
> + const char *field_name = IDENTIFIER_POINTER (DECL_NAME (field));
> + snprintf (anon_name, sizeof(anon_name), "anon_union_by_%s", field_name);
> + }
> + else
> + {
> + /* No named fields - use Itanium-style Ut encoding. */
> + snprintf (anon_name, sizeof(anon_name), "Ut_unnamed_union");
> + }
> + }
> + else
> + {
> + /* For anonymous structs/enums, use Itanium-style Ut encoding
> + with layout info for discrimination. */
> + const char *type_prefix = "";
> + if (TREE_CODE (type) == RECORD_TYPE)
> + type_prefix = "struct";
> + else if (TREE_CODE (type) == ENUMERAL_TYPE)
> + type_prefix = "enum";
> +
> + /* Include size and field layout for better discrimination. */
> + HOST_WIDE_INT size = 0;
> + if (TYPE_SIZE (type) && tree_fits_shwi_p (TYPE_SIZE (type)))
> + size = tree_to_shwi (TYPE_SIZE (type));
> +
> + /* Generate a hash based on field layout to distinguish same-sized
> + anonymous types. */
> + unsigned layout_hash = 0;
> + if (TREE_CODE (type) == RECORD_TYPE)
> + {
> + for (tree field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
> + {
> + if (TREE_CODE (field) == FIELD_DECL)
> + {
> + /* Hash field offset and type. */
> + if (DECL_FIELD_OFFSET (field))
> + {
> + HOST_WIDE_INT offset = tree_to_shwi (DECL_FIELD_OFFSET (field));
> + layout_hash = layout_hash * 31 + (unsigned)offset;
> + }
> +
> + /* Hash field type. */
> + tree field_type = TREE_TYPE (field);
> + if (field_type && TYPE_MODE (field_type) != VOIDmode)
> + layout_hash = layout_hash * 37 + (unsigned)TYPE_MODE (field_type);
> + }
> + }
> + }
> +
> + if (layout_hash != 0)
> + snprintf (anon_name, sizeof(anon_name), "Ut_%s_%ld_%x",
> + type_prefix, (long)size, layout_hash);
> + else
> + snprintf (anon_name, sizeof(anon_name), "Ut_%s_%ld",
> + type_prefix, (long)size);
> + }
> +
> + name = anon_name;
> + }
> +
> + if (name)
> + {
> + append_string (std::to_string (strlen (name)).c_str (), out_str, hash_state);
> + append_string (name, out_str, hash_state);
> + }
> + else
> + {
> + /* Always show diagnostic information for missing struct names. */
> + debug_tree (type);
> + internal_error ("mangle: Missing case in struct name extraction - please report this as a bug");
Again sorry rather than internal_error.
I still think it would be better if the hashing and mangling be one
step rather than 2 separate steps.
Especially since this is only used for the hashing and will most
likely only used there ever.
Thanks,
Andrew Pinski
> + }
> + break;
> + }
> +
> + default:
> + /* Handle builtin types. */
> + mangle_builtin_type (type, out_str, hash_state);
> + break;
> + }
> +}
> +
> +/* Compute canonical function type hash using Itanium C++ ABI mangling. */
> +uint32_t
> +hash_function_type (tree fntype, tree fndecl)
> +{
> + gcc_assert (fntype);
> + gcc_assert (TREE_CODE (fntype) == FUNCTION_TYPE);
> +
> + std::string result;
> + std::string *out_str = nullptr;
> + uint32_t hash_state = 2166136261U; /* FNV-1a 32-bit offset basis. */
> +
> + /* Only build string if dump is active. */
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + {
> + result.reserve (32);
> + out_str = &result;
> + }
> +
> + /* Store function context for error reporting. */
> + current_function_context = fndecl;
> +
> + /* Typeinfo for a function prototype. */
> + append_string ("_ZTS", out_str, &hash_state);
> +
> + mangle_type (fntype, out_str, &hash_state);
> +
> + /* Clear function context. */
> + current_function_context = NULL_TREE;
> +
> + /* Output to dump file if enabled. */
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + {
> + fprintf (dump_file, "KCFI type ID: mangled='%s' typeid=0x%08x\n",
> + result.c_str (), hash_state);
> + }
> +
> + return hash_state;
> +}
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: [PATCH v2 1/7] mangle: Introduce C typeinfo mangling API
2025-09-05 0:50 ` Andrew Pinski
@ 2025-09-05 1:09 ` Kees Cook
0 siblings, 0 replies; 32+ messages in thread
From: Kees Cook @ 2025-09-05 1:09 UTC (permalink / raw)
To: Andrew Pinski
Cc: Qing Zhao, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Peter Zijlstra, Dan Li,
Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
Bill Wendling, gcc-patches, linux-hardening
On Thu, Sep 04, 2025 at 05:50:45PM -0700, Andrew Pinski wrote:
> On Thu, Sep 4, 2025 at 5:27 PM Kees Cook <kees@kernel.org> wrote:
> > +
> > + /* Unknown builtin type - this should never happen in a well-formed C program. */
> > + debug_tree (type);
> > + internal_error ("mangle: Unknown builtin type in function %qD - please report this as a bug",
> > + current_function_context);
>
> This should NOT be internal_error but rather sorry.
Ah, heh. I switched to internal_error because you'd suggested it in
the last version. Maybe I misunderstood where I should be using sorry vs
internal_error.
> > + if (!name && !TYPE_NAME (type))
> > + {
> > + static char anon_name[128];
>
> I am not a fan of a static variable here. Why not just a stack variable?
Oh, whoops. I will fix that.
> > + {
> > + /* Always show diagnostic information for missing struct names. */
> > + debug_tree (type);
> > + internal_error ("mangle: Missing case in struct name extraction - please report this as a bug");
>
> Again sorry rather than internal_error.
>
> I still think it would be better if the hashing and mangling be one
> step rather than 2 separate steps.
> Especially since this is only used for the hashing and will most
> likely only used there ever.
In this version the hashing happens immediately. No full string is built
unless the dumpfile is enabled.
-Kees
--
Kees Cook
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
2025-09-05 0:24 [PATCH v2 0/7] Introduce Kernel Control Flow Integrity ABI [PR107048] Kees Cook
2025-09-05 0:24 ` [PATCH v2 1/7] mangle: Introduce C typeinfo mangling API Kees Cook
@ 2025-09-05 0:24 ` Kees Cook
2025-09-05 8:51 ` Peter Zijlstra
2025-09-09 18:49 ` Qing Zhao
2025-09-05 0:24 ` [PATCH v2 3/7] x86: Add x86_64 Kernel Control Flow Integrity implementation Kees Cook
` (4 subsequent siblings)
6 siblings, 2 replies; 32+ messages in thread
From: Kees Cook @ 2025-09-05 0:24 UTC (permalink / raw)
To: Qing Zhao
Cc: Kees Cook, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Peter Zijlstra, Dan Li,
Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
Bill Wendling, gcc-patches, linux-hardening
Implements the Linux Kernel Control Flow Integrity ABI, which provides a
function prototype based forward edge control flow integrity protection
by instrumenting every indirect call to check for a hash value before
the target function address. If the hash at the call site and the hash
at the target do not match, execution will trap.
See the start of kcfi.cc for design details.
gcc/ChangeLog:
* kcfi.h: New file with KCFI public interface declarations.
* kcfi.cc: New file implementing Kernel Control Flow Integrity
infrastructure.
* Makefile.in (OBJS): Add kcfi.o.
* flag-types.h (enum sanitize_code): Add SANITIZE_KCFI.
* gimple.h (enum gf_mask): Add GF_CALL_INLINED_FROM_KCFI_NOSANTIZE.
(gimple_call_set_inlined_from_kcfi_nosantize): New function.
(gimple_call_inlined_from_kcfi_nosantize_p): New function.
* tree-pass.h Add kcfi passes.
* c-family/c-attribs.cc: Include asan.h.
(handle_patchable_function_entry_attribute): Add error for using
patchable_function_entry attribute with -fsanitize=kcfi.
* df-scan.cc (df_uses_record): Add KCFI case to handle KCFI RTL
patterns and process wrapped RTL.
* doc/invoke.texi (fsanitize=kcfi): Add documentation for KCFI
sanitizer option.
* doc/tm.texi.in: Add Kernel Control Flow Integrity section with
TARGET_KCFI_SUPPORTED, TARGET_KCFI_MASK_TYPE_ID,
TARGET_KCFI_EMIT_TYPE_ID hooks.
* doc/tm.texi: Regenerate.
* final.cc (call_from_call_insn): Add KCFI case to handle
KCFI-wrapped calls.
* opts.cc (sanitizer_opts): Add kcfi entry.
* passes.cc: Include kcfi.h.
* passes.def: Add KCFI passes (GIMPLE and IPA).
* rtl.def (KCFI): Add new RTL code for KCFI instrumentation.
* rtlanal.cc (rtx_cost): Add KCFI case.
* target.def: Add KCFI target hooks.
* toplev.cc (process_options): Add KCFI option processing.
* tree-inline.cc: Include kcfi.h and asan.h.
(copy_bb): Handle KCFI no_sanitize attribute propagation during
inlining.
* varasm.cc (assemble_start_function): Emit KCFI preambles.
(assemble_external_real): Emit KCFI typeid symbols.
(default_elf_asm_named_section): Handle .kcfi_traps using
SECTION_LINK_ORDER flag.
Signed-off-by: Kees Cook <kees@kernel.org>
---
gcc/kcfi.h | 47 +++
gcc/kcfi.cc | 764 ++++++++++++++++++++++++++++++++++++++
gcc/Makefile.in | 1 +
gcc/flag-types.h | 2 +
gcc/gimple.h | 21 ++
gcc/tree-pass.h | 3 +
gcc/c-family/c-attribs.cc | 12 +
gcc/df-scan.cc | 6 +
gcc/doc/invoke.texi | 35 ++
gcc/doc/tm.texi | 31 ++
gcc/doc/tm.texi.in | 12 +
gcc/final.cc | 3 +
gcc/opts.cc | 1 +
gcc/passes.cc | 1 +
gcc/passes.def | 3 +
gcc/rtl.def | 6 +
gcc/rtlanal.cc | 5 +
gcc/target.def | 38 ++
gcc/toplev.cc | 11 +
gcc/tree-inline.cc | 10 +
gcc/varasm.cc | 46 ++-
21 files changed, 1048 insertions(+), 10 deletions(-)
create mode 100644 gcc/kcfi.h
create mode 100644 gcc/kcfi.cc
diff --git a/gcc/kcfi.h b/gcc/kcfi.h
new file mode 100644
index 000000000000..17ec59a1a3b8
--- /dev/null
+++ b/gcc/kcfi.h
@@ -0,0 +1,47 @@
+/* Kernel Control Flow Integrity (KCFI) support for GCC.
+ Copyright (C) 2025 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3. If not see
+<http://www.gnu.org/licenses/>. */
+
+#ifndef GCC_KCFI_H
+#define GCC_KCFI_H
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "rtl.h"
+
+/* Common helper for RTL patterns to emit .kcfi_traps section entry.
+ Call after emitting trap label and instruction with the trap symbol
+ reference. */
+extern void kcfi_emit_traps_section (FILE *file, rtx trap_label_sym);
+
+/* Extract KCFI type ID from current GIMPLE statement. */
+extern rtx kcfi_get_call_type_id (void);
+
+/* Emit KCFI type ID symbol for address-taken functions. */
+extern void emit_kcfi_typeid_symbol (FILE *asm_file, tree decl,
+ const char *name);
+
+/* Emit KCFI preamble. */
+extern void kcfi_emit_preamble (FILE *file, tree decl,
+ const char *actual_fname);
+
+/* For calculating callsite offset. */
+extern HOST_WIDE_INT kcfi_patchable_entry_prefix_nops;
+
+#endif /* GCC_KCFI_H */
diff --git a/gcc/kcfi.cc b/gcc/kcfi.cc
new file mode 100644
index 000000000000..1ae0602eac7b
--- /dev/null
+++ b/gcc/kcfi.cc
@@ -0,0 +1,764 @@
+/* Kernel Control Flow Integrity (KCFI) support for GCC.
+ Copyright (C) 2025 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3. If not see
+<http://www.gnu.org/licenses/>. */
+
+/* KCFI ABI Design:
+
+The Linux Kernel Control Flow Integrity ABI provides a function prototype
+based forward edge control flow integrity protection by instrumenting
+every indirect call to check for a hash value before the target function
+address. If the hash at the call site and the hash at the target do not
+match, execution will trap.
+
+The general CFI ideas are discussed here, but focuses more on a CFG
+analysis to construct valid call destinations, which tends to require LTO:
+https://users.soe.ucsc.edu/~abadi/Papers/cfi-tissec-revised.pdf
+
+Later refinement for using jump tables (constructed via CFG analysis
+during LTO) was proposed here:
+https://www.usenix.org/system/files/conference/usenixsecurity14/sec14-paper-tice.pdf
+
+Linux used the above implementation from 2018 to 2022:
+https://android-developers.googleblog.com/2018/10/control-flow-integrity-in-android-kernel.html
+but the corner cases for target addresses not being the actual functions
+(i.e. pointing into the jump table) was a continual source of problems,
+and generating the jump tables required full LTO, which had its own set
+of problems.
+
+Looking at function prototypes as the source of call validity was
+presented here, though still relied on LTO:
+https://www.blackhat.com/docs/asia-17/materials/asia-17-Moreira-Drop-The-Rop-Fine-Grained-Control-Flow-Integrity-For-The-Linux-Kernel-wp.pdf
+
+The KCFI approach built on the function-prototype idea, but avoided
+needing LTO, and could be further updated to deal with CPU errata
+(retpolines, etc):
+https://lpc.events/event/16/contributions/1315/
+
+KCFI has a number of specific constraints. Some are tied to the
+backend architecture, which are covered in arch-specific code.
+The constraints are:
+
+- The KCFI scheme generates a unique 32-bit hash for each unique function
+ prototype, allowing for indirect call sites to verify that they are
+ calling into a matching _type_ of function pointer. This changes the
+ semantics of some optimization logic because now indirect calls to
+ different types cannot be merged. For example:
+
+ if (p->func_type_1)
+ return p->func_type_1();
+ if (p->func_type_2)
+ return p->func_type_2();
+
+ In final asm, the optimizer may collapse the second indirect call
+ into a jump to the first indirect call once it has loaded the function
+ pointer. KCFI must block cross-type merging otherwise there will be a
+ single KCFI check happening for only 1 type but being used by 2 target
+ types. The distinguishing characteristic for call merging becomes the
+ type, not the address/register usage.
+
+- The check-call instruction sequence must be treated a single unit: it
+ cannot be rearranged or split or optimized. The pattern is that
+ indirect calls, "call *$target", get converted into:
+
+ mov $target_expression, %target ; only present if the expression was
+ ; not already %target register
+ load -$offset(%target), %tmp ; load the typeid hash at target
+ cmp $hash, %tmp ; compare expected typeid with loaded
+ je .Lcheck_passed ; jump to the indirect call
+ .Lkcfi_trap$N: ; label of trap insn
+ trap ; trap on failure, but arranged so
+ ; "permissive mode" falls through
+ .Lkcfi_call$N: ; label of call insn
+ call *%target ; actual indirect call
+
+ This pattern of call immediately after trap provides for the
+ "permissive" checking mode automatically: the trap gets handled,
+ a warning emitted, and then execution continues after the trap to
+ the call.
+
+- KCFI check-call instrumentation must survive tail call optimization.
+ If an indirect call is turned into an indirect jump, KCFI checking
+ must still happen (but will still use the jmp).
+
+- Functions that may be called indirectly have a preamble added,
+ __cfi_$original_func_name, which contains the $hash value:
+
+ __cfi_target_func:
+ .word $hash
+ target_func:
+ [regular function entry...]
+
+- The preamble needs to interact with patchable function entry so that
+ the hash appears further away from the actual start of the function
+ (leaving the prefix NOPs of the patchable function entry unchanged).
+ This means only _globally defined_ patchable function entry is supported
+ with KCFI (indrect call sites must know in advance what the offset is,
+ which may not be possible with extern functions). For example, a "4,4"
+ patchable function entry would end up like:
+
+ __cfi_target_func:
+ .data $hash
+ nop nop nop nop
+ target_func:
+ [regular function entry...]
+
+ Architectures may need to add alignment nops prior to the hash to keep things
+ aligned for function call conventions.
+
+- External functions that are address-taken have a weak __kcfi_typeid_$funcname
+ symbol added with the hash value available so that the hash can be referenced
+ from assembly linkages, etc, where the hash values cannot be calculated (i.e
+ where C type information is missing):
+
+ .weak __kcfi_typeid_$func
+ .set __kcfi_typeid_$func, $hash
+
+- On architectures that do not have a good way to encode additional
+ details in their trap insn (e.g. x86_64 and riscv64), the trap location
+ is identified as a KCFI trap via a relative address offset entry
+ emitted into the .kcfi_traps section for each indirect call site's
+ trap instruction. The previous check-call example's insn sequence has
+ a section push/pop inserted between the trap and call:
+
+ ...
+ .Lkcfi_trap$N:
+ trap
+ .section .kcfi_traps,"ao",@progbits,.text
+ .Lkcfi_entry$N:
+ .long .Lkcfi_trap$N - .Lkcfi_entry$N
+ .text
+ .Lkcfi_call$N:
+ call *%target
+
+ For architectures that can encode immediates in their trap function
+ (e.g. aarch64 and arm32), this isn't needed: they just use immediate
+ codes that indicate a KCFI trap.
+
+- The no_sanitize("kcfi") function attribute means that the marked
+ function must not produce KCFI checking for indirect calls, and this
+ attribute must survive inlining. This is used rarely by Linux, but
+ is required to make BPF JIT trampolines work on older Linux kernel
+ versions.
+
+As a result of these constraints, there are some behavioral aspects
+that need to be preserved across the middle-end and back-end.
+
+For indirect call sites:
+
+- Keeping indirect calls from being merged (see above) by adding a
+ wrapping type so that equality was tested based on type-id.
+
+- Keeping typeid information available through to the RTL expansion
+ phase was done via a new KCFI insn that wraps CALL and the typeid.
+
+- To make sure KCFI expansion is skipped for inline functions, the
+ inlining is marked during GIMPLE with a new flag which is checked
+ during expansion.
+
+For indirect call targets:
+
+- kcfi_emit_preamble() uses function_needs_kcfi_preamble(),
+ to emit the preablem, which interacts with patchable function
+ entry to add any needed alignment.
+
+- gcc/varasm.cc, assemble_external_real() calls emit_kcfi_typeid_symbol()
+ to add the __kcfi_typeid symbols (see get_function_kcfi_type_id()
+ below).
+
+*/
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "target.h"
+#include "function.h"
+#include "tree.h"
+#include "tree-pass.h"
+#include "dumpfile.h"
+#include "basic-block.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "cgraph.h"
+#include "kcfi.h"
+#include "stringpool.h"
+#include "attribs.h"
+#include "rtl.h"
+#include "cfg.h"
+#include "cfgrtl.h"
+#include "asan.h"
+#include "diagnostic-core.h"
+#include "memmodel.h"
+#include "print-tree.h"
+#include "emit-rtl.h"
+#include "output.h"
+#include "builtins.h"
+#include "varasm.h"
+#include "opts.h"
+#include "mangle.h"
+#include "target.h"
+#include "flags.h"
+
+HOST_WIDE_INT kcfi_patchable_entry_prefix_nops = 0; /* For callsite offset */
+static HOST_WIDE_INT kcfi_patchable_entry_arch_alignment_nops = 0; /* For preamble alignment */
+
+/* Common helper for RTL patterns to emit .kcfi_traps section entry. */
+void
+kcfi_emit_traps_section (FILE *file, rtx trap_label_sym)
+{
+ /* Generate entry label internally and get its number. */
+ rtx entry_label = gen_label_rtx ();
+ int entry_labelno = CODE_LABEL_NUMBER (entry_label);
+
+ /* Generate entry label name with custom prefix. */
+ char entry_name[32];
+ ASM_GENERATE_INTERNAL_LABEL (entry_name, "Lkcfi_entry", entry_labelno);
+
+ /* Save current section to restore later. */
+ section *saved_section = in_section;
+
+ /* Use varasm infrastructure for section handling. */
+ section *kcfi_traps_section = get_section (".kcfi_traps",
+ SECTION_LINK_ORDER, NULL);
+ switch_to_section (kcfi_traps_section);
+
+ /* Emit entry label. */
+ ASM_OUTPUT_LABEL (file, entry_name);
+
+ /* Generate address difference using RTL infrastructure. */
+ rtx entry_label_sym = gen_rtx_SYMBOL_REF (Pmode, entry_name);
+ rtx addr_diff = gen_rtx_MINUS (Pmode, trap_label_sym, entry_label_sym);
+
+ /* Emit the address difference as a 4-byte value. */
+ assemble_integer (addr_diff, 4, BITS_PER_UNIT, 1);
+
+ /* Restore the previous section. */
+ switch_to_section (saved_section);
+}
+
+/* Compute KCFI type ID for a function declaration or function type (internal) */
+static uint32_t
+compute_kcfi_type_id (tree fntype, tree fndecl = NULL_TREE)
+{
+ gcc_assert (fntype);
+ gcc_assert (TREE_CODE (fntype) == FUNCTION_TYPE);
+
+ uint32_t type_id = hash_function_type (fntype, fndecl);
+
+ /* Apply target-specific masking if supported. */
+ if (targetm.kcfi.mask_type_id)
+ type_id = targetm.kcfi.mask_type_id (type_id);
+
+ return type_id;
+}
+
+/* Check if a function needs KCFI preamble generation.
+ ALL functions get preambles when -fsanitize=kcfi is enabled, regardless
+ of no_sanitize("kcfi") attribute. */
+static bool
+function_needs_kcfi_preamble (tree fndecl)
+{
+ /* Only instrument if KCFI is globally enabled. */
+ if (!(flag_sanitize & SANITIZE_KCFI))
+ return false;
+
+ struct cgraph_node *node = cgraph_node::get (fndecl);
+
+ /* Ignore cold partition functions: not reached via indirect call. */
+ if (node && node->split_part)
+ return false;
+
+ /* Ignore cold partition sections: cold partitions are never indirect call
+ targets. Only skip preambles for cold partitions (has_bb_partition = true)
+ not for entire cold-attributed functions (has_bb_partition = false). */
+ if (in_cold_section_p && crtl && crtl->has_bb_partition)
+ return false;
+
+ /* Check if function is truly address-taken using cgraph node analysis. */
+ bool addr_taken = (node && node->address_taken);
+
+ /* Only instrument functions that can be targets of indirect calls:
+ - Public functions (can be called externally)
+ - External declarations (from other modules)
+ - Functions with true address-taken status from cgraph analysis. */
+ return TREE_PUBLIC (fndecl) || DECL_EXTERNAL (fndecl) || addr_taken;
+}
+
+/* Function attribute to store KCFI type ID. */
+static tree kcfi_type_id_attr = NULL_TREE;
+
+/* Set KCFI type ID for a function declaration during IPA phase.
+ Fatal error if type ID is already set. */
+static void
+set_function_kcfi_type_id (tree fndecl)
+{
+ if (!kcfi_type_id_attr)
+ kcfi_type_id_attr = get_identifier ("kcfi_type_id");
+
+ /* Fatal error if type ID already set - nothing should set it twice. */
+ if (lookup_attribute_by_prefix ("kcfi_type_id",
+ DECL_ATTRIBUTES (fndecl)))
+ internal_error ("KCFI type ID already set for function %qD", fndecl);
+
+ /* Compute type ID using FUNCTION_TYPE to preserve typedef information. */
+ uint32_t type_id = compute_kcfi_type_id (TREE_TYPE (fndecl), fndecl);
+
+ tree type_id_tree = build_int_cst (unsigned_type_node, type_id);
+ tree attr_value = build_tree_list (NULL_TREE, type_id_tree);
+ tree attr = build_tree_list (kcfi_type_id_attr, attr_value);
+
+ DECL_ATTRIBUTES (fndecl) = chainon (DECL_ATTRIBUTES (fndecl), attr);
+}
+
+/* Get KCFI type ID for a function declaration during assembly output phase.
+ Fatal error if type ID was not previously set during IPA phase. */
+static uint32_t
+get_function_kcfi_type_id (tree fndecl)
+{
+ if (!kcfi_type_id_attr)
+ kcfi_type_id_attr = get_identifier ("kcfi_type_id");
+
+ tree attr = lookup_attribute_by_prefix ("kcfi_type_id",
+ DECL_ATTRIBUTES (fndecl));
+ if (attr && TREE_VALUE (attr) && TREE_VALUE (TREE_VALUE (attr)))
+ {
+ tree value = TREE_VALUE (TREE_VALUE (attr));
+ if (TREE_CODE (value) == INTEGER_CST)
+ return (uint32_t) TREE_INT_CST_LOW (value);
+ }
+
+ internal_error ("KCFI type ID not found for function %qD - "
+ "should have been set during GIMPLE phase", fndecl);
+}
+
+/* Prepare the global KCFI alignment NOPs calculation.
+ Called once during IPA pass to set global variable. */
+static void
+kcfi_prepare_alignment_nops (void)
+{
+ /* Only use global patchable-function-entry flag, not function attributes.
+ KCFI callsites cannot know about function-specific attributes. */
+ if (flag_patchable_function_entry)
+ {
+ HOST_WIDE_INT total_nops, prefix_nops = 0;
+ parse_and_check_patch_area (flag_patchable_function_entry, false,
+ &total_nops, &prefix_nops);
+ /* Store value for callsite offset calculation */
+ kcfi_patchable_entry_prefix_nops = prefix_nops;
+ }
+
+ /* Calculate architecture-specific alignment NOPs.
+ KCFI preamble layout:
+ __cfi_func: [alignment_nops][typeid][prefix_nops] func: [entry_nops]
+
+ The alignment NOPs ensure __cfi_func stays at proper function alignment
+ when prefix NOPs are added. */
+ HOST_WIDE_INT arch_alignment = 0;
+
+ /* Calculate alignment NOPs based on function alignment setting.
+ Use explicit -falign-functions if set, otherwise default to 4 bytes. */
+ int alignment_bytes = 4;
+ if (align_functions.levels[0].log > 0)
+ {
+ /* Use explicit -falign-functions setting */
+ alignment_bytes = align_functions.levels[0].get_value();
+ }
+
+ /* Get typeid instruction size from target hook, default to 4 bytes */
+ int typeid_size = targetm.kcfi.emit_type_id
+ ? targetm.kcfi.emit_type_id (NULL, 0) : 4;
+
+ /* Calculate alignment NOPs needed */
+ arch_alignment = (alignment_bytes - ((kcfi_patchable_entry_prefix_nops + typeid_size) % alignment_bytes)) % alignment_bytes;
+
+ /* Use the calculated alignment NOPs */
+ kcfi_patchable_entry_arch_alignment_nops = arch_alignment;
+}
+
+/* Check if this is an indirect call that needs KCFI instrumentation. */
+static bool
+is_kcfi_indirect_call (tree fn)
+{
+ if (!fn)
+ return false;
+
+ /* Only functions WITHOUT no_sanitize("kcfi") should generate KCFI checks at
+ indirect call sites. */
+ if (!sanitize_flags_p (SANITIZE_KCFI, current_function_decl))
+ return false;
+
+ /* Direct function calls via ADDR_EXPR don't need KCFI checks. */
+ if (TREE_CODE (fn) == ADDR_EXPR)
+ return false;
+
+ /* Everything else must be indirect calls needing KCFI. */
+ return true;
+}
+
+/* Extract KCFI type ID from indirect call GIMPLE statement.
+ Returns RTX constant with type ID, or NULL_RTX if no KCFI needed. */
+rtx
+kcfi_get_call_type_id (void)
+{
+ if (!sanitize_flags_p (SANITIZE_KCFI) || !currently_expanding_gimple_stmt)
+ return NULL_RTX;
+
+ if (!is_gimple_call (currently_expanding_gimple_stmt))
+ return NULL_RTX;
+
+ gcall *call_stmt = as_a <gcall *> (currently_expanding_gimple_stmt);
+
+ /* Only indirect calls need KCFI instrumentation. */
+ if (gimple_call_fndecl (call_stmt))
+ return NULL_RTX;
+
+ tree fn_type = gimple_call_fntype (call_stmt);
+ if (!fn_type)
+ return NULL_RTX;
+
+ tree attr = lookup_attribute ("kcfi_type_id", TYPE_ATTRIBUTES (fn_type));
+ if (!attr || !TREE_VALUE (attr))
+ return NULL_RTX;
+
+ if (gimple_call_inlined_from_kcfi_nosantize_p (call_stmt))
+ return NULL_RTX;
+
+ uint32_t kcfi_type_id = (uint32_t) tree_to_uhwi (TREE_VALUE (attr));
+ return GEN_INT (kcfi_type_id);
+}
+
+/* Emit KCFI type ID symbol for an address-taken function.
+ Centralized emission point to avoid duplication between
+ assemble_external_real() and assemble_start_function(). */
+void
+emit_kcfi_typeid_symbol (FILE *asm_file, tree decl, const char *name)
+{
+ uint32_t type_id = get_function_kcfi_type_id (decl);
+ fprintf (asm_file, "\t.weak\t__kcfi_typeid_%s\n", name);
+ fprintf (asm_file, "\t.set\t__kcfi_typeid_%s, 0x%08x\n", name, type_id);
+}
+
+void
+kcfi_emit_preamble (FILE *file, tree decl, const char *actual_fname)
+{
+ /* Check if KCFI is enabled and function needs preamble. */
+ if (!function_needs_kcfi_preamble (decl))
+ return;
+
+ /* Use actual function name if provided, otherwise fall back to DECL_ASSEMBLER_NAME. */
+ const char *fname = actual_fname ? actual_fname
+ : IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
+
+ /* Get type ID. */
+ uint32_t type_id = get_function_kcfi_type_id (decl);
+
+ /* Create symbol name for reuse. */
+ char cfi_symbol_name[256];
+ snprintf (cfi_symbol_name, sizeof(cfi_symbol_name), "__cfi_%s", fname);
+
+ /* Emit __cfi_ symbol with proper visibility. */
+ if (TREE_PUBLIC (decl))
+ {
+ if (DECL_WEAK (decl))
+ ASM_WEAKEN_LABEL (file, cfi_symbol_name);
+ else
+ targetm.asm_out.globalize_label (file, cfi_symbol_name);
+ }
+
+ /* Emit .type directive. */
+ ASM_OUTPUT_TYPE_DIRECTIVE (file, cfi_symbol_name, "function");
+ fprintf (file, "%s:\n", cfi_symbol_name);
+
+ /* Emit architecture-specific prefix NOPs. */
+ for (int i = 0; i < kcfi_patchable_entry_arch_alignment_nops; i++)
+ {
+ fprintf (file, "\tnop\n");
+ }
+
+ /* Emit type ID bytes. */
+ if (targetm.kcfi.emit_type_id)
+ targetm.kcfi.emit_type_id (file, type_id);
+ else
+ fprintf (file, "\t.word\t0x%08x\n", type_id);
+
+ /* Mark end of __cfi_ symbol and emit size directive. */
+ char cfi_end_label[256];
+ snprintf (cfi_end_label, sizeof(cfi_end_label), ".Lcfi_func_end_%s", fname);
+ ASM_OUTPUT_LABEL (file, cfi_end_label);
+
+ ASM_OUTPUT_MEASURED_SIZE (file, cfi_symbol_name);
+}
+
+/* KCFI GIMPLE pass implementation. */
+
+static bool
+gate_kcfi (void)
+{
+ /* Always process functions when KCFI is globally enabled to set type IDs.
+ Individual function processing (call instrumentation) will check no_sanitize("kcfi"). */
+ return sanitize_flags_p (SANITIZE_KCFI);
+}
+
+/* Create a KCFI wrapper function type that embeds the type ID. */
+static tree
+create_kcfi_wrapper_type (tree original_fn_type, uint32_t type_id)
+{
+ /* Create a unique type name incorporating the type ID. */
+ char wrapper_name[32];
+ snprintf (wrapper_name, sizeof (wrapper_name), "__kcfi_wrapper_%x", type_id);
+
+ /* Build a new function type that's structurally identical but nominally different. */
+ tree wrapper_type = build_function_type (TREE_TYPE (original_fn_type),
+ TYPE_ARG_TYPES (original_fn_type));
+
+ /* Set the type name to make it distinct. */
+ TYPE_NAME (wrapper_type) = get_identifier (wrapper_name);
+
+ /* Attach kcfi_type_id attribute to the original function type for cfgexpand.cc */
+ tree attr_name = get_identifier ("kcfi_type_id");
+ tree attr_value = build_int_cst (unsigned_type_node, type_id);
+ tree attr = build_tree_list (attr_name, attr_value);
+ TYPE_ATTRIBUTES (original_fn_type) = chainon (TYPE_ATTRIBUTES (original_fn_type), attr);
+
+ return wrapper_type;
+}
+
+/* Wrap indirect calls with KCFI type for anti-merging. */
+static unsigned int
+kcfi_instrument (void)
+{
+ /* Process current function for call instrumentation only.
+ Type ID setting is handled by the separate IPA pass. */
+
+ basic_block bb;
+
+ FOR_EACH_BB_FN (bb, cfun)
+ {
+ gimple_stmt_iterator gsi;
+ for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+ {
+ gimple *stmt = gsi_stmt (gsi);
+
+ if (!is_gimple_call (stmt))
+ continue;
+
+ gcall *call_stmt = as_a <gcall *> (stmt);
+
+ // Skip internal calls - we only instrument indirect calls
+ if (gimple_call_internal_p (call_stmt))
+ continue;
+
+ tree fndecl = gimple_call_fndecl (call_stmt);
+
+ // Only process indirect calls (no fndecl)
+ if (fndecl)
+ continue;
+
+ tree fn = gimple_call_fn (call_stmt);
+ if (!is_kcfi_indirect_call (fn))
+ continue;
+
+ // Get the function type to compute KCFI type ID
+ tree fn_type = gimple_call_fntype (call_stmt);
+ gcc_assert (fn_type);
+ if (TREE_CODE (fn_type) != FUNCTION_TYPE)
+ continue;
+
+ uint32_t type_id = compute_kcfi_type_id (fn_type);
+
+ // Create KCFI wrapper type for this call
+ tree wrapper_type = create_kcfi_wrapper_type (fn_type, type_id);
+
+ // Create a temporary variable for the wrapped function pointer
+ tree wrapper_ptr_type = build_pointer_type (wrapper_type);
+ tree wrapper_tmp = create_tmp_var (wrapper_ptr_type, "kcfi_wrapper");
+
+ // Create assignment: wrapper_tmp = (wrapper_ptr_type) fn
+ tree cast_expr = build1 (NOP_EXPR, wrapper_ptr_type, fn);
+ gimple *cast_stmt = gimple_build_assign (wrapper_tmp, cast_expr);
+ gsi_insert_before (&gsi, cast_stmt, GSI_SAME_STMT);
+
+ // Update the call to use the wrapped function pointer
+ gimple_call_set_fn (call_stmt, wrapper_tmp);
+ }
+ }
+
+ return 0;
+}
+
+namespace {
+
+const pass_data pass_data_kcfi =
+{
+ GIMPLE_PASS, /* type */
+ "kcfi", /* name */
+ OPTGROUP_NONE, /* optinfo_flags */
+ TV_NONE, /* tv_id */
+ ( PROP_ssa | PROP_cfg | PROP_gimple_leh ), /* properties_required */
+ 0, /* properties_provided */
+ 0, /* properties_destroyed */
+ 0, /* todo_flags_start */
+ TODO_update_ssa, /* todo_flags_finish */
+};
+
+class pass_kcfi : public gimple_opt_pass
+{
+public:
+ pass_kcfi (gcc::context *ctxt)
+ : gimple_opt_pass (pass_data_kcfi, ctxt)
+ {}
+
+ /* opt_pass methods: */
+ opt_pass * clone () final override { return new pass_kcfi (m_ctxt); }
+ bool gate (function *) final override
+ {
+ return gate_kcfi ();
+ }
+ unsigned int execute (function *) final override
+ {
+ return kcfi_instrument ();
+ }
+
+}; // class pass_kcfi
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_kcfi (gcc::context *ctxt)
+{
+ return new pass_kcfi (ctxt);
+}
+
+namespace {
+
+const pass_data pass_data_kcfi_O0 =
+{
+ GIMPLE_PASS, /* type */
+ "kcfi0", /* name */
+ OPTGROUP_NONE, /* optinfo_flags */
+ TV_NONE, /* tv_id */
+ ( PROP_ssa | PROP_cfg | PROP_gimple_leh ), /* properties_required */
+ 0, /* properties_provided */
+ 0, /* properties_destroyed */
+ 0, /* todo_flags_start */
+ TODO_update_ssa, /* todo_flags_finish */
+};
+
+class pass_kcfi_O0 : public gimple_opt_pass
+{
+public:
+ pass_kcfi_O0 (gcc::context *ctxt)
+ : gimple_opt_pass (pass_data_kcfi_O0, ctxt)
+ {}
+
+ /* opt_pass methods: */
+ bool gate (function *) final override
+ {
+ return !optimize && gate_kcfi ();
+ }
+ unsigned int execute (function *) final override
+ {
+ return kcfi_instrument ();
+ }
+
+}; // class pass_kcfi_O0
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_kcfi_O0 (gcc::context *ctxt)
+{
+ return new pass_kcfi_O0 (ctxt);
+}
+
+/* IPA pass for KCFI type ID setting - runs once per compilation unit. */
+
+namespace {
+
+const pass_data pass_data_ipa_kcfi =
+{
+ SIMPLE_IPA_PASS, /* type */
+ "ipa_kcfi", /* name */
+ OPTGROUP_NONE, /* optinfo_flags */
+ TV_IPA_OPT, /* tv_id */
+ 0, /* properties_required */
+ 0, /* properties_provided */
+ 0, /* properties_destroyed */
+ 0, /* todo_flags_start */
+ 0, /* todo_flags_finish */
+};
+
+/* Set KCFI type IDs for all functions in the compilation unit. */
+static unsigned int
+ipa_kcfi_execute (void)
+{
+ struct cgraph_node *node;
+
+ /* Prepare global KCFI alignment NOPs calculation once for all functions. */
+ kcfi_prepare_alignment_nops ();
+
+ /* Process all functions - both local and external.
+ This preserves typedef information using DECL_ARGUMENTS. */
+ FOR_EACH_FUNCTION (node)
+ {
+ tree fndecl = node->decl;
+
+ /* Skip all non-NORMAL builtins (MD, FRONTEND) entirely.
+ For NORMAL builtins, skip those that lack an implicit
+ implementation (closest way to distinguishing DEF_LIB_BUILTIN
+ from others). E.g. we need to have typeids for memset(). */
+ if (fndecl_built_in_p (fndecl))
+ {
+ if (DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_NORMAL)
+ continue;
+ if (!builtin_decl_implicit_p (DECL_FUNCTION_CODE (fndecl)))
+ continue;
+ }
+
+ set_function_kcfi_type_id (fndecl);
+ }
+
+ return 0;
+}
+
+class pass_ipa_kcfi : public simple_ipa_opt_pass
+{
+public:
+ pass_ipa_kcfi (gcc::context *ctxt)
+ : simple_ipa_opt_pass (pass_data_ipa_kcfi, ctxt)
+ {}
+
+ /* opt_pass methods: */
+ bool gate (function *) final override
+ {
+ return sanitize_flags_p (SANITIZE_KCFI);
+ }
+
+ unsigned int execute (function *) final override
+ {
+ return ipa_kcfi_execute ();
+ }
+
+}; // class pass_ipa_kcfi
+
+} // anon namespace
+
+simple_ipa_opt_pass *
+make_pass_ipa_kcfi (gcc::context *ctxt)
+{
+ return new pass_ipa_kcfi (ctxt);
+}
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 4c12ac68d979..84bbc4223734 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1591,6 +1591,7 @@ OBJS = \
ira-emit.o \
ira-lives.o \
jump.o \
+ kcfi.o \
langhooks.o \
late-combine.o \
lcm.o \
diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index bf681c3e8153..c3c0bc61ee3e 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -337,6 +337,8 @@ enum sanitize_code {
SANITIZE_KERNEL_HWADDRESS = 1UL << 30,
/* Shadow Call Stack. */
SANITIZE_SHADOW_CALL_STACK = 1UL << 31,
+ /* KCFI (Kernel Control Flow Integrity) */
+ SANITIZE_KCFI = 1ULL << 32,
SANITIZE_SHIFT = SANITIZE_SHIFT_BASE | SANITIZE_SHIFT_EXPONENT,
SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | SANITIZE_UNREACHABLE
| SANITIZE_VLA | SANITIZE_NULL | SANITIZE_RETURN
diff --git a/gcc/gimple.h b/gcc/gimple.h
index da32651ea017..cef915b9164f 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -142,6 +142,7 @@ enum gf_mask {
GF_CALL_ALLOCA_FOR_VAR = 1 << 5,
GF_CALL_INTERNAL = 1 << 6,
GF_CALL_CTRL_ALTERING = 1 << 7,
+ GF_CALL_INLINED_FROM_KCFI_NOSANTIZE = 1 << 8,
GF_CALL_MUST_TAIL_CALL = 1 << 9,
GF_CALL_BY_DESCRIPTOR = 1 << 10,
GF_CALL_NOCF_CHECK = 1 << 11,
@@ -3487,6 +3488,26 @@ gimple_call_from_thunk_p (gcall *s)
return (s->subcode & GF_CALL_FROM_THUNK) != 0;
}
+/* If INLINED_FROM_KCFI_NOSANTIZE_P is true, mark GIMPLE_CALL S as being
+ inlined from a function with no_sanitize("kcfi"). */
+
+inline void
+gimple_call_set_inlined_from_kcfi_nosantize (gcall *s, bool inlined_from_kcfi_nosantize_p)
+{
+ if (inlined_from_kcfi_nosantize_p)
+ s->subcode |= GF_CALL_INLINED_FROM_KCFI_NOSANTIZE;
+ else
+ s->subcode &= ~GF_CALL_INLINED_FROM_KCFI_NOSANTIZE;
+}
+
+/* Return true if GIMPLE_CALL S was inlined from a function with
+ no_sanitize("kcfi"). */
+
+inline bool
+gimple_call_inlined_from_kcfi_nosantize_p (const gcall *s)
+{
+ return (s->subcode & GF_CALL_INLINED_FROM_KCFI_NOSANTIZE) != 0;
+}
/* If FROM_NEW_OR_DELETE_P is true, mark GIMPLE_CALL S as being a call
to operator new or delete created from a new or delete expression. */
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 1c68a69350df..fbf235adada3 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -357,6 +357,8 @@ extern gimple_opt_pass *make_pass_tsan (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_tsan_O0 (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_sancov (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_sancov_O0 (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_kcfi (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_kcfi_O0 (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_lower_cf (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_refactor_eh (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_lower_eh (gcc::context *ctxt);
@@ -544,6 +546,7 @@ extern ipa_opt_pass_d *make_pass_ipa_odr (gcc::context *ctxt);
extern ipa_opt_pass_d *make_pass_ipa_reference (gcc::context *ctxt);
extern ipa_opt_pass_d *make_pass_ipa_pure_const (gcc::context *ctxt);
extern simple_ipa_opt_pass *make_pass_ipa_pta (gcc::context *ctxt);
+extern simple_ipa_opt_pass *make_pass_ipa_kcfi (gcc::context *ctxt);
extern simple_ipa_opt_pass *make_pass_ipa_tm (gcc::context *ctxt);
extern simple_ipa_opt_pass *make_pass_target_clone (gcc::context *ctxt);
extern simple_ipa_opt_pass *make_pass_dispatcher_calls (gcc::context *ctxt);
diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 1e3a94ed9493..a12cfe48772a 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -48,6 +48,7 @@ along with GCC; see the file COPYING3. If not see
#include "gimplify.h"
#include "tree-pretty-print.h"
#include "gcc-rich-location.h"
+#include "asan.h"
#include "gcc-urlifier.h"
static tree handle_packed_attribute (tree *, tree, tree, int, bool *);
@@ -6508,6 +6509,17 @@ static tree
handle_patchable_function_entry_attribute (tree *, tree name, tree args,
int, bool *no_add_attrs)
{
+ /* Function-specific patchable_function_entry attribute is incompatible
+ with KCFI because KCFI callsites cannot know about function-specific
+ patchable entry settings on a preamble in a different translation
+ unit. */
+ if (sanitize_flags_p (SANITIZE_KCFI))
+ {
+ error ("%qE attribute cannot be used with %<-fsanitize=kcfi%>", name);
+ *no_add_attrs = true;
+ return NULL_TREE;
+ }
+
for (; args; args = TREE_CHAIN (args))
{
tree val = TREE_VALUE (args);
diff --git a/gcc/df-scan.cc b/gcc/df-scan.cc
index 1e4c6a2a4fb5..0e9c75df48dd 100644
--- a/gcc/df-scan.cc
+++ b/gcc/df-scan.cc
@@ -2851,6 +2851,12 @@ df_uses_record (class df_collection_rec *collection_rec,
/* If we're clobbering a REG then we have a def so ignore. */
return;
+ case KCFI:
+ /* KCFI wraps other RTL - process the wrapped RTL. */
+ df_uses_record (collection_rec, &XEXP (x, 0), ref_type, bb, insn_info, flags);
+ /* The type ID operand (XEXP (x, 1)) doesn't contain register uses. */
+ return;
+
case MEM:
df_uses_record (collection_rec,
&XEXP (x, 0), DF_REF_REG_MEM_LOAD,
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 56c4fa86e346..cd70e6351a4e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -18382,6 +18382,41 @@ possible by specifying the command-line options
@option{--param hwasan-instrument-allocas=1} respectively. Using a random frame
tag is not implemented for kernel instrumentation.
+@opindex fsanitize=kcfi
+@item -fsanitize=kcfi
+Enable Kernel Control Flow Integrity (KCFI), a lightweight control
+flow integrity mechanism designed for operating system kernels.
+KCFI instruments indirect function calls to verify that the target
+function has the expected type signature at runtime. Each function
+receives a unique type identifier computed from a hash of its function
+prototype (including parameter types and return type). Before each
+indirect call, the implementation inserts a check to verify that the
+target function's type identifier matches the expected identifier
+for the call site, terminating the program if a mismatch is detected.
+This provides forward-edge control flow protection against attacks that
+attempt to redirect indirect calls to unintended targets.
+
+The implementation adds minimal runtime overhead and does not require
+runtime library support, making it suitable for kernel environments.
+The type identifier is placed before the function entry point,
+allowing runtime verification without additional metadata structures,
+and without changing the entry points of the target functions. Only
+functions that have referenced by their address receive the KCFI preamble
+instrumentation.
+
+KCFI is intended primarily for kernel code and may not be suitable
+for user-space applications that rely on techniques incompatible
+with strict type checking of indirect calls.
+
+Note that KCFI is incompatible with function-specific
+@code{patchable_function_entry} attributes because KCFI call sites
+cannot know about function-specific patchable entry settings in different
+translation units. Only the global @option{-fpatchable-function-entry}
+command-line option is supported with KCFI.
+
+Use @option{-fdump-tree-kcfi} to examine the computed type identifiers
+and their corresponding mangled type strings during compilation.
+
@opindex fsanitize=pointer-compare
@item -fsanitize=pointer-compare
Instrument comparison operation (<, <=, >, >=) with pointer operands.
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 37642680f423..69603fdad090 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -3166,6 +3166,7 @@ This describes the stack layout and calling conventions.
* Tail Calls::
* Shrink-wrapping separate components::
* Stack Smashing Protection::
+* Kernel Control Flow Integrity::
* Miscellaneous Register Hooks::
@end menu
@@ -5432,6 +5433,36 @@ should be allocated from heap memory and consumers should release them.
The result will be pruned to cases with PREFIX if not NULL.
@end deftypefn
+@node Kernel Control Flow Integrity
+@subsection Kernel Control Flow Integrity
+@cindex kernel control flow integrity
+@cindex KCFI
+
+@deftypefn {Target Hook} bool TARGET_KCFI_SUPPORTED (void)
+Return true if the target supports Kernel Control Flow Integrity (KCFI).
+This hook indicates whether the target has implemented the necessary RTL
+patterns and infrastructure to support KCFI instrumentation. The default
+implementation returns false.
+@end deftypefn
+
+@deftypefn {Target Hook} uint32_t TARGET_KCFI_MASK_TYPE_ID (uint32_t @var{type_id})
+Apply architecture-specific masking to KCFI type ID. This hook allows
+targets to apply bit masks or other transformations to the computed KCFI
+type identifier to match the target's specific requirements. The default
+implementation returns the type ID unchanged.
+@end deftypefn
+
+@deftypefn {Target Hook} int TARGET_KCFI_EMIT_TYPE_ID (FILE *@var{file}, uint32_t @var{type_id})
+Emit architecture-specific type ID instruction for KCFI preambles
+and return the size of the instruction in bytes.
+@var{file} is the assembly output stream and @var{type_id} is the KCFI
+type identifier to emit. If @var{file} is NULL, skip emission and only
+return the size. If not overridden, the default fallback emits a
+@code{.word} directive with the type ID and returns 4 bytes. Targets can
+override this to emit different instruction sequences and return their
+corresponding sizes.
+@end deftypefn
+
@node Miscellaneous Register Hooks
@subsection Miscellaneous register hooks
@cindex miscellaneous register hooks
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index c3ed9a9fd7c2..b2856886194c 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -2433,6 +2433,7 @@ This describes the stack layout and calling conventions.
* Tail Calls::
* Shrink-wrapping separate components::
* Stack Smashing Protection::
+* Kernel Control Flow Integrity::
* Miscellaneous Register Hooks::
@end menu
@@ -3807,6 +3808,17 @@ generic code.
@hook TARGET_GET_VALID_OPTION_VALUES
+@node Kernel Control Flow Integrity
+@subsection Kernel Control Flow Integrity
+@cindex kernel control flow integrity
+@cindex KCFI
+
+@hook TARGET_KCFI_SUPPORTED
+
+@hook TARGET_KCFI_MASK_TYPE_ID
+
+@hook TARGET_KCFI_EMIT_TYPE_ID
+
@node Miscellaneous Register Hooks
@subsection Miscellaneous register hooks
@cindex miscellaneous register hooks
diff --git a/gcc/final.cc b/gcc/final.cc
index afcb0bb9efbc..7f6aa9f9e480 100644
--- a/gcc/final.cc
+++ b/gcc/final.cc
@@ -2094,6 +2094,9 @@ call_from_call_insn (const rtx_call_insn *insn)
case SET:
x = XEXP (x, 1);
break;
+ case KCFI:
+ x = XEXP (x, 0);
+ break;
}
}
return x;
diff --git a/gcc/opts.cc b/gcc/opts.cc
index 3ab993aea573..0ee37e01d24a 100644
--- a/gcc/opts.cc
+++ b/gcc/opts.cc
@@ -2170,6 +2170,7 @@ const struct sanitizer_opts_s sanitizer_opts[] =
SANITIZER_OPT (pointer-overflow, SANITIZE_POINTER_OVERFLOW, true, true),
SANITIZER_OPT (builtin, SANITIZE_BUILTIN, true, true),
SANITIZER_OPT (shadow-call-stack, SANITIZE_SHADOW_CALL_STACK, false, false),
+ SANITIZER_OPT (kcfi, SANITIZE_KCFI, false, true),
SANITIZER_OPT (all, ~sanitize_code_type (0), true, true),
#undef SANITIZER_OPT
{ NULL, sanitize_code_type (0), 0UL, false, false }
diff --git a/gcc/passes.cc b/gcc/passes.cc
index a33c8d924a52..4c6ceac740ff 100644
--- a/gcc/passes.cc
+++ b/gcc/passes.cc
@@ -63,6 +63,7 @@ along with GCC; see the file COPYING3. If not see
#include "diagnostic-core.h" /* for fnotice */
#include "stringpool.h"
#include "attribs.h"
+#include "kcfi.h"
/* Reserved TODOs */
#define TODO_verify_il (1u << 31)
diff --git a/gcc/passes.def b/gcc/passes.def
index 68ce53baa0f1..fd1bb0846801 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -52,6 +52,7 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_ipa_auto_profile_offline);
NEXT_PASS (pass_ipa_free_lang_data);
NEXT_PASS (pass_ipa_function_and_variable_visibility);
+ NEXT_PASS (pass_ipa_kcfi);
NEXT_PASS (pass_ipa_strub_mode);
NEXT_PASS (pass_build_ssa_passes);
PUSH_INSERT_PASSES_WITHIN (pass_build_ssa_passes)
@@ -275,6 +276,7 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_sink_code, false /* unsplit edges */);
NEXT_PASS (pass_sancov);
NEXT_PASS (pass_asan);
+ NEXT_PASS (pass_kcfi);
NEXT_PASS (pass_tsan);
NEXT_PASS (pass_dse, true /* use DR analysis */);
NEXT_PASS (pass_dce, false /* update_address_taken_p */, false /* remove_unused_locals */);
@@ -443,6 +445,7 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_sancov_O0);
NEXT_PASS (pass_lower_switch_O0);
NEXT_PASS (pass_asan_O0);
+ NEXT_PASS (pass_kcfi_O0);
NEXT_PASS (pass_tsan_O0);
NEXT_PASS (pass_musttail);
NEXT_PASS (pass_sanopt);
diff --git a/gcc/rtl.def b/gcc/rtl.def
index 15ae7d10fcc1..af643d187b95 100644
--- a/gcc/rtl.def
+++ b/gcc/rtl.def
@@ -318,6 +318,12 @@ DEF_RTL_EXPR(CLOBBER, "clobber", "e", RTX_EXTRA)
DEF_RTL_EXPR(CALL, "call", "ee", RTX_EXTRA)
+/* KCFI wrapper for call expressions.
+ Operand 0 is the call expression.
+ Operand 1 is the KCFI type ID (const_int). */
+
+DEF_RTL_EXPR(KCFI, "kcfi", "ee", RTX_EXTRA)
+
/* Return from a subroutine. */
DEF_RTL_EXPR(RETURN, "return", "", RTX_EXTRA)
diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
index 63a1d08c46cf..4baa820b176e 100644
--- a/gcc/rtlanal.cc
+++ b/gcc/rtlanal.cc
@@ -1177,6 +1177,11 @@ reg_referenced_p (const_rtx x, const_rtx body)
case IF_THEN_ELSE:
return reg_overlap_mentioned_p (x, body);
+ case KCFI:
+ /* For KCFI wrapper, check both the wrapped call and the type ID */
+ return (reg_overlap_mentioned_p (x, XEXP (body, 0))
+ || reg_overlap_mentioned_p (x, XEXP (body, 1)));
+
case TRAP_IF:
return reg_overlap_mentioned_p (x, TRAP_CONDITION (body));
diff --git a/gcc/target.def b/gcc/target.def
index 8e491d838642..47a11c60809a 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -7589,6 +7589,44 @@ DEFHOOKPOD
The default value is NULL.",
const char *, NULL)
+/* Kernel Control Flow Integrity (KCFI) hooks. */
+#undef HOOK_PREFIX
+#define HOOK_PREFIX "TARGET_KCFI_"
+HOOK_VECTOR (TARGET_KCFI, kcfi)
+
+DEFHOOK
+(supported,
+ "Return true if the target supports Kernel Control Flow Integrity (KCFI).\n\
+This hook indicates whether the target has implemented the necessary RTL\n\
+patterns and infrastructure to support KCFI instrumentation. The default\n\
+implementation returns false.",
+ bool, (void),
+ hook_bool_void_false)
+
+DEFHOOK
+(mask_type_id,
+ "Apply architecture-specific masking to KCFI type ID. This hook allows\n\
+targets to apply bit masks or other transformations to the computed KCFI\n\
+type identifier to match the target's specific requirements. The default\n\
+implementation returns the type ID unchanged.",
+ uint32_t, (uint32_t type_id),
+ NULL)
+
+DEFHOOK
+(emit_type_id,
+ "Emit architecture-specific type ID instruction for KCFI preambles\n\
+and return the size of the instruction in bytes.\n\
+@var{file} is the assembly output stream and @var{type_id} is the KCFI\n\
+type identifier to emit. If @var{file} is NULL, skip emission and only\n\
+return the size. If not overridden, the default fallback emits a\n\
+@code{.word} directive with the type ID and returns 4 bytes. Targets can\n\
+override this to emit different instruction sequences and return their\n\
+corresponding sizes.",
+ int, (FILE *file, uint32_t type_id),
+ NULL)
+
+HOOK_VECTOR_END (kcfi)
+
/* Close the 'struct gcc_target' definition. */
HOOK_VECTOR_END (C90_EMPTY_HACK)
diff --git a/gcc/toplev.cc b/gcc/toplev.cc
index d26467450e37..9078bb6318a9 100644
--- a/gcc/toplev.cc
+++ b/gcc/toplev.cc
@@ -67,6 +67,7 @@ along with GCC; see the file COPYING3. If not see
#include "attribs.h"
#include "asan.h"
#include "tsan.h"
+#include "kcfi.h"
#include "plugin.h"
#include "context.h"
#include "pass_manager.h"
@@ -1739,6 +1740,16 @@ process_options ()
"requires %<-fno-exceptions%>");
}
+ if (flag_sanitize & SANITIZE_KCFI)
+ {
+ if (!targetm.kcfi.supported ())
+ sorry ("%<-fsanitize=kcfi%> not supported by this target");
+
+ /* KCFI is supported for only C at this time. */
+ if (!lang_GNU_C ())
+ sorry ("%<-fsanitize=kcfi%> is only supported for C");
+ }
+
HOST_WIDE_INT patch_area_size, patch_area_start;
parse_and_check_patch_area (flag_patchable_function_entry, false,
&patch_area_size, &patch_area_start);
diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
index 08e642178ba5..e674e176f7d3 100644
--- a/gcc/tree-inline.cc
+++ b/gcc/tree-inline.cc
@@ -2104,6 +2104,16 @@ copy_bb (copy_body_data *id, basic_block bb,
/* Advance iterator now before stmt is moved to seq_gsi. */
gsi_next (&stmts_gsi);
+ /* If inlining from a function with no_sanitize("kcfi"), mark any
+ call statements in the inlined body with the flag so they skip
+ KCFI instrumentation. */
+ if (is_gimple_call (stmt)
+ && !sanitize_flags_p (SANITIZE_KCFI, id->src_fn))
+ {
+ gcall *call = as_a <gcall *> (stmt);
+ gimple_call_set_inlined_from_kcfi_nosantize (call, true);
+ }
+
if (gimple_nop_p (stmt))
continue;
diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index 0d78f5b384fb..b897954fd0ea 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -57,6 +57,7 @@ along with GCC; see the file COPYING3. If not see
#include "attribs.h"
#include "asan.h"
#include "rtl-iter.h"
+#include "kcfi.h"
#include "file-prefix-map.h" /* remap_debug_filename() */
#include "alloc-pool.h"
#include "toplev.h"
@@ -2199,6 +2200,9 @@ assemble_start_function (tree decl, const char *fnname)
unsigned short patch_area_size = crtl->patch_area_size;
unsigned short patch_area_entry = crtl->patch_area_entry;
+ /* Emit KCFI preamble before any patchable areas. */
+ kcfi_emit_preamble (asm_out_file, decl, fnname);
+
/* Emit the patching area before the entry label, if any. */
if (patch_area_entry > 0)
targetm.asm_out.print_patchable_function_entry (asm_out_file,
@@ -2767,6 +2771,19 @@ assemble_external_real (tree decl)
/* Some systems do require some output. */
SYMBOL_REF_USED (XEXP (rtl, 0)) = 1;
ASM_OUTPUT_EXTERNAL (asm_out_file, decl, XSTR (XEXP (rtl, 0), 0));
+
+ /* Emit KCFI type ID symbol for external function declarations that are address-taken. */
+ struct cgraph_node *node = (TREE_CODE (decl) == FUNCTION_DECL) ? cgraph_node::get (decl) : NULL;
+ if (flag_sanitize & SANITIZE_KCFI
+ && TREE_CODE (decl) == FUNCTION_DECL
+ && !DECL_INITIAL (decl) /* Only for external declarations (no function body) */
+ && node && node->address_taken) /* Use direct cgraph analysis for address-taken check. */
+ {
+ const char *name = XSTR (XEXP (rtl, 0), 0);
+ /* Strip any encoding prefixes like '*' from symbol name. */
+ name = targetm.strip_name_encoding (name);
+ emit_kcfi_typeid_symbol (asm_out_file, decl, name);
+ }
}
}
#endif
@@ -7283,16 +7300,25 @@ default_elf_asm_named_section (const char *name, unsigned int flags,
fprintf (asm_out_file, ",%d", flags & SECTION_ENTSIZE);
if (flags & SECTION_LINK_ORDER)
{
- /* For now, only section "__patchable_function_entries"
- adopts flag SECTION_LINK_ORDER, internal label LPFE*
- was emitted in default_print_patchable_function_entry,
- just place it here for linked_to section. */
- gcc_assert (!strcmp (name, "__patchable_function_entries"));
- fprintf (asm_out_file, ",");
- char buf[256];
- ASM_GENERATE_INTERNAL_LABEL (buf, "LPFE",
- current_function_funcdef_no);
- assemble_name_raw (asm_out_file, buf);
+ if (!strcmp (name, "__patchable_function_entries"))
+ {
+ /* For patchable function entries, internal label LPFE*
+ was emitted in default_print_patchable_function_entry,
+ just place it here for linked_to section. */
+ fprintf (asm_out_file, ",");
+ char buf[256];
+ ASM_GENERATE_INTERNAL_LABEL (buf, "LPFE",
+ current_function_funcdef_no);
+ assemble_name_raw (asm_out_file, buf);
+ }
+ else if (!strcmp (name, ".kcfi_traps"))
+ {
+ /* KCFI traps section links to .text section. */
+ fprintf (asm_out_file, ",.text");
+ }
+ else
+ internal_error ("unexpected use of %<SECTION_LINK_ORDER%> by section %qs",
+ name);
}
if (HAVE_COMDAT_GROUP && (flags & SECTION_LINKONCE))
{
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* Re: [PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
2025-09-05 0:24 ` [PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure Kees Cook
@ 2025-09-05 8:51 ` Peter Zijlstra
2025-09-05 16:19 ` Kees Cook
2025-09-09 18:49 ` Qing Zhao
1 sibling, 1 reply; 32+ messages in thread
From: Peter Zijlstra @ 2025-09-05 8:51 UTC (permalink / raw)
To: Kees Cook
Cc: Qing Zhao, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Dan Li, Sami Tolvanen,
Ramon de C Valle, Joao Moreira, Nathan Chancellor, Bill Wendling,
gcc-patches, linux-hardening
On Thu, Sep 04, 2025 at 05:24:10PM -0700, Kees Cook wrote:
> +- The check-call instruction sequence must be treated a single unit: it
> + cannot be rearranged or split or optimized. The pattern is that
> + indirect calls, "call *$target", get converted into:
> +
> + mov $target_expression, %target ; only present if the expression was
> + ; not already %target register
> + load -$offset(%target), %tmp ; load the typeid hash at target
> + cmp $hash, %tmp ; compare expected typeid with loaded
> + je .Lcheck_passed ; jump to the indirect call
> + .Lkcfi_trap$N: ; label of trap insn
> + trap ; trap on failure, but arranged so
> + ; "permissive mode" falls through
> + .Lkcfi_call$N: ; label of call insn
> + call *%target ; actual indirect call
> +
> + This pattern of call immediately after trap provides for the
> + "permissive" checking mode automatically: the trap gets handled,
> + a warning emitted, and then execution continues after the trap to
> + the call.
I know it is far too late to do anything here. But I've recently dug
through a bunch of optimization manual and the like and that Jcc is
about as bad as it gets :/
The old optimization manual states that forward jumps are assumed
not-taken; while backward jumps are assumed taken.
The new wisdom is that any Jcc must be assumed not-taken; that is, the
fallthrough case has the best throughput.
Here we have a forward branch which is assumed taken :-(
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
2025-09-05 8:51 ` Peter Zijlstra
@ 2025-09-05 16:19 ` Kees Cook
2025-09-08 15:32 ` Peter Zijlstra
0 siblings, 1 reply; 32+ messages in thread
From: Kees Cook @ 2025-09-05 16:19 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Qing Zhao, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Dan Li, Sami Tolvanen,
Ramon de C Valle, Joao Moreira, Nathan Chancellor, Bill Wendling,
gcc-patches, linux-hardening
On Fri, Sep 05, 2025 at 10:51:03AM +0200, Peter Zijlstra wrote:
> On Thu, Sep 04, 2025 at 05:24:10PM -0700, Kees Cook wrote:
> > +- The check-call instruction sequence must be treated a single unit: it
> > + cannot be rearranged or split or optimized. The pattern is that
> > + indirect calls, "call *$target", get converted into:
> > +
> > + mov $target_expression, %target ; only present if the expression was
> > + ; not already %target register
> > + load -$offset(%target), %tmp ; load the typeid hash at target
> > + cmp $hash, %tmp ; compare expected typeid with loaded
> > + je .Lcheck_passed ; jump to the indirect call
> > + .Lkcfi_trap$N: ; label of trap insn
> > + trap ; trap on failure, but arranged so
> > + ; "permissive mode" falls through
> > + .Lkcfi_call$N: ; label of call insn
> > + call *%target ; actual indirect call
> > +
> > + This pattern of call immediately after trap provides for the
> > + "permissive" checking mode automatically: the trap gets handled,
> > + a warning emitted, and then execution continues after the trap to
> > + the call.
>
> I know it is far too late to do anything here. But I've recently dug
> through a bunch of optimization manual and the like and that Jcc is
> about as bad as it gets :/
>
> The old optimization manual states that forward jumps are assumed
> not-taken; while backward jumps are assumed taken.
>
> The new wisdom is that any Jcc must be assumed not-taken; that is, the
> fallthrough case has the best throughput.
I would expect the cmp to be the slowest part of this sequence, and I
figured the both the trap and the call to be speculation barriers? I'm
not sure, though. Is changing the sequence actually useful?
> Here we have a forward branch which is assumed taken :-(
The constraints we have are:
- Linux x86 KCFI trap handler decodes the instructions from the trap
backwards, but it uses exact offsets (-12 and -6).
- Control flow following the trap must make the call (for warn-only mode)
If we change this, we'd need to make the insn decoder smarter to likey
look at the insn AFTER the trap ("is it a direct jump?")
And then use this, which is ugly, but matches second constraint:
cmp $hash %tmp
jne .Ltrap
.Lcall:
call *%target
jmp .Ldone
.Ltrap:
ud2
jmp .Lcall
.Ldone:
+4 bytes for x86_64
--
Kees Cook
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: [PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
2025-09-05 16:19 ` Kees Cook
@ 2025-09-08 15:32 ` Peter Zijlstra
2025-09-08 21:55 ` Kees Cook
0 siblings, 1 reply; 32+ messages in thread
From: Peter Zijlstra @ 2025-09-08 15:32 UTC (permalink / raw)
To: Kees Cook
Cc: Qing Zhao, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Dan Li, Sami Tolvanen,
Ramon de C Valle, Joao Moreira, Nathan Chancellor, Bill Wendling,
gcc-patches, linux-hardening
On Fri, Sep 05, 2025 at 09:19:29AM -0700, Kees Cook wrote:
> On Fri, Sep 05, 2025 at 10:51:03AM +0200, Peter Zijlstra wrote:
> > On Thu, Sep 04, 2025 at 05:24:10PM -0700, Kees Cook wrote:
> > > +- The check-call instruction sequence must be treated a single unit: it
> > > + cannot be rearranged or split or optimized. The pattern is that
> > > + indirect calls, "call *$target", get converted into:
> > > +
> > > + mov $target_expression, %target ; only present if the expression was
> > > + ; not already %target register
> > > + load -$offset(%target), %tmp ; load the typeid hash at target
> > > + cmp $hash, %tmp ; compare expected typeid with loaded
> > > + je .Lcheck_passed ; jump to the indirect call
> > > + .Lkcfi_trap$N: ; label of trap insn
> > > + trap ; trap on failure, but arranged so
> > > + ; "permissive mode" falls through
> > > + .Lkcfi_call$N: ; label of call insn
> > > + call *%target ; actual indirect call
> > > +
> > > + This pattern of call immediately after trap provides for the
> > > + "permissive" checking mode automatically: the trap gets handled,
> > > + a warning emitted, and then execution continues after the trap to
> > > + the call.
> >
> > I know it is far too late to do anything here. But I've recently dug
> > through a bunch of optimization manual and the like and that Jcc is
> > about as bad as it gets :/
> >
> > The old optimization manual states that forward jumps are assumed
> > not-taken; while backward jumps are assumed taken.
> >
> > The new wisdom is that any Jcc must be assumed not-taken; that is, the
> > fallthrough case has the best throughput.
>
> I would expect the cmp to be the slowest part of this sequence, and I
> figured the both the trap and the call to be speculation barriers? I'm
> not sure, though. Is changing the sequence actually useful?
The load can miss, in which case it is definitely the most expensive
thing around.
> > Here we have a forward branch which is assumed taken :-(
>
> The constraints we have are:
>
> - Linux x86 KCFI trap handler decodes the instructions from the trap
> backwards, but it uses exact offsets (-12 and -6).
> - Control flow following the trap must make the call (for warn-only mode)
>
> If we change this, we'd need to make the insn decoder smarter to likey
> look at the insn AFTER the trap ("is it a direct jump?")
>
> And then use this, which is ugly, but matches second constraint:
>
> cmp $hash %tmp
> jne .Ltrap
> .Lcall:
> call *%target
> jmp .Ldone
> .Ltrap:
> ud2
> jmp .Lcall
> .Ldone:
Ah, you can do something like:
cmp $hash, %tmp
jne +3
nopl -42(%rax)
call *%target
which is only 2 bytes longer. Notably, that nopl is 4 bytes and the 4th
byte is 0xd6 (aka UDB). This is an effective UDcc instruction based
around a forward non-taken branch.
But yeah, I don't know if it is worth changing this. Its just that I've
been staring at these things far too much of late :-)
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: [PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
2025-09-08 15:32 ` Peter Zijlstra
@ 2025-09-08 21:55 ` Kees Cook
0 siblings, 0 replies; 32+ messages in thread
From: Kees Cook @ 2025-09-08 21:55 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Qing Zhao, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Dan Li, Sami Tolvanen,
Ramon de C Valle, Joao Moreira, Nathan Chancellor, Bill Wendling,
gcc-patches, linux-hardening
On Mon, Sep 08, 2025 at 05:32:58PM +0200, Peter Zijlstra wrote:
> On Fri, Sep 05, 2025 at 09:19:29AM -0700, Kees Cook wrote:
> > On Fri, Sep 05, 2025 at 10:51:03AM +0200, Peter Zijlstra wrote:
> > > On Thu, Sep 04, 2025 at 05:24:10PM -0700, Kees Cook wrote:
> > > > +- The check-call instruction sequence must be treated a single unit: it
> > > > + cannot be rearranged or split or optimized. The pattern is that
> > > > + indirect calls, "call *$target", get converted into:
> > > > +
> > > > + mov $target_expression, %target ; only present if the expression was
> > > > + ; not already %target register
> > > > + load -$offset(%target), %tmp ; load the typeid hash at target
> > > > + cmp $hash, %tmp ; compare expected typeid with loaded
> > > > + je .Lcheck_passed ; jump to the indirect call
> > > > + .Lkcfi_trap$N: ; label of trap insn
> > > > + trap ; trap on failure, but arranged so
> > > > + ; "permissive mode" falls through
> > > > + .Lkcfi_call$N: ; label of call insn
> > > > + call *%target ; actual indirect call
> > > > +
> > > > + This pattern of call immediately after trap provides for the
> > > > + "permissive" checking mode automatically: the trap gets handled,
> > > > + a warning emitted, and then execution continues after the trap to
> > > > + the call.
> > >
> > > I know it is far too late to do anything here. But I've recently dug
> > > through a bunch of optimization manual and the like and that Jcc is
> > > about as bad as it gets :/
> > >
> > > The old optimization manual states that forward jumps are assumed
> > > not-taken; while backward jumps are assumed taken.
> > >
> > > The new wisdom is that any Jcc must be assumed not-taken; that is, the
> > > fallthrough case has the best throughput.
> >
> > I would expect the cmp to be the slowest part of this sequence, and I
> > figured the both the trap and the call to be speculation barriers? I'm
> > not sure, though. Is changing the sequence actually useful?
>
> The load can miss, in which case it is definitely the most expensive
> thing around.
>
> > > Here we have a forward branch which is assumed taken :-(
> >
> > The constraints we have are:
> >
> > - Linux x86 KCFI trap handler decodes the instructions from the trap
> > backwards, but it uses exact offsets (-12 and -6).
> > - Control flow following the trap must make the call (for warn-only mode)
> >
> > If we change this, we'd need to make the insn decoder smarter to likey
> > look at the insn AFTER the trap ("is it a direct jump?")
> >
> > And then use this, which is ugly, but matches second constraint:
> >
> > cmp $hash %tmp
> > jne .Ltrap
> > .Lcall:
> > call *%target
> > jmp .Ldone
> > .Ltrap:
> > ud2
> > jmp .Lcall
> > .Ldone:
>
> Ah, you can do something like:
>
> cmp $hash, %tmp
> jne +3
> nopl -42(%rax)
> call *%target
>
> which is only 2 bytes longer. Notably, that nopl is 4 bytes and the 4th
> byte is 0xd6 (aka UDB). This is an effective UDcc instruction based
> around a forward non-taken branch.
Oh right, I forgot about the nop encodings.
> But yeah, I don't know if it is worth changing this. Its just that I've
> been staring at these things far too much of late :-)
To do this we'd need to change the Linux trap handler and Clang's
implementation, so yeah, I'm inclined to just leave it as-is until we
have a stronger reason to change it.
--
Kees Cook
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
2025-09-05 0:24 ` [PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure Kees Cook
2025-09-05 8:51 ` Peter Zijlstra
@ 2025-09-09 18:49 ` Qing Zhao
2025-09-11 3:05 ` Kees Cook
1 sibling, 1 reply; 32+ messages in thread
From: Qing Zhao @ 2025-09-09 18:49 UTC (permalink / raw)
To: Kees Cook
Cc: Andrew Pinski, Richard Biener, Joseph Myers, Jan Hubicka,
Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
Jim Wilson, Peter Zijlstra, Dan Li, Sami Tolvanen,
Ramon de C Valle, Joao Moreira, Nathan Chancellor, Bill Wendling,
gcc-patches@gcc.gnu.org, linux-hardening@vger.kernel.org
> On Sep 4, 2025, at 20:24, Kees Cook <kees@kernel.org> wrote:
>
> Implements the Linux Kernel Control Flow Integrity ABI, which provides a
> function prototype based forward edge control flow integrity protection
> by instrumenting every indirect call to check for a hash value before
> the target function address. If the hash at the call site and the hash
> at the target do not match, execution will trap.
>
> See the start of kcfi.cc for design details.
>
> gcc/ChangeLog:
>
> * kcfi.h: New file with KCFI public interface declarations.
> * kcfi.cc: New file implementing Kernel Control Flow Integrity
> infrastructure.
> * Makefile.in (OBJS): Add kcfi.o.
> * flag-types.h (enum sanitize_code): Add SANITIZE_KCFI.
> * gimple.h (enum gf_mask): Add GF_CALL_INLINED_FROM_KCFI_NOSANTIZE.
> (gimple_call_set_inlined_from_kcfi_nosantize): New function.
> (gimple_call_inlined_from_kcfi_nosantize_p): New function.
> * tree-pass.h Add kcfi passes.
> * c-family/c-attribs.cc: Include asan.h.
> (handle_patchable_function_entry_attribute): Add error for using
> patchable_function_entry attribute with -fsanitize=kcfi.
> * df-scan.cc (df_uses_record): Add KCFI case to handle KCFI RTL
> patterns and process wrapped RTL.
> * doc/invoke.texi (fsanitize=kcfi): Add documentation for KCFI
> sanitizer option.
> * doc/tm.texi.in: Add Kernel Control Flow Integrity section with
> TARGET_KCFI_SUPPORTED, TARGET_KCFI_MASK_TYPE_ID,
> TARGET_KCFI_EMIT_TYPE_ID hooks.
> * doc/tm.texi: Regenerate.
> * final.cc (call_from_call_insn): Add KCFI case to handle
> KCFI-wrapped calls.
> * opts.cc (sanitizer_opts): Add kcfi entry.
> * passes.cc: Include kcfi.h.
> * passes.def: Add KCFI passes (GIMPLE and IPA).
> * rtl.def (KCFI): Add new RTL code for KCFI instrumentation.
> * rtlanal.cc (rtx_cost): Add KCFI case.
> * target.def: Add KCFI target hooks.
> * toplev.cc (process_options): Add KCFI option processing.
> * tree-inline.cc: Include kcfi.h and asan.h.
> (copy_bb): Handle KCFI no_sanitize attribute propagation during
> inlining.
> * varasm.cc (assemble_start_function): Emit KCFI preambles.
> (assemble_external_real): Emit KCFI typeid symbols.
> (default_elf_asm_named_section): Handle .kcfi_traps using
> SECTION_LINK_ORDER flag.
>
> Signed-off-by: Kees Cook <kees@kernel.org>
> ---
> gcc/kcfi.h | 47 +++
> gcc/kcfi.cc | 764 ++++++++++++++++++++++++++++++++++++++
> gcc/Makefile.in | 1 +
> gcc/flag-types.h | 2 +
> gcc/gimple.h | 21 ++
> gcc/tree-pass.h | 3 +
> gcc/c-family/c-attribs.cc | 12 +
> gcc/df-scan.cc | 6 +
> gcc/doc/invoke.texi | 35 ++
> gcc/doc/tm.texi | 31 ++
> gcc/doc/tm.texi.in | 12 +
> gcc/final.cc | 3 +
> gcc/opts.cc | 1 +
> gcc/passes.cc | 1 +
> gcc/passes.def | 3 +
> gcc/rtl.def | 6 +
> gcc/rtlanal.cc | 5 +
> gcc/target.def | 38 ++
> gcc/toplev.cc | 11 +
> gcc/tree-inline.cc | 10 +
> gcc/varasm.cc | 46 ++-
> 21 files changed, 1048 insertions(+), 10 deletions(-)
> create mode 100644 gcc/kcfi.h
> create mode 100644 gcc/kcfi.cc
>
> diff --git a/gcc/kcfi.h b/gcc/kcfi.h
> new file mode 100644
> index 000000000000..17ec59a1a3b8
> --- /dev/null
> +++ b/gcc/kcfi.h
> @@ -0,0 +1,47 @@
> +/* Kernel Control Flow Integrity (KCFI) support for GCC.
> + Copyright (C) 2025 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3. If not see
> +<http://www.gnu.org/licenses/>. */
> +
> +#ifndef GCC_KCFI_H
> +#define GCC_KCFI_H
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "rtl.h"
> +
> +/* Common helper for RTL patterns to emit .kcfi_traps section entry.
> + Call after emitting trap label and instruction with the trap symbol
> + reference. */
> +extern void kcfi_emit_traps_section (FILE *file, rtx trap_label_sym);
> +
> +/* Extract KCFI type ID from current GIMPLE statement. */
> +extern rtx kcfi_get_call_type_id (void);
> +
> +/* Emit KCFI type ID symbol for address-taken functions. */
> +extern void emit_kcfi_typeid_symbol (FILE *asm_file, tree decl,
> + const char *name);
> +
> +/* Emit KCFI preamble. */
> +extern void kcfi_emit_preamble (FILE *file, tree decl,
> + const char *actual_fname);
> +
> +/* For calculating callsite offset. */
> +extern HOST_WIDE_INT kcfi_patchable_entry_prefix_nops;
> +
> +#endif /* GCC_KCFI_H */
> diff --git a/gcc/kcfi.cc b/gcc/kcfi.cc
> new file mode 100644
> index 000000000000..1ae0602eac7b
> --- /dev/null
> +++ b/gcc/kcfi.cc
> @@ -0,0 +1,764 @@
> +/* Kernel Control Flow Integrity (KCFI) support for GCC.
> + Copyright (C) 2025 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3. If not see
> +<http://www.gnu.org/licenses/>. */
> +
> +/* KCFI ABI Design:
> +
> +The Linux Kernel Control Flow Integrity ABI provides a function prototype
> +based forward edge control flow integrity protection by instrumenting
> +every indirect call to check for a hash value before the target function
> +address. If the hash at the call site and the hash at the target do not
> +match, execution will trap.
> +
> +The general CFI ideas are discussed here, but focuses more on a CFG
> +analysis to construct valid call destinations, which tends to require LTO:
> +https://users.soe.ucsc.edu/~abadi/Papers/cfi-tissec-revised.pdf
> +
> +Later refinement for using jump tables (constructed via CFG analysis
> +during LTO) was proposed here:
> +https://www.usenix.org/system/files/conference/usenixsecurity14/sec14-paper-tice.pdf
> +
> +Linux used the above implementation from 2018 to 2022:
> +https://android-developers.googleblog.com/2018/10/control-flow-integrity-in-android-kernel.html
> +but the corner cases for target addresses not being the actual functions
> +(i.e. pointing into the jump table) was a continual source of problems,
> +and generating the jump tables required full LTO, which had its own set
> +of problems.
> +
> +Looking at function prototypes as the source of call validity was
> +presented here, though still relied on LTO:
> +https://www.blackhat.com/docs/asia-17/materials/asia-17-Moreira-Drop-The-Rop-Fine-Grained-Control-Flow-Integrity-For-The-Linux-Kernel-wp.pdf
> +
> +The KCFI approach built on the function-prototype idea, but avoided
> +needing LTO, and could be further updated to deal with CPU errata
> +(retpolines, etc):
> +https://lpc.events/event/16/contributions/1315/
> +
> +KCFI has a number of specific constraints. Some are tied to the
> +backend architecture, which are covered in arch-specific code.
> +The constraints are:
> +
> +- The KCFI scheme generates a unique 32-bit hash for each unique function
> + prototype, allowing for indirect call sites to verify that they are
> + calling into a matching _type_ of function pointer. This changes the
> + semantics of some optimization logic because now indirect calls to
> + different types cannot be merged. For example:
> +
> + if (p->func_type_1)
> + return p->func_type_1();
> + if (p->func_type_2)
> + return p->func_type_2();
> +
> + In final asm, the optimizer may collapse the second indirect call
> + into a jump to the first indirect call once it has loaded the function
> + pointer. KCFI must block cross-type merging otherwise there will be a
> + single KCFI check happening for only 1 type but being used by 2 target
> + types. The distinguishing characteristic for call merging becomes the
> + type, not the address/register usage.
> +
> +- The check-call instruction sequence must be treated a single unit: it
> + cannot be rearranged or split or optimized. The pattern is that
> + indirect calls, "call *$target", get converted into:
> +
> + mov $target_expression, %target ; only present if the expression was
> + ; not already %target register
> + load -$offset(%target), %tmp ; load the typeid hash at target
> + cmp $hash, %tmp ; compare expected typeid with loaded
> + je .Lcheck_passed ; jump to the indirect call
> + .Lkcfi_trap$N: ; label of trap insn
> + trap ; trap on failure, but arranged so
> + ; "permissive mode" falls through
> + .Lkcfi_call$N: ; label of call insn
> + call *%target ; actual indirect call
> +
> + This pattern of call immediately after trap provides for the
> + "permissive" checking mode automatically: the trap gets handled,
> + a warning emitted, and then execution continues after the trap to
> + the call.
> +
> +- KCFI check-call instrumentation must survive tail call optimization.
> + If an indirect call is turned into an indirect jump, KCFI checking
> + must still happen (but will still use the jmp).
> +
> +- Functions that may be called indirectly have a preamble added,
> + __cfi_$original_func_name, which contains the $hash value:
> +
> + __cfi_target_func:
> + .word $hash
> + target_func:
> + [regular function entry...]
> +
> +- The preamble needs to interact with patchable function entry so that
> + the hash appears further away from the actual start of the function
> + (leaving the prefix NOPs of the patchable function entry unchanged).
> + This means only _globally defined_ patchable function entry is supported
> + with KCFI (indrect call sites must know in advance what the offset is,
> + which may not be possible with extern functions). For example, a "4,4"
> + patchable function entry would end up like:
> +
> + __cfi_target_func:
> + .data $hash
> + nop nop nop nop
> + target_func:
> + [regular function entry...]
> +
> + Architectures may need to add alignment nops prior to the hash to keep things
> + aligned for function call conventions.
> +
> +- External functions that are address-taken have a weak __kcfi_typeid_$funcname
> + symbol added with the hash value available so that the hash can be referenced
> + from assembly linkages, etc, where the hash values cannot be calculated (i.e
> + where C type information is missing):
> +
> + .weak __kcfi_typeid_$func
> + .set __kcfi_typeid_$func, $hash
> +
> +- On architectures that do not have a good way to encode additional
> + details in their trap insn (e.g. x86_64 and riscv64), the trap location
> + is identified as a KCFI trap via a relative address offset entry
> + emitted into the .kcfi_traps section for each indirect call site's
> + trap instruction. The previous check-call example's insn sequence has
> + a section push/pop inserted between the trap and call:
> +
> + ...
> + .Lkcfi_trap$N:
> + trap
> + .section .kcfi_traps,"ao",@progbits,.text
> + .Lkcfi_entry$N:
> + .long .Lkcfi_trap$N - .Lkcfi_entry$N
> + .text
> + .Lkcfi_call$N:
> + call *%target
> +
> + For architectures that can encode immediates in their trap function
> + (e.g. aarch64 and arm32), this isn't needed: they just use immediate
> + codes that indicate a KCFI trap.
> +
> +- The no_sanitize("kcfi") function attribute means that the marked
> + function must not produce KCFI checking for indirect calls, and this
> + attribute must survive inlining. This is used rarely by Linux, but
> + is required to make BPF JIT trampolines work on older Linux kernel
> + versions.
> +
> +As a result of these constraints, there are some behavioral aspects
> +that need to be preserved across the middle-end and back-end.
> +
> +For indirect call sites:
> +
> +- Keeping indirect calls from being merged (see above) by adding a
> + wrapping type so that equality was tested based on type-id.
I still think that the additional new created wrapping type and the new assignment stmt
wrapper_tmp = (wrapper_ptr_type) fn
is not necessary.
All the information can be get from function type + type-id which is attached as an attribute
to the original_function_type of “fn”.
Could you explain why the wrapper type and the new temporary, new assignment is
necessary?
> +
> +- Keeping typeid information available through to the RTL expansion
> + phase was done via a new KCFI insn that wraps CALL and the typeid.
Is the new KCFI insn the following:
wrapper_tmp = (wrapper_ptr_type) fn?
Why the type-id attached as the attribute is not enough?
> +
> +- To make sure KCFI expansion is skipped for inline functions, the
> + inlining is marked during GIMPLE with a new flag which is checked
> + during expansion.
> +
> +For indirect call targets:
> +
> +- kcfi_emit_preamble() uses function_needs_kcfi_preamble(),
> + to emit the preablem,
Typo: preamble.
> which interacts with patchable function
> + entry to add any needed alignment.
> +
> +- gcc/varasm.cc, assemble_external_real() calls emit_kcfi_typeid_symbol()
> + to add the __kcfi_typeid symbols (see get_function_kcfi_type_id()
> + below).
> +
> +*/
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "target.h"
> +#include "function.h"
> +#include "tree.h"
> +#include "tree-pass.h"
> +#include "dumpfile.h"
> +#include "basic-block.h"
> +#include "gimple.h"
> +#include "gimple-iterator.h"
> +#include "cgraph.h"
> +#include "kcfi.h"
> +#include "stringpool.h"
> +#include "attribs.h"
> +#include "rtl.h"
> +#include "cfg.h"
> +#include "cfgrtl.h"
> +#include "asan.h"
> +#include "diagnostic-core.h"
> +#include "memmodel.h"
> +#include "print-tree.h"
> +#include "emit-rtl.h"
> +#include "output.h"
> +#include "builtins.h"
> +#include "varasm.h"
> +#include "opts.h"
> +#include "mangle.h"
> +#include "target.h"
> +#include "flags.h"
> +
> +HOST_WIDE_INT kcfi_patchable_entry_prefix_nops = 0; /* For callsite offset */
> +static HOST_WIDE_INT kcfi_patchable_entry_arch_alignment_nops = 0; /* For preamble alignment */
> +
> +/* Common helper for RTL patterns to emit .kcfi_traps section entry. */
there should be one empty line between the comments of the function and the start of the function.
I noticed that you need such empty line for all the new functions you added. -:)
> +void
> +kcfi_emit_traps_section (FILE *file, rtx trap_label_sym)
> +{
> + /* Generate entry label internally and get its number. */
> + rtx entry_label = gen_label_rtx ();
> + int entry_labelno = CODE_LABEL_NUMBER (entry_label);
> +
> + /* Generate entry label name with custom prefix. */
> + char entry_name[32];
> + ASM_GENERATE_INTERNAL_LABEL (entry_name, "Lkcfi_entry", entry_labelno);
> +
> + /* Save current section to restore later. */
> + section *saved_section = in_section;
> +
> + /* Use varasm infrastructure for section handling. */
> + section *kcfi_traps_section = get_section (".kcfi_traps",
> + SECTION_LINK_ORDER, NULL);
> + switch_to_section (kcfi_traps_section);
> +
> + /* Emit entry label. */
> + ASM_OUTPUT_LABEL (file, entry_name);
> +
> + /* Generate address difference using RTL infrastructure. */
> + rtx entry_label_sym = gen_rtx_SYMBOL_REF (Pmode, entry_name);
> + rtx addr_diff = gen_rtx_MINUS (Pmode, trap_label_sym, entry_label_sym);
> +
> + /* Emit the address difference as a 4-byte value. */
> + assemble_integer (addr_diff, 4, BITS_PER_UNIT, 1);
> +
> + /* Restore the previous section. */
> + switch_to_section (saved_section);
> +}
> +
> +/* Compute KCFI type ID for a function declaration or function type (internal) */
> +static uint32_t
> +compute_kcfi_type_id (tree fntype, tree fndecl = NULL_TREE)
> +{
> + gcc_assert (fntype);
> + gcc_assert (TREE_CODE (fntype) == FUNCTION_TYPE);
> +
> + uint32_t type_id = hash_function_type (fntype, fndecl);
> +
> + /* Apply target-specific masking if supported. */
> + if (targetm.kcfi.mask_type_id)
> + type_id = targetm.kcfi.mask_type_id (type_id);
> +
> + return type_id;
> +}
> +
> +/* Check if a function needs KCFI preamble generation.
> + ALL functions get preambles when -fsanitize=kcfi is enabled, regardless
> + of no_sanitize("kcfi") attribute. */
Why no_sanitize(“kcfi”) is not considered here?
> +static bool
> +function_needs_kcfi_preamble (tree fndecl)
> +{
> + /* Only instrument if KCFI is globally enabled. */
> + if (!(flag_sanitize & SANITIZE_KCFI))
> + return false;
> +
> + struct cgraph_node *node = cgraph_node::get (fndecl);
> +
> + /* Ignore cold partition functions: not reached via indirect call. */
> + if (node && node->split_part)
> + return false;
> +
> + /* Ignore cold partition sections: cold partitions are never indirect call
> + targets. Only skip preambles for cold partitions (has_bb_partition = true)
> + not for entire cold-attributed functions (has_bb_partition = false). */
> + if (in_cold_section_p && crtl && crtl->has_bb_partition)
> + return false;
> +
> + /* Check if function is truly address-taken using cgraph node analysis. */
> + bool addr_taken = (node && node->address_taken);
> +
> + /* Only instrument functions that can be targets of indirect calls:
> + - Public functions (can be called externally)
> + - External declarations (from other modules)
> + - Functions with true address-taken status from cgraph analysis. */
> + return TREE_PUBLIC (fndecl) || DECL_EXTERNAL (fndecl) || addr_taken;
> +}
> +
> +/* Function attribute to store KCFI type ID. */
> +static tree kcfi_type_id_attr = NULL_TREE;
> +
> +/* Set KCFI type ID for a function declaration during IPA phase.
> + Fatal error if type ID is already set. */
> +static void
> +set_function_kcfi_type_id (tree fndecl)
> +{
> + if (!kcfi_type_id_attr)
> + kcfi_type_id_attr = get_identifier ("kcfi_type_id");
> +
> + /* Fatal error if type ID already set - nothing should set it twice. */
> + if (lookup_attribute_by_prefix ("kcfi_type_id",
> + DECL_ATTRIBUTES (fndecl)))
> + internal_error ("KCFI type ID already set for function %qD", fndecl);
> +
> + /* Compute type ID using FUNCTION_TYPE to preserve typedef information. */
> + uint32_t type_id = compute_kcfi_type_id (TREE_TYPE (fndecl), fndecl);
> +
> + tree type_id_tree = build_int_cst (unsigned_type_node, type_id);
> + tree attr_value = build_tree_list (NULL_TREE, type_id_tree);
> + tree attr = build_tree_list (kcfi_type_id_attr, attr_value);
> +
> + DECL_ATTRIBUTES (fndecl) = chainon (DECL_ATTRIBUTES (fndecl), attr);
> +}
> +
> +/* Get KCFI type ID for a function declaration during assembly output phase.
> + Fatal error if type ID was not previously set during IPA phase. */
> +static uint32_t
> +get_function_kcfi_type_id (tree fndecl)
> +{
> + if (!kcfi_type_id_attr)
> + kcfi_type_id_attr = get_identifier ("kcfi_type_id");
> +
> + tree attr = lookup_attribute_by_prefix ("kcfi_type_id",
> + DECL_ATTRIBUTES (fndecl));
> + if (attr && TREE_VALUE (attr) && TREE_VALUE (TREE_VALUE (attr)))
> + {
> + tree value = TREE_VALUE (TREE_VALUE (attr));
> + if (TREE_CODE (value) == INTEGER_CST)
> + return (uint32_t) TREE_INT_CST_LOW (value);
The indentation of above is off.
> + }
> +
> + internal_error ("KCFI type ID not found for function %qD - "
> + "should have been set during GIMPLE phase", fndecl);
> +}
> +
> +/* Prepare the global KCFI alignment NOPs calculation.
> + Called once during IPA pass to set global variable. */
> +static void
> +kcfi_prepare_alignment_nops (void)
> +{
> + /* Only use global patchable-function-entry flag, not function attributes.
> + KCFI callsites cannot know about function-specific attributes. */
> + if (flag_patchable_function_entry)
> + {
> + HOST_WIDE_INT total_nops, prefix_nops = 0;
> + parse_and_check_patch_area (flag_patchable_function_entry, false,
> + &total_nops, &prefix_nops);
> + /* Store value for callsite offset calculation */
> + kcfi_patchable_entry_prefix_nops = prefix_nops;
> + }
> +
> + /* Calculate architecture-specific alignment NOPs.
> + KCFI preamble layout:
> + __cfi_func: [alignment_nops][typeid][prefix_nops] func: [entry_nops]
> +
> + The alignment NOPs ensure __cfi_func stays at proper function alignment
> + when prefix NOPs are added. */
> + HOST_WIDE_INT arch_alignment = 0;
> +
> + /* Calculate alignment NOPs based on function alignment setting.
> + Use explicit -falign-functions if set, otherwise default to 4 bytes. */
> + int alignment_bytes = 4;
> + if (align_functions.levels[0].log > 0)
> + {
> + /* Use explicit -falign-functions setting */
> + alignment_bytes = align_functions.levels[0].get_value();
> + }
> +
> + /* Get typeid instruction size from target hook, default to 4 bytes */
> + int typeid_size = targetm.kcfi.emit_type_id
> + ? targetm.kcfi.emit_type_id (NULL, 0) : 4;
> +
> + /* Calculate alignment NOPs needed */
> + arch_alignment = (alignment_bytes - ((kcfi_patchable_entry_prefix_nops + typeid_size) % alignment_bytes)) % alignment_bytes;
The above line is too long.
> +
> + /* Use the calculated alignment NOPs */
> + kcfi_patchable_entry_arch_alignment_nops = arch_alignment;
> +}
> +
> +/* Check if this is an indirect call that needs KCFI instrumentation. */
> +static bool
> +is_kcfi_indirect_call (tree fn)
> +{
> + if (!fn)
> + return false;
> +
> + /* Only functions WITHOUT no_sanitize("kcfi") should generate KCFI checks at
> + indirect call sites. */
> + if (!sanitize_flags_p (SANITIZE_KCFI, current_function_decl))
> + return false;
> +
> + /* Direct function calls via ADDR_EXPR don't need KCFI checks. */
> + if (TREE_CODE (fn) == ADDR_EXPR)
> + return false;
> +
> + /* Everything else must be indirect calls needing KCFI. */
> + return true;
> +}
> +
> +/* Extract KCFI type ID from indirect call GIMPLE statement.
> + Returns RTX constant with type ID, or NULL_RTX if no KCFI needed. */
> +rtx
> +kcfi_get_call_type_id (void)
> +{
> + if (!sanitize_flags_p (SANITIZE_KCFI) || !currently_expanding_gimple_stmt)
> + return NULL_RTX;
> +
> + if (!is_gimple_call (currently_expanding_gimple_stmt))
> + return NULL_RTX;
> +
> + gcall *call_stmt = as_a <gcall *> (currently_expanding_gimple_stmt);
> +
> + /* Only indirect calls need KCFI instrumentation. */
> + if (gimple_call_fndecl (call_stmt))
> + return NULL_RTX;
> +
> + tree fn_type = gimple_call_fntype (call_stmt);
> + if (!fn_type)
> + return NULL_RTX;
> +
> + tree attr = lookup_attribute ("kcfi_type_id", TYPE_ATTRIBUTES (fn_type));
> + if (!attr || !TREE_VALUE (attr))
> + return NULL_RTX;
> +
> + if (gimple_call_inlined_from_kcfi_nosantize_p (call_stmt))
> + return NULL_RTX;
> +
> + uint32_t kcfi_type_id = (uint32_t) tree_to_uhwi (TREE_VALUE (attr));
> + return GEN_INT (kcfi_type_id);
> +}
> +
> +/* Emit KCFI type ID symbol for an address-taken function.
> + Centralized emission point to avoid duplication between
> + assemble_external_real() and assemble_start_function(). */
> +void
> +emit_kcfi_typeid_symbol (FILE *asm_file, tree decl, const char *name)
> +{
> + uint32_t type_id = get_function_kcfi_type_id (decl);
> + fprintf (asm_file, "\t.weak\t__kcfi_typeid_%s\n", name);
> + fprintf (asm_file, "\t.set\t__kcfi_typeid_%s, 0x%08x\n", name, type_id);
> +}
> +
> +void
> +kcfi_emit_preamble (FILE *file, tree decl, const char *actual_fname)
> +{
> + /* Check if KCFI is enabled and function needs preamble. */
> + if (!function_needs_kcfi_preamble (decl))
> + return;
> +
> + /* Use actual function name if provided, otherwise fall back to DECL_ASSEMBLER_NAME. */
> + const char *fname = actual_fname ? actual_fname
> + : IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
> +
> + /* Get type ID. */
> + uint32_t type_id = get_function_kcfi_type_id (decl);
> +
> + /* Create symbol name for reuse. */
> + char cfi_symbol_name[256];
> + snprintf (cfi_symbol_name, sizeof(cfi_symbol_name), "__cfi_%s", fname);
> +
> + /* Emit __cfi_ symbol with proper visibility. */
> + if (TREE_PUBLIC (decl))
> + {
> + if (DECL_WEAK (decl))
> + ASM_WEAKEN_LABEL (file, cfi_symbol_name);
> + else
> + targetm.asm_out.globalize_label (file, cfi_symbol_name);
> + }
Indentations are off in the above lines.
> +
> + /* Emit .type directive. */
> + ASM_OUTPUT_TYPE_DIRECTIVE (file, cfi_symbol_name, "function");
> + fprintf (file, "%s:\n", cfi_symbol_name);
> +
> + /* Emit architecture-specific prefix NOPs. */
> + for (int i = 0; i < kcfi_patchable_entry_arch_alignment_nops; i++)
> + {
> + fprintf (file, "\tnop\n");
> + }
> +
> + /* Emit type ID bytes. */
> + if (targetm.kcfi.emit_type_id)
> + targetm.kcfi.emit_type_id (file, type_id);
> + else
> + fprintf (file, "\t.word\t0x%08x\n", type_id);
> +
> + /* Mark end of __cfi_ symbol and emit size directive. */
> + char cfi_end_label[256];
> + snprintf (cfi_end_label, sizeof(cfi_end_label), ".Lcfi_func_end_%s", fname);
> + ASM_OUTPUT_LABEL (file, cfi_end_label);
> +
> + ASM_OUTPUT_MEASURED_SIZE (file, cfi_symbol_name);
> +}
> +
> +/* KCFI GIMPLE pass implementation. */
> +
> +static bool
> +gate_kcfi (void)
> +{
> + /* Always process functions when KCFI is globally enabled to set type IDs.
> + Individual function processing (call instrumentation) will check no_sanitize("kcfi"). */
> + return sanitize_flags_p (SANITIZE_KCFI);
> +}
> +
> +/* Create a KCFI wrapper function type that embeds the type ID. */
> +static tree
> +create_kcfi_wrapper_type (tree original_fn_type, uint32_t type_id)
> +{
> + /* Create a unique type name incorporating the type ID. */
> + char wrapper_name[32];
> + snprintf (wrapper_name, sizeof (wrapper_name), "__kcfi_wrapper_%x", type_id);
> +
> + /* Build a new function type that's structurally identical but nominally different. */
> + tree wrapper_type = build_function_type (TREE_TYPE (original_fn_type),
> + TYPE_ARG_TYPES (original_fn_type));
> +
> + /* Set the type name to make it distinct. */
> + TYPE_NAME (wrapper_type) = get_identifier (wrapper_name);
> +
> + /* Attach kcfi_type_id attribute to the original function type for cfgexpand.cc */
> + tree attr_name = get_identifier ("kcfi_type_id");
> + tree attr_value = build_int_cst (unsigned_type_node, type_id);
> + tree attr = build_tree_list (attr_name, attr_value);
> + TYPE_ATTRIBUTES (original_fn_type) = chainon (TYPE_ATTRIBUTES (original_fn_type), attr);
> +
> + return wrapper_type;
> +}
As I asked previously, In the above routine, the “type_id" is attached as an attribute to the
“original_fn_type” already, what additional information is carried by the new “wrapper_type”?
> +
> +/* Wrap indirect calls with KCFI type for anti-merging. */
> +static unsigned int
> +kcfi_instrument (void)
> +{
> + /* Process current function for call instrumentation only.
> + Type ID setting is handled by the separate IPA pass. */
> +
> + basic_block bb;
> +
> + FOR_EACH_BB_FN (bb, cfun)
> + {
> + gimple_stmt_iterator gsi;
> + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> + {
The indentation of the above line is off.
> + gimple *stmt = gsi_stmt (gsi);
> +
> + if (!is_gimple_call (stmt))
> + continue;
> +
> + gcall *call_stmt = as_a <gcall *> (stmt);
> +
> + // Skip internal calls - we only instrument indirect calls
> + if (gimple_call_internal_p (call_stmt))
> + continue;
> +
> + tree fndecl = gimple_call_fndecl (call_stmt);
> +
> + // Only process indirect calls (no fndecl)
> + if (fndecl)
> + continue;
> +
> + tree fn = gimple_call_fn (call_stmt);
> + if (!is_kcfi_indirect_call (fn))
> + continue;
> +
> + // Get the function type to compute KCFI type ID
> + tree fn_type = gimple_call_fntype (call_stmt);
> + gcc_assert (fn_type);
> + if (TREE_CODE (fn_type) != FUNCTION_TYPE)
> + continue;
> +
> + uint32_t type_id = compute_kcfi_type_id (fn_type);
> +
> + // Create KCFI wrapper type for this call
> + tree wrapper_type = create_kcfi_wrapper_type (fn_type, type_id);
Again, the new “type_id” has been attached as an attribute of “fn_type” here,
> +
> + // Create a temporary variable for the wrapped function pointer
> + tree wrapper_ptr_type = build_pointer_type (wrapper_type);
> + tree wrapper_tmp = create_tmp_var (wrapper_ptr_type, "kcfi_wrapper");
> +
> + // Create assignment: wrapper_tmp = (wrapper_ptr_type) fn
> + tree cast_expr = build1 (NOP_EXPR, wrapper_ptr_type, fn);
> + gimple *cast_stmt = gimple_build_assign (wrapper_tmp, cast_expr);
> + gsi_insert_before (&gsi, cast_stmt, GSI_SAME_STMT);
> +
Why the additional wrapper_ptr_type, wrapper_tmp and new assignment stmt
Are needed here?
> + // Update the call to use the wrapped function pointer
> + gimple_call_set_fn (call_stmt, wrapper_tmp);
> + }
> + }
> +
> + return 0;
> +}
> +
> +namespace {
> +
> +const pass_data pass_data_kcfi =
> +{
> + GIMPLE_PASS, /* type */
> + "kcfi", /* name */
> + OPTGROUP_NONE, /* optinfo_flags */
> + TV_NONE, /* tv_id */
> + ( PROP_ssa | PROP_cfg | PROP_gimple_leh ), /* properties_required */
> + 0, /* properties_provided */
> + 0, /* properties_destroyed */
> + 0, /* todo_flags_start */
> + TODO_update_ssa, /* todo_flags_finish */
> +};
> +
> +class pass_kcfi : public gimple_opt_pass
> +{
> +public:
> + pass_kcfi (gcc::context *ctxt)
> + : gimple_opt_pass (pass_data_kcfi, ctxt)
> + {}
> +
> + /* opt_pass methods: */
> + opt_pass * clone () final override { return new pass_kcfi (m_ctxt); }
> + bool gate (function *) final override
> + {
> + return gate_kcfi ();
> + }
> + unsigned int execute (function *) final override
> + {
> + return kcfi_instrument ();
> + }
> +
> +}; // class pass_kcfi
> +
> +} // anon namespace
> +
> +gimple_opt_pass *
> +make_pass_kcfi (gcc::context *ctxt)
> +{
> + return new pass_kcfi (ctxt);
> +}
> +
> +namespace {
> +
> +const pass_data pass_data_kcfi_O0 =
> +{
> + GIMPLE_PASS, /* type */
> + "kcfi0", /* name */
> + OPTGROUP_NONE, /* optinfo_flags */
> + TV_NONE, /* tv_id */
> + ( PROP_ssa | PROP_cfg | PROP_gimple_leh ), /* properties_required */
> + 0, /* properties_provided */
> + 0, /* properties_destroyed */
> + 0, /* todo_flags_start */
> + TODO_update_ssa, /* todo_flags_finish */
> +};
> +
> +class pass_kcfi_O0 : public gimple_opt_pass
> +{
> +public:
> + pass_kcfi_O0 (gcc::context *ctxt)
> + : gimple_opt_pass (pass_data_kcfi_O0, ctxt)
> + {}
> +
> + /* opt_pass methods: */
> + bool gate (function *) final override
> + {
> + return !optimize && gate_kcfi ();
> + }
> + unsigned int execute (function *) final override
> + {
> + return kcfi_instrument ();
> + }
> +
> +}; // class pass_kcfi_O0
> +
> +} // anon namespace
> +
> +gimple_opt_pass *
> +make_pass_kcfi_O0 (gcc::context *ctxt)
> +{
> + return new pass_kcfi_O0 (ctxt);
> +}
> +
> +/* IPA pass for KCFI type ID setting - runs once per compilation unit. */
> +
> +namespace {
> +
> +const pass_data pass_data_ipa_kcfi =
> +{
> + SIMPLE_IPA_PASS, /* type */
> + "ipa_kcfi", /* name */
> + OPTGROUP_NONE, /* optinfo_flags */
> + TV_IPA_OPT, /* tv_id */
> + 0, /* properties_required */
> + 0, /* properties_provided */
> + 0, /* properties_destroyed */
> + 0, /* todo_flags_start */
> + 0, /* todo_flags_finish */
> +};
> +
> +/* Set KCFI type IDs for all functions in the compilation unit. */
> +static unsigned int
> +ipa_kcfi_execute (void)
> +{
> + struct cgraph_node *node;
> +
> + /* Prepare global KCFI alignment NOPs calculation once for all functions. */
> + kcfi_prepare_alignment_nops ();
> +
> + /* Process all functions - both local and external.
> + This preserves typedef information using DECL_ARGUMENTS. */
> + FOR_EACH_FUNCTION (node)
> + {
> + tree fndecl = node->decl;
> +
> + /* Skip all non-NORMAL builtins (MD, FRONTEND) entirely.
> + For NORMAL builtins, skip those that lack an implicit
> + implementation (closest way to distinguishing DEF_LIB_BUILTIN
> + from others). E.g. we need to have typeids for memset(). */
> + if (fndecl_built_in_p (fndecl))
> + {
> + if (DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_NORMAL)
> + continue;
> + if (!builtin_decl_implicit_p (DECL_FUNCTION_CODE (fndecl)))
> + continue;
> + }
> +
> + set_function_kcfi_type_id (fndecl);
> + }
> +
> + return 0;
> +}
> +
> +class pass_ipa_kcfi : public simple_ipa_opt_pass
> +{
> +public:
> + pass_ipa_kcfi (gcc::context *ctxt)
> + : simple_ipa_opt_pass (pass_data_ipa_kcfi, ctxt)
> + {}
> +
> + /* opt_pass methods: */
> + bool gate (function *) final override
> + {
> + return sanitize_flags_p (SANITIZE_KCFI);
> + }
> +
> + unsigned int execute (function *) final override
> + {
> + return ipa_kcfi_execute ();
> + }
> +
> +}; // class pass_ipa_kcfi
> +
> +} // anon namespace
> +
> +simple_ipa_opt_pass *
> +make_pass_ipa_kcfi (gcc::context *ctxt)
> +{
> + return new pass_ipa_kcfi (ctxt);
> +}
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 4c12ac68d979..84bbc4223734 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1591,6 +1591,7 @@ OBJS = \
> ira-emit.o \
> ira-lives.o \
> jump.o \
> + kcfi.o \
> langhooks.o \
> late-combine.o \
> lcm.o \
> diff --git a/gcc/flag-types.h b/gcc/flag-types.h
> index bf681c3e8153..c3c0bc61ee3e 100644
> --- a/gcc/flag-types.h
> +++ b/gcc/flag-types.h
> @@ -337,6 +337,8 @@ enum sanitize_code {
> SANITIZE_KERNEL_HWADDRESS = 1UL << 30,
> /* Shadow Call Stack. */
> SANITIZE_SHADOW_CALL_STACK = 1UL << 31,
> + /* KCFI (Kernel Control Flow Integrity) */
> + SANITIZE_KCFI = 1ULL << 32,
> SANITIZE_SHIFT = SANITIZE_SHIFT_BASE | SANITIZE_SHIFT_EXPONENT,
> SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | SANITIZE_UNREACHABLE
> | SANITIZE_VLA | SANITIZE_NULL | SANITIZE_RETURN
> diff --git a/gcc/gimple.h b/gcc/gimple.h
> index da32651ea017..cef915b9164f 100644
> --- a/gcc/gimple.h
> +++ b/gcc/gimple.h
> @@ -142,6 +142,7 @@ enum gf_mask {
> GF_CALL_ALLOCA_FOR_VAR = 1 << 5,
> GF_CALL_INTERNAL = 1 << 6,
> GF_CALL_CTRL_ALTERING = 1 << 7,
> + GF_CALL_INLINED_FROM_KCFI_NOSANTIZE = 1 << 8,
> GF_CALL_MUST_TAIL_CALL = 1 << 9,
> GF_CALL_BY_DESCRIPTOR = 1 << 10,
> GF_CALL_NOCF_CHECK = 1 << 11,
> @@ -3487,6 +3488,26 @@ gimple_call_from_thunk_p (gcall *s)
> return (s->subcode & GF_CALL_FROM_THUNK) != 0;
> }
>
> +/* If INLINED_FROM_KCFI_NOSANTIZE_P is true, mark GIMPLE_CALL S as being
> + inlined from a function with no_sanitize("kcfi"). */
> +
> +inline void
> +gimple_call_set_inlined_from_kcfi_nosantize (gcall *s, bool inlined_from_kcfi_nosantize_p)
> +{
> + if (inlined_from_kcfi_nosantize_p)
> + s->subcode |= GF_CALL_INLINED_FROM_KCFI_NOSANTIZE;
> + else
> + s->subcode &= ~GF_CALL_INLINED_FROM_KCFI_NOSANTIZE;
> +}
> +
> +/* Return true if GIMPLE_CALL S was inlined from a function with
> + no_sanitize("kcfi"). */
> +
> +inline bool
> +gimple_call_inlined_from_kcfi_nosantize_p (const gcall *s)
> +{
> + return (s->subcode & GF_CALL_INLINED_FROM_KCFI_NOSANTIZE) != 0;
> +}
>
> /* If FROM_NEW_OR_DELETE_P is true, mark GIMPLE_CALL S as being a call
> to operator new or delete created from a new or delete expression. */
> diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
> index 1c68a69350df..fbf235adada3 100644
> --- a/gcc/tree-pass.h
> +++ b/gcc/tree-pass.h
> @@ -357,6 +357,8 @@ extern gimple_opt_pass *make_pass_tsan (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_tsan_O0 (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_sancov (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_sancov_O0 (gcc::context *ctxt);
> +extern gimple_opt_pass *make_pass_kcfi (gcc::context *ctxt);
> +extern gimple_opt_pass *make_pass_kcfi_O0 (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_lower_cf (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_refactor_eh (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_lower_eh (gcc::context *ctxt);
> @@ -544,6 +546,7 @@ extern ipa_opt_pass_d *make_pass_ipa_odr (gcc::context *ctxt);
> extern ipa_opt_pass_d *make_pass_ipa_reference (gcc::context *ctxt);
> extern ipa_opt_pass_d *make_pass_ipa_pure_const (gcc::context *ctxt);
> extern simple_ipa_opt_pass *make_pass_ipa_pta (gcc::context *ctxt);
> +extern simple_ipa_opt_pass *make_pass_ipa_kcfi (gcc::context *ctxt);
> extern simple_ipa_opt_pass *make_pass_ipa_tm (gcc::context *ctxt);
> extern simple_ipa_opt_pass *make_pass_target_clone (gcc::context *ctxt);
> extern simple_ipa_opt_pass *make_pass_dispatcher_calls (gcc::context *ctxt);
> diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
> index 1e3a94ed9493..a12cfe48772a 100644
> --- a/gcc/c-family/c-attribs.cc
> +++ b/gcc/c-family/c-attribs.cc
> @@ -48,6 +48,7 @@ along with GCC; see the file COPYING3. If not see
> #include "gimplify.h"
> #include "tree-pretty-print.h"
> #include "gcc-rich-location.h"
> +#include "asan.h"
> #include "gcc-urlifier.h"
>
> static tree handle_packed_attribute (tree *, tree, tree, int, bool *);
> @@ -6508,6 +6509,17 @@ static tree
> handle_patchable_function_entry_attribute (tree *, tree name, tree args,
> int, bool *no_add_attrs)
> {
> + /* Function-specific patchable_function_entry attribute is incompatible
> + with KCFI because KCFI callsites cannot know about function-specific
> + patchable entry settings on a preamble in a different translation
> + unit. */
> + if (sanitize_flags_p (SANITIZE_KCFI))
> + {
> + error ("%qE attribute cannot be used with %<-fsanitize=kcfi%>", name);
> + *no_add_attrs = true;
> + return NULL_TREE;
> + }
> +
> for (; args; args = TREE_CHAIN (args))
> {
> tree val = TREE_VALUE (args);
> diff --git a/gcc/df-scan.cc b/gcc/df-scan.cc
> index 1e4c6a2a4fb5..0e9c75df48dd 100644
> --- a/gcc/df-scan.cc
> +++ b/gcc/df-scan.cc
> @@ -2851,6 +2851,12 @@ df_uses_record (class df_collection_rec *collection_rec,
> /* If we're clobbering a REG then we have a def so ignore. */
> return;
>
> + case KCFI:
> + /* KCFI wraps other RTL - process the wrapped RTL. */
> + df_uses_record (collection_rec, &XEXP (x, 0), ref_type, bb, insn_info, flags);
> + /* The type ID operand (XEXP (x, 1)) doesn't contain register uses. */
> + return;
> +
> case MEM:
> df_uses_record (collection_rec,
> &XEXP (x, 0), DF_REF_REG_MEM_LOAD,
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 56c4fa86e346..cd70e6351a4e 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -18382,6 +18382,41 @@ possible by specifying the command-line options
> @option{--param hwasan-instrument-allocas=1} respectively. Using a random frame
> tag is not implemented for kernel instrumentation.
>
> +@opindex fsanitize=kcfi
> +@item -fsanitize=kcfi
> +Enable Kernel Control Flow Integrity (KCFI), a lightweight control
> +flow integrity mechanism designed for operating system kernels.
> +KCFI instruments indirect function calls to verify that the target
> +function has the expected type signature at runtime. Each function
> +receives a unique type identifier computed from a hash of its function
> +prototype (including parameter types and return type). Before each
> +indirect call, the implementation inserts a check to verify that the
> +target function's type identifier matches the expected identifier
> +for the call site, terminating the program if a mismatch is detected.
> +This provides forward-edge control flow protection against attacks that
> +attempt to redirect indirect calls to unintended targets.
> +
> +The implementation adds minimal runtime overhead and does not require
> +runtime library support, making it suitable for kernel environments.
> +The type identifier is placed before the function entry point,
> +allowing runtime verification without additional metadata structures,
> +and without changing the entry points of the target functions. Only
> +functions that have referenced by their address receive the KCFI preamble
> +instrumentation.
> +
> +KCFI is intended primarily for kernel code and may not be suitable
> +for user-space applications that rely on techniques incompatible
> +with strict type checking of indirect calls.
> +
> +Note that KCFI is incompatible with function-specific
> +@code{patchable_function_entry} attributes because KCFI call sites
> +cannot know about function-specific patchable entry settings in different
> +translation units. Only the global @option{-fpatchable-function-entry}
> +command-line option is supported with KCFI.
> +
> +Use @option{-fdump-tree-kcfi} to examine the computed type identifiers
> +and their corresponding mangled type strings during compilation.
> +
> @opindex fsanitize=pointer-compare
> @item -fsanitize=pointer-compare
> Instrument comparison operation (<, <=, >, >=) with pointer operands.
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 37642680f423..69603fdad090 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -3166,6 +3166,7 @@ This describes the stack layout and calling conventions.
> * Tail Calls::
> * Shrink-wrapping separate components::
> * Stack Smashing Protection::
> +* Kernel Control Flow Integrity::
> * Miscellaneous Register Hooks::
> @end menu
>
> @@ -5432,6 +5433,36 @@ should be allocated from heap memory and consumers should release them.
> The result will be pruned to cases with PREFIX if not NULL.
> @end deftypefn
>
> +@node Kernel Control Flow Integrity
> +@subsection Kernel Control Flow Integrity
> +@cindex kernel control flow integrity
> +@cindex KCFI
> +
> +@deftypefn {Target Hook} bool TARGET_KCFI_SUPPORTED (void)
> +Return true if the target supports Kernel Control Flow Integrity (KCFI).
> +This hook indicates whether the target has implemented the necessary RTL
> +patterns and infrastructure to support KCFI instrumentation. The default
> +implementation returns false.
> +@end deftypefn
> +
> +@deftypefn {Target Hook} uint32_t TARGET_KCFI_MASK_TYPE_ID (uint32_t @var{type_id})
> +Apply architecture-specific masking to KCFI type ID. This hook allows
> +targets to apply bit masks or other transformations to the computed KCFI
> +type identifier to match the target's specific requirements. The default
> +implementation returns the type ID unchanged.
> +@end deftypefn
> +
> +@deftypefn {Target Hook} int TARGET_KCFI_EMIT_TYPE_ID (FILE *@var{file}, uint32_t @var{type_id})
> +Emit architecture-specific type ID instruction for KCFI preambles
> +and return the size of the instruction in bytes.
> +@var{file} is the assembly output stream and @var{type_id} is the KCFI
> +type identifier to emit. If @var{file} is NULL, skip emission and only
> +return the size. If not overridden, the default fallback emits a
> +@code{.word} directive with the type ID and returns 4 bytes. Targets can
> +override this to emit different instruction sequences and return their
> +corresponding sizes.
> +@end deftypefn
> +
> @node Miscellaneous Register Hooks
> @subsection Miscellaneous register hooks
> @cindex miscellaneous register hooks
> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index c3ed9a9fd7c2..b2856886194c 100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -2433,6 +2433,7 @@ This describes the stack layout and calling conventions.
> * Tail Calls::
> * Shrink-wrapping separate components::
> * Stack Smashing Protection::
> +* Kernel Control Flow Integrity::
> * Miscellaneous Register Hooks::
> @end menu
>
> @@ -3807,6 +3808,17 @@ generic code.
>
> @hook TARGET_GET_VALID_OPTION_VALUES
>
> +@node Kernel Control Flow Integrity
> +@subsection Kernel Control Flow Integrity
> +@cindex kernel control flow integrity
> +@cindex KCFI
> +
> +@hook TARGET_KCFI_SUPPORTED
> +
> +@hook TARGET_KCFI_MASK_TYPE_ID
> +
> +@hook TARGET_KCFI_EMIT_TYPE_ID
> +
> @node Miscellaneous Register Hooks
> @subsection Miscellaneous register hooks
> @cindex miscellaneous register hooks
> diff --git a/gcc/final.cc b/gcc/final.cc
> index afcb0bb9efbc..7f6aa9f9e480 100644
> --- a/gcc/final.cc
> +++ b/gcc/final.cc
> @@ -2094,6 +2094,9 @@ call_from_call_insn (const rtx_call_insn *insn)
> case SET:
> x = XEXP (x, 1);
> break;
> + case KCFI:
> + x = XEXP (x, 0);
> + break;
> }
> }
> return x;
> diff --git a/gcc/opts.cc b/gcc/opts.cc
> index 3ab993aea573..0ee37e01d24a 100644
> --- a/gcc/opts.cc
> +++ b/gcc/opts.cc
> @@ -2170,6 +2170,7 @@ const struct sanitizer_opts_s sanitizer_opts[] =
> SANITIZER_OPT (pointer-overflow, SANITIZE_POINTER_OVERFLOW, true, true),
> SANITIZER_OPT (builtin, SANITIZE_BUILTIN, true, true),
> SANITIZER_OPT (shadow-call-stack, SANITIZE_SHADOW_CALL_STACK, false, false),
> + SANITIZER_OPT (kcfi, SANITIZE_KCFI, false, true),
> SANITIZER_OPT (all, ~sanitize_code_type (0), true, true),
> #undef SANITIZER_OPT
> { NULL, sanitize_code_type (0), 0UL, false, false }
> diff --git a/gcc/passes.cc b/gcc/passes.cc
> index a33c8d924a52..4c6ceac740ff 100644
> --- a/gcc/passes.cc
> +++ b/gcc/passes.cc
> @@ -63,6 +63,7 @@ along with GCC; see the file COPYING3. If not see
> #include "diagnostic-core.h" /* for fnotice */
> #include "stringpool.h"
> #include "attribs.h"
> +#include "kcfi.h"
>
> /* Reserved TODOs */
> #define TODO_verify_il (1u << 31)
> diff --git a/gcc/passes.def b/gcc/passes.def
> index 68ce53baa0f1..fd1bb0846801 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -52,6 +52,7 @@ along with GCC; see the file COPYING3. If not see
> NEXT_PASS (pass_ipa_auto_profile_offline);
> NEXT_PASS (pass_ipa_free_lang_data);
> NEXT_PASS (pass_ipa_function_and_variable_visibility);
> + NEXT_PASS (pass_ipa_kcfi);
> NEXT_PASS (pass_ipa_strub_mode);
> NEXT_PASS (pass_build_ssa_passes);
> PUSH_INSERT_PASSES_WITHIN (pass_build_ssa_passes)
> @@ -275,6 +276,7 @@ along with GCC; see the file COPYING3. If not see
> NEXT_PASS (pass_sink_code, false /* unsplit edges */);
> NEXT_PASS (pass_sancov);
> NEXT_PASS (pass_asan);
> + NEXT_PASS (pass_kcfi);
> NEXT_PASS (pass_tsan);
> NEXT_PASS (pass_dse, true /* use DR analysis */);
> NEXT_PASS (pass_dce, false /* update_address_taken_p */, false /* remove_unused_locals */);
> @@ -443,6 +445,7 @@ along with GCC; see the file COPYING3. If not see
> NEXT_PASS (pass_sancov_O0);
> NEXT_PASS (pass_lower_switch_O0);
> NEXT_PASS (pass_asan_O0);
> + NEXT_PASS (pass_kcfi_O0);
> NEXT_PASS (pass_tsan_O0);
> NEXT_PASS (pass_musttail);
> NEXT_PASS (pass_sanopt);
> diff --git a/gcc/rtl.def b/gcc/rtl.def
> index 15ae7d10fcc1..af643d187b95 100644
> --- a/gcc/rtl.def
> +++ b/gcc/rtl.def
> @@ -318,6 +318,12 @@ DEF_RTL_EXPR(CLOBBER, "clobber", "e", RTX_EXTRA)
>
> DEF_RTL_EXPR(CALL, "call", "ee", RTX_EXTRA)
>
> +/* KCFI wrapper for call expressions.
> + Operand 0 is the call expression.
> + Operand 1 is the KCFI type ID (const_int). */
> +
> +DEF_RTL_EXPR(KCFI, "kcfi", "ee", RTX_EXTRA)
> +
> /* Return from a subroutine. */
>
> DEF_RTL_EXPR(RETURN, "return", "", RTX_EXTRA)
> diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
> index 63a1d08c46cf..4baa820b176e 100644
> --- a/gcc/rtlanal.cc
> +++ b/gcc/rtlanal.cc
> @@ -1177,6 +1177,11 @@ reg_referenced_p (const_rtx x, const_rtx body)
> case IF_THEN_ELSE:
> return reg_overlap_mentioned_p (x, body);
>
> + case KCFI:
> + /* For KCFI wrapper, check both the wrapped call and the type ID */
> + return (reg_overlap_mentioned_p (x, XEXP (body, 0))
> + || reg_overlap_mentioned_p (x, XEXP (body, 1)));
> +
> case TRAP_IF:
> return reg_overlap_mentioned_p (x, TRAP_CONDITION (body));
>
> diff --git a/gcc/target.def b/gcc/target.def
> index 8e491d838642..47a11c60809a 100644
> --- a/gcc/target.def
> +++ b/gcc/target.def
> @@ -7589,6 +7589,44 @@ DEFHOOKPOD
> The default value is NULL.",
> const char *, NULL)
>
> +/* Kernel Control Flow Integrity (KCFI) hooks. */
> +#undef HOOK_PREFIX
> +#define HOOK_PREFIX "TARGET_KCFI_"
> +HOOK_VECTOR (TARGET_KCFI, kcfi)
> +
> +DEFHOOK
> +(supported,
> + "Return true if the target supports Kernel Control Flow Integrity (KCFI).\n\
> +This hook indicates whether the target has implemented the necessary RTL\n\
> +patterns and infrastructure to support KCFI instrumentation. The default\n\
> +implementation returns false.",
> + bool, (void),
> + hook_bool_void_false)
> +
> +DEFHOOK
> +(mask_type_id,
> + "Apply architecture-specific masking to KCFI type ID. This hook allows\n\
> +targets to apply bit masks or other transformations to the computed KCFI\n\
> +type identifier to match the target's specific requirements. The default\n\
> +implementation returns the type ID unchanged.",
> + uint32_t, (uint32_t type_id),
> + NULL)
> +
> +DEFHOOK
> +(emit_type_id,
> + "Emit architecture-specific type ID instruction for KCFI preambles\n\
> +and return the size of the instruction in bytes.\n\
> +@var{file} is the assembly output stream and @var{type_id} is the KCFI\n\
> +type identifier to emit. If @var{file} is NULL, skip emission and only\n\
> +return the size. If not overridden, the default fallback emits a\n\
> +@code{.word} directive with the type ID and returns 4 bytes. Targets can\n\
> +override this to emit different instruction sequences and return their\n\
> +corresponding sizes.",
> + int, (FILE *file, uint32_t type_id),
> + NULL)
> +
> +HOOK_VECTOR_END (kcfi)
> +
> /* Close the 'struct gcc_target' definition. */
> HOOK_VECTOR_END (C90_EMPTY_HACK)
>
> diff --git a/gcc/toplev.cc b/gcc/toplev.cc
> index d26467450e37..9078bb6318a9 100644
> --- a/gcc/toplev.cc
> +++ b/gcc/toplev.cc
> @@ -67,6 +67,7 @@ along with GCC; see the file COPYING3. If not see
> #include "attribs.h"
> #include "asan.h"
> #include "tsan.h"
> +#include "kcfi.h"
> #include "plugin.h"
> #include "context.h"
> #include "pass_manager.h"
> @@ -1739,6 +1740,16 @@ process_options ()
> "requires %<-fno-exceptions%>");
> }
>
> + if (flag_sanitize & SANITIZE_KCFI)
> + {
> + if (!targetm.kcfi.supported ())
> + sorry ("%<-fsanitize=kcfi%> not supported by this target");
> +
> + /* KCFI is supported for only C at this time. */
> + if (!lang_GNU_C ())
> + sorry ("%<-fsanitize=kcfi%> is only supported for C");
> + }
> +
> HOST_WIDE_INT patch_area_size, patch_area_start;
> parse_and_check_patch_area (flag_patchable_function_entry, false,
> &patch_area_size, &patch_area_start);
> diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
> index 08e642178ba5..e674e176f7d3 100644
> --- a/gcc/tree-inline.cc
> +++ b/gcc/tree-inline.cc
> @@ -2104,6 +2104,16 @@ copy_bb (copy_body_data *id, basic_block bb,
> /* Advance iterator now before stmt is moved to seq_gsi. */
> gsi_next (&stmts_gsi);
>
> + /* If inlining from a function with no_sanitize("kcfi"), mark any
> + call statements in the inlined body with the flag so they skip
> + KCFI instrumentation. */
> + if (is_gimple_call (stmt)
> + && !sanitize_flags_p (SANITIZE_KCFI, id->src_fn))
> + {
> + gcall *call = as_a <gcall *> (stmt);
> + gimple_call_set_inlined_from_kcfi_nosantize (call, true);
> + }
> +
> if (gimple_nop_p (stmt))
> continue;
>
> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> index 0d78f5b384fb..b897954fd0ea 100644
> --- a/gcc/varasm.cc
> +++ b/gcc/varasm.cc
> @@ -57,6 +57,7 @@ along with GCC; see the file COPYING3. If not see
> #include "attribs.h"
> #include "asan.h"
> #include "rtl-iter.h"
> +#include "kcfi.h"
> #include "file-prefix-map.h" /* remap_debug_filename() */
> #include "alloc-pool.h"
> #include "toplev.h"
> @@ -2199,6 +2200,9 @@ assemble_start_function (tree decl, const char *fnname)
> unsigned short patch_area_size = crtl->patch_area_size;
> unsigned short patch_area_entry = crtl->patch_area_entry;
>
> + /* Emit KCFI preamble before any patchable areas. */
> + kcfi_emit_preamble (asm_out_file, decl, fnname);
> +
> /* Emit the patching area before the entry label, if any. */
> if (patch_area_entry > 0)
> targetm.asm_out.print_patchable_function_entry (asm_out_file,
> @@ -2767,6 +2771,19 @@ assemble_external_real (tree decl)
> /* Some systems do require some output. */
> SYMBOL_REF_USED (XEXP (rtl, 0)) = 1;
> ASM_OUTPUT_EXTERNAL (asm_out_file, decl, XSTR (XEXP (rtl, 0), 0));
> +
> + /* Emit KCFI type ID symbol for external function declarations that are address-taken. */
> + struct cgraph_node *node = (TREE_CODE (decl) == FUNCTION_DECL) ? cgraph_node::get (decl) : NULL;
> + if (flag_sanitize & SANITIZE_KCFI
> + && TREE_CODE (decl) == FUNCTION_DECL
> + && !DECL_INITIAL (decl) /* Only for external declarations (no function body) */
> + && node && node->address_taken) /* Use direct cgraph analysis for address-taken check. */
> + {
> + const char *name = XSTR (XEXP (rtl, 0), 0);
> + /* Strip any encoding prefixes like '*' from symbol name. */
> + name = targetm.strip_name_encoding (name);
> + emit_kcfi_typeid_symbol (asm_out_file, decl, name);
> + }
The indentations of the above lines are off.
Thanks.
Qing
> }
> }
> #endif
> @@ -7283,16 +7300,25 @@ default_elf_asm_named_section (const char *name, unsigned int flags,
> fprintf (asm_out_file, ",%d", flags & SECTION_ENTSIZE);
> if (flags & SECTION_LINK_ORDER)
> {
> - /* For now, only section "__patchable_function_entries"
> - adopts flag SECTION_LINK_ORDER, internal label LPFE*
> - was emitted in default_print_patchable_function_entry,
> - just place it here for linked_to section. */
> - gcc_assert (!strcmp (name, "__patchable_function_entries"));
> - fprintf (asm_out_file, ",");
> - char buf[256];
> - ASM_GENERATE_INTERNAL_LABEL (buf, "LPFE",
> - current_function_funcdef_no);
> - assemble_name_raw (asm_out_file, buf);
> + if (!strcmp (name, "__patchable_function_entries"))
> + {
> + /* For patchable function entries, internal label LPFE*
> + was emitted in default_print_patchable_function_entry,
> + just place it here for linked_to section. */
> + fprintf (asm_out_file, ",");
> + char buf[256];
> + ASM_GENERATE_INTERNAL_LABEL (buf, "LPFE",
> + current_function_funcdef_no);
> + assemble_name_raw (asm_out_file, buf);
> + }
> + else if (!strcmp (name, ".kcfi_traps"))
> + {
> + /* KCFI traps section links to .text section. */
> + fprintf (asm_out_file, ",.text");
> + }
> + else
> + internal_error ("unexpected use of %<SECTION_LINK_ORDER%> by section %qs",
> + name);
> }
> if (HAVE_COMDAT_GROUP && (flags & SECTION_LINKONCE))
> {
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: [PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
2025-09-09 18:49 ` Qing Zhao
@ 2025-09-11 3:05 ` Kees Cook
2025-09-11 7:29 ` Peter Zijlstra
2025-09-11 15:04 ` Qing Zhao
0 siblings, 2 replies; 32+ messages in thread
From: Kees Cook @ 2025-09-11 3:05 UTC (permalink / raw)
To: Qing Zhao
Cc: Andrew Pinski, Richard Biener, Joseph Myers, Jan Hubicka,
Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
Jim Wilson, Peter Zijlstra, Dan Li, Sami Tolvanen,
Ramon de C Valle, Joao Moreira, Nathan Chancellor, Bill Wendling,
gcc-patches@gcc.gnu.org, linux-hardening@vger.kernel.org
On Tue, Sep 09, 2025 at 06:49:22PM +0000, Qing Zhao wrote:
>
> > On Sep 4, 2025, at 20:24, Kees Cook <kees@kernel.org> wrote:
> > +For indirect call sites:
> > +
> > +- Keeping indirect calls from being merged (see above) by adding a
> > + wrapping type so that equality was tested based on type-id.
>
> I still think that the additional new created wrapping type and the new assignment stmt
>
> wrapper_tmp = (wrapper_ptr_type) fn
> is not necessary.
>
> All the information can be get from function type + type-id which is attached as an attribute
> to the original_function_type of “fn”.
> Could you explain why the wrapper type and the new temporary, new assignment is
> necessary?
I couldn't find a way to stop merging just using the attributes. I need
a way to directly associated indirect call sites with the typeid.
> > +
> > +- Keeping typeid information available through to the RTL expansion
> > + phase was done via a new KCFI insn that wraps CALL and the typeid.
>
> Is the new KCFI insn the following:
> wrapper_tmp = (wrapper_ptr_type) fn?
This bullet is speaking about the backend change to support the KCFI
check-call insn sequences:
+/* KCFI wrapper for call expressions.
+ Operand 0 is the call expression.
+ Operand 1 is the KCFI type ID (const_int). */
+
+DEF_RTL_EXPR(KCFI, "kcfi", "ee", RTX_EXTRA)
> Why the type-id attached as the attribute is not enough?
Doing the wrapping avoided needing to update multiple optimization passes
to check for the attribute. And it still needed a way to distinguish
between direct and indirect calls, so I need to wrap only the indirect
calls, where as the typeid attribute is for all functions for all typeid
needs, like preamble generation, etc.
> > +
> > +- To make sure KCFI expansion is skipped for inline functions, the
> > + inlining is marked during GIMPLE with a new flag which is checked
> > + during expansion.
> > +
> > +For indirect call targets:
> > +
> > +- kcfi_emit_preamble() uses function_needs_kcfi_preamble(),
> > + to emit the preablem,
> Typo: preamble.
Fixed.
> > +HOST_WIDE_INT kcfi_patchable_entry_prefix_nops = 0; /* For callsite offset */
> > +static HOST_WIDE_INT kcfi_patchable_entry_arch_alignment_nops = 0; /* For preamble alignment */
> > +
> > +/* Common helper for RTL patterns to emit .kcfi_traps section entry. */
>
> there should be one empty line between the comments of the function and the start of the function.
> I noticed that you need such empty line for all the new functions you added. -:)
Oh! Wow, yeah, I totally missed this coding style requirement. Fixed.
> > +/* Check if a function needs KCFI preamble generation.
> > + ALL functions get preambles when -fsanitize=kcfi is enabled, regardless
> > + of no_sanitize("kcfi") attribute. */
>
> Why no_sanitize(“kcfi”) is not considered here?
no_sanitize(“kcfi”) is strictly about whether call-site checking
is performed within the function. It is not used to mark a function as
not being the target of a KCFI call.
> > +/* Get KCFI type ID for a function declaration during assembly output phase.
> > + Fatal error if type ID was not previously set during IPA phase. */
> > +static uint32_t
> > +get_function_kcfi_type_id (tree fndecl)
> > +{
> > + if (!kcfi_type_id_attr)
> > + kcfi_type_id_attr = get_identifier ("kcfi_type_id");
> > +
> > + tree attr = lookup_attribute_by_prefix ("kcfi_type_id",
> > + DECL_ATTRIBUTES (fndecl));
> > + if (attr && TREE_VALUE (attr) && TREE_VALUE (TREE_VALUE (attr)))
> > + {
> > + tree value = TREE_VALUE (TREE_VALUE (attr));
> > + if (TREE_CODE (value) == INTEGER_CST)
> > + return (uint32_t) TREE_INT_CST_LOW (value);
> The indentation of above is off.
I understand GCC code style indentation to be "leading spans of 8 spaces
should be replaced with a tab character". This is what I followed:
first line indents to column 6, so 6 spaces. Second line indents to
column 8, so 1 tab:
SSSSSSif (TREE_CODE (value) == INTEGER_CST)
Treturn (uint32_t) TREE_INT_CST_LOW (value);
This seems to match everywhere else? Randomly picking line get_section()
from varasm.cc, I see the same:
if (not_existing)
internal_error ("section already exists: %qs", name);
> > + /* Calculate alignment NOPs needed */
> > + arch_alignment = (alignment_bytes - ((kcfi_patchable_entry_prefix_nops + typeid_size) % alignment_bytes)) % alignment_bytes;
> The above line is too long.
Oops, yes, thank you. Fixed.
What is the right tool for me to run to check for these kinds of code
style glitches? contrib/check_GNU_style.py doesn't report anything. Oh!
It takes _patches_ not _files_. The .sh version specifies "patch" in the
help usage. Okay, I will get this all passing cleanly.
> > +/* Wrap indirect calls with KCFI type for anti-merging. */
> > +static unsigned int
> > +kcfi_instrument (void)
> > +{
> > + /* Process current function for call instrumentation only.
> > + Type ID setting is handled by the separate IPA pass. */
> > +
> > + basic_block bb;
> > +
> > + FOR_EACH_BB_FN (bb, cfun)
> > + {
> > + gimple_stmt_iterator gsi;
> > + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> > + {
> > + gimple *stmt = gsi_stmt (gsi);
> > +
> > + if (!is_gimple_call (stmt))
> > + continue;
> > +
> > + gcall *call_stmt = as_a <gcall *> (stmt);
> > +
> > + // Skip internal calls - we only instrument indirect calls
> > + if (gimple_call_internal_p (call_stmt))
> > + continue;
> > +
> > + tree fndecl = gimple_call_fndecl (call_stmt);
> > +
> > + // Only process indirect calls (no fndecl)
> > + if (fndecl)
> > + continue;
> > +
> > + tree fn = gimple_call_fn (call_stmt);
> > + if (!is_kcfi_indirect_call (fn))
> > + continue;
> > +
> > + // Get the function type to compute KCFI type ID
> > + tree fn_type = gimple_call_fntype (call_stmt);
> > + gcc_assert (fn_type);
> > + if (TREE_CODE (fn_type) != FUNCTION_TYPE)
> > + continue;
> > +
> > + uint32_t type_id = compute_kcfi_type_id (fn_type);
> > +
> > + // Create KCFI wrapper type for this call
> > + tree wrapper_type = create_kcfi_wrapper_type (fn_type, type_id);
> Again, the new “type_id” has been attached as an attribute of “fn_type” here,
The attribute is attached during IPA. This is run before that, but as I
mentioned, this is the call-site handling, and the IPA pass is for
globally associating a type-id to the function for all other uses
(preambles, weak symbols, etc).
> > + // Create a temporary variable for the wrapped function pointer
> > + tree wrapper_ptr_type = build_pointer_type (wrapper_type);
> > + tree wrapper_tmp = create_tmp_var (wrapper_ptr_type, "kcfi_wrapper");
> > +
> > + // Create assignment: wrapper_tmp = (wrapper_ptr_type) fn
> > + tree cast_expr = build1 (NOP_EXPR, wrapper_ptr_type, fn);
> > + gimple *cast_stmt = gimple_build_assign (wrapper_tmp, cast_expr);
> > + gsi_insert_before (&gsi, cast_stmt, GSI_SAME_STMT);
> > +
>
> Why the additional wrapper_ptr_type, wrapper_tmp and new assignment stmt
> Are needed here?
Based on my understanding of the requirements for GIMPLE here is that I
needed a cast between the original and the wrapper, and an assignment
for SSA. It's converting the call like:
Original: result = fn(arg1, arg2)
Becomes: result = wrapper(arg1, arg2, type_id)
I'm open to whatever alternative is needed here. I tried to capture the
merging issue with gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c
My other test methodology is "does Linux boot?" ;)
--
Kees Cook
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: [PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
2025-09-11 3:05 ` Kees Cook
@ 2025-09-11 7:29 ` Peter Zijlstra
2025-09-12 6:20 ` Kees Cook
2025-09-11 15:04 ` Qing Zhao
1 sibling, 1 reply; 32+ messages in thread
From: Peter Zijlstra @ 2025-09-11 7:29 UTC (permalink / raw)
To: Kees Cook
Cc: Qing Zhao, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Dan Li, Sami Tolvanen,
Ramon de C Valle, Joao Moreira, Nathan Chancellor, Bill Wendling,
gcc-patches@gcc.gnu.org, linux-hardening@vger.kernel.org
On Wed, Sep 10, 2025 at 08:05:11PM -0700, Kees Cook wrote:
> > > +/* Check if a function needs KCFI preamble generation.
> > > + ALL functions get preambles when -fsanitize=kcfi is enabled, regardless
> > > + of no_sanitize("kcfi") attribute. */
> >
> > Why no_sanitize(“kcfi”) is not considered here?
>
> no_sanitize(“kcfi”) is strictly about whether call-site checking
> is performed within the function. It is not used to mark a function as
> not being the target of a KCFI call.
I'll once again argue that __attribute__((nocf_check)) (aka. __noendbr)
should have that effect.
If there is no ENDBR, then no amount of kCFI preamble will make the
function (indirectly) callable.
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: [PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
2025-09-11 7:29 ` Peter Zijlstra
@ 2025-09-12 6:20 ` Kees Cook
0 siblings, 0 replies; 32+ messages in thread
From: Kees Cook @ 2025-09-12 6:20 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Qing Zhao, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Dan Li, Sami Tolvanen,
Ramon de C Valle, Joao Moreira, Nathan Chancellor, Bill Wendling,
gcc-patches@gcc.gnu.org, linux-hardening@vger.kernel.org
On Thu, Sep 11, 2025 at 09:29:35AM +0200, Peter Zijlstra wrote:
> On Wed, Sep 10, 2025 at 08:05:11PM -0700, Kees Cook wrote:
>
> > > > +/* Check if a function needs KCFI preamble generation.
> > > > + ALL functions get preambles when -fsanitize=kcfi is enabled, regardless
> > > > + of no_sanitize("kcfi") attribute. */
> > >
> > > Why no_sanitize(“kcfi”) is not considered here?
> >
> > no_sanitize(“kcfi”) is strictly about whether call-site checking
> > is performed within the function. It is not used to mark a function as
> > not being the target of a KCFI call.
>
> I'll once again argue that __attribute__((nocf_check)) (aka. __noendbr)
> should have that effect.
>
> If there is no ENDBR, then no amount of kCFI preamble will make the
> function (indirectly) callable.
Oh yeah, sure! I've added this for v3 now.
--
Kees Cook
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
2025-09-11 3:05 ` Kees Cook
2025-09-11 7:29 ` Peter Zijlstra
@ 2025-09-11 15:04 ` Qing Zhao
2025-09-12 7:32 ` Kees Cook
1 sibling, 1 reply; 32+ messages in thread
From: Qing Zhao @ 2025-09-11 15:04 UTC (permalink / raw)
To: Kees Cook
Cc: Andrew Pinski, Richard Biener, Joseph Myers, Jan Hubicka,
Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
Jim Wilson, Peter Zijlstra, Dan Li, Sami Tolvanen,
Ramon de C Valle, Joao Moreira, Nathan Chancellor, Bill Wendling,
gcc-patches@gcc.gnu.org, linux-hardening@vger.kernel.org
> On Sep 10, 2025, at 23:05, Kees Cook <kees@kernel.org> wrote:
>
> On Tue, Sep 09, 2025 at 06:49:22PM +0000, Qing Zhao wrote:
>>
>>> On Sep 4, 2025, at 20:24, Kees Cook <kees@kernel.org> wrote:
>>> +For indirect call sites:
>>> +
>>> +- Keeping indirect calls from being merged (see above) by adding a
>>> + wrapping type so that equality was tested based on type-id.
>>
>> I still think that the additional new created wrapping type and the new assignment stmt
>>
>> wrapper_tmp = (wrapper_ptr_type) fn
>> is not necessary.
>>
>> All the information can be get from function type + type-id which is attached as an attribute
>> to the original_function_type of “fn”.
>> Could you explain why the wrapper type and the new temporary, new assignment is
>> necessary?
>
> I couldn't find a way to stop merging just using the attributes. I need
> a way to directly associated indirect call sites with the typeid.
>
When determining whether two callsites should be merged, is it feasible to adding the different type_id from the
attributes into consideration?
>>> +
>>> +- Keeping typeid information available through to the RTL expansion
>>> + phase was done via a new KCFI insn that wraps CALL and the typeid.
>>
>> Is the new KCFI insn the following:
>> wrapper_tmp = (wrapper_ptr_type) fn?
>
> This bullet is speaking about the backend change to support the KCFI
> check-call insn sequences:
>
> +/* KCFI wrapper for call expressions.
> + Operand 0 is the call expression.
> + Operand 1 is the KCFI type ID (const_int). */
> +
> +DEF_RTL_EXPR(KCFI, "kcfi", "ee", RTX_EXTRA)
Okay, I see.
>
>> Why the type-id attached as the attribute is not enough?
>
> Doing the wrapping avoided needing to update multiple optimization passes
> to check for the attribute. And it still needed a way to distinguish
> between direct and indirect calls, so I need to wrap only the indirect
> calls, where as the typeid attribute is for all functions for all typeid
> needs, like preamble generation, etc.
Okay, this sounds like a reasonable justification for these additional temporaries
and assignment stmts.
One more question, are these additional temporaries and assignment stmts are
finally eliminated by later optimizations? Any runtime overhead due to them?
>
>>> +
>>> +- To make sure KCFI expansion is skipped for inline functions, the
>>> + inlining is marked during GIMPLE with a new flag which is checked
>>> + during expansion.
>>> +
>>> +For indirect call targets:
>>> +
>>> +- kcfi_emit_preamble() uses function_needs_kcfi_preamble(),
>>> + to emit the preablem,
>> Typo: preamble.
>
> Fixed.
>
>>> +HOST_WIDE_INT kcfi_patchable_entry_prefix_nops = 0; /* For callsite offset */
>>> +static HOST_WIDE_INT kcfi_patchable_entry_arch_alignment_nops = 0; /* For preamble alignment */
>>> +
>>> +/* Common helper for RTL patterns to emit .kcfi_traps section entry. */
>>
>> there should be one empty line between the comments of the function and the start of the function.
>> I noticed that you need such empty line for all the new functions you added. -:)
>
> Oh! Wow, yeah, I totally missed this coding style requirement. Fixed.
>
>>> +/* Check if a function needs KCFI preamble generation.
>>> + ALL functions get preambles when -fsanitize=kcfi is enabled, regardless
>>> + of no_sanitize("kcfi") attribute. */
>>
>> Why no_sanitize(“kcfi”) is not considered here?
>
> no_sanitize(“kcfi”) is strictly about whether call-site checking
> is performed within the function. It is not used to mark a function as
> not being the target of a KCFI call.
Okay, is this documented somewhere?
>
>>> +/* Get KCFI type ID for a function declaration during assembly output phase.
>>> + Fatal error if type ID was not previously set during IPA phase. */
>>> +static uint32_t
>>> +get_function_kcfi_type_id (tree fndecl)
>>> +{
>>> + if (!kcfi_type_id_attr)
>>> + kcfi_type_id_attr = get_identifier ("kcfi_type_id");
>>> +
>>> + tree attr = lookup_attribute_by_prefix ("kcfi_type_id",
>>> + DECL_ATTRIBUTES (fndecl));
>>> + if (attr && TREE_VALUE (attr) && TREE_VALUE (TREE_VALUE (attr)))
>>> + {
>>> + tree value = TREE_VALUE (TREE_VALUE (attr));
>>> + if (TREE_CODE (value) == INTEGER_CST)
>>> + return (uint32_t) TREE_INT_CST_LOW (value);
>> The indentation of above is off.
>
> I understand GCC code style indentation to be "leading spans of 8 spaces
> should be replaced with a tab character". This is what I followed:
>
> first line indents to column 6, so 6 spaces. Second line indents to
> column 8, so 1 tab:
Yes, that’s right. -:)
>
> SSSSSSif (TREE_CODE (value) == INTEGER_CST)
> Treturn (uint32_t) TREE_INT_CST_LOW (value);
>
> This seems to match everywhere else? Randomly picking line get_section()
> from varasm.cc, I see the same:
>
> if (not_existing)
> internal_error ("section already exists: %qs", name);
Okay. (I guess that the mail client has some issues….)
>
>>> + /* Calculate alignment NOPs needed */
>>> + arch_alignment = (alignment_bytes - ((kcfi_patchable_entry_prefix_nops + typeid_size) % alignment_bytes)) % alignment_bytes;
>> The above line is too long.
>
> Oops, yes, thank you. Fixed.
>
> What is the right tool for me to run to check for these kinds of code
> style glitches? contrib/check_GNU_style.py doesn't report anything. Oh!
> It takes _patches_ not _files_. The .sh version specifies "patch" in the
> help usage. Okay, I will get this all passing cleanly.
Yeah, I usually use contrib/check_GNU_style.py to cleanup the code before
submitting the patch.
>
>>> +/* Wrap indirect calls with KCFI type for anti-merging. */
>>> +static unsigned int
>>> +kcfi_instrument (void)
>>> +{
>>> + /* Process current function for call instrumentation only.
>>> + Type ID setting is handled by the separate IPA pass. */
>>> +
>>> + basic_block bb;
>>> +
>>> + FOR_EACH_BB_FN (bb, cfun)
>>> + {
>>> + gimple_stmt_iterator gsi;
>>> + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>>> + {
>>> + gimple *stmt = gsi_stmt (gsi);
>>> +
>>> + if (!is_gimple_call (stmt))
>>> + continue;
>>> +
>>> + gcall *call_stmt = as_a <gcall *> (stmt);
>>> +
>>> + // Skip internal calls - we only instrument indirect calls
>>> + if (gimple_call_internal_p (call_stmt))
>>> + continue;
>>> +
>>> + tree fndecl = gimple_call_fndecl (call_stmt);
>>> +
>>> + // Only process indirect calls (no fndecl)
>>> + if (fndecl)
>>> + continue;
>>> +
>>> + tree fn = gimple_call_fn (call_stmt);
>>> + if (!is_kcfi_indirect_call (fn))
>>> + continue;
>>> +
>>> + // Get the function type to compute KCFI type ID
>>> + tree fn_type = gimple_call_fntype (call_stmt);
>>> + gcc_assert (fn_type);
>>> + if (TREE_CODE (fn_type) != FUNCTION_TYPE)
>>> + continue;
>>> +
>>> + uint32_t type_id = compute_kcfi_type_id (fn_type);
>>> +
>>> + // Create KCFI wrapper type for this call
>>> + tree wrapper_type = create_kcfi_wrapper_type (fn_type, type_id);
>> Again, the new “type_id” has been attached as an attribute of “fn_type” here,
>
> The attribute is attached during IPA. This is run before that, but as I
> mentioned, this is the call-site handling, and the IPA pass is for
> globally associating a type-id to the function for all other uses
> (preambles, weak symbols, etc).
During IPA, the typeid is attached to the function type through “set_function_kcfi_type_id” for
each function in the callgraph.
For each indirect callsite in the above, the routine “create_kcfi_wrapper_type” also attaches
the typeid to the original_fn_type, and at the same time, create a new wrapper_type with a type_name
embedding the typeid.
So, I feel the type_id information is carried redundantly here.
>>> + // Create a temporary variable for the wrapped function pointer
>>> + tree wrapper_ptr_type = build_pointer_type (wrapper_type);
>>> + tree wrapper_tmp = create_tmp_var (wrapper_ptr_type, "kcfi_wrapper");
>>> +
>>> + // Create assignment: wrapper_tmp = (wrapper_ptr_type) fn
>>> + tree cast_expr = build1 (NOP_EXPR, wrapper_ptr_type, fn);
>>> + gimple *cast_stmt = gimple_build_assign (wrapper_tmp, cast_expr);
>>> + gsi_insert_before (&gsi, cast_stmt, GSI_SAME_STMT);
>>> +
>>
>> Why the additional wrapper_ptr_type, wrapper_tmp and new assignment stmt
>> Are needed here?
>
> Based on my understanding of the requirements for GIMPLE here is that I
> needed a cast between the original and the wrapper, and an assignment
> for SSA. It's converting the call like:
>
> Original: result = fn(arg1, arg2)
> Becomes: result = wrapper(arg1, arg2, type_id)
Yeah, the gimples are correct.
The concerns I have is, whether these additional temporizes and stmts impact
Optimization and also incur any runtime overhead.
>
> I'm open to whatever alternative is needed here. I tried to capture the
> merging issue with gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c
Might need to study a little bit here to see whether better solution is possible without
These additional temporizes and stmts.
Qing
>
> My other test methodology is "does Linux boot?" ;)
>
> --
> Kees Cook
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: [PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
2025-09-11 15:04 ` Qing Zhao
@ 2025-09-12 7:32 ` Kees Cook
2025-09-12 14:01 ` Qing Zhao
0 siblings, 1 reply; 32+ messages in thread
From: Kees Cook @ 2025-09-12 7:32 UTC (permalink / raw)
To: Qing Zhao
Cc: Andrew Pinski, Richard Biener, Joseph Myers, Jan Hubicka,
Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
Jim Wilson, Peter Zijlstra, Dan Li, Sami Tolvanen,
Ramon de C Valle, Joao Moreira, Nathan Chancellor, Bill Wendling,
gcc-patches@gcc.gnu.org, linux-hardening@vger.kernel.org
On Thu, Sep 11, 2025 at 03:04:01PM +0000, Qing Zhao wrote:
>
>
> > On Sep 10, 2025, at 23:05, Kees Cook <kees@kernel.org> wrote:
> >
> > On Tue, Sep 09, 2025 at 06:49:22PM +0000, Qing Zhao wrote:
> >>
> >>> On Sep 4, 2025, at 20:24, Kees Cook <kees@kernel.org> wrote:
> >>> +For indirect call sites:
> >>> +
> >>> +- Keeping indirect calls from being merged (see above) by adding a
> >>> + wrapping type so that equality was tested based on type-id.
> >>
> >> I still think that the additional new created wrapping type and the new assignment stmt
> >>
> >> wrapper_tmp = (wrapper_ptr_type) fn
> >> is not necessary.
> >>
> >> All the information can be get from function type + type-id which is attached as an attribute
> >> to the original_function_type of “fn”.
> >> Could you explain why the wrapper type and the new temporary, new assignment is
> >> necessary?
> >
> > I couldn't find a way to stop merging just using the attributes. I need
> > a way to directly associated indirect call sites with the typeid.
> >
> When determining whether two callsites should be merged, is it feasible to adding the different type_id from the
> attributes into consideration?
This is basically what was happening in the RFC, but I kept finding new
corner cases in various passes, so it felt like whack-a-mole. Using the
wrapper appeared to solve it across the board with no special casing.
> >> Why the type-id attached as the attribute is not enough?
> >
> > Doing the wrapping avoided needing to update multiple optimization passes
> > to check for the attribute. And it still needed a way to distinguish
> > between direct and indirect calls, so I need to wrap only the indirect
> > calls, where as the typeid attribute is for all functions for all typeid
> > needs, like preamble generation, etc.
>
> Okay, this sounds like a reasonable justification for these additional temporaries
> and assignment stmts.
> One more question, are these additional temporaries and assignment stmts are
> finally eliminated by later optimizations? Any runtime overhead due to them?
Yeah, they totally vanish as far as I've been able to determine.
> >>> +/* Check if a function needs KCFI preamble generation.
> >>> + ALL functions get preambles when -fsanitize=kcfi is enabled, regardless
> >>> + of no_sanitize("kcfi") attribute. */
> >>
> >> Why no_sanitize(“kcfi”) is not considered here?
> >
> > no_sanitize(“kcfi”) is strictly about whether call-site checking
> > is performed within the function. It is not used to mark a function as
> > not being the target of a KCFI call.
>
> Okay, is this documented somewhere?
Ah, whoops, no. I have added a note to the "no_sanitize" function attribute
docs for v3.
> > What is the right tool for me to run to check for these kinds of code
> > style glitches? contrib/check_GNU_style.py doesn't report anything. Oh!
> > It takes _patches_ not _files_. The .sh version specifies "patch" in the
> > help usage. Okay, I will get this all passing cleanly.
>
> Yeah, I usually use contrib/check_GNU_style.py to cleanup the code before
> submitting the patch.
Thanks!
> >>> +/* Wrap indirect calls with KCFI type for anti-merging. */
> >>> +static unsigned int
> >>> +kcfi_instrument (void)
> >>> +{
> >>> + /* Process current function for call instrumentation only.
> >>> + Type ID setting is handled by the separate IPA pass. */
> >>> +
> >>> + basic_block bb;
> >>> +
> >>> + FOR_EACH_BB_FN (bb, cfun)
> >>> + {
> >>> + gimple_stmt_iterator gsi;
> >>> + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> >>> + {
> >>> + gimple *stmt = gsi_stmt (gsi);
> >>> +
> >>> + if (!is_gimple_call (stmt))
> >>> + continue;
> >>> +
> >>> + gcall *call_stmt = as_a <gcall *> (stmt);
> >>> +
> >>> + // Skip internal calls - we only instrument indirect calls
> >>> + if (gimple_call_internal_p (call_stmt))
> >>> + continue;
> >>> +
> >>> + tree fndecl = gimple_call_fndecl (call_stmt);
> >>> +
> >>> + // Only process indirect calls (no fndecl)
> >>> + if (fndecl)
> >>> + continue;
> >>> +
> >>> + tree fn = gimple_call_fn (call_stmt);
> >>> + if (!is_kcfi_indirect_call (fn))
> >>> + continue;
> >>> +
> >>> + // Get the function type to compute KCFI type ID
> >>> + tree fn_type = gimple_call_fntype (call_stmt);
> >>> + gcc_assert (fn_type);
> >>> + if (TREE_CODE (fn_type) != FUNCTION_TYPE)
> >>> + continue;
> >>> +
> >>> + uint32_t type_id = compute_kcfi_type_id (fn_type);
> >>> +
> >>> + // Create KCFI wrapper type for this call
> >>> + tree wrapper_type = create_kcfi_wrapper_type (fn_type, type_id);
> >> Again, the new “type_id” has been attached as an attribute of “fn_type” here,
> >
> > The attribute is attached during IPA. This is run before that, but as I
> > mentioned, this is the call-site handling, and the IPA pass is for
> > globally associating a type-id to the function for all other uses
> > (preambles, weak symbols, etc).
> During IPA, the typeid is attached to the function type through “set_function_kcfi_type_id” for
> each function in the callgraph.
>
> For each indirect callsite in the above, the routine “create_kcfi_wrapper_type” also attaches
> the typeid to the original_fn_type, and at the same time, create a new wrapper_type with a type_name
> embedding the typeid.
>
> So, I feel the type_id information is carried redundantly here.
Ah! Yes, sorry, I see what you mean: the tail portion of
create_kcfi_wrapper_type! Yeah, this is kind of a oversight from
switching to the IPA pass. I was attaching the typeid to DECLs in IPA
and TYPEs in GIMPLE (create_kcfi_wrapper_type). The DECLs were used for
preambles, and the TYPEs were used for RTL expansion. I will attempt to
merge these; they should all be on the TYPE.
> > I'm open to whatever alternative is needed here. I tried to capture the
> > merging issue with gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c
>
> Might need to study a little bit here to see whether better solution is possible without
> These additional temporizes and stmts.
Thanks!
--
Kees Cook
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: [PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
2025-09-12 7:32 ` Kees Cook
@ 2025-09-12 14:01 ` Qing Zhao
2025-09-13 6:29 ` Kees Cook
0 siblings, 1 reply; 32+ messages in thread
From: Qing Zhao @ 2025-09-12 14:01 UTC (permalink / raw)
To: Kees Cook
Cc: Andrew Pinski, Richard Biener, Joseph Myers, Jan Hubicka,
Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
Jim Wilson, Peter Zijlstra, Dan Li, Sami Tolvanen,
Ramon de C Valle, Joao Moreira, Nathan Chancellor, Bill Wendling,
gcc-patches@gcc.gnu.org, linux-hardening@vger.kernel.org
> On Sep 12, 2025, at 03:32, Kees Cook <kees@kernel.org> wrote:
>
> On Thu, Sep 11, 2025 at 03:04:01PM +0000, Qing Zhao wrote:
>>
>>
>>> On Sep 10, 2025, at 23:05, Kees Cook <kees@kernel.org> wrote:
>>>
>>> On Tue, Sep 09, 2025 at 06:49:22PM +0000, Qing Zhao wrote:
>>>>
>>>>> On Sep 4, 2025, at 20:24, Kees Cook <kees@kernel.org> wrote:
>>>>> +For indirect call sites:
>>>>> +
>>>>> +- Keeping indirect calls from being merged (see above) by adding a
>>>>> + wrapping type so that equality was tested based on type-id.
>>>>
>>>> I still think that the additional new created wrapping type and the new assignment stmt
>>>>
>>>> wrapper_tmp = (wrapper_ptr_type) fn
>>>> is not necessary.
>>>>
>>>> All the information can be get from function type + type-id which is attached as an attribute
>>>> to the original_function_type of “fn”.
>>>> Could you explain why the wrapper type and the new temporary, new assignment is
>>>> necessary?
>>>
>>> I couldn't find a way to stop merging just using the attributes. I need
>>> a way to directly associated indirect call sites with the typeid.
>>>
>> When determining whether two callsites should be merged, is it feasible to adding the different type_id from the
>> attributes into consideration?
>
> This is basically what was happening in the RFC, but I kept finding new
> corner cases in various passes, so it felt like whack-a-mole. Using the
> wrapper appeared to solve it across the board with no special casing.
Okay, if this is the case, I think that it’s better to explain the reason in the design doc on why you finally
decide to add the wrapper function instead of directly using the type_id from the attribute for comparison.
What’s the issues when directly using the type_id from the attribute, why only the new wrapper type and
function works?
>
>>>> Why the type-id attached as the attribute is not enough?
>>>
>>> Doing the wrapping avoided needing to update multiple optimization passes
>>> to check for the attribute.
Do you remember which optimization passes need to be updated for these purpose?
>>> And it still needed a way to distinguish
>>> between direct and indirect calls, so I need to wrap only the indirect
>>> calls, where as the typeid attribute is for all functions for all typeid
>>> needs, like preamble generation, etc.
>>
>> Okay, this sounds like a reasonable justification for these additional temporaries
>> and assignment stmts.
>> One more question, are these additional temporaries and assignment stmts are
>> finally eliminated by later optimizations? Any runtime overhead due to them?
>
> Yeah, they totally vanish as far as I've been able to determine.
That’s good. Then you might add this too in the design doc as a justification of the
New wrapper type, temporaries and new assignment stmt.
thanks.
Qing
>
>>>>> +/* Check if a function needs KCFI preamble generation.
>>>>> + ALL functions get preambles when -fsanitize=kcfi is enabled, regardless
>>>>> + of no_sanitize("kcfi") attribute. */
>>>>
>>>> Why no_sanitize(“kcfi”) is not considered here?
>>>
>>> no_sanitize(“kcfi”) is strictly about whether call-site checking
>>> is performed within the function. It is not used to mark a function as
>>> not being the target of a KCFI call.
>>
>> Okay, is this documented somewhere?
>
> Ah, whoops, no. I have added a note to the "no_sanitize" function attribute
> docs for v3.
>
>>> What is the right tool for me to run to check for these kinds of code
>>> style glitches? contrib/check_GNU_style.py doesn't report anything. Oh!
>>> It takes _patches_ not _files_. The .sh version specifies "patch" in the
>>> help usage. Okay, I will get this all passing cleanly.
>>
>> Yeah, I usually use contrib/check_GNU_style.py to cleanup the code before
>> submitting the patch.
>
> Thanks!
>
>>>>> +/* Wrap indirect calls with KCFI type for anti-merging. */
>>>>> +static unsigned int
>>>>> +kcfi_instrument (void)
>>>>> +{
>>>>> + /* Process current function for call instrumentation only.
>>>>> + Type ID setting is handled by the separate IPA pass. */
>>>>> +
>>>>> + basic_block bb;
>>>>> +
>>>>> + FOR_EACH_BB_FN (bb, cfun)
>>>>> + {
>>>>> + gimple_stmt_iterator gsi;
>>>>> + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>>>>> + {
>>>>> + gimple *stmt = gsi_stmt (gsi);
>>>>> +
>>>>> + if (!is_gimple_call (stmt))
>>>>> + continue;
>>>>> +
>>>>> + gcall *call_stmt = as_a <gcall *> (stmt);
>>>>> +
>>>>> + // Skip internal calls - we only instrument indirect calls
>>>>> + if (gimple_call_internal_p (call_stmt))
>>>>> + continue;
>>>>> +
>>>>> + tree fndecl = gimple_call_fndecl (call_stmt);
>>>>> +
>>>>> + // Only process indirect calls (no fndecl)
>>>>> + if (fndecl)
>>>>> + continue;
>>>>> +
>>>>> + tree fn = gimple_call_fn (call_stmt);
>>>>> + if (!is_kcfi_indirect_call (fn))
>>>>> + continue;
>>>>> +
>>>>> + // Get the function type to compute KCFI type ID
>>>>> + tree fn_type = gimple_call_fntype (call_stmt);
>>>>> + gcc_assert (fn_type);
>>>>> + if (TREE_CODE (fn_type) != FUNCTION_TYPE)
>>>>> + continue;
>>>>> +
>>>>> + uint32_t type_id = compute_kcfi_type_id (fn_type);
>>>>> +
>>>>> + // Create KCFI wrapper type for this call
>>>>> + tree wrapper_type = create_kcfi_wrapper_type (fn_type, type_id);
>>>> Again, the new “type_id” has been attached as an attribute of “fn_type” here,
>>>
>>> The attribute is attached during IPA. This is run before that, but as I
>>> mentioned, this is the call-site handling, and the IPA pass is for
>>> globally associating a type-id to the function for all other uses
>>> (preambles, weak symbols, etc).
>> During IPA, the typeid is attached to the function type through “set_function_kcfi_type_id” for
>> each function in the callgraph.
>>
>> For each indirect callsite in the above, the routine “create_kcfi_wrapper_type” also attaches
>> the typeid to the original_fn_type, and at the same time, create a new wrapper_type with a type_name
>> embedding the typeid.
>>
>> So, I feel the type_id information is carried redundantly here.
>
> Ah! Yes, sorry, I see what you mean: the tail portion of
> create_kcfi_wrapper_type! Yeah, this is kind of a oversight from
> switching to the IPA pass. I was attaching the typeid to DECLs in IPA
> and TYPEs in GIMPLE (create_kcfi_wrapper_type). The DECLs were used for
> preambles, and the TYPEs were used for RTL expansion. I will attempt to
> merge these; they should all be on the TYPE.
>
>>> I'm open to whatever alternative is needed here. I tried to capture the
>>> merging issue with gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c
>>
>> Might need to study a little bit here to see whether better solution is possible without
>> These additional temporizes and stmts.
>
> Thanks!
>
> --
> Kees Cook
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: [PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
2025-09-12 14:01 ` Qing Zhao
@ 2025-09-13 6:29 ` Kees Cook
0 siblings, 0 replies; 32+ messages in thread
From: Kees Cook @ 2025-09-13 6:29 UTC (permalink / raw)
To: Qing Zhao
Cc: Andrew Pinski, Richard Biener, Joseph Myers, Jan Hubicka,
Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
Jim Wilson, Peter Zijlstra, Dan Li, Sami Tolvanen,
Ramon de C Valle, Joao Moreira, Nathan Chancellor, Bill Wendling,
gcc-patches@gcc.gnu.org, linux-hardening@vger.kernel.org
On Fri, Sep 12, 2025 at 02:01:57PM +0000, Qing Zhao wrote:
>
> > On Sep 12, 2025, at 03:32, Kees Cook <kees@kernel.org> wrote:
> >
> > On Thu, Sep 11, 2025 at 03:04:01PM +0000, Qing Zhao wrote:
> >>
> >>
> >>> On Sep 10, 2025, at 23:05, Kees Cook <kees@kernel.org> wrote:
> >>>
> >>> On Tue, Sep 09, 2025 at 06:49:22PM +0000, Qing Zhao wrote:
> >>>>
> >>>> Why the type-id attached as the attribute is not enough?
> >>>
> >>> Doing the wrapping avoided needing to update multiple optimization passes
> >>> to check for the attribute.
>
> Do you remember which optimization passes need to be updated for these purpose?
I had patched at least old_insns_match_p:
https://lore.kernel.org/linux-hardening/20250821072708.3109244-3-kees@kernel.org/#Z31gcc:cfgcleanup.cc
The rest that I patched were about dealing with retaining notes, which
aren't used any more now (an attribute is used, not a note).
> >>> And it still needed a way to distinguish
> >>> between direct and indirect calls, so I need to wrap only the indirect
> >>> calls, where as the typeid attribute is for all functions for all typeid
> >>> needs, like preamble generation, etc.
> >>
> >> Okay, this sounds like a reasonable justification for these additional temporaries
> >> and assignment stmts.
> >> One more question, are these additional temporaries and assignment stmts are
> >> finally eliminated by later optimizations? Any runtime overhead due to them?
> >
> > Yeah, they totally vanish as far as I've been able to determine.
>
> That’s good. Then you might add this too in the design doc as a justification of the
> New wrapper type, temporaries and new assignment stmt.
I spent some time today experimenting with annotations and discovered that
the KCFI RTL changes actually ended up solving all the issue I'd found.
Combined with moving the DECL attributes to TYPE attributes, everything
got MUCH simpler. I'll send v3 out soon with all of this redundancy
removed. I want to test it a little more first.
--
Kees Cook
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v2 3/7] x86: Add x86_64 Kernel Control Flow Integrity implementation
2025-09-05 0:24 [PATCH v2 0/7] Introduce Kernel Control Flow Integrity ABI [PR107048] Kees Cook
2025-09-05 0:24 ` [PATCH v2 1/7] mangle: Introduce C typeinfo mangling API Kees Cook
2025-09-05 0:24 ` [PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure Kees Cook
@ 2025-09-05 0:24 ` Kees Cook
2025-09-05 0:24 ` [PATCH v2 4/7] aarch64: Add AArch64 " Kees Cook
` (3 subsequent siblings)
6 siblings, 0 replies; 32+ messages in thread
From: Kees Cook @ 2025-09-05 0:24 UTC (permalink / raw)
To: Qing Zhao
Cc: Kees Cook, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Peter Zijlstra, Dan Li,
Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
Bill Wendling, gcc-patches, linux-hardening
Implement x86_64-specific KCFI backend:
- Implies -mindirect-branch-register since KCFI needs call target in
a register for typeid hash loading.
- Function preamble generation with type IDs positioned at -(4+prefix_nops)
offset from function entry point.
- Function-aligned KCFI preambles using calculated alignment NOPs:
aligned(prefix_nops + 5, 16) to maintain ability to call the
__cfi_ preamble directly in the case of Linux's FineIBT alternative
CFI sequences (live patched into place).
- Type-id hash avoids generating ENDBR instruction in type IDs
(0xfa1e0ff3/0xfb1e0ff3 are incremented by 1 to prevent execution).
- On-demand scratch register allocation strategy (r11 as needed).
The clobbers are available both early and late.
- Uses the .kcfi_traps section for debugger/runtime metadata.
Assembly Code Pattern layout required by Linux kernel:
movl $inverse_type_id, %r10d ; Load expected type (0 - hash)
addl offset(%target), %r10d ; Add stored type ID from preamble
je .Lkcfi_call ; Branch if types match (sum == 0)
.Lkcfi_trap: ud2 ; Undefined instruction trap on mismatch
.Lkcfi_call: call/jmp *%target ; Execute validated indirect transfer
Build and run tested on x86_64 Linux kernel with various CPU errata
handling alternatives, with and without FineIBT patching.
gcc/ChangeLog:
config/i386/i386.h: KCFI enables TARGET_INDIRECT_BRANCH_REGISTER.
config/i386/i386-protos.h: Declare ix86_output_kcfi_insn().
config/i386/i386-expand.cc (ix86_expand_call): Expand indirect
calls into KCFI RTL.
config/i386/i386.cc (ix86_kcfi_mask_type_id): New function.
(ix86_output_kcfi_insn): New function to emit KCFI assembly.
config/i386/i386.md: Add KCFI RTL patterns.
doc/invoke.texi: Document x86 nuances.
Signed-off-by: Kees Cook <kees@kernel.org>
---
gcc/config/i386/i386-protos.h | 1 +
gcc/config/i386/i386.h | 3 +-
gcc/config/i386/i386-expand.cc | 21 +++++-
gcc/config/i386/i386.cc | 118 +++++++++++++++++++++++++++++++++
gcc/config/i386/i386.md | 62 +++++++++++++++--
gcc/doc/invoke.texi | 23 +++++++
6 files changed, 220 insertions(+), 8 deletions(-)
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index bdb8bb963b5d..b0b3864fb53c 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -377,6 +377,7 @@ extern enum attr_cpu ix86_schedule;
extern bool ix86_nopic_noplt_attribute_p (rtx call_op);
extern const char * ix86_output_call_insn (rtx_insn *insn, rtx call_op);
+extern const char * ix86_output_kcfi_insn (rtx_insn *insn, rtx *operands);
extern const char * ix86_output_indirect_jmp (rtx call_op);
extern const char * ix86_output_function_return (bool long_p);
extern const char * ix86_output_indirect_function_return (rtx ret_op);
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 2d53db683176..5c6012ac743b 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -3038,7 +3038,8 @@ extern void debug_dispatch_window (int);
#define TARGET_INDIRECT_BRANCH_REGISTER \
(ix86_indirect_branch_register \
- || cfun->machine->indirect_branch_type != indirect_branch_keep)
+ || cfun->machine->indirect_branch_type != indirect_branch_keep \
+ || (flag_sanitize & SANITIZE_KCFI))
#define IX86_HLE_ACQUIRE (1 << 16)
#define IX86_HLE_RELEASE (1 << 17)
diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index ef6c12cd5697..2a7feffa7ebc 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -94,6 +94,7 @@ along with GCC; see the file COPYING3. If not see
#include "i386-builtins.h"
#include "i386-expand.h"
#include "asan.h"
+#include "kcfi.h"
/* Split one or more double-mode RTL references into pairs of half-mode
references. The RTL can be REG, offsettable MEM, integer constant, or
@@ -10279,8 +10280,9 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
unsigned int vec_len = 0;
tree fndecl;
bool call_no_callee_saved_registers = false;
+ bool is_direct_call = SYMBOL_REF_P (XEXP (fnaddr, 0));
- if (SYMBOL_REF_P (XEXP (fnaddr, 0)))
+ if (is_direct_call)
{
fndecl = SYMBOL_REF_DECL (XEXP (fnaddr, 0));
if (fndecl)
@@ -10317,7 +10319,7 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
if (TARGET_MACHO && !TARGET_64BIT)
{
#if TARGET_MACHO
- if (flag_pic && SYMBOL_REF_P (XEXP (fnaddr, 0)))
+ if (flag_pic && is_direct_call)
fnaddr = machopic_indirect_call_target (fnaddr);
#endif
}
@@ -10401,7 +10403,7 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
if (ix86_cmodel == CM_LARGE_PIC
&& !TARGET_PECOFF
&& MEM_P (fnaddr)
- && SYMBOL_REF_P (XEXP (fnaddr, 0))
+ && is_direct_call
&& !local_symbolic_operand (XEXP (fnaddr, 0), VOIDmode))
fnaddr = gen_rtx_MEM (QImode, construct_plt_address (XEXP (fnaddr, 0)));
/* Since x32 GOT slot is 64 bit with zero upper 32 bits, indirect
@@ -10433,6 +10435,19 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
call = gen_rtx_CALL (VOIDmode, fnaddr, callarg1);
+ /* Only indirect calls need KCFI instrumentation. */
+ rtx kcfi_type_rtx = is_direct_call ? NULL_RTX : kcfi_get_call_type_id ();
+ if (kcfi_type_rtx)
+ {
+ /* Wrap call with KCFI. */
+ call = gen_rtx_KCFI (VOIDmode, call, kcfi_type_rtx);
+
+ /* Add KCFI clobbers for the insn sequence. */
+ clobber_reg (&use, gen_rtx_REG (DImode, R10_REG));
+ clobber_reg (&use, gen_rtx_REG (DImode, R11_REG));
+ clobber_reg (&use, gen_rtx_REG (CCmode, FLAGS_REG));
+ }
+
if (retval)
call = gen_rtx_SET (retval, call);
vec[vec_len++] = call;
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index b2c1acd12dac..95912533a445 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -98,6 +98,7 @@ along with GCC; see the file COPYING3. If not see
#include "i386-builtins.h"
#include "i386-expand.h"
#include "i386-features.h"
+#include "kcfi.h"
#include "function-abi.h"
#include "rtl-error.h"
#include "gimple-pretty-print.h"
@@ -1700,6 +1701,19 @@ ix86_function_naked (const_tree fn)
return false;
}
+/* Apply x86-64 specific masking to KCFI type ID. */
+static uint32_t
+ix86_kcfi_mask_type_id (uint32_t type_id)
+{
+ /* Avoid embedding ENDBR instructions in KCFI type IDs.
+ ENDBR64: 0xfa1e0ff3, ENDBR32: 0xfb1e0ff3
+ If the type ID matches either instruction encoding, increment by 1. */
+ if (type_id == 0xfa1e0ff3U || type_id == 0xfb1e0ff3U)
+ return type_id + 1;
+
+ return type_id;
+}
+
/* Write the extra assembler code needed to declare a function properly. */
void
@@ -28469,6 +28483,110 @@ ix86_set_handled_components (sbitmap components)
}
}
+/* Output the assembly for a KCFI checked call instruction. */
+const char *
+ix86_output_kcfi_insn (rtx_insn *insn, rtx *operands)
+{
+ /* Target is guaranteed to be in a register due to
+ TARGET_INDIRECT_BRANCH_REGISTER. */
+ rtx target_reg = operands[0];
+ gcc_assert (REG_P (target_reg));
+
+ /* In thunk-extern mode, the register must be R11 for FineIBT
+ compatibility. Should this be handled via constraints? */
+ if (cfun->machine->indirect_branch_type == indirect_branch_thunk_extern)
+ {
+ if (REGNO (target_reg) != R11_REG)
+ {
+ /* Emit move from current target to R11. */
+ target_reg = gen_rtx_REG (DImode, R11_REG);
+ rtx r11_operands[2] = { operands[0], target_reg };
+ output_asm_insn ("movq\t%0, %1", r11_operands);
+ }
+ }
+
+ /* Generate labels internally. */
+ rtx trap_label = gen_label_rtx ();
+ rtx call_label = gen_label_rtx ();
+
+ /* Get label numbers for custom naming. */
+ int trap_labelno = CODE_LABEL_NUMBER (trap_label);
+ int call_labelno = CODE_LABEL_NUMBER (call_label);
+
+ /* Generate custom label names. */
+ char trap_name[32];
+ char call_name[32];
+ ASM_GENERATE_INTERNAL_LABEL (trap_name, "Lkcfi_trap", trap_labelno);
+ ASM_GENERATE_INTERNAL_LABEL (call_name, "Lkcfi_call", call_labelno);
+
+ /* Choose scratch register: r10 by default, r11 if r10 is the target. */
+ bool target_is_r10 = (REGNO (target_reg) == R10_REG);
+ int scratch_reg = target_is_r10 ? R11_REG : R10_REG;
+
+ /* Get KCFI type ID from operand */
+ uint32_t type_id = (uint32_t) INTVAL (operands[2]);
+
+ /* Convert to inverse for the check (0 - hash) */
+ uint32_t inverse_type_id = (uint32_t)(0 - type_id);
+
+ /* Calculate offset to typeid from target address. */
+ HOST_WIDE_INT offset = -(4 + kcfi_patchable_entry_prefix_nops);
+
+ /* Output complete KCFI check + call/sibcall sequence atomically. */
+ rtx inverse_type_id_rtx = gen_int_mode (inverse_type_id, SImode);
+ rtx mov_operands[2] = { inverse_type_id_rtx, gen_rtx_REG (SImode, scratch_reg) };
+ output_asm_insn ("movl\t$%c0, %1", mov_operands);
+
+ /* Create memory operand for the addl instruction. */
+ rtx offset_rtx = gen_int_mode (offset, DImode);
+ rtx mem_op = gen_rtx_MEM (SImode, gen_rtx_PLUS (DImode, target_reg, offset_rtx));
+ rtx add_operands[2] = { mem_op, gen_rtx_REG (SImode, scratch_reg) };
+ output_asm_insn ("addl\t%0, %1", add_operands);
+
+ /* Output conditional jump to call label. */
+ fputs ("\tje\t", asm_out_file);
+ assemble_name (asm_out_file, call_name);
+ fputc ('\n', asm_out_file);
+
+ /* Output trap label and instruction. */
+ ASM_OUTPUT_LABEL (asm_out_file, trap_name);
+ output_asm_insn ("ud2", operands);
+
+ /* Use common helper for trap section entry. */
+ rtx trap_label_sym = gen_rtx_SYMBOL_REF (Pmode, trap_name);
+ kcfi_emit_traps_section (asm_out_file, trap_label_sym);
+
+ /* Output pass/call label. */
+ ASM_OUTPUT_LABEL (asm_out_file, call_name);
+
+ /* Finally emit the protected call or sibling call. */
+ if (SIBLING_CALL_P (insn))
+ return ix86_output_indirect_jmp (target_reg);
+ else
+ return ix86_output_call_insn (insn, target_reg);
+}
+
+/* Emit x86_64-specific type ID instruction and return instruction size. */
+static int
+ix86_kcfi_emit_type_id (FILE *file, uint32_t type_id)
+{
+ /* Emit movl instruction with type ID if file is not NULL. */
+ if (file)
+ fprintf (file, "\tmovl\t$0x%08x, %%eax\n", type_id);
+
+ /* x86_64 uses 5-byte movl instruction for type ID. */
+ return 5;
+}
+
+#undef TARGET_KCFI_SUPPORTED
+#define TARGET_KCFI_SUPPORTED hook_bool_void_true
+
+#undef TARGET_KCFI_MASK_TYPE_ID
+#define TARGET_KCFI_MASK_TYPE_ID ix86_kcfi_mask_type_id
+
+#undef TARGET_KCFI_EMIT_TYPE_ID
+#define TARGET_KCFI_EMIT_TYPE_ID ix86_kcfi_emit_type_id
+
#undef TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS
#define TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS ix86_get_separate_components
#undef TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index cea6c152f2b9..b343f78361a0 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -20274,11 +20274,24 @@
DONE;
})
+;; KCFI indirect call - matches KCFI wrapper RTL
+(define_insn "*call"
+ [(kcfi (call (mem:QI (match_operand:W 0 "call_insn_operand" "<c>BwBz"))
+ (match_operand 1))
+ (match_operand 2 "const_int_operand"))]
+ "!SIBLING_CALL_P (insn)"
+{
+ return ix86_output_kcfi_insn (insn, operands);
+}
+ [(set_attr "type" "call")])
+
(define_insn "*call"
[(call (mem:QI (match_operand:W 0 "call_insn_operand" "<c>BwBz"))
(match_operand 1))]
"!SIBLING_CALL_P (insn)"
- "* return ix86_output_call_insn (insn, operands[0]);"
+{
+ return ix86_output_call_insn (insn, operands[0]);
+}
[(set_attr "type" "call")])
;; This covers both call and sibcall since only GOT slot is allowed.
@@ -20311,11 +20324,24 @@
}
[(set_attr "type" "call")])
+;; KCFI sibling call - matches KCFI wrapper RTL
+(define_insn "*sibcall"
+ [(kcfi (call (mem:QI (match_operand:W 0 "sibcall_insn_operand" "UBsBz"))
+ (match_operand 1))
+ (match_operand 2 "const_int_operand"))]
+ "SIBLING_CALL_P (insn)"
+{
+ return ix86_output_kcfi_insn (insn, operands);
+}
+ [(set_attr "type" "call")])
+
(define_insn "*sibcall"
[(call (mem:QI (match_operand:W 0 "sibcall_insn_operand" "UBsBz"))
(match_operand 1))]
"SIBLING_CALL_P (insn)"
- "* return ix86_output_call_insn (insn, operands[0]);"
+{
+ return ix86_output_call_insn (insn, operands[0]);
+}
[(set_attr "type" "call")])
(define_insn "*sibcall_memory"
@@ -20472,12 +20498,26 @@
DONE;
})
+;; KCFI call with return value - matches when KCFI note present
+(define_insn "*call_value"
+ [(set (match_operand 0)
+ (kcfi (call (mem:QI (match_operand:W 1 "call_insn_operand" "<c>BwBz"))
+ (match_operand 2))
+ (match_operand 3 "const_int_operand")))]
+ "!SIBLING_CALL_P (insn)"
+{
+ return ix86_output_kcfi_insn (insn, &operands[1]);
+}
+ [(set_attr "type" "callv")])
+
(define_insn "*call_value"
[(set (match_operand 0)
(call (mem:QI (match_operand:W 1 "call_insn_operand" "<c>BwBz"))
(match_operand 2)))]
"!SIBLING_CALL_P (insn)"
- "* return ix86_output_call_insn (insn, operands[1]);"
+{
+ return ix86_output_call_insn (insn, operands[1]);
+}
[(set_attr "type" "callv")])
;; This covers both call and sibcall since only GOT slot is allowed.
@@ -20513,12 +20553,26 @@
}
[(set_attr "type" "callv")])
+;; KCFI sibling call with return value - matches KCFI wrapper RTL
+(define_insn "*sibcall_value"
+ [(set (match_operand 0)
+ (kcfi (call (mem:QI (match_operand:W 1 "sibcall_insn_operand" "UBsBz"))
+ (match_operand 2))
+ (match_operand 3 "const_int_operand")))]
+ "SIBLING_CALL_P (insn)"
+{
+ return ix86_output_kcfi_insn (insn, &operands[1]);
+}
+ [(set_attr "type" "callv")])
+
(define_insn "*sibcall_value"
[(set (match_operand 0)
(call (mem:QI (match_operand:W 1 "sibcall_insn_operand" "UBsBz"))
(match_operand 2)))]
"SIBLING_CALL_P (insn)"
- "* return ix86_output_call_insn (insn, operands[1]);"
+{
+ return ix86_output_call_insn (insn, operands[1]);
+}
[(set_attr "type" "callv")])
(define_insn "*sibcall_value_memory"
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index cd70e6351a4e..d44e7015facf 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -18404,6 +18404,29 @@ and without changing the entry points of the target functions. Only
functions that have referenced by their address receive the KCFI preamble
instrumentation.
+Platform-specific implementation details:
+
+On x86_64, KCFI type identifiers are emitted as a @code{movl $ID, %eax}
+instruction before the function entry. The implementation ensures that
+type IDs never collide with ENDBR instruction encodings. When used
+with @option{-fpatchable-function-entry}, the type identifier is
+placed before any patchable NOPs, with appropriate alignment to
+maintain a 16-byte boundary for the function entry. KCFI automatically
+implies @option{-mindirect-branch-register}, forcing all indirect calls
+and jumps to use registers instead of memory operands. The runtime
+check loads the type ID from the target function into @code{%r10d} and
+uses an @code{addl} instruction to add the negative expected type ID,
+effectively zeroing the register if the types match. A conditional
+jump follows to either continue execution or trap on mismatch. The
+check sequence uses @code{%r10d} and @code{%r11d} as scratch registers.
+Trap locations are recorded in a special @code{.kcfi_traps} section
+that maps trap sites to their corresponding function entry points,
+enabling debuggers and crash handlers to identify KCFI violations.
+The exact instruction sequences for both the KCFI preamble and the
+check-call bundle are considered ABI, as the Linux kernel may
+optionally rewrite these areas at boot time to mitigate detected CPU
+errata.
+
KCFI is intended primarily for kernel code and may not be suitable
for user-space applications that rely on techniques incompatible
with strict type checking of indirect calls.
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* [PATCH v2 4/7] aarch64: Add AArch64 Kernel Control Flow Integrity implementation
2025-09-05 0:24 [PATCH v2 0/7] Introduce Kernel Control Flow Integrity ABI [PR107048] Kees Cook
` (2 preceding siblings ...)
2025-09-05 0:24 ` [PATCH v2 3/7] x86: Add x86_64 Kernel Control Flow Integrity implementation Kees Cook
@ 2025-09-05 0:24 ` Kees Cook
2025-09-05 0:24 ` [PATCH v2 5/7] arm: Add ARM 32-bit " Kees Cook
` (2 subsequent siblings)
6 siblings, 0 replies; 32+ messages in thread
From: Kees Cook @ 2025-09-05 0:24 UTC (permalink / raw)
To: Qing Zhao
Cc: Kees Cook, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Peter Zijlstra, Dan Li,
Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
Bill Wendling, gcc-patches, linux-hardening
Implement AArch64-specific KCFI backend.
- Function preamble generation using .word directives for type ID storage
at offset from function entry point (no default alignment NOPs needed
due to fixed 4-byte instruction size).
- Trap debugging through ESR (Exception Syndrome Register) encoding
in BRK instruction immediate values.
- Scratch register allocation using w16/w17 (x16/x17) following
AArch64 procedure call standard for intra-procedure-call registers.
Assembly Code Pattern for AArch64:
ldur w16, [target, #-4] ; Load actual type ID from preamble
mov w17, #type_id_low ; Load expected type (lower 16 bits)
movk w17, #type_id_high, lsl #16 ; Load upper 16 bits if needed
cmp w16, w17 ; Compare type IDs directly
b.eq .Lpass ; Branch if types match
.Ltrap: brk #esr_value ; Enhanced trap with register info
.Lpass: blr/br target ; Execute validated indirect transfer
ESR (Exception Syndrome Register) Integration:
- BRK instruction immediate encoding format:
0x8000 | ((TypeIndex & 31) << 5) | (AddrIndex & 31)
- TypeIndex indicates which W register contains expected type (W17 = 17)
- AddrIndex indicates which X register contains target address (0-30)
- Example: brk #33313 (0x8221) = expected type in W17, target address in X1
Build and run tested with Linux kernel ARCH=arm64.
gcc/ChangeLog:
config/aarch64/aarch64-protos.h: Declare aarch64_indirect_branch_asm,
and KCFI helpers.
config/aarch64/aarch64.cc (aarch64_expand_call): Wrap CALLs in
KCFI, with clobbers.
(aarch64_indirect_branch_asm): New function, extract common
logic for branch asm, like existing call asm helper.
(aarch64_output_kcfi_insn): Emit KCFI assembly.
config/aarch64/aarch64.md: Add KCFI RTL patterns and replace
open-coded branch emission with aarch64_indirect_branch_asm.
doc/invoke.texi: Document aarch64 nuances.
Signed-off-by: Kees Cook <kees@kernel.org>
---
gcc/config/aarch64/aarch64-protos.h | 5 ++
gcc/config/aarch64/aarch64.cc | 115 ++++++++++++++++++++++++++++
gcc/config/aarch64/aarch64.md | 64 ++++++++++++++--
gcc/doc/invoke.texi | 14 ++++
4 files changed, 190 insertions(+), 8 deletions(-)
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 56efcf2c7f2c..c91fdcc80ea3 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -1261,6 +1261,7 @@ tree aarch64_resolve_overloaded_builtin_general (location_t, tree, void *);
const char *aarch64_sls_barrier (int);
const char *aarch64_indirect_call_asm (rtx);
+const char *aarch64_indirect_branch_asm (rtx);
extern bool aarch64_harden_sls_retbr_p (void);
extern bool aarch64_harden_sls_blr_p (void);
@@ -1284,4 +1285,8 @@ extern unsigned aarch64_stack_alignment (const_tree exp, unsigned align);
extern rtx aarch64_gen_compare_zero_and_branch (rtx_code code, rtx x,
rtx_code_label *label);
+/* KCFI support. */
+extern void kcfi_emit_trap_with_section (FILE *file, rtx trap_label_rtx);
+extern const char *aarch64_output_kcfi_insn (rtx_insn *insn, rtx *operands);
+
#endif /* GCC_AARCH64_PROTOS_H */
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index fb8311b655d7..a84018ff111e 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -83,6 +83,7 @@
#include "rtlanal.h"
#include "tree-dfa.h"
#include "asan.h"
+#include "kcfi.h"
#include "aarch64-elf-metadata.h"
#include "aarch64-feature-deps.h"
#include "config/arm/aarch-common.h"
@@ -11848,6 +11849,15 @@ aarch64_expand_call (rtx result, rtx mem, rtx cookie, bool sibcall)
call = gen_rtx_CALL (VOIDmode, mem, const0_rtx);
+ /* Only indirect calls need KCFI instrumentation. */
+ bool is_direct_call = SYMBOL_REF_P (XEXP (mem, 0));
+ rtx kcfi_type_rtx = is_direct_call ? NULL_RTX : kcfi_get_call_type_id ();
+ if (kcfi_type_rtx)
+ {
+ /* Wrap call in KCFI. */
+ call = gen_rtx_KCFI (VOIDmode, call, kcfi_type_rtx);
+ }
+
if (result != NULL_RTX)
call = gen_rtx_SET (result, call);
@@ -11864,6 +11874,16 @@ aarch64_expand_call (rtx result, rtx mem, rtx cookie, bool sibcall)
auto call_insn = aarch64_emit_call_insn (call);
+ /* Add KCFI clobbers for indirect calls. */
+ if (kcfi_type_rtx)
+ {
+ rtx usage = CALL_INSN_FUNCTION_USAGE (call_insn);
+ /* Add X16 and X17 clobbers for AArch64 KCFI scratch registers. */
+ clobber_reg (&usage, gen_rtx_REG (DImode, 16));
+ clobber_reg (&usage, gen_rtx_REG (DImode, 17));
+ CALL_INSN_FUNCTION_USAGE (call_insn) = usage;
+ }
+
/* Check whether the call requires a change to PSTATE.SM. We can't
emit the instructions to change PSTATE.SM yet, since they involve
a change in vector length and a change in instruction set, which
@@ -30630,6 +30650,14 @@ aarch64_indirect_call_asm (rtx addr)
return "";
}
+const char *
+aarch64_indirect_branch_asm (rtx addr)
+{
+ gcc_assert (REG_P (addr));
+ output_asm_insn ("br\t%0", &addr);
+ return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
+}
+
/* Emit the assembly instruction to load the thread pointer into DEST.
Select between different tpidr_elN registers depending on -mtp= setting. */
@@ -32823,6 +32851,93 @@ aarch64_libgcc_floating_mode_supported_p
#undef TARGET_DOCUMENTATION_NAME
#define TARGET_DOCUMENTATION_NAME "AArch64"
+/* Output the assembly for a KCFI checked call instruction. */
+const char *
+aarch64_output_kcfi_insn (rtx_insn *insn, rtx *operands)
+{
+ /* Target register is operands[0]. */
+ rtx target_reg = operands[0];
+ gcc_assert (REG_P (target_reg));
+
+ /* Get KCFI type ID from operand[3]. */
+ uint32_t type_id = (uint32_t) INTVAL (operands[3]);
+
+ /* Calculate typeid offset from call target. */
+ HOST_WIDE_INT offset = -(4 + kcfi_patchable_entry_prefix_nops);
+
+ /* Generate labels internally. */
+ rtx trap_label = gen_label_rtx ();
+ rtx call_label = gen_label_rtx ();
+
+ /* Get label numbers for custom naming. */
+ int trap_labelno = CODE_LABEL_NUMBER (trap_label);
+ int call_labelno = CODE_LABEL_NUMBER (call_label);
+
+ /* Generate custom label names. */
+ char trap_name[32];
+ char call_name[32];
+ ASM_GENERATE_INTERNAL_LABEL (trap_name, "Lkcfi_trap", trap_labelno);
+ ASM_GENERATE_INTERNAL_LABEL (call_name, "Lkcfi_call", call_labelno);
+
+ /* AArch64 KCFI check sequence:
+ 1. Load actual type from function preamble
+ 2. Load expected type
+ 3. Compare and branch if equal
+ 4. Trap if mismatch
+ 5. Call/branch to target. */
+
+ rtx temp_operands[3];
+
+ /* Load actual type from memory at offset using ldur. */
+ temp_operands[0] = gen_rtx_REG (SImode, R16_REGNUM); /* w16 */
+ temp_operands[1] = target_reg; /* x register */
+ temp_operands[2] = GEN_INT (offset); /* offset */
+ output_asm_insn ("ldur\t%w0, [%1, #%2]", temp_operands);
+
+ /* Load expected type low 16 bits into w17. */
+ temp_operands[0] = gen_rtx_REG (SImode, R17_REGNUM); /* w17 */
+ temp_operands[1] = GEN_INT (type_id & 0xFFFF);
+ output_asm_insn ("mov\t%w0, #%1", temp_operands);
+
+ /* Load expected type high 16 bits into w17. */
+ temp_operands[0] = gen_rtx_REG (SImode, R17_REGNUM); /* w17 */
+ temp_operands[1] = GEN_INT ((type_id >> 16) & 0xFFFF);
+ output_asm_insn ("movk\t%w0, #%1, lsl #16", temp_operands);
+
+ /* Compare types. */
+ temp_operands[0] = gen_rtx_REG (SImode, R16_REGNUM); /* w16 */
+ temp_operands[1] = gen_rtx_REG (SImode, R17_REGNUM); /* w17 */
+ output_asm_insn ("cmp\t%w0, %w1", temp_operands);
+
+ /* Output conditional branch to call label. */
+ fputs ("\tb.eq\t", asm_out_file);
+ assemble_name (asm_out_file, call_name);
+ fputc ('\n', asm_out_file);
+
+ /* Output trap label and BRK instruction. */
+ ASM_OUTPUT_LABEL (asm_out_file, trap_name);
+
+ /* Calculate and emit BRK with ESR encoding. */
+ unsigned type_index = 17; /* w17 contains expected type. */
+ unsigned addr_index = REGNO (operands[0]) - R0_REGNUM;
+ unsigned esr_value = 0x8000 | ((type_index & 31) << 5) | (addr_index & 31);
+
+ temp_operands[0] = GEN_INT (esr_value);
+ output_asm_insn ("brk\t#%0", temp_operands);
+
+ /* Output call label. */
+ ASM_OUTPUT_LABEL (asm_out_file, call_name);
+
+ /* Return appropriate call instruction based on SIBLING_CALL_P. */
+ if (SIBLING_CALL_P (insn))
+ return aarch64_indirect_branch_asm (operands[0]);
+ else
+ return aarch64_indirect_call_asm (operands[0]);
+}
+
+#undef TARGET_KCFI_SUPPORTED
+#define TARGET_KCFI_SUPPORTED hook_bool_void_true
+
struct gcc_target targetm = TARGET_INITIALIZER;
#include "gt-aarch64.h"
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index fedbd4026a06..3a91f681dc89 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1483,6 +1483,19 @@
}"
)
+;; KCFI indirect call - matches KCFI wrapper RTL
+(define_insn "*call_insn"
+ [(kcfi (call (mem:DI (match_operand:DI 0 "aarch64_call_insn_operand" "Ucr"))
+ (match_operand 1 "" ""))
+ (match_operand 3 "const_int_operand"))
+ (unspec:DI [(match_operand:DI 2 "const_int_operand")] UNSPEC_CALLEE_ABI)
+ (clobber (reg:DI LR_REGNUM))]
+ "!SIBLING_CALL_P (insn)"
+{
+ return aarch64_output_kcfi_insn (insn, operands);
+}
+ [(set_attr "type" "call")])
+
(define_insn "*call_insn"
[(call (mem:DI (match_operand:DI 0 "aarch64_call_insn_operand"))
(match_operand 1 "" ""))
@@ -1510,6 +1523,20 @@
}"
)
+;; KCFI call with return value - matches KCFI wrapper RTL
+(define_insn "*call_value_insn"
+ [(set (match_operand 0 "" "")
+ (kcfi (call (mem:DI (match_operand:DI 1 "aarch64_call_insn_operand" "Ucr"))
+ (match_operand 2 "" ""))
+ (match_operand 4 "const_int_operand")))
+ (unspec:DI [(match_operand:DI 3 "const_int_operand")] UNSPEC_CALLEE_ABI)
+ (clobber (reg:DI LR_REGNUM))]
+ "!SIBLING_CALL_P (insn)"
+{
+ return aarch64_output_kcfi_insn (insn, &operands[1]);
+}
+ [(set_attr "type" "call")])
+
(define_insn "*call_value_insn"
[(set (match_operand 0 "" "")
(call (mem:DI (match_operand:DI 1 "aarch64_call_insn_operand"))
@@ -1550,6 +1577,19 @@
}
)
+;; KCFI sibling call - matches KCFI wrapper RTL
+(define_insn "*sibcall_insn"
+ [(kcfi (call (mem:DI (match_operand:DI 0 "aarch64_call_insn_operand" "Ucs"))
+ (match_operand 1 ""))
+ (match_operand 3 "const_int_operand"))
+ (unspec:DI [(match_operand:DI 2 "const_int_operand")] UNSPEC_CALLEE_ABI)
+ (return)]
+ "SIBLING_CALL_P (insn)"
+{
+ return aarch64_output_kcfi_insn (insn, operands);
+}
+ [(set_attr "type" "branch")])
+
(define_insn "*sibcall_insn"
[(call (mem:DI (match_operand:DI 0 "aarch64_call_insn_operand" "Ucs, Usf"))
(match_operand 1 ""))
@@ -1558,16 +1598,27 @@
"SIBLING_CALL_P (insn)"
{
if (which_alternative == 0)
- {
- output_asm_insn ("br\\t%0", operands);
- return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
- }
+ return aarch64_indirect_branch_asm (operands[0]);
return "b\\t%c0";
}
[(set_attr "type" "branch, branch")
(set_attr "sls_length" "retbr,none")]
)
+;; KCFI sibling call with return value - matches KCFI wrapper RTL
+(define_insn "*sibcall_value_insn"
+ [(set (match_operand 0 "")
+ (kcfi (call (mem:DI (match_operand:DI 1 "aarch64_call_insn_operand" "Ucs"))
+ (match_operand 2 ""))
+ (match_operand 4 "const_int_operand")))
+ (unspec:DI [(match_operand:DI 3 "const_int_operand")] UNSPEC_CALLEE_ABI)
+ (return)]
+ "SIBLING_CALL_P (insn)"
+{
+ return aarch64_output_kcfi_insn (insn, &operands[1]);
+}
+ [(set_attr "type" "branch")])
+
(define_insn "*sibcall_value_insn"
[(set (match_operand 0 "")
(call (mem:DI
@@ -1578,10 +1629,7 @@
"SIBLING_CALL_P (insn)"
{
if (which_alternative == 0)
- {
- output_asm_insn ("br\\t%1", operands);
- return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
- }
+ return aarch64_indirect_branch_asm (operands[1]);
return "b\\t%c1";
}
[(set_attr "type" "branch, branch")
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index d44e7015facf..45efc75a3b05 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -18427,6 +18427,20 @@ check-call bundle are considered ABI, as the Linux kernel may
optionally rewrite these areas at boot time to mitigate detected CPU
errata.
+On AArch64, KCFI type identifiers are emitted as a @code{.word ID}
+directive (a 32-bit constant) before the function entry. AArch64's
+natural 4-byte instruction alignment eliminates the need for additional
+padding NOPs. When used with @option{-fpatchable-function-entry}, the
+type identifier is placed before any patchable NOPs. The runtime check
+uses @code{x16} and @code{x17} as scratch registers. Type mismatches
+trigger a @code{brk} instruction with an immediate value that encodes
+both the expected type register index and the target address register
+index in the format @code{0x8000 | (type_reg << 5) | addr_reg}. This
+encoding is captured in the ESR (Exception Syndrome Register) when the
+trap is taken, allowing the kernel to identify both the KCFI violation
+and the involved registers for detailed diagnostics (eliminating the need
+for a separate @code{.kcfi_traps} section as used on x86_64).
+
KCFI is intended primarily for kernel code and may not be suitable
for user-space applications that rely on techniques incompatible
with strict type checking of indirect calls.
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* [PATCH v2 5/7] arm: Add ARM 32-bit Kernel Control Flow Integrity implementation
2025-09-05 0:24 [PATCH v2 0/7] Introduce Kernel Control Flow Integrity ABI [PR107048] Kees Cook
` (3 preceding siblings ...)
2025-09-05 0:24 ` [PATCH v2 4/7] aarch64: Add AArch64 " Kees Cook
@ 2025-09-05 0:24 ` Kees Cook
2025-09-11 7:49 ` Ard Biesheuvel
2025-09-05 0:24 ` [PATCH v2 6/7] riscv: Add RISC-V " Kees Cook
2025-09-05 0:24 ` [PATCH v2 7/7] kcfi: Add regression test suite Kees Cook
6 siblings, 1 reply; 32+ messages in thread
From: Kees Cook @ 2025-09-05 0:24 UTC (permalink / raw)
To: Qing Zhao
Cc: Kees Cook, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Peter Zijlstra, Dan Li,
Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
Bill Wendling, gcc-patches, linux-hardening
Implement ARM 32-bit KCFI backend supporting ARMv7+:
- Function preamble generation using .word directives for type ID storage
at -4 byte offset from function entry point (no prefix NOPs needed due to
4-byte instruction alignment).
- Use movw/movt instructions for 32-bit immediate loading.
- Trap debugging through UDF instruction immediate encoding following
AArch64 BRK pattern for encoding registers with useful contents.
- Scratch register allocation using r0/r1 following ARM procedure call
standard for caller-saved temporary registers, though they get
stack spilled due to register pressure.
Assembly Code Pattern for ARM 32-bit:
push {r0, r1} ; Spill r0, r1
ldr r0, [target, #-4] ; Load actual type ID from preamble
movw r1, #type_id_low ; Load expected type (lower 16 bits)
movt r1, #type_id_high ; Load upper 16 bits with top instruction
cmp r0, r1 ; Compare type IDs directly
pop [r0, r1] ; Reload r0, r1
beq .Lkcfi_call ; Branch if typeids match
.Lkcfi_trap: udf #udf_value ; Undefined instruction trap with encoding
.Lkcfi_call: blx/bx target ; Execute validated indirect transfer
UDF Immediate Encoding (following AArch64 ESR pattern):
- UDF instruction immediate encoding format:
0x8000 | ((ExpectedTypeReg & 31) << 5) | (TargetAddrReg & 31)
- ExpectedTypeReg indicates which register contains expected type (R12 = 12)
- TargetAddrReg indicates which register contains target address (0-15)
- Example: udf #33154 (0x817A) = expected type in R12, target address in R2
Build and run tested with Linux kernel ARCH=arm.
gcc/ChangeLog:
config/arm/arm-protos.h: Declare KCFI helpers.
config/arm/arm.cc (arm_maybe_wrap_call_with_kcfi): New function.
(arm_maybe_wrap_call_value_with_kcfi): New function.
(arm_output_kcfi_insn): Emit KCFI assembly.
config/arm/arm.md: Add KCFI RTL patterns and hook expansion.
doc/invoke.texi: Document arm32 nuances.
Signed-off-by: Kees Cook <kees@kernel.org>
---
gcc/config/arm/arm-protos.h | 4 +
gcc/config/arm/arm.cc | 144 ++++++++++++++++++++++++++++++++++++
gcc/config/arm/arm.md | 62 ++++++++++++++++
gcc/doc/invoke.texi | 17 +++++
4 files changed, 227 insertions(+)
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index ff7e7658f912..ad3dc522e2b9 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -607,6 +607,10 @@ void arm_initialize_isa (sbitmap, const enum isa_feature *);
const char * arm_gen_far_branch (rtx *, int, const char * , const char *);
+rtx arm_maybe_wrap_call_with_kcfi (rtx, rtx);
+rtx arm_maybe_wrap_call_value_with_kcfi (rtx, rtx);
+const char *arm_output_kcfi_insn (rtx_insn *, rtx *);
+
bool arm_mve_immediate_check(rtx, machine_mode, bool);
opt_machine_mode arm_mve_data_mode (scalar_mode, poly_uint64);
diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index 8b951f3d4a67..b74abc1aafcf 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -77,6 +77,8 @@
#include "aarch-common-protos.h"
#include "machmode.h"
#include "arm-builtins.h"
+#include "kcfi.h"
+#include "flags.h"
/* This file should be included last. */
#include "target-def.h"
@@ -35803,6 +35805,148 @@ arm_mode_base_reg_class (machine_mode mode)
return MODE_BASE_REG_REG_CLASS (mode);
}
+/* ARM KCFI target hook implementations. */
+
+/* KCFI wrapper helper functions for .md file */
+
+/* Apply KCFI wrapping to call pattern if needed. */
+rtx
+arm_maybe_wrap_call_with_kcfi (rtx pat, rtx addr)
+{
+ /* Only indirect calls need KCFI instrumentation. */
+ bool is_direct_call = SYMBOL_REF_P (addr);
+ if (!is_direct_call)
+ {
+ rtx kcfi_type_rtx = kcfi_get_call_type_id ();
+ if (kcfi_type_rtx)
+ {
+ /* Extract the CALL from the PARALLEL and wrap it with KCFI */
+ rtx call_rtx = XVECEXP (pat, 0, 0);
+ rtx kcfi_call = gen_rtx_KCFI (VOIDmode, call_rtx, kcfi_type_rtx);
+
+ /* Replace the CALL in the PARALLEL with the KCFI-wrapped call */
+ XVECEXP (pat, 0, 0) = kcfi_call;
+ }
+ }
+ return pat;
+}
+
+/* Apply KCFI wrapping to call_value pattern if needed. */
+rtx
+arm_maybe_wrap_call_value_with_kcfi (rtx pat, rtx addr)
+{
+ /* Only indirect calls need KCFI instrumentation. */
+ bool is_direct_call = SYMBOL_REF_P (addr);
+ if (!is_direct_call)
+ {
+ rtx kcfi_type_rtx = kcfi_get_call_type_id ();
+ if (kcfi_type_rtx)
+ {
+ /* Extract the SET from the PARALLEL and wrap its CALL with KCFI */
+ rtx set_rtx = XVECEXP (pat, 0, 0);
+ rtx call_rtx = SET_SRC (set_rtx);
+ rtx kcfi_call = gen_rtx_KCFI (VOIDmode, call_rtx, kcfi_type_rtx);
+
+ /* Replace the CALL in the SET with the KCFI-wrapped call */
+ SET_SRC (set_rtx) = kcfi_call;
+ }
+ }
+ return pat;
+}
+
+const char *
+arm_output_kcfi_insn (rtx_insn *insn, rtx *operands)
+{
+ /* KCFI requires movw/movt instructions for type ID loading. */
+ if (!TARGET_HAVE_MOVT)
+ sorry ("%<-fsanitize=kcfi%> requires movw/movt instructions (ARMv7 or later)");
+
+ /* KCFI type id. */
+ uint32_t type_id = INTVAL (operands[2]);
+
+ /* Calculate typeid offset from call target. */
+ HOST_WIDE_INT offset = -(4 + kcfi_patchable_entry_prefix_nops);
+
+ /* Calculate trap immediate. */
+ unsigned addr_reg_num = REGNO (operands[0]);
+ unsigned udf_immediate = 0x8000 | (0x1F << 5) | (addr_reg_num & 31);
+
+ /* Generate labels internally. */
+ rtx trap_label = gen_label_rtx ();
+ rtx call_label = gen_label_rtx ();
+
+ /* Get label numbers for custom naming. */
+ int trap_labelno = CODE_LABEL_NUMBER (trap_label);
+ int call_labelno = CODE_LABEL_NUMBER (call_label);
+
+ /* Generate custom label names. */
+ char trap_name[32];
+ char call_name[32];
+ ASM_GENERATE_INTERNAL_LABEL (trap_name, "Lkcfi_trap", trap_labelno);
+ ASM_GENERATE_INTERNAL_LABEL (call_name, "Lkcfi_call", call_labelno);
+
+ /* Create memory operand for the type load */
+ rtx mem_op = gen_rtx_MEM (SImode, gen_rtx_PLUS (SImode, operands[0], GEN_INT(offset)));
+ rtx temp_operands[6];
+
+ /* Spill r0 and r1 to stack */
+ output_asm_insn ("push\t{r0, r1}", NULL);
+
+ /* Load actual type from memory using r0 */
+ temp_operands[0] = gen_rtx_REG (SImode, 0); /* r0 */
+ temp_operands[1] = mem_op;
+ output_asm_insn ("ldr\t%0, %1", temp_operands);
+
+ /* Load expected type low 16 bits into r1 */
+ temp_operands[0] = gen_rtx_REG (SImode, 1); /* r1 */
+ temp_operands[1] = GEN_INT (type_id & 0xFFFF);
+ output_asm_insn ("movw\t%0, %1", temp_operands);
+
+ /* Load expected type high 16 bits into r1 */
+ temp_operands[0] = gen_rtx_REG (SImode, 1); /* r1 */
+ temp_operands[1] = GEN_INT ((type_id >> 16) & 0xFFFF);
+ output_asm_insn ("movt\t%0, %1", temp_operands);
+
+ /* Compare types */
+ temp_operands[0] = gen_rtx_REG (SImode, 0); /* r0 */
+ temp_operands[1] = gen_rtx_REG (SImode, 1); /* r1 */
+ output_asm_insn ("cmp\t%0, %1", temp_operands);
+
+ /* Restore r0 and r1 from stack */
+ output_asm_insn ("pop\t{r0, r1}", NULL);
+
+ /* Output conditional branch to call label. */
+ fputs ("\tbeq\t", asm_out_file);
+ assemble_name (asm_out_file, call_name);
+ fputc ('\n', asm_out_file);
+
+ /* Output trap label and UDF instruction. */
+ ASM_OUTPUT_LABEL (asm_out_file, trap_name);
+ temp_operands[0] = GEN_INT (udf_immediate);
+ output_asm_insn ("udf\t%0", temp_operands);
+
+ /* Output pass/call label. */
+ ASM_OUTPUT_LABEL (asm_out_file, call_name);
+
+ /* Handle calls to lr using ip (which may be clobbered in subr anyway). */
+ if (REGNO (operands[0]) == LR_REGNUM)
+ {
+ operands[0] = gen_rtx_REG (SImode, IP_REGNUM);
+ output_asm_insn ("mov\t%0, lr", operands);
+ }
+
+ /* Call or tail call instruction */
+ if (SIBLING_CALL_P (insn))
+ output_asm_insn ("bx\t%0", operands);
+ else
+ output_asm_insn ("blx\t%0", operands);
+
+ return "";
+}
+
+#undef TARGET_KCFI_SUPPORTED
+#define TARGET_KCFI_SUPPORTED hook_bool_void_true
+
#undef TARGET_DOCUMENTATION_NAME
#define TARGET_DOCUMENTATION_NAME "ARM"
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 422ae549b65b..238220ae6417 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -8629,6 +8629,7 @@
else
{
pat = gen_call_internal (operands[0], operands[1], operands[2]);
+ pat = arm_maybe_wrap_call_with_kcfi (pat, XEXP (operands[0], 0));
arm_emit_call_insn (pat, XEXP (operands[0], 0), false);
}
@@ -8687,6 +8688,20 @@
}
)
+;; KCFI indirect call - KCFI wraps just the call pattern
+(define_insn "*kcfi_call_reg"
+ [(kcfi (call (mem:SI (match_operand:SI 0 "s_register_operand" "r"))
+ (match_operand 1 "" ""))
+ (match_operand 2 "const_int_operand"))
+ (use (match_operand 3 "" ""))
+ (clobber (reg:SI LR_REGNUM))]
+ "TARGET_32BIT && !SIBLING_CALL_P (insn) && arm_ccfsm_state == 0"
+{
+ return arm_output_kcfi_insn (insn, operands);
+}
+ [(set_attr "type" "call")
+ (set_attr "length" "36")])
+
(define_insn "*call_reg_armv5"
[(call (mem:SI (match_operand:SI 0 "s_register_operand" "r"))
(match_operand 1 "" ""))
@@ -8753,6 +8768,7 @@
{
pat = gen_call_value_internal (operands[0], operands[1],
operands[2], operands[3]);
+ pat = arm_maybe_wrap_call_value_with_kcfi (pat, XEXP (operands[1], 0));
arm_emit_call_insn (pat, XEXP (operands[1], 0), false);
}
@@ -8799,6 +8815,21 @@
}
}")
+;; KCFI indirect call_value - KCFI wraps just the call pattern
+(define_insn "*kcfi_call_value_reg"
+ [(set (match_operand 0 "" "")
+ (kcfi (call (mem:SI (match_operand:SI 1 "s_register_operand" "r"))
+ (match_operand 2 "" ""))
+ (match_operand 3 "const_int_operand")))
+ (use (match_operand 4 "" ""))
+ (clobber (reg:SI LR_REGNUM))]
+ "TARGET_32BIT && !SIBLING_CALL_P (insn) && arm_ccfsm_state == 0"
+{
+ return arm_output_kcfi_insn (insn, &operands[1]);
+}
+ [(set_attr "type" "call")
+ (set_attr "length" "36")])
+
(define_insn "*call_value_reg_armv5"
[(set (match_operand 0 "" "")
(call (mem:SI (match_operand:SI 1 "s_register_operand" "r"))
@@ -8901,6 +8932,7 @@
operands[2] = const0_rtx;
pat = gen_sibcall_internal (operands[0], operands[1], operands[2]);
+ pat = arm_maybe_wrap_call_with_kcfi (pat, XEXP (operands[0], 0));
arm_emit_call_insn (pat, operands[0], true);
DONE;
}"
@@ -8935,11 +8967,26 @@
pat = gen_sibcall_value_internal (operands[0], operands[1],
operands[2], operands[3]);
+ pat = arm_maybe_wrap_call_value_with_kcfi (pat, XEXP (operands[1], 0));
arm_emit_call_insn (pat, operands[1], true);
DONE;
}"
)
+;; KCFI sibling call - KCFI wraps just the call pattern
+(define_insn "*kcfi_sibcall_insn"
+ [(kcfi (call (mem:SI (match_operand:SI 0 "s_register_operand" "Cs"))
+ (match_operand 1 "" ""))
+ (match_operand 2 "const_int_operand"))
+ (return)
+ (use (match_operand 3 "" ""))]
+ "TARGET_32BIT && SIBLING_CALL_P (insn) && arm_ccfsm_state == 0"
+{
+ return arm_output_kcfi_insn (insn, operands);
+}
+ [(set_attr "type" "call")
+ (set_attr "length" "36")])
+
(define_insn "*sibcall_insn"
[(call (mem:SI (match_operand:SI 0 "call_insn_operand" "Cs, US"))
(match_operand 1 "" ""))
@@ -8960,6 +9007,21 @@
[(set_attr "type" "call")]
)
+;; KCFI sibling call with return value - KCFI wraps just the call pattern
+(define_insn "*kcfi_sibcall_value_insn"
+ [(set (match_operand 0 "" "")
+ (kcfi (call (mem:SI (match_operand:SI 1 "s_register_operand" "Cs"))
+ (match_operand 2 "" ""))
+ (match_operand 3 "const_int_operand")))
+ (return)
+ (use (match_operand 4 "" ""))]
+ "TARGET_32BIT && SIBLING_CALL_P (insn) && arm_ccfsm_state == 0"
+{
+ return arm_output_kcfi_insn (insn, &operands[1]);
+}
+ [(set_attr "type" "call")
+ (set_attr "length" "36")])
+
(define_insn "*sibcall_value_insn"
[(set (match_operand 0 "" "")
(call (mem:SI (match_operand:SI 1 "call_insn_operand" "Cs,US"))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 45efc75a3b05..25ee82c9cba7 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -18441,6 +18441,23 @@ trap is taken, allowing the kernel to identify both the KCFI violation
and the involved registers for detailed diagnostics (eliminating the need
for a separate @code{.kcfi_traps} section as used on x86_64).
+On ARM 32-bit, KCFI type identifiers are emitted as a @code{.word ID}
+directive (a 32-bit constant) before the function entry. ARM's
+natural 4-byte instruction alignment eliminates the need for additional
+padding NOPs. When used with @option{-fpatchable-function-entry}, the
+type identifier is placed before any patchable NOPs. The runtime check
+preserves argument registers @code{r0} and @code{r1} using @code{push}
+and @code{pop} instructions, then uses them as scratch registers for
+the type comparison. The expected type is loaded using @code{movw} and
+@code{movt} instruction pairs for 32-bit immediate values. Type mismatches
+trigger a @code{udf} instruction with an immediate value that encodes
+both the expected type register index and the target address register
+index in the format @code{0x8000 | (type_reg << 5) | addr_reg}. This
+encoding is captured in the UDF immediate field when the trap is taken,
+allowing the kernel to identify both the KCFI violation and the involved
+registers for detailed diagnostics (eliminating the need for a separate
+@code{.kcfi_traps} section as used on x86_64).
+
KCFI is intended primarily for kernel code and may not be suitable
for user-space applications that rely on techniques incompatible
with strict type checking of indirect calls.
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* Re: [PATCH v2 5/7] arm: Add ARM 32-bit Kernel Control Flow Integrity implementation
2025-09-05 0:24 ` [PATCH v2 5/7] arm: Add ARM 32-bit " Kees Cook
@ 2025-09-11 7:49 ` Ard Biesheuvel
2025-09-12 9:03 ` Kees Cook
0 siblings, 1 reply; 32+ messages in thread
From: Ard Biesheuvel @ 2025-09-11 7:49 UTC (permalink / raw)
To: Kees Cook
Cc: Qing Zhao, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Peter Zijlstra, Dan Li,
Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
Bill Wendling, gcc-patches, linux-hardening
On Fri, 5 Sept 2025 at 02:24, Kees Cook <kees@kernel.org> wrote:
>
> Implement ARM 32-bit KCFI backend supporting ARMv7+:
>
> - Function preamble generation using .word directives for type ID storage
> at -4 byte offset from function entry point (no prefix NOPs needed due to
> 4-byte instruction alignment).
>
> - Use movw/movt instructions for 32-bit immediate loading.
>
> - Trap debugging through UDF instruction immediate encoding following
> AArch64 BRK pattern for encoding registers with useful contents.
>
> - Scratch register allocation using r0/r1 following ARM procedure call
> standard for caller-saved temporary registers, though they get
> stack spilled due to register pressure.
>
> Assembly Code Pattern for ARM 32-bit:
> push {r0, r1} ; Spill r0, r1
> ldr r0, [target, #-4] ; Load actual type ID from preamble
> movw r1, #type_id_low ; Load expected type (lower 16 bits)
> movt r1, #type_id_high ; Load upper 16 bits with top instruction
> cmp r0, r1 ; Compare type IDs directly
> pop [r0, r1] ; Reload r0, r1
We could avoid the MOVW/MOVT pair and the spilling by doing something
along the lines of
ldr ip, [target, #-4]
eor ip, ip, #type_id[0]
eor ip, ip, #type_id[1] << 8
eor ip, ip, #type_id[2] << 16
eors ip, ip, #type_id[3] << 24
ldrne ip, =type_id[3:0]
Note that IP (R12) should be dead before a function call. Here it is
conditionally loaded with the expected target typeid, removing the
need to decode the instructions to recover it when the trap occurs.
This should compile to Thumb2 as well as ARM encodings.
> beq .Lkcfi_call ; Branch if typeids match
> .Lkcfi_trap: udf #udf_value ; Undefined instruction trap with encoding
> .Lkcfi_call: blx/bx target ; Execute validated indirect transfer
>
> UDF Immediate Encoding (following AArch64 ESR pattern):
> - UDF instruction immediate encoding format:
> 0x8000 | ((ExpectedTypeReg & 31) << 5) | (TargetAddrReg & 31)
> - ExpectedTypeReg indicates which register contains expected type (R12 = 12)
> - TargetAddrReg indicates which register contains target address (0-15)
> - Example: udf #33154 (0x817A) = expected type in R12, target address in R2
>
> Build and run tested with Linux kernel ARCH=arm.
>
> gcc/ChangeLog:
>
> config/arm/arm-protos.h: Declare KCFI helpers.
> config/arm/arm.cc (arm_maybe_wrap_call_with_kcfi): New function.
> (arm_maybe_wrap_call_value_with_kcfi): New function.
> (arm_output_kcfi_insn): Emit KCFI assembly.
> config/arm/arm.md: Add KCFI RTL patterns and hook expansion.
> doc/invoke.texi: Document arm32 nuances.
>
> Signed-off-by: Kees Cook <kees@kernel.org>
> ---
> gcc/config/arm/arm-protos.h | 4 +
> gcc/config/arm/arm.cc | 144 ++++++++++++++++++++++++++++++++++++
> gcc/config/arm/arm.md | 62 ++++++++++++++++
> gcc/doc/invoke.texi | 17 +++++
> 4 files changed, 227 insertions(+)
>
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: [PATCH v2 5/7] arm: Add ARM 32-bit Kernel Control Flow Integrity implementation
2025-09-11 7:49 ` Ard Biesheuvel
@ 2025-09-12 9:03 ` Kees Cook
2025-09-12 9:08 ` Kees Cook
0 siblings, 1 reply; 32+ messages in thread
From: Kees Cook @ 2025-09-12 9:03 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Qing Zhao, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Peter Zijlstra, Dan Li,
Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
Bill Wendling, gcc-patches, linux-hardening
On Thu, Sep 11, 2025 at 09:49:56AM +0200, Ard Biesheuvel wrote:
> On Fri, 5 Sept 2025 at 02:24, Kees Cook <kees@kernel.org> wrote:
> >
> > Implement ARM 32-bit KCFI backend supporting ARMv7+:
> >
> > - Function preamble generation using .word directives for type ID storage
> > at -4 byte offset from function entry point (no prefix NOPs needed due to
> > 4-byte instruction alignment).
> >
> > - Use movw/movt instructions for 32-bit immediate loading.
> >
> > - Trap debugging through UDF instruction immediate encoding following
> > AArch64 BRK pattern for encoding registers with useful contents.
> >
> > - Scratch register allocation using r0/r1 following ARM procedure call
> > standard for caller-saved temporary registers, though they get
> > stack spilled due to register pressure.
> >
> > Assembly Code Pattern for ARM 32-bit:
> > push {r0, r1} ; Spill r0, r1
> > ldr r0, [target, #-4] ; Load actual type ID from preamble
> > movw r1, #type_id_low ; Load expected type (lower 16 bits)
> > movt r1, #type_id_high ; Load upper 16 bits with top instruction
> > cmp r0, r1 ; Compare type IDs directly
> > pop [r0, r1] ; Reload r0, r1
>
> We could avoid the MOVW/MOVT pair and the spilling by doing something
> along the lines of
>
> ldr ip, [target, #-4]
> eor ip, ip, #type_id[0]
> eor ip, ip, #type_id[1] << 8
> eor ip, ip, #type_id[2] << 16
> eors ip, ip, #type_id[3] << 24
> ldrne ip, =type_id[3:0]
Ah-ha, nice. And it could re-load the type_id on the slow path instead
of unconditionally, I guess? (So no "ne" suffix needed there.)
...
eors ip, ip, #type_id[3] << 24
beq .Lkcfi_call
.Lkcfi_trap:
ldr ip, =type_id[3:0]
udf #nnn
.Lkcfi_call:
blx target
>
> Note that IP (R12) should be dead before a function call. Here it is
> conditionally loaded with the expected target typeid, removing the
> need to decode the instructions to recover it when the trap occurs.
>
> This should compile to Thumb2 as well as ARM encodings.
Won't IP get used as the target register if r0-r3 are used for passing
arguments? AAPCS implies this is how it'll go (4 arguments in registers,
the rest on stack), but when I tried to force this to happen, it looked
like it'd only pass 3 via registers, and would make the call with r3.
I can't see if this is safe to unconditionally use IP?
--
Kees Cook
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: [PATCH v2 5/7] arm: Add ARM 32-bit Kernel Control Flow Integrity implementation
2025-09-12 9:03 ` Kees Cook
@ 2025-09-12 9:08 ` Kees Cook
2025-09-12 9:43 ` Ard Biesheuvel
0 siblings, 1 reply; 32+ messages in thread
From: Kees Cook @ 2025-09-12 9:08 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Qing Zhao, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Peter Zijlstra, Dan Li,
Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
Bill Wendling, gcc-patches, linux-hardening
On Fri, Sep 12, 2025 at 02:03:08AM -0700, Kees Cook wrote:
> On Thu, Sep 11, 2025 at 09:49:56AM +0200, Ard Biesheuvel wrote:
> > On Fri, 5 Sept 2025 at 02:24, Kees Cook <kees@kernel.org> wrote:
> > >
> > > Implement ARM 32-bit KCFI backend supporting ARMv7+:
> > >
> > > - Function preamble generation using .word directives for type ID storage
> > > at -4 byte offset from function entry point (no prefix NOPs needed due to
> > > 4-byte instruction alignment).
> > >
> > > - Use movw/movt instructions for 32-bit immediate loading.
> > >
> > > - Trap debugging through UDF instruction immediate encoding following
> > > AArch64 BRK pattern for encoding registers with useful contents.
> > >
> > > - Scratch register allocation using r0/r1 following ARM procedure call
> > > standard for caller-saved temporary registers, though they get
> > > stack spilled due to register pressure.
> > >
> > > Assembly Code Pattern for ARM 32-bit:
> > > push {r0, r1} ; Spill r0, r1
> > > ldr r0, [target, #-4] ; Load actual type ID from preamble
> > > movw r1, #type_id_low ; Load expected type (lower 16 bits)
> > > movt r1, #type_id_high ; Load upper 16 bits with top instruction
> > > cmp r0, r1 ; Compare type IDs directly
> > > pop [r0, r1] ; Reload r0, r1
> >
> > We could avoid the MOVW/MOVT pair and the spilling by doing something
> > along the lines of
> >
> > ldr ip, [target, #-4]
> > eor ip, ip, #type_id[0]
> > eor ip, ip, #type_id[1] << 8
> > eor ip, ip, #type_id[2] << 16
> > eors ip, ip, #type_id[3] << 24
> > ldrne ip, =type_id[3:0]
>
> Ah-ha, nice. And it could re-load the type_id on the slow path instead
> of unconditionally, I guess? (So no "ne" suffix needed there.)
>
> ...
> eors ip, ip, #type_id[3] << 24
> beq .Lkcfi_call
> .Lkcfi_trap:
> ldr ip, =type_id[3:0]
> udf #nnn
> .Lkcfi_call:
> blx target
>
>
> >
> > Note that IP (R12) should be dead before a function call. Here it is
> > conditionally loaded with the expected target typeid, removing the
> > need to decode the instructions to recover it when the trap occurs.
> >
> > This should compile to Thumb2 as well as ARM encodings.
>
> Won't IP get used as the target register if r0-r3 are used for passing
> arguments? AAPCS implies this is how it'll go (4 arguments in registers,
> the rest on stack), but when I tried to force this to happen, it looked
> like it'd only pass 3 via registers, and would make the call with r3.
Wait, I misread, my test is using r4 as the target! Still, is IP guaranteed
to never be used for the target?
-Kees
--
Kees Cook
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: [PATCH v2 5/7] arm: Add ARM 32-bit Kernel Control Flow Integrity implementation
2025-09-12 9:08 ` Kees Cook
@ 2025-09-12 9:43 ` Ard Biesheuvel
2025-09-12 19:01 ` Kees Cook
0 siblings, 1 reply; 32+ messages in thread
From: Ard Biesheuvel @ 2025-09-12 9:43 UTC (permalink / raw)
To: Kees Cook
Cc: Qing Zhao, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Peter Zijlstra, Dan Li,
Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
Bill Wendling, gcc-patches, linux-hardening
On Fri, 12 Sept 2025 at 11:08, Kees Cook <kees@kernel.org> wrote:
>
> On Fri, Sep 12, 2025 at 02:03:08AM -0700, Kees Cook wrote:
> > On Thu, Sep 11, 2025 at 09:49:56AM +0200, Ard Biesheuvel wrote:
> > > On Fri, 5 Sept 2025 at 02:24, Kees Cook <kees@kernel.org> wrote:
> > > >
> > > > Implement ARM 32-bit KCFI backend supporting ARMv7+:
> > > >
> > > > - Function preamble generation using .word directives for type ID storage
> > > > at -4 byte offset from function entry point (no prefix NOPs needed due to
> > > > 4-byte instruction alignment).
> > > >
> > > > - Use movw/movt instructions for 32-bit immediate loading.
> > > >
> > > > - Trap debugging through UDF instruction immediate encoding following
> > > > AArch64 BRK pattern for encoding registers with useful contents.
> > > >
> > > > - Scratch register allocation using r0/r1 following ARM procedure call
> > > > standard for caller-saved temporary registers, though they get
> > > > stack spilled due to register pressure.
> > > >
> > > > Assembly Code Pattern for ARM 32-bit:
> > > > push {r0, r1} ; Spill r0, r1
> > > > ldr r0, [target, #-4] ; Load actual type ID from preamble
> > > > movw r1, #type_id_low ; Load expected type (lower 16 bits)
> > > > movt r1, #type_id_high ; Load upper 16 bits with top instruction
> > > > cmp r0, r1 ; Compare type IDs directly
> > > > pop [r0, r1] ; Reload r0, r1
> > >
> > > We could avoid the MOVW/MOVT pair and the spilling by doing something
> > > along the lines of
> > >
> > > ldr ip, [target, #-4]
> > > eor ip, ip, #type_id[0]
> > > eor ip, ip, #type_id[1] << 8
> > > eor ip, ip, #type_id[2] << 16
> > > eors ip, ip, #type_id[3] << 24
> > > ldrne ip, =type_id[3:0]
> >
> > Ah-ha, nice. And it could re-load the type_id on the slow path instead
> > of unconditionally, I guess? (So no "ne" suffix needed there.)
> >
> > ...
> > eors ip, ip, #type_id[3] << 24
> > beq .Lkcfi_call
> > .Lkcfi_trap:
> > ldr ip, =type_id[3:0]
Yeah better. If you use the right compiler abstraction to emit this
load, it will be turned into MOVW/MOVT if the target supports it.
> > udf #nnn
> > .Lkcfi_call:
> > blx target
> >
> >
> > >
> > > Note that IP (R12) should be dead before a function call. Here it is
> > > conditionally loaded with the expected target typeid, removing the
> > > need to decode the instructions to recover it when the trap occurs.
> > >
> > > This should compile to Thumb2 as well as ARM encodings.
> >
> > Won't IP get used as the target register if r0-r3 are used for passing
> > arguments? AAPCS implies this is how it'll go (4 arguments in registers,
> > the rest on stack), but when I tried to force this to happen, it looked
> > like it'd only pass 3 via registers, and would make the call with r3.
>
> Wait, I misread, my test is using r4 as the target! Still, is IP guaranteed
> to never be used for the target?
>
The target register can be any GPR. IP is guaranteed by AAPCS not to
play a role in parameter passing, because it is the Inter Procedural
scratch register, and may be clobbered by PLT trampolines that get
inserted between a direct call and its target. These are not direct
calls, of course, but the callee does not know that, and so it cannot
make any assumptions about the value of IP.
That said, I'm not sure I understand why this type register has to be
a fixed register. It /can/ be a fixed register, but you'd have to tell
the compiler that. In that case, it can still use the link register
for the target, unless it is emitting a tail call and LR needs to be
preserved. The upshot of that would be that some tail calls will be
converted into ordinary calls, due to the need to preserve some
registers on the stack. But I'd still assume letting the compiler do
this when needed is better than always pushing/popping two registers
in the CFI call sequence.
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: [PATCH v2 5/7] arm: Add ARM 32-bit Kernel Control Flow Integrity implementation
2025-09-12 9:43 ` Ard Biesheuvel
@ 2025-09-12 19:01 ` Kees Cook
0 siblings, 0 replies; 32+ messages in thread
From: Kees Cook @ 2025-09-12 19:01 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Qing Zhao, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Peter Zijlstra, Dan Li,
Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
Bill Wendling, gcc-patches, linux-hardening
On Fri, Sep 12, 2025 at 11:43:00AM +0200, Ard Biesheuvel wrote:
> On Fri, 12 Sept 2025 at 11:08, Kees Cook <kees@kernel.org> wrote:
> >
> > On Fri, Sep 12, 2025 at 02:03:08AM -0700, Kees Cook wrote:
> > > On Thu, Sep 11, 2025 at 09:49:56AM +0200, Ard Biesheuvel wrote:
> > > > On Fri, 5 Sept 2025 at 02:24, Kees Cook <kees@kernel.org> wrote:
> > > > >
> > > > > Implement ARM 32-bit KCFI backend supporting ARMv7+:
> > > > >
> > > > > - Function preamble generation using .word directives for type ID storage
> > > > > at -4 byte offset from function entry point (no prefix NOPs needed due to
> > > > > 4-byte instruction alignment).
> > > > >
> > > > > - Use movw/movt instructions for 32-bit immediate loading.
> > > > >
> > > > > - Trap debugging through UDF instruction immediate encoding following
> > > > > AArch64 BRK pattern for encoding registers with useful contents.
> > > > >
> > > > > - Scratch register allocation using r0/r1 following ARM procedure call
> > > > > standard for caller-saved temporary registers, though they get
> > > > > stack spilled due to register pressure.
> > > > >
> > > > > Assembly Code Pattern for ARM 32-bit:
> > > > > push {r0, r1} ; Spill r0, r1
> > > > > ldr r0, [target, #-4] ; Load actual type ID from preamble
> > > > > movw r1, #type_id_low ; Load expected type (lower 16 bits)
> > > > > movt r1, #type_id_high ; Load upper 16 bits with top instruction
> > > > > cmp r0, r1 ; Compare type IDs directly
> > > > > pop [r0, r1] ; Reload r0, r1
> > > >
> > > > We could avoid the MOVW/MOVT pair and the spilling by doing something
> > > > along the lines of
> > > >
> > > > ldr ip, [target, #-4]
> > > > eor ip, ip, #type_id[0]
> > > > eor ip, ip, #type_id[1] << 8
> > > > eor ip, ip, #type_id[2] << 16
> > > > eors ip, ip, #type_id[3] << 24
> > > > ldrne ip, =type_id[3:0]
> > >
> > > Ah-ha, nice. And it could re-load the type_id on the slow path instead
> > > of unconditionally, I guess? (So no "ne" suffix needed there.)
> > >
> > > ...
> > > eors ip, ip, #type_id[3] << 24
> > > beq .Lkcfi_call
> > > .Lkcfi_trap:
> > > ldr ip, =type_id[3:0]
>
> Yeah better. If you use the right compiler abstraction to emit this
> load, it will be turned into MOVW/MOVT if the target supports it.
>
> > > udf #nnn
> > > .Lkcfi_call:
> > > blx target
> > >
> > >
> > > >
> > > > Note that IP (R12) should be dead before a function call. Here it is
> > > > conditionally loaded with the expected target typeid, removing the
> > > > need to decode the instructions to recover it when the trap occurs.
> > > >
> > > > This should compile to Thumb2 as well as ARM encodings.
> > >
> > > Won't IP get used as the target register if r0-r3 are used for passing
> > > arguments? AAPCS implies this is how it'll go (4 arguments in registers,
> > > the rest on stack), but when I tried to force this to happen, it looked
> > > like it'd only pass 3 via registers, and would make the call with r3.
> >
> > Wait, I misread, my test is using r4 as the target! Still, is IP guaranteed
> > to never be used for the target?
> >
>
> The target register can be any GPR. IP is guaranteed by AAPCS not to
> play a role in parameter passing, because it is the Inter Procedural
> scratch register, and may be clobbered by PLT trampolines that get
> inserted between a direct call and its target. These are not direct
> calls, of course, but the callee does not know that, and so it cannot
> make any assumptions about the value of IP.
Okay, it seems like I am close to having this replaced with the eor
method, but the backend really really does not like constructing the ldr
for me. I may leave this as a "future improvement", and just change the
Linux side of the trap handling to decode the eor insns instead of
pulling the value out of IP.
--
Kees Cook
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v2 6/7] riscv: Add RISC-V Kernel Control Flow Integrity implementation
2025-09-05 0:24 [PATCH v2 0/7] Introduce Kernel Control Flow Integrity ABI [PR107048] Kees Cook
` (4 preceding siblings ...)
2025-09-05 0:24 ` [PATCH v2 5/7] arm: Add ARM 32-bit " Kees Cook
@ 2025-09-05 0:24 ` Kees Cook
2025-09-16 3:40 ` Jeff Law
2025-09-05 0:24 ` [PATCH v2 7/7] kcfi: Add regression test suite Kees Cook
6 siblings, 1 reply; 32+ messages in thread
From: Kees Cook @ 2025-09-05 0:24 UTC (permalink / raw)
To: Qing Zhao
Cc: Kees Cook, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Peter Zijlstra, Dan Li,
Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
Bill Wendling, gcc-patches, linux-hardening
Implement RISC-V-specific KCFI backend.
- Function preamble generation using .word directives for type ID storage
at offset from function entry point (no alignment NOPs needed due to
fix 4-byte instruction size).
- Scratch register allocation using t1/t2 (x6/x7) following RISC-V
procedure call standard for temporary registers.
- Integration with .kcfi_traps section for debugger/runtime metadata
(like x86_64).
Assembly Code Pattern for RISC-V:
lw t1, -4(target_reg) ; Load actual type ID from preamble
lui t2, %hi(expected_type) ; Load expected type (upper 20 bits)
addiw t2, t2, %lo(expected_type) ; Add lower 12 bits (sign-extended)
beq t1, t2, .Lkcfi_call ; Branch if types match
.Lkcfi_trap: ebreak ; Environment break trap on mismatch
.Lkcfi_call: jalr/jr target_reg ; Execute validated indirect transfer
Build and run tested with Linux kernel ARCH=riscv.
gcc/ChangeLog:
config/riscv/riscv-protos.h: Declare KCFI helpers.
config/riscv/riscv.cc (riscv_maybe_wrap_call_with_kcfi): New
function, to wrap calls.
(riscv_maybe_wrap_call_value_with_kcfi): New function, to
wrap calls with return values.
(riscv_output_kcfi_insn): New function to emit KCFI assembly.
config/riscv/riscv.md: Add KCFI RTL patterns and hook expansion.
doc/invoke.texi: Document riscv nuances.
Signed-off-by: Kees Cook <kees@kernel.org>
---
gcc/config/riscv/riscv-protos.h | 3 +
gcc/config/riscv/riscv.cc | 147 ++++++++++++++++++++++++++++++++
gcc/config/riscv/riscv.md | 74 ++++++++++++++--
gcc/doc/invoke.texi | 13 +++
4 files changed, 231 insertions(+), 6 deletions(-)
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 2d60a0ad44b3..0e916fbdde13 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -126,6 +126,9 @@ extern bool riscv_split_64bit_move_p (rtx, rtx);
extern void riscv_split_doubleword_move (rtx, rtx);
extern const char *riscv_output_move (rtx, rtx);
extern const char *riscv_output_return ();
+extern rtx riscv_maybe_wrap_call_with_kcfi (rtx, rtx);
+extern rtx riscv_maybe_wrap_call_value_with_kcfi (rtx, rtx);
+extern const char *riscv_output_kcfi_insn (rtx_insn *, rtx *);
extern void riscv_declare_function_name (FILE *, const char *, tree);
extern void riscv_declare_function_size (FILE *, const char *, tree);
extern void riscv_asm_output_alias (FILE *, const tree, const tree);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 41ee81b93acf..8dc54ffb19fe 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -81,6 +81,7 @@ along with GCC; see the file COPYING3. If not see
#include "cgraph.h"
#include "langhooks.h"
#include "gimplify.h"
+#include "kcfi.h"
/* This file should be included last. */
#include "target-def.h"
@@ -11346,6 +11347,149 @@ riscv_convert_vector_chunks (struct gcc_options *opts)
return 1;
}
+/* Apply KCFI wrapping to call pattern if needed. */
+rtx
+riscv_maybe_wrap_call_with_kcfi (rtx pat, rtx addr)
+{
+ /* Only indirect calls need KCFI instrumentation. */
+ bool is_direct_call = SYMBOL_REF_P (addr);
+ if (!is_direct_call)
+ {
+ rtx kcfi_type_rtx = kcfi_get_call_type_id ();
+ if (kcfi_type_rtx)
+ {
+ /* Extract the CALL from the PARALLEL and wrap it with KCFI */
+ rtx call_rtx = XVECEXP (pat, 0, 0);
+ rtx kcfi_call = gen_rtx_KCFI (VOIDmode, call_rtx, kcfi_type_rtx);
+
+ /* Replace the CALL in the PARALLEL with the KCFI-wrapped call */
+ XVECEXP (pat, 0, 0) = kcfi_call;
+ }
+ }
+ return pat;
+}
+
+/* Apply KCFI wrapping to call_value pattern if needed. */
+rtx
+riscv_maybe_wrap_call_value_with_kcfi (rtx pat, rtx addr)
+{
+ /* Only indirect calls need KCFI instrumentation. */
+ bool is_direct_call = SYMBOL_REF_P (addr);
+ if (!is_direct_call)
+ {
+ rtx kcfi_type_rtx = kcfi_get_call_type_id ();
+ if (kcfi_type_rtx)
+ {
+ /* Extract the SET from the PARALLEL and wrap its CALL with KCFI */
+ rtx set_rtx = XVECEXP (pat, 0, 0);
+ rtx call_rtx = SET_SRC (set_rtx);
+ rtx kcfi_call = gen_rtx_KCFI (VOIDmode, call_rtx, kcfi_type_rtx);
+
+ /* Replace the CALL in the SET with the KCFI-wrapped call */
+ SET_SRC (set_rtx) = kcfi_call;
+ }
+ }
+ return pat;
+}
+
+/* Output the assembly for a KCFI checked call instruction. */
+const char *
+riscv_output_kcfi_insn (rtx_insn *insn, rtx *operands)
+{
+ /* Target register. */
+ rtx target_reg = operands[0];
+ gcc_assert (REG_P (target_reg));
+
+ /* Get KCFI type ID. */
+ uint32_t expected_type = (uint32_t) INTVAL (operands[3]);
+
+ /* Calculate typeid offset from call target. */
+ HOST_WIDE_INT offset = -(4 + kcfi_patchable_entry_prefix_nops);
+
+ /* Choose scratch registers that don't conflict with target. */
+ unsigned temp1_regnum = T1_REGNUM;
+ unsigned temp2_regnum = T2_REGNUM;
+
+ if (REGNO (target_reg) == T1_REGNUM)
+ temp1_regnum = T3_REGNUM;
+ else if (REGNO (target_reg) == T2_REGNUM)
+ temp2_regnum = T3_REGNUM;
+
+ /* Generate labels internally. */
+ rtx trap_label = gen_label_rtx ();
+ rtx call_label = gen_label_rtx ();
+
+ /* Get label numbers for custom naming. */
+ int trap_labelno = CODE_LABEL_NUMBER (trap_label);
+ int call_labelno = CODE_LABEL_NUMBER (call_label);
+
+ /* Generate custom label names. */
+ char trap_name[32];
+ char call_name[32];
+ ASM_GENERATE_INTERNAL_LABEL (trap_name, "Lkcfi_trap", trap_labelno);
+ ASM_GENERATE_INTERNAL_LABEL (call_name, "Lkcfi_call", call_labelno);
+
+ /* Split expected_type for RISC-V immediate encoding.
+ If bit 11 is set, increment upper 20 bits to compensate for sign extension. */
+ int32_t lo12 = ((int32_t)(expected_type << 20)) >> 20;
+ uint32_t hi20 = ((expected_type >> 12) + ((expected_type & 0x800) ? 1 : 0)) & 0xFFFFF;
+
+ rtx temp_operands[3];
+
+ /* Load actual type from memory at offset. */
+ temp_operands[0] = gen_rtx_REG (SImode, temp1_regnum);
+ temp_operands[1] = gen_rtx_MEM (SImode,
+ gen_rtx_PLUS (DImode, target_reg,
+ GEN_INT (offset)));
+ output_asm_insn ("lw\t%0, %1", temp_operands);
+
+ /* Load expected type using lui + addiw for proper sign extension. */
+ temp_operands[0] = gen_rtx_REG (SImode, temp2_regnum);
+ temp_operands[1] = GEN_INT (hi20);
+ output_asm_insn ("lui\t%0, %1", temp_operands);
+
+ temp_operands[0] = gen_rtx_REG (SImode, temp2_regnum);
+ temp_operands[1] = gen_rtx_REG (SImode, temp2_regnum);
+ temp_operands[2] = GEN_INT (lo12);
+ output_asm_insn ("addiw\t%0, %1, %2", temp_operands);
+
+ /* Output conditional branch to call label. */
+ fprintf (asm_out_file, "\tbeq\t%s, %s, ", reg_names[temp1_regnum], reg_names[temp2_regnum]);
+ assemble_name (asm_out_file, call_name);
+ fputc ('\n', asm_out_file);
+
+ /* Output trap label and ebreak instruction. */
+ ASM_OUTPUT_LABEL (asm_out_file, trap_name);
+ output_asm_insn ("ebreak", operands);
+
+ /* Use common helper for trap section entry. */
+ rtx trap_label_sym = gen_rtx_SYMBOL_REF (Pmode, trap_name);
+ kcfi_emit_traps_section (asm_out_file, trap_label_sym);
+
+ /* Output pass/call label. */
+ ASM_OUTPUT_LABEL (asm_out_file, call_name);
+
+ /* Execute the indirect call. */
+ if (SIBLING_CALL_P (insn))
+ {
+ /* Tail call uses x0 (zero register) to avoid saving return address. */
+ temp_operands[0] = gen_rtx_REG (DImode, 0); /* x0 */
+ temp_operands[1] = target_reg; /* target register */
+ temp_operands[2] = const0_rtx;
+ output_asm_insn ("jalr\t%0, %1, %2", temp_operands);
+ }
+ else
+ {
+ /* Regular call uses x1 (return address register). */
+ temp_operands[0] = gen_rtx_REG (DImode, RETURN_ADDR_REGNUM); /* x1 */
+ temp_operands[1] = target_reg; /* target register */
+ temp_operands[2] = const0_rtx;
+ output_asm_insn ("jalr\t%0, %1, %2", temp_operands);
+ }
+
+ return "";
+}
+
/* 'Unpack' up the internal tuning structs and update the options
in OPTS. The caller must have set up selected_tune and selected_arch
as all the other target-specific codegen decisions are
@@ -15898,6 +16042,9 @@ riscv_prefetch_offset_address_p (rtx x, machine_mode mode)
#define TARGET_GET_FUNCTION_VERSIONS_DISPATCHER \
riscv_get_function_versions_dispatcher
+#undef TARGET_KCFI_SUPPORTED
+#define TARGET_KCFI_SUPPORTED hook_bool_void_true
+
#undef TARGET_DOCUMENTATION_NAME
#define TARGET_DOCUMENTATION_NAME "RISC-V"
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 4718a75598a6..9a9524a5e46f 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -3982,10 +3982,25 @@
""
{
rtx target = riscv_legitimize_call_address (XEXP (operands[0], 0));
- emit_call_insn (gen_sibcall_internal (target, operands[1], operands[2]));
+ rtx pat = gen_sibcall_internal (target, operands[1], operands[2]);
+ pat = riscv_maybe_wrap_call_with_kcfi (pat, target);
+ emit_call_insn (pat);
DONE;
})
+;; KCFI sibling call - matches KCFI wrapper RTL
+(define_insn "*kcfi_sibcall_insn"
+ [(kcfi (call (mem:SI (match_operand:DI 0 "call_insn_operand" "l"))
+ (match_operand 1 ""))
+ (match_operand 3 "const_int_operand"))
+ (use (unspec:SI [(match_operand 2 "const_int_operand")] UNSPEC_CALLEE_CC))]
+ "SIBLING_CALL_P (insn)"
+{
+ return riscv_output_kcfi_insn (insn, operands);
+}
+ [(set_attr "type" "call")
+ (set_attr "length" "24")])
+
(define_insn "sibcall_internal"
[(call (mem:SI (match_operand 0 "call_insn_operand" "j,S,U"))
(match_operand 1 "" ""))
@@ -4009,11 +4024,26 @@
""
{
rtx target = riscv_legitimize_call_address (XEXP (operands[1], 0));
- emit_call_insn (gen_sibcall_value_internal (operands[0], target, operands[2],
- operands[3]));
+ rtx pat = gen_sibcall_value_internal (operands[0], target, operands[2], operands[3]);
+ pat = riscv_maybe_wrap_call_value_with_kcfi (pat, target);
+ emit_call_insn (pat);
DONE;
})
+;; KCFI sibling call with return value - matches KCFI wrapper RTL
+(define_insn "*kcfi_sibcall_value_insn"
+ [(set (match_operand 0 "")
+ (kcfi (call (mem:SI (match_operand:DI 1 "call_insn_operand" "l"))
+ (match_operand 2 ""))
+ (match_operand 4 "const_int_operand")))
+ (use (unspec:SI [(match_operand 3 "const_int_operand")] UNSPEC_CALLEE_CC))]
+ "SIBLING_CALL_P (insn)"
+{
+ return riscv_output_kcfi_insn (insn, &operands[1]);
+}
+ [(set_attr "type" "call")
+ (set_attr "length" "24")])
+
(define_insn "sibcall_value_internal"
[(set (match_operand 0 "" "")
(call (mem:SI (match_operand 1 "call_insn_operand" "j,S,U"))
@@ -4037,10 +4067,26 @@
""
{
rtx target = riscv_legitimize_call_address (XEXP (operands[0], 0));
- emit_call_insn (gen_call_internal (target, operands[1], operands[2]));
+ rtx pat = gen_call_internal (target, operands[1], operands[2]);
+ pat = riscv_maybe_wrap_call_with_kcfi (pat, target);
+ emit_call_insn (pat);
DONE;
})
+;; KCFI indirect call - matches KCFI wrapper RTL
+(define_insn "*kcfi_call_internal"
+ [(kcfi (call (mem:SI (match_operand:DI 0 "call_insn_operand" "l"))
+ (match_operand 1 "" ""))
+ (match_operand 3 "const_int_operand"))
+ (use (unspec:SI [(match_operand 2 "const_int_operand")] UNSPEC_CALLEE_CC))
+ (clobber (reg:SI RETURN_ADDR_REGNUM))]
+ "!SIBLING_CALL_P (insn)"
+{
+ return riscv_output_kcfi_insn (insn, operands);
+}
+ [(set_attr "type" "call")
+ (set_attr "length" "24")])
+
(define_insn "call_internal"
[(call (mem:SI (match_operand 0 "call_insn_operand" "l,S,U"))
(match_operand 1 "" ""))
@@ -4065,11 +4111,27 @@
""
{
rtx target = riscv_legitimize_call_address (XEXP (operands[1], 0));
- emit_call_insn (gen_call_value_internal (operands[0], target, operands[2],
- operands[3]));
+ rtx pat = gen_call_value_internal (operands[0], target, operands[2], operands[3]);
+ pat = riscv_maybe_wrap_call_value_with_kcfi (pat, target);
+ emit_call_insn (pat);
DONE;
})
+;; KCFI call with return value - matches KCFI wrapper RTL
+(define_insn "*kcfi_call_value_insn"
+ [(set (match_operand 0 "" "")
+ (kcfi (call (mem:SI (match_operand:DI 1 "call_insn_operand" "l"))
+ (match_operand 2 "" ""))
+ (match_operand 4 "const_int_operand")))
+ (use (unspec:SI [(match_operand 3 "const_int_operand")] UNSPEC_CALLEE_CC))
+ (clobber (reg:SI RETURN_ADDR_REGNUM))]
+ "!SIBLING_CALL_P (insn)"
+{
+ return riscv_output_kcfi_insn (insn, &operands[1]);
+}
+ [(set_attr "type" "call")
+ (set_attr "length" "24")])
+
(define_insn "call_value_internal"
[(set (match_operand 0 "" "")
(call (mem:SI (match_operand 1 "call_insn_operand" "l,S,U"))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 25ee82c9cba7..43e86f4bc5b4 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -18458,6 +18458,19 @@ allowing the kernel to identify both the KCFI violation and the involved
registers for detailed diagnostics (eliminating the need for a separate
@code{.kcfi_traps} section as used on x86_64).
+On RISC-V, KCFI type identifiers are emitted as a @code{.word ID}
+directive (a 32-bit constant) before the function entry, similar to AArch64.
+RISC-V's natural 4-byte instruction alignment eliminates the need for
+additional padding NOPs. When used with @option{-fpatchable-function-entry},
+the type identifier is placed before any patchable NOPs. The runtime check
+loads the actual type using @code{lw t1, OFFSET(target_reg)}, where the
+offset accounts for any prefix NOPs, constructs the expected type using
+@code{lui} and @code{addiw} instructions into @code{t2}, and compares them
+with @code{beq}. Type mismatches trigger an @code{ebreak} instruction.
+Like x86_64, RISC-V uses a @code{.kcfi_traps} section to map trap locations
+to their corresponding function entry points for debugging (RISC-V lacks
+ESR-style trap encoding unlike AArch64).
+
KCFI is intended primarily for kernel code and may not be suitable
for user-space applications that rely on techniques incompatible
with strict type checking of indirect calls.
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* Re: [PATCH v2 6/7] riscv: Add RISC-V Kernel Control Flow Integrity implementation
2025-09-05 0:24 ` [PATCH v2 6/7] riscv: Add RISC-V " Kees Cook
@ 2025-09-16 3:40 ` Jeff Law
2025-09-16 6:04 ` Kees Cook
0 siblings, 1 reply; 32+ messages in thread
From: Jeff Law @ 2025-09-16 3:40 UTC (permalink / raw)
To: Kees Cook, Qing Zhao
Cc: Andrew Pinski, Richard Biener, Joseph Myers, Jan Hubicka,
Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
Jim Wilson, Peter Zijlstra, Dan Li, Sami Tolvanen,
Ramon de C Valle, Joao Moreira, Nathan Chancellor, Bill Wendling,
gcc-patches, linux-hardening
On 9/4/25 18:24, Kees Cook wrote:
> Implement RISC-V-specific KCFI backend.
>
> - Function preamble generation using .word directives for type ID storage
> at offset from function entry point (no alignment NOPs needed due to
> fix 4-byte instruction size).
>
> - Scratch register allocation using t1/t2 (x6/x7) following RISC-V
> procedure call standard for temporary registers.
>
> - Integration with .kcfi_traps section for debugger/runtime metadata
> (like x86_64).
>
> Assembly Code Pattern for RISC-V:
> lw t1, -4(target_reg) ; Load actual type ID from preamble
> lui t2, %hi(expected_type) ; Load expected type (upper 20 bits)
> addiw t2, t2, %lo(expected_type) ; Add lower 12 bits (sign-extended)
> beq t1, t2, .Lkcfi_call ; Branch if types match
> .Lkcfi_trap: ebreak ; Environment break trap on mismatch
> .Lkcfi_call: jalr/jr target_reg ; Execute validated indirect transfer
>
> Build and run tested with Linux kernel ARCH=riscv.
>
> gcc/ChangeLog:
>
> config/riscv/riscv-protos.h: Declare KCFI helpers.
> config/riscv/riscv.cc (riscv_maybe_wrap_call_with_kcfi): New
> function, to wrap calls.
> (riscv_maybe_wrap_call_value_with_kcfi): New function, to
> wrap calls with return values.
> (riscv_output_kcfi_insn): New function to emit KCFI assembly.
> config/riscv/riscv.md: Add KCFI RTL patterns and hook expansion.
> doc/invoke.texi: Document riscv nuances.
>
> Signed-off-by: Kees Cook <kees@kernel.org>
> ---
> gcc/config/riscv/riscv-protos.h | 3 +
> gcc/config/riscv/riscv.cc | 147 ++++++++++++++++++++++++++++++++
> gcc/config/riscv/riscv.md | 74 ++++++++++++++--
> gcc/doc/invoke.texi | 13 +++
> 4 files changed, 231 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index 2d60a0ad44b3..0e916fbdde13 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -126,6 +126,9 @@ extern bool riscv_split_64bit_move_p (rtx, rtx);
> extern void riscv_split_doubleword_move (rtx, rtx);
> extern const char *riscv_output_move (rtx, rtx);
> extern const char *riscv_output_return ();
> +extern rtx riscv_maybe_wrap_call_with_kcfi (rtx, rtx);
> +extern rtx riscv_maybe_wrap_call_value_with_kcfi (rtx, rtx);
> +extern const char *riscv_output_kcfi_insn (rtx_insn *, rtx *);
> extern void riscv_declare_function_name (FILE *, const char *, tree);
> extern void riscv_declare_function_size (FILE *, const char *, tree);
> extern void riscv_asm_output_alias (FILE *, const tree, const tree);
> @@ -11346,6 +11347,149 @@ riscv_convert_vector_chunks (struct gcc_options *opts)
> return 1;
> }
>
> +/* Apply KCFI wrapping to call pattern if needed. */
> +rtx
> +riscv_maybe_wrap_call_with_kcfi (rtx pat, rtx addr)
So our coding standards require a bit more for that function comment.
What are PAT and ADDR and how are they used?
> +}
> +
> +/* Apply KCFI wrapping to call_value pattern if needed. */
> +rtx
> +riscv_maybe_wrap_call_value_with_kcfi (rtx pat, rtx addr)
Similarly here.
> +
> +/* Output the assembly for a KCFI checked call instruction. */
> +const char *
> +riscv_output_kcfi_insn (rtx_insn *insn, rtx *operands)
And here.
> +{
> + /* Target register. */
> + rtx target_reg = operands[0];
> + gcc_assert (REG_P (target_reg));
> +
> + /* Get KCFI type ID. */
> + uint32_t expected_type = (uint32_t) INTVAL (operands[3]);
Do we know operands[3] is a CONST_INT?
> +
> + /* Calculate typeid offset from call target. */
> + HOST_WIDE_INT offset = -(4 + kcfi_patchable_entry_prefix_nops);
> +
> + /* Choose scratch registers that don't conflict with target. */
> + unsigned temp1_regnum = T1_REGNUM;
> + unsigned temp2_regnum = T2_REGNUM;
ISTM that this will need some kidn of adjustment if someone were to
compile with -ffixed-reg. Maybe all we really need is a sorry() so that
if someone were to try to fix the temporary registers they'd get a loud
complaint from teh compiler.
> +
> + /* Load actual type from memory at offset. */
> + temp_operands[0] = gen_rtx_REG (SImode, temp1_regnum);
> + temp_operands[1] = gen_rtx_MEM (SImode,
> + gen_rtx_PLUS (DImode, target_reg,
> + GEN_INT (offset)));
> + output_asm_insn ("lw\t%0, %1", temp_operands);
Rather than using DImode for the PLUS, shouldn't it instead use Pmode so
that it at least tries to work on rv32? Or is this stuff somehow
defined as only working for rv64?
> +
> + /* Execute the indirect call. */
> + if (SIBLING_CALL_P (insn))
> + {
> + /* Tail call uses x0 (zero register) to avoid saving return address. */
> + temp_operands[0] = gen_rtx_REG (DImode, 0); /* x0 */
> + temp_operands[1] = target_reg; /* target register */
> + temp_operands[2] = const0_rtx;
> + output_asm_insn ("jalr\t%0, %1, %2", temp_operands);
> + }
> + else
> + {
> + /* Regular call uses x1 (return address register). */
> + temp_operands[0] = gen_rtx_REG (DImode, RETURN_ADDR_REGNUM); /* x1 */
> + temp_operands[1] = target_reg; /* target register */
> + temp_operands[2] = const0_rtx;
> + output_asm_insn ("jalr\t%0, %1, %2", temp_operands);
> + }
More cases where we probably should be using Pmode.
We generally prefer to not generate assembly code like you've done, but
instead prefer to generate actual RTL. Is there some reason why you
decided to use output_asm_insn rather than generating RTL and letting
usual mechanisms for generating assembly code kick in?
> rtx target = riscv_legitimize_call_address (XEXP (operands[0], 0));
> - emit_call_insn (gen_sibcall_internal (target, operands[1], operands[2]));
> + rtx pat = gen_sibcall_internal (target, operands[1], operands[2]);
> + pat = riscv_maybe_wrap_call_with_kcfi (pat, target);
> + emit_call_insn (pat);
> DONE;
> })
>
> +;; KCFI sibling call - matches KCFI wrapper RTL
> +(define_insn "*kcfi_sibcall_insn"
> + [(kcfi (call (mem:SI (match_operand:DI 0 "call_insn_operand" "l"))
> + (match_operand 1 ""))
> + (match_operand 3 "const_int_operand"))
> + (use (unspec:SI [(match_operand 2 "const_int_operand")] UNSPEC_CALLEE_CC))]
> + "SIBLING_CALL_P (insn)"
I think the DI for the memory operand should probably be :P instead so
that we're not so tied to rv64.
> +;; KCFI sibling call with return value - matches KCFI wrapper RTL
> +(define_insn "*kcfi_sibcall_value_insn"
> + [(set (match_operand 0 "")
> + (kcfi (call (mem:SI (match_operand:DI 1 "call_insn_operand" "l"))
> + (match_operand 2 ""))
> + (match_operand 4 "const_int_operand")))
> + (use (unspec:SI [(match_operand 3 "const_int_operand")] UNSPEC_CALLEE_CC))]
> + "SIBLING_CALL_P (insn)"
> +{
> + return riscv_output_kcfi_insn (insn, &operands[1]);
> +}
> + [(set_attr "type" "call")
> + (set_attr "length" "24")])
Similarly for this pattern.
>
> +;; KCFI indirect call - matches KCFI wrapper RTL
> +(define_insn "*kcfi_call_internal"
> + [(kcfi (call (mem:SI (match_operand:DI 0 "call_insn_operand" "l"))
> + (match_operand 1 "" ""))
> + (match_operand 3 "const_int_operand"))
> + (use (unspec:SI [(match_operand 2 "const_int_operand")] UNSPEC_CALLEE_CC))
> + (clobber (reg:SI RETURN_ADDR_REGNUM))]
> + "!SIBLING_CALL_P (insn)"
> +{
> + return riscv_output_kcfi_insn (insn, operands);
> +}
> + [(set_attr "type" "call")
> + (set_attr "length" "24")])
And this one.
>
> +;; KCFI call with return value - matches KCFI wrapper RTL
> +(define_insn "*kcfi_call_value_insn"
> + [(set (match_operand 0 "" "")
> + (kcfi (call (mem:SI (match_operand:DI 1 "call_insn_operand" "l"))
> + (match_operand 2 "" ""))
> + (match_operand 4 "const_int_operand")))
> + (use (unspec:SI [(match_operand 3 "const_int_operand")] UNSPEC_CALLEE_CC))
> + (clobber (reg:SI RETURN_ADDR_REGNUM))]
> + "!SIBLING_CALL_P (insn)"
> +{
> + return riscv_output_kcfi_insn (insn, &operands[1]);
> +}
> + [(set_attr "type" "call")
> + (set_attr "length" "24")])
THis one too.
64).
>
> +On RISC-V, KCFI type identifiers are emitted as a @code{.word ID}
> +directive (a 32-bit constant) before the function entry, similar to AArch64.
> +RISC-V's natural 4-byte instruction alignment eliminates the need for
> +additional padding NOPs. When used with @option{-fpatchable-function-entry},
> +the type identifier is placed before any patchable NOPs.
Note that many designs implement the "C" extension and as a result only
have a 2 byte alignment for instructions.
Jeff
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: [PATCH v2 6/7] riscv: Add RISC-V Kernel Control Flow Integrity implementation
2025-09-16 3:40 ` Jeff Law
@ 2025-09-16 6:04 ` Kees Cook
2025-10-01 0:56 ` Jeff Law
0 siblings, 1 reply; 32+ messages in thread
From: Kees Cook @ 2025-09-16 6:04 UTC (permalink / raw)
To: Jeff Law
Cc: Qing Zhao, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Peter Zijlstra, Dan Li,
Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
Bill Wendling, gcc-patches, linux-hardening
On Mon, Sep 15, 2025 at 09:40:53PM -0600, Jeff Law wrote:
>
>
> On 9/4/25 18:24, Kees Cook wrote:
> > Implement RISC-V-specific KCFI backend.
> >
> > - Function preamble generation using .word directives for type ID storage
> > at offset from function entry point (no alignment NOPs needed due to
> > fix 4-byte instruction size).
> >
> > - Scratch register allocation using t1/t2 (x6/x7) following RISC-V
> > procedure call standard for temporary registers.
> >
> > - Integration with .kcfi_traps section for debugger/runtime metadata
> > (like x86_64).
> >
> > Assembly Code Pattern for RISC-V:
> > lw t1, -4(target_reg) ; Load actual type ID from preamble
> > lui t2, %hi(expected_type) ; Load expected type (upper 20 bits)
> > addiw t2, t2, %lo(expected_type) ; Add lower 12 bits (sign-extended)
> > beq t1, t2, .Lkcfi_call ; Branch if types match
> > .Lkcfi_trap: ebreak ; Environment break trap on mismatch
> > .Lkcfi_call: jalr/jr target_reg ; Execute validated indirect transfer
> >
> > Build and run tested with Linux kernel ARCH=riscv.
> >
> > gcc/ChangeLog:
> >
> > config/riscv/riscv-protos.h: Declare KCFI helpers.
> > config/riscv/riscv.cc (riscv_maybe_wrap_call_with_kcfi): New
> > function, to wrap calls.
> > (riscv_maybe_wrap_call_value_with_kcfi): New function, to
> > wrap calls with return values.
> > (riscv_output_kcfi_insn): New function to emit KCFI assembly.
> > config/riscv/riscv.md: Add KCFI RTL patterns and hook expansion.
> > doc/invoke.texi: Document riscv nuances.
> >
> > Signed-off-by: Kees Cook <kees@kernel.org>
> > ---
> > gcc/config/riscv/riscv-protos.h | 3 +
> > gcc/config/riscv/riscv.cc | 147 ++++++++++++++++++++++++++++++++
> > gcc/config/riscv/riscv.md | 74 ++++++++++++++--
> > gcc/doc/invoke.texi | 13 +++
> > 4 files changed, 231 insertions(+), 6 deletions(-)
> >
> > diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> > index 2d60a0ad44b3..0e916fbdde13 100644
> > --- a/gcc/config/riscv/riscv-protos.h
> > +++ b/gcc/config/riscv/riscv-protos.h
> > @@ -126,6 +126,9 @@ extern bool riscv_split_64bit_move_p (rtx, rtx);
> > extern void riscv_split_doubleword_move (rtx, rtx);
> > extern const char *riscv_output_move (rtx, rtx);
> > extern const char *riscv_output_return ();
> > +extern rtx riscv_maybe_wrap_call_with_kcfi (rtx, rtx);
> > +extern rtx riscv_maybe_wrap_call_value_with_kcfi (rtx, rtx);
> > +extern const char *riscv_output_kcfi_insn (rtx_insn *, rtx *);
> > extern void riscv_declare_function_name (FILE *, const char *, tree);
> > extern void riscv_declare_function_size (FILE *, const char *, tree);
> > extern void riscv_asm_output_alias (FILE *, const tree, const tree);
>
> > @@ -11346,6 +11347,149 @@ riscv_convert_vector_chunks (struct gcc_options *opts)
> > return 1;
> > }
> > +/* Apply KCFI wrapping to call pattern if needed. */
> > +rtx
> > +riscv_maybe_wrap_call_with_kcfi (rtx pat, rtx addr)
> So our coding standards require a bit more for that function comment. What
> are PAT and ADDR and how are they used?
Sure, I can document those more fully in the next version.
> > +riscv_output_kcfi_insn (rtx_insn *insn, rtx *operands)
> > +{
> > + /* Target register. */
> > + rtx target_reg = operands[0];
> > + gcc_assert (REG_P (target_reg));
> > +
> > + /* Get KCFI type ID. */
> > + uint32_t expected_type = (uint32_t) INTVAL (operands[3]);
> Do we know operands[3] is a CONST_INT?
Yes, these all come from their respective RTL's:
(match_operand 3 "const_int_operand"))
(Except for sibcalls where the RTL operand is offset by 1, but adjust
for in the call to riscv_output_kcfi_insn.
>
> > +
> > + /* Calculate typeid offset from call target. */
> > + HOST_WIDE_INT offset = -(4 + kcfi_patchable_entry_prefix_nops);
> > +
> > + /* Choose scratch registers that don't conflict with target. */
> > + unsigned temp1_regnum = T1_REGNUM;
> > + unsigned temp2_regnum = T2_REGNUM;
> ISTM that this will need some kidn of adjustment if someone were to compile
> with -ffixed-reg. Maybe all we really need is a sorry() so that if someone
> were to try to fix the temporary registers they'd get a loud complaint from
> teh compiler.
Yeah, this needs more careful management of the scratch registers. I
have not been able to find a sane way to provide working constraints to
the RTL patterns, but I'd _much_ rather let the register allocator do
all this work.
> > +
> > + /* Load actual type from memory at offset. */
> > + temp_operands[0] = gen_rtx_REG (SImode, temp1_regnum);
> > + temp_operands[1] = gen_rtx_MEM (SImode,
> > + gen_rtx_PLUS (DImode, target_reg,
> > + GEN_INT (offset)));
> > + output_asm_insn ("lw\t%0, %1", temp_operands);
> Rather than using DImode for the PLUS, shouldn't it instead use Pmode so
> that it at least tries to work on rv32? Or is this stuff somehow defined as
> only working for rv64?
It was designed entirely for rv64. I'm not against making it work with
rv32, but I just haven't tried or tested it there.
> > +
> > + /* Execute the indirect call. */
> > + if (SIBLING_CALL_P (insn))
> > + {
> > + /* Tail call uses x0 (zero register) to avoid saving return address. */
> > + temp_operands[0] = gen_rtx_REG (DImode, 0); /* x0 */
> > + temp_operands[1] = target_reg; /* target register */
> > + temp_operands[2] = const0_rtx;
> > + output_asm_insn ("jalr\t%0, %1, %2", temp_operands);
> > + }
> > + else
> > + {
> > + /* Regular call uses x1 (return address register). */
> > + temp_operands[0] = gen_rtx_REG (DImode, RETURN_ADDR_REGNUM); /* x1 */
> > + temp_operands[1] = target_reg; /* target register */
> > + temp_operands[2] = const0_rtx;
> > + output_asm_insn ("jalr\t%0, %1, %2", temp_operands);
> > + }
> More cases where we probably should be using Pmode.
>
> We generally prefer to not generate assembly code like you've done, but
> instead prefer to generate actual RTL. Is there some reason why you decided
> to use output_asm_insn rather than generating RTL and letting usual
> mechanisms for generating assembly code kick in?
Yeah, I covered this a bit in patch #2 in the series which describe the
design requirements. The main issue is that the typeid validation check
cannot be separated from the call, and the instruction pattern needs to
have very close control over the register usage so we don't introduce
any new indirect call gadgets (pop %target, call %target).
So, this is a replacement of the regular CALL rtl pattern. I am totally
open to any other way to do this. I have been bumbling around in here
(and on the other architectures) trying to find ways to make it all
work, but it still feels like a bit of a hack. :)
> > +On RISC-V, KCFI type identifiers are emitted as a @code{.word ID}
> > +directive (a 32-bit constant) before the function entry, similar to AArch64.
> > +RISC-V's natural 4-byte instruction alignment eliminates the need for
> > +additional padding NOPs. When used with @option{-fpatchable-function-entry},
> > +the type identifier is placed before any patchable NOPs.
> Note that many designs implement the "C" extension and as a result only have
> a 2 byte alignment for instructions.
Okay, noted. Are there any restrictions on function pointer alignment?
Regardless, I should probably rewrite this language a bit to try to
better say "we don't care about alignment padding since the preamble
typeid contents are a multiple of instruction size" (which would still
be true for 2 byte alignemnt).
Thanks for looking this over!
-Kees
--
Kees Cook
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: [PATCH v2 6/7] riscv: Add RISC-V Kernel Control Flow Integrity implementation
2025-09-16 6:04 ` Kees Cook
@ 2025-10-01 0:56 ` Jeff Law
0 siblings, 0 replies; 32+ messages in thread
From: Jeff Law @ 2025-10-01 0:56 UTC (permalink / raw)
To: Kees Cook
Cc: Qing Zhao, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Peter Zijlstra, Dan Li,
Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
Bill Wendling, gcc-patches, linux-hardening
On 9/16/25 12:04 AM, Kees Cook wrote:
>>> +/* Apply KCFI wrapping to call pattern if needed. */
>>> +rtx
>>> +riscv_maybe_wrap_call_with_kcfi (rtx pat, rtx addr)
>> So our coding standards require a bit more for that function comment. What
>> are PAT and ADDR and how are they used?
>
> Sure, I can document those more fully in the next version.
Thanks.
>
>>> +riscv_output_kcfi_insn (rtx_insn *insn, rtx *operands)
>>> +{
>>> + /* Target register. */
>>> + rtx target_reg = operands[0];
>>> + gcc_assert (REG_P (target_reg));
>>> +
>>> + /* Get KCFI type ID. */
>>> + uint32_t expected_type = (uint32_t) INTVAL (operands[3]);
>> Do we know operands[3] is a CONST_INT?
>
> Yes, these all come from their respective RTL's:
>
> (match_operand 3 "const_int_operand"))
Perfect. Just wanted to make sure. It's a fairly common goof to try to
extract an integer value from a non-integer node.
>> ISTM that this will need some kidn of adjustment if someone were to compile
>> with -ffixed-reg. Maybe all we really need is a sorry() so that if someone
>> were to try to fix the temporary registers they'd get a loud complaint from
>> teh compiler.
>
> Yeah, this needs more careful management of the scratch registers. I
> have not been able to find a sane way to provide working constraints to
> the RTL patterns, but I'd _much_ rather let the register allocator do
> all this work.
Usually you end up having to define a register class with the single
register you want. Of course once you do that you also need to start
defining union classes and you also have to audit all kinds of code to
make sure it's doing something sensible. Yea, it's painful.
>
>>> +
>>> + /* Load actual type from memory at offset. */
>>> + temp_operands[0] = gen_rtx_REG (SImode, temp1_regnum);
>>> + temp_operands[1] = gen_rtx_MEM (SImode,
>>> + gen_rtx_PLUS (DImode, target_reg,
>>> + GEN_INT (offset)));
>>> + output_asm_insn ("lw\t%0, %1", temp_operands);
>> Rather than using DImode for the PLUS, shouldn't it instead use Pmode so
>> that it at least tries to work on rv32? Or is this stuff somehow defined as
>> only working for rv64?
>
> It was designed entirely for rv64. I'm not against making it work with
> rv32, but I just haven't tried or tested it there.
ACK. This may never end up being used on rv32. But we should at least
fix the obvious stuff since it's just the right thing to do.
Conceptually any pointer should be using Pmode. If you keep that in
mind, that covers one big blob of issues. It also means that if someone
where to try to light up 32 bit pointers on rv64 that your code is ready
for that (and yes, we've had those kinds of requests, though to date
none of that code has been ready to integrate).
>>
>> We generally prefer to not generate assembly code like you've done, but
>> instead prefer to generate actual RTL. Is there some reason why you decided
>> to use output_asm_insn rather than generating RTL and letting usual
>> mechanisms for generating assembly code kick in?
>
> Yeah, I covered this a bit in patch #2 in the series which describe the
> design requirements. The main issue is that the typeid validation check
> cannot be separated from the call, and the instruction pattern needs to
> have very close control over the register usage so we don't introduce
> any new indirect call gadgets (pop %target, call %target).
>
> So, this is a replacement of the regular CALL rtl pattern. I am totally
> open to any other way to do this. I have been bumbling around in here
> (and on the other architectures) trying to find ways to make it all
> work, but it still feels like a bit of a hack. :)
I didn't really see anything in patch#2 which would indicate we want to
generate blobs of assembly code. It feels like there's something
missing in both our understandings.
>
> Okay, noted. Are there any restrictions on function pointer alignment?
> Regardless, I should probably rewrite this language a bit to try to
> better say "we don't care about alignment padding since the preamble
> typeid contents are a multiple of instruction size" (which would still
> be true for 2 byte alignemnt).
Nope. The architecture will require them to be 2 byte aligned if "C" is
enabled or 4 byte aligned if "C" is not enabled. In both cases those
alignments correspond to the minimum instruction size.
Jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v2 7/7] kcfi: Add regression test suite
2025-09-05 0:24 [PATCH v2 0/7] Introduce Kernel Control Flow Integrity ABI [PR107048] Kees Cook
` (5 preceding siblings ...)
2025-09-05 0:24 ` [PATCH v2 6/7] riscv: Add RISC-V " Kees Cook
@ 2025-09-05 0:24 ` Kees Cook
2025-09-05 7:06 ` Jakub Jelinek
6 siblings, 1 reply; 32+ messages in thread
From: Kees Cook @ 2025-09-05 0:24 UTC (permalink / raw)
To: Qing Zhao
Cc: Kees Cook, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Peter Zijlstra, Dan Li,
Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
Bill Wendling, gcc-patches, linux-hardening
Adds a test suite for KCFI (Kernel Control Flow Integrity) ABI, covering
core functionality, optimization and code generation, addressing,
architecture-specific KCFI sequence emission, and integration with
patchable function entry.
Tests can be run via:
make check-gcc RUNTESTFLAGS='kcfi.exp'
gcc/testsuite/ChangeLog:
gcc.dg/kcfi/kcfi-adjacency.c: New test.
gcc.dg/kcfi/kcfi-basics.c: New test.
gcc.dg/kcfi/kcfi-call-sharing.c: New test.
gcc.dg/kcfi/kcfi-cold-partition.c: New test.
gcc.dg/kcfi/kcfi-complex-addressing.c: New test.
gcc.dg/kcfi/kcfi-ipa-robustness.c: New test.
gcc.dg/kcfi/kcfi-move-preservation.c: New test.
gcc.dg/kcfi/kcfi-no-sanitize-inline.c: New test.
gcc.dg/kcfi/kcfi-no-sanitize.c: New test.
gcc.dg/kcfi/kcfi-offset-validation.c: New test.
gcc.dg/kcfi/kcfi-patchable-basic.c: New test.
gcc.dg/kcfi/kcfi-patchable-entry-only.c: New test.
gcc.dg/kcfi/kcfi-patchable-large.c: New test.
gcc.dg/kcfi/kcfi-patchable-medium.c: New test.
gcc.dg/kcfi/kcfi-patchable-prefix-only.c: New test.
gcc.dg/kcfi/kcfi-pic-addressing.c: New test.
gcc.dg/kcfi/kcfi-retpoline-r11.c: New test.
gcc.dg/kcfi/kcfi-tail-calls.c: New test.
gcc.dg/kcfi/kcfi-trap-encoding.c: New test.
gcc.dg/kcfi/kcfi-trap-section.c: New test.
gcc.dg/kcfi/kcfi-type-mangling.c: New test.
gcc.dg/kcfi/kcfi.exp: New file, test infrastructure.
Signed-off-by: Kees Cook <kees@kernel.org>
---
gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c | 73 ++
gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c | 101 ++
gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c | 85 ++
.../gcc.dg/kcfi/kcfi-cold-partition.c | 137 +++
.../gcc.dg/kcfi/kcfi-complex-addressing.c | 125 ++
.../gcc.dg/kcfi/kcfi-ipa-robustness.c | 55 +
.../gcc.dg/kcfi/kcfi-move-preservation.c | 56 +
.../gcc.dg/kcfi/kcfi-no-sanitize-inline.c | 101 ++
gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c | 41 +
.../gcc.dg/kcfi/kcfi-offset-validation.c | 50 +
.../gcc.dg/kcfi/kcfi-patchable-basic.c | 71 ++
.../gcc.dg/kcfi/kcfi-patchable-entry-only.c | 64 +
.../gcc.dg/kcfi/kcfi-patchable-large.c | 52 +
.../gcc.dg/kcfi/kcfi-patchable-medium.c | 61 +
.../gcc.dg/kcfi/kcfi-patchable-prefix-only.c | 61 +
.../gcc.dg/kcfi/kcfi-pic-addressing.c | 105 ++
.../gcc.dg/kcfi/kcfi-retpoline-r11.c | 51 +
gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c | 143 +++
.../gcc.dg/kcfi/kcfi-trap-encoding.c | 56 +
gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c | 43 +
.../gcc.dg/kcfi/kcfi-type-mangling.c | 1064 +++++++++++++++++
gcc/testsuite/gcc.dg/kcfi/kcfi.exp | 36 +
22 files changed, 2631 insertions(+)
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-cold-partition.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-complex-addressing.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-ipa-robustness.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-move-preservation.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize-inline.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-offset-validation.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-basic.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-entry-only.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-large.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-medium.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-prefix-only.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-pic-addressing.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-retpoline-r11.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-trap-encoding.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-type-mangling.c
create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi.exp
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c
new file mode 100644
index 000000000000..3c52e01c9558
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c
@@ -0,0 +1,73 @@
+/* Test KCFI check/transfer adjacency - regression test for instruction
+ insertion. */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=kcfi -O2" } */
+/* { dg-options "-fsanitize=kcfi -O2 -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+
+/* This test ensures that KCFI security checks remain immediately adjacent
+ to their corresponding indirect calls/jumps, with no executable instructions
+ between the type ID check and the control flow transfer. */
+
+/* External function pointers to prevent optimization. */
+extern void (*complex_func_ptr)(int, int, int, int);
+extern int (*return_func_ptr)(int, int);
+
+/* Function with complex argument preparation that could tempt
+ the optimizer to insert instructions between KCFI check and call. */
+__attribute__((noinline)) void test_complex_args(int a, int b, int c, int d) {
+ /* Complex argument expressions that might cause instruction scheduling. */
+ complex_func_ptr(a * 2, b + c, d - a, (a << 1) | b);
+}
+
+/* Function with return value handling. */
+__attribute__((noinline)) int test_return_value(int x, int y) {
+ /* Return value handling that shouldn't interfere with adjacency. */
+ int result = return_func_ptr(x + 1, y * 2);
+ return result + 1;
+}
+
+/* Test struct field access that caused issues in try-catch.c. */
+struct call_info {
+ void (*handler)(void);
+ int status;
+ int data;
+};
+
+extern struct call_info *global_call_info;
+
+__attribute__((noinline)) void test_struct_field_call(void) {
+ /* This pattern caused adjacency issues before the fix. */
+ global_call_info->handler();
+}
+
+/* Test conditional indirect call. */
+__attribute__((noinline)) void test_conditional_call(int flag) {
+ if (flag) {
+ global_call_info->handler();
+ }
+}
+
+/* Should have KCFI instrumentation for all indirect calls. */
+
+/* x86_64: Complete KCFI check sequence should be present. */
+/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r1[01]d\n\taddl\t[^,]+, %r1[01]d\n\tje\t\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tud2} { target x86_64-*-* } } } */
+
+/* AArch64: Complete KCFI check sequence should be present. */
+/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-[0-9]+\]\n\tmov\tw17, #[0-9]+\n\tmovk\tw17, #[0-9]+, lsl #16\n\tcmp\tw16, w17\n\tb\.eq\t(\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tbrk\t#[0-9]+\n\1:\n\tblr\tx[0-9]+} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Complete KCFI check sequence should be present with stack
+ spilling. */
+/* { dg-final { scan-assembler {push\t\{r0, r1\}\n\tldr\tr0, \[r[0-9]+, #-[0-9]+\]\n\tmovw\tr1, #[0-9]+\n\tmovt\tr1, #[0-9]+\n\tcmp\tr0, r1\n\tpop\t\{r0, r1\}\n\tbeq\t\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tudf\t#[0-9]+\n\.Lkcfi_call[0-9]+:\n\tblx\tr[0-9]+} { target arm32 } } } */
+
+/* RISC-V: Complete KCFI check sequence should be present. */
+/* { dg-final { scan-assembler {lw\tt1, -4\([a-z0-9]+\)\n\tlui\tt2, [0-9]+\n\taddiw\tt2, t2, -?[0-9]+\n\tbeq\tt1, t2, \.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tebreak} { target riscv*-*-* } } } */
+
+/* Should have trap section with entries. */
+/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
+
+/* AArch64 should NOT have trap section (uses brk immediate instead) */
+/* { dg-final { scan-assembler-not {\.kcfi_traps} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit should NOT have trap section (uses udf immediate instead) */
+/* { dg-final { scan-assembler-not {\.kcfi_traps} { target arm32 } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c
new file mode 100644
index 000000000000..ee156a8c5bb0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c
@@ -0,0 +1,101 @@
+/* Test basic KCFI functionality - preamble generation. */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=kcfi" } */
+/* { dg-options "-fsanitize=kcfi -falign-functions=16" { target x86_64-*-* } } */
+/* { dg-options "-fsanitize=kcfi -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+
+/* Extern function declarations - should NOT get KCFI preambles. */
+extern void external_func(void);
+extern int external_func_int(int x);
+
+void regular_function(int x) {
+ /* This should get KCFI preamble. */
+}
+
+void static_target_function(int x) {
+ /* Target function that can be called indirectly. */
+}
+
+static void static_caller(void) {
+ /* Static function that makes an indirect call
+ Should NOT get KCFI preamble (not address-taken)
+ But must generate KCFI check for the indirect call. */
+ void (*local_ptr)(int) = static_target_function;
+ local_ptr(42); /* This should generate KCFI check. */
+}
+
+/* Make external_func address-taken. */
+void (*func_ptr)(int) = regular_function;
+void (*ext_ptr)(void) = external_func;
+
+int main() {
+ func_ptr(42);
+ ext_ptr(); /* Indirect call to external_func. */
+ external_func_int(10); /* Direct call to external_func_int. */
+ static_caller(); /* Direct call to static function. */
+ return 0;
+}
+
+/* Verify KCFI preamble exists for regular_function. */
+/* { dg-final { scan-assembler {__cfi_regular_function:} } } */
+
+/* Verify KCFI preamble symbol comes before main function symbol. */
+/* { dg-final { scan-assembler {__cfi_regular_function:.*regular_function:} } } */
+
+/* Target function should have preamble (address-taken). */
+/* { dg-final { scan-assembler {__cfi_static_target_function:} } } */
+
+/* Static caller should NOT have preamble (it's only called directly,
+ not address-taken). */
+/* { dg-final { scan-assembler-not {__cfi_static_caller:} } } */
+
+/* x86_64: Verify type ID in preamble (after NOPs, before function label) */
+/* { dg-final { scan-assembler {__cfi_regular_function:\n\t+nop\n.*\n\t+movl\t+\$0x[0-9a-f]+, %eax} { target x86_64-*-* } } } */
+
+/* AArch64: Verify type ID word in preamble. */
+/* { dg-final { scan-assembler {__cfi_regular_function:\n\t\.word\t0x[0-9a-f]+} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Verify type ID word in preamble. */
+/* { dg-final { scan-assembler {__cfi_regular_function:\n\t\.word\t0x[0-9a-f]+} { target arm32 } } } */
+
+/* RISC-V: Verify type ID word in preamble */
+/* { dg-final { scan-assembler {__cfi_regular_function:\n\t\.word\t0x[0-9a-f]+} { target riscv*-*-* } } } */
+
+/* x86_64: Static function should generate complete KCFI check sequence. */
+/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-4\(%r[a-z0-9]+\), %r10d\n\tje\t(\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tud2\n.*\n\1:\n\tcall} { target x86_64-*-* } } } */
+
+/* AArch64: Static function should generate complete KCFI check sequence. */
+/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-4\]\n\tmov\tw17, #[0-9]+\n\tmovk\tw17, #[0-9]+, lsl #16\n\tcmp\tw16, w17\n\tb\.eq\t(\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tbrk\t#[0-9]+\n\1:\n\tblr} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Static function should generate complete KCFI check sequence
+ with stack spilling. */
+/* { dg-final { scan-assembler {push\t\{r0, r1\}\n\tldr\tr0, \[r[0-9]+, #-4\]\n\tmovw\tr1, #[0-9]+\n\tmovt\tr1, #[0-9]+\n\tcmp\tr0, r1\n\tpop\t\{r0, r1\}\n\tbeq\t\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tudf\t#[0-9]+\n\.Lkcfi_call[0-9]+:\n\tblx\tr[0-9]+} { target arm32 } } } */
+
+/* RISC-V: Static function should generate KCFI check for indirect call. */
+/* { dg-final { scan-assembler {lw\tt1, -4\([a-z0-9]+\)\n\tlui\tt2, [0-9]+\n\taddiw\tt2, t2, -?[0-9]+\n\tbeq\tt1, t2, (\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tebreak\n\t\.section\t\.kcfi_traps,"ao",@progbits,\.text\n\.Lkcfi_entry[0-9]+:\n\t\.4byte\t\.Lkcfi_trap[0-9]+-\.Lkcfi_entry[0-9]+\n\t\.text\n\1:\n\tjalr} { target riscv*-*-* } } } */
+
+/* Extern functions should NOT get KCFI preambles. */
+/* { dg-final { scan-assembler-not {__cfi_external_func:} } } */
+/* { dg-final { scan-assembler-not {__cfi_external_func_int:} } } */
+
+/* Local functions should NOT get __kcfi_typeid_ symbols. */
+/* Only external declarations that are address-taken should get __kcfi_typeid_ */
+/* { dg-final { scan-assembler-not {__kcfi_typeid_regular_function} } } */
+/* { dg-final { scan-assembler-not {__kcfi_typeid_main} } } */
+
+/* External address-taken functions should get __kcfi_typeid_ symbols. */
+/* { dg-final { scan-assembler {__kcfi_typeid_external_func} } } */
+
+/* External functions that are only called directly should NOT get
+ __kcfi_typeid_ symbols. */
+/* { dg-final { scan-assembler-not {__kcfi_typeid_external_func_int} } } */
+
+/* Should have trap section for KCFI checks. */
+/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
+
+/* AArch64 should NOT have trap section (uses brk immediate instead). */
+/* { dg-final { scan-assembler-not {\.kcfi_traps} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit should NOT have trap section (uses udf immediate instead). */
+/* { dg-final { scan-assembler-not {\.kcfi_traps} { target arm32 } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c
new file mode 100644
index 000000000000..800c802bf64d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c
@@ -0,0 +1,85 @@
+/* Test KCFI check sharing bug - optimizer incorrectly shares KCFI checks
+ between different function types. */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=kcfi -O2" } */
+/* { dg-options "-fsanitize=kcfi -O2 -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+
+/* Reproduce the pattern from Linux kernel internal_create_group where:
+ - Two different function pointer types (is_visible vs is_bin_visible).
+ - Both get loaded into the same register (%rcx).
+ - Optimizer creates shared KCFI check with wrong type ID.
+ - This causes CFI failures in production kernel. */
+
+struct kobject { int dummy; };
+struct attribute { int dummy; };
+struct bin_attribute { int dummy; };
+
+struct attribute_group {
+ const char *name;
+ // Type ID A
+ int (*is_visible)(struct kobject *, struct attribute *, int);
+ // Type ID B
+ int (*is_bin_visible)(struct kobject *, const struct bin_attribute *, int);
+ struct attribute **attrs;
+ const struct bin_attribute **bin_attrs;
+};
+
+/* Function that mimics __first_visible from kernel - gets inlined into
+ caller. */
+static int __first_visible(const struct attribute_group *grp, struct kobject *kobj)
+{
+ /* Path 1: Call is_visible function pointer. */
+ if (grp->attrs && grp->attrs[0] && grp->is_visible)
+ return grp->is_visible(kobj, grp->attrs[0], 0);
+
+ /* Path 2: Call is_bin_visible function pointer. */
+ if (grp->bin_attrs && grp->bin_attrs[0] && grp->is_bin_visible)
+ return grp->is_bin_visible(kobj, grp->bin_attrs[0], 0);
+
+ return 0;
+}
+
+/* Main function that triggers the optimization bug. */
+int test_kcfi_check_sharing(struct kobject *kobj, const struct attribute_group *grp)
+{
+ /* This should inline __first_visible and create the problematic pattern where:
+ 1. Both function pointers get loaded into same register.
+ 2. Optimizer shares KCFI check between them.
+ 3. Uses wrong type ID for one of the calls. */
+ return __first_visible(grp, kobj);
+}
+
+/* Each indirect call should have its own KCFI check with correct type ID.
+
+ Should see:
+ 1. KCFI check for is_visible call with is_visible type ID.
+ 2. KCFI check for is_bin_visible call with is_bin_visible type ID. */
+
+/* Verify we have TWO different KCFI check sequences. */
+/* Each check should have different type ID constants. */
+/* x86: { dg-final { scan-assembler-times {movl\s+\$-?[0-9]+,\s+%r10d} 2 { target i?86-*-* x86_64-*-* } } } */
+/* AArch64: { dg-final { scan-assembler-times {mov\s+w17, #[0-9]+} 2 { target aarch64*-*-* } } } */
+/* ARM 32-bit: { dg-final { scan-assembler-times {movw\s+r1, #[0-9]+} 2 { target arm32 } } } */
+/* RISC-V: { dg-final { scan-assembler-times {lui\tt2, [0-9]+} 2 { target riscv*-*-* } } } */
+
+/* Verify the checks use DIFFERENT type IDs (not shared).
+ We should NOT see the same type ID used twice - that would indicate
+ sharing bug. */
+/* x86: { dg-final { scan-assembler-not {movl\s+\$(-?[0-9]+),\s+%r10d.*movl\s+\$\1,\s+%r10d} { target i?86-*-* x86_64-*-* } } } */
+/* AArch64: { dg-final { scan-assembler-not {mov\s+w17, #([0-9]+).*mov\s+w17, #\1} { target aarch64*-*-* } } } */
+/* ARM 32-bit: { dg-final { scan-assembler-not {movw\s+r1, #([0-9]+).*movw\s+r1, #\1} { target arm32 } } } */
+/* RISC-V: { dg-final { scan-assembler-not {lui\s+t2, ([0-9]+)\s.*lui\s+t2, \1\s} { target riscv*-*-* } } } */
+
+/* Verify each call follows its own check (not shared) */
+/* Should have 2 separate trap instructions. */
+/* x86: { dg-final { scan-assembler-times {ud2} 2 { target i?86-*-* x86_64-*-* } } } */
+/* AArch64: { dg-final { scan-assembler-times {brk\s+#[0-9]+} 2 { target aarch64*-*-* } } } */
+/* ARM 32-bit: { dg-final { scan-assembler-times {udf\s+#[0-9]+} 2 { target arm32 } } } */
+/* RISC-V: { dg-final { scan-assembler-times {ebreak} 2 { target riscv*-*-* } } } */
+
+/* Verify 2 separate call sites. */
+/* x86: { dg-final { scan-assembler-times {jmp\s+\*%[a-z0-9]+} 2 { target i?86-*-* x86_64-*-* } } } */
+/* AArch64: Allow both blr (regular call) and br (tail call) */
+/* AArch64: { dg-final { scan-assembler-times {br\tx[0-9]+} 2 { target aarch64*-*-* } } } */
+/* ARM 32-bit: { dg-final { scan-assembler-times {bx\s+(?:r[0-9]+|ip)} 2 { target arm32 } } } */
+/* RISC-V: { dg-final { scan-assembler-times {jalr\t[a-z0-9]+} 2 { target riscv*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-cold-partition.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-cold-partition.c
new file mode 100644
index 000000000000..1783c7bca135
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-cold-partition.c
@@ -0,0 +1,137 @@
+/* Test KCFI cold function and cold partition behavior. */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=kcfi -O2" } */
+/* { dg-options "-fsanitize=kcfi -O2 -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+/* { dg-additional-options "-freorder-blocks-and-partition" { target freorder } } */
+
+void regular_function(void) {
+ /* Regular function should get preamble. */
+}
+
+/* Cold-attributed function should STILL get preamble (it's a regular
+ function, just marked cold). */
+__attribute__((cold))
+void cold_attributed_function(void) {
+ /* This function has cold attribute but should still get KCFI preamble. */
+}
+
+/* Hot-attributed function should get preamble. */
+__attribute__((hot))
+void hot_attributed_function(void) {
+ /* This function is explicitly hot and should get KCFI preamble. */
+}
+
+/* Global to prevent optimization from eliminating cold paths. */
+extern void abort(void);
+
+/* Additional function to test that normal functions still get preambles. */
+__attribute__((noinline))
+int another_regular_function(int x) {
+ return x + 42;
+}
+
+/* Function designed to generate cold partitions under optimization. */
+__attribute__((noinline))
+void function_with_cold_partition(int condition) {
+ /* Hot path - very likely to execute. */
+ if (__builtin_expect(condition == 42, 1)) {
+ /* Simple hot path that optimizer will keep inline. */
+ return;
+ }
+
+ /* Cold paths that actually do something to prevent elimination. */
+ if (__builtin_expect(condition < 0, 0)) {
+ /* Error path 1 - call abort to prevent elimination. */
+ abort();
+ }
+
+ if (__builtin_expect(condition > 1000000, 0)) {
+ /* Error path 2 - call abort to prevent elimination. */
+ abort();
+ }
+
+ if (__builtin_expect(condition == 999999, 0)) {
+ /* Error path 3 - more substantial cold code. */
+ volatile int sum = 0;
+ for (volatile int i = 0; i < 100; i++) {
+ sum += i * condition;
+ }
+ if (sum > 0)
+ abort();
+ }
+
+ /* More cold paths - switch with many unlikely cases. */
+ switch (condition) {
+ case 1000001: case 1000002: case 1000003: case 1000004: case 1000005:
+ case 1000006: case 1000007: case 1000008: case 1000009: case 1000010:
+ /* Each case does some work before abort. */
+ volatile int work = condition * 2;
+ if (work > 0) abort();
+ break;
+ default:
+ if (condition != 42) {
+ /* Fallback cold path - substantial work. */
+ volatile int result = 0;
+ for (volatile int j = 0; j < condition % 50; j++) {
+ result += j;
+ }
+ if (result >= 0) abort();
+ }
+ }
+}
+
+/* Test function pointers to ensure address-taken detection works. */
+void test_function_pointers(void) {
+ void (*regular_ptr)(void) = regular_function;
+ void (*cold_ptr)(void) = cold_attributed_function;
+ void (*hot_ptr)(void) = hot_attributed_function;
+
+ regular_ptr();
+ cold_ptr();
+ hot_ptr();
+}
+
+int main() {
+ regular_function();
+ cold_attributed_function();
+ hot_attributed_function();
+ function_with_cold_partition(42); /* Normal case - stay in hot path. */
+ another_regular_function(5);
+ test_function_pointers();
+ return 0;
+}
+
+/* Regular function should have preamble. */
+/* { dg-final { scan-assembler "__cfi_regular_function:" } } */
+
+/* Cold-attributed function should STILL have preamble (it's a legitimate function) */
+/* { dg-final { scan-assembler "__cfi_cold_attributed_function:" } } */
+
+/* Hot-attributed function should have preamble. */
+/* { dg-final { scan-assembler "__cfi_hot_attributed_function:" } } */
+
+/* Function that generates cold partitions should have preamble for main entry. */
+/* { dg-final { scan-assembler "__cfi_function_with_cold_partition:" } } */
+
+/* Address-taken functions should have preambles. */
+/* { dg-final { scan-assembler "__cfi_test_function_pointers:" } } */
+
+/* The function should generate a .cold partition (only on targets that support freorder) */
+/* { dg-final { scan-assembler "function_with_cold_partition\\.cold:" { target freorder } } } */
+
+/* The .cold partition should NOT get a __cfi_ preamble since it's never
+ reached via indirect calls. */
+/* { dg-final { scan-assembler-not "__cfi_function_with_cold_partition\\.cold:" { target freorder } } } */
+
+/* Additional regular function should get preamble. */
+/* { dg-final { scan-assembler "__cfi_another_regular_function:" } } */
+
+/* Test coverage summary:
+ 1. Cold-attributed function (__attribute__((cold))): SHOULD get preamble
+ 2. Cold partition (-freorder-blocks-and-partition): should NOT get preamble
+ 3. IPA split .part function (split_part=true): Logic in place, would skip if triggered
+
+ Note: IPA function splitting (creating .part functions with split_part=true) requires
+ specific optimization conditions that are difficult to trigger reliably in tests.
+ The KCFI logic correctly handles this case using the split_part flag check.
+*/
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-complex-addressing.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-complex-addressing.c
new file mode 100644
index 000000000000..83431dc6bd28
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-complex-addressing.c
@@ -0,0 +1,125 @@
+/* Test KCFI with complex addressing modes (structure members, array
+ elements). This is a regression test for the change_address_1 RTL
+ error that occurred when target_addr was PLUS(reg, offset) instead
+ of a simple register. */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=kcfi -O2" } */
+/* { dg-options "-fsanitize=kcfi -O2 -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+
+struct function_table {
+ int (*callback1)(int);
+ int (*callback2)(int, int);
+ void (*callback3)(void);
+ int data;
+};
+
+static int handler1(int x) {
+ return x * 2;
+}
+
+static int handler2(int x, int y) {
+ return x + y;
+}
+
+static void handler3(void) {
+ /* Empty handler. */
+}
+
+/* Test indirect calls through structure members - this creates
+ PLUS(reg, offset) addressing. */
+int test_struct_members(struct function_table *table) {
+ int result = 0;
+
+ /* These indirect calls will generate complex addressing modes:
+ * call *(%rdi) - callback1 at offset 0
+ * call *8(%rdi) - callback2 at offset 8
+ * call *16(%rdi) - callback3 at offset 16
+ * KCFI must handle PLUS(reg, struct_offset) + kcfi_offset. */
+
+ result += table->callback1(10);
+ result += table->callback2(5, 7);
+ table->callback3();
+
+ return result;
+}
+
+/* Test indirect calls through array elements - another source of
+ complex addressing. */
+typedef int (*func_array_t)(int);
+
+int test_array_elements(func_array_t functions[], int index) {
+ /* This creates addressing like MEM[PLUS(PLUS(reg, index*8), 0)]
+ which should be simplified to MEM[PLUS(reg, index*8)]. */
+ return functions[index](42);
+}
+
+/* Test with global structure. */
+static struct function_table global_table = {
+ .callback1 = handler1,
+ .callback2 = handler2,
+ .callback3 = handler3,
+ .data = 100
+};
+
+int test_global_struct(void) {
+ /* Access through global structure - may generate different
+ addressing patterns. */
+ return global_table.callback1(20) + global_table.callback2(3, 4);
+}
+
+/* Test nested structure access. */
+struct nested_table {
+ struct function_table inner;
+ int extra_data;
+};
+
+int test_nested_struct(struct nested_table *nested) {
+ /* Even more complex addressing: nested structure member access. */
+ return nested->inner.callback1(15);
+}
+
+int main() {
+ struct function_table local_table = {
+ .callback1 = handler1,
+ .callback2 = handler2,
+ .callback3 = handler3,
+ .data = 50
+ };
+
+ func_array_t func_array[] = { handler1, handler1, handler1 };
+
+ int result = 0;
+ result += test_struct_members(&local_table);
+ result += test_array_elements(func_array, 1);
+ result += test_global_struct();
+
+ struct nested_table nested = { .inner = local_table, .extra_data = 200 };
+ result += test_nested_struct(&nested);
+
+ return result;
+}
+
+/* Verify that all address-taken functions get KCFI preambles. */
+/* { dg-final { scan-assembler {__cfi_handler1:} } } */
+/* { dg-final { scan-assembler {__cfi_handler2:} } } */
+/* { dg-final { scan-assembler {__cfi_handler3:} } } */
+
+/* x86_64: Verify KCFI checks are generated for indirect calls through
+ complex addressing. */
+/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-4\(%r[a-z0-9]+\), %r10d} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler {ud2} { target x86_64-*-* } } } */
+
+/* AArch64: Verify KCFI checks for complex addressing. */
+/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-4\]} { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler {brk} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Verify KCFI checks for complex addressing with stack spilling. */
+/* { dg-final { scan-assembler {ldr\tr0, \[r[0-9]+, #-4\]} { target arm32 } } } */
+/* { dg-final { scan-assembler {udf} { target arm32 } } } */
+
+/* RISC-V: Verify KCFI check sequence for complex addressing. */
+/* { dg-final { scan-assembler {lw\tt1, -4\([a-z0-9]+\)\n\tlui\tt2, [0-9]+\n\taddiw\tt2, t2, -?[0-9]+\n\tbeq\tt1, t2, \.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tebreak} { target riscv*-*-* } } } */
+
+/* Should have trap section for x86 and RISC-V only. */
+/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-ipa-robustness.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-ipa-robustness.c
new file mode 100644
index 000000000000..86787e9dad32
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-ipa-robustness.c
@@ -0,0 +1,55 @@
+/* Test KCFI IPA pass robustness with compiler-generated constructs. */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=kcfi -O2" } */
+/* { dg-options "-fsanitize=kcfi -O2 -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+
+#include <stddef.h>
+
+/* Test various compiler-generated constructs that could confuse IPA pass. */
+
+/* static_assert - this was causing the original crash. */
+typedef struct {
+ int field1;
+ char field2;
+} test_struct_t;
+
+static_assert(offsetof(test_struct_t, field1) == 0, "layout check 1");
+static_assert(offsetof(test_struct_t, field2) == 4, "layout check 2");
+static_assert(sizeof(test_struct_t) >= 5, "size check");
+
+/* Regular functions that should get KCFI analysis. */
+void regular_function(void) {
+ /* Should get KCFI preamble. */
+}
+
+static void static_function(void) {
+ /* With -O2: correctly identified as not address-taken, no preamble. */
+}
+
+void address_taken_function(void) {
+ /* Should get KCFI preamble (address taken below) */
+}
+
+/* Function pointer to create address-taken scenario. */
+void (*func_ptr)(void) = address_taken_function;
+
+/* More static_asserts mixed with function definitions. */
+static_assert(sizeof(void*) >= 4, "pointer size check");
+
+int main(void) {
+ regular_function(); /* Direct call. */
+ static_function(); /* Direct call to static. */
+ func_ptr(); /* Indirect call. */
+
+ static_assert(sizeof(int) == 4, "int size check");
+
+ return 0;
+}
+
+/* Verify KCFI preambles are generated appropriately. */
+/* { dg-final { scan-assembler "__cfi_regular_function:" } } */
+/* { dg-final { scan-assembler "__cfi_address_taken_function:" } } */
+/* { dg-final { scan-assembler "__cfi_main:" } } */
+
+/* With -O2: static_function correctly identified as not address-taken. */
+/* { dg-final { scan-assembler-not "__cfi_static_function:" } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-move-preservation.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-move-preservation.c
new file mode 100644
index 000000000000..2d0140f9e429
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-move-preservation.c
@@ -0,0 +1,56 @@
+/* Test that KCFI preserves function pointer moves at -O2 optimization.
+ This test ensures that the combine pass doesn't incorrectly optimize away
+ the move instruction needed to transfer function pointers from argument
+ registers to the target registers used by KCFI patterns. */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -fsanitize=kcfi -std=gnu11" } */
+/* { dg-options "-O2 -fsanitize=kcfi -std=gnu11 -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+
+static int called_count = 0;
+
+/* Function taking one argument, returning void. */
+static __attribute__((noinline)) void increment_void(int *counter)
+{
+ (*counter)++;
+}
+
+/* Function taking one argument, returning int. */
+static __attribute__((noinline)) int increment_int(int *counter)
+{
+ (*counter)++;
+ return *counter;
+}
+
+/* Don't allow the compiler to inline the calls. */
+static __attribute__((noinline)) void indirect_call(void (*func)(int *))
+{
+ func(&called_count);
+}
+
+int main(void)
+{
+ /* This should work - matching prototype. */
+ indirect_call(increment_void);
+
+ /* This should trap - mismatched prototype. */
+ indirect_call((void *)increment_int);
+
+ return 0;
+}
+
+/* Verify complete KCFI check sequence with preserved move instruction. At
+ -O2, the combine pass previously optimized away the move from %rdi to %rax,
+ breaking KCFI. Verify the full sequence is preserved. */
+
+/* x86_64: Complete KCFI sequence with move preservation and indirect jump. */
+/* { dg-final { scan-assembler {(indirect_call):.*\n.*movq\s+%rdi,\s+(%rax)\n.*movl\s+\$[0-9]+,\s+%r10d\n\taddl\s+-4\(\2\),\s+%r10d\n\tje\s+\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tud2.*\.Lkcfi_call[0-9]+:\n\tjmp\s+\*\2.*\.size\s+\1,\s+\.-\1} { target x86_64-*-* } } } */
+
+/* AArch64: Complete KCFI sequence with move preservation and indirect branch. */
+/* { dg-final { scan-assembler {(indirect_call):.*\n.*mov\s+(x[0-9]+),\s+x0\n.*ldur\s+w16,\s+\[\2,\s+#-4\]\n\tmov\s+w17,\s+#[0-9]+\n\tmovk\s+w17,\s+#[0-9]+,\s+lsl\s+#16\n\tcmp\s+w16,\s+w17\n\tb\.eq\s+\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tbrk\s+#[0-9]+.*\.Lkcfi_call[0-9]+:\n\tbr\s+\2.*\.size\s+\1,\s+\.-\1} { target aarch64*-*-* } } } */
+
+/* ARM32: Complete KCFI sequence with move preservation and indirect branch. */
+/* { dg-final { scan-assembler {(indirect_call):.*\n.*mov\s+(r[0-9]+),\s+r0\n.*push\s+\{r0,\s+r1\}\n\tldr\s+r0,\s+\[\2,\s+#-4\]\n\tmovw\s+r1,\s+#[0-9]+\n\tmovt\s+r1,\s+#[0-9]+\n\tcmp\s+r0,\s+r1\n\tpop\s+\{r0,\s+r1\}\n\tbeq\s+\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tudf\s+#[0-9]+.*\.Lkcfi_call[0-9]+:\n\tbx\s+\2.*\.size\s+\1,\s+\.-\1} { target arm32 } } } */
+
+/* RISC-V: Complete KCFI sequence with move preservation and indirect jump. */
+/* { dg-final { scan-assembler {(indirect_call):.*mv\s+(a[0-9]+),a0.*lw\s+t1,\s+-4\(\2\).*lui\s+t2,\s+[0-9]+.*addiw\s+t2,\s+t2,\s+-?[0-9]+.*beq\s+t1,\s+t2,\s+\.Lkcfi_call[0-9]+.*ebreak.*jalr\s+zero,\s+\2,\s+0.*\.size\s+\1,\s+\.-\1} { target riscv64-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize-inline.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize-inline.c
new file mode 100644
index 000000000000..13e0d32c11fe
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize-inline.c
@@ -0,0 +1,101 @@
+/* Test that no_sanitize("kcfi") attribute is preserved during inlining. */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=kcfi -O2" } */
+/* { dg-options "-fsanitize=kcfi -O2 -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+
+extern void external_side_effect(int value);
+
+/* Regular function (should get KCFI checks) */
+__attribute__((noinline))
+void normal_function(void (*callback)(int))
+{
+ /* This indirect call must generate KCFI checks. */
+ callback(300);
+ external_side_effect(300);
+}
+
+/* Regular function marked with no_sanitize("kcfi") (positive control) */
+__attribute__((noinline, no_sanitize("kcfi")))
+void sensitive_non_inline_function(void (*callback)(int))
+{
+ /* This indirect call should NOT generate KCFI checks. */
+ callback(100);
+ external_side_effect(100);
+}
+
+/* Function marked with both no_sanitize("kcfi") and always_inline. */
+__attribute__((always_inline, no_sanitize("kcfi")))
+static inline void sensitive_inline_function(void (*callback)(int))
+{
+ /* This indirect call should NOT generate KCFI checks when inlined. */
+ callback(42);
+ external_side_effect(42);
+}
+
+/* Explicit wrapper for testing sensitive_inline_function behavior. */
+__attribute__((noinline))
+void wrap_sensitive_inline(void (*callback)(int))
+{
+ sensitive_inline_function(callback);
+}
+
+/* Function marked with only always_inline (should get KCFI checks) */
+__attribute__((always_inline))
+static inline void normal_inline_function(void (*callback)(int))
+{
+ /* This indirect call must generate KCFI checks when inlined. */
+ callback(200);
+ external_side_effect(200);
+}
+
+/* Explicit wrapper for testing normal_inline_function behavior. */
+__attribute__((noinline))
+void wrap_normal_inline(void (*callback)(int))
+{
+ normal_inline_function(callback);
+}
+
+void test_callback(int value)
+{
+ external_side_effect(value);
+}
+
+static void (*volatile function_pointer)(int) = test_callback;
+
+int main(void)
+{
+ void (*fn_ptr)(int) = function_pointer;
+
+ normal_function(fn_ptr);
+ wrap_normal_inline(fn_ptr);
+ sensitive_non_inline_function(fn_ptr);
+ wrap_sensitive_inline(fn_ptr);
+
+ return 0;
+}
+
+/* Verify correct number of KCFI checks: exactly 2 */
+/* { dg-final { scan-assembler-times {ud2} 2 { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-times {brk\s+#[0-9]+} 2 { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler-times {udf\s+#[0-9]+} 2 { target arm32 } } } */
+/* { dg-final { scan-assembler-times {ebreak} 2 { target riscv*-*-* } } } */
+
+/* Positive controls: these should have KCFI checks. */
+/* { dg-final { scan-assembler {normal_function:.*ud2.*\.size\s+normal_function} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler {wrap_normal_inline:.*ud2.*\.size\s+wrap_normal_inline} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler {normal_function:.*brk\s+#[0-9]+.*\.size\s+normal_function} { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler {wrap_normal_inline:.*brk\s+#[0-9]+.*\.size\s+wrap_normal_inline} { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler {normal_function:.*udf\t#[0-9]+.*\.size\s+normal_function} { target arm32 } } } */
+/* { dg-final { scan-assembler {wrap_normal_inline:.*udf\t#[0-9]+.*\.size\s+wrap_normal_inline} { target arm32 } } } */
+/* { dg-final { scan-assembler {normal_function:.*ebreak.*\.size\s+normal_function} { target riscv*-*-* } } } */
+/* { dg-final { scan-assembler {wrap_normal_inline:.*ebreak.*\.size\s+wrap_normal_inline} { target riscv*-*-* } } } */
+
+/* Negative controls: these should NOT have KCFI checks. */
+/* { dg-final { scan-assembler-not {sensitive_non_inline_function:.*ud2.*\.size\s+sensitive_non_inline_function} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-not {wrap_sensitive_inline:.*ud2.*\.size\s+wrap_sensitive_inline} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-not {sensitive_non_inline_function:.*brk\s+#[0-9]+.*\.size\s+sensitive_non_inline_function} { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler-not {wrap_sensitive_inline:.*brk\s+#[0-9]+.*\.size\s+wrap_sensitive_inline} { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler-not {sensitive_non_inline_function:[^\n]*udf\t#[0-9]+[^\n]*\.size\tsensitive_non_inline_function} { target arm32 } } } */
+/* { dg-final { scan-assembler-not {wrap_sensitive_inline:[^\n]*udf\t#[0-9]+[^\n]*\.size\twrap_sensitive_inline} { target arm32 } } } */
+/* { dg-final { scan-assembler-not {sensitive_non_inline_function:.*ebreak.*\.size\s+sensitive_non_inline_function} { target riscv*-*-* } } } */
+/* { dg-final { scan-assembler-not {wrap_sensitive_inline:.*ebreak.*\.size\s+wrap_sensitive_inline} { target riscv*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c
new file mode 100644
index 000000000000..a0c1d6c23133
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c
@@ -0,0 +1,41 @@
+/* Test KCFI with no_sanitize attribute. */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=kcfi" } */
+/* { dg-options "-fsanitize=kcfi -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+
+void target_function(void) {
+ /* This should get KCFI preamble. */
+}
+
+void caller_with_checks(void) {
+ /* This function should generate KCFI checks. */
+ void (*func_ptr)(void) = target_function;
+ func_ptr();
+}
+
+__attribute__((no_sanitize("kcfi")))
+void caller_no_checks(void) {
+ /* This function should NOT generate KCFI checks due to no_sanitize. */
+ void (*func_ptr)(void) = target_function;
+ func_ptr();
+}
+
+int main() {
+ caller_with_checks(); /* This should generate checks inside. */
+ caller_no_checks(); /* This should NOT generate checks inside. */
+ return 0;
+}
+
+/* All functions should get preambles regardless of no_sanitize. */
+/* { dg-final { scan-assembler "__cfi_target_function:" } } */
+/* { dg-final { scan-assembler "__cfi_caller_with_checks:" } } */
+/* { dg-final { scan-assembler "__cfi_caller_no_checks:" } } */
+/* { dg-final { scan-assembler "__cfi_main:" } } */
+
+/* caller_with_checks() should generate KCFI check.
+ caller_no_checks() should NOT generate KCFI check (no_sanitize).
+ So a total of exactly 1 KCFI check in the entire program. */
+/* { dg-final { scan-assembler-times {addl\t-4\(%r[ad]x\), %r1[01]d} 1 { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-times {ldur\tw16, \[x[0-9]+, #-4\]} 1 { target aarch64-*-* } } } */
+/* { dg-final { scan-assembler-times {ldr\tr0, \[r[0-9]+, #-4\]} 1 { target arm32 } } } */
+/* { dg-final { scan-assembler-times {lw\tt1, -[0-9]+\(} 1 { target riscv*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-offset-validation.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-offset-validation.c
new file mode 100644
index 000000000000..94952daa7831
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-offset-validation.c
@@ -0,0 +1,50 @@
+/* Test KCFI call-site offset validation across architectures. */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=kcfi" } */
+/* { dg-options "-fsanitize=kcfi -falign-functions=16" { target x86_64-*-* } } */
+/* { dg-options "-fsanitize=kcfi -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+
+void target_func_a(void) { }
+void target_func_b(int x) { }
+void target_func_c(int x, int y) { }
+
+int main() {
+ void (*ptr_a)(void) = target_func_a;
+ void (*ptr_b)(int) = target_func_b;
+ void (*ptr_c)(int, int) = target_func_c;
+
+ /* Multiple indirect calls. */
+ ptr_a();
+ ptr_b(1);
+ ptr_c(1, 2);
+
+ return 0;
+}
+
+/* Should have KCFI preambles for all functions. */
+/* { dg-final { scan-assembler "__cfi_target_func_a:" } } */
+/* { dg-final { scan-assembler "__cfi_target_func_b:" } } */
+/* { dg-final { scan-assembler "__cfi_target_func_c:" } } */
+
+/* x86_64: All call sites should use -4 offset for KCFI type ID loads, even
+ with -falign-functions=16 (we're not using patchable entries here). */
+/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-4\(%r[a-z0-9]+\), %r10d} { target x86_64-*-* } } } */
+
+/* AArch64: All call sites should use -4 offset. */
+/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-4\]} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: All call sites should use -4 offset with stack spilling. */
+/* { dg-final { scan-assembler {ldr\tr0, \[r[0-9]+, #-4\]} { target arm32 } } } */
+
+/* RISC-V: All call sites should use -4 offset. */
+/* { dg-final { scan-assembler {lw\tt1, -4\(} { target riscv*-*-* } } } */
+
+/* Should have trap section. */
+/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
+
+/* AArch64 should NOT have trap section (uses brk immediate instead) */
+/* { dg-final { scan-assembler-not {\.kcfi_traps} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit should NOT have trap section (uses udf immediate instead) */
+/* { dg-final { scan-assembler-not {\.kcfi_traps} { target arm32 } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-basic.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-basic.c
new file mode 100644
index 000000000000..191cc404a33a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-basic.c
@@ -0,0 +1,71 @@
+/* Test KCFI with patchable function entries - basic case. */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=kcfi -fpatchable-function-entry=5,2" } */
+/* { dg-options "-fsanitize=kcfi -fpatchable-function-entry=5,2 -falign-functions=16" { target x86_64-*-* } } */
+/* { dg-options "-fsanitize=kcfi -fpatchable-function-entry=5,2 -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+
+void test_function(int x) {
+ /* Function should get both KCFI preamble and patchable entries. */
+}
+
+int main() {
+ test_function(42);
+ return 0;
+}
+
+/* Should have KCFI preamble. */
+/* { dg-final { scan-assembler "__cfi_test_function:" } } */
+
+/* Should have patchable function entry section. */
+/* { dg-final { scan-assembler "__patchable_function_entries" } } */
+
+/* x86_64: Should have exactly 2 prefix NOPs between .LPFE and .type. */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*\.type} { target x86_64-*-* } } } */
+
+/* x86_64: Should have exactly 3 entry NOPs between .cfi_startproc and
+ pushq. */
+/* { dg-final { scan-assembler {\.cfi_startproc\n\t*nop\n\t*nop\n\t*nop\n\t*pushq} { target x86_64-*-* } } } */
+
+/* x86_64: KCFI should have exactly 9 NOPs between __cfi_ and movl. */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*movl} { target x86_64-*-* } } } */
+
+/* x86_64: Validate KCFI type ID is present. */
+/* { dg-final { scan-assembler {movl\t\$0x[0-9a-f]+, %eax} { target x86_64-*-* } } } */
+
+/* AArch64: Should have exactly 2 prefix NOPs between .LPFE and .type. */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*\.type} { target aarch64*-*-* } } } */
+
+/* AArch64: Should have exactly 3 entry NOPs between .cfi_startproc and
+ stack manipulation. */
+/* { dg-final { scan-assembler {\.cfi_startproc\n\t*nop\n\t*nop\n\t*nop\n\t*sub\t*sp} { target aarch64*-*-* } } } */
+
+/* AArch64: KCFI should have alignment NOPs then .word immediate. */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target aarch64*-*-* } } } */
+
+/* AArch64: Validate clean KCFI boundary - .word then immediate end/size. */
+/* { dg-final { scan-assembler {\.word\t0x[0-9a-f]+\n\.Lcfi_func_end_test_function:\n\t\.size\t__cfi_test_function, \.-__cfi_test_function} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Should have exactly 2 prefix NOPs between .LPFE and .syntax. */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*\.syntax} { target arm32 } } } */
+
+/* ARM 32-bit: Should have exactly 3 entry NOPs after function label. */
+/* { dg-final { scan-assembler {test_function:\n\t*nop\n\t*nop\n\t*nop} { target arm32 } } } */
+
+/* ARM 32-bit: KCFI should have alignment NOPs then .word immediate. */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target arm32 } } } */
+
+/* ARM 32-bit: Validate clean KCFI boundary - .word then immediate end/size. */
+/* { dg-final { scan-assembler {\.word\t0x[0-9a-f]+\n\.Lcfi_func_end_test_function:\n\t\.size\t__cfi_test_function, \.-__cfi_test_function} { target arm32 } } } */
+
+/* RISC-V: Should have exactly 2 prefix NOPs between .LPFE and .type. */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*\.type} { target riscv*-*-* } } } */
+
+/* RISC-V: Should have exactly 3 entry NOPs before .cfi_startproc followed
+ by addi sp. */
+/* { dg-final { scan-assembler {nop\n\t*nop\n\t*nop\n\.LFB[0-9]+:\n\t*\.cfi_startproc\n\t*addi\t*sp} { target riscv*-*-* } } } */
+
+/* RISC-V: KCFI should have alignment NOPs then .word immediate. */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target riscv*-*-* } } } */
+
+/* RISC-V: Validate clean KCFI boundary - .word then immediate end/size. */
+/* { dg-final { scan-assembler {\.word\t0x[0-9a-f]+\n\.Lcfi_func_end_test_function:\n\t\.size\t__cfi_test_function, \.-__cfi_test_function} { target riscv*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-entry-only.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-entry-only.c
new file mode 100644
index 000000000000..1d8a9fc8ba9e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-entry-only.c
@@ -0,0 +1,64 @@
+/* Test KCFI with patchable function entries - entry NOPs only. */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=kcfi -fpatchable-function-entry=4,0" } */
+/* { dg-options "-fsanitize=kcfi -fpatchable-function-entry=4,0 -falign-functions=16" { target x86_64-*-* } } */
+/* { dg-options "-fsanitize=kcfi -fpatchable-function-entry=4,0 -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+
+void test_function(void) {
+}
+
+static void caller(void) {
+ /* Make an indirect call to test callsite offset calculation. */
+ void (*func_ptr)(void) = test_function;
+ func_ptr();
+}
+
+int main() {
+ test_function(); /* Direct call. */
+ caller(); /* Indirect call via static function. */
+ return 0;
+}
+
+/* x86_64: Should have KCFI preamble with architecture alignment NOPs (11). */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+movl\t+\$0x[0-9a-f]+, %eax} { target x86_64-*-* } } } */
+
+/* AArch64: Should have KCFI preamble with no alignment NOPs. */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*\.word\t0x[0-9a-f]+} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Should have KCFI preamble with no alignment NOPs. */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t\.word\t0x[0-9a-f]+} { target arm32 } } } */
+
+/* RISC-V: Should have KCFI preamble with no alignment NOPs. */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t\.word\t0x[0-9a-f]+} { target riscv*-*-* } } } */
+
+/* x86_64: Indirect call should use original prefix NOPs (0) for offset
+ calculation: -4 offset. */
+/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-4\(%r[a-z0-9]+\), %r10d\n\tje\t(\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tud2\n.*\n\1:\n\tcall} { target x86_64-*-* } } } */
+
+/* x86_64: All 4 NOPs are entry NOPs - should have exactly 4 entry NOPs. */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*pushq} { target x86_64-*-* } } } */
+
+
+/* AArch64: All 4 NOPs are entry NOPs - should have exactly 4 entry NOPs. */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*stp} { target aarch64*-*-* } } } */
+
+/* AArch64: No alignment NOPs - function type should come immediately before
+ function. */
+/* { dg-final { scan-assembler {\.type\t*test_function, %function\n*test_function:} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: All 4 NOPs are entry NOPs - should have exactly 4 entry NOPs. */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop} { target arm32 } } } */
+
+/* ARM 32-bit: No alignment NOPs - function type should come immediately
+ before function. */
+/* { dg-final { scan-assembler {\.type\t*test_function, %function\n*test_function:} { target arm32 } } } */
+
+/* RISC-V: All 4 NOPs are entry NOPs. */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\.LFB} { target riscv*-*-* } } } */
+
+/* RISC-V: No alignment NOPs - function type should come immediately
+ before function. */
+/* { dg-final { scan-assembler {\.type\t*test_function, @function\n*test_function:} { target riscv*-*-* } } } */
+
+/* Should have patchable function entry section. */
+/* { dg-final { scan-assembler "__patchable_function_entries" } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-large.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-large.c
new file mode 100644
index 000000000000..e78eef5a8312
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-large.c
@@ -0,0 +1,52 @@
+/* Test KCFI with large patchable function entries. */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=kcfi -fpatchable-function-entry=11,11" } */
+/* { dg-options "-fsanitize=kcfi -fpatchable-function-entry=11,11 -falign-functions=16" { target x86_64-*-* } } */
+/* { dg-options "-fsanitize=kcfi -fpatchable-function-entry=11,11 -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+
+void test_function(void) {
+}
+
+int main() {
+ void (*func_ptr)(void) = test_function;
+ func_ptr();
+ return 0;
+}
+
+/* Should have KCFI preamble. */
+/* { dg-final { scan-assembler "__cfi_test_function:" } } */
+
+/* Should have patchable function entry section. */
+/* { dg-final { scan-assembler "__patchable_function_entries" } } */
+
+/* x86_64: Should have exactly 11 alignment NOPs between .LPFE and .type. */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target x86_64-*-* } } } */
+
+/* x86_64: Should have 0 entry NOPs - function starts immediately with
+ pushq. */
+/* { dg-final { scan-assembler {test_function:\n\.LFB[0-9]+:\n\t*\.cfi_startproc\n\t*pushq\t*%rbp} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target x86_64-*-* } } } */
+
+/* x86_64: KCFI should have 0 entry NOPs - goes directly to typeid movl. */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*movl\t\$0x[0-9a-f]+, %eax} { target x86_64-*-* } } } */
+
+/* x86_64: Call site should use -15 offset. */
+/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-15\(%r[a-z0-9]+\), %r10d} { target x86_64-*-* } } } */
+
+/* AArch64: Should have exactly 11 prefix NOPs between .LPFE and .type. */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Should have exactly 11 prefix NOPs between .LPFE and .type. */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop} { target arm32 } } } */
+
+/* AArch64: Call site should use -15 offset. */
+/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-15\]} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Call site should use -15 offset. */
+/* { dg-final { scan-assembler {ldr\tr0, \[r[0-9]+, #-15\]} { target arm32 } } } */
+
+/* RISC-V: Should have 11 prefix NOPs between .LPFE and .type. */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target riscv*-*-* } } } */
+
+/* RISC-V: Call site should use -15 offset (same as x86/AArch64). */
+/* { dg-final { scan-assembler {lw\tt1, -15\(} { target riscv*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-medium.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-medium.c
new file mode 100644
index 000000000000..e594df25c1bf
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-medium.c
@@ -0,0 +1,61 @@
+/* Test KCFI with medium patchable function entries. */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=kcfi -fpatchable-function-entry=8,4" } */
+/* { dg-options "-fsanitize=kcfi -fpatchable-function-entry=8,4 -falign-functions=16" { target x86_64-*-* } } */
+/* { dg-options "-fsanitize=kcfi -fpatchable-function-entry=8,4 -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+
+void test_function(void) {
+}
+
+int main() {
+ void (*func_ptr)(void) = test_function;
+ func_ptr();
+ return 0;
+}
+
+/* Should have KCFI preamble. */
+/* { dg-final { scan-assembler "__cfi_test_function:" } } */
+
+/* Should have patchable function entry section. */
+/* { dg-final { scan-assembler "__patchable_function_entries" } } */
+
+/* x86_64: Should have exactly 4 prefix NOPs between .LPFE and .type. */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target x86_64-*-* } } } */
+
+/* x86_64: Should have exactly 4 entry NOPs between .cfi_startproc and
+ pushq. */
+/* { dg-final { scan-assembler {\.cfi_startproc\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*pushq} { target x86_64-*-* } } } */
+
+/* x86_64: KCFI should have exactly 7 alignment NOPs between __cfi_ and
+ typeid movl. */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*movl\t\$0x[0-9a-f]+, %eax} { target x86_64-*-* } } } */
+
+/* x86_64: Call site should use -8 offset. */
+/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-8\(%r[a-z0-9]+\), %r10d} { target x86_64-*-* } } } */
+
+/* AArch64: Should have exactly 4 prefix NOPs between .LPFE and .type. */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target aarch64*-*-* } } } */
+
+/* AArch64: Should have exactly 4 entry NOPs after .cfi_startproc. */
+/* { dg-final { scan-assembler {\.cfi_startproc\n\t*nop\n\t*nop\n\t*nop\n\t*nop} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Should have exactly 4 prefix NOPs between .LPFE and .syntax. */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.syntax} { target arm32 } } } */
+
+/* ARM 32-bit: Should have exactly 4 entry NOPs after function label. */
+/* { dg-final { scan-assembler {test_function:\n\t*nop\n\t*nop\n\t*nop\n\t*nop} { target arm32 } } } */
+
+/* AArch64: Call site should use -8 offset. */
+/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-8\]} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Call site should use -8 offset. */
+/* { dg-final { scan-assembler {ldr\tr0, \[r[0-9]+, #-8\]} { target arm32 } } } */
+
+/* RISC-V: Should have exactly 4 prefix NOPs between .LPFE and .type. */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target riscv*-*-* } } } */
+
+/* RISC-V: Should have 4 entry NOPs. */
+/* { dg-final { scan-assembler {test_function:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\.LFB} { target riscv*-*-* } } } */
+
+/* RISC-V: Call site should use -8 offset (same as x86/AArch64) */
+/* { dg-final { scan-assembler {lw\tt1, -8\(} { target riscv*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-prefix-only.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-prefix-only.c
new file mode 100644
index 000000000000..46f61e3da042
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-prefix-only.c
@@ -0,0 +1,61 @@
+/* Test KCFI with patchable function entries - prefix NOPs only. */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=kcfi -fpatchable-function-entry=3,3" } */
+/* { dg-options "-fsanitize=kcfi -fpatchable-function-entry=3,3 -falign-functions=16" { target x86_64-*-* } } */
+/* { dg-options "-fsanitize=kcfi -fpatchable-function-entry=3,3 -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+
+void test_function(void) {
+}
+
+int main() {
+ test_function();
+ return 0;
+}
+
+/* Should have KCFI preamble. */
+/* { dg-final { scan-assembler "__cfi_test_function:" } } */
+
+/* x86_64: All 3 NOPs are prefix NOPs - should have exactly 3 prefix NOPs. */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*\.type\t*test_function} { target x86_64-*-* } } } */
+
+/* x86_64: No entry NOPs - function should start immediately with prologue. */
+/* { dg-final { scan-assembler {test_function:\n\.LFB[0-9]+:\n\t*\.cfi_startproc\n\t*pushq\t*%rbp} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target x86_64-*-* } } } */
+
+/* x86_64: should have exactly 8 alignment NOPs. */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*movl} { target x86_64-*-* } } } */
+
+/* AArch64: All 3 NOPs are prefix NOPs - should have exactly 3 prefix NOPs. */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*\.type\t*test_function} { target aarch64*-*-* } } } */
+
+/* AArch64: No entry NOPs - function should start immediately with prologue. */
+/* { dg-final { scan-assembler {test_function:\n\.LFB[0-9]+:\n\t*\.cfi_startproc\n\t*nop\n\t*ret} { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target aarch64*-*-* } } } */
+
+/* AArch64: KCFI type ID should have 1 alignment NOP then word. */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: All 3 NOPs are prefix NOPs - should have exactly 3 prefix NOPs. */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop} { target arm32 } } } */
+
+/* ARM 32-bit: No entry NOPs - function should start immediately with
+ prologue. */
+/* { dg-final { scan-assembler {test_function:} { target arm32 } } } */
+/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target arm32 } } } */
+
+/* ARM 32-bit: KCFI type ID should have 1 alignment NOP then word. */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target arm32 } } } */
+
+/* RISC-V: All 3 NOPs are prefix NOPs - should have exactly 3 prefix NOPs. */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*\.type\t*test_function} { target riscv*-*-* } } } */
+
+/* RISC-V: No entry NOPs - function should start immediately with
+ .cfi_startproc. */
+/* { dg-final { scan-assembler {test_function:\n\.LFB[0-9]+:\n\t*\.cfi_startproc} { target riscv*-*-* } } } */
+/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target riscv*-*-* } } } */
+
+/* RISC-V: KCFI type ID should have 1 alignment NOP then word. */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target riscv*-*-* } } } */
+
+/* Should have patchable function entry section. */
+/* { dg-final { scan-assembler "__patchable_function_entries" } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-pic-addressing.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-pic-addressing.c
new file mode 100644
index 000000000000..f68d3d3f44db
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-pic-addressing.c
@@ -0,0 +1,105 @@
+/* Test KCFI with position-independent code addressing modes.
+ This is a regression test for complex addressing like
+ PLUS(PLUS(...), symbol_ref) which can occur with PIC and caused
+ change_address_1 RTL errors. */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=kcfi -O2 -fpic" } */
+/* { dg-options "-fsanitize=kcfi -O2 -fpic -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+
+/* Global function pointer table that creates PIC addressing. */
+struct callbacks {
+ int (*handler1)(int);
+ void (*handler2)(void);
+ int (*handler3)(int, int);
+};
+
+static int simple_handler(int x) {
+ return x * 2;
+}
+
+static void void_handler(void) {
+ /* Empty handler. */
+}
+
+static int complex_handler(int a, int b) {
+ return a + b;
+}
+
+/* Global structure that will require PIC addressing. */
+struct callbacks global_callbacks = {
+ .handler1 = simple_handler,
+ .handler2 = void_handler,
+ .handler3 = complex_handler
+};
+
+/* Function that uses PIC addressing to access global callbacks. */
+int test_pic_addressing(int value) {
+ /* These indirect calls through global structure create complex
+ addressing like PLUS(PLUS(GOT_base, symbol_offset), struct_offset)
+ which previously caused RTL errors in KCFI instrumentation. */
+
+ int result = 0;
+ result += global_callbacks.handler1(value);
+
+ global_callbacks.handler2();
+
+ result += global_callbacks.handler3(value, result);
+
+ return result;
+}
+
+/* Test with function pointer arrays. */
+static int (*func_array[])(int) = {
+ simple_handler,
+ simple_handler,
+ simple_handler
+};
+
+int test_pic_array(int index, int value) {
+ /* Array access with PIC can also create complex addressing. */
+ return func_array[index % 3](value);
+}
+
+/* Test with dynamic PIC addressing. */
+struct callbacks *get_callbacks(void) {
+ return &global_callbacks;
+}
+
+int test_dynamic_pic(int value) {
+ /* Dynamic access through function call creates very complex addressing. */
+ struct callbacks *cb = get_callbacks();
+ return cb->handler1(value) + cb->handler3(value, value);
+}
+
+int main() {
+ int result = 0;
+ result += test_pic_addressing(10);
+ result += test_pic_array(1, 20);
+ result += test_dynamic_pic(5);
+ return result;
+}
+
+/* Verify that all address-taken functions get KCFI preambles. */
+/* { dg-final { scan-assembler {__cfi_simple_handler:} } } */
+/* { dg-final { scan-assembler {__cfi_void_handler:} } } */
+/* { dg-final { scan-assembler {__cfi_complex_handler:} } } */
+
+/* x86_64: Verify KCFI checks are generated. */
+/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-4\(%r[a-z0-9]+\), %r10d} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler {ud2} { target x86_64-*-* } } } */
+
+/* AArch64: Verify KCFI checks. */
+/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-4\]} { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler {brk} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Verify KCFI checks with PIC addressing and stack spilling. */
+/* { dg-final { scan-assembler {ldr\tr0, \[r[0-9]+, #-4\]} { target arm32 } } } */
+/* { dg-final { scan-assembler {udf} { target arm32 } } } */
+
+/* RISC-V: Verify KCFI checks are generated. */
+/* { dg-final { scan-assembler {lw\tt1, -4\([a-z0-9]+\)\n\tlui\tt2, [0-9]+\n\taddiw\tt2, t2, -?[0-9]+} { target riscv*-*-* } } } */
+/* { dg-final { scan-assembler {ebreak} { target riscv*-*-* } } } */
+
+/* Should have trap section. */
+/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-retpoline-r11.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-retpoline-r11.c
new file mode 100644
index 000000000000..656a60db5a7e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-retpoline-r11.c
@@ -0,0 +1,51 @@
+/* Test KCFI with retpoline thunk-extern flag forces r11 usage. */
+/* { dg-do compile { target x86_64-*-* } } */
+/* { dg-options "-fsanitize=kcfi -mindirect-branch=thunk-extern -O2" } */
+/* { dg-options "-fsanitize=kcfi -mindirect-branch=thunk-extern -O2 -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+
+extern int external_target(void);
+
+/* Test regular call (not tail call) */
+__attribute__((noinline))
+int call_test(int (*func_ptr)(void)) {
+ /* This indirect call should use r11 when both KCFI and
+ -mindirect-branch=thunk-extern are enabled. */
+ int result = func_ptr(); /* Function parameter prevents direct optimization. */
+ return result + 1; /* Prevent tail call optimization. */
+}
+
+/* Reference external_target to generate the required symbol. */
+int (*external_func_ptr)(void) = external_target;
+
+/* Test function for sibcalls (tail calls) */
+__attribute__((noinline))
+void sibcall_test(int (**func_ptr)(void)) {
+ /* This sibcall should use r11 when both KCFI and
+ -mindirect-branch=thunk-extern are enabled. */
+ (*func_ptr)(); /* Tail call - should be optimized to sibcall. */
+}
+
+/* Should have weak symbol for external function. */
+/* { dg-final { scan-assembler "__kcfi_typeid_external_target" } } */
+
+/* When both KCFI and -mindirect-branch=thunk-extern are enabled,
+ indirect calls should always use r11 register and convert to extern thunks. */
+/* { dg-final { scan-assembler-times {call\s+__x86_indirect_thunk_r11} 1 } } */
+
+/* Sibcalls should also use r11 register and convert to extern thunks. */
+/* { dg-final { scan-assembler-times {jmp\s+__x86_indirect_thunk_r11} 1 } } */
+
+/* Should have exactly 2 KCFI traps (one per function) */
+/* { dg-final { scan-assembler-times {ud2} 2 } } */
+
+/* Should NOT use other registers for indirect calls. */
+/* { dg-final { scan-assembler-not {call\s+\*%rax} } } */
+/* { dg-final { scan-assembler-not {call\s+\*%rcx} } } */
+/* { dg-final { scan-assembler-not {call\s+\*%rdx} } } */
+/* { dg-final { scan-assembler-not {call\s+\*%rdi} } } */
+
+/* Should NOT use other registers for sibcalls. */
+/* { dg-final { scan-assembler-not {jmp\s+\*%rax} } } */
+/* { dg-final { scan-assembler-not {jmp\s+\*%rcx} } } */
+/* { dg-final { scan-assembler-not {jmp\s+\*%rdx} } } */
+/* { dg-final { scan-assembler-not {jmp\s+\*%rdi} } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c
new file mode 100644
index 000000000000..b044dd6fb993
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c
@@ -0,0 +1,143 @@
+/* Test KCFI protection when indirect calls get converted to tail calls. */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=kcfi -O2" } */
+/* { dg-options "-fsanitize=kcfi -O2 -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+
+typedef int (*func_ptr_t)(int);
+typedef void (*void_func_ptr_t)(void);
+
+struct function_table {
+ func_ptr_t process;
+ void_func_ptr_t cleanup;
+};
+
+/* Target functions. */
+int process_data(int x) { return x * 2; }
+void cleanup_data(void) {}
+
+/* Initialize function table. */
+volatile struct function_table vtable = {
+ .process = &process_data,
+ .cleanup = &cleanup_data
+};
+
+/* Indirect call through struct member that should become tail call. */
+int test_struct_indirect_call(int x) {
+ /* This is an indirect call that should be converted to tail call:
+ Without -fno-optimize-sibling-calls should become "jmp *vtable+0(%rip)"
+ With -fno-optimize-sibling-calls should become "call *vtable+0(%rip)" */
+ return vtable.process(x);
+}
+
+/* Indirect call through function pointer parameter. */
+int test_param_indirect_call(func_ptr_t handler, int x) {
+ /* This is an indirect call that should be converted to tail call:
+ Without -fno-optimize-sibling-calls should become "jmp *%rdi"
+ With -fno-optimize-sibling-calls should be "call *%rdi" */
+ return handler(x);
+}
+
+/* Void indirect call through struct member. */
+void test_void_indirect_call(void) {
+ /* This is an indirect call that should be converted to tail call:
+ * Without -fno-optimize-sibling-calls: should become "jmp *vtable+8(%rip)"
+ * With -fno-optimize-sibling-calls: should be "call *vtable+8(%rip)" */
+ vtable.cleanup();
+}
+
+/* Non-tail call for comparison (should always be call). */
+int test_non_tail_indirect_call(func_ptr_t handler, int x) {
+ /* This should never become a tail call - always "call *%rdi" */
+ int result = handler(x);
+ return result + 1; /* Prevents tail call optimization. */
+}
+
+/* Should have KCFI preambles for all functions. */
+/* { dg-final { scan-assembler-times "__cfi_process_data:" 1 } } */
+/* { dg-final { scan-assembler-times "__cfi_cleanup_data:" 1 } } */
+/* { dg-final { scan-assembler-times "__cfi_test_struct_indirect_call:" 1 } } */
+/* { dg-final { scan-assembler-times "__cfi_test_param_indirect_call:" 1 } } */
+/* { dg-final { scan-assembler-times "__cfi_test_void_indirect_call:" 1 } } */
+/* { dg-final { scan-assembler-times "__cfi_test_non_tail_indirect_call:" 1 } } */
+
+/* Should have exactly 4 KCFI checks for indirect calls as
+ (load type ID + compare). */
+/* { dg-final { scan-assembler-times {movl\t\$-?[0-9]+, %r10d} 4 { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-times {addl\t-4\(%r[a-z0-9]+\), %r10d} 4 { target x86_64-*-* } } } */
+
+/* Should have exactly 4 trap sections and 4 trap instructions. */
+/* { dg-final { scan-assembler-times "\\.kcfi_traps" 4 { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-times "ud2" 4 { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-times "\\.kcfi_traps" 4 { target riscv*-*-* } } } */
+/* { dg-final { scan-assembler-times "ebreak" 4 { target riscv*-*-* } } } */
+
+/* Should NOT have unprotected direct jumps to vtable. */
+/* { dg-final { scan-assembler-not {jmp\t\*vtable\(%rip\)} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-not {jmp\t\*vtable\+8\(%rip\)} { target x86_64-*-* } } } */
+
+/* Should have exactly 3 protected tail calls (jmp through register after
+ KCFI check). */
+/* { dg-final { scan-assembler-times {jmp\t\*%[a-z0-9]+} 3 { target x86_64-*-* } } } */
+
+/* Should have exactly 1 regular call (non-tail call case). */
+/* { dg-final { scan-assembler-times {call\t\*%[a-z0-9]+} 1 { target x86_64-*-* } } } */
+
+/* RISC-V: Should have exactly 4 KCFI checks for indirect calls
+ (comparison instruction). */
+/* { dg-final { scan-assembler-times {beq\tt1, t2, \.Lkcfi_call[0-9]+} 4 { target riscv*-*-* } } } */
+
+/* RISC-V: Should have exactly 4 KCFI checks for indirect calls as
+ (load type ID + compare). */
+/* { dg-final { scan-assembler-times {lw\tt1, -4\([a-z0-9]+\)} 4 { target riscv*-*-* } } } */
+/* { dg-final { scan-assembler-times {lui\tt2, [0-9]+} 4 { target riscv*-*-* } } } */
+
+/* RISC-V: Should have exactly 3 protected tail calls (jr after
+ KCFI check - no return address save). */
+/* { dg-final { scan-assembler-times {jalr\t(x0|zero), [a-z0-9]+, 0} 3 { target riscv*-*-* } } } */
+
+/* RISC-V: Should have exactly 1 regular call (non-tail call case - saves
+ return address). */
+/* { dg-final { scan-assembler-times {jalr\t(x1|ra), [a-z0-9]+, 0} 1 { target riscv*-*-* } } } */
+
+/* Type ID loading should use lui + addiw pattern for 32-bit constants. */
+/* { dg-final { scan-assembler {lui\tt2, [0-9]+} { target riscv*-*-* } } } */
+/* { dg-final { scan-assembler {addiw\tt2, t2, -?[0-9]+} { target riscv*-*-* } } } */
+
+/* Should have exactly 4 KCFI checks for indirect calls (load type ID from
+ -4 offset + compare). */
+/* { dg-final { scan-assembler-times {ldur\tw16, \[x[0-9]+, #-4\]} 4 { target aarch64-*-* } } } */
+/* { dg-final { scan-assembler-times {cmp\tw16, w17} 4 { target aarch64-*-* } } } */
+
+/* Should have exactly 4 trap instructions. */
+/* { dg-final { scan-assembler-times {brk\t#[0-9]+} 4 { target aarch64-*-* } } } */
+
+/* Should have exactly 3 protected tail calls (br through register after
+ KCFI check). */
+/* { dg-final { scan-assembler-times {br\tx[0-9]+} 3 { target aarch64-*-* } } } */
+
+/* Should have exactly 1 regular call (non-tail call case). */
+/* { dg-final { scan-assembler-times {blr\tx[0-9]+} 1 { target aarch64-*-* } } } */
+
+/* Type ID loading should use mov + movk pattern for 32-bit constants. */
+/* { dg-final { scan-assembler {mov\tw17, #[0-9]+} { target aarch64-*-* } } } */
+/* { dg-final { scan-assembler {movk\tw17, #[0-9]+, lsl #16} { target aarch64-*-* } } } */
+
+/* Should have exactly 4 KCFI checks for indirect calls (load type ID from
+ -4 offset + compare). */
+/* { dg-final { scan-assembler-times {ldr\tr0, \[r[0-9]+, #-4\]} 4 { target arm32 } } } */
+/* { dg-final { scan-assembler-times {cmp\tr0, r1} 4 { target arm32 } } } */
+
+/* Should have exactly 4 trap instructions. */
+/* { dg-final { scan-assembler-times {udf\t#[0-9]+} 4 { target arm32 } } } */
+
+/* Should have exactly 3 protected tail calls (bx through register after
+ KCFI check). */
+/* { dg-final { scan-assembler-times {bx\tr[0-9]+} 3 { target arm32 } } } */
+
+/* Should have exactly 1 regular call (non-tail call case). */
+/* { dg-final { scan-assembler-times {blx\tr[0-9]+} 1 { target arm32 } } } */
+
+/* Type ID loading should use movw + movt pattern for 32-bit constants
+ into r1. */
+/* { dg-final { scan-assembler {movw\tr1, #[0-9]+} { target arm32 } } } */
+/* { dg-final { scan-assembler {movt\tr1, #[0-9]+} { target arm32 } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-encoding.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-encoding.c
new file mode 100644
index 000000000000..1427bb933c62
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-encoding.c
@@ -0,0 +1,56 @@
+/* Test AArch64 and ARM32 KCFI trap encoding in BRK/UDF instructions. */
+/* { dg-do compile { target { aarch64*-*-* || arm32 } } } */
+/* { dg-options "-fsanitize=kcfi" } */
+/* { dg-options "-fsanitize=kcfi -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+
+void target_function(int x, char y) {
+}
+
+int main() {
+ void (*func_ptr)(int, char) = target_function;
+
+ /* This should generate trap with immediate encoding. */
+ func_ptr(42, 'a');
+
+ return 0;
+}
+
+/* Should have KCFI preamble. */
+/* { dg-final { scan-assembler "__cfi_target_function:" } } */
+
+/* AArch64 specific: Should have BRK instruction with proper ESR encoding
+ ESR format: 0x8000 | ((type_reg & 31) << 5) | (addr_reg & 31)
+
+ Test the ESR encoding by checking for the expected value.
+ Since we know this test uses x2, we expect ESR = 0x8000 | (17<<5) | 2 = 33314
+
+ A truly dynamic test would need to extract the register from blr and compute
+ the corresponding ESR, but DejaGnu's regex limitations make this complex.
+ This test validates the specific case and documents the encoding.
+ */
+/* { dg-final { scan-assembler "blr\\s+x2" { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler "brk\\s+#33314" { target aarch64*-*-* } } } */
+
+/* Should have KCFI check with type comparison. */
+/* { dg-final { scan-assembler {ldur\t*w16, \[x[0-9]+, #-4\]} { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler {cmp\t*w16, w17} { target aarch64*-*-* } } } */
+
+/* ARM32 specific: Should have UDF instruction with proper encoding
+ UDF format: 0x8000 | ((type_reg & 31) << 5) | (addr_reg & 31)
+
+ Since ARM32 spills and restores r0/r1 before the trap, the type_reg
+ field uses 0x1F (31) to indicate "register was spilled" rather than
+ pointing to a live register. The addr_reg field contains the actual
+ target register number.
+
+ For this test case using r3, we expect:
+ UDF = 0x8000 | (31 << 5) | 3 = 0x8000 | 0x3E0 | 3 = 33763
+ */
+/* { dg-final { scan-assembler "blx\\s+r3" { target arm32 } } } */
+/* { dg-final { scan-assembler "udf\\s+#33763" { target arm32 } } } */
+
+/* Should have register spilling and restoration around type check. */
+/* { dg-final { scan-assembler {push\t*\{r0, r1\}} { target arm32 } } } */
+/* { dg-final { scan-assembler {pop\t*\{r0, r1\}} { target arm32 } } } */
+/* { dg-final { scan-assembler {ldr\t*r0, \[r[0-9]+, #-4\]} { target arm32 } } } */
+/* { dg-final { scan-assembler {cmp\t*r0, r1} { target arm32 } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c
new file mode 100644
index 000000000000..bd42e08659f2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c
@@ -0,0 +1,43 @@
+/* Test KCFI trap section generation. */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=kcfi" } */
+/* { dg-options "-fsanitize=kcfi -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+
+void target_function(void) {}
+
+int main() {
+ void (*func_ptr)(void) = target_function;
+
+ /* Multiple indirect calls to generate multiple trap entries. */
+ func_ptr();
+ func_ptr();
+
+ return 0;
+}
+
+/* Should have KCFI preamble. */
+/* { dg-final { scan-assembler "__cfi_target_function:" } } */
+
+/* Should have exactly 2 trap labels in code. */
+/* { dg-final { scan-assembler-times {\.L[^:]+:\n\s*ud2} 2 { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-times {\.L[^:]+:\n\s*brk} 2 { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler-times {\.L[^:]+:\n\s*udf} 2 { target arm32 } } } */
+/* { dg-final { scan-assembler-times {\.L[^:]+:\n\s*ebreak} 2 { target riscv*-*-* } } } */
+
+/* x86_64: Should have complete .kcfi_traps section sequence with relative
+ offset and 2 entries. */
+/* { dg-final { scan-assembler {\.section\t\.kcfi_traps,"ao",@progbits,\.text\n\.Lkcfi_entry([^:]+):\n\t\.long\t\.Lkcfi_trap([^\s\n]+)-\.Lkcfi_entry\1\n\t\.text} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-times {\.section\t\.kcfi_traps,"ao",@progbits,\.text} 2 { target x86_64-*-* } } } */
+
+/* AArch64 should NOT have .kcfi_traps section (uses brk immediate instead) */
+/* { dg-final { scan-assembler-not {\.section\t+\.kcfi_traps} { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler-not {\.long.*-\.L} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit should NOT have .kcfi_traps section (uses udf immediate instead) */
+/* { dg-final { scan-assembler-not {\.section\t+\.kcfi_traps} { target arm32 } } } */
+/* { dg-final { scan-assembler-not {\.long.*-\.L} { target arm32 } } } */
+
+/* RISC-V: Should have complete .kcfi_traps section sequence with relative
+ offset and 2 entries. */
+/* { dg-final { scan-assembler {\.section\t\.kcfi_traps,"ao",@progbits,\.text\n\.Lkcfi_entry([^:]+):\n\t\.4byte\t\.L([^\s\n]+)-\.Lkcfi_entry\1\n\t\.text} { target riscv*-*-* } } } */
+/* { dg-final { scan-assembler-times {\.section\t\.kcfi_traps,"ao",@progbits,\.text} 2 { target riscv*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-type-mangling.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-type-mangling.c
new file mode 100644
index 000000000000..75d607fa170b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-type-mangling.c
@@ -0,0 +1,1064 @@
+/* Test KCFI type ID hashing - verify different signatures generate different
+ __kcfi_typeid_ symbols. Verifies the mangling strings via the dump file
+ output. */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=kcfi -fdump-tree-kcfi0-details -fdump-ipa-ipa_kcfi-details" } */
+/* { dg-options "-fsanitize=kcfi -fdump-tree-kcfi0-details -fdump-ipa-ipa_kcfi-details -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
+
+#include <stdarg.h>
+
+/* Test __kcfi_typeid_ symbol generation for address-taken functions.
+ Verify precise type discrimination using Itanium C++ ABI mangling. */
+
+/* External function declarations - these will get __kcfi_typeid_ symbols
+ when address-taken. */
+extern void func_void(void); /* _ZTSFvvE -> 0x40e0d3c8 */
+extern void func_char(char x); /* _ZTSFvcE -> 0x64fce2f1 */
+extern void func_int(int x); /* _ZTSFviE -> 0x70e35def */
+extern void func_long(long x); /* _ZTSFvlE -> 0x24efb23e */
+
+/* Basic types - verify exact type IDs match with precise patterns. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_void\n\t\.set\t__kcfi_typeid_func_void, 0x40e0d3c8} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_char\n\t\.set\t__kcfi_typeid_func_char, 0x64fce2f1} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_int\n\t\.set\t__kcfi_typeid_func_int, 0x70e35def} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_long\n\t\.set\t__kcfi_typeid_func_long, 0x24efb23e} } } */
+
+/* Verify basic types. */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvvE' typeid=0x40e0d3c8} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvcE' typeid=0x64fce2f1} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFviE' typeid=0x70e35def} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvlE' typeid=0x24efb23e} kcfi0 } } */
+
+/* Count verification - basic types (void type used by multiple functions). */
+/* { dg-final { scan-assembler-times {0x40e0d3c8} 4 } }
+ +3 from local function preambles + memset test. */
+/* { dg-final { scan-assembler-times {0x64fce2f1} 1 } } */
+/* { dg-final { scan-assembler-times {0x70e35def} 1 } } */
+/* { dg-final { scan-assembler-times {0x24efb23e} 1 } } */
+
+/* Pointer parameter types - must all differ. */
+extern void func_int_ptr(int *x); /* _ZTSFvPiE -> 0xb2a15cf9 */
+extern void func_char_ptr(char *x); /* _ZTSFvPcE -> 0x1eaf7e87 */
+extern void func_void_ptr(void *x); /* _ZTSFvPvE -> 0xb2e442e6 */
+
+/* Pointer types - verify they all differ with precise patterns. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_int_ptr\n\t\.set\t__kcfi_typeid_func_int_ptr, 0xb2a15cf9} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_char_ptr\n\t\.set\t__kcfi_typeid_func_char_ptr, 0x1eaf7e87} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_void_ptr\n\t\.set\t__kcfi_typeid_func_void_ptr, 0xb2e442e6} } } */
+
+/* Verify pointer types. */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvPiE' typeid=0xb2a15cf9} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvPcE' typeid=0x1eaf7e87} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvPvE' typeid=0xb2e442e6} kcfi0 } } */
+
+/* Count verification - pointer types (will appear twice due to array decay
+ earlier). */
+/* { dg-final { scan-assembler-times {0xb2a15cf9} 2 } } */
+/* { dg-final { scan-assembler-times {0x1eaf7e87} 2 } } */
+/* { dg-final { scan-assembler-times {0xb2e442e6} 1 } } */
+
+/* Const qualifier discrimination - const vs non-const must have different
+ type IDs. */
+extern void func_const_int_ptr(const int *x); /* _ZTSFvPKiE -> const int* (must differ from int*) */
+extern void func_const_char_ptr(const char *x); /* _ZTSFvPKcE -> const char* (must differ from char*) */
+extern void func_const_void_ptr(const void *x); /* _ZTSFvPKvE -> const void* (must differ from void*) */
+
+/* Const qualifier types - verify const vs non-const have different type IDs. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_const_int_ptr\n\t\.set\t__kcfi_typeid_func_const_int_ptr, 0x1dce360a} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_const_char_ptr\n\t\.set\t__kcfi_typeid_func_const_char_ptr, 0x39bf5794} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_const_void_ptr\n\t\.set\t__kcfi_typeid_func_const_void_ptr, 0x0dee7085} } } */
+
+/* Verify const qualifier types. */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvPKiE' typeid=0x1dce360a} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvPKcE' typeid=0x39bf5794} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvPKvE' typeid=0x0dee7085} kcfi0 } } */
+
+/* Count verification - const qualifier types should appear exactly once. */
+/* { dg-final { scan-assembler-times {0x1dce360a} 1 } } */
+/* { dg-final { scan-assembler-times {0x39bf5794} 2 } } +1 from non-variadic simple test. */
+/* { dg-final { scan-assembler-times {0x0dee7085} 1 } } */
+
+/* Nested pointer types. */
+extern void func_int_ptr_ptr(int **x); /* _ZTSFvPPiE -> 0xf61ef6c7 */
+extern void func_char_ptr_ptr(char **x); /* _ZTSFvPPcE -> 0x8a0f4239 */
+
+/* Nested pointers with precise patterns. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_int_ptr_ptr\n\t\.set\t__kcfi_typeid_func_int_ptr_ptr, 0xf61ef6c7} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_char_ptr_ptr\n\t\.set\t__kcfi_typeid_func_char_ptr_ptr, 0x8a0f4239} } } */
+
+/* Verify nested pointer types. */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvPPiE' typeid=0xf61ef6c7} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvPPcE' typeid=0x8a0f4239} kcfi0 } } */
+
+/* Count verification - nested pointer types should appear exactly once. */
+/* { dg-final { scan-assembler-times {0xf61ef6c7} 1 } } */
+/* { dg-final { scan-assembler-times {0x8a0f4239} 1 } } */
+
+/* Multiple parameter types - order matters. */
+extern void func_int_char(int x, char y); /* _ZTSFvicE -> 0x5b983d44 */
+extern void func_char_int(char x, int y); /* _ZTSFvciE -> 0x4dbf9e00 */
+extern void func_two_int(int x, int y); /* _ZTSFviiE -> 0x3fa71bba. */
+
+/* Multiple parameter tests with precise patterns. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_int_char\n\t\.set\t__kcfi_typeid_func_int_char, 0x5b983d44} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_char_int\n\t\.set\t__kcfi_typeid_func_char_int, 0x4dbf9e00} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_two_int\n\t\.set\t__kcfi_typeid_func_two_int, 0x3fa71bba} } } */
+
+/* Verify multiple parameter types. */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvicE' typeid=0x5b983d44} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvciE' typeid=0x4dbf9e00} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFviiE' typeid=0x3fa71bba} kcfi0 } } */
+
+/* Count verification - multiple parameter types should appear exactly once. */
+/* { dg-final { scan-assembler-times {0x5b983d44} 1 } } */
+/* { dg-final { scan-assembler-times {0x4dbf9e00} 1 } } */
+/* { dg-final { scan-assembler-times {0x3fa71bba} 1 } } */
+
+/* Return types. */
+extern int func_return_int(void); /* _ZTSFivE -> 0xb7f32039 */
+extern char func_return_char(void); /* _ZTSFcvE -> 0x9646527b */
+extern void* func_return_ptr(void); /* _ZTSFPvvE -> 0x81e76bc6 */
+
+/* Return type tests with precise patterns. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_return_int\n\t\.set\t__kcfi_typeid_func_return_int, 0xb7f32039} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_return_char\n\t\.set\t__kcfi_typeid_func_return_char, 0x9646527b} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_return_ptr\n\t\.set\t__kcfi_typeid_func_return_ptr, 0x81e76bc6} } } */
+
+/* Verify return types. */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFivE' typeid=0xb7f32039} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFcvE' typeid=0x9646527b} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFPvvE' typeid=0x81e76bc6} kcfi0 } } */
+
+/* Count verification - return types should appear exactly once. */
+/* { dg-final { scan-assembler-times {0xb7f32039} 1 } } */
+/* { dg-final { scan-assembler-times {0x9646527b} 1 } } */
+/* { dg-final { scan-assembler-times {0x81e76bc6} 1 } } */
+
+/* Array parameters - decay to pointers. */
+extern void func_int_array(int arr[]); /* _ZTSFvPiE -> 0xb2a15cf9 (same as int*) */
+extern void func_char_array(char arr[]); /* _ZTSFvPcE -> 0x1eaf7e87 (same as char*) */
+
+/* Array decay validation - arrays should have SAME type ID as corresponding pointers. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_int_array\n\t\.set\t__kcfi_typeid_func_int_array, 0xb2a15cf9} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_char_array\n\t\.set\t__kcfi_typeid_func_char_array, 0x1eaf7e87} } } */
+/* Counted below. */
+
+/* Function pointer parameters. */
+extern void func_fptr_void(void (*fp)(void)); /* _ZTSFvPFvvEE -> 0xc88f6251 */
+extern void func_fptr_int(void (*fp)(int)); /* _ZTSFvPFviEE -> 0xc4bf13bc */
+extern void func_fptr_ret_int(int (*fp)(void)); /* _ZTSFvPFivEE -> 0xf728b0c2 */
+
+/* Function pointer parameter tests with precise patterns. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_fptr_void\n\t\.set\t__kcfi_typeid_func_fptr_void, 0xc88f6251} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_fptr_int\n\t\.set\t__kcfi_typeid_func_fptr_int, 0xc4bf13bc} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_fptr_ret_int\n\t\.set\t__kcfi_typeid_func_fptr_ret_int, 0xf728b0c2} } } */
+
+/* Verify function pointer parameter types. */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvPFvvEE' typeid=0xc88f6251} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvPFviEE' typeid=0xc4bf13bc} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvPFivEE' typeid=0xf728b0c2} kcfi0 } } */
+
+/* Count verification - function pointer parameter types should appear exactly once. */
+/* { dg-final { scan-assembler-times {0xc88f6251} 1 } } */
+/* { dg-final { scan-assembler-times {0xc4bf13bc} 1 } } */
+/* { dg-final { scan-assembler-times {0xf728b0c2} 1 } } */
+
+/* Variadic functions - must include 'z' marker for ellipsis parameter. */
+extern void func_variadic_simple(const char *fmt, ...); /* _ZTSFvPKczE -> uses z for variadic. */
+extern void func_variadic_mixed(int x, const char *fmt, ...); /* _ZTSFviPKczE -> int + const char* + variadic. */
+extern void func_variadic_multi(int x, char y, const char *fmt, ...); /* _ZTSFvicPKczE -> multiple params + variadic. */
+
+/* Audit log pattern - matches Linux kernel audit_log function signature. */
+struct audit_context { int dummy; };
+extern void audit_log_pattern(struct audit_context *ctx,
+ unsigned int gfp_mask, int type,
+ const char *fmt, ...); /* _ZTSFvP13audit_contextjiPKczE */
+
+/* va_start regression test. */
+void test_va_start_regression(float dummy, const char *fmt, ...);
+
+/* Variadic function tests - must differ from non-variadic equivalents. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_variadic_simple\n\t\.set\t__kcfi_typeid_func_variadic_simple, 0xc948a054} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_variadic_mixed\n\t\.set\t__kcfi_typeid_func_variadic_mixed, 0x00fbb853} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_variadic_multi\n\t\.set\t__kcfi_typeid_func_variadic_multi, 0xe22e4c64} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_audit_log_pattern\n\t\.set\t__kcfi_typeid_audit_log_pattern, 0xa610bd06} } } */
+
+/* Verify variadic function mangling includes 'z' marker. */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvPKczE' typeid=0xc948a054} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFviPKczE' typeid=0x00fbb853} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvicPKczE' typeid=0xe22e4c64} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvP13audit_contextjiPKczE' typeid=0xa610bd06} kcfi0 } } */
+
+/* Count verification - variadic function types should appear exactly once. */
+/* { dg-final { scan-assembler-times {0xc948a054} 1 } } */
+/* { dg-final { scan-assembler-times {0x00fbb853} 1 } } */
+/* { dg-final { scan-assembler-times {0xe22e4c64} 1 } } */
+/* { dg-final { scan-assembler-times {0xa610bd06} 1 } } */
+
+/* Non-variadic equivalents - must differ from variadic versions. */
+extern void func_non_variadic_simple(const char *fmt); /* _ZTSFvPKcE -> no z marker. */
+extern void func_non_variadic_mixed(int x, const char *fmt); /* _ZTSFviPKcE -> no z marker. */
+
+/* Non-variadic function tests - must have different type IDs from variadic versions. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_non_variadic_simple\n\t\.set\t__kcfi_typeid_func_non_variadic_simple, 0x39bf5794} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_non_variadic_mixed\n\t\.set\t__kcfi_typeid_func_non_variadic_mixed, 0xddf27ea9} } } */
+
+/* Verify non-variadic function mangling lacks 'z' marker. */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvPKcE' typeid=0x39bf5794} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFviPKcE' typeid=0xddf27ea9} kcfi0 } } */
+
+/* Count verification - non-variadic function types should appear exactly once. */
+/* { dg-final { scan-assembler-times {0x39bf5794} 2 } } +1 from earlier const char* test. */
+/* { dg-final { scan-assembler-times {0xddf27ea9} 1 } } */
+
+/* Struct/union/enum parameter types: each struct name must produce different type IDs. */
+struct test_struct_a { int x; };
+struct test_struct_b { int y; };
+struct test_struct_c { int z; };
+union test_union_a { int i; float f; };
+union test_union_b { int j; float g; };
+enum test_enum_a { ENUM_A_VAL1, ENUM_A_VAL2 };
+enum test_enum_b { ENUM_B_VAL1, ENUM_B_VAL2 };
+
+/* Functions taking struct pointers - must have different type IDs. */
+extern void func_struct_a_ptr(struct test_struct_a *x); /* _ZTSFv14test_struct_aPiE -> unique. */
+extern void func_struct_b_ptr(struct test_struct_b *x); /* _ZTSFv14test_struct_bPiE -> unique. */
+extern void func_struct_c_ptr(struct test_struct_c *x); /* _ZTSFv14test_struct_cPiE -> unique. */
+
+/* Struct pointer tests with precise patterns. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_struct_a_ptr\n\t\.set\t__kcfi_typeid_func_struct_a_ptr, 0x784c51f8} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_struct_b_ptr\n\t\.set\t__kcfi_typeid_func_struct_b_ptr, 0x8845af63} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_struct_c_ptr\n\t\.set\t__kcfi_typeid_func_struct_c_ptr, 0x2c475d26} } } */
+
+/* Verify struct pointer types. */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvP13test_struct_aE' typeid=0x784c51f8} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvP13test_struct_bE' typeid=0x8845af63} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvP13test_struct_cE' typeid=0x2c475d26} kcfi0 } } */
+
+/* Count verification - struct pointer types should appear exactly once. */
+/* { dg-final { scan-assembler-times {0x784c51f8} 1 } } */
+/* { dg-final { scan-assembler-times {0x8845af63} 1 } } */
+/* { dg-final { scan-assembler-times {0x2c475d26} 1 } } */
+
+/* Functions taking const struct pointers - must differ from
+ non-const versions. */
+extern void func_const_struct_a_ptr(const struct test_struct_a *x); /* _ZTSFvPK14test_struct_aE -> unique, different from non-const. */
+extern void func_const_struct_b_ptr(const struct test_struct_b *x); /* _ZTSFvPK14test_struct_bE -> unique, different from non-const. */
+extern void func_const_struct_c_ptr(const struct test_struct_c *x); /* _ZTSFvPK14test_struct_cE -> unique, different from non-const. */
+
+/* Const struct pointer tests with precise patterns - must differ
+ from non-const. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_const_struct_a_ptr\n\t\.set\t__kcfi_typeid_func_const_struct_a_ptr, 0xe57ff62f} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_const_struct_b_ptr\n\t\.set\t__kcfi_typeid_func_const_struct_b_ptr, 0xd58698c4} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_const_struct_c_ptr\n\t\.set\t__kcfi_typeid_func_const_struct_c_ptr, 0xa98414e9} } } */
+
+/* Verify const struct pointer types. */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvPK13test_struct_aE' typeid=0xe57ff62f} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvPK13test_struct_bE' typeid=0xd58698c4} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvPK13test_struct_cE' typeid=0xa98414e9} kcfi0 } } */
+
+/* Count verification - const struct pointer types should appear exactly once. */
+/* { dg-final { scan-assembler-times {0xe57ff62f} 1 } } */
+/* { dg-final { scan-assembler-times {0xd58698c4} 1 } } */
+/* { dg-final { scan-assembler-times {0xa98414e9} 1 } } */
+
+extern void func_union_a_ptr(union test_union_a *x); /* _ZTSFv13test_union_aPiE -> unique. */
+extern void func_union_b_ptr(union test_union_b *x); /* _ZTSFv13test_union_bPiE -> unique. */
+extern void func_enum_a_ptr(enum test_enum_a *x); /* _ZTSFv11test_enum_aPiE -> unique. */
+extern void func_enum_b_ptr(enum test_enum_b *x); /* _ZTSFv11test_enum_bPiE -> unique. */
+
+/* Union member access discrimination test - prevents regression of
+ union member bug. */
+struct tasklet_like_struct {
+ int state;
+ union {
+ /* First union member - should NOT be used for callback calls. */
+ void (*func)(unsigned long data);
+ /* Second union member - should be used for callback calls. */
+ void (*callback)(struct tasklet_like_struct *t);
+ };
+ unsigned long data;
+};
+
+/* Function with callback signature - this should match when accessed via
+ union->callback. */
+extern void tasklet_callback_function(struct tasklet_like_struct *t); /* _ZTSFvP19tasklet_like_structE -> unique. */
+
+/* Function with func signature - this should NOT match callback calls. */
+extern void tasklet_func_function(unsigned long data); /* _ZTSFvmE -> different from callback. */
+
+/* Union member access discrimination tests. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_tasklet_callback_function\n\t\.set\t__kcfi_typeid_tasklet_callback_function, 0x84fa4a3e} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_tasklet_func_function\n\t\.set\t__kcfi_typeid_tasklet_func_function, 0x80ee047b} } } */
+
+/* Verify union member discrimination tests. */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvP19tasklet_like_structE' typeid=0x84fa4a3e} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvmE' typeid=0x80ee047b} kcfi0 } } */
+
+/* Union pointer tests with precise patterns. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_union_a_ptr\n\t\.set\t__kcfi_typeid_func_union_a_ptr, 0xfeec6097} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_union_b_ptr\n\t\.set\t__kcfi_typeid_func_union_b_ptr, 0xeef3032c} } } */
+
+/* Verify union pointer types. */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvP12test_union_aE' typeid=0xfeec6097} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvP12test_union_bE' typeid=0xeef3032c} kcfi0 } } */
+
+/* Count verification - union pointer types should appear exactly once. */
+/* { dg-final { scan-assembler-times {0xfeec6097} 1 } } */
+/* { dg-final { scan-assembler-times {0xeef3032c} 1 } } */
+
+/* Enum pointer tests with precise patterns. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_enum_a_ptr\n\t\.set\t__kcfi_typeid_func_enum_a_ptr, 0xd2bdb84a} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_enum_b_ptr\n\t\.set\t__kcfi_typeid_func_enum_b_ptr, 0xf2c02941} } } */
+
+/* Verify enum pointer types. */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvP11test_enum_aE' typeid=0xd2bdb84a} kcfi0 } } */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvP11test_enum_bE' typeid=0xf2c02941} kcfi0 } } */
+
+/* Count verification - enum pointer types should appear exactly once. */
+/* { dg-final { scan-assembler-times {0xd2bdb84a} 1 } } */
+/* { dg-final { scan-assembler-times {0xf2c02941} 1 } } */
+
+/* Count verification - union member discrimination types should appear exactly once. */
+/* The key test is that callback and func functions have DIFFERENT type IDs, proving union member discrimination works. */
+/* { dg-final { scan-assembler-times {0x84fa4a3e} 1 } } */
+/* { dg-final { scan-assembler-times {0x80ee047b} 1 } } */
+
+/* Indirect call through t->callback union must use correct callback
+ type ID (0x84fa4a3e). The decimal value 2063971778 corresponds to
+ 0x84fa4a3e used in KCFI checks. */
+/* { dg-final { scan-assembler-times {\tmovl\t\$2063971778, %r10d} 1 { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-times {\tmov\tw17, #19006\n\tmovk\tw17, #34042, lsl #16} 1 { target aarch64-*-* } } } */
+/* { dg-final { scan-assembler-times {\tpush\t\{r0, r1\}\n\tldr\tr0, \[r[0-9]+, #-4\]\n\tmovw\tr1, #19006\n\tmovt\tr1, #34042} 1 { target arm32 } } } */
+/* { dg-final { scan-assembler-times {\tlui\tt2, 544677\n\taddiw\tt2, t2, -1474} 1 { target riscv*-*-* } } } */
+
+/* Functions returning struct pointers - must have different type IDs. */
+extern struct test_struct_a* func_ret_struct_a_ptr(void); /* _ZTSF14test_struct_aPvE -> unique. */
+extern struct test_struct_b* func_ret_struct_b_ptr(void); /* _ZTSF14test_struct_bPvE -> unique. */
+extern struct test_struct_c* func_ret_struct_c_ptr(void); /* _ZTSF14test_struct_cPvE -> unique. */
+
+/* Struct return pointer tests with precise patterns. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_ret_struct_a_ptr\n\t\.set\t__kcfi_typeid_func_ret_struct_a_ptr, 0x25780668} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_ret_struct_b_ptr\n\t\.set\t__kcfi_typeid_func_ret_struct_b_ptr, 0xb1377aa5} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_ret_struct_c_ptr\n\t\.set\t__kcfi_typeid_func_ret_struct_c_ptr, 0x0dc41dee} } } */
+
+/* Verify struct return pointer types. */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFP13test_struct_avE' typeid=0x25780668} kcfi0 } } */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFP13test_struct_bvE' typeid=0xb1377aa5} kcfi0 } } */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFP13test_struct_cvE' typeid=0x0dc41dee} kcfi0 } } */
+
+/* { dg-final { scan-assembler-times {0x25780668} 1 } } */
+/* { dg-final { scan-assembler-times {0xb1377aa5} 1 } } */
+/* { dg-final { scan-assembler-times {0x0dc41dee} 1 } } */
+
+/* Functions taking structs by value - must have different type IDs. */
+extern void func_struct_a_val(struct test_struct_a x); /* _ZTSFv14test_struct_aE -> unique. */
+extern void func_struct_b_val(struct test_struct_b x); /* _ZTSFv14test_struct_bE -> unique. */
+extern void func_struct_c_val(struct test_struct_c x); /* _ZTSFv14test_struct_cE -> unique. */
+
+/* Struct by-value parameter tests with precise patterns. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_struct_a_val\n\t\.set\t__kcfi_typeid_func_struct_a_val, 0xe0fb126a} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_struct_b_val\n\t\.set\t__kcfi_typeid_func_struct_b_val, 0x00fd8361} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_struct_c_val\n\t\.set\t__kcfi_typeid_func_struct_c_val, 0xad00d0bc} } } */
+
+/* Verify struct by-value parameter types. */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFv13test_struct_aE' typeid=0xe0fb126a} kcfi0 } } */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFv13test_struct_bE' typeid=0x00fd8361} kcfi0 } } */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFv13test_struct_cE' typeid=0xad00d0bc} kcfi0 } } */
+
+/* { dg-final { scan-assembler-times {0xe0fb126a} 1 } } */
+/* { dg-final { scan-assembler-times {0x00fd8361} 1 } } */
+/* { dg-final { scan-assembler-times {0xad00d0bc} 1 } } */
+
+/* Functions returning structs by value - must have different type IDs. */
+extern struct test_struct_a func_ret_struct_a_val(void); /* _ZTSF14test_struct_avE -> unique. */
+extern struct test_struct_b func_ret_struct_b_val(void); /* _ZTSF14test_struct_bvE -> unique. */
+extern struct test_struct_c func_ret_struct_c_val(void); /* _ZTSF14test_struct_cvE -> unique. */
+
+/* Struct return by-value tests with precise patterns. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_ret_struct_a_val\n\t\.set\t__kcfi_typeid_func_ret_struct_a_val, 0x0405e05a} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_ret_struct_b_val\n\t\.set\t__kcfi_typeid_func_ret_struct_b_val, 0x6c60f9bb} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_ret_struct_c_val\n\t\.set\t__kcfi_typeid_func_ret_struct_c_val, 0xd8ef4934} } } */
+
+/* Verify struct return by-value types - using correct P prefix for
+ function pointer. */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSF13test_struct_avE' typeid=0x0405e05a} kcfi0 } } */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSF13test_struct_bvE' typeid=0x6c60f9bb} kcfi0 } } */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSF13test_struct_cvE' typeid=0xd8ef4934} kcfi0 } } */
+
+/* { dg-final { scan-assembler-times {0x0405e05a} 1 } } */
+/* { dg-final { scan-assembler-times {0x6c60f9bb} 1 } } */
+/* { dg-final { scan-assembler-times {0xd8ef4934} 1 } } */
+
+/* Mixed struct parameters - order and type must matter. */
+extern void func_struct_a_b(struct test_struct_a *a, struct test_struct_b *b); /* unique. */
+extern void func_struct_b_a(struct test_struct_b *b, struct test_struct_a *a); /* different! */
+
+/* Mixed struct parameter tests - MUST be different (parameter order matters). */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_struct_a_b\n\t\.set\t__kcfi_typeid_func_struct_a_b, 0xf4af6e27} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_struct_b_a\n\t\.set\t__kcfi_typeid_func_struct_b_a, 0x16bb1ad3} } } */
+
+/* Verify mixed struct parameter types. */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFvP13test_struct_aP13test_struct_bE' typeid=0xf4af6e27} kcfi0 } } */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFvP13test_struct_bP13test_struct_aE' typeid=0x16bb1ad3} kcfi0 } } */
+
+/* { dg-final { scan-assembler-times {0xf4af6e27} 1 } } */
+/* { dg-final { scan-assembler-times {0x16bb1ad3} 1 } } */
+
+/* Typedef structs - must be different from named structs. */
+typedef struct { int value; } typedef_struct_x;
+typedef struct { int value; } typedef_struct_y; /* Same layout but different typedef name. */
+extern void func_typedef_x_ptr(typedef_struct_x *x); /* Must be unique. */
+extern void func_typedef_y_ptr(typedef_struct_y *x); /* Must be different from typedef_struct_x. */
+
+/* Typedef struct tests - MUST be different from each other. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_typedef_x_ptr\n\t\.set\t__kcfi_typeid_func_typedef_x_ptr, 0x746f7969} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_typedef_y_ptr\n\t\.set\t__kcfi_typeid_func_typedef_y_ptr, 0xa071fd44} } } */
+
+/* Verify typedef struct pointer types. */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFvP16typedef_struct_xE' typeid=0x746f7969} kcfi0 } } */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFvP16typedef_struct_yE' typeid=0xa071fd44} kcfi0 } } */
+
+/* { dg-final { scan-assembler-times {0x746f7969} 1 } } */
+/* { dg-final { scan-assembler-times {0xa071fd44} 1 } } */
+
+/* Typedef vs open-coded function types - MUST have identical type IDs. */
+typedef void (*func_ptr_typedef)(int x, char y);
+extern void func_with_typedef_param(func_ptr_typedef fp); /* Should match open-coded. */
+extern void func_with_opencoded_param(void (*fp)(int x, char y)); /* Should match typedef. */
+
+/* Function parameter types - typedef and open-coded should generate SAME type ID */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_with_typedef_param\n\t\.set\t__kcfi_typeid_func_with_typedef_param, 0xdc5c6da9} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_with_opencoded_param\n\t\.set\t__kcfi_typeid_func_with_opencoded_param, 0xdc5c6da9} } } */
+
+/* Verify function pointer parameter types. */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFvPFvicEE' typeid=0xdc5c6da9} kcfi0 } } */
+
+/* Verify exact count - each typedef/opencoded pair should generate exactly 2 symbols with identical values. */
+/* { dg-final { scan-assembler-times {0xdc5c6da9} 2 } } */
+
+/* Typedef vs open-coded function types - MUST have identical type IDs. */
+typedef int (*ret_func_ptr_typedef)(void);
+extern ret_func_ptr_typedef func_ret_typedef_param(void); /* Should match open-coded. */
+extern int (*func_ret_opencoded_param(void))(void); /* Should match typedef. */
+
+/* Return function pointer types - typedef and open-coded should
+ generate SAME type ID. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_ret_typedef_param\n\t\.set\t__kcfi_typeid_func_ret_typedef_param, 0xdfeb316a} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_ret_opencoded_param\n\t\.set\t__kcfi_typeid_func_ret_opencoded_param, 0xdfeb316a} } } */
+
+/* Verify return function pointer types. */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFPFivEvE' typeid=0xdfeb316a} kcfi0 } } */
+
+/* Verify exact count - each typedef/opencoded pair should generate exactly
+ 2 symbols with identical values. */
+/* { dg-final { scan-assembler-times {0xdfeb316a} 2 } } */
+
+/* Anonymous struct via typedef - should get typedef name as struct name. */
+typedef struct { int anon_member_1; } anon_typedef_1;
+typedef struct { int anon_member_2; } anon_typedef_2;
+extern void func_anon_typedef_1(anon_typedef_1 *param); /* Should use typedef name. */
+extern void func_anon_typedef_2(anon_typedef_2 *param); /* Should be different from anon_typedef_1 */
+
+/* Anonymous typedef struct tests - MUST be different from each other. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_anon_typedef_1\n\t\.set\t__kcfi_typeid_func_anon_typedef_1, 0x55475a23} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_anon_typedef_2\n\t\.set\t__kcfi_typeid_func_anon_typedef_2, 0x454f8fb8} } } */
+
+/* Verify anonymous typedef struct types. */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFvP14anon_typedef_1E' typeid=0x55475a23} kcfi0 } } */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFvP14anon_typedef_2E' typeid=0x454f8fb8} kcfi0 } } */
+
+/* { dg-final { scan-assembler-times {0x55475a23} 1 } } */
+/* { dg-final { scan-assembler-times {0x454f8fb8} 1 } } */
+
+/* Local function definitions - these will NOT get __kcfi_typeid_ symbols (only external declarations do) */
+void local_func_void(void) { } /* _ZTSFvvE -> 0x40e0d3c8 */
+void local_func_short(short x) { } /* _ZTSFvsE -> 0x84d472e1 */
+void local_func_uint(unsigned int x) { } /* _ZTSFvjE -> 0x60eb9384 */
+void local_func_float(float x) { } /* _ZTSFvfE -> 0x210943d8 */
+
+/* Local function validation - verify local function definitions do NOT get
+ __kcfi_typeid_ symbols. */
+/* { dg-final { scan-assembler-not {\t\.weak\t__kcfi_typeid_local_func_void\n} } } */
+/* { dg-final { scan-assembler-not {\t\.weak\t__kcfi_typeid_local_func_short\n} } } */
+/* { dg-final { scan-assembler-not {\t\.weak\t__kcfi_typeid_local_func_uint\n} } } */
+/* { dg-final { scan-assembler-not {\t\.weak\t__kcfi_typeid_local_func_float\n} } } */
+
+/* Local pointer parameter types. */
+void local_func_double_ptr(double *x) { } /* _ZTSFvPdE -> 0x1ec0c7a8 */
+void local_func_float_ptr(float *x) { } /* _ZTSFvPfE -> 0xd2bbd2d6 */
+
+/* Local pointer parameter types - should NOT emit symbols. */
+/* { dg-final { scan-assembler-not {\t\.weak\t__kcfi_typeid_local_func_double_ptr\n} } } */
+/* { dg-final { scan-assembler-not {\t\.weak\t__kcfi_typeid_local_func_float_ptr\n} } } */
+
+/* Local nested pointers. */
+void local_func_void_ptr_ptr(void **x) { } /* _ZTSFvPPvE -> 0xa64349b0 */
+
+/* Local nested pointers - should NOT emit symbols. */
+/* { dg-final { scan-assembler-not {\t\.weak\t__kcfi_typeid_local_func_void_ptr_ptr\n} } } */
+
+/* Local mixed parameters. */
+void local_func_ptr_val(int *x, int y) { } /* _ZTSFvPiiE -> 0xf072c2e8 */
+void local_func_val_ptr(int x, int *y) { } /* _ZTSFviPiE -> 0x0d1f87aa */
+
+/* Local mixed parameter validation - should NOT emit symbols. */
+/* { dg-final { scan-assembler-not {\t\.weak\t__kcfi_typeid_local_func_ptr_val\n} } } */
+/* { dg-final { scan-assembler-not {\t\.weak\t__kcfi_typeid_local_func_val_ptr\n} } } */
+
+/* Local return types. */
+float local_func_return_float(void) { return 0.0f; } /* _ZTSFfvE -> 0xee5e2118 */
+double local_func_return_double(void) { return 0.0; } /* _ZTSFdvE -> 0x59256b1e */
+
+/* Local return type discrimination - should NOT emit symbols. */
+/* { dg-final { scan-assembler-not {\t\.weak\t__kcfi_typeid_local_func_return_float\n} } } */
+/* { dg-final { scan-assembler-not {\t\.weak\t__kcfi_typeid_local_func_return_double\n} } } */
+
+/* Verify local function mangle strings appear in KCFI dump (even though no symbols are emitted) */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFvsE' typeid=0x84d472e1} kcfi0 } } */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFvfE' typeid=0x210943d8} kcfi0 } } */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFvPdE' typeid=0x1ec0c7a8} kcfi0 } } */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFvPfE' typeid=0xd2bbd2d6} kcfi0 } } */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFvPPvE' typeid=0xa64349b0} kcfi0 } } */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFvPiiE' typeid=0xf072c2e8} kcfi0 } } */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFviPiE' typeid=0x0d1f87aa} kcfi0 } } */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFfvE' typeid=0xee5e2118} kcfi0 } } */
+/* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFdvE' typeid=0x59256b1e} kcfi0 } } */
+
+struct not_void {
+ int nothing;
+};
+
+/* Function that takes addresses to make functions visible to KCFI */
+void test_address_taken(struct not_void *arg)
+{
+ /* External functions - taking addresses generates __kcfi_typeid_ symbols. */
+ void (*p1)(void) = func_void;
+ void (*p2)(char) = func_char;
+ void (*p3)(int) = func_int;
+ void (*p4)(long) = func_long;
+
+ void (*p5)(int*) = func_int_ptr;
+ void (*p6)(char*) = func_char_ptr;
+ void (*p7)(void*) = func_void_ptr;
+
+ void (*p_const_int_ptr)(const int*) = func_const_int_ptr;
+ void (*p_const_char_ptr)(const char*) = func_const_char_ptr;
+ void (*p_const_void_ptr)(const void*) = func_const_void_ptr;
+
+ void (*p8)(int**) = func_int_ptr_ptr;
+ void (*p9)(char**) = func_char_ptr_ptr;
+
+ void (*p10)(int, char) = func_int_char;
+ void (*p11)(char, int) = func_char_int;
+ void (*p12)(int, int) = func_two_int;
+
+ int (*p13)(void) = func_return_int;
+ char (*p14)(void) = func_return_char;
+ void* (*p15)(void) = func_return_ptr;
+
+ /* Array parameters - should decay to pointers. */
+ void (*p16)(int*) = func_int_array;
+ void (*p17)(char*) = func_char_array;
+
+ /* Function pointer parameters. */
+ void (*p18)(void(*)(void)) = func_fptr_void;
+ void (*p19)(void(*)(int)) = func_fptr_int;
+ void (*p20)(int(*)(void)) = func_fptr_ret_int;
+
+ /* Struct/union/enum function pointers. */
+ void (*p_struct_a_ptr)(struct test_struct_a*) = func_struct_a_ptr;
+ void (*p_struct_b_ptr)(struct test_struct_b*) = func_struct_b_ptr;
+ void (*p_struct_c_ptr)(struct test_struct_c*) = func_struct_c_ptr;
+
+ /* Const struct function pointers. */
+ void (*p_const_struct_a_ptr)(const struct test_struct_a*) = func_const_struct_a_ptr;
+ void (*p_const_struct_b_ptr)(const struct test_struct_b*) = func_const_struct_b_ptr;
+ void (*p_const_struct_c_ptr)(const struct test_struct_c*) = func_const_struct_c_ptr;
+ void (*p_union_a_ptr)(union test_union_a*) = func_union_a_ptr;
+ void (*p_union_b_ptr)(union test_union_b*) = func_union_b_ptr;
+ void (*p_enum_a_ptr)(enum test_enum_a*) = func_enum_a_ptr;
+ void (*p_enum_b_ptr)(enum test_enum_b*) = func_enum_b_ptr;
+
+ struct test_struct_a* (*p_ret_struct_a_ptr)(void) = func_ret_struct_a_ptr;
+ struct test_struct_b* (*p_ret_struct_b_ptr)(void) = func_ret_struct_b_ptr;
+ struct test_struct_c* (*p_ret_struct_c_ptr)(void) = func_ret_struct_c_ptr;
+
+ void (*p_struct_a_val)(struct test_struct_a) = func_struct_a_val;
+ void (*p_struct_b_val)(struct test_struct_b) = func_struct_b_val;
+ void (*p_struct_c_val)(struct test_struct_c) = func_struct_c_val;
+
+ struct test_struct_a (*p_ret_struct_a_val)(void) = func_ret_struct_a_val;
+ struct test_struct_b (*p_ret_struct_b_val)(void) = func_ret_struct_b_val;
+ struct test_struct_c (*p_ret_struct_c_val)(void) = func_ret_struct_c_val;
+
+ void (*p_struct_a_b)(struct test_struct_a*, struct test_struct_b*) = func_struct_a_b;
+ void (*p_struct_b_a)(struct test_struct_b*, struct test_struct_a*) = func_struct_b_a;
+
+ void (*p_typedef_x_ptr)(typedef_struct_x*) = func_typedef_x_ptr;
+ void (*p_typedef_y_ptr)(typedef_struct_y*) = func_typedef_y_ptr;
+
+ /* Typedef vs open-coded function type assignments should generate
+ identical type IDs. */
+ void (*p_with_typedef_param)(func_ptr_typedef) = func_with_typedef_param;
+ void (*p_with_opencoded_param)(void (*)(int, char)) = func_with_opencoded_param;
+ ret_func_ptr_typedef (*p_ret_typedef_param)(void) = func_ret_typedef_param;
+ int (*(*p_ret_opencoded_param)(void))(void) = func_ret_opencoded_param;
+
+ /* Anonymous struct typedef assignments - should generate unique type IDs. */
+ void (*p_anon_typedef_1)(anon_typedef_1 *) = func_anon_typedef_1;
+ void (*p_anon_typedef_2)(anon_typedef_2 *) = func_anon_typedef_2;
+
+ /* Union member access discrimination test. */
+ void (*p_tasklet_callback)(struct tasklet_like_struct *) = tasklet_callback_function;
+ void (*p_tasklet_func)(unsigned long) = tasklet_func_function;
+
+ /* Local functions - taking addresses does NOT generate __kcfi_typeid_
+ symbols (only external declarations do). */
+ void (*p21)(void) = local_func_void;
+ void (*p22)(short) = local_func_short;
+ void (*p23)(unsigned int) = local_func_uint;
+ void (*p24)(float) = local_func_float;
+
+ void (*p25)(double*) = local_func_double_ptr;
+ void (*p26)(float*) = local_func_float_ptr;
+
+ void (*p27)(void**) = local_func_void_ptr_ptr;
+
+ void (*p28)(int*, int) = local_func_ptr_val;
+ void (*p29)(int, int*) = local_func_val_ptr;
+
+ float (*p30)(void) = local_func_return_float;
+ double (*p31)(void) = local_func_return_double;
+
+ /* Use pointers to prevent optimization - external functions. */
+ if (p1) p1();
+ if (p2) p2('x');
+ if (p3) p3(42);
+ if (p4) p4(42L);
+ if (p5) p5((int*)0);
+ if (p6) p6((char*)0);
+ if (p7) p7((void*)0);
+
+ /* Use const qualifier pointers to prevent optimization. */
+ if (p_const_int_ptr) p_const_int_ptr((const int*)0);
+ if (p_const_char_ptr) p_const_char_ptr((const char*)0);
+ if (p_const_void_ptr) p_const_void_ptr((const void*)0);
+ if (p8) p8((int**)0);
+ if (p9) p9((char**)0);
+ if (p10) p10(1, 'x');
+ if (p11) p11('x', 1);
+ if (p12) p12(1, 2);
+ if (p13) p13();
+ if (p14) p14();
+ if (p15) p15();
+ if (p16) p16((int*)0);
+ if (p17) p17((char*)0);
+ if (p18) p18((void(*)(void))0);
+ if (p19) p19((void(*)(int))0);
+ if (p20) p20((int(*)(void))0);
+
+ /* Use pointers to prevent optimization - local functions. */
+ if (p21) p21();
+ if (p22) p22(1);
+ if (p23) p23(1U);
+ if (p24) p24(1.0f);
+ if (p25) p25((double*)0);
+ if (p26) p26((float*)0);
+ if (p27) p27((void**)0);
+ if (p28) p28((int*)0, 1);
+ if (p29) p29(1, (int*)0);
+ if (p30) p30();
+ if (p31) p31();
+
+ /* Use struct/union/enum function pointers to generate KCFI type IDs. */
+ if (p_struct_a_ptr) p_struct_a_ptr((struct test_struct_a*)0);
+ if (p_struct_b_ptr) p_struct_b_ptr((struct test_struct_b*)0);
+ if (p_struct_c_ptr) p_struct_c_ptr((struct test_struct_c*)0);
+ if (p_const_struct_a_ptr) p_const_struct_a_ptr((const struct test_struct_a*)0);
+ if (p_const_struct_b_ptr) p_const_struct_b_ptr((const struct test_struct_b*)0);
+ if (p_const_struct_c_ptr) p_const_struct_c_ptr((const struct test_struct_c*)0);
+ if (p_union_a_ptr) p_union_a_ptr((union test_union_a*)0);
+ if (p_union_b_ptr) p_union_b_ptr((union test_union_b*)0);
+ if (p_enum_a_ptr) p_enum_a_ptr((enum test_enum_a*)0);
+ if (p_enum_b_ptr) p_enum_b_ptr((enum test_enum_b*)0);
+
+ /* Use struct return type function pointers to generate type IDs. */
+ if (p_ret_struct_a_ptr) p_ret_struct_a_ptr();
+ if (p_ret_struct_b_ptr) p_ret_struct_b_ptr();
+ if (p_ret_struct_c_ptr) p_ret_struct_c_ptr();
+
+ /* Use struct by-value parameter function pointers to generate type IDs. */
+ struct test_struct_a dummy_a = {};
+ struct test_struct_b dummy_b = {};
+ struct test_struct_c dummy_c = {};
+ if (p_struct_a_val) p_struct_a_val(dummy_a);
+ if (p_struct_b_val) p_struct_b_val(dummy_b);
+ if (p_struct_c_val) p_struct_c_val(dummy_c);
+
+ /* Use struct return by-value function pointers to generate type IDs. */
+ if (p_ret_struct_a_val) p_ret_struct_a_val();
+ if (p_ret_struct_b_val) p_ret_struct_b_val();
+ if (p_ret_struct_c_val) p_ret_struct_c_val();
+
+ /* Use multi-parameter struct function pointers to generate type IDs. */
+ if (p_struct_a_b) p_struct_a_b((struct test_struct_a*)0, (struct test_struct_b*)0);
+ if (p_struct_b_a) p_struct_b_a((struct test_struct_b*)0, (struct test_struct_a*)0);
+
+ /* Use typedef struct function pointers to generate type IDs. */
+ if (p_typedef_x_ptr) p_typedef_x_ptr((typedef_struct_x*)0);
+ if (p_typedef_y_ptr) p_typedef_y_ptr((typedef_struct_y*)0);
+
+ /* Use typedef vs open-coded function pointers to generate type IDs. */
+ if (p_with_typedef_param) p_with_typedef_param((func_ptr_typedef)0);
+ if (p_with_opencoded_param) p_with_opencoded_param((void (*)(int, char))0);
+ if (p_ret_typedef_param) p_ret_typedef_param();
+ if (p_ret_opencoded_param) p_ret_opencoded_param();
+
+ /* Use anonymous typedef function pointers to generate type IDs. */
+ if (p_anon_typedef_1) p_anon_typedef_1((anon_typedef_1*)0);
+ if (p_anon_typedef_2) p_anon_typedef_2((anon_typedef_2*)0);
+
+ /* Use tasklet func function pointer to generate type ID. */
+ if (p_tasklet_func) p_tasklet_func(0);
+
+ struct tasklet_like_struct test_tasklet = { };
+ test_tasklet.callback = tasklet_callback_function;
+
+ /* This indirect call through union->callback MUST generate type ID
+ 0x84fa4a3e (callback signature). NOT type ID 0x80ee047b (func signature
+ from first union member). */
+ struct tasklet_like_struct *volatile tasklet_ptr = &test_tasklet;
+ if (tasklet_ptr->callback) {
+ /* This call should match tasklet_callback_function type ID */
+ tasklet_ptr->callback(tasklet_ptr);
+ }
+}
+
+/* Named struct and its typedef should have IDENTICAL type IDs after canonicalization. */
+struct named_for_typedef_test { int member; };
+typedef struct named_for_typedef_test named_for_typedef_test_t;
+
+extern void func_named_struct_param(struct named_for_typedef_test *param);
+extern void func_typedef_struct_param(named_for_typedef_test_t *param);
+
+/* Named struct typedef canonicalization - MUST have identical type IDs. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_named_struct_param\n\t\.set\t__kcfi_typeid_func_named_struct_param, 0x9316d030} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_typedef_struct_param\n\t\.set\t__kcfi_typeid_func_typedef_struct_param, 0x9316d030} } } */
+
+/* Verify named struct typedef canonicalization types. */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvP22named_for_typedef_testE' typeid=0x9316d030} kcfi0 } } */
+
+/* Verify exact count - both should generate exactly 2 symbols with identical values. */
+/* { dg-final { scan-assembler-times {0x9316d030} 2 } } */
+
+void test_named_struct_typedef_canonicalization(struct not_void *arg) {
+ /* These should be compatible after canonicalization. */
+ void (*fp_struct)(struct named_for_typedef_test *) = func_named_struct_param;
+ void (*fp_typedef)(struct named_for_typedef_test *) = func_typedef_struct_param;
+
+ /* Take addresses to generate type IDs. */
+ if (fp_struct) fp_struct((struct named_for_typedef_test *)0);
+ if (fp_typedef) fp_typedef((struct named_for_typedef_test *)0);
+}
+
+/* Basic type typedef canonicalization - typedef should canonicalize to
+ underlying basic type. */
+
+/* Basic type typedefs commonly used in kernel code. */
+typedef unsigned char u8;
+typedef unsigned short u16;
+typedef unsigned int u32;
+
+/* Functions with basic type typedef vs original type parameters. */
+extern void func_u8_param(u8 param);
+extern void func_unsigned_char_param(unsigned char param);
+extern void func_u16_param(u16 param);
+extern void func_unsigned_short_param(unsigned short param);
+extern void func_u32_param(u32 param);
+extern void func_unsigned_int_param(unsigned int param);
+
+void test_basic_typedef_canonicalization(struct not_void *arg) {
+ /* These should be compatible after canonicalization. */
+ void (*fp_u8)(unsigned char) = func_u8_param; /* Should work with canonicalization. */
+ void (*fp_uchar)(unsigned char) = func_unsigned_char_param; /* Should work normally. */
+ void (*fp_u16)(unsigned short) = func_u16_param; /* Should work with canonicalization. */
+ void (*fp_ushort)(unsigned short) = func_unsigned_short_param; /* Should work normally. */
+ void (*fp_u32)(unsigned int) = func_u32_param; /* Should work with canonicalization. */
+ void (*fp_uint)(unsigned int) = func_unsigned_int_param; /* Should work normally. */
+
+ /* Take addresses to generate type IDs. */
+ if (fp_u8) fp_u8(0);
+ if (fp_uchar) fp_uchar(0);
+ if (fp_u16) fp_u16(0);
+ if (fp_ushort) fp_ushort(0);
+ if (fp_u32) fp_u32(0);
+ if (fp_uint) fp_uint(0);
+}
+
+/* Basic type typedef canonicalization - MUST have identical type IDs
+ after canonicalization. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_u8_param\n\t\.set\t__kcfi_typeid_func_u8_param, 0x14e69eb2} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_unsigned_char_param\n\t\.set\t__kcfi_typeid_func_unsigned_char_param, 0x14e69eb2} } } */
+
+/* Verify basic type canonicalization (u8/unsigned char) */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvhE' typeid=0x14e69eb2} kcfi0 } } */
+
+/* Count test is below, which includes other tests that use this hash. */
+
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_u16_param\n\t\.set\t__kcfi_typeid_func_u16_param, 0x74dca876} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_unsigned_short_param\n\t\.set\t__kcfi_typeid_func_unsigned_short_param, 0x74dca876} } } */
+
+/* Verify basic type canonicalization (u16/unsigned short) */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvtE' typeid=0x74dca876} kcfi0 } } */
+
+/* { dg-final { scan-assembler-times {0x74dca876} 2 } } */
+
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_u32_param\n\t\.set\t__kcfi_typeid_func_u32_param, 0x60eb9384} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_unsigned_int_param\n\t\.set\t__kcfi_typeid_func_unsigned_int_param, 0x60eb9384} } } */
+
+/* Verify basic type canonicalization (u32/unsigned int) */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvjE' typeid=0x60eb9384} kcfi0 } } */
+
+/* Count test is below, which includes other tests that use this hash. */
+
+/* Verify exact count - each typedef/basic type pair should generate exactly 2 symbols with identical values. */
+/* Note: Counts updated below to include recursive typedef tests. */
+
+/* Recursive typedef canonicalization - test multi-level typedef chains. */
+
+/* Kernel-style recursive typedef chains that need full canonicalization. */
+typedef unsigned char __u8_recursive;
+typedef __u8_recursive u8_recursive;
+
+typedef unsigned int __u32_recursive;
+typedef __u32_recursive u32_recursive;
+
+/* Three-level typedef chains. */
+typedef unsigned char base_u8_recursive_t;
+typedef base_u8_recursive_t mid_u8_recursive_t;
+typedef mid_u8_recursive_t top_u8_recursive_t;
+
+/* Struct recursive typedef chains. */
+struct recursive_struct_test { int value; };
+typedef struct recursive_struct_test base_recursive_struct_t;
+typedef base_recursive_struct_t top_recursive_struct_t;
+
+/* Functions with recursive typedefs - MUST have same type IDs as canonical forms. */
+extern void func_u8_recursive_chain(u8_recursive param); /* u8_recursive -> __u8_recursive -> unsigned char. */
+extern void func_u8_recursive_mid(__u8_recursive param); /* __u8_recursive -> unsigned char. */
+extern void func_u8_recursive_base(unsigned char param); /* unsigned char (baseline) */
+
+extern void func_u32_recursive_chain(u32_recursive param); /* u32_recursive -> __u32_recursive -> unsigned int. */
+extern void func_u32_recursive_mid(__u32_recursive param); /* __u32_recursive -> unsigned int. */
+extern void func_u32_recursive_base(unsigned int param); /* unsigned int (baseline) */
+
+extern void func_three_level_recursive(top_u8_recursive_t param); /* top -> mid -> base -> unsigned char. */
+extern void func_three_level_mid(mid_u8_recursive_t param); /* mid -> base -> unsigned char. */
+extern void func_three_level_base(base_u8_recursive_t param); /* base -> unsigned char. */
+extern void func_three_level_final(unsigned char param); /* unsigned char (baseline) */
+
+extern void func_struct_recursive_chain(top_recursive_struct_t *param); /* Should resolve to struct name. */
+extern void func_struct_recursive_mid(base_recursive_struct_t *param); /* Should resolve to struct name. */
+extern void func_struct_recursive_original(struct recursive_struct_test *param); /* struct name (baseline) */
+
+void test_recursive_canonicalization(struct not_void *arg) {
+ /* Recursive typedef function pointers - should be compatible after
+ full canonicalization. */
+ void (*fp_u8_chain)(unsigned char) = func_u8_recursive_chain; /* Should work after 2-level canonicalization. */
+ void (*fp_u8_mid)(unsigned char) = func_u8_recursive_mid; /* Should work after 1-level canonicalization. */
+ void (*fp_u8_base)(unsigned char) = func_u8_recursive_base; /* Should work normally. */
+
+ void (*fp_u32_chain)(unsigned int) = func_u32_recursive_chain; /* Should work after 2-level canonicalization. */
+ void (*fp_u32_mid)(unsigned int) = func_u32_recursive_mid; /* Should work after 1-level canonicalization. */
+ void (*fp_u32_base)(unsigned int) = func_u32_recursive_base; /* Should work normally. */
+
+ void (*fp_three_chain)(unsigned char) = func_three_level_recursive; /* Should work after 3-level canonicalization. */
+ void (*fp_three_mid)(unsigned char) = func_three_level_mid; /* Should work after 2-level canonicalization. */
+ void (*fp_three_base)(unsigned char) = func_three_level_base; /* Should work after 1-level canonicalization. */
+ void (*fp_three_final)(unsigned char) = func_three_level_final; /* Should work normally. */
+
+ void (*fp_struct_chain)(struct recursive_struct_test *) = func_struct_recursive_chain; /* Should work after canonicalization. */
+ void (*fp_struct_mid)(struct recursive_struct_test *) = func_struct_recursive_mid; /* Should work after canonicalization. */
+ void (*fp_struct_orig)(struct recursive_struct_test *) = func_struct_recursive_original; /* Should work normally. */
+
+ /* Use function pointers to prevent optimization. */
+ if (fp_u8_chain) fp_u8_chain(0);
+ if (fp_u8_mid) fp_u8_mid(0);
+ if (fp_u8_base) fp_u8_base(0);
+ if (fp_u32_chain) fp_u32_chain(0);
+ if (fp_u32_mid) fp_u32_mid(0);
+ if (fp_u32_base) fp_u32_base(0);
+ if (fp_three_chain) fp_three_chain(0);
+ if (fp_three_mid) fp_three_mid(0);
+ if (fp_three_base) fp_three_base(0);
+ if (fp_three_final) fp_three_final(0);
+ if (fp_struct_chain) fp_struct_chain((struct recursive_struct_test *)0);
+ if (fp_struct_mid) fp_struct_mid((struct recursive_struct_test *)0);
+ if (fp_struct_orig) fp_struct_orig((struct recursive_struct_test *)0);
+}
+
+/* Recursive typedef canonicalization validation - MUST have identical type
+ IDs after full canonicalization. */
+
+/* u8 recursive chain - all should resolve to unsigned char. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_u8_recursive_chain\n\t\.set\t__kcfi_typeid_func_u8_recursive_chain, 0x14e69eb2} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_u8_recursive_mid\n\t\.set\t__kcfi_typeid_func_u8_recursive_mid, 0x14e69eb2} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_u8_recursive_base\n\t\.set\t__kcfi_typeid_func_u8_recursive_base, 0x14e69eb2} } } */
+
+/* u32 recursive chain - all should resolve to unsigned int. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_u32_recursive_chain\n\t\.set\t__kcfi_typeid_func_u32_recursive_chain, 0x60eb9384} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_u32_recursive_mid\n\t\.set\t__kcfi_typeid_func_u32_recursive_mid, 0x60eb9384} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_u32_recursive_base\n\t\.set\t__kcfi_typeid_func_u32_recursive_base, 0x60eb9384} } } */
+
+/* Three-level u8 recursive chain - all should resolve to unsigned char. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_three_level_recursive\n\t\.set\t__kcfi_typeid_func_three_level_recursive, 0x14e69eb2} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_three_level_mid\n\t\.set\t__kcfi_typeid_func_three_level_mid, 0x14e69eb2} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_three_level_base\n\t\.set\t__kcfi_typeid_func_three_level_base, 0x14e69eb2} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_three_level_final\n\t\.set\t__kcfi_typeid_func_three_level_final, 0x14e69eb2} } } */
+
+/* Struct recursive chain - all should resolve to same struct name. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_struct_recursive_chain\n\t\.set\t__kcfi_typeid_func_struct_recursive_chain, 0xf63dce36} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_struct_recursive_mid\n\t\.set\t__kcfi_typeid_func_struct_recursive_mid, 0xf63dce36} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_struct_recursive_original\n\t\.set\t__kcfi_typeid_func_struct_recursive_original, 0xf63dce36} } } */
+
+/* Update counts to include recursive typedef tests. */
+/* Note: u8/unsigned char recursive tests add 7 more occurrences (actual count: 9) */
+/* { dg-final { scan-assembler-times {0x14e69eb2} 9 } } */
+
+/* Note: u32/unsigned int recursive tests add 3 more occurrences (actual count: 6) */
+/* { dg-final { scan-assembler-times {0x60eb9384} 6 } } */
+
+/* Verify struct recursive typedef canonicalization types. */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFvP21recursive_struct_testE' typeid=0xf63dce36} kcfi0 } } */
+
+/* Struct recursive: 3 identical type IDs. */
+/* { dg-final { scan-assembler-times {0xf63dce36} 3 } } */
+
+/* VLA (Variable Length Array) mangling tests. */
+
+/* Basic VLA cases - all should decay to simple pointer types. */
+extern void func_vla_1d(int n, int arr[n]); /* _ZTSFviPiE -> 0x0d1f87aa */
+extern void func_vla_empty(int n, int arr[]); /* _ZTSFviPiE -> 0x0d1f87aa */
+extern void func_vla_ptr(int n, int *arr); /* _ZTSFviPiE -> 0x0d1f87aa */
+
+/* VLA 1D tests with precise patterns - all should be identical. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_vla_1d\n\t\.set\t__kcfi_typeid_func_vla_1d, 0x0d1f87aa} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_vla_empty\n\t\.set\t__kcfi_typeid_func_vla_empty, 0x0d1f87aa} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_vla_ptr\n\t\.set\t__kcfi_typeid_func_vla_ptr, 0x0d1f87aa} } } */
+
+/* Verify VLA 1D types. */
+/* { dg-final { scan-tree-dump {mangled='_ZTSFviPiE' typeid=0x0d1f87aa} kcfi0 } } */
+
+/* Count verification - VLA 1D types should appear exactly 3 times in assembly. */
+/* { dg-final { scan-assembler-times {0x0d1f87aa} 4 } } +1 from local function preamble. */
+
+/* 2D arrays with known dimension - VLA in first dimension, fixed in second. */
+extern void func_vla_2d_first(int n, int arr[n][10]); /* _ZTSFviPA10_iE -> 0x2cd9653d */
+extern void func_vla_2d_empty(int n, int arr[][10]); /* _ZTSFviPA10_iE -> 0x2cd9653d */
+extern void func_vla_2d_ptr(int n, int (*arr)[10]); /* _ZTSFviPA10_iE -> 0x2cd9653d */
+
+/* 2D VLA with fixed dimension tests with precise patterns. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_vla_2d_first\n\t\.set\t__kcfi_typeid_func_vla_2d_first, 0x2cd9653d} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_vla_2d_empty\n\t\.set\t__kcfi_typeid_func_vla_2d_empty, 0x2cd9653d} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_vla_2d_ptr\n\t\.set\t__kcfi_typeid_func_vla_2d_ptr, 0x2cd9653d} } } */
+
+/* Verify 2D VLA with fixed dimension types. */
+/* { dg-final { scan-ipa-dump {mangled='_ZTSFviPA10_iE' typeid=0x2cd9653d} ipa_kcfi } } */
+
+/* Count verification - 2D VLA with fixed dimension should appear exactly 3 times. */
+/* { dg-final { scan-assembler-times {0x2cd9653d} 3 } } */
+
+/* 2D VLA cases - both dimensions variable (Itanium ABI: variable dimension = empty) */
+extern void func_vla_2d_both(int rows, int cols, int arr[rows][cols]); /* _ZTSFviiPA_iE -> 0xc63cc57b */
+extern void func_vla_2d_second(int rows, int cols, int arr[][cols]); /* _ZTSFviiPA_iE -> 0xc63cc57b */
+extern void func_vla_2d_star(int rows, int cols, int arr[*][cols]); /* _ZTSFviiPA_iE -> 0xc63cc57b */
+
+/* 2D VLA with both dimensions variable tests with precise patterns. */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_vla_2d_both\n\t\.set\t__kcfi_typeid_func_vla_2d_both, 0xc63cc57b} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_vla_2d_second\n\t\.set\t__kcfi_typeid_func_vla_2d_second, 0xc63cc57b} } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_func_vla_2d_star\n\t\.set\t__kcfi_typeid_func_vla_2d_star, 0xc63cc57b} } } */
+
+/* Verify 2D VLA with both dimensions variable types. */
+/* { dg-final { scan-ipa-dump {mangled='_ZTSFviiPA_iE' typeid=0xc63cc57b} ipa_kcfi } } */
+
+/* Count verification - 2D VLA with both variable dimensions should appear exactly 3 times in assembly. */
+/* { dg-final { scan-assembler-times {0xc63cc57b} 3 } } */
+
+/* VLA test function to force mangling. */
+void test_vla_mangling_verification(void) {
+ void (*fp_vla_1d)(int, int*) = func_vla_1d;
+ void (*fp_vla_empty)(int, int*) = func_vla_empty;
+ void (*fp_vla_ptr)(int, int*) = func_vla_ptr;
+ void (*fp_vla_2d_first)(int, int(*)[10]) = func_vla_2d_first;
+ void (*fp_vla_2d_empty)(int, int(*)[10]) = func_vla_2d_empty;
+ void (*fp_vla_2d_ptr)(int, int(*)[10]) = func_vla_2d_ptr;
+
+ /* 2D VLA functions - take addresses to generate __kcfi_typeid_ symbols. */
+ volatile void *vla_p1 = func_vla_2d_both;
+ volatile void *vla_p2 = func_vla_2d_second;
+ volatile void *vla_p3 = func_vla_2d_star;
+ (void)vla_p1; (void)vla_p2; (void)vla_p3;
+
+ /* Variadic functions - take addresses and call through typed pointers to generate __kcfi_typeid_ symbols. */
+ void (*fp_variadic_simple)(const char *, ...) = func_variadic_simple;
+ void (*fp_variadic_mixed)(int, const char *, ...) = func_variadic_mixed;
+ void (*fp_variadic_multi)(int, char, const char *, ...) = func_variadic_multi;
+ void (*fp_audit_pattern)(struct audit_context *, unsigned int, int, const char *, ...) = audit_log_pattern;
+ void (*fp_non_variadic_simple)(const char *) = func_non_variadic_simple;
+ void (*fp_non_variadic_mixed)(int, const char *) = func_non_variadic_mixed;
+
+ /* Call through function pointers to trigger KCFI analysis. */
+ if (fp_variadic_simple) fp_variadic_simple("test");
+ if (fp_variadic_mixed) fp_variadic_mixed(1, "test");
+ if (fp_variadic_multi) fp_variadic_multi(1, 'x', "test");
+ if (fp_audit_pattern) fp_audit_pattern((struct audit_context *)0, 0, 1, "test");
+ if (fp_non_variadic_simple) fp_non_variadic_simple("test");
+ if (fp_non_variadic_mixed) fp_non_variadic_mixed(1, "test");
+
+ /* va_start regression test - ensures builtin functions are skipped in KCFI processing. */
+ test_va_start_regression(0.0f, "format", 42, 'x', "string");
+
+ /* Keep volatile assignments for backward compatibility. */
+ volatile void *variadic_p1 = func_variadic_simple;
+ volatile void *variadic_p2 = func_variadic_mixed;
+ volatile void *variadic_p3 = func_variadic_multi;
+ volatile void *audit_pattern_p = audit_log_pattern;
+ volatile void *non_variadic_p1 = func_non_variadic_simple;
+ volatile void *non_variadic_p2 = func_non_variadic_mixed;
+ (void)variadic_p1; (void)variadic_p2; (void)variadic_p3;
+ (void)audit_pattern_p;
+ (void)non_variadic_p1; (void)non_variadic_p2;
+}
+
+/* va_start regression test implementation - triggers __builtin_va_start usage. */
+void test_va_start_regression(float dummy, const char *fmt, ...) {
+ va_list args;
+ /* This previously caused crash due to __builtin_va_start processing. */
+ va_start(args, fmt);
+ /* Simple va_list usage to ensure the builtin call is generated. */
+ (void)args;
+ va_end(args);
+}
+
+/* Library builtin test - __builtin_memset resolves to memset and should
+ get KCFI type ID. */
+
+/* memset signature: void *memset(void *s, int c, size_t n)
+ - 64-bit targets: size_t is 'unsigned long' (m) -> _ZTSFPvPvimE -> 0x1d8c7ada
+ - 32-bit ARM: size_t is 'unsigned int' (j) -> _ZTSFPvPvijE -> 0xdd98e20d */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_memset\n\t\.set\t__kcfi_typeid_memset, 0x1d8c7ada} { target { ! arm*-*-* } } } } */
+/* { dg-final { scan-assembler {\t\.weak\t__kcfi_typeid_memset\n\t\.set\t__kcfi_typeid_memset, 0xdd98e20d} { target arm32 } } } */
+/* { dg-final { scan-ipa-dump {mangled='_ZTSFPvPvimE' typeid=0x1d8c7ada} ipa_kcfi { target { ! arm*-*-* } } } } */
+/* { dg-final { scan-ipa-dump {mangled='_ZTSFPvPvijE' typeid=0xdd98e20d} ipa_kcfi { target arm32 } } } */
+/* { dg-final { scan-assembler-times {0x1d8c7ada} 1 { target { ! arm*-*-* } } } } */
+/* { dg-final { scan-assembler-times {0xdd98e20d} 1 { target arm32 } } } */
+
+void test_builtin_memset_indirect(void) {
+ char buffer[64];
+ /* Force indirect call through function pointer to test KCFI validation.
+ __builtin_memset resolves to regular memset which should get a type ID. */
+ void *(*memset_ptr)(void *, int, __SIZE_TYPE__) = __builtin_memset;
+ volatile void *result = memset_ptr(buffer, 0, sizeof(buffer));
+ (void)result; /* Prevent optimization. */
+}
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi.exp b/gcc/testsuite/gcc.dg/kcfi/kcfi.exp
new file mode 100644
index 000000000000..2aebcbe1c01b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi.exp
@@ -0,0 +1,36 @@
+# Copyright (C) 2025 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3. If not see
+# <http://www.gnu.org/licenses/>.
+
+# GCC testsuite for KCFI (Kernel Control Flow Integrity) tests.
+
+# Load support procs.
+load_lib gcc-dg.exp
+
+# If a testcase doesn't have special options, use these.
+global DEFAULT_CFLAGS
+if ![info exists DEFAULT_CFLAGS] then {
+ set DEFAULT_CFLAGS ""
+}
+
+# Initialize `dg'.
+dg-init
+
+# Main loop.
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c]] \
+ "" $DEFAULT_CFLAGS
+
+# All done.
+dg-finish
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* Re: [PATCH v2 7/7] kcfi: Add regression test suite
2025-09-05 0:24 ` [PATCH v2 7/7] kcfi: Add regression test suite Kees Cook
@ 2025-09-05 7:06 ` Jakub Jelinek
2025-09-05 17:15 ` Kees Cook
0 siblings, 1 reply; 32+ messages in thread
From: Jakub Jelinek @ 2025-09-05 7:06 UTC (permalink / raw)
To: Kees Cook
Cc: Qing Zhao, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Peter Zijlstra, Dan Li,
Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
Bill Wendling, gcc-patches, linux-hardening
On Thu, Sep 04, 2025 at 05:24:15PM -0700, Kees Cook wrote:
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c
> @@ -0,0 +1,73 @@
> +/* Test KCFI check/transfer adjacency - regression test for instruction
> + insertion. */
> +/* { dg-do compile } */
> +/* { dg-options "-fsanitize=kcfi -O2" } */
> +/* { dg-options "-fsanitize=kcfi -O2 -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
For stuff like this you should be using dg-additional-options.
/* { dg-options "-fsanitize=kcfi -O2" } */
/* { dg-additional-options "-march=armv7-a -mfloat-abi=soft" { target arm32 } } */
(in various other tests too).
> +/* Should have KCFI instrumentation for all indirect calls. */
> +
> +/* x86_64: Complete KCFI check sequence should be present. */
> +/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r1[01]d\n\taddl\t[^,]+, %r1[01]d\n\tje\t\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tud2} { target x86_64-*-* } } } */
This at least needs
/* { dg-additional-options "-masm=att" { target x86_64-*-* } } */
because Intel syntax wouldn't match. Does this match with all possible
-march/-mtune settings?
Peope very often do test
make check RUNTESTFLAGS='--target_board=unix/-march=skylake-avx512'
etc. so if the test depends on a particular ISA or tuning, better
add it explicitly to dg-options.
Also, we try not to use triplets like x86_64-*-* but instead
{ i?86-*-* x86_64-*-* } && lp64
or
{ i?86-*-* x86_64-*-* } && { ! ia32 }
depending on whether it is only for -m64, or for both -m64 and -mx32,
because on some targets the multilib compiler is i?86-*-* defaulting
to -m32, on most obviously x86_64-*-* defaulting to -m64.
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c
> @@ -0,0 +1,101 @@
> +/* Test basic KCFI functionality - preamble generation. */
> +/* { dg-do compile } */
> +/* { dg-options "-fsanitize=kcfi" } */
> +/* { dg-options "-fsanitize=kcfi -falign-functions=16" { target x86_64-*-* } } */
> +/* { dg-options "-fsanitize=kcfi -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
Again (and in many others).
> +/* x86_64: Should have 0 entry NOPs - function starts immediately with
> + pushq. */
> +/* { dg-final { scan-assembler {test_function:\n\.LFB[0-9]+:\n\t*\.cfi_startproc\n\t*pushq\t*%rbp} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target x86_64-*-* } } } */
.weak is ELF specific, not all targets have it, are the tests restricted to
targets that do support it and in this syntax? We have
/* { dg-require-weak "" } */
but that doesn't imply a particular function.
Also, not all configurations will support .cfi_* directives, that depends
both on command line parameters and on whether assembler supports those.
If you expect them in all tests, perhaps you should test for those in
kcfi.exp and not run the tests at all if the directives aren't supported
(or if weak isn't supported etc.).
Also, there are targets with different line endings, so usually one scans
for [\n\r]* instead of just \n. No idea why you're using \t*, the compiler
emits just one tab.
Jakub
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: [PATCH v2 7/7] kcfi: Add regression test suite
2025-09-05 7:06 ` Jakub Jelinek
@ 2025-09-05 17:15 ` Kees Cook
0 siblings, 0 replies; 32+ messages in thread
From: Kees Cook @ 2025-09-05 17:15 UTC (permalink / raw)
To: Jakub Jelinek
Cc: Qing Zhao, Andrew Pinski, Richard Biener, Joseph Myers,
Jan Hubicka, Richard Earnshaw, Richard Sandiford,
Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
Andrew Waterman, Jim Wilson, Peter Zijlstra, Dan Li,
Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
Bill Wendling, gcc-patches, linux-hardening
On Fri, Sep 05, 2025 at 09:06:41AM +0200, Jakub Jelinek wrote:
> On Thu, Sep 04, 2025 at 05:24:15PM -0700, Kees Cook wrote:
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c
> > @@ -0,0 +1,73 @@
> > +/* Test KCFI check/transfer adjacency - regression test for instruction
> > + insertion. */
> > +/* { dg-do compile } */
> > +/* { dg-options "-fsanitize=kcfi -O2" } */
> > +/* { dg-options "-fsanitize=kcfi -O2 -march=armv7-a -mfloat-abi=soft" { target arm32 } } */
>
> For stuff like this you should be using dg-additional-options.
> /* { dg-options "-fsanitize=kcfi -O2" } */
> /* { dg-additional-options "-march=armv7-a -mfloat-abi=soft" { target arm32 } } */
> (in various other tests too).
Ah, perfect; thanks!
> > +/* Should have KCFI instrumentation for all indirect calls. */
> > +
> > +/* x86_64: Complete KCFI check sequence should be present. */
> > +/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r1[01]d\n\taddl\t[^,]+, %r1[01]d\n\tje\t\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tud2} { target x86_64-*-* } } } */
>
> This at least needs
> /* { dg-additional-options "-masm=att" { target x86_64-*-* } } */
> because Intel syntax wouldn't match.
Ah, okay. Is the test suite ever run with -masm != att?
> Does this match with all possible -march/-mtune settings?
I was just running this with "default" state. I didn't think there was
value is testing all the combinations -- all the sequence tests are
basically validating that nothing surprising happened during emission,
etc. What's the best practice for this? Should I add specific
-march/-mtune options for each arch?
> Peope very often do test
> make check RUNTESTFLAGS='--target_board=unix/-march=skylake-avx512'
> etc. so if the test depends on a particular ISA or tuning, better
> add it explicitly to dg-options.
How does that end up meshing? i.e. if I have -mtune=generic in
dg-options, but someone runs with a different -mtune, what happens?
> Also, we try not to use triplets like x86_64-*-* but instead
> { i?86-*-* x86_64-*-* } && lp64
> or
> { i?86-*-* x86_64-*-* } && { ! ia32 }
> depending on whether it is only for -m64, or for both -m64 and -mx32,
> because on some targets the multilib compiler is i?86-*-* defaulting
> to -m32, on most obviously x86_64-*-* defaulting to -m64.
Okay, sounds good. I'll update all of these (for this we only care about
64-bit x86). Out of curiosity what triple matches i?86-*-* and lp64? I
thought x86_64 was sufficient here.
(Though I suddenly realize I think I have nothing in the KCFI patches
can that rejects working under -m32 ... I only do careful target checks
under arm.)
> > +/* x86_64: Should have 0 entry NOPs - function starts immediately with
> > + pushq. */
> > +/* { dg-final { scan-assembler {test_function:\n\.LFB[0-9]+:\n\t*\.cfi_startproc\n\t*pushq\t*%rbp} { target x86_64-*-* } } } */
> > +/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target x86_64-*-* } } } */
>
> .weak is ELF specific, not all targets have it, are the tests restricted to
> targets that do support it and in this syntax? We have
> /* { dg-require-weak "" } */
> but that doesn't imply a particular function.
Oh, er, this is just for ELF targets. Is there a way to globally
restrict all of these tests to just the 4 arch combos? I'm suspecting
now that these tests will all universally fail for the archs that don't
support -fsanitize=kcfi. I thought dg-options was handling filtering
this, but maybe I've misunderstood?
I'm guessing I need to declare an alias like "lp64" or what I think I
saw asan doing for this feature?
> Also, not all configurations will support .cfi_* directives, that depends
> both on command line parameters and on whether assembler supports those.
> If you expect them in all tests, perhaps you should test for those in
> kcfi.exp and not run the tests at all if the directives aren't supported
> (or if weak isn't supported etc.).
Yeah, this sounds like the place I need to limit the tests from?
Everything I know about dg I've learned in the last month. :P
Studying this some more, it looks like some .exp files use "istarget". I
found, e.g.:
if { [istarget nvptx-*-*] } {
return
}
So maybe I need that as a top-level filter in kcfi.exp:
if { ![istarget arm*-*-*] && ![istarget x86_64-*-*] && ... } {
unsupported "KCFI tests not supported on this target"
return
}
?
I will build some 5th target and see what happens when I run these
tests. :P
> Also, there are targets with different line endings, so usually one scans
> for [\n\r]* instead of just \n.
Okay -- I'm expecting this will go away once I limit to just the 4 targets
I want, or do you want me to universally update the \n patterns to
[\n\r]*?
> No idea why you're using \t*, the compiler emits just one tab.
Ah, I'm not sure where that came from (I will fix it). There has been
a lot of automation on my end to get all these patterns converted from
.s output into dg patterns.
Thanks for looking this over!
-Kees
--
Kees Cook
^ permalink raw reply [flat|nested] 32+ messages in thread