linux-hardening.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/7] Introduce Kernel Control Flow Integrity ABI [PR107048]
@ 2025-09-13 23:23 Kees Cook
  2025-09-13 23:23 ` [PATCH v3 1/7] typeinfo: Introduce KCFI typeinfo mangling API Kees Cook
                   ` (6 more replies)
  0 siblings, 7 replies; 28+ messages in thread
From: Kees Cook @ 2025-09-13 23:23 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Kees Cook, Andrew Pinski, Jakub Jelinek, Martin Uecker,
	Richard Biener, Joseph Myers, Peter Zijlstra, Jan Hubicka,
	Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
	Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
	Jim Wilson, Dan Li, Sami Tolvanen, Ramon de C Valle, Joao Moreira,
	Nathan Chancellor, Bill Wendling, gcc-patches, linux-hardening

Hi!

Here is v3, which has continued to evolve a lot from v2[1].

This series implements[2][3] the Linux Kernel Control Flow Integrity
ABI, which provides a function prototype based forward edge control flow
integrity protection by instrumenting every indirect call to check for
a hash value before the target function address. If the hash at the call
site and the hash at the target do not match, execution will trap.

Changes since v2:

- Refactored mangling to provide actual builtins, making it SO much
  easier to test. This is good not just for KCFI but also for coming
  type-aware allocators that need to have a stable value (32-bit
  hash) to represent C types.

- Consolidated DECL vs TYPE attributes for KCFI type_id, allowing
  for the removal of all the GIMPLE type wrapping and the GIMPLE
  passes entirely.

- Tightened testsuite to be much more target and option aware.

- Support nocf_check to disable preamble generation.

- Passes contrib/check_GNU_style.py (with some clear exceptions).

- Added more documentation.

- General cleanups and comment clarifications.

Thanks!

-Kees

[1] https://lore.kernel.org/linux-hardening/20250905001157.it.269-kees@kernel.org/
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107048
[3] https://github.com/KSPP/linux/issues/369

Kees Cook (7):
  typeinfo: Introduce KCFI typeinfo mangling API
  kcfi: Add core Kernel Control Flow Integrity infrastructure
  x86: Add x86_64 Kernel Control Flow Integrity implementation
  aarch64: Add AArch64 Kernel Control Flow Integrity implementation
  arm: Add ARM 32-bit Kernel Control Flow Integrity implementation
  riscv: Add RISC-V Kernel Control Flow Integrity implementation
  kcfi: Add regression test suite

 gcc/kcfi.h                                    |  52 ++
 gcc/kcfi.cc                                   | 601 ++++++++++++++++++
 gcc/Makefile.in                               |   2 +
 gcc/c-family/c-common.h                       |   1 +
 gcc/config/aarch64/aarch64-protos.h           |   5 +
 gcc/config/arm/arm-protos.h                   |   4 +
 gcc/config/i386/i386-protos.h                 |   1 +
 gcc/config/i386/i386.h                        |   3 +-
 gcc/config/riscv/riscv-protos.h               |   3 +
 gcc/flag-types.h                              |   2 +
 gcc/gimple.h                                  |  22 +
 gcc/kcfi-typeinfo.h                           |  32 +
 gcc/tree-pass.h                               |   1 +
 .../gcc.dg/builtin-typeinfo-errors.c          |  28 +
 gcc/testsuite/gcc.dg/builtin-typeinfo.c       | 350 ++++++++++
 gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c    |  72 +++
 gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c       | 108 ++++
 gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c |  84 +++
 .../gcc.dg/kcfi/kcfi-cold-partition.c         | 136 ++++
 .../gcc.dg/kcfi/kcfi-complex-addressing.c     | 135 ++++
 .../gcc.dg/kcfi/kcfi-ipa-robustness.c         |  54 ++
 .../gcc.dg/kcfi/kcfi-move-preservation.c      |  55 ++
 .../gcc.dg/kcfi/kcfi-no-sanitize-inline.c     | 100 +++
 gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c  |  39 ++
 .../gcc.dg/kcfi/kcfi-offset-validation.c      |  48 ++
 .../gcc.dg/kcfi/kcfi-patchable-basic.c        |  70 ++
 .../gcc.dg/kcfi/kcfi-patchable-entry-only.c   |  62 ++
 .../gcc.dg/kcfi/kcfi-patchable-large.c        |  51 ++
 .../gcc.dg/kcfi/kcfi-patchable-medium.c       |  60 ++
 .../gcc.dg/kcfi/kcfi-patchable-prefix-only.c  |  60 ++
 .../gcc.dg/kcfi/kcfi-pic-addressing.c         | 104 +++
 .../gcc.dg/kcfi/kcfi-retpoline-r11.c          |  50 ++
 gcc/testsuite/gcc.dg/kcfi/kcfi-runtime.c      | 151 +++++
 gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c   | 142 +++++
 .../gcc.dg/kcfi/kcfi-trap-encoding.c          |  54 ++
 gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c |  41 ++
 gcc/c-family/c-attribs.cc                     |  17 +-
 gcc/c-family/c-common.cc                      |   2 +
 gcc/c/c-parser.cc                             |  72 +++
 gcc/config/aarch64/aarch64.cc                 | 116 ++++
 gcc/config/aarch64/aarch64.md                 |  64 +-
 gcc/config/arm/arm.cc                         | 146 +++++
 gcc/config/arm/arm.md                         |  62 ++
 gcc/config/i386/i386-expand.cc                |  22 +-
 gcc/config/i386/i386.cc                       | 130 ++++
 gcc/config/i386/i386.md                       |  62 +-
 gcc/config/riscv/riscv.cc                     | 159 +++++
 gcc/config/riscv/riscv.md                     |  76 ++-
 gcc/df-scan.cc                                |   7 +
 gcc/doc/extend.texi                           | 132 ++++
 gcc/doc/invoke.texi                           | 100 +++
 gcc/doc/tm.texi                               |  31 +
 gcc/doc/tm.texi.in                            |  12 +
 gcc/final.cc                                  |   3 +
 gcc/kcfi-typeinfo.cc                          | 475 ++++++++++++++
 gcc/opts.cc                                   |   1 +
 gcc/passes.cc                                 |   1 +
 gcc/passes.def                                |   1 +
 gcc/rtl.def                                   |   6 +
 gcc/rtlanal.cc                                |   5 +
 gcc/target.def                                |  38 ++
 gcc/testsuite/gcc.dg/kcfi/kcfi.exp            |  64 ++
 gcc/toplev.cc                                 |  10 +
 gcc/tree-inline.cc                            |  10 +
 gcc/varasm.cc                                 |  37 +-
 65 files changed, 4611 insertions(+), 33 deletions(-)
 create mode 100644 gcc/kcfi.h
 create mode 100644 gcc/kcfi.cc
 create mode 100644 gcc/kcfi-typeinfo.h
 create mode 100644 gcc/testsuite/gcc.dg/builtin-typeinfo-errors.c
 create mode 100644 gcc/testsuite/gcc.dg/builtin-typeinfo.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-cold-partition.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-complex-addressing.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-ipa-robustness.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-move-preservation.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize-inline.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-offset-validation.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-basic.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-entry-only.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-large.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-medium.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-prefix-only.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-pic-addressing.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-retpoline-r11.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-runtime.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-trap-encoding.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c
 create mode 100644 gcc/kcfi-typeinfo.cc
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi.exp

-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v3 1/7] typeinfo: Introduce KCFI typeinfo mangling API
  2025-09-13 23:23 [PATCH v3 0/7] Introduce Kernel Control Flow Integrity ABI [PR107048] Kees Cook
@ 2025-09-13 23:23 ` Kees Cook
  2025-09-17 17:56   ` Qing Zhao
  2025-09-13 23:23 ` [PATCH v3 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure Kees Cook
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Kees Cook @ 2025-09-13 23:23 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Kees Cook, Andrew Pinski, Jakub Jelinek, Martin Uecker,
	Richard Biener, Joseph Myers, Peter Zijlstra, Jan Hubicka,
	Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
	Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
	Jim Wilson, Dan Li, Sami Tolvanen, Ramon de C Valle, Joao Moreira,
	Nathan Chancellor, Bill Wendling, gcc-patches, linux-hardening

To support the KCFI typeid and future type-based allocators, which need
to convert unique types into unique 32-bit values, add a mangling system
based on the Itanium C++ mangling ABI, adapted for for C types. Introduce
__builtin_typeinfo_hash for the hash, and __builtin_typeinfo_name for
testing and debugging (to see the human-readable mangling form). Add
tests for typeinfo validation and error handling.

gcc/ChangeLog:

	* Makefile.in: Add kcfi-typeinfo.o.
	* doc/extend.texi: Document typeinfo builtins.
	* kcfi-typeinfo.h: New file, typeinfo mangling API.
	* kcfi-typeinfo.cc: New file, implement typeinfo mangling.

gcc/c-family/ChangeLog:

	* c-common.h (enum rid): Add typeinfo builtins.
	* c-common.cc: Add typeinfo builtins.

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_get_builtin_type_arg): New function,
	parse type.
	(c_parser_postfix_expression): Add typeinfo builtins.

gcc/testsuite/ChangeLog:

	* gcc.dg/builtin-typeinfo-errors.c: New test, validate bad
	arguments are rejected.
	* gcc.dg/builtin-typeinfo.c: New test, typeinfo mangling.

Signed-off-by: Kees Cook <kees@kernel.org>
---
 gcc/Makefile.in                               |   1 +
 gcc/c-family/c-common.h                       |   1 +
 gcc/kcfi-typeinfo.h                           |  32 ++
 .../gcc.dg/builtin-typeinfo-errors.c          |  28 ++
 gcc/testsuite/gcc.dg/builtin-typeinfo.c       | 350 +++++++++++++
 gcc/c-family/c-common.cc                      |   2 +
 gcc/c/c-parser.cc                             |  72 +++
 gcc/doc/extend.texi                           |  94 ++++
 gcc/kcfi-typeinfo.cc                          | 475 ++++++++++++++++++
 9 files changed, 1055 insertions(+)
 create mode 100644 gcc/kcfi-typeinfo.h
 create mode 100644 gcc/testsuite/gcc.dg/builtin-typeinfo-errors.c
 create mode 100644 gcc/testsuite/gcc.dg/builtin-typeinfo.c
 create mode 100644 gcc/kcfi-typeinfo.cc

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index d2744db843d7..a14fb498ce44 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1591,6 +1591,7 @@ OBJS = \
 	ira-emit.o \
 	ira-lives.o \
 	jump.o \
+	kcfi-typeinfo.o \
 	langhooks.o \
 	late-combine.o \
 	lcm.o \
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index b6021d241731..e0100837946e 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -112,6 +112,7 @@ enum rid
   RID_BUILTIN_SHUFFLEVECTOR,   RID_BUILTIN_CONVERTVECTOR,  RID_BUILTIN_TGMATH,
   RID_BUILTIN_HAS_ATTRIBUTE,   RID_BUILTIN_ASSOC_BARRIER,  RID_BUILTIN_STDC,
   RID_BUILTIN_COUNTED_BY_REF,
+  RID_BUILTIN_TYPEINFO_NAME,  RID_BUILTIN_TYPEINFO_HASH,
   RID_DFLOAT32, RID_DFLOAT64, RID_DFLOAT128, RID_DFLOAT64X,
 
   /* TS 18661-3 keywords, in the same sequence as the TI_* values.  */
diff --git a/gcc/kcfi-typeinfo.h b/gcc/kcfi-typeinfo.h
new file mode 100644
index 000000000000..805f9ebaeca4
--- /dev/null
+++ b/gcc/kcfi-typeinfo.h
@@ -0,0 +1,32 @@
+/* KCFI-compatible type mangling, based on Itanium C++ ABI.
+   Copyright (C) 2025 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_KCFI_TYPEINFO_H
+#define GCC_KCFI_TYPEINFO_H
+
+#include "tree.h"
+#include <string>
+
+/* Get the typeinfo mangled name string for any C type.  */
+extern std::string typeinfo_get_name (tree type);
+
+/* Get the typeinfo hash for any C type.  */
+extern uint32_t typeinfo_get_hash (tree type);
+
+#endif /* GCC_KCFI_TYPEINFO_H */
diff --git a/gcc/testsuite/gcc.dg/builtin-typeinfo-errors.c b/gcc/testsuite/gcc.dg/builtin-typeinfo-errors.c
new file mode 100644
index 000000000000..71ad01337b4e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/builtin-typeinfo-errors.c
@@ -0,0 +1,28 @@
+/* Test error handling for __builtin_typeinfo_name and __builtin_typeinfo_hash.  */
+/* { dg-do compile } */
+
+int main() {
+    /* Test missing arguments */
+    const char *result1 = __builtin_typeinfo_name(); /* { dg-error "expected specifier-qualifier-list before '\\)'" } */
+    /* { dg-error "expected type name in '__builtin_typeinfo_name'" "" { target *-*-* } .-1 } */
+    unsigned int result2 = __builtin_typeinfo_hash(); /* { dg-error "expected specifier-qualifier-list before '\\)'" } */
+    /* { dg-error "expected type name in '__builtin_typeinfo_hash'" "" { target *-*-* } .-1 } */
+
+    /* Test wrong argument types (expressions instead of type names) */
+    const char *result3 = __builtin_typeinfo_name(42); /* { dg-error "expected specifier-qualifier-list before numeric constant" } */
+    /* { dg-error "expected type name in '__builtin_typeinfo_name'" "" { target *-*-* } .-1 } */
+    unsigned int result4 = __builtin_typeinfo_hash(42); /* { dg-error "expected specifier-qualifier-list before numeric constant" } */
+    /* { dg-error "expected type name in '__builtin_typeinfo_hash'" "" { target *-*-* } .-1 } */
+
+    int x = 5;
+    const char *result5 = __builtin_typeinfo_name(x); /* { dg-error "expected specifier-qualifier-list before" } */
+    /* { dg-error "expected type name in '__builtin_typeinfo_name'" "" { target *-*-* } .-1 } */
+    unsigned int result6 = __builtin_typeinfo_hash(x); /* { dg-error "expected specifier-qualifier-list before" } */
+    /* { dg-error "expected type name in '__builtin_typeinfo_hash'" "" { target *-*-* } .-1 } */
+
+    /* Test too many arguments */
+    const char *result7 = __builtin_typeinfo_name(int, int); /* { dg-error "expected '\\)' before ','" } */
+    unsigned int result8 = __builtin_typeinfo_hash(int, int); /* { dg-error "expected '\\)' before ','" } */
+
+    return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/builtin-typeinfo.c b/gcc/testsuite/gcc.dg/builtin-typeinfo.c
new file mode 100644
index 000000000000..744dc50f407e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/builtin-typeinfo.c
@@ -0,0 +1,350 @@
+/* Test KCFI type mangling using __builtin_typeinfo_name.  */
+/* { dg-do run } */
+/* { dg-options "-std=gnu99" } */
+
+#include <stdio.h>
+#include <string.h>
+#include <stdarg.h>
+
+int pass, fail;
+
+#define TEST_STRING(expr, expected_string) \
+  do { \
+    const char *actual_string = __builtin_typeinfo_name(typeof(expr)); \
+    printf("Testing %s: ", #expr); \
+    if (strcmp(actual_string, expected_string) == 0) { \
+      printf("PASS (%s)\n", actual_string); \
+      pass ++; \
+    } else { \
+      printf("FAIL\n"); \
+      printf("  Expected: %s\n", expected_string); \
+      printf("  Actual:   %s\n", actual_string); \
+      fail ++; \
+    } \
+  } while (0)
+
+int main(void)
+{
+    printf("Testing KCFI Typeinfo Mangling\n");
+    printf("======================================================\n");
+
+    /* Test basic types */
+    TEST_STRING(void, "_ZTSv");
+    TEST_STRING(char, "_ZTSc");
+    TEST_STRING(int, "_ZTSi");
+    TEST_STRING(short, "_ZTSs");
+    TEST_STRING(long, "_ZTSl");
+    TEST_STRING(float, "_ZTSf");
+    TEST_STRING(double, "_ZTSd");
+
+    /* Test qualified types */
+    TEST_STRING(const int, "_ZTSKi");
+    TEST_STRING(volatile int, "_ZTSVi");
+
+    /* Test pointer types */
+    TEST_STRING(char*, "_ZTSPc");
+    TEST_STRING(int*, "_ZTSPi");
+    TEST_STRING(void*, "_ZTSPv");
+    TEST_STRING(const char*, "_ZTSPKc");
+
+    /* Test array types */
+    TEST_STRING(int[10],  "_ZTSA10_i");
+    TEST_STRING(char[20], "_ZTSA20_c");
+    TEST_STRING(short[],  "_ZTSA_s");
+
+    /* Test basic function types */
+    extern void func_void(void);
+    extern void func_char(char x);
+    extern void func_short(short x);
+    extern void func_int(int x);
+    extern void func_long(long x);
+    TEST_STRING(func_void,  "_ZTSFvvE");
+    TEST_STRING(func_char,  "_ZTSFvcE");
+    TEST_STRING(func_short, "_ZTSFvsE");
+    TEST_STRING(func_int,   "_ZTSFviE");
+    TEST_STRING(func_long,  "_ZTSFvlE");
+
+    /* Test functions with unsigned types */
+    extern void func_unsigned_char(unsigned char x);
+    extern void func_unsigned_short(unsigned short x);
+    extern void func_unsigned_int(unsigned int x);
+    TEST_STRING(func_unsigned_char,  "_ZTSFvhE");
+    TEST_STRING(func_unsigned_short, "_ZTSFvtE");
+    TEST_STRING(func_unsigned_int,   "_ZTSFvjE");
+
+    /* Test functions with signed types */
+    extern void func_signed_char(signed char x);
+    extern void func_signed_short(signed short x);
+    extern void func_signed_int(signed int x);
+    TEST_STRING(func_signed_char,  "_ZTSFvaE");
+    TEST_STRING(func_signed_short, "_ZTSFvsE");
+    TEST_STRING(func_signed_int,   "_ZTSFviE");
+
+    /* Test functions with pointer types */
+    extern void func_void_ptr(void *x);
+    extern void func_char_ptr(char *x);
+    extern void func_short_ptr(short *x);
+    extern void func_int_ptr(int *x);
+    extern void func_int_array(int arr[]); /* Decays to "int *".  */
+    extern void func_long_ptr(long *x);
+    TEST_STRING(func_void_ptr,  "_ZTSFvPvE");
+    TEST_STRING(func_char_ptr,  "_ZTSFvPcE");
+    TEST_STRING(func_short_ptr, "_ZTSFvPsE");
+    TEST_STRING(func_int_ptr,   "_ZTSFvPiE");
+    TEST_STRING(func_int_array, "_ZTSFvPiE");
+    TEST_STRING(func_long_ptr,  "_ZTSFvPlE");
+
+    /* Test functions with const qualifiers */
+    extern void func_const_void_ptr(const void *x);
+    extern void func_const_char_ptr(const char *x);
+    extern void func_const_short_ptr(const short *x);
+    extern void func_const_int_ptr(const int *x);
+    extern void func_const_long_ptr(const long *x);
+    TEST_STRING(func_const_void_ptr,  "_ZTSFvPKvE");
+    TEST_STRING(func_const_char_ptr,  "_ZTSFvPKcE");
+    TEST_STRING(func_const_short_ptr, "_ZTSFvPKsE");
+    TEST_STRING(func_const_int_ptr,   "_ZTSFvPKiE");
+    TEST_STRING(func_const_long_ptr,  "_ZTSFvPKlE");
+
+    /* Test nested pointers */
+    extern void func_int_ptr_ptr(int **x);
+    extern void func_char_ptr_ptr(char **x);
+    TEST_STRING(func_int_ptr_ptr,  "_ZTSFvPPiE");
+    TEST_STRING(func_char_ptr_ptr, "_ZTSFvPPcE");
+
+    /* Test multiple parameters */
+    extern void func_int_char(int x, char y);
+    extern void func_char_int(char x, int y);
+    extern void func_two_int(int x, int y);
+    TEST_STRING(func_int_char, "_ZTSFvicE");
+    TEST_STRING(func_char_int, "_ZTSFvciE");
+    TEST_STRING(func_two_int,  "_ZTSFviiE");
+
+    /* Test return types */
+    extern int func_return_int(void);
+    extern char func_return_char(void);
+    extern void* func_return_ptr(void);
+    TEST_STRING(func_return_int,  "_ZTSFivE");
+    TEST_STRING(func_return_char, "_ZTSFcvE");
+    TEST_STRING(func_return_ptr,  "_ZTSFPvvE");
+
+    /* Test function pointer parameters */
+    extern void func_fptr_void(void (*fp)(void));
+    extern void func_fptr_int(void (*fp)(int));
+    extern void func_fptr_ret_int(int (*fp)(void));
+    TEST_STRING(func_fptr_void,    "_ZTSFvPFvvEE");
+    TEST_STRING(func_fptr_int,     "_ZTSFvPFviEE");
+    TEST_STRING(func_fptr_ret_int, "_ZTSFvPFivEE");
+
+    /* Test variadic functions */
+    struct audit_context { int dummy; };
+    extern void func_variadic_simple(const char *fmt, ...);
+    extern void func_variadic_mixed(int x, const char *fmt, ...);
+    extern void func_variadic_multi(int x, char y, const char *fmt, ...);
+    extern void audit_log_pattern(struct audit_context *ctx, unsigned int gfp_mask,
+				  int type, const char *fmt, ...);
+    TEST_STRING(func_variadic_simple, "_ZTSFvPKczE");
+    TEST_STRING(func_variadic_mixed,  "_ZTSFviPKczE");
+    TEST_STRING(func_variadic_multi,  "_ZTSFvicPKczE");
+    TEST_STRING(audit_log_pattern,    "_ZTSFvP13audit_contextjiPKczE");
+
+    /* Test mixed const/non-const */
+    extern void func_const_mixed(int x, const char *fmt);
+    TEST_STRING(func_const_mixed,  "_ZTSFviPKcE");
+
+    /* Test named struct types */
+    struct test_struct_a { int x; };
+    struct test_struct_b { char y; };
+    struct test_struct_c { void *ptr; };
+    TEST_STRING(struct test_struct_a, "_ZTS13test_struct_a");
+    extern void func_struct_a_ptr(struct test_struct_a *x);
+    extern void func_struct_b_ptr(struct test_struct_b *x);
+    extern void func_struct_c_ptr(struct test_struct_c *x);
+    TEST_STRING(func_struct_a_ptr, "_ZTSFvP13test_struct_aE");
+    TEST_STRING(func_struct_b_ptr, "_ZTSFvP13test_struct_bE");
+    TEST_STRING(func_struct_c_ptr, "_ZTSFvP13test_struct_cE");
+
+    /* Test const named struct types */
+    extern void func_const_struct_a_ptr(const struct test_struct_a *x);
+    extern void func_const_struct_b_ptr(const struct test_struct_b *x);
+    extern void func_const_struct_c_ptr(const struct test_struct_c *x);
+    TEST_STRING(func_const_struct_a_ptr, "_ZTSFvPK13test_struct_aE");
+    TEST_STRING(func_const_struct_b_ptr, "_ZTSFvPK13test_struct_bE");
+    TEST_STRING(func_const_struct_c_ptr, "_ZTSFvPK13test_struct_cE");
+
+    /* Test named union types */
+    union test_union_a { int x; float y; };
+    union test_union_b { char a; void *b; };
+    TEST_STRING(union test_union_a,  "_ZTS12test_union_a");
+    extern void func_union_a_ptr(union test_union_a *x);
+    extern void func_union_b_ptr(union test_union_b *x);
+    TEST_STRING(func_union_a_ptr, "_ZTSFvP12test_union_aE");
+    TEST_STRING(func_union_b_ptr, "_ZTSFvP12test_union_bE");
+
+    /* Test enum types: distinct from int */
+    enum test_enum_a { ENUM_A_VAL };
+    enum test_enum_b { ENUM_B_VAL };
+    TEST_STRING(enum test_enum_a, "_ZTS11test_enum_a");
+    extern void func_enum_a_ptr(enum test_enum_a *x);
+    extern void func_enum_b_ptr(enum test_enum_b *x);
+    TEST_STRING(func_enum_a_ptr, "_ZTSFvP11test_enum_aE");
+    TEST_STRING(func_enum_b_ptr, "_ZTSFvP11test_enum_bE");
+
+    /* Test union member discrimination */
+    struct tasklet {
+        int state;
+        union {
+            void (*func)(unsigned long data);
+            void (*callback)(struct tasklet *t);
+        };
+        unsigned long data;
+    } tasklet_instance;
+    TEST_STRING(tasklet_instance, "_ZTS7tasklet");
+    struct tasklet *p = &tasklet_instance;
+    extern void tasklet_callback_function(struct tasklet *t);
+    extern void tasklet_func_function(unsigned long data);
+    TEST_STRING(tasklet_func_function,     "_ZTSFvmE");
+    TEST_STRING(*p->func,                  "_ZTSFvmE");
+    TEST_STRING(tasklet_callback_function, "_ZTSFvP7taskletE");
+    TEST_STRING(*p->callback,              "_ZTSFvP7taskletE");
+
+    /* Test struct return pointers */
+    extern struct test_struct_a* func_ret_struct_a_ptr(void);
+    extern struct test_struct_b* func_ret_struct_b_ptr(void);
+    extern struct test_struct_c* func_ret_struct_c_ptr(void);
+    TEST_STRING(func_ret_struct_a_ptr, "_ZTSFP13test_struct_avE");
+    TEST_STRING(func_ret_struct_b_ptr, "_ZTSFP13test_struct_bvE");
+    TEST_STRING(func_ret_struct_c_ptr, "_ZTSFP13test_struct_cvE");
+
+    /* Test struct by-value parameters */
+    extern void func_struct_a_val(struct test_struct_a x);
+    extern void func_struct_b_val(struct test_struct_b x);
+    extern void func_struct_c_val(struct test_struct_c x);
+    TEST_STRING(func_struct_a_val, "_ZTSFv13test_struct_aE");
+    TEST_STRING(func_struct_b_val, "_ZTSFv13test_struct_bE");
+    TEST_STRING(func_struct_c_val, "_ZTSFv13test_struct_cE");
+
+    /* Test struct return by-value */
+    extern struct test_struct_a func_ret_struct_a_val(void);
+    extern struct test_struct_b func_ret_struct_b_val(void);
+    extern struct test_struct_c func_ret_struct_c_val(void);
+    TEST_STRING(func_ret_struct_a_val, "_ZTSF13test_struct_avE");
+    TEST_STRING(func_ret_struct_b_val, "_ZTSF13test_struct_bvE");
+    TEST_STRING(func_ret_struct_c_val, "_ZTSF13test_struct_cvE");
+
+    /* Test mixed struct parameters */
+    extern void func_struct_a_b(struct test_struct_a *a, struct test_struct_b *b);
+    extern void func_struct_b_a(struct test_struct_b *b, struct test_struct_a *a);
+    TEST_STRING(func_struct_a_b, "_ZTSFvP13test_struct_aP13test_struct_bE");
+    TEST_STRING(func_struct_b_a, "_ZTSFvP13test_struct_bP13test_struct_aE");
+
+    /* Test anonymous struct typedefs */
+    typedef struct { int x; } typedef_struct_x;
+    typedef struct { int y; } typedef_struct_y;
+    TEST_STRING(typedef_struct_x, "_ZTS16typedef_struct_x");
+    extern void func_typedef_x_ptr(typedef_struct_x *x);
+    extern void func_typedef_y_ptr(typedef_struct_y *y);
+    TEST_STRING(func_typedef_x_ptr, "_ZTSFvP16typedef_struct_xE");
+    TEST_STRING(func_typedef_y_ptr, "_ZTSFvP16typedef_struct_yE");
+    extern void func_typedef_x(typedef_struct_x x);
+    TEST_STRING(func_typedef_x, "_ZTSFv16typedef_struct_xE");
+
+    /* Test anonymous union typedefs */
+    typedef union { int x; short a; } typedef_union_x;
+    typedef union { int y; short b; } typedef_union_y;
+    TEST_STRING(typedef_union_x, "_ZTS15typedef_union_x");
+    extern void func_typedef_union_x_ptr(typedef_union_x *x);
+    extern void func_typedef_union_y_ptr(typedef_union_y *y);
+    TEST_STRING(func_typedef_union_x_ptr, "_ZTSFvP15typedef_union_xE");
+    TEST_STRING(func_typedef_union_y_ptr, "_ZTSFvP15typedef_union_yE");
+    extern void func_typedef_union_x(typedef_union_x x);
+    TEST_STRING(func_typedef_union_x, "_ZTSFv15typedef_union_xE");
+
+    /* Test anonymous enum typedefs */
+    typedef enum { STEP_1, STEP_2 } typedef_enum_x;
+    typedef enum { STEP_A, STEP_B } typedef_enum_y;
+    TEST_STRING(typedef_enum_x, "_ZTS14typedef_enum_x");
+    extern void func_typedef_enum_x_ptr(typedef_enum_x *x);
+    extern void func_typedef_enum_y_ptr(typedef_enum_y *y);
+    TEST_STRING(func_typedef_enum_x_ptr, "_ZTSFvP14typedef_enum_xE");
+    TEST_STRING(func_typedef_enum_y_ptr, "_ZTSFvP14typedef_enum_yE");
+    extern void func_typedef_enum_x(typedef_enum_x x);
+    TEST_STRING(func_typedef_enum_x, "_ZTSFv14typedef_enum_xE");
+
+    /* Test basic typedef vs open-coded function types: should be the same.  */
+    typedef void (*func_type_typedef)(int, char);
+    TEST_STRING(func_type_typedef,           "_ZTSPFvicE");
+    extern void func_with_typedef_param(func_type_typedef fp);
+    extern void func_with_opencoded_param(void (*fp)(int, char));
+    TEST_STRING(func_with_typedef_param,   "_ZTSFvPFvicEE");
+    TEST_STRING(func_with_opencoded_param, "_ZTSFvPFvicEE");
+
+    /* Test return function pointer types */
+    typedef int (*ret_func_type_typedef)(void);
+    TEST_STRING(ret_func_type_typedef,     "_ZTSPFivE");
+    extern ret_func_type_typedef func_ret_typedef_param(void);
+    extern int (*func_ret_opencoded_param(void))(void);
+    TEST_STRING(func_ret_typedef_param,   "_ZTSFPFivEvE");
+    TEST_STRING(func_ret_opencoded_param, "_ZTSFPFivEvE");
+
+    /* Test additional type combos */
+    extern void func_float(float x);
+    extern void func_double_ptr(double *x);
+    extern void func_float_ptr(float *x);
+    extern void func_void_ptr_ptr(void **x);
+    extern void func_ptr_val(int *x, int y);
+    extern void func_val_ptr(int x, int *y);
+    extern float func_return_float(void);
+    extern double func_return_double(void);
+    TEST_STRING(func_float,         "_ZTSFvfE");
+    TEST_STRING(func_double_ptr,    "_ZTSFvPdE");
+    TEST_STRING(func_float_ptr,     "_ZTSFvPfE");
+    TEST_STRING(func_void_ptr_ptr,  "_ZTSFvPPvE");
+    TEST_STRING(func_ptr_val,       "_ZTSFvPiiE");
+    TEST_STRING(func_val_ptr,       "_ZTSFviPiE");
+    TEST_STRING(func_return_float,  "_ZTSFfvE");
+    TEST_STRING(func_return_double, "_ZTSFdvE");
+
+    /* Test VLA types: should be all the same.  */
+    extern void func_vla_1d(int n, int arr[n]);
+    extern void func_vla_empty(int n, int arr[]);
+    extern void func_vla_ptr(int n, int *arr);
+    TEST_STRING(func_vla_1d,    "_ZTSFviPiE");
+    TEST_STRING(func_vla_empty, "_ZTSFviPiE");
+    TEST_STRING(func_vla_ptr,   "_ZTSFviPiE");
+
+    /* Test 2D VLA with fixed dimension: should be all the same.  */
+    extern void func_vla_2d_first(int n, int arr[n][10]);
+    extern void func_vla_2d_empty(int n, int arr[][10]);
+    extern void func_vla_2d_ptr(int n, int (*arr)[10]);
+    TEST_STRING(func_vla_2d_first, "_ZTSFviPA10_iE");
+    TEST_STRING(func_vla_2d_empty, "_ZTSFviPA10_iE");
+    TEST_STRING(func_vla_2d_ptr,   "_ZTSFviPA10_iE");
+
+    /* Test 2D VLA with both dimensions variable: should be all the same.  */
+    extern void func_vla_2d_both(int rows, int cols, int arr[rows][cols]);
+    extern void func_vla_2d_second(int rows, int cols, int arr[][cols]);
+    extern void func_vla_2d_star(int rows, int cols, int arr[*][cols]);
+    TEST_STRING(func_vla_2d_both,   "_ZTSFviiPA_iE");
+    TEST_STRING(func_vla_2d_second, "_ZTSFviiPA_iE");
+    TEST_STRING(func_vla_2d_star,   "_ZTSFviiPA_iE");
+
+    /* Test recursive typedef canonicalization */
+    struct recursive_struct_test { int field; };
+    typedef struct recursive_struct_test recursive_struct_typedef_1;
+    typedef recursive_struct_typedef_1 recursive_struct_typedef_2;
+    extern void func_recursive_struct_test(struct recursive_struct_test *x);
+    TEST_STRING(func_recursive_struct_test, "_ZTSFvP21recursive_struct_testE");
+
+    /* Test anonymous struct, union, enum types */
+    struct { int a; short b; } anon_struct;
+    union { int x; float y; } anon_union;
+    enum { ANON_VAL1, ANON_VAL2 } anon_enum;
+    TEST_STRING(anon_struct, "_ZTS3$_0"); // <length>$_<counter>
+    TEST_STRING(anon_union, "_ZTS3$_1");  // <length>$_<counter>
+    TEST_STRING(anon_enum, "_ZTS3$_2");   // <length>$_<counter>
+
+    printf("\n================================================================\n");
+    printf("Passed: %d Failed: %d (%d total tests)\n", pass, fail, pass + fail);
+    return fail;
+}
diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index e7dd4602ac11..94f2c2001ad5 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -461,6 +461,8 @@ const struct c_common_resword c_common_reswords[] =
   { "__builtin_stdc_trailing_zeros", RID_BUILTIN_STDC, D_CONLY },
   { "__builtin_tgmath", RID_BUILTIN_TGMATH, D_CONLY },
   { "__builtin_offsetof", RID_OFFSETOF, 0 },
+  { "__builtin_typeinfo_hash", RID_BUILTIN_TYPEINFO_HASH, D_CONLY },
+  { "__builtin_typeinfo_name", RID_BUILTIN_TYPEINFO_NAME, D_CONLY },
   { "__builtin_types_compatible_p", RID_TYPES_COMPATIBLE_P, D_CONLY },
   { "__builtin_c23_va_start", RID_C23_VA_START,	D_C23 },
   { "__builtin_va_arg",	RID_VA_ARG,	0 },
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index e8b64948bf69..996fb576ac7c 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -77,6 +77,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "asan.h"
 #include "c-family/c-ubsan.h"
 #include "gcc-urlifier.h"
+#include "kcfi-typeinfo.h"
 \f
 /* We need to walk over decls with incomplete struct/union/enum types
    after parsing the whole translation unit.
@@ -11017,6 +11018,38 @@ c_parser_has_attribute_expression (c_parser *parser)
   return result;
 }
 
+/* Parse the single type name argument of a builtin that takes a type name.
+   Returns true on success and stores the parsed type in *OUT_TYPE.
+   If successful, *OUT_CLOSE_PAREN_LOC is written with the location of
+   the closing parenthesis.  */
+
+static bool
+c_parser_get_builtin_type_arg (c_parser *parser, const char *bname,
+			       tree *out_type, location_t *out_close_paren_loc)
+{
+  matching_parens parens;
+  if (!parens.require_open (parser))
+    return false;
+
+  struct c_type_name *type_name = c_parser_type_name (parser);
+  if (type_name == NULL)
+    {
+      error_at (c_parser_peek_token (parser)->location,
+		"expected type name in %qs", bname);
+      return false;
+    }
+
+  *out_close_paren_loc = c_parser_peek_token (parser)->location;
+  parens.skip_until_found_close (parser);
+
+  tree type = groktypename (type_name, NULL, NULL);
+  if (type == error_mark_node)
+    return false;
+
+  *out_type = type;
+  return true;
+}
+
 /* Helper function to read arguments of builtins which are interfaces
    for the middle-end nodes like COMPLEX_EXPR, VEC_PERM_EXPR and
    others.  The name of the builtin is passed using BNAME parameter.
@@ -12025,6 +12058,45 @@ c_parser_postfix_expression (c_parser *parser)
 	    set_c_expr_source_range (&expr, loc, close_paren_loc);
 	  }
 	  break;
+	case RID_BUILTIN_TYPEINFO_NAME:
+	  {
+	    c_parser_consume_token (parser);
+	    location_t close_paren_loc;
+	    tree type;
+	    if (!c_parser_get_builtin_type_arg (parser,
+						"__builtin_typeinfo_name",
+						&type, &close_paren_loc))
+	      {
+		expr.set_error ();
+		break;
+	      }
+
+	    /* Call the typeinfo name function.  */
+	    std::string type_name = typeinfo_get_name (type);
+	    expr.value = build_string_literal (type_name.length () + 1,
+					       type_name.c_str ());
+	    set_c_expr_source_range (&expr, loc, close_paren_loc);
+	  }
+	  break;
+	case RID_BUILTIN_TYPEINFO_HASH:
+	  {
+	    c_parser_consume_token (parser);
+	    location_t close_paren_loc;
+	    tree type;
+	    if (!c_parser_get_builtin_type_arg (parser,
+						"__builtin_typeinfo_hash",
+						&type, &close_paren_loc))
+	      {
+		expr.set_error ();
+		break;
+	      }
+
+	    /* Call the typeinfo hash function.  */
+	    uint32_t type_hash = typeinfo_get_hash (type);
+	    expr.value = build_int_cst (unsigned_type_node, type_hash);
+	    set_c_expr_source_range (&expr, loc, close_paren_loc);
+	  }
+	  break;
 	case RID_BUILTIN_TGMATH:
 	  {
 	    vec<c_expr_t, va_gc> *cexpr_list;
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 382295834035..7cddea1ed6c1 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -17547,6 +17547,100 @@ which will cause a @code{NULL} pointer to be used for the unsafe case.
 
 @enddefbuiltin
 
+@defbuiltin{{unsigned int} __builtin_typeinfo_hash (@var{type})}
+
+The built-in function @code{__builtin_typeinfo_hash} returns a hash value
+for the given type @var{type} (which is a type, not an expression).  The hash
+is computed using the FNV-1a algorithm on the type's mangled name representation,
+which follows a subset of the Itanium C++ ABI conventions adapted for C types.
+(See @code{__buitin_typeinfo_name} for the string representation.)
+
+This built-in is primarily intended for kernel control flow integrity (KCFI)
+implementations and other type-aware runtime systems that need to generate
+consistent type identifiers.  The hash value is a 32-bit unsigned integer.
+
+Key characteristics of the hash:
+@itemize @bullet
+@item
+The hash is consistent for the same type across different translation units.
+@item
+Typedefs are recursively canonicalized down to integral type name or named
+struct, union, or enum tag name.
+@item
+Typedefs of anonymous structs, unions, and enums preserve the typedef name
+in the hash calculation (e.g., @code{typedef struct @{ int x; @} foo_t;}
+uses @code{foo_t} in the hash).
+@item
+Type qualifiers (@code{const}, @code{volatile}, @code{restrict}) affect
+the hash value.
+@item
+Function types include parameter types and variadic markers in the hash.
+@end itemize
+
+For example:
+@smallexample
+typedef struct @{ int x; @} mytype_t;
+unsigned int hash1 = __builtin_typeinfo_hash(mytype_t);
+unsigned int hash2 = __builtin_typeinfo_hash(struct @{ int x; @});
+/* hash1 != hash2 because the typedef name is preserved */
+
+void func(int x, char y);
+unsigned int hash3 = __builtin_typeinfo_hash(typeof(func));
+/* Returns hash for function type "void(int, char)" */
+@end smallexample
+
+@emph{Note:} This construct is only available for C@. For C++, see
+@code{std::type_info::hash_code}.
+
+@enddefbuiltin
+
+@defbuiltin{{const char *} __builtin_typeinfo_name (@var{type})}
+
+The built-in function @code{__builtin_typeinfo_name} returns a string
+containing the mangled name representation of the given type @var{type}
+(which is a type, not an expression).  The string follows a subset of the
+Itanium C++ ABI mangling conventions adapted for C types.  (See
+@code{__buitin_typeinfo_hash} for the unsigned 32-bit hash representation.)
+
+The returned string is a compile-time constant suitable for use in
+string comparisons, debugging output, or other type introspection needs.
+The string begins with @code{_ZTS} followed by the encoded type information.
+
+Mangling examples:
+@itemize @bullet
+@item
+@code{int} becomes @code{"_ZTSi"}
+@item
+@code{char *} becomes @code{"_ZTSPc"}
+@item
+@code{const int} becomes @code{"_ZTSKi"}
+@item
+@code{int[10]} becomes @code{"_ZTSA10_i"}
+@item
+@code{void (*)(int)} becomes @code{"_ZTSPFviE"}
+@item
+@code{struct foo} becomes @code{"_ZTS3foo"}
+@item
+@code{typedef struct @{ int x; @} bar_t;} becomes @code{"_ZTS5bar_t"}
+@end itemize
+
+The mangling preserves typedef names for anonymous compound types, which
+is particularly useful for distinguishing between different typedefs of
+structurally identical anonymous types:
+
+@smallexample
+typedef struct @{ int x; @} type_a;
+typedef struct @{ int x; @} type_b;
+const char *name_a = __builtin_typeinfo_name(type_a);  /* "_ZTS6type_a" */
+const char *name_b = __builtin_typeinfo_name(type_b);  /* "_ZTS6type_b" */
+/* name_a and name_b are different despite identical structure */
+@end smallexample
+
+@emph{Note:} This construct is only available for C@. For C++, see
+@code{std::type_info::name}.
+
+@enddefbuiltin
+
 @defbuiltin{int __builtin_types_compatible_p (@var{type1}, @var{type2})}
 
 You can use the built-in function @code{__builtin_types_compatible_p} to
diff --git a/gcc/kcfi-typeinfo.cc b/gcc/kcfi-typeinfo.cc
new file mode 100644
index 000000000000..24099c42cc2e
--- /dev/null
+++ b/gcc/kcfi-typeinfo.cc
@@ -0,0 +1,475 @@
+/* KCFI-compatible type mangling, based on Itanium C++ ABI.
+   Copyright (C) 2025 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+/* Produces typeinfo mangling similar to Itanium C++ Mangling ABI, but
+   limited to types exposed within GCC for C language handling.  The
+   hashes are used by KCFI (and future type-aware allocator support).
+   The strings are used for testing and debugging.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "diagnostic-core.h"
+#include "stringpool.h"
+#include "stor-layout.h"
+#include "print-tree.h"
+#include "kcfi-typeinfo.h"
+
+/* Helper to update FNV-1a hash with a single character.  */
+
+static inline void
+fnv1a_hash_char (uint32_t *hash_state, unsigned char c)
+{
+  *hash_state ^= c;
+  *hash_state *= 16777619U; /* FNV-1a 32-bit prime.  */
+}
+
+/* Helper to append character to optional string and update hash using
+   FNV-1a.  */
+
+static void
+append_char (char c, std::string *out_str, uint32_t *hash_state)
+{
+  if (out_str)
+    *out_str += c;
+  if (!hash_state)
+    return;
+  fnv1a_hash_char (hash_state, (unsigned char) c);
+}
+
+/* Helper to append string to optional string and update hash using
+   FNV-1a.  */
+
+static void
+append_string (const char *str, std::string *out_str, uint32_t *hash_state)
+{
+  if (out_str)
+    *out_str += str;
+  if (!hash_state)
+    return;
+  for (const char *p = str; *p; p++)
+    fnv1a_hash_char (hash_state, (unsigned char) *p);
+}
+
+/* Forward declaration for recursive type mangling.  */
+
+static void mangle_type (tree type, std::string *out_str, uint32_t *hash_state);
+
+/* Mangle a builtin type following Itanium C++ ABI for C types.  */
+
+static void
+mangle_builtin_type (tree type, std::string *out_str, uint32_t *hash_state)
+{
+  gcc_assert (type != NULL_TREE);
+
+  switch (TREE_CODE (type))
+    {
+    case VOID_TYPE:
+      append_char ('v', out_str, hash_state);
+      return;
+
+    case BOOLEAN_TYPE:
+      append_char ('b', out_str, hash_state);
+      return;
+
+    case INTEGER_TYPE:
+      if (type == char_type_node)
+	append_char ('c', out_str, hash_state);
+      else if (type == signed_char_type_node)
+	append_char ('a', out_str, hash_state);
+      else if (type == unsigned_char_type_node)
+	append_char ('h', out_str, hash_state);
+      else if (type == short_integer_type_node)
+	append_char ('s', out_str, hash_state);
+      else if (type == short_unsigned_type_node)
+	append_char ('t', out_str, hash_state);
+      else if (type == integer_type_node)
+	append_char ('i', out_str, hash_state);
+      else if (type == unsigned_type_node)
+	append_char ('j', out_str, hash_state);
+      else if (type == long_integer_type_node)
+	append_char ('l', out_str, hash_state);
+      else if (type == long_unsigned_type_node)
+	append_char ('m', out_str, hash_state);
+      else if (type == long_long_integer_type_node)
+	append_char ('x', out_str, hash_state);
+      else if (type == long_long_unsigned_type_node)
+	append_char ('y', out_str, hash_state);
+      else
+	{
+	  /* Fallback for other integer types - use precision-based
+	     encoding.  */
+	  append_char ('i', out_str, hash_state);
+	  append_string (std::to_string (TYPE_PRECISION (type)).c_str (),
+			 out_str, hash_state);
+	}
+      return;
+
+    case REAL_TYPE:
+      if (type == float_type_node)
+	append_char ('f', out_str, hash_state);
+      else if (type == double_type_node)
+	append_char ('d', out_str, hash_state);
+      else if (type == long_double_type_node)
+	append_char ('e', out_str, hash_state);
+      else
+	{
+	  /* Fallback for other real types.  */
+	  append_char ('f', out_str, hash_state);
+	  append_string (std::to_string (TYPE_PRECISION (type)).c_str (),
+			 out_str, hash_state);
+	}
+      return;
+
+    case VECTOR_TYPE:
+      {
+	/* Handle vector types:
+	   Dv<num-elements>_<element-type-encoding>
+	   Example: uint8x16_t -> Dv16_h (vector of 16 unsigned char)  */
+	tree vector_size = TYPE_SIZE_UNIT (type);
+	tree element_type = TREE_TYPE (type);
+	tree element_size = TYPE_SIZE_UNIT (element_type);
+
+	if (vector_size && element_size
+	    && TREE_CODE (vector_size) == INTEGER_CST
+	    && TREE_CODE (element_size) == INTEGER_CST)
+	  {
+	    append_char ('D', out_str, hash_state);
+	    append_char ('v', out_str, hash_state);
+
+	    unsigned HOST_WIDE_INT vec_bytes = tree_to_uhwi (vector_size);
+	    unsigned HOST_WIDE_INT elem_bytes = tree_to_uhwi (element_size);
+	    unsigned HOST_WIDE_INT num_elements = vec_bytes / elem_bytes;
+
+	    /* Append number of elements.  */
+	    append_string (std::to_string (num_elements).c_str (),
+			   out_str, hash_state);
+	    append_char ('_', out_str, hash_state);
+
+	    /* Recursively mangle the element type.  */
+	    mangle_type (element_type, out_str, hash_state);
+	    return;
+	  }
+	/* Fail for vectors with unknown size.  */
+      }
+      break;
+
+    default:
+      break;
+    }
+
+  /* Unknown builtin type: this should never happen in a well-formed C.  */
+  debug_tree (type);
+  internal_error ("mangle: Unknown builtin type - please report this as a bug");
+}
+
+/* Canonicalize typedef types to their underlying named struct/union types.  */
+
+static tree
+canonicalize_typedef_type (tree type)
+{
+  /* Handle typedef types: canonicalize to named structs when possible.  */
+  if (TYPE_NAME (type) && TREE_CODE (TYPE_NAME (type)) == TYPE_DECL)
+    {
+      tree type_decl = TYPE_NAME (type);
+
+      /* Check if this is a typedef (not the original struct declaration) */
+      if (DECL_ORIGINAL_TYPE (type_decl))
+	{
+	  tree original_type = DECL_ORIGINAL_TYPE (type_decl);
+
+	  /* Handle struct/union/enum types.  */
+	  if (TREE_CODE (original_type) == RECORD_TYPE
+	      || TREE_CODE (original_type) == UNION_TYPE
+	      || TREE_CODE (original_type) == ENUMERAL_TYPE)
+	    {
+	      /* Preserve typedef of anonymous struct/union/enum types.  */
+	      if (!TYPE_NAME (original_type))
+		return type;
+
+	      /* Named compound type: canonicalize to it.  */
+	      return canonicalize_typedef_type (original_type);
+	    }
+
+	  /* For basic type typedefs (e.g., u8 -> unsigned char),
+	     canonicalize to original type.  */
+	  if (TREE_CODE (original_type) == INTEGER_TYPE
+	      || TREE_CODE (original_type) == REAL_TYPE
+	      || TREE_CODE (original_type) == POINTER_TYPE
+	      || TREE_CODE (original_type) == ARRAY_TYPE
+	      || TREE_CODE (original_type) == FUNCTION_TYPE
+	      || TREE_CODE (original_type) == METHOD_TYPE
+	      || TREE_CODE (original_type) == BOOLEAN_TYPE
+	      || TREE_CODE (original_type) == COMPLEX_TYPE
+	      || TREE_CODE (original_type) == VECTOR_TYPE)
+	    {
+	      /* Recursively canonicalize in case the original type is
+		 also a typedef.  */
+	      return canonicalize_typedef_type (original_type);
+	    }
+	}
+    }
+
+  return type;
+}
+
+/* Recursively mangle a C type following Itanium C++ ABI.  */
+
+static void
+mangle_type (tree type, std::string *out_str, uint32_t *hash_state)
+{
+  gcc_assert (type != NULL_TREE);
+
+  /* Canonicalize typedef types to their underlying named struct types.  */
+  type = canonicalize_typedef_type (type);
+
+  /* Save original qualified type for cases where we need typedef
+     information.  */
+  tree qualified_type = type;
+
+  /* Centralized qualifier handling: emit qualifiers for this type,
+     then continue with unqualified version.  */
+  if (TYPE_QUALS (type) != TYPE_UNQUALIFIED)
+    {
+      /* Emit qualifiers in Itanium ABI order: restrict, volatile, const.  */
+      if (TYPE_QUALS (type) & TYPE_QUAL_RESTRICT)
+	append_char ('r', out_str, hash_state);
+      if (TYPE_QUALS (type) & TYPE_QUAL_VOLATILE)
+	append_char ('V', out_str, hash_state);
+      if (TYPE_QUALS (type) & TYPE_QUAL_CONST)
+	append_char ('K', out_str, hash_state);
+
+      /* Get unqualified version for further processing.  */
+      type = TYPE_MAIN_VARIANT (type);
+    }
+
+  switch (TREE_CODE (type))
+    {
+    case POINTER_TYPE:
+      {
+	/* Pointer type: 'P' + pointed-to type.  */
+	append_char ('P', out_str, hash_state);
+
+	/* Recursively mangle the pointed-to type.  */
+	tree pointed_to_type = TREE_TYPE (type);
+	mangle_type (pointed_to_type, out_str, hash_state);
+	break;
+      }
+
+    case ARRAY_TYPE:
+      /* Array type: 'A' + size + '_' + element type (simplified).  */
+      append_char ('A', out_str, hash_state);
+      if (TYPE_DOMAIN (type) && TYPE_MAX_VALUE (TYPE_DOMAIN (type)))
+	{
+	  tree max_val = TYPE_MAX_VALUE (TYPE_DOMAIN (type));
+	  /* Check if array size is compile-time constant to handle VLAs. */
+	  if (TREE_CODE (max_val) == INTEGER_CST && tree_fits_shwi_p (max_val))
+	    {
+	      HOST_WIDE_INT size = tree_to_shwi (max_val) + 1;
+	      append_string (std::to_string ((long) size).c_str (),
+			     out_str, hash_state);
+	    }
+	  /* For VLAs or non-constant dimensions, emit empty size (A_).  */
+	  append_char ('_', out_str, hash_state);
+	}
+      else
+	{
+	  /* No domain or no max value: emit A_.  */
+	  append_char ('_', out_str, hash_state);
+	}
+      mangle_type (TREE_TYPE (type), out_str, hash_state);
+      break;
+
+    case REFERENCE_TYPE:
+      /* Reference type: 'R' + referenced type.
+	 Note: We must handle references to builtin types including compiler
+	 builtins like __builtin_va_list used in functions like va_start.  */
+      append_char ('R', out_str, hash_state);
+      mangle_type (TREE_TYPE (type), out_str, hash_state);
+      break;
+
+    case FUNCTION_TYPE:
+      {
+	/* Function type: 'F' + return type + parameter types + 'E' */
+	append_char ('F', out_str, hash_state);
+	mangle_type (TREE_TYPE (type), out_str, hash_state);
+
+	/* Add parameter types.  */
+	tree param_types = TYPE_ARG_TYPES (type);
+
+	if (param_types == NULL_TREE)
+	  {
+	    /* func () - no parameter list (could be variadic). */
+	  }
+	else
+	  {
+	    bool found_real_params = false;
+	    for (tree param = param_types; param; param = TREE_CHAIN (param))
+	      {
+		tree param_type = TREE_VALUE (param);
+		if (param_type == void_type_node)
+		  {
+		    /* Check if this is the first parameter (explicit void) or a
+		       sentinel.  */
+		    if (!found_real_params)
+		      {
+			/* func (void) - explicit empty parameter list.
+			   Mangle void to distinguish from variadic func (). */
+			mangle_type (void_type_node, out_str, hash_state);
+		      }
+		    /* If we found real params before this void, it's a sentinel
+		       so stop here.  */
+		    break;
+		  }
+
+		found_real_params = true;
+
+		/* For value parameters, ignore const/volatile qualifiers as
+		   they don't affect the calling convention.  "const int" and
+		   "int" are passed identically by value.  */
+		tree canonical_param_type = param_type;
+
+		if (TREE_CODE (param_type) != POINTER_TYPE
+		    && TREE_CODE (param_type) != REFERENCE_TYPE
+		    && TREE_CODE (param_type) != ARRAY_TYPE)
+		  {
+		    /* For non-pointer/reference value parameters, strip
+		       qualifiers by default.  */
+		    canonical_param_type = TYPE_MAIN_VARIANT (param_type);
+
+		    /* Exception: preserve typedef information for anonymous
+		       compound types.  */
+		    if (TYPE_NAME (param_type)
+			&& TREE_CODE (TYPE_NAME (param_type)) == TYPE_DECL
+			&& DECL_ORIGINAL_TYPE (TYPE_NAME (param_type)))
+		      {
+			tree original_type
+			  = DECL_ORIGINAL_TYPE (TYPE_NAME (param_type));
+			if ((TREE_CODE (original_type) == RECORD_TYPE
+			     || TREE_CODE (original_type) == UNION_TYPE
+			     || TREE_CODE (original_type) == ENUMERAL_TYPE)
+			    && !TYPE_NAME (original_type))
+			  {
+			    /* Preserve typedef of an anonymous
+			       struct/union/enum.  */
+			    canonical_param_type = param_type;
+			  }
+		      }
+		  }
+
+		mangle_type (canonical_param_type, out_str, hash_state);
+	      }
+	  }
+
+	/* Check if this is a variadic function and add 'z' marker.  */
+	if (stdarg_p (type))
+	  {
+	    append_char ('z', out_str, hash_state);
+	  }
+
+	append_char ('E', out_str, hash_state);
+	break;
+      }
+
+    case RECORD_TYPE:
+    case UNION_TYPE:
+    case ENUMERAL_TYPE:
+      {
+	/* Struct/union/enum: use simplified representation for C types.  */
+	const char *name = NULL;
+
+	/* For compound types, use the original qualified type to preserve
+	   typedef info.  */
+	if (TYPE_QUALS (qualified_type) != TYPE_UNQUALIFIED)
+	  {
+	    type = qualified_type;
+	  }
+
+	if (TYPE_NAME (type))
+	  {
+	    if (TREE_CODE (TYPE_NAME (type)) == TYPE_DECL)
+	      {
+		/* TYPE_DECL case: both named structs and typedef structs.  */
+		tree decl_name = DECL_NAME (TYPE_NAME (type));
+		if (decl_name && TREE_CODE (decl_name) == IDENTIFIER_NODE)
+		  {
+		    name = IDENTIFIER_POINTER (decl_name);
+		  }
+	      }
+	    else if (TREE_CODE (TYPE_NAME (type)) == IDENTIFIER_NODE)
+	      {
+		/* Direct identifier case.  */
+		name = IDENTIFIER_POINTER (TYPE_NAME (type));
+	      }
+	  }
+
+	if (name)
+	  {
+	    append_string (std::to_string (strlen (name)).c_str (),
+			   out_str, hash_state);
+	    append_string (name, out_str, hash_state);
+	    break;
+	  }
+
+	/* If no name found, use anonymous type format: <length>$_<counter>.  */
+	static unsigned anon_counter = 0;
+	std::string anon_name = "$_" + std::to_string (anon_counter++);
+
+	append_string (std::to_string (anon_name.length ()).c_str (),
+		       out_str, hash_state);
+	append_string (anon_name.c_str (), out_str, hash_state);
+	break;
+      }
+
+    default:
+      /* Handle builtin types.  */
+      mangle_builtin_type (type, out_str, hash_state);
+      break;
+    }
+}
+
+/* Get the typeinfo mangled name string for any C type.
+   Returns the mangled type string following Itanium C++ ABI conventions.  */
+
+std::string
+typeinfo_get_name (tree type)
+{
+  gcc_assert (type != NULL_TREE);
+  std::string result = "_ZTS";
+
+  mangle_type (type, &result, nullptr);
+  return result;
+}
+
+/* Get the typeinfo hash for any C type.
+   Returns the FNV-1a hash of the mangled type string.  */
+
+uint32_t
+typeinfo_get_hash (tree type)
+{
+  gcc_assert (type != NULL_TREE);
+  uint32_t hash_state = 2166136261U; /* FNV-1a 32-bit offset basis.  */
+
+  /* Include _ZTS prefix in hash calculation.  */
+  append_string ("_ZTS", nullptr, &hash_state);
+
+  mangle_type (type, nullptr, &hash_state);
+  return hash_state;
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
  2025-09-13 23:23 [PATCH v3 0/7] Introduce Kernel Control Flow Integrity ABI [PR107048] Kees Cook
  2025-09-13 23:23 ` [PATCH v3 1/7] typeinfo: Introduce KCFI typeinfo mangling API Kees Cook
@ 2025-09-13 23:23 ` Kees Cook
  2025-09-17 13:42   ` Qing Zhao
  2025-09-13 23:23 ` [PATCH v3 3/7] x86: Add x86_64 Kernel Control Flow Integrity implementation Kees Cook
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Kees Cook @ 2025-09-13 23:23 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Kees Cook, Andrew Pinski, Jakub Jelinek, Martin Uecker,
	Richard Biener, Joseph Myers, Peter Zijlstra, Jan Hubicka,
	Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
	Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
	Jim Wilson, Dan Li, Sami Tolvanen, Ramon de C Valle, Joao Moreira,
	Nathan Chancellor, Bill Wendling, gcc-patches, linux-hardening

Implements the Linux Kernel Control Flow Integrity ABI, which provides a
function prototype based forward edge control flow integrity protection
by instrumenting every indirect call to check for a hash value before
the target function address. If the hash at the call site and the hash
at the target do not match, execution will trap.

See the start of kcfi.cc for design details.

gcc/ChangeLog:

	* kcfi.h: New file with KCFI public interface declarations.
	* kcfi.cc: New file implementing Kernel Control Flow Integrity
	infrastructure.
	* Makefile.in (OBJS): Add kcfi.o.
	* flag-types.h (enum sanitize_code): Add SANITIZE_KCFI.
	* gimple.h (enum gf_mask): Add GF_CALL_INLINED_FROM_KCFI_NOSANTIZE.
	(gimple_call_set_inlined_from_kcfi_nosantize): New function.
	(gimple_call_inlined_from_kcfi_nosantize_p): New function.
	* tree-pass.h Add kcfi passes.
	* df-scan.cc (df_uses_record): Add KCFI case to handle KCFI RTL
	patterns and process wrapped RTL.
        * doc/extend.texi: Update nocf_check for kcfi.
	* doc/invoke.texi (fsanitize=kcfi): Add documentation for KCFI
	sanitizer option.
	* doc/tm.texi.in: Add Kernel Control Flow Integrity section with
	TARGET_KCFI_SUPPORTED, TARGET_KCFI_MASK_TYPE_ID,
	TARGET_KCFI_EMIT_TYPE_ID hooks.
	* doc/tm.texi: Regenerate.
	* final.cc (call_from_call_insn): Add KCFI case to handle
	KCFI-wrapped calls.
	* opts.cc (sanitizer_opts): Add kcfi entry.
	* passes.cc: Include kcfi.h.
	* passes.def: Add KCFI IPA pass.
	* rtl.def (KCFI): Add new RTL code for KCFI instrumentation.
	* rtlanal.cc (rtx_cost): Add KCFI case.
	* target.def: Add KCFI target hooks.
	* toplev.cc (process_options): Add KCFI option processing.
	* tree-inline.cc: Include kcfi.h and asan.h.
	(copy_bb): Handle KCFI no_sanitize attribute propagation during
	inlining.
	* varasm.cc (assemble_start_function): Emit KCFI preambles.
	(assemble_external_real): Emit KCFI typeid symbols.
	(default_elf_asm_named_section): Handle .kcfi_traps using
	SECTION_LINK_ORDER flag.

gcc/c-family/ChangeLog:

	* c-attribs.cc: Include asan.h.
        (handle_nocf_check_attribute): Enable nocf_check under kcfi.
	(handle_patchable_function_entry_attribute): Add error for using
	patchable_function_entry attribute with -fsanitize=kcfi.

Signed-off-by: Kees Cook <kees@kernel.org>
---
 gcc/kcfi.h                |  52 ++++
 gcc/kcfi.cc               | 601 ++++++++++++++++++++++++++++++++++++++
 gcc/Makefile.in           |   1 +
 gcc/flag-types.h          |   2 +
 gcc/gimple.h              |  22 ++
 gcc/tree-pass.h           |   1 +
 gcc/c-family/c-attribs.cc |  17 +-
 gcc/df-scan.cc            |   7 +
 gcc/doc/extend.texi       |  38 +++
 gcc/doc/invoke.texi       |  33 +++
 gcc/doc/tm.texi           |  31 ++
 gcc/doc/tm.texi.in        |  12 +
 gcc/final.cc              |   3 +
 gcc/opts.cc               |   1 +
 gcc/passes.cc             |   1 +
 gcc/passes.def            |   1 +
 gcc/rtl.def               |   6 +
 gcc/rtlanal.cc            |   5 +
 gcc/target.def            |  38 +++
 gcc/toplev.cc             |  10 +
 gcc/tree-inline.cc        |  10 +
 gcc/varasm.cc             |  37 ++-
 22 files changed, 918 insertions(+), 11 deletions(-)
 create mode 100644 gcc/kcfi.h
 create mode 100644 gcc/kcfi.cc

diff --git a/gcc/kcfi.h b/gcc/kcfi.h
new file mode 100644
index 000000000000..32c186416493
--- /dev/null
+++ b/gcc/kcfi.h
@@ -0,0 +1,52 @@
+/* Kernel Control Flow Integrity (KCFI) support for GCC.
+   Copyright (C) 2025 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_KCFI_H
+#define GCC_KCFI_H
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "rtl.h"
+
+/* Common helper for RTL patterns to emit .kcfi_traps section entry.
+   Call after emitting trap label and instruction with the trap symbol
+   reference.  */
+extern void kcfi_emit_traps_section (FILE *file, rtx trap_label_sym);
+
+/* Extract KCFI type ID from current GIMPLE statement.  */
+extern rtx __kcfi_get_type_id_for_expanding_gimple_call (void);
+
+/* Convenience wrapper to check for SANITIZE_KCFI.  */
+#define kcfi_get_type_id_for_expanding_gimple_call()	\
+  ((flag_sanitize & SANITIZE_KCFI)			\
+     ? __kcfi_get_type_id_for_expanding_gimple_call ()	\
+     : NULL_RTX)
+
+/* Emit KCFI type ID symbol for external address-taken functions.  */
+extern void kcfi_emit_typeid_symbol (FILE *asm_file, tree fndecl);
+
+/* Emit KCFI preamble for potential indirect call targets.  */
+extern void kcfi_emit_preamble (FILE *asm_file, tree fndecl,
+				const char *actual_fname);
+
+/* For calculating callsite offset.  */
+extern HOST_WIDE_INT kcfi_patchable_entry_prefix_nops;
+
+#endif /* GCC_KCFI_H */
diff --git a/gcc/kcfi.cc b/gcc/kcfi.cc
new file mode 100644
index 000000000000..9ed0cb00faa1
--- /dev/null
+++ b/gcc/kcfi.cc
@@ -0,0 +1,601 @@
+/* Kernel Control Flow Integrity (KCFI) support for GCC.
+   Copyright (C) 2025 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+/* KCFI ABI Design:
+
+The Linux Kernel Control Flow Integrity ABI provides a function prototype
+based forward edge control flow integrity protection by instrumenting
+every indirect call to check for a hash value before the target function
+address.  If the hash at the call site and the hash at the target do not
+match, execution will trap.
+
+The general CFI ideas are discussed here, but focuses more on a CFG
+analysis to construct valid call destinations, which tends to require LTO:
+https://users.soe.ucsc.edu/~abadi/Papers/cfi-tissec-revised.pdf
+
+Later refinement for using jump tables (constructed via CFG analysis
+during LTO) was proposed here:
+https://www.usenix.org/system/files/conference/usenixsecurity14/sec14-paper-tice.pdf
+
+Linux used the above implementation from 2018 to 2022:
+https://android-developers.googleblog.com/2018/10/control-flow-integrity-in-android-kernel.html
+but the corner cases for target addresses not being the actual functions
+(i.e. pointing into the jump table) was a continual source of problems,
+and generating the jump tables required full LTO, which had its own set
+of problems.
+
+Looking at function prototypes as the source of call validity was
+presented here, though still relied on LTO:
+https://www.blackhat.com/docs/asia-17/materials/asia-17-Moreira-Drop-The-Rop-Fine-Grained-Control-Flow-Integrity-For-The-Linux-Kernel-wp.pdf
+
+The KCFI approach built on the function-prototype idea, but avoided
+needing LTO, and could be further updated to deal with CPU errata
+(retpolines, etc):
+https://lpc.events/event/16/contributions/1315/
+
+KCFI has a number of specific constraints.  Some are tied to the
+backend architecture, which are covered in arch-specific code.
+The constraints are:
+
+- The KCFI scheme generates a unique 32-bit hash ("typeid") for each
+  unique function prototype, allowing for indirect call sites to verify
+  that they are calling into a matching _type_ of function pointer.
+  This changes the semantics of some optimization logic because now
+  indirect calls to different types cannot be merged.  For example:
+
+    if (p->func_type_1)
+	return p->func_type_1 ();
+    if (p->func_type_2)
+	return p->func_type_2 ();
+
+  In final asm, the optimizer may collapse the second indirect call
+  into a jump to the first indirect call once it has loaded the function
+  pointer.  KCFI must block cross-type merging otherwise there will be a
+  single KCFI check happening for only 1 type but being used by 2 target
+  types.  The distinguishing characteristic for call merging becomes the
+  type, not the address/register usage.
+
+- The check-call instruction sequence must be treated as a single unit: it
+  cannot be rearranged or split or optimized.  The pattern is that
+  indirect calls, "call *%target", get converted into:
+
+    mov $target_expression, %target ; only present if the expression was
+				    ; not already in %target register
+    load -$offset(%target), %tmp    ; load typeid hash from target preamble
+    cmp $typeid, %tmp		    ; compare expected typeid with loaded
+    je .Lkcfi_call$N		    ; success: jump to the indirect call
+  .Lkcfi_trap$N:		    ; label of trap insn
+    trap			    ; trap on failure, but arranged so
+				    ; "permissive mode" falls through
+  .Lkcfi_call$N:		    ; label of call insn
+    call *%target		    ; actual indirect call
+
+  This pattern of call immediately after trap provides for the
+  "permissive" checking mode automatically: the trap gets handled,
+  a warning emitted, and then execution continues after the trap to
+  the call.
+
+- KCFI check-call instrumentation must survive tail call optimization.
+  If an indirect call is turned into an indirect jump, KCFI checking
+  must still happen (but it will use a jmp rather than a call).
+
+- Functions that may be called indirectly have a preamble added,
+  __cfi_$original_func_name, which contains the $typeid value:
+
+    __cfi_target_func:
+      .word $typeid
+    target_func:
+       [regular function entry...]
+
+- The preamble needs to interact with patchable function entry so that
+  the typeid appears further away from the actual start of the function
+  (leaving the prefix NOPs of the patchable function entry unchanged).
+  This means only _globally defined_ patchable function entry is supported
+  with KCFI (indrect call sites must know in advance what the offset is,
+  which may not be possible with extern functions that use a function
+  attribute to change their patchable function entry characteristics).
+  For example, a "4,4" patchable function entry would end up like:
+
+    __cfi_target_func:
+      .data $typeid
+      nop nop nop nop
+    target_func:
+       [regular function entry...]
+
+  Architectures may need to add alignment nops prior to the typeid to keep
+  __cfi_target_func aligned for function call conventions.
+
+- External functions that are address-taken have a weak __kcfi_typeid_$func
+  symbol added with the typeid value available so that the typeid can be
+  referenced from assembly linkages, etc, where the typeid values cannot be
+  calculated (i.e where C type information is missing):
+
+    .weak   __kcfi_typeid_$func
+    .set    __kcfi_typeid_$func, $typeid
+
+- On architectures that do not have a good way to encode additional
+  details in their trap insn (e.g. x86_64 and riscv64), the trap location
+  is identified as a KCFI trap via a relative address offset entry
+  emitted into the .kcfi_traps section for each indirect call site's
+  trap instruction.  The previous check-call example's insn sequence would
+  then have section changes inserted between the trap and call:
+
+  ...
+  .Lkcfi_trap$N:
+    trap
+  .section	.kcfi_traps,"ao",@progbits,.text
+  .Lkcfi_entry$N:
+    .long	.Lkcfi_trap$N - .Lkcfi_entry$N
+  .text
+  .Lkcfi_call$N:
+    call *%target
+
+  It is up to such architectures to decode instructions prior to the
+  trap to locate the typeid that the callsite was expecting.
+
+  For architectures that can encode immediates in their trap function
+  (e.g. aarch64 and arm32), this isn't needed: they just use immediate
+  codes that indicate a KCFI trap.
+
+- The no_sanitize("kcfi") function attribute means that the marked
+  function must not produce KCFI checking for indirect calls, and this
+  attribute must survive inlining.  This is used rarely by Linux, but
+  is required to make BPF JIT trampolines work on older Linux kernel
+  versions.
+
+- The "nocf_check" function attribute can be used to supress the
+  KCFI preamble for a function, making that function unavailable
+  for indirect calls.
+
+As a result of these constraints, there are some behavioral aspects
+that need to be preserved across the middle-end and back-end.
+
+For indirect call sites:
+
+- All function types have their associated typeid attached as an
+  attribute.
+
+- Keep typeid information available through to the RTL expansion
+  phase was done via a new KCFI insn RTL pattern that wraps the CALL
+  and the typeid.
+
+- Keep indirect calls from being merged (see earlier example) by
+  checking the KCFI insn's typeid for equality.
+
+- To make sure KCFI expansion is skipped for inline functions that
+  are marked with no_sanitize("kcfi"), the inlining is marked during
+  GIMPLE with a new flag which is checked during expansion.
+
+- KCFI insn emission interacts with patchable function entry to
+  load the typeid from the target preambble, offset by prefix NOPs.
+
+For indirect call targets:
+
+- kcfi_emit_preamble interacts with patchable function entry to add
+  any needed alignment prior to emitting the typeid.
+
+- assemble_external_real calls kcfi_emit_typeid_symbol to add the
+  __kcfi_typeid_$func symbols.
+
+*/
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "target.h"
+#include "function.h"
+#include "tree.h"
+#include "tree-pass.h"
+#include "dumpfile.h"
+#include "basic-block.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "cgraph.h"
+#include "kcfi.h"
+#include "stringpool.h"
+#include "attribs.h"
+#include "rtl.h"
+#include "cfg.h"
+#include "cfgrtl.h"
+#include "asan.h"
+#include "diagnostic-core.h"
+#include "memmodel.h"
+#include "print-tree.h"
+#include "emit-rtl.h"
+#include "output.h"
+#include "builtins.h"
+#include "varasm.h"
+#include "opts.h"
+#include "target.h"
+#include "flags.h"
+#include "kcfi-typeinfo.h"
+#include "insn-config.h"
+#include "recog.h"
+
+/* For callsite typeid loading offset.  */
+HOST_WIDE_INT kcfi_patchable_entry_prefix_nops = 0;
+/* For preamble alignment.  */
+static HOST_WIDE_INT kcfi_patchable_entry_arch_alignment_nops = 0;
+static const char *kcfi_nop = NULL;
+
+/* Common helper for RTL patterns to emit .kcfi_traps section entry.  */
+
+void
+kcfi_emit_traps_section (FILE *file, rtx trap_label_sym)
+{
+  /* Generate entry label internally and get its number.  */
+  rtx entry_label = gen_label_rtx ();
+  int entry_labelno = CODE_LABEL_NUMBER (entry_label);
+
+  /* Generate entry label name with custom prefix.  */
+  char entry_name[32];
+  ASM_GENERATE_INTERNAL_LABEL (entry_name, "Lkcfi_entry", entry_labelno);
+
+  /* Save current section to restore later.  */
+  section *saved_section = in_section;
+
+  /* Use varasm infrastructure for section handling:
+     .section	.kcfi_traps,"ao",@progbits,.text  */
+  section *kcfi_traps_section = get_section (".kcfi_traps",
+					     SECTION_LINK_ORDER, NULL);
+  switch_to_section (kcfi_traps_section);
+
+  /* Emit entry label for relative offset:
+     .Lkcfi_entry$N:  */
+  ASM_OUTPUT_LABEL (file, entry_name);
+
+  /* Generate address difference using RTL infrastructure.  */
+  rtx entry_label_sym = gen_rtx_SYMBOL_REF (Pmode, entry_name);
+  rtx addr_diff = gen_rtx_MINUS (Pmode, trap_label_sym, entry_label_sym);
+
+  /* Emit the address difference as a 4-byte value:
+    .long	.Lkcfi_trap$N - .Lkcfi_entry$N  */
+  assemble_integer (addr_diff, 4, BITS_PER_UNIT, 1);
+
+  /* Restore the previous section:
+     .text  */
+  switch_to_section (saved_section);
+}
+
+/* Compute KCFI type ID for a function type.  */
+
+static uint32_t
+compute_kcfi_type_id (tree fntype)
+{
+  gcc_assert (fntype);
+  gcc_assert (TREE_CODE (fntype) == FUNCTION_TYPE);
+
+  uint32_t type_id = typeinfo_get_hash (fntype);
+
+  /* Apply target-specific masking if supported.  */
+  if (targetm.kcfi.mask_type_id)
+    type_id = targetm.kcfi.mask_type_id (type_id);
+
+  /* Output to dump file if enabled.  */
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+      std::string mangled_name = typeinfo_get_name (fntype);
+      fprintf (dump_file, "KCFI type ID: mangled='%s' typeid=0x%08x\n",
+	       mangled_name.c_str (), type_id);
+    }
+
+  return type_id;
+}
+
+/* Function attribute to store KCFI type ID.  */
+static tree kcfi_type_id_attr = NULL_TREE;
+
+/* Get KCFI type ID for a function type.  Set it if missing.  */
+
+static uint32_t
+kcfi_get_type_id (tree fn_type)
+{
+  uint32_t type_id;
+
+  /* Cache the attribute identifier.  */
+  if (!kcfi_type_id_attr)
+    kcfi_type_id_attr = get_identifier ("kcfi_type_id");
+
+  tree attr = lookup_attribute (IDENTIFIER_POINTER (kcfi_type_id_attr),
+				TYPE_ATTRIBUTES (fn_type));
+  if (attr)
+    {
+      tree value = TREE_VALUE (attr);
+      gcc_assert (value && TREE_CODE (value) == INTEGER_CST);
+      type_id = (uint32_t) TREE_INT_CST_LOW (value);
+    }
+  else
+    {
+      type_id = compute_kcfi_type_id (fn_type);
+
+      tree type_id_tree = build_int_cst (unsigned_type_node, type_id);
+      tree attr = build_tree_list (kcfi_type_id_attr, type_id_tree);
+
+      TYPE_ATTRIBUTES (fn_type) = chainon (TYPE_ATTRIBUTES (fn_type), attr);
+    }
+
+  return type_id;
+}
+
+/* Prepare the global KCFI alignment NOPs calculation.
+   Called once during IPA pass to set global variables.  */
+
+static void
+kcfi_prepare_alignment_nops (void)
+{
+  /* Only use global patchable-function-entry flag, not function attributes.
+     KCFI callsites cannot know about function-specific attributes.  */
+  if (flag_patchable_function_entry)
+    {
+      HOST_WIDE_INT total_nops, prefix_nops = 0;
+      parse_and_check_patch_area (flag_patchable_function_entry, false,
+				  &total_nops, &prefix_nops);
+      /* Store value for callsite offset calculation.  */
+      kcfi_patchable_entry_prefix_nops = prefix_nops;
+    }
+
+  /* Calculate architecture-specific alignment NOPs.
+     KCFI preamble layout:
+     __cfi_func: [alignment_nops][typeid][prefix_nops] func: [entry_nops]
+
+     The alignment NOPs ensure __cfi_func stays at proper function entry
+     alignment when prefix NOPs are added.  */
+  HOST_WIDE_INT arch_alignment = 0;
+
+  /* Calculate alignment NOPs based on function alignment setting.
+     Use explicit -falign-functions value if set, otherwise default to 4.  */
+  int alignment_bytes = 4;
+  if (align_functions.levels[0].log > 0)
+    alignment_bytes = align_functions.levels[0].get_value ();
+
+  /* Get typeid instruction size from target hook, default to 4 bytes.  */
+  int typeid_size = targetm.kcfi.emit_type_id
+		    ? targetm.kcfi.emit_type_id (NULL, 0) : 4;
+
+  /* Calculate alignment NOP bytes needed.  */
+  arch_alignment = (alignment_bytes
+		    - ((kcfi_patchable_entry_prefix_nops + typeid_size)
+		       % alignment_bytes)) % alignment_bytes;
+
+  /* Prepare NOP template.  */
+  rtx_insn *nop_insn = make_insn_raw (gen_nop ());
+  int code_num = recog_memoized (nop_insn);
+  kcfi_nop = get_insn_template (code_num, nop_insn);
+
+  /* Calculate number of NOP instructions needed for alignment.  */
+  int nop_size = get_attr_length (nop_insn);
+  if (arch_alignment % nop_size != 0)
+    sorry ("KCFI function entry alignment padding bytes "
+	   "(" HOST_WIDE_INT_PRINT_DEC ") are not a multiple of "
+	   "architecture NOP instruction size (%d)",
+	   arch_alignment, nop_size);
+  kcfi_patchable_entry_arch_alignment_nops = arch_alignment / nop_size;
+}
+
+/* Extract KCFI type ID from indirect call GIMPLE statement.
+   Returns RTX constant with type ID, or NULL_RTX if no KCFI needed.  */
+
+rtx
+__kcfi_get_type_id_for_expanding_gimple_call (void)
+{
+  gcc_assert (currently_expanding_gimple_stmt);
+  gcc_assert (is_gimple_call (currently_expanding_gimple_stmt));
+
+  /* Internally checks for no_sanitize("kcfi") with current_function_decl.  */
+  if (!sanitize_flags_p (SANITIZE_KCFI))
+    return NULL_RTX;
+
+  gcall *call_stmt = as_a <gcall *> (currently_expanding_gimple_stmt);
+
+  /* Only indirect calls need KCFI instrumentation.  */
+  if (gimple_call_fndecl (call_stmt))
+    return NULL_RTX;
+
+  /* Skip calls originating from inlined no_sanitize("kcfi") functions.  */
+  if (gimple_call_inlined_from_kcfi_nosantize_p (call_stmt))
+    return NULL_RTX;
+
+  /* Get function type of call.  */
+  tree fn_type = gimple_call_fntype (call_stmt);
+  gcc_assert (fn_type);
+
+  /* Return the type_id.  */
+  return GEN_INT (kcfi_get_type_id (fn_type));
+}
+
+/* Emit KCFI type ID symbol for an address-taken external function.  */
+
+void
+kcfi_emit_typeid_symbol (FILE *asm_file, tree fndecl)
+{
+  /* Only emit for external function declarations.  */
+  if (TREE_CODE (fndecl) != FUNCTION_DECL || DECL_INITIAL (fndecl))
+    return;
+
+  /* Only emit for functions that are address-taken.  */
+  struct cgraph_node *node = cgraph_node::get (fndecl);
+  if (!node || !node->address_taken)
+    return;
+
+  /* Get symbol name from RTL and strip encoding prefixes.  */
+  rtx rtl = DECL_RTL (fndecl);
+  const char *name = XSTR (XEXP (rtl, 0), 0);
+  name = targetm.strip_name_encoding (name);
+
+  /* .weak __kcfi_typeid_{name} */
+  std::string symbol_name = std::string ("__kcfi_typeid_") + name;
+  ASM_WEAKEN_LABEL (asm_file, symbol_name.c_str ());
+
+  /* .set __kcfi_typeid_{name}, 0x{type_id} */
+  char val[16];
+  snprintf (val, sizeof (val), "0x%08x",
+	    kcfi_get_type_id (TREE_TYPE (fndecl)));
+  ASM_OUTPUT_DEF (asm_file, symbol_name.c_str (), val);
+}
+
+/* Emit KCFI preamble before the function label.
+   Functions get preambles when -fsanitize=kcfi is enabled, regardless of
+   no_sanitize("kcfi") attribute.  */
+
+void
+kcfi_emit_preamble (FILE *asm_file, tree fndecl, const char *actual_fname)
+{
+  /* Skip functions with nocf_check attribute.  */
+  if (lookup_attribute ("nocf_check", TYPE_ATTRIBUTES (TREE_TYPE (fndecl))))
+    return;
+
+  struct cgraph_node *node = cgraph_node::get (fndecl);
+
+  /* Ignore cold partition functions: not reached via indirect call.  */
+  if (node && node->split_part)
+    return;
+
+  /* Ignore cold partition sections: cold partitions are never indirect call
+     targets.  Only skip preambles for cold partitions (has_bb_partition = true)
+     not for entire cold-attributed functions (has_bb_partition = false).  */
+  if (in_cold_section_p && crtl && crtl->has_bb_partition)
+    return;
+
+  /* Check if function is truly address-taken using cgraph node analysis.  */
+  bool addr_taken = (node && node->address_taken);
+
+  /* Only instrument functions that can be targets of indirect calls:
+     - Public functions (can be called externally)
+     - External declarations (from other modules)
+     - Functions with true address-taken status from cgraph analysis.  */
+  if (!(TREE_PUBLIC (fndecl) || DECL_EXTERNAL (fndecl) || addr_taken))
+    return;
+
+  /* Use actual function name if provided, otherwise fall back to
+     DECL_ASSEMBLER_NAME.  */
+  const char *fname = actual_fname
+			? actual_fname
+			: IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (fndecl));
+
+  /* Create symbol name for reuse.  */
+  std::string cfi_symbol_name = std::string ("__cfi_") + fname;
+
+  /* Emit __cfi_ symbol with proper visibility.  */
+  if (TREE_PUBLIC (fndecl))
+    {
+      if (DECL_WEAK (fndecl))
+	ASM_WEAKEN_LABEL (asm_file, cfi_symbol_name.c_str ());
+      else
+	targetm.asm_out.globalize_label (asm_file, cfi_symbol_name.c_str ());
+    }
+
+  /* Emit .type directive.  */
+  ASM_OUTPUT_TYPE_DIRECTIVE (asm_file, cfi_symbol_name.c_str (), "function");
+  ASM_OUTPUT_LABEL (asm_file, cfi_symbol_name.c_str ());
+
+  /* Emit architecture-specific alignment NOPs using target's NOP template.  */
+  for (int i = 0; i < kcfi_patchable_entry_arch_alignment_nops; i++)
+    output_asm_insn (kcfi_nop, NULL);
+
+  /* Emit type ID bytes.  */
+  uint32_t type_id = kcfi_get_type_id (TREE_TYPE (fndecl));
+  if (targetm.kcfi.emit_type_id)
+    targetm.kcfi.emit_type_id (asm_file, type_id);
+  else
+    fprintf (asm_file, "\t.word\t0x%08x\n", type_id);
+
+  /* Mark end of __cfi_ symbol and emit size directive.  */
+  std::string cfi_end_label = std::string (".Lcfi_func_end_") + fname;
+  ASM_OUTPUT_LABEL (asm_file, cfi_end_label.c_str ());
+
+  ASM_OUTPUT_MEASURED_SIZE (asm_file, cfi_symbol_name.c_str ());
+}
+
+namespace {
+
+/* IPA pass for KCFI type ID setting - runs once per compilation unit.  */
+
+const pass_data pass_data_ipa_kcfi =
+{
+  SIMPLE_IPA_PASS, /* type */
+  "ipa_kcfi", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_IPA_OPT, /* tv_id */
+  0, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
+
+/* Set KCFI type_ids for all usable function types in compilation unit.  */
+
+static unsigned int
+ipa_kcfi_execute (void)
+{
+  struct cgraph_node *node;
+
+  /* Prepare global KCFI alignment NOPs calculation once for all functions.  */
+  kcfi_prepare_alignment_nops ();
+
+  /* Process all functions - both local and external.  */
+  FOR_EACH_FUNCTION (node)
+    {
+      tree fndecl = node->decl;
+
+      /* Skip all non-NORMAL builtins (MD, FRONTEND) entirely.
+	 For NORMAL builtins, skip those that lack an implicit
+	 implementation (closest way to distinguishing DEF_LIB_BUILTIN
+	 from others).  E.g. we need to have typeids for memset().  */
+      if (fndecl_built_in_p (fndecl))
+	{
+	  if (DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_NORMAL)
+	    continue;
+	  if (!builtin_decl_implicit_p (DECL_FUNCTION_CODE (fndecl)))
+	    continue;
+	}
+
+      /* Cache the type_id in the function type.  */
+      kcfi_get_type_id (TREE_TYPE (fndecl));
+    }
+
+  return 0;
+}
+
+class pass_ipa_kcfi : public simple_ipa_opt_pass
+{
+public:
+  pass_ipa_kcfi (gcc::context *ctxt)
+    : simple_ipa_opt_pass (pass_data_ipa_kcfi, ctxt)
+  {}
+
+  bool gate (function *) final override
+  {
+    return sanitize_flags_p (SANITIZE_KCFI);
+  }
+
+  unsigned int execute (function *) final override
+  {
+    return ipa_kcfi_execute ();
+  }
+
+}; /* class pass_ipa_kcfi */
+
+} /* anon namespace */
+
+simple_ipa_opt_pass *
+make_pass_ipa_kcfi (gcc::context *ctxt)
+{
+  return new pass_ipa_kcfi (ctxt);
+}
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index a14fb498ce44..5b89161ac75a 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1592,6 +1592,7 @@ OBJS = \
 	ira-lives.o \
 	jump.o \
 	kcfi-typeinfo.o \
+	kcfi.o \
 	langhooks.o \
 	late-combine.o \
 	lcm.o \
diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index bf681c3e8153..c3c0bc61ee3e 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -337,6 +337,8 @@ enum sanitize_code {
   SANITIZE_KERNEL_HWADDRESS = 1UL << 30,
   /* Shadow Call Stack.  */
   SANITIZE_SHADOW_CALL_STACK = 1UL << 31,
+  /* KCFI (Kernel Control Flow Integrity) */
+  SANITIZE_KCFI = 1ULL << 32,
   SANITIZE_SHIFT = SANITIZE_SHIFT_BASE | SANITIZE_SHIFT_EXPONENT,
   SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | SANITIZE_UNREACHABLE
 		       | SANITIZE_VLA | SANITIZE_NULL | SANITIZE_RETURN
diff --git a/gcc/gimple.h b/gcc/gimple.h
index da32651ea017..d5e7acc2c6a7 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -142,6 +142,7 @@ enum gf_mask {
     GF_CALL_ALLOCA_FOR_VAR	= 1 << 5,
     GF_CALL_INTERNAL		= 1 << 6,
     GF_CALL_CTRL_ALTERING       = 1 << 7,
+    GF_CALL_INLINED_FROM_KCFI_NOSANTIZE = 1 << 8,
     GF_CALL_MUST_TAIL_CALL	= 1 << 9,
     GF_CALL_BY_DESCRIPTOR	= 1 << 10,
     GF_CALL_NOCF_CHECK		= 1 << 11,
@@ -3487,6 +3488,27 @@ gimple_call_from_thunk_p (gcall *s)
   return (s->subcode & GF_CALL_FROM_THUNK) != 0;
 }
 
+/* If INLINED_FROM_KCFI_NOSANTIZE_P is true, mark GIMPLE_CALL S as being
+   inlined from a function with no_sanitize("kcfi").  */
+
+inline void
+gimple_call_set_inlined_from_kcfi_nosantize (gcall *s,
+					     bool inlined_from_kcfi_nosantize_p)
+{
+  if (inlined_from_kcfi_nosantize_p)
+    s->subcode |= GF_CALL_INLINED_FROM_KCFI_NOSANTIZE;
+  else
+    s->subcode &= ~GF_CALL_INLINED_FROM_KCFI_NOSANTIZE;
+}
+
+/* Return true if GIMPLE_CALL S was inlined from a function with
+   no_sanitize("kcfi").  */
+
+inline bool
+gimple_call_inlined_from_kcfi_nosantize_p (const gcall *s)
+{
+  return (s->subcode & GF_CALL_INLINED_FROM_KCFI_NOSANTIZE) != 0;
+}
 
 /* If FROM_NEW_OR_DELETE_P is true, mark GIMPLE_CALL S as being a call
    to operator new or delete created from a new or delete expression.  */
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 1c68a69350df..8155249c990a 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -544,6 +544,7 @@ extern ipa_opt_pass_d *make_pass_ipa_odr (gcc::context *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_reference (gcc::context *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_pure_const (gcc::context *ctxt);
 extern simple_ipa_opt_pass *make_pass_ipa_pta (gcc::context *ctxt);
+extern simple_ipa_opt_pass *make_pass_ipa_kcfi (gcc::context *ctxt);
 extern simple_ipa_opt_pass *make_pass_ipa_tm (gcc::context *ctxt);
 extern simple_ipa_opt_pass *make_pass_target_clone (gcc::context *ctxt);
 extern simple_ipa_opt_pass *make_pass_dispatcher_calls (gcc::context *ctxt);
diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 1e3a94ed9493..1580ab25f70b 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimplify.h"
 #include "tree-pretty-print.h"
 #include "gcc-rich-location.h"
+#include "asan.h"
 #include "gcc-urlifier.h"
 
 static tree handle_packed_attribute (tree *, tree, tree, int, bool *);
@@ -1740,8 +1741,11 @@ handle_nocf_check_attribute (tree *node, tree name,
       warning (OPT_Wattributes, "%qE attribute ignored", name);
       *no_add_attrs = true;
     }
-  else if (!(flag_cf_protection & CF_BRANCH))
+  else if (!(flag_cf_protection & CF_BRANCH)
+	   && !(flag_sanitize & SANITIZE_KCFI))
     {
+      /* Allow it with -fsanitize=kcfi, but leave this warning alone
+	 to avoid confusion over this weird corner case.  */
       warning (OPT_Wattributes, "%qE attribute ignored. Use "
 				"%<-fcf-protection%> option to enable it",
 				name);
@@ -6508,6 +6512,17 @@ static tree
 handle_patchable_function_entry_attribute (tree *, tree name, tree args,
 					   int, bool *no_add_attrs)
 {
+  /* Function-specific patchable_function_entry attribute is incompatible
+     with KCFI because KCFI callsites cannot know about function-specific
+     patchable entry settings on a preamble in a different translation
+     unit.  */
+  if (sanitize_flags_p (SANITIZE_KCFI))
+    {
+      error ("%qE attribute cannot be used with %<-fsanitize=kcfi%>", name);
+      *no_add_attrs = true;
+      return NULL_TREE;
+    }
+
   for (; args; args = TREE_CHAIN (args))
     {
       tree val = TREE_VALUE (args);
diff --git a/gcc/df-scan.cc b/gcc/df-scan.cc
index 1e4c6a2a4fb5..2be5e60786a3 100644
--- a/gcc/df-scan.cc
+++ b/gcc/df-scan.cc
@@ -2851,6 +2851,13 @@ df_uses_record (class df_collection_rec *collection_rec,
       /* If we're clobbering a REG then we have a def so ignore.  */
       return;
 
+    case KCFI:
+      /* KCFI wraps other RTL - process the wrapped RTL.  */
+      df_uses_record (collection_rec, &XEXP (x, 0), ref_type, bb, insn_info,
+		      flags);
+      /* The type ID operand (XEXP (x, 1)) doesn't contain register uses.  */
+      return;
+
     case MEM:
       df_uses_record (collection_rec,
 		      &XEXP (x, 0), DF_REF_REG_MEM_LOAD,
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 7cddea1ed6c1..ae9c039ab589 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -2740,6 +2740,44 @@ void __attribute__ ((no_sanitize ("alignment,object-size")))
 g () @{ /* @r{Do something.} */; @}
 @end smallexample
 
+When @code{no_sanitize("kcfi")} is applied to a function, it disables
+the generation of Kernel Control Flow Integrity (KCFI) instrumentation
+for indirect function calls within that function.  This means that
+indirect calls in the marked function will not be checked against the
+target function's type signature.
+
+However, the function itself will still receive a KCFI preamble (type
+identifier) when compiled with @option{-fsanitize=kcfi}, allowing it to
+be safely called indirectly from other functions that do perform KCFI
+checks.  In other words, @code{no_sanitize("kcfi")} affects outgoing
+calls from the function, not incoming calls to the function.
+
+@smallexample
+void __attribute__ ((no_sanitize ("kcfi")))
+trusted_function(void (*callback)(int))
+@{
+  /* This indirect call will NOT be instrumented with KCFI checks */
+  callback(42);
+@}
+
+void regular_function(void (*callback)(int))
+@{
+  /* This indirect call WILL be instrumented with KCFI checks */
+  callback(42);
+@}
+@end smallexample
+
+This attribute is primarily used in kernel code for special contexts such
+as BPF JIT trampolines or other low-level code where KCFI instrumentation
+might interfere with the intended operation.  The attribute survives
+inlining to ensure that @code{no_sanitize("kcfi")} functions do not generate
+KCFI checks even when inlined into a function that otherwise performs KCFI
+checks.
+
+Note: To disable KCFI preamble generation for functions so that they may
+explicitly not be called indirectly, use the @code{nocf_check} function
+attribute instead.
+
 @cindex @code{no_sanitize_address} function attribute
 @item no_sanitize_address
 @itemx no_address_safety_analysis
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 56c4fa86e346..f96e104a7248 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -18382,6 +18382,39 @@ possible by specifying the command-line options
 @option{--param hwasan-instrument-allocas=1} respectively. Using a random frame
 tag is not implemented for kernel instrumentation.
 
+@opindex fsanitize=kcfi
+@item -fsanitize=kcfi
+Enable Kernel Control Flow Integrity (KCFI), a lightweight control
+flow integrity mechanism designed for operating system kernels.
+KCFI instruments indirect function calls to verify that the target
+function has the expected type signature at runtime.  Each function
+receives a unique type identifier computed from a hash of its function
+prototype (including parameter types and return type).  Before each
+indirect call, the implementation inserts a check to verify that the
+target function's type identifier matches the expected identifier
+for the call site, issuing a trap instruction if a mismatch is detected.
+This provides forward-edge control flow protection against attacks that
+attempt to redirect indirect calls to unintended targets.
+
+The implementation adds minimal runtime overhead and does not require
+runtime library support, making it suitable for kernel environments.
+The type identifier is placed before the function entry point,
+allowing runtime verification without additional metadata structures,
+and without changing the entry points of the target functions.
+
+KCFI is intended primarily for kernel code and may not be suitable
+for user-space applications that rely on techniques incompatible
+with strict type checking of indirect calls.
+
+Note that KCFI is incompatible with function-specific
+@code{patchable_function_entry} attributes because KCFI call sites
+cannot know about function-specific patchable entry settings in different
+translation units.  Only the global @option{-fpatchable-function-entry}
+command-line option is supported with KCFI.
+
+Use @option{-fdump-ipa-kcfi-details} to examine the computed type identifier
+hashes and their corresponding mangled type strings during compilation.
+
 @opindex fsanitize=pointer-compare
 @item -fsanitize=pointer-compare
 Instrument comparison operation (<, <=, >, >=) with pointer operands.
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 37642680f423..69603fdad090 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -3166,6 +3166,7 @@ This describes the stack layout and calling conventions.
 * Tail Calls::
 * Shrink-wrapping separate components::
 * Stack Smashing Protection::
+* Kernel Control Flow Integrity::
 * Miscellaneous Register Hooks::
 @end menu
 
@@ -5432,6 +5433,36 @@ should be allocated from heap memory and consumers should release them.
 The result will be pruned to cases with PREFIX if not NULL.
 @end deftypefn
 
+@node Kernel Control Flow Integrity
+@subsection Kernel Control Flow Integrity
+@cindex kernel control flow integrity
+@cindex KCFI
+
+@deftypefn {Target Hook} bool TARGET_KCFI_SUPPORTED (void)
+Return true if the target supports Kernel Control Flow Integrity (KCFI).
+This hook indicates whether the target has implemented the necessary RTL
+patterns and infrastructure to support KCFI instrumentation.  The default
+implementation returns false.
+@end deftypefn
+
+@deftypefn {Target Hook} uint32_t TARGET_KCFI_MASK_TYPE_ID (uint32_t @var{type_id})
+Apply architecture-specific masking to KCFI type ID.  This hook allows
+targets to apply bit masks or other transformations to the computed KCFI
+type identifier to match the target's specific requirements.  The default
+implementation returns the type ID unchanged.
+@end deftypefn
+
+@deftypefn {Target Hook} int TARGET_KCFI_EMIT_TYPE_ID (FILE *@var{file}, uint32_t @var{type_id})
+Emit architecture-specific type ID instruction for KCFI preambles
+and return the size of the instruction in bytes.
+@var{file} is the assembly output stream and @var{type_id} is the KCFI
+type identifier to emit.  If @var{file} is NULL, skip emission and only
+return the size.  If not overridden, the default fallback emits a
+@code{.word} directive with the type ID and returns 4 bytes.  Targets can
+override this to emit different instruction sequences and return their
+corresponding sizes.
+@end deftypefn
+
 @node Miscellaneous Register Hooks
 @subsection Miscellaneous register hooks
 @cindex miscellaneous register hooks
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index c3ed9a9fd7c2..b2856886194c 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -2433,6 +2433,7 @@ This describes the stack layout and calling conventions.
 * Tail Calls::
 * Shrink-wrapping separate components::
 * Stack Smashing Protection::
+* Kernel Control Flow Integrity::
 * Miscellaneous Register Hooks::
 @end menu
 
@@ -3807,6 +3808,17 @@ generic code.
 
 @hook TARGET_GET_VALID_OPTION_VALUES
 
+@node Kernel Control Flow Integrity
+@subsection Kernel Control Flow Integrity
+@cindex kernel control flow integrity
+@cindex KCFI
+
+@hook TARGET_KCFI_SUPPORTED
+
+@hook TARGET_KCFI_MASK_TYPE_ID
+
+@hook TARGET_KCFI_EMIT_TYPE_ID
+
 @node Miscellaneous Register Hooks
 @subsection Miscellaneous register hooks
 @cindex miscellaneous register hooks
diff --git a/gcc/final.cc b/gcc/final.cc
index afcb0bb9efbc..7f6aa9f9e480 100644
--- a/gcc/final.cc
+++ b/gcc/final.cc
@@ -2094,6 +2094,9 @@ call_from_call_insn (const rtx_call_insn *insn)
 	case SET:
 	  x = XEXP (x, 1);
 	  break;
+	case KCFI:
+	  x = XEXP (x, 0);
+	  break;
 	}
     }
   return x;
diff --git a/gcc/opts.cc b/gcc/opts.cc
index 3ab993aea573..0ee37e01d24a 100644
--- a/gcc/opts.cc
+++ b/gcc/opts.cc
@@ -2170,6 +2170,7 @@ const struct sanitizer_opts_s sanitizer_opts[] =
   SANITIZER_OPT (pointer-overflow, SANITIZE_POINTER_OVERFLOW, true, true),
   SANITIZER_OPT (builtin, SANITIZE_BUILTIN, true, true),
   SANITIZER_OPT (shadow-call-stack, SANITIZE_SHADOW_CALL_STACK, false, false),
+  SANITIZER_OPT (kcfi, SANITIZE_KCFI, false, true),
   SANITIZER_OPT (all, ~sanitize_code_type (0), true, true),
 #undef SANITIZER_OPT
   { NULL, sanitize_code_type (0), 0UL, false, false }
diff --git a/gcc/passes.cc b/gcc/passes.cc
index a33c8d924a52..4c6ceac740ff 100644
--- a/gcc/passes.cc
+++ b/gcc/passes.cc
@@ -63,6 +63,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "diagnostic-core.h" /* for fnotice */
 #include "stringpool.h"
 #include "attribs.h"
+#include "kcfi.h"
 
 /* Reserved TODOs */
 #define TODO_verify_il			(1u << 31)
diff --git a/gcc/passes.def b/gcc/passes.def
index 68ce53baa0f1..65dd0bf4a41e 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -52,6 +52,7 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_ipa_auto_profile_offline);
   NEXT_PASS (pass_ipa_free_lang_data);
   NEXT_PASS (pass_ipa_function_and_variable_visibility);
+  NEXT_PASS (pass_ipa_kcfi);
   NEXT_PASS (pass_ipa_strub_mode);
   NEXT_PASS (pass_build_ssa_passes);
   PUSH_INSERT_PASSES_WITHIN (pass_build_ssa_passes)
diff --git a/gcc/rtl.def b/gcc/rtl.def
index 15ae7d10fcc1..af643d187b95 100644
--- a/gcc/rtl.def
+++ b/gcc/rtl.def
@@ -318,6 +318,12 @@ DEF_RTL_EXPR(CLOBBER, "clobber", "e", RTX_EXTRA)
 
 DEF_RTL_EXPR(CALL, "call", "ee", RTX_EXTRA)
 
+/* KCFI wrapper for call expressions.
+   Operand 0 is the call expression.
+   Operand 1 is the KCFI type ID (const_int).  */
+
+DEF_RTL_EXPR(KCFI, "kcfi", "ee", RTX_EXTRA)
+
 /* Return from a subroutine.  */
 
 DEF_RTL_EXPR(RETURN, "return", "", RTX_EXTRA)
diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
index 63a1d08c46cf..5016fe93ccac 100644
--- a/gcc/rtlanal.cc
+++ b/gcc/rtlanal.cc
@@ -1177,6 +1177,11 @@ reg_referenced_p (const_rtx x, const_rtx body)
     case IF_THEN_ELSE:
       return reg_overlap_mentioned_p (x, body);
 
+    case KCFI:
+      /* For KCFI wrapper, check both the wrapped call and the type ID.  */
+      return (reg_overlap_mentioned_p (x, XEXP (body, 0))
+	      || reg_overlap_mentioned_p (x, XEXP (body, 1)));
+
     case TRAP_IF:
       return reg_overlap_mentioned_p (x, TRAP_CONDITION (body));
 
diff --git a/gcc/target.def b/gcc/target.def
index 8e491d838642..47a11c60809a 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -7589,6 +7589,44 @@ DEFHOOKPOD
 The default value is NULL.",
  const char *, NULL)
 
+/* Kernel Control Flow Integrity (KCFI) hooks.  */
+#undef HOOK_PREFIX
+#define HOOK_PREFIX "TARGET_KCFI_"
+HOOK_VECTOR (TARGET_KCFI, kcfi)
+
+DEFHOOK
+(supported,
+ "Return true if the target supports Kernel Control Flow Integrity (KCFI).\n\
+This hook indicates whether the target has implemented the necessary RTL\n\
+patterns and infrastructure to support KCFI instrumentation.  The default\n\
+implementation returns false.",
+ bool, (void),
+ hook_bool_void_false)
+
+DEFHOOK
+(mask_type_id,
+ "Apply architecture-specific masking to KCFI type ID.  This hook allows\n\
+targets to apply bit masks or other transformations to the computed KCFI\n\
+type identifier to match the target's specific requirements.  The default\n\
+implementation returns the type ID unchanged.",
+ uint32_t, (uint32_t type_id),
+ NULL)
+
+DEFHOOK
+(emit_type_id,
+ "Emit architecture-specific type ID instruction for KCFI preambles\n\
+and return the size of the instruction in bytes.\n\
+@var{file} is the assembly output stream and @var{type_id} is the KCFI\n\
+type identifier to emit.  If @var{file} is NULL, skip emission and only\n\
+return the size.  If not overridden, the default fallback emits a\n\
+@code{.word} directive with the type ID and returns 4 bytes.  Targets can\n\
+override this to emit different instruction sequences and return their\n\
+corresponding sizes.",
+ int, (FILE *file, uint32_t type_id),
+ NULL)
+
+HOOK_VECTOR_END (kcfi)
+
 /* Close the 'struct gcc_target' definition.  */
 HOOK_VECTOR_END (C90_EMPTY_HACK)
 
diff --git a/gcc/toplev.cc b/gcc/toplev.cc
index d26467450e37..f48cfeb050aa 100644
--- a/gcc/toplev.cc
+++ b/gcc/toplev.cc
@@ -67,6 +67,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "attribs.h"
 #include "asan.h"
 #include "tsan.h"
+#include "kcfi.h"
 #include "plugin.h"
 #include "context.h"
 #include "pass_manager.h"
@@ -1739,6 +1740,15 @@ process_options ()
 		  "requires %<-fno-exceptions%>");
     }
 
+  if (flag_sanitize & SANITIZE_KCFI)
+    {
+      if (!targetm.kcfi.supported ())
+	sorry ("%<-fsanitize=kcfi%> not supported by this target");
+
+      if (!lang_GNU_C ())
+	sorry ("%<-fsanitize=kcfi%> is only supported for C");
+    }
+
   HOST_WIDE_INT patch_area_size, patch_area_start;
   parse_and_check_patch_area (flag_patchable_function_entry, false,
 			      &patch_area_size, &patch_area_start);
diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
index 08e642178ba5..e674e176f7d3 100644
--- a/gcc/tree-inline.cc
+++ b/gcc/tree-inline.cc
@@ -2104,6 +2104,16 @@ copy_bb (copy_body_data *id, basic_block bb,
 	  /* Advance iterator now before stmt is moved to seq_gsi.  */
 	  gsi_next (&stmts_gsi);
 
+	  /* If inlining from a function with no_sanitize("kcfi"), mark any
+	     call statements in the inlined body with the flag so they skip
+	     KCFI instrumentation.  */
+	  if (is_gimple_call (stmt)
+	      && !sanitize_flags_p (SANITIZE_KCFI, id->src_fn))
+	    {
+	      gcall *call = as_a <gcall *> (stmt);
+	      gimple_call_set_inlined_from_kcfi_nosantize (call, true);
+	    }
+
 	  if (gimple_nop_p (stmt))
 	      continue;
 
diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index 0d78f5b384fb..d4e9e2373c6c 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -57,6 +57,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "attribs.h"
 #include "asan.h"
 #include "rtl-iter.h"
+#include "kcfi.h"
 #include "file-prefix-map.h" /* remap_debug_filename()  */
 #include "alloc-pool.h"
 #include "toplev.h"
@@ -2199,6 +2200,10 @@ assemble_start_function (tree decl, const char *fnname)
   unsigned short patch_area_size = crtl->patch_area_size;
   unsigned short patch_area_entry = crtl->patch_area_entry;
 
+  /* Emit KCFI preamble before any patchable areas.  */
+  if (flag_sanitize & SANITIZE_KCFI)
+    kcfi_emit_preamble (asm_out_file, decl, fnname);
+
   /* Emit the patching area before the entry label, if any.  */
   if (patch_area_entry > 0)
     targetm.asm_out.print_patchable_function_entry (asm_out_file,
@@ -2767,6 +2772,9 @@ assemble_external_real (tree decl)
       /* Some systems do require some output.  */
       SYMBOL_REF_USED (XEXP (rtl, 0)) = 1;
       ASM_OUTPUT_EXTERNAL (asm_out_file, decl, XSTR (XEXP (rtl, 0), 0));
+
+      if (flag_sanitize & SANITIZE_KCFI)
+	kcfi_emit_typeid_symbol (asm_out_file, decl);
     }
 }
 #endif
@@ -7283,16 +7291,25 @@ default_elf_asm_named_section (const char *name, unsigned int flags,
 	fprintf (asm_out_file, ",%d", flags & SECTION_ENTSIZE);
       if (flags & SECTION_LINK_ORDER)
 	{
-	  /* For now, only section "__patchable_function_entries"
-	     adopts flag SECTION_LINK_ORDER, internal label LPFE*
-	     was emitted in default_print_patchable_function_entry,
-	     just place it here for linked_to section.  */
-	  gcc_assert (!strcmp (name, "__patchable_function_entries"));
-	  fprintf (asm_out_file, ",");
-	  char buf[256];
-	  ASM_GENERATE_INTERNAL_LABEL (buf, "LPFE",
-				       current_function_funcdef_no);
-	  assemble_name_raw (asm_out_file, buf);
+	  if (!strcmp (name, "__patchable_function_entries"))
+	    {
+	      /* For patchable function entries, internal label LPFE*
+		 was emitted in default_print_patchable_function_entry,
+		 just place it here for linked_to section.  */
+	      fprintf (asm_out_file, ",");
+	      char buf[256];
+	      ASM_GENERATE_INTERNAL_LABEL (buf, "LPFE",
+					   current_function_funcdef_no);
+	      assemble_name_raw (asm_out_file, buf);
+	    }
+	  else if (!strcmp (name, ".kcfi_traps"))
+	    {
+	      /* KCFI traps section links to .text section.  */
+	      fprintf (asm_out_file, ",.text");
+	    }
+	  else
+	    internal_error ("unexpected use of %<SECTION_LINK_ORDER%> by section %qs",
+			    name);
 	}
       if (HAVE_COMDAT_GROUP && (flags & SECTION_LINKONCE))
 	{
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 3/7] x86: Add x86_64 Kernel Control Flow Integrity implementation
  2025-09-13 23:23 [PATCH v3 0/7] Introduce Kernel Control Flow Integrity ABI [PR107048] Kees Cook
  2025-09-13 23:23 ` [PATCH v3 1/7] typeinfo: Introduce KCFI typeinfo mangling API Kees Cook
  2025-09-13 23:23 ` [PATCH v3 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure Kees Cook
@ 2025-09-13 23:23 ` Kees Cook
  2025-09-13 23:24 ` [PATCH v3 4/7] aarch64: Add AArch64 " Kees Cook
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 28+ messages in thread
From: Kees Cook @ 2025-09-13 23:23 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Kees Cook, Andrew Pinski, Jakub Jelinek, Martin Uecker,
	Richard Biener, Joseph Myers, Peter Zijlstra, Jan Hubicka,
	Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
	Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
	Jim Wilson, Dan Li, Sami Tolvanen, Ramon de C Valle, Joao Moreira,
	Nathan Chancellor, Bill Wendling, gcc-patches, linux-hardening

Implement x86_64-specific KCFI backend:

- Implies -mindirect-branch-register since KCFI needs call target in
  a register for typeid hash loading.

- Function preamble generation with type IDs positioned at -(4+prefix_nops)
  offset from function entry point.

- Function-aligned KCFI preambles using calculated alignment NOPs:
  aligned(prefix_nops + 5, $func_align) to maintain ability to call the
  __cfi_ preamble directly in the case of Linux's FineIBT alternative
  CFI sequences (live patched into place).

- Type-id hash avoids generating ENDBR instruction in type IDs
  (0xfa1e0ff3/0xfb1e0ff3 are incremented by 1 to prevent execution).

- On-demand scratch register allocation strategy (r11 as needed).
  The clobbers are available both early and late.

- Uses the .kcfi_traps section for debugger/runtime metadata.

Assembly Code Pattern layout required by Linux kernel:
  movl $inverse_type_id, %r10d   ; Load expected type (0 - hash)
  addl offset(%target), %r10d    ; Add stored type ID from preamble
  je .Lkcfi_call                 ; Branch if types match (sum == 0)
  .Lkcfi_trap: ud2               ; Undefined instruction trap on mismatch
  .Lkcfi_call: call/jmp *%target ; Execute validated indirect transfer

Build and run tested on x86_64 Linux kernel with various CPU errata
handling alternatives, with and without FineIBT patching.

gcc/ChangeLog:

	config/i386/i386.h: KCFI enables TARGET_INDIRECT_BRANCH_REGISTER.
	config/i386/i386-protos.h: Declare ix86_output_kcfi_insn().
	config/i386/i386-expand.cc (ix86_expand_call): Expand indirect
	calls into KCFI RTL.
	config/i386/i386.cc (ix86_kcfi_mask_type_id): New function.
	(ix86_output_kcfi_insn): New function to emit KCFI assembly.
	config/i386/i386.md: Add KCFI RTL patterns.
	doc/invoke.texi: Document x86 nuances.

Signed-off-by: Kees Cook <kees@kernel.org>
---
 gcc/config/i386/i386-protos.h  |   1 +
 gcc/config/i386/i386.h         |   3 +-
 gcc/config/i386/i386-expand.cc |  22 +++++-
 gcc/config/i386/i386.cc        | 130 +++++++++++++++++++++++++++++++++
 gcc/config/i386/i386.md        |  62 +++++++++++++++-
 gcc/doc/invoke.texi            |  23 ++++++
 6 files changed, 233 insertions(+), 8 deletions(-)

diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index bdb8bb963b5d..b0b3864fb53c 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -377,6 +377,7 @@ extern enum attr_cpu ix86_schedule;
 
 extern bool ix86_nopic_noplt_attribute_p (rtx call_op);
 extern const char * ix86_output_call_insn (rtx_insn *insn, rtx call_op);
+extern const char * ix86_output_kcfi_insn (rtx_insn *insn, rtx *operands);
 extern const char * ix86_output_indirect_jmp (rtx call_op);
 extern const char * ix86_output_function_return (bool long_p);
 extern const char * ix86_output_indirect_function_return (rtx ret_op);
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 2d53db683176..5c6012ac743b 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -3038,7 +3038,8 @@ extern void debug_dispatch_window (int);
 
 #define TARGET_INDIRECT_BRANCH_REGISTER \
   (ix86_indirect_branch_register \
-   || cfun->machine->indirect_branch_type != indirect_branch_keep)
+   || cfun->machine->indirect_branch_type != indirect_branch_keep \
+   || (flag_sanitize & SANITIZE_KCFI))
 
 #define IX86_HLE_ACQUIRE (1 << 16)
 #define IX86_HLE_RELEASE (1 << 17)
diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index ef6c12cd5697..3f322271b98f 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -94,6 +94,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "i386-builtins.h"
 #include "i386-expand.h"
 #include "asan.h"
+#include "kcfi.h"
 
 /* Split one or more double-mode RTL references into pairs of half-mode
    references.  The RTL can be REG, offsettable MEM, integer constant, or
@@ -10279,8 +10280,9 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
   unsigned int vec_len = 0;
   tree fndecl;
   bool call_no_callee_saved_registers = false;
+  bool is_direct_call = SYMBOL_REF_P (XEXP (fnaddr, 0));
 
-  if (SYMBOL_REF_P (XEXP (fnaddr, 0)))
+  if (is_direct_call)
     {
       fndecl = SYMBOL_REF_DECL (XEXP (fnaddr, 0));
       if (fndecl)
@@ -10317,7 +10319,7 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
   if (TARGET_MACHO && !TARGET_64BIT)
     {
 #if TARGET_MACHO
-      if (flag_pic && SYMBOL_REF_P (XEXP (fnaddr, 0)))
+      if (flag_pic && is_direct_call)
 	fnaddr = machopic_indirect_call_target (fnaddr);
 #endif
     }
@@ -10401,7 +10403,7 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
   if (ix86_cmodel == CM_LARGE_PIC
       && !TARGET_PECOFF
       && MEM_P (fnaddr)
-      && SYMBOL_REF_P (XEXP (fnaddr, 0))
+      && is_direct_call
       && !local_symbolic_operand (XEXP (fnaddr, 0), VOIDmode))
     fnaddr = gen_rtx_MEM (QImode, construct_plt_address (XEXP (fnaddr, 0)));
   /* Since x32 GOT slot is 64 bit with zero upper 32 bits, indirect
@@ -10433,6 +10435,20 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
 
   call = gen_rtx_CALL (VOIDmode, fnaddr, callarg1);
 
+  /* Only indirect calls need KCFI instrumentation.  */
+  rtx kcfi_type_rtx = is_direct_call ? NULL_RTX
+    : kcfi_get_type_id_for_expanding_gimple_call ();
+  if (kcfi_type_rtx)
+    {
+      /* Wrap call with KCFI.  */
+      call = gen_rtx_KCFI (VOIDmode, call, kcfi_type_rtx);
+
+      /* Add KCFI clobbers for the insn sequence.  */
+      clobber_reg (&use, gen_rtx_REG (DImode, R10_REG));
+      clobber_reg (&use, gen_rtx_REG (DImode, R11_REG));
+      clobber_reg (&use, gen_rtx_REG (CCmode, FLAGS_REG));
+    }
+
   if (retval)
     call = gen_rtx_SET (retval, call);
   vec[vec_len++] = call;
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index b2c1acd12dac..c3dde17322f6 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -98,6 +98,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "i386-builtins.h"
 #include "i386-expand.h"
 #include "i386-features.h"
+#include "kcfi.h"
 #include "function-abi.h"
 #include "rtl-error.h"
 #include "gimple-pretty-print.h"
@@ -1700,6 +1701,20 @@ ix86_function_naked (const_tree fn)
   return false;
 }
 
+/* Apply x86-64 specific masking to KCFI type ID.  */
+
+static uint32_t
+ix86_kcfi_mask_type_id (uint32_t type_id)
+{
+  /* Avoid embedding ENDBR instructions in KCFI type IDs.
+     ENDBR64: 0xfa1e0ff3, ENDBR32: 0xfb1e0ff3
+     If the type ID matches either instruction encoding, increment by 1.  */
+  if (type_id == 0xfa1e0ff3U || type_id == 0xfb1e0ff3U)
+    return type_id + 1;
+
+  return type_id;
+}
+
 /* Write the extra assembler code needed to declare a function properly.  */
 
 void
@@ -28469,6 +28484,121 @@ ix86_set_handled_components (sbitmap components)
       }
 }
 
+/* Output the assembly for a KCFI checked call instruction.  */
+
+const char *
+ix86_output_kcfi_insn (rtx_insn *insn, rtx *operands)
+{
+  /* KCFI is only supported in 64-bit mode due to use of r10/r11 registers.  */
+  if (!TARGET_64BIT || TARGET_X32)
+    {
+      sorry ("%<-fsanitize=kcfi%> is not supported for 32-bit x86 or x32 mode");
+      return "";
+    }
+
+  /* Target is guaranteed to be in a register due to
+     TARGET_INDIRECT_BRANCH_REGISTER.  */
+  rtx target_reg = operands[0];
+  gcc_assert (REG_P (target_reg));
+
+  /* In thunk-extern mode, the register must be R11 for FineIBT
+     compatibility.  Should this be handled via constraints?  */
+  if (cfun->machine->indirect_branch_type == indirect_branch_thunk_extern)
+    {
+      if (REGNO (target_reg) != R11_REG)
+	{
+	  /* Emit move from current target to R11.  */
+	  target_reg = gen_rtx_REG (DImode, R11_REG);
+	  rtx r11_operands[2] = { operands[0], target_reg };
+	  output_asm_insn ("movq\t%0, %1", r11_operands);
+	}
+    }
+
+  /* Generate labels internally.  */
+  rtx trap_label = gen_label_rtx ();
+  rtx call_label = gen_label_rtx ();
+
+  /* Get label numbers for custom naming.  */
+  int trap_labelno = CODE_LABEL_NUMBER (trap_label);
+  int call_labelno = CODE_LABEL_NUMBER (call_label);
+
+  /* Generate custom label names.  */
+  char trap_name[32];
+  char call_name[32];
+  ASM_GENERATE_INTERNAL_LABEL (trap_name, "Lkcfi_trap", trap_labelno);
+  ASM_GENERATE_INTERNAL_LABEL (call_name, "Lkcfi_call", call_labelno);
+
+  /* Choose scratch register: r10 by default, r11 if r10 is the target.  */
+  bool target_is_r10 = (REGNO (target_reg) == R10_REG);
+  int scratch_reg = target_is_r10 ? R11_REG : R10_REG;
+
+  /* Get KCFI type ID from operand.  */
+  uint32_t type_id = (uint32_t) INTVAL (operands[2]);
+
+  /* Convert to inverse for the check (0 - hash) */
+  uint32_t inverse_type_id = (uint32_t)(0 - type_id);
+
+  /* Calculate offset to typeid from target address.  */
+  HOST_WIDE_INT offset = -(4 + kcfi_patchable_entry_prefix_nops);
+
+  /* Output complete KCFI check + call/sibcall sequence atomically.  */
+  rtx inverse_type_id_rtx = gen_int_mode (inverse_type_id, SImode);
+  rtx mov_operands[2] = { inverse_type_id_rtx,
+			  gen_rtx_REG (SImode, scratch_reg) };
+  output_asm_insn ("movl\t$%c0, %1", mov_operands);
+
+  /* Create memory operand for the addl instruction.  */
+  rtx offset_rtx = gen_int_mode (offset, DImode);
+  rtx mem_op = gen_rtx_MEM (SImode,
+			    gen_rtx_PLUS (DImode, target_reg, offset_rtx));
+  rtx add_operands[2] = { mem_op, gen_rtx_REG (SImode, scratch_reg) };
+  output_asm_insn ("addl\t%0, %1", add_operands);
+
+  /* Output conditional jump to call label.  */
+  fputs ("\tje\t", asm_out_file);
+  assemble_name (asm_out_file, call_name);
+  fputc ('\n', asm_out_file);
+
+  /* Output trap label and instruction.  */
+  ASM_OUTPUT_LABEL (asm_out_file, trap_name);
+  output_asm_insn ("ud2", operands);
+
+  /* Use common helper for trap section entry.  */
+  rtx trap_label_sym = gen_rtx_SYMBOL_REF (Pmode, trap_name);
+  kcfi_emit_traps_section (asm_out_file, trap_label_sym);
+
+  /* Output pass/call label.  */
+  ASM_OUTPUT_LABEL (asm_out_file, call_name);
+
+  /* Finally emit the protected call or sibling call.  */
+  if (SIBLING_CALL_P (insn))
+    return ix86_output_indirect_jmp (target_reg);
+  else
+    return ix86_output_call_insn (insn, target_reg);
+}
+
+/* Emit x86_64-specific type ID instruction and return instruction size.  */
+
+static int
+ix86_kcfi_emit_type_id (FILE *file, uint32_t type_id)
+{
+  /* Emit movl instruction with type ID if file is not NULL.  */
+  if (file)
+    fprintf (file, "\tmovl\t$0x%08x, %%eax\n", type_id);
+
+  /* x86_64 uses 5-byte movl instruction for type ID.  */
+  return 5;
+}
+
+#undef TARGET_KCFI_SUPPORTED
+#define TARGET_KCFI_SUPPORTED hook_bool_void_true
+
+#undef TARGET_KCFI_MASK_TYPE_ID
+#define TARGET_KCFI_MASK_TYPE_ID ix86_kcfi_mask_type_id
+
+#undef TARGET_KCFI_EMIT_TYPE_ID
+#define TARGET_KCFI_EMIT_TYPE_ID ix86_kcfi_emit_type_id
+
 #undef TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS
 #define TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS ix86_get_separate_components
 #undef TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index cea6c152f2b9..b36979e67981 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -20274,11 +20274,24 @@
   DONE;
 })
 
+;; KCFI indirect call
+(define_insn "*call"
+  [(kcfi (call (mem:QI (match_operand:W 0 "call_insn_operand" "<c>BwBz"))
+	       (match_operand 1))
+	 (match_operand 2 "const_int_operand"))]
+  "!SIBLING_CALL_P (insn)"
+{
+  return ix86_output_kcfi_insn (insn, operands);
+}
+  [(set_attr "type" "call")])
+
 (define_insn "*call"
   [(call (mem:QI (match_operand:W 0 "call_insn_operand" "<c>BwBz"))
 	 (match_operand 1))]
   "!SIBLING_CALL_P (insn)"
-  "* return ix86_output_call_insn (insn, operands[0]);"
+{
+  return ix86_output_call_insn (insn, operands[0]);
+}
   [(set_attr "type" "call")])
 
 ;; This covers both call and sibcall since only GOT slot is allowed.
@@ -20311,11 +20324,24 @@
 }
   [(set_attr "type" "call")])
 
+;; KCFI sibling call
+(define_insn "*sibcall"
+  [(kcfi (call (mem:QI (match_operand:W 0 "sibcall_insn_operand" "UBsBz"))
+	       (match_operand 1))
+	 (match_operand 2 "const_int_operand"))]
+  "SIBLING_CALL_P (insn)"
+{
+  return ix86_output_kcfi_insn (insn, operands);
+}
+  [(set_attr "type" "call")])
+
 (define_insn "*sibcall"
   [(call (mem:QI (match_operand:W 0 "sibcall_insn_operand" "UBsBz"))
 	 (match_operand 1))]
   "SIBLING_CALL_P (insn)"
-  "* return ix86_output_call_insn (insn, operands[0]);"
+{
+  return ix86_output_call_insn (insn, operands[0]);
+}
   [(set_attr "type" "call")])
 
 (define_insn "*sibcall_memory"
@@ -20472,12 +20498,26 @@
   DONE;
 })
 
+;; KCFI call with return value
+(define_insn "*call_value"
+  [(set (match_operand 0)
+	(kcfi (call (mem:QI (match_operand:W 1 "call_insn_operand" "<c>BwBz"))
+		    (match_operand 2))
+	      (match_operand 3 "const_int_operand")))]
+  "!SIBLING_CALL_P (insn)"
+{
+  return ix86_output_kcfi_insn (insn, &operands[1]);
+}
+  [(set_attr "type" "callv")])
+
 (define_insn "*call_value"
   [(set (match_operand 0)
 	(call (mem:QI (match_operand:W 1 "call_insn_operand" "<c>BwBz"))
 	      (match_operand 2)))]
   "!SIBLING_CALL_P (insn)"
-  "* return ix86_output_call_insn (insn, operands[1]);"
+{
+  return ix86_output_call_insn (insn, operands[1]);
+}
   [(set_attr "type" "callv")])
 
 ;; This covers both call and sibcall since only GOT slot is allowed.
@@ -20513,12 +20553,26 @@
 }
   [(set_attr "type" "callv")])
 
+;; KCFI sibling call with return value
+(define_insn "*sibcall_value"
+  [(set (match_operand 0)
+	(kcfi (call (mem:QI (match_operand:W 1 "sibcall_insn_operand" "UBsBz"))
+		    (match_operand 2))
+	      (match_operand 3 "const_int_operand")))]
+  "SIBLING_CALL_P (insn)"
+{
+  return ix86_output_kcfi_insn (insn, &operands[1]);
+}
+  [(set_attr "type" "callv")])
+
 (define_insn "*sibcall_value"
   [(set (match_operand 0)
 	(call (mem:QI (match_operand:W 1 "sibcall_insn_operand" "UBsBz"))
 	      (match_operand 2)))]
   "SIBLING_CALL_P (insn)"
-  "* return ix86_output_call_insn (insn, operands[1]);"
+{
+  return ix86_output_call_insn (insn, operands[1]);
+}
   [(set_attr "type" "callv")])
 
 (define_insn "*sibcall_value_memory"
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index f96e104a7248..bd84b7dd903f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -18402,6 +18402,29 @@ The type identifier is placed before the function entry point,
 allowing runtime verification without additional metadata structures,
 and without changing the entry points of the target functions.
 
+Platform-specific implementation details:
+
+On x86_64, KCFI type identifiers are emitted as a @code{movl $ID, %eax}
+instruction before the function entry.  The implementation ensures that
+type IDs never collide with ENDBR instruction encodings.  When used
+with @option{-fpatchable-function-entry}, the type identifier is
+placed before any patchable NOPs, with appropriate alignment to maintain
+the alignment specified by @code{-falign-functions}.  KCFI automatically
+implies @option{-mindirect-branch-register}, forcing all indirect calls
+and jumps to use registers instead of memory operands.  The runtime
+check loads the type ID from the target function into @code{%r10d} and
+uses an @code{addl} instruction to add the negative expected type ID,
+effectively zeroing the register if the types match.  A conditional
+jump follows to either continue execution or trap on mismatch.  The
+check sequence uses @code{%r10d} and @code{%r11d} as scratch registers.
+Trap locations are recorded in a special @code{.kcfi_traps} section
+that maps trap sites to their corresponding function entry points,
+enabling debuggers and crash handlers to identify KCFI violations.
+The exact instruction sequences for both the KCFI preamble and the
+check-call bundle are considered ABI, as the Linux kernel may
+optionally rewrite these areas at boot time to mitigate detected CPU
+errata.
+
 KCFI is intended primarily for kernel code and may not be suitable
 for user-space applications that rely on techniques incompatible
 with strict type checking of indirect calls.
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 4/7] aarch64: Add AArch64 Kernel Control Flow Integrity implementation
  2025-09-13 23:23 [PATCH v3 0/7] Introduce Kernel Control Flow Integrity ABI [PR107048] Kees Cook
                   ` (2 preceding siblings ...)
  2025-09-13 23:23 ` [PATCH v3 3/7] x86: Add x86_64 Kernel Control Flow Integrity implementation Kees Cook
@ 2025-09-13 23:24 ` Kees Cook
  2025-09-13 23:43   ` Andrew Pinski
  2025-09-13 23:24 ` [PATCH v3 5/7] arm: Add ARM 32-bit " Kees Cook
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Kees Cook @ 2025-09-13 23:24 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Kees Cook, Andrew Pinski, Jakub Jelinek, Martin Uecker,
	Richard Biener, Joseph Myers, Peter Zijlstra, Jan Hubicka,
	Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
	Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
	Jim Wilson, Dan Li, Sami Tolvanen, Ramon de C Valle, Joao Moreira,
	Nathan Chancellor, Bill Wendling, gcc-patches, linux-hardening

Implement AArch64-specific KCFI backend.

- Trap debugging through ESR (Exception Syndrome Register) encoding
  in BRK instruction immediate values.

- Scratch register allocation using w16/w17 (x16/x17) following
  AArch64 procedure call standard for intra-procedure-call registers.

Assembly Code Pattern for AArch64:
  ldur w16, [target, #-4]       ; Load actual type ID from preamble
  mov  w17, #type_id_low        ; Load expected type (lower 16 bits)
  movk w17, #type_id_high, lsl #16  ; Load upper 16 bits if needed
  cmp  w16, w17                 ; Compare type IDs directly
  b.eq .Lpass                   ; Branch if types match
  .Ltrap: brk #esr_value        ; Enhanced trap with register info
  .Lpass: blr/br target         ; Execute validated indirect transfer

ESR (Exception Syndrome Register) Integration:
- BRK instruction immediate encoding format:
  0x8000 | ((TypeIndex & 31) << 5) | (AddrIndex & 31)
  - TypeIndex indicates which W register contains expected type (W17 = 17)
  - AddrIndex indicates which X register contains target address (0-30)
  - Example: brk #33313 (0x8221) = expected type in W17, target address in X1

Build and run tested with Linux kernel ARCH=arm64.

gcc/ChangeLog:

	config/aarch64/aarch64-protos.h: Declare aarch64_indirect_branch_asm,
	and KCFI helpers.
	config/aarch64/aarch64.cc (aarch64_expand_call): Wrap CALLs in
	KCFI, with clobbers.
	(aarch64_indirect_branch_asm): New function, extract common
	logic for branch asm, like existing call asm helper.
	(aarch64_output_kcfi_insn): Emit KCFI assembly.
	config/aarch64/aarch64.md: Add KCFI RTL patterns and replace
	open-coded branch emission with aarch64_indirect_branch_asm.
	doc/invoke.texi: Document aarch64 nuances.

Signed-off-by: Kees Cook <kees@kernel.org>
---
 gcc/config/aarch64/aarch64-protos.h |   5 ++
 gcc/config/aarch64/aarch64.cc       | 116 ++++++++++++++++++++++++++++
 gcc/config/aarch64/aarch64.md       |  64 +++++++++++++--
 gcc/doc/invoke.texi                 |  14 ++++
 4 files changed, 191 insertions(+), 8 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 56efcf2c7f2c..c91fdcc80ea3 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -1261,6 +1261,7 @@ tree aarch64_resolve_overloaded_builtin_general (location_t, tree, void *);
 
 const char *aarch64_sls_barrier (int);
 const char *aarch64_indirect_call_asm (rtx);
+const char *aarch64_indirect_branch_asm (rtx);
 extern bool aarch64_harden_sls_retbr_p (void);
 extern bool aarch64_harden_sls_blr_p (void);
 
@@ -1284,4 +1285,8 @@ extern unsigned aarch64_stack_alignment (const_tree exp, unsigned align);
 extern rtx aarch64_gen_compare_zero_and_branch (rtx_code code, rtx x,
 						rtx_code_label *label);
 
+/* KCFI support.  */
+extern void kcfi_emit_trap_with_section (FILE *file, rtx trap_label_rtx);
+extern const char *aarch64_output_kcfi_insn (rtx_insn *insn, rtx *operands);
+
 #endif /* GCC_AARCH64_PROTOS_H */
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index fb8311b655d7..a7d17f18b72e 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -83,6 +83,7 @@
 #include "rtlanal.h"
 #include "tree-dfa.h"
 #include "asan.h"
+#include "kcfi.h"
 #include "aarch64-elf-metadata.h"
 #include "aarch64-feature-deps.h"
 #include "config/arm/aarch-common.h"
@@ -11848,6 +11849,16 @@ aarch64_expand_call (rtx result, rtx mem, rtx cookie, bool sibcall)
 
   call = gen_rtx_CALL (VOIDmode, mem, const0_rtx);
 
+  /* Only indirect calls need KCFI instrumentation.  */
+  bool is_direct_call = SYMBOL_REF_P (XEXP (mem, 0));
+  rtx kcfi_type_rtx = is_direct_call ? NULL_RTX
+    : kcfi_get_type_id_for_expanding_gimple_call ();
+  if (kcfi_type_rtx)
+    {
+      /* Wrap call in KCFI.  */
+      call = gen_rtx_KCFI (VOIDmode, call, kcfi_type_rtx);
+    }
+
   if (result != NULL_RTX)
     call = gen_rtx_SET (result, call);
 
@@ -11864,6 +11875,16 @@ aarch64_expand_call (rtx result, rtx mem, rtx cookie, bool sibcall)
 
   auto call_insn = aarch64_emit_call_insn (call);
 
+  /* Add KCFI clobbers for indirect calls.  */
+  if (kcfi_type_rtx)
+    {
+      rtx usage = CALL_INSN_FUNCTION_USAGE (call_insn);
+      /* Add X16 and X17 clobbers for AArch64 KCFI scratch registers.  */
+      clobber_reg (&usage, gen_rtx_REG (DImode, 16));
+      clobber_reg (&usage, gen_rtx_REG (DImode, 17));
+      CALL_INSN_FUNCTION_USAGE (call_insn) = usage;
+    }
+
   /* Check whether the call requires a change to PSTATE.SM.  We can't
      emit the instructions to change PSTATE.SM yet, since they involve
      a change in vector length and a change in instruction set, which
@@ -30630,6 +30651,14 @@ aarch64_indirect_call_asm (rtx addr)
   return "";
 }
 
+const char *
+aarch64_indirect_branch_asm (rtx addr)
+{
+  gcc_assert (REG_P (addr));
+  output_asm_insn ("br\t%0", &addr);
+  return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
+}
+
 /* Emit the assembly instruction to load the thread pointer into DEST.
    Select between different tpidr_elN registers depending on -mtp= setting.  */
 
@@ -32823,6 +32852,93 @@ aarch64_libgcc_floating_mode_supported_p
 #undef TARGET_DOCUMENTATION_NAME
 #define TARGET_DOCUMENTATION_NAME "AArch64"
 
+/* Output the assembly for a KCFI checked call instruction.  */
+
+const char *
+aarch64_output_kcfi_insn (rtx_insn *insn, rtx *operands)
+{
+  /* KCFI is only supported in LP64 mode.  */
+  if (TARGET_ILP32)
+    {
+      sorry ("%<-fsanitize=kcfi%> is not supported for %<-mabi=ilp32%>");
+      return "";
+    }
+  /* Target register is operands[0].  */
+  rtx target_reg = operands[0];
+  gcc_assert (REG_P (target_reg));
+
+  /* Get KCFI type ID from operand[3].  */
+  uint32_t type_id = (uint32_t) INTVAL (operands[3]);
+
+  /* Calculate typeid offset from call target.  */
+  HOST_WIDE_INT offset = -(4 + kcfi_patchable_entry_prefix_nops);
+
+  /* Generate labels internally.  */
+  rtx trap_label = gen_label_rtx ();
+  rtx call_label = gen_label_rtx ();
+
+  /* Get label numbers for custom naming.  */
+  int trap_labelno = CODE_LABEL_NUMBER (trap_label);
+  int call_labelno = CODE_LABEL_NUMBER (call_label);
+
+  /* Generate custom label names.  */
+  char trap_name[32];
+  char call_name[32];
+  ASM_GENERATE_INTERNAL_LABEL (trap_name, "Lkcfi_trap", trap_labelno);
+  ASM_GENERATE_INTERNAL_LABEL (call_name, "Lkcfi_call", call_labelno);
+
+  rtx temp_operands[3];
+
+  /* Load actual type into w16 from memory at offset using ldur.  */
+  temp_operands[0] = gen_rtx_REG (SImode, R16_REGNUM);
+  temp_operands[1] = target_reg;
+  temp_operands[2] = GEN_INT (offset);
+  output_asm_insn ("ldur\t%w0, [%1, #%2]", temp_operands);
+
+  /* Load expected type low 16 bits into w17.  */
+  temp_operands[0] = gen_rtx_REG (SImode, R17_REGNUM);
+  temp_operands[1] = GEN_INT (type_id & 0xFFFF);
+  output_asm_insn ("mov\t%w0, #%1", temp_operands);
+
+  /* Load expected type high 16 bits into w17.  */
+  temp_operands[0] = gen_rtx_REG (SImode, R17_REGNUM);
+  temp_operands[1] = GEN_INT ((type_id >> 16) & 0xFFFF);
+  output_asm_insn ("movk\t%w0, #%1, lsl #16", temp_operands);
+
+  /* Compare types.  */
+  temp_operands[0] = gen_rtx_REG (SImode, R16_REGNUM);
+  temp_operands[1] = gen_rtx_REG (SImode, R17_REGNUM);
+  output_asm_insn ("cmp\t%w0, %w1", temp_operands);
+
+  /* Output conditional branch to call label.  */
+  fputs ("\tb.eq\t", asm_out_file);
+  assemble_name (asm_out_file, call_name);
+  fputc ('\n', asm_out_file);
+
+  /* Output trap label and BRK instruction.  */
+  ASM_OUTPUT_LABEL (asm_out_file, trap_name);
+
+  /* Calculate and emit BRK with ESR encoding.  */
+  unsigned type_index = R17_REGNUM;
+  unsigned addr_index = REGNO (operands[0]) - R0_REGNUM;
+  unsigned esr_value = 0x8000 | ((type_index & 31) << 5) | (addr_index & 31);
+
+  temp_operands[0] = GEN_INT (esr_value);
+  output_asm_insn ("brk\t#%0", temp_operands);
+
+  /* Output call label.  */
+  ASM_OUTPUT_LABEL (asm_out_file, call_name);
+
+  /* Return appropriate call instruction based on SIBLING_CALL_P.  */
+  if (SIBLING_CALL_P (insn))
+    return aarch64_indirect_branch_asm (operands[0]);
+  else
+    return aarch64_indirect_call_asm (operands[0]);
+}
+
+#undef TARGET_KCFI_SUPPORTED
+#define TARGET_KCFI_SUPPORTED hook_bool_void_true
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-aarch64.h"
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index fedbd4026a06..1a5abc142f50 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1483,6 +1483,19 @@
   }"
 )
 
+;; KCFI indirect call
+(define_insn "*call_insn"
+  [(kcfi (call (mem:DI (match_operand:DI 0 "aarch64_call_insn_operand" "Ucr"))
+	       (match_operand 1 "" ""))
+	 (match_operand 3 "const_int_operand"))
+   (unspec:DI [(match_operand:DI 2 "const_int_operand")] UNSPEC_CALLEE_ABI)
+   (clobber (reg:DI LR_REGNUM))]
+  "!SIBLING_CALL_P (insn)"
+{
+  return aarch64_output_kcfi_insn (insn, operands);
+}
+  [(set_attr "type" "call")])
+
 (define_insn "*call_insn"
   [(call (mem:DI (match_operand:DI 0 "aarch64_call_insn_operand"))
 	 (match_operand 1 "" ""))
@@ -1510,6 +1523,20 @@
   }"
 )
 
+;; KCFI call with return value
+(define_insn "*call_value_insn"
+  [(set (match_operand 0 "" "")
+	(kcfi (call (mem:DI (match_operand:DI 1 "aarch64_call_insn_operand" "Ucr"))
+		    (match_operand 2 "" ""))
+	      (match_operand 4 "const_int_operand")))
+   (unspec:DI [(match_operand:DI 3 "const_int_operand")] UNSPEC_CALLEE_ABI)
+   (clobber (reg:DI LR_REGNUM))]
+  "!SIBLING_CALL_P (insn)"
+{
+  return aarch64_output_kcfi_insn (insn, &operands[1]);
+}
+  [(set_attr "type" "call")])
+
 (define_insn "*call_value_insn"
   [(set (match_operand 0 "" "")
 	(call (mem:DI (match_operand:DI 1 "aarch64_call_insn_operand"))
@@ -1550,6 +1577,19 @@
   }
 )
 
+;; KCFI sibling call
+(define_insn "*sibcall_insn"
+  [(kcfi (call (mem:DI (match_operand:DI 0 "aarch64_call_insn_operand" "Ucs"))
+	       (match_operand 1 ""))
+	 (match_operand 3 "const_int_operand"))
+   (unspec:DI [(match_operand:DI 2 "const_int_operand")] UNSPEC_CALLEE_ABI)
+   (return)]
+  "SIBLING_CALL_P (insn)"
+{
+  return aarch64_output_kcfi_insn (insn, operands);
+}
+  [(set_attr "type" "branch")])
+
 (define_insn "*sibcall_insn"
   [(call (mem:DI (match_operand:DI 0 "aarch64_call_insn_operand" "Ucs, Usf"))
 	 (match_operand 1 ""))
@@ -1558,16 +1598,27 @@
   "SIBLING_CALL_P (insn)"
   {
     if (which_alternative == 0)
-      {
-	output_asm_insn ("br\\t%0", operands);
-	return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
-      }
+      return aarch64_indirect_branch_asm (operands[0]);
     return "b\\t%c0";
   }
   [(set_attr "type" "branch, branch")
    (set_attr "sls_length" "retbr,none")]
 )
 
+;; KCFI sibling call with return value
+(define_insn "*sibcall_value_insn"
+  [(set (match_operand 0 "")
+	(kcfi (call (mem:DI (match_operand:DI 1 "aarch64_call_insn_operand" "Ucs"))
+		    (match_operand 2 ""))
+	      (match_operand 4 "const_int_operand")))
+   (unspec:DI [(match_operand:DI 3 "const_int_operand")] UNSPEC_CALLEE_ABI)
+   (return)]
+  "SIBLING_CALL_P (insn)"
+{
+  return aarch64_output_kcfi_insn (insn, &operands[1]);
+}
+  [(set_attr "type" "branch")])
+
 (define_insn "*sibcall_value_insn"
   [(set (match_operand 0 "")
 	(call (mem:DI
@@ -1578,10 +1629,7 @@
   "SIBLING_CALL_P (insn)"
   {
     if (which_alternative == 0)
-      {
-	output_asm_insn ("br\\t%1", operands);
-	return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
-      }
+      return aarch64_indirect_branch_asm (operands[1]);
     return "b\\t%c1";
   }
   [(set_attr "type" "branch, branch")
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index bd84b7dd903f..972e8e76494f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -18425,6 +18425,20 @@ check-call bundle are considered ABI, as the Linux kernel may
 optionally rewrite these areas at boot time to mitigate detected CPU
 errata.
 
+On AArch64, KCFI type identifiers are emitted as a @code{.word ID}
+directive (a 32-bit constant) before the function entry.  AArch64's
+natural 4-byte instruction alignment eliminates the need for additional
+alignment NOPs.  When used with @option{-fpatchable-function-entry}, the
+type identifier is placed before any prefix NOPs.  The runtime check
+uses @code{x16} and @code{x17} as scratch registers.  Type mismatches
+trigger a @code{brk} instruction with an immediate value that encodes
+both the expected type register index and the target address register
+index in the format @code{0x8000 | (type_reg << 5) | addr_reg}.  This
+encoding is captured in the ESR (Exception Syndrome Register) when the
+trap is taken, allowing the kernel to identify both the KCFI violation
+and the involved registers for detailed diagnostics (eliminating the need
+for a separate @code{.kcfi_traps} section as used on x86_64).
+
 KCFI is intended primarily for kernel code and may not be suitable
 for user-space applications that rely on techniques incompatible
 with strict type checking of indirect calls.
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 5/7] arm: Add ARM 32-bit Kernel Control Flow Integrity implementation
  2025-09-13 23:23 [PATCH v3 0/7] Introduce Kernel Control Flow Integrity ABI [PR107048] Kees Cook
                   ` (3 preceding siblings ...)
  2025-09-13 23:24 ` [PATCH v3 4/7] aarch64: Add AArch64 " Kees Cook
@ 2025-09-13 23:24 ` Kees Cook
  2025-09-13 23:24 ` [PATCH v3 6/7] riscv: Add RISC-V " Kees Cook
  2025-09-13 23:24 ` [PATCH v3 7/7] kcfi: Add regression test suite Kees Cook
  6 siblings, 0 replies; 28+ messages in thread
From: Kees Cook @ 2025-09-13 23:24 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Kees Cook, Andrew Pinski, Jakub Jelinek, Martin Uecker,
	Richard Biener, Joseph Myers, Peter Zijlstra, Jan Hubicka,
	Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
	Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
	Jim Wilson, Dan Li, Sami Tolvanen, Ramon de C Valle, Joao Moreira,
	Nathan Chancellor, Bill Wendling, gcc-patches, linux-hardening

Implement ARM 32-bit KCFI backend supporting ARMv7+:

- Use movw/movt instructions for 32-bit immediate loading.

- Trap debugging through UDF instruction immediate encoding following
  AArch64 BRK pattern for encoding registers with useful contents.

- Scratch register allocation using r0/r1 following ARM procedure call
  standard for caller-saved temporary registers, though they get
  stack spilled due to register pressure. IP (r12) not usable here
  because register allocator regularly uses it as the branch target
  register.

Assembly Code Pattern for ARM 32-bit:
  push {r0, r1}                ; Spill r0, r1
  ldr  r0, [target, #-4]       ; Load actual type ID from preamble
  movw r1, #type_id_low        ; Load expected type (lower 16 bits)
  movt r1, #type_id_high       ; Load upper 16 bits with top instruction
  cmp  r0, r1                  ; Compare type IDs directly
  pop [r0, r1]                 ; Reload r0, r1
  beq  .Lkcfi_call             ; Branch if typeids match
  .Lkcfi_trap: udf #udf_value  ; Undefined instruction trap with encoding
  .Lkcfi_call: blx/bx target   ; Execute validated indirect transfer

UDF Immediate Encoding (following AArch64 ESR pattern):
- UDF instruction immediate encoding format:
  0x8000 | ((ExpectedTypeReg & 31) << 5) | (TargetAddrReg & 31)
  - ExpectedTypeReg indicates which register contains expected type (R12 = 12)
  - TargetAddrReg indicates which register contains target address (0-15)
  - Example: udf #33154 (0x817A) = expected type in R12, target address in R2

Build and run tested with Linux kernel ARCH=arm.

gcc/ChangeLog:

	config/arm/arm-protos.h: Declare KCFI helpers.
	config/arm/arm.cc (arm_maybe_wrap_call_with_kcfi): New function.
	(arm_maybe_wrap_call_value_with_kcfi): New function.
	(arm_output_kcfi_insn): Emit KCFI assembly.
	config/arm/arm.md: Add KCFI RTL patterns and hook expansion.
	doc/invoke.texi: Document arm32 nuances.

Signed-off-by: Kees Cook <kees@kernel.org>
---
 gcc/config/arm/arm-protos.h |   4 +
 gcc/config/arm/arm.cc       | 146 ++++++++++++++++++++++++++++++++++++
 gcc/config/arm/arm.md       |  62 +++++++++++++++
 gcc/doc/invoke.texi         |  17 +++++
 4 files changed, 229 insertions(+)

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index ff7e7658f912..ad3dc522e2b9 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -607,6 +607,10 @@ void arm_initialize_isa (sbitmap, const enum isa_feature *);
 
 const char * arm_gen_far_branch (rtx *, int, const char * , const char *);
 
+rtx arm_maybe_wrap_call_with_kcfi (rtx, rtx);
+rtx arm_maybe_wrap_call_value_with_kcfi (rtx, rtx);
+const char *arm_output_kcfi_insn (rtx_insn *, rtx *);
+
 bool arm_mve_immediate_check(rtx, machine_mode, bool);
 
 opt_machine_mode arm_mve_data_mode (scalar_mode, poly_uint64);
diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index 8b951f3d4a67..d06183fd2d53 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -77,6 +77,8 @@
 #include "aarch-common-protos.h"
 #include "machmode.h"
 #include "arm-builtins.h"
+#include "kcfi.h"
+#include "flags.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -35803,6 +35805,150 @@ arm_mode_base_reg_class (machine_mode mode)
   return MODE_BASE_REG_REG_CLASS (mode);
 }
 
+/* Apply KCFI wrapping to call pattern if needed.  */
+
+rtx
+arm_maybe_wrap_call_with_kcfi (rtx pat, rtx addr)
+{
+  /* Only indirect calls need KCFI instrumentation.  */
+  bool is_direct_call = SYMBOL_REF_P (addr);
+  if (!is_direct_call)
+    {
+      rtx kcfi_type_rtx = kcfi_get_type_id_for_expanding_gimple_call ();
+      if (kcfi_type_rtx)
+	{
+	  /* Extract the CALL from the PARALLEL and wrap it with KCFI.  */
+	  rtx call_rtx = XVECEXP (pat, 0, 0);
+	  rtx kcfi_call = gen_rtx_KCFI (VOIDmode, call_rtx, kcfi_type_rtx);
+
+	  /* Replace the CALL in the PARALLEL with the KCFI-wrapped call.  */
+	  XVECEXP (pat, 0, 0) = kcfi_call;
+	}
+    }
+  return pat;
+}
+
+/* Apply KCFI wrapping to call_value pattern if needed.  */
+
+rtx
+arm_maybe_wrap_call_value_with_kcfi (rtx pat, rtx addr)
+{
+  /* Only indirect calls need KCFI instrumentation.  */
+  bool is_direct_call = SYMBOL_REF_P (addr);
+  if (!is_direct_call)
+    {
+      rtx kcfi_type_rtx = kcfi_get_type_id_for_expanding_gimple_call ();
+      if (kcfi_type_rtx)
+	{
+	  /* Extract the SET from the PARALLEL and wrap its CALL with KCFI.  */
+	  rtx set_rtx = XVECEXP (pat, 0, 0);
+	  rtx call_rtx = SET_SRC (set_rtx);
+	  rtx kcfi_call = gen_rtx_KCFI (VOIDmode, call_rtx, kcfi_type_rtx);
+
+	  /* Replace the CALL in the SET with the KCFI-wrapped call.  */
+	  SET_SRC (set_rtx) = kcfi_call;
+	}
+    }
+  return pat;
+}
+
+/* Output the assembly for a KCFI checked call instruction.  */
+
+const char *
+arm_output_kcfi_insn (rtx_insn *insn, rtx *operands)
+{
+  /* KCFI requires movw/movt instructions for type ID loading.  */
+  if (!TARGET_HAVE_MOVT)
+    sorry ("%<-fsanitize=kcfi%> requires movw/movt instructions (ARMv7 or later)");
+
+  /* KCFI type id.  */
+  uint32_t type_id = INTVAL (operands[2]);
+
+  /* Calculate typeid offset from call target.  */
+  HOST_WIDE_INT offset = -(4 + kcfi_patchable_entry_prefix_nops);
+
+  /* Calculate trap immediate.  */
+  unsigned addr_reg_num = REGNO (operands[0]);
+  unsigned udf_immediate = 0x8000 | (0x1F << 5) | (addr_reg_num & 31);
+
+  /* Generate labels internally.  */
+  rtx trap_label = gen_label_rtx ();
+  rtx call_label = gen_label_rtx ();
+
+  /* Get label numbers for custom naming.  */
+  int trap_labelno = CODE_LABEL_NUMBER (trap_label);
+  int call_labelno = CODE_LABEL_NUMBER (call_label);
+
+  /* Generate custom label names.  */
+  char trap_name[32];
+  char call_name[32];
+  ASM_GENERATE_INTERNAL_LABEL (trap_name, "Lkcfi_trap", trap_labelno);
+  ASM_GENERATE_INTERNAL_LABEL (call_name, "Lkcfi_call", call_labelno);
+
+  /* Create memory operand for the type load.  */
+  rtx mem_op = gen_rtx_MEM (SImode,
+			    gen_rtx_PLUS (SImode, operands[0],
+					  GEN_INT (offset)));
+  rtx temp_operands[6];
+
+  /* Spill r0 and r1 to stack.  */
+  output_asm_insn ("push\t{r0, r1}", NULL);
+
+  /* Load actual type from memory using r0.  */
+  temp_operands[0] = gen_rtx_REG (SImode, R0_REGNUM);
+  temp_operands[1] = mem_op;
+  output_asm_insn ("ldr\t%0, %1", temp_operands);
+
+  /* Load expected type low 16 bits into r1.  */
+  temp_operands[0] = gen_rtx_REG (SImode, R1_REGNUM);
+  temp_operands[1] = GEN_INT (type_id & 0xFFFF);
+  output_asm_insn ("movw\t%0, %1", temp_operands);
+
+  /* Load expected type high 16 bits into r1.  */
+  temp_operands[0] = gen_rtx_REG (SImode, R1_REGNUM);
+  temp_operands[1] = GEN_INT ((type_id >> 16) & 0xFFFF);
+  output_asm_insn ("movt\t%0, %1", temp_operands);
+
+  /* Compare types in r0 and r1.  */
+  temp_operands[0] = gen_rtx_REG (SImode, R0_REGNUM);
+  temp_operands[1] = gen_rtx_REG (SImode, R1_REGNUM);
+  output_asm_insn ("cmp\t%0, %1", temp_operands);
+
+  /* Restore r0 and r1 from stack.  */
+  output_asm_insn ("pop\t{r0, r1}", NULL);
+
+  /* Output conditional branch to call label.  */
+  fputs ("\tbeq\t", asm_out_file);
+  assemble_name (asm_out_file, call_name);
+  fputc ('\n', asm_out_file);
+
+  /* Output trap label and UDF instruction.  */
+  ASM_OUTPUT_LABEL (asm_out_file, trap_name);
+  temp_operands[0] = GEN_INT (udf_immediate);
+  output_asm_insn ("udf\t%0", temp_operands);
+
+  /* Output pass/call label.  */
+  ASM_OUTPUT_LABEL (asm_out_file, call_name);
+
+  /* Handle calls to lr using ip (which may be clobbered in subr anyway).  */
+  if (REGNO (operands[0]) == LR_REGNUM)
+    {
+      operands[0] = gen_rtx_REG (SImode, IP_REGNUM);
+      output_asm_insn ("mov\t%0, lr", operands);
+    }
+
+  /* Call or tail call instruction.  */
+  if (SIBLING_CALL_P (insn))
+    output_asm_insn ("bx\t%0", operands);
+  else
+    output_asm_insn ("blx\t%0", operands);
+
+  return "";
+}
+
+#undef TARGET_KCFI_SUPPORTED
+#define TARGET_KCFI_SUPPORTED hook_bool_void_true
+
 #undef TARGET_DOCUMENTATION_NAME
 #define TARGET_DOCUMENTATION_NAME "ARM"
 
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 422ae549b65b..646eb0d757b1 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -8629,6 +8629,7 @@
     else
       {
 	pat = gen_call_internal (operands[0], operands[1], operands[2]);
+	pat = arm_maybe_wrap_call_with_kcfi (pat, XEXP (operands[0], 0));
 	arm_emit_call_insn (pat, XEXP (operands[0], 0), false);
       }
 
@@ -8687,6 +8688,20 @@
   }
 )
 
+;; KCFI indirect call - KCFI wraps just the call pattern
+(define_insn "*kcfi_call_reg"
+  [(kcfi (call (mem:SI (match_operand:SI 0 "s_register_operand" "r"))
+	       (match_operand 1 "" ""))
+	 (match_operand 2 "const_int_operand"))
+   (use (match_operand 3 "" ""))
+   (clobber (reg:SI LR_REGNUM))]
+  "TARGET_32BIT && !SIBLING_CALL_P (insn) && arm_ccfsm_state == 0"
+{
+  return arm_output_kcfi_insn (insn, operands);
+}
+  [(set_attr "type" "call")
+   (set_attr "length" "36")])
+
 (define_insn "*call_reg_armv5"
   [(call (mem:SI (match_operand:SI 0 "s_register_operand" "r"))
          (match_operand 1 "" ""))
@@ -8753,6 +8768,7 @@
       {
 	pat = gen_call_value_internal (operands[0], operands[1],
 				       operands[2], operands[3]);
+	pat = arm_maybe_wrap_call_value_with_kcfi (pat, XEXP (operands[1], 0));
 	arm_emit_call_insn (pat, XEXP (operands[1], 0), false);
       }
 
@@ -8799,6 +8815,21 @@
       }
   }")
 
+;; KCFI indirect call_value - KCFI wraps just the call pattern
+(define_insn "*kcfi_call_value_reg"
+  [(set (match_operand 0 "" "")
+	(kcfi (call (mem:SI (match_operand:SI 1 "s_register_operand" "r"))
+		    (match_operand 2 "" ""))
+	      (match_operand 3 "const_int_operand")))
+   (use (match_operand 4 "" ""))
+   (clobber (reg:SI LR_REGNUM))]
+  "TARGET_32BIT && !SIBLING_CALL_P (insn) && arm_ccfsm_state == 0"
+{
+  return arm_output_kcfi_insn (insn, &operands[1]);
+}
+  [(set_attr "type" "call")
+   (set_attr "length" "36")])
+
 (define_insn "*call_value_reg_armv5"
   [(set (match_operand 0 "" "")
         (call (mem:SI (match_operand:SI 1 "s_register_operand" "r"))
@@ -8901,6 +8932,7 @@
       operands[2] = const0_rtx;
 
     pat = gen_sibcall_internal (operands[0], operands[1], operands[2]);
+    pat = arm_maybe_wrap_call_with_kcfi (pat, XEXP (operands[0], 0));
     arm_emit_call_insn (pat, operands[0], true);
     DONE;
   }"
@@ -8935,11 +8967,26 @@
 
     pat = gen_sibcall_value_internal (operands[0], operands[1],
                                       operands[2], operands[3]);
+    pat = arm_maybe_wrap_call_value_with_kcfi (pat, XEXP (operands[1], 0));
     arm_emit_call_insn (pat, operands[1], true);
     DONE;
   }"
 )
 
+;; KCFI sibling call - KCFI wraps just the call pattern
+(define_insn "*kcfi_sibcall_insn"
+  [(kcfi (call (mem:SI (match_operand:SI 0 "s_register_operand" "Cs"))
+	       (match_operand 1 "" ""))
+	 (match_operand 2 "const_int_operand"))
+   (return)
+   (use (match_operand 3 "" ""))]
+  "TARGET_32BIT && SIBLING_CALL_P (insn) && arm_ccfsm_state == 0"
+{
+  return arm_output_kcfi_insn (insn, operands);
+}
+  [(set_attr "type" "call")
+   (set_attr "length" "36")])
+
 (define_insn "*sibcall_insn"
  [(call (mem:SI (match_operand:SI 0 "call_insn_operand" "Cs, US"))
 	(match_operand 1 "" ""))
@@ -8960,6 +9007,21 @@
   [(set_attr "type" "call")]
 )
 
+;; KCFI sibling call with return value - KCFI wraps just the call pattern
+(define_insn "*kcfi_sibcall_value_insn"
+  [(set (match_operand 0 "" "")
+	(kcfi (call (mem:SI (match_operand:SI 1 "s_register_operand" "Cs"))
+		    (match_operand 2 "" ""))
+	      (match_operand 3 "const_int_operand")))
+   (return)
+   (use (match_operand 4 "" ""))]
+  "TARGET_32BIT && SIBLING_CALL_P (insn) && arm_ccfsm_state == 0"
+{
+  return arm_output_kcfi_insn (insn, &operands[1]);
+}
+  [(set_attr "type" "call")
+   (set_attr "length" "36")])
+
 (define_insn "*sibcall_value_insn"
  [(set (match_operand 0 "" "")
        (call (mem:SI (match_operand:SI 1 "call_insn_operand" "Cs,US"))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 972e8e76494f..dfaec475d2e1 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -18439,6 +18439,23 @@ trap is taken, allowing the kernel to identify both the KCFI violation
 and the involved registers for detailed diagnostics (eliminating the need
 for a separate @code{.kcfi_traps} section as used on x86_64).
 
+On ARM 32-bit, KCFI type identifiers are emitted as a @code{.word ID}
+directive (a 32-bit constant) before the function entry.  ARM's
+natural 4-byte instruction alignment eliminates the need for additional
+alignment NOPs.  When used with @option{-fpatchable-function-entry}, the
+type identifier is placed before any prefix NOPs.  The runtime check
+preserves argument registers @code{r0} and @code{r1} using @code{push}
+and @code{pop} instructions, then uses them as scratch registers for
+the type comparison.  The expected type is loaded using @code{movw} and
+@code{movt} instruction pairs for 32-bit immediate values.  Type mismatches
+trigger a @code{udf} instruction with an immediate value that encodes
+both the expected type register index and the target address register
+index in the format @code{0x8000 | (type_reg << 5) | addr_reg}.  This
+encoding is captured in the UDF immediate field when the trap is taken,
+allowing the kernel to identify both the KCFI violation and the involved
+registers for detailed diagnostics (eliminating the need for a separate
+@code{.kcfi_traps} section as used on x86_64).
+
 KCFI is intended primarily for kernel code and may not be suitable
 for user-space applications that rely on techniques incompatible
 with strict type checking of indirect calls.
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 6/7] riscv: Add RISC-V Kernel Control Flow Integrity implementation
  2025-09-13 23:23 [PATCH v3 0/7] Introduce Kernel Control Flow Integrity ABI [PR107048] Kees Cook
                   ` (4 preceding siblings ...)
  2025-09-13 23:24 ` [PATCH v3 5/7] arm: Add ARM 32-bit " Kees Cook
@ 2025-09-13 23:24 ` Kees Cook
  2025-09-13 23:24 ` [PATCH v3 7/7] kcfi: Add regression test suite Kees Cook
  6 siblings, 0 replies; 28+ messages in thread
From: Kees Cook @ 2025-09-13 23:24 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Kees Cook, Andrew Pinski, Jakub Jelinek, Martin Uecker,
	Richard Biener, Joseph Myers, Peter Zijlstra, Jan Hubicka,
	Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
	Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
	Jim Wilson, Dan Li, Sami Tolvanen, Ramon de C Valle, Joao Moreira,
	Nathan Chancellor, Bill Wendling, gcc-patches, linux-hardening

Implement RISC-V-specific KCFI backend.

- Scratch register allocation using t1/t2 (x6/x7) following RISC-V
  procedure call standard for temporary registers.

- Integration with .kcfi_traps section for debugger/runtime metadata
  (like x86_64).

Assembly Code Pattern for RISC-V:
  lw      t1, -4(target_reg)         ; Load actual type ID from preamble
  lui     t2, %hi(expected_type)     ; Load expected type (upper 20 bits)
  addiw   t2, t2, %lo(expected_type) ; Add lower 12 bits (sign-extended)
  beq     t1, t2, .Lkcfi_call        ; Branch if types match
  .Lkcfi_trap: ebreak                ; Environment break trap on mismatch
  .Lkcfi_call: jalr/jr target_reg    ; Execute validated indirect transfer

Build and run tested with Linux kernel ARCH=riscv.

gcc/ChangeLog:

	config/riscv/riscv-protos.h: Declare KCFI helpers.
	config/riscv/riscv.cc (riscv_maybe_wrap_call_with_kcfi): New
	function, to wrap calls.
	(riscv_maybe_wrap_call_value_with_kcfi): New function, to
	wrap calls with return values.
	(riscv_output_kcfi_insn): New function to emit KCFI assembly.
	config/riscv/riscv.md: Add KCFI RTL patterns and hook expansion.
	doc/invoke.texi: Document riscv nuances.

Signed-off-by: Kees Cook <kees@kernel.org>
---
 gcc/config/riscv/riscv-protos.h |   3 +
 gcc/config/riscv/riscv.cc       | 159 ++++++++++++++++++++++++++++++++
 gcc/config/riscv/riscv.md       |  76 +++++++++++++--
 gcc/doc/invoke.texi             |  13 +++
 4 files changed, 245 insertions(+), 6 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 2d60a0ad44b3..0e916fbdde13 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -126,6 +126,9 @@ extern bool riscv_split_64bit_move_p (rtx, rtx);
 extern void riscv_split_doubleword_move (rtx, rtx);
 extern const char *riscv_output_move (rtx, rtx);
 extern const char *riscv_output_return ();
+extern rtx riscv_maybe_wrap_call_with_kcfi (rtx, rtx);
+extern rtx riscv_maybe_wrap_call_value_with_kcfi (rtx, rtx);
+extern const char *riscv_output_kcfi_insn (rtx_insn *, rtx *);
 extern void riscv_declare_function_name (FILE *, const char *, tree);
 extern void riscv_declare_function_size (FILE *, const char *, tree);
 extern void riscv_asm_output_alias (FILE *, const tree, const tree);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 41ee81b93acf..7d51ab11c7ee 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -81,6 +81,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cgraph.h"
 #include "langhooks.h"
 #include "gimplify.h"
+#include "kcfi.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -11346,6 +11347,161 @@ riscv_convert_vector_chunks (struct gcc_options *opts)
     return 1;
 }
 
+/* Apply KCFI wrapping to call pattern if needed.  */
+
+rtx
+riscv_maybe_wrap_call_with_kcfi (rtx pat, rtx addr)
+{
+  /* Only indirect calls need KCFI instrumentation.  */
+  bool is_direct_call = SYMBOL_REF_P (addr);
+  if (!is_direct_call)
+    {
+      rtx kcfi_type_rtx = kcfi_get_type_id_for_expanding_gimple_call ();
+      if (kcfi_type_rtx)
+	{
+	  /* Extract the CALL from the PARALLEL and wrap it with KCFI.  */
+	  rtx call_rtx = XVECEXP (pat, 0, 0);
+	  rtx kcfi_call = gen_rtx_KCFI (VOIDmode, call_rtx, kcfi_type_rtx);
+
+	  /* Replace the CALL in the PARALLEL with the KCFI-wrapped call.  */
+	  XVECEXP (pat, 0, 0) = kcfi_call;
+	}
+    }
+  return pat;
+}
+
+/* Apply KCFI wrapping to call_value pattern if needed.  */
+
+rtx
+riscv_maybe_wrap_call_value_with_kcfi (rtx pat, rtx addr)
+{
+  /* Only indirect calls need KCFI instrumentation.  */
+  bool is_direct_call = SYMBOL_REF_P (addr);
+  if (!is_direct_call)
+    {
+      rtx kcfi_type_rtx = kcfi_get_type_id_for_expanding_gimple_call ();
+      if (kcfi_type_rtx)
+	{
+	  /* Extract the SET from the PARALLEL and wrap its CALL with KCFI.  */
+	  rtx set_rtx = XVECEXP (pat, 0, 0);
+	  rtx call_rtx = SET_SRC (set_rtx);
+	  rtx kcfi_call = gen_rtx_KCFI (VOIDmode, call_rtx, kcfi_type_rtx);
+
+	  /* Replace the CALL in the SET with the KCFI-wrapped call.  */
+	  SET_SRC (set_rtx) = kcfi_call;
+	}
+    }
+  return pat;
+}
+
+/* Output the assembly for a KCFI checked call instruction.  */
+
+const char *
+riscv_output_kcfi_insn (rtx_insn *insn, rtx *operands)
+{
+  /* KCFI is only supported in 64-bit mode.  */
+  if (!TARGET_64BIT)
+    {
+      sorry ("%<-fsanitize=kcfi%> is not supported for 32-bit RISC-V");
+      return "";
+    }
+  /* Target register.  */
+  rtx target_reg = operands[0];
+  gcc_assert (REG_P (target_reg));
+
+  /* Get KCFI type ID.  */
+  uint32_t expected_type = (uint32_t) INTVAL (operands[3]);
+
+  /* Calculate typeid offset from call target.  */
+  HOST_WIDE_INT offset = -(4 + kcfi_patchable_entry_prefix_nops);
+
+  /* Choose scratch registers that don't conflict with target.  */
+  unsigned temp1_regnum = T1_REGNUM;
+  unsigned temp2_regnum = T2_REGNUM;
+
+  if (REGNO (target_reg) == T1_REGNUM)
+    temp1_regnum = T3_REGNUM;
+  else if (REGNO (target_reg) == T2_REGNUM)
+    temp2_regnum = T3_REGNUM;
+
+  /* Generate labels internally.  */
+  rtx trap_label = gen_label_rtx ();
+  rtx call_label = gen_label_rtx ();
+
+  /* Get label numbers for custom naming.  */
+  int trap_labelno = CODE_LABEL_NUMBER (trap_label);
+  int call_labelno = CODE_LABEL_NUMBER (call_label);
+
+  /* Generate custom label names.  */
+  char trap_name[32];
+  char call_name[32];
+  ASM_GENERATE_INTERNAL_LABEL (trap_name, "Lkcfi_trap", trap_labelno);
+  ASM_GENERATE_INTERNAL_LABEL (call_name, "Lkcfi_call", call_labelno);
+
+  /* Split expected_type for RISC-V immediate encoding.
+     If bit 11 is set, increment upper 20 bits to compensate for sign
+     extension.  */
+  int32_t lo12 = ((int32_t)(expected_type << 20)) >> 20;
+  uint32_t hi20 = ((expected_type >> 12)
+		    + ((expected_type & 0x800) ? 1 : 0)) & 0xFFFFF;
+
+  rtx temp_operands[3];
+
+  /* Load actual type from memory at offset.  */
+  temp_operands[0] = gen_rtx_REG (SImode, temp1_regnum);
+  temp_operands[1] = gen_rtx_MEM (SImode,
+				  gen_rtx_PLUS (DImode, target_reg,
+						GEN_INT (offset)));
+  output_asm_insn ("lw\t%0, %1", temp_operands);
+
+  /* Load expected type using lui + addiw for proper sign extension.  */
+  temp_operands[0] = gen_rtx_REG (SImode, temp2_regnum);
+  temp_operands[1] = GEN_INT (hi20);
+  output_asm_insn ("lui\t%0, %1", temp_operands);
+
+  temp_operands[0] = gen_rtx_REG (SImode, temp2_regnum);
+  temp_operands[1] = gen_rtx_REG (SImode, temp2_regnum);
+  temp_operands[2] = GEN_INT (lo12);
+  output_asm_insn ("addiw\t%0, %1, %2", temp_operands);
+
+  /* Output conditional branch to call label.  */
+  fprintf (asm_out_file, "\tbeq\t%s, %s, ",
+	   reg_names[temp1_regnum], reg_names[temp2_regnum]);
+  assemble_name (asm_out_file, call_name);
+  fputc ('\n', asm_out_file);
+
+  /* Output trap label and ebreak instruction.  */
+  ASM_OUTPUT_LABEL (asm_out_file, trap_name);
+  output_asm_insn ("ebreak", operands);
+
+  /* Use common helper for trap section entry.  */
+  rtx trap_label_sym = gen_rtx_SYMBOL_REF (Pmode, trap_name);
+  kcfi_emit_traps_section (asm_out_file, trap_label_sym);
+
+  /* Output pass/call label.  */
+  ASM_OUTPUT_LABEL (asm_out_file, call_name);
+
+  /* Execute the indirect call.  */
+  if (SIBLING_CALL_P (insn))
+    {
+      /* Tail call uses x0 (zero register) to avoid saving return address.  */
+      temp_operands[0] = gen_rtx_REG (DImode, 0);
+      temp_operands[1] = target_reg;
+      temp_operands[2] = const0_rtx;
+      output_asm_insn ("jalr\t%0, %1, %2", temp_operands);
+    }
+  else
+    {
+      /* Regular call uses x1 (return address register).  */
+      temp_operands[0] = gen_rtx_REG (DImode, RETURN_ADDR_REGNUM);
+      temp_operands[1] = target_reg;
+      temp_operands[2] = const0_rtx;
+      output_asm_insn ("jalr\t%0, %1, %2", temp_operands);
+    }
+
+  return "";
+}
+
 /* 'Unpack' up the internal tuning structs and update the options
     in OPTS.  The caller must have set up selected_tune and selected_arch
     as all the other target-specific codegen decisions are
@@ -15898,6 +16054,9 @@ riscv_prefetch_offset_address_p (rtx x, machine_mode mode)
 #define TARGET_GET_FUNCTION_VERSIONS_DISPATCHER \
   riscv_get_function_versions_dispatcher
 
+#undef TARGET_KCFI_SUPPORTED
+#define TARGET_KCFI_SUPPORTED hook_bool_void_true
+
 #undef TARGET_DOCUMENTATION_NAME
 #define TARGET_DOCUMENTATION_NAME "RISC-V"
 
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 4718a75598a6..0124999c4c45 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -3982,10 +3982,25 @@
   ""
 {
   rtx target = riscv_legitimize_call_address (XEXP (operands[0], 0));
-  emit_call_insn (gen_sibcall_internal (target, operands[1], operands[2]));
+  rtx pat = gen_sibcall_internal (target, operands[1], operands[2]);
+  pat = riscv_maybe_wrap_call_with_kcfi (pat, target);
+  emit_call_insn (pat);
   DONE;
 })
 
+;; KCFI sibling call
+(define_insn "*kcfi_sibcall_insn"
+  [(kcfi (call (mem:SI (match_operand:DI 0 "call_insn_operand" "l"))
+	       (match_operand 1 ""))
+	 (match_operand 3 "const_int_operand"))
+   (use (unspec:SI [(match_operand 2 "const_int_operand")] UNSPEC_CALLEE_CC))]
+  "SIBLING_CALL_P (insn)"
+{
+  return riscv_output_kcfi_insn (insn, operands);
+}
+  [(set_attr "type" "call")
+   (set_attr "length" "24")])
+
 (define_insn "sibcall_internal"
   [(call (mem:SI (match_operand 0 "call_insn_operand" "j,S,U"))
 	 (match_operand 1 "" ""))
@@ -4009,11 +4024,27 @@
   ""
 {
   rtx target = riscv_legitimize_call_address (XEXP (operands[1], 0));
-  emit_call_insn (gen_sibcall_value_internal (operands[0], target, operands[2],
-					      operands[3]));
+  rtx pat = gen_sibcall_value_internal (operands[0], target, operands[2],
+					operands[3]);
+  pat = riscv_maybe_wrap_call_value_with_kcfi (pat, target);
+  emit_call_insn (pat);
   DONE;
 })
 
+;; KCFI sibling call with return value
+(define_insn "*kcfi_sibcall_value_insn"
+  [(set (match_operand 0 "")
+	(kcfi (call (mem:SI (match_operand:DI 1 "call_insn_operand" "l"))
+		    (match_operand 2 ""))
+	      (match_operand 4 "const_int_operand")))
+   (use (unspec:SI [(match_operand 3 "const_int_operand")] UNSPEC_CALLEE_CC))]
+  "SIBLING_CALL_P (insn)"
+{
+  return riscv_output_kcfi_insn (insn, &operands[1]);
+}
+  [(set_attr "type" "call")
+   (set_attr "length" "24")])
+
 (define_insn "sibcall_value_internal"
   [(set (match_operand 0 "" "")
 	(call (mem:SI (match_operand 1 "call_insn_operand" "j,S,U"))
@@ -4037,10 +4068,26 @@
   ""
 {
   rtx target = riscv_legitimize_call_address (XEXP (operands[0], 0));
-  emit_call_insn (gen_call_internal (target, operands[1], operands[2]));
+  rtx pat = gen_call_internal (target, operands[1], operands[2]);
+  pat = riscv_maybe_wrap_call_with_kcfi (pat, target);
+  emit_call_insn (pat);
   DONE;
 })
 
+;; KCFI indirect call
+(define_insn "*kcfi_call_internal"
+  [(kcfi (call (mem:SI (match_operand:DI 0 "call_insn_operand" "l"))
+	       (match_operand 1 "" ""))
+	 (match_operand 3 "const_int_operand"))
+   (use (unspec:SI [(match_operand 2 "const_int_operand")] UNSPEC_CALLEE_CC))
+   (clobber (reg:SI RETURN_ADDR_REGNUM))]
+  "!SIBLING_CALL_P (insn)"
+{
+  return riscv_output_kcfi_insn (insn, operands);
+}
+  [(set_attr "type" "call")
+   (set_attr "length" "24")])
+
 (define_insn "call_internal"
   [(call (mem:SI (match_operand 0 "call_insn_operand" "l,S,U"))
 	 (match_operand 1 "" ""))
@@ -4065,11 +4112,28 @@
   ""
 {
   rtx target = riscv_legitimize_call_address (XEXP (operands[1], 0));
-  emit_call_insn (gen_call_value_internal (operands[0], target, operands[2],
-					   operands[3]));
+  rtx pat = gen_call_value_internal (operands[0], target, operands[2],
+				     operands[3]);
+  pat = riscv_maybe_wrap_call_value_with_kcfi (pat, target);
+  emit_call_insn (pat);
   DONE;
 })
 
+;; KCFI call with return value
+(define_insn "*kcfi_call_value_insn"
+  [(set (match_operand 0 "" "")
+	(kcfi (call (mem:SI (match_operand:DI 1 "call_insn_operand" "l"))
+		    (match_operand 2 "" ""))
+	      (match_operand 4 "const_int_operand")))
+   (use (unspec:SI [(match_operand 3 "const_int_operand")] UNSPEC_CALLEE_CC))
+   (clobber (reg:SI RETURN_ADDR_REGNUM))]
+  "!SIBLING_CALL_P (insn)"
+{
+  return riscv_output_kcfi_insn (insn, &operands[1]);
+}
+  [(set_attr "type" "call")
+   (set_attr "length" "24")])
+
 (define_insn "call_value_internal"
   [(set (match_operand 0 "" "")
 	(call (mem:SI (match_operand 1 "call_insn_operand" "l,S,U"))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index dfaec475d2e1..4aaf23cb836d 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -18456,6 +18456,19 @@ allowing the kernel to identify both the KCFI violation and the involved
 registers for detailed diagnostics (eliminating the need for a separate
 @code{.kcfi_traps} section as used on x86_64).
 
+On RISC-V, KCFI type identifiers are emitted as a @code{.word ID}
+directive (a 32-bit constant) before the function entry, similar to AArch64.
+RISC-V's natural 4-byte instruction alignment eliminates the need for
+additional alignment NOPs.  When used with @option{-fpatchable-function-entry},
+the type identifier is placed before any prefix NOPs.  The runtime check
+loads the actual type using @code{lw t1, OFFSET(target_reg)}, where the
+offset accounts for any prefix NOPs, constructs the expected type using
+@code{lui} and @code{addiw} instructions into @code{t2}, and compares them
+with @code{beq}.  Type mismatches trigger an @code{ebreak} instruction.
+Like x86_64, RISC-V uses a @code{.kcfi_traps} section to map trap locations
+to their corresponding function entry points for debugging (RISC-V lacks
+ESR-style trap encoding like used on AArch64).
+
 KCFI is intended primarily for kernel code and may not be suitable
 for user-space applications that rely on techniques incompatible
 with strict type checking of indirect calls.
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 7/7] kcfi: Add regression test suite
  2025-09-13 23:23 [PATCH v3 0/7] Introduce Kernel Control Flow Integrity ABI [PR107048] Kees Cook
                   ` (5 preceding siblings ...)
  2025-09-13 23:24 ` [PATCH v3 6/7] riscv: Add RISC-V " Kees Cook
@ 2025-09-13 23:24 ` Kees Cook
  2025-09-13 23:51   ` Andrew Pinski
  2025-09-13 23:58   ` Andrew Pinski
  6 siblings, 2 replies; 28+ messages in thread
From: Kees Cook @ 2025-09-13 23:24 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Kees Cook, Andrew Pinski, Jakub Jelinek, Martin Uecker,
	Richard Biener, Joseph Myers, Peter Zijlstra, Jan Hubicka,
	Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
	Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
	Jim Wilson, Dan Li, Sami Tolvanen, Ramon de C Valle, Joao Moreira,
	Nathan Chancellor, Bill Wendling, gcc-patches, linux-hardening

Adds a test suite for KCFI (Kernel Control Flow Integrity) ABI, covering
core functionality, optimization and code generation, addressing,
architecture-specific KCFI sequence emission, and integration with
patchable function entry.

Tests can be run via:
  make check-c RUNTESTFLAGS='kcfi.exp'

gcc/testsuite/ChangeLog:

	* gcc.dg/kcfi/kcfi-adjacency.c: New test.
	* gcc.dg/kcfi/kcfi-basics.c: New test.
	* gcc.dg/kcfi/kcfi-call-sharing.c: New test.
	* gcc.dg/kcfi/kcfi-cold-partition.c: New test.
	* gcc.dg/kcfi/kcfi-complex-addressing.c: New test.
	* gcc.dg/kcfi/kcfi-ipa-robustness.c: New test.
	* gcc.dg/kcfi/kcfi-move-preservation.c: New test.
	* gcc.dg/kcfi/kcfi-no-sanitize-inline.c: New test.
	* gcc.dg/kcfi/kcfi-no-sanitize.c: New test.
	* gcc.dg/kcfi/kcfi-offset-validation.c: New test.
	* gcc.dg/kcfi/kcfi-patchable-basic.c: New test.
	* gcc.dg/kcfi/kcfi-patchable-entry-only.c: New test.
	* gcc.dg/kcfi/kcfi-patchable-large.c: New test.
	* gcc.dg/kcfi/kcfi-patchable-medium.c: New test.
	* gcc.dg/kcfi/kcfi-patchable-prefix-only.c: New test.
	* gcc.dg/kcfi/kcfi-pic-addressing.c: New test.
	* gcc.dg/kcfi/kcfi-retpoline-r11.c: New test.
	* gcc.dg/kcfi/kcfi-runtime.c: New test.
	* gcc.dg/kcfi/kcfi-tail-calls.c: New test.
	* gcc.dg/kcfi/kcfi-trap-encoding.c: New test.
	* gcc.dg/kcfi/kcfi-trap-section.c: New test.
	* gcc.dg/kcfi/kcfi.exp: New test.

Signed-off-by: Kees Cook <kees@kernel.org>
---
 gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c    |  72 +++++++++
 gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c       | 108 +++++++++++++
 gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c |  84 ++++++++++
 .../gcc.dg/kcfi/kcfi-cold-partition.c         | 136 ++++++++++++++++
 .../gcc.dg/kcfi/kcfi-complex-addressing.c     | 135 ++++++++++++++++
 .../gcc.dg/kcfi/kcfi-ipa-robustness.c         |  54 +++++++
 .../gcc.dg/kcfi/kcfi-move-preservation.c      |  55 +++++++
 .../gcc.dg/kcfi/kcfi-no-sanitize-inline.c     | 100 ++++++++++++
 gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c  |  39 +++++
 .../gcc.dg/kcfi/kcfi-offset-validation.c      |  48 ++++++
 .../gcc.dg/kcfi/kcfi-patchable-basic.c        |  70 ++++++++
 .../gcc.dg/kcfi/kcfi-patchable-entry-only.c   |  62 +++++++
 .../gcc.dg/kcfi/kcfi-patchable-large.c        |  51 ++++++
 .../gcc.dg/kcfi/kcfi-patchable-medium.c       |  60 +++++++
 .../gcc.dg/kcfi/kcfi-patchable-prefix-only.c  |  60 +++++++
 .../gcc.dg/kcfi/kcfi-pic-addressing.c         | 104 ++++++++++++
 .../gcc.dg/kcfi/kcfi-retpoline-r11.c          |  50 ++++++
 gcc/testsuite/gcc.dg/kcfi/kcfi-runtime.c      | 151 ++++++++++++++++++
 gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c   | 142 ++++++++++++++++
 .../gcc.dg/kcfi/kcfi-trap-encoding.c          |  54 +++++++
 gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c |  41 +++++
 gcc/testsuite/gcc.dg/kcfi/kcfi.exp            |  64 ++++++++
 22 files changed, 1740 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-cold-partition.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-complex-addressing.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-ipa-robustness.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-move-preservation.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize-inline.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-offset-validation.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-basic.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-entry-only.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-large.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-medium.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-prefix-only.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-pic-addressing.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-retpoline-r11.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-runtime.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-trap-encoding.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c
 create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi.exp

diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c
new file mode 100644
index 000000000000..becb47678df0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c
@@ -0,0 +1,72 @@
+/* Test KCFI check/transfer adjacency - regression test for instruction
+   insertion.  */
+/* { dg-do compile } */
+/* { dg-additional-options "-O2" } */
+
+/* This test ensures that KCFI security checks remain immediately adjacent
+   to their corresponding indirect calls/jumps, with no executable instructions
+   between the type ID check and the control flow transfer. */
+
+/* External function pointers to prevent optimization.  */
+extern void (*complex_func_ptr)(int, int, int, int);
+extern int (*return_func_ptr)(int, int);
+
+/* Function with complex argument preparation that could tempt
+   the optimizer to insert instructions between KCFI check and call.  */
+__attribute__((noinline)) void test_complex_args(int a, int b, int c, int d) {
+    /* Complex argument expressions that might cause instruction scheduling.  */
+    complex_func_ptr(a * 2, b + c, d - a, (a << 1) | b);
+}
+
+/* Function with return value handling.  */
+__attribute__((noinline)) int test_return_value(int x, int y) {
+    /* Return value handling that shouldn't interfere with adjacency.  */
+    int result = return_func_ptr(x + 1, y * 2);
+    return result + 1;
+}
+
+/* Test struct field access that caused issues in try-catch.c.  */
+struct call_info {
+    void (*handler)(void);
+    int status;
+    int data;
+};
+
+extern struct call_info *global_call_info;
+
+__attribute__((noinline)) void test_struct_field_call(void) {
+    /* This pattern caused adjacency issues before the fix.  */
+    global_call_info->handler();
+}
+
+/* Test conditional indirect call.  */
+__attribute__((noinline)) void test_conditional_call(int flag) {
+    if (flag) {
+        global_call_info->handler();
+    }
+}
+
+/* Should have KCFI instrumentation for all indirect calls.  */
+
+/* x86_64: Complete KCFI check sequence should be present.  */
+/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r1[01]d\n\taddl\t[^,]+, %r1[01]d\n\tje\t\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tud2} { target x86_64-*-* } } } */
+
+/* AArch64: Complete KCFI check sequence should be present.  */
+/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-[0-9]+\]\n\tmov\tw17, #[0-9]+\n\tmovk\tw17, #[0-9]+, lsl #16\n\tcmp\tw16, w17\n\tb\.eq\t(\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tbrk\t#[0-9]+\n\1:\n\tblr\tx[0-9]+} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Complete KCFI check sequence should be present with stack
+   spilling.  */
+/* { dg-final { scan-assembler {push\t\{r0, r1\}\n\tldr\tr0, \[r[0-9]+, #-[0-9]+\]\n\tmovw\tr1, #[0-9]+\n\tmovt\tr1, #[0-9]+\n\tcmp\tr0, r1\n\tpop\t\{r0, r1\}\n\tbeq\t\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tudf\t#[0-9]+\n\.Lkcfi_call[0-9]+:\n\tblx\tr[0-9]+} { target arm32 } } } */
+
+/* RISC-V: Complete KCFI check sequence should be present.  */
+/* { dg-final { scan-assembler {lw\tt1, -4\([a-z0-9]+\)\n\tlui\tt2, [0-9]+\n\taddiw\tt2, t2, -?[0-9]+\n\tbeq\tt1, t2, \.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tebreak} { target riscv*-*-* } } } */
+
+/* Should have trap section with entries.  */
+/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
+
+/* AArch64 should NOT have trap section (uses brk immediate instead) */
+/* { dg-final { scan-assembler-not {\.kcfi_traps} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit should NOT have trap section (uses udf immediate instead) */
+/* { dg-final { scan-assembler-not {\.kcfi_traps} { target arm32 } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c
new file mode 100644
index 000000000000..b0a9e11f1f3c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c
@@ -0,0 +1,108 @@
+/* Test basic KCFI functionality - preamble generation.  */
+/* { dg-do compile } */
+/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
+
+/* Extern function declarations - should NOT get KCFI preambles.  */
+extern void external_func(void);
+extern int external_func_int(int x);
+
+void regular_function(int x) {
+    /* This should get KCFI preamble.  */
+}
+
+void static_target_function(int x) {
+    /* Target function that can be called indirectly.  */
+}
+
+__attribute__((nocf_check))
+void nocf_check_function(int x) {
+    /* This function has nocf_check attribute - should NOT get KCFI preamble.  */
+}
+
+static void static_caller(void) {
+    /* Static function that makes an indirect call
+       Should NOT get KCFI preamble (not address-taken)
+       But must generate KCFI check for the indirect call.  */
+    void (*local_ptr)(int) = static_target_function;
+    local_ptr(42);  /* This should generate KCFI check.  */
+}
+
+/* Make external_func address-taken.  */
+void (*func_ptr)(int) = regular_function;
+void (*ext_ptr)(void) = external_func;
+void (__attribute__((nocf_check)) *nocf_ptr)(int) = nocf_check_function;
+
+int main() {
+    func_ptr(42);
+    ext_ptr();        /* Indirect call to external_func.  */
+    external_func_int(10);  /* Direct call to external_func_int.  */
+    static_caller();  /* Direct call to static function.  */
+    return 0;
+}
+
+/* Verify KCFI preamble exists for regular_function.  */
+/* { dg-final { scan-assembler {__cfi_regular_function:} } } */
+
+/* Verify KCFI preamble symbol comes before main function symbol.  */
+/* { dg-final { scan-assembler {__cfi_regular_function:.*regular_function:} } } */
+
+/* Target function should have preamble (address-taken).  */
+/* { dg-final { scan-assembler {__cfi_static_target_function:} } } */
+
+/* Static caller should NOT have preamble (it's only called directly,
+   not address-taken). */
+/* { dg-final { scan-assembler-not {__cfi_static_caller:} } } */
+
+/* Function with nocf_check attribute should NOT have preamble.  */
+/* { dg-final { scan-assembler-not {__cfi_nocf_check_function:} } } */
+
+/* x86_64: Verify type ID in preamble (after NOPs, before function label) */
+/* { dg-final { scan-assembler {__cfi_regular_function:\n\t+nop\n.*\n\t+movl\t+\$0x[0-9a-f]+, %eax} { target x86_64-*-* } } } */
+
+/* AArch64: Verify type ID word in preamble.  */
+/* { dg-final { scan-assembler {__cfi_regular_function:\n\t\.word\t0x[0-9a-f]+} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Verify type ID word in preamble.  */
+/* { dg-final { scan-assembler {__cfi_regular_function:\n\t\.word\t0x[0-9a-f]+} { target arm32 } } } */
+
+/* RISC-V: Verify type ID word in preamble */
+/* { dg-final { scan-assembler {__cfi_regular_function:\n\t\.word\t0x[0-9a-f]+} { target riscv*-*-* } } } */
+
+/* x86_64: Static function should generate complete KCFI check sequence.  */
+/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-4\(%r[a-z0-9]+\), %r10d\n\tje\t(\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tud2\n.*\n\1:\n\tcall} { target x86_64-*-* } } } */
+
+/* AArch64: Static function should generate complete KCFI check sequence.  */
+/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-4\]\n\tmov\tw17, #[0-9]+\n\tmovk\tw17, #[0-9]+, lsl #16\n\tcmp\tw16, w17\n\tb\.eq\t(\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tbrk\t#[0-9]+\n\1:\n\tblr} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Static function should generate complete KCFI check sequence
+   with stack spilling.  */
+/* { dg-final { scan-assembler {push\t\{r0, r1\}\n\tldr\tr0, \[r[0-9]+, #-4\]\n\tmovw\tr1, #[0-9]+\n\tmovt\tr1, #[0-9]+\n\tcmp\tr0, r1\n\tpop\t\{r0, r1\}\n\tbeq\t\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tudf\t#[0-9]+\n\.Lkcfi_call[0-9]+:\n\tblx\tr[0-9]+} { target arm32 } } } */
+
+/* RISC-V: Static function should generate KCFI check for indirect call.  */
+/* { dg-final { scan-assembler {lw\tt1, -4\([a-z0-9]+\)\n\tlui\tt2, [0-9]+\n\taddiw\tt2, t2, -?[0-9]+\n\tbeq\tt1, t2, (\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tebreak\n\t\.section\t\.kcfi_traps,"ao",@progbits,\.text\n\.Lkcfi_entry[0-9]+:\n\t\.4byte\t\.Lkcfi_trap[0-9]+-\.Lkcfi_entry[0-9]+\n\t\.text\n\1:\n\tjalr} { target riscv*-*-* } } } */
+
+/* Extern functions should NOT get KCFI preambles.  */
+/* { dg-final { scan-assembler-not {__cfi_external_func:} } } */
+/* { dg-final { scan-assembler-not {__cfi_external_func_int:} } } */
+
+/* Local functions should NOT get __kcfi_typeid_ symbols.  */
+/* Only external declarations that are address-taken should get __kcfi_typeid_ */
+/* { dg-final { scan-assembler-not {__kcfi_typeid_regular_function} } } */
+/* { dg-final { scan-assembler-not {__kcfi_typeid_main} } } */
+
+/* External address-taken functions should get __kcfi_typeid_ symbols.  */
+/* { dg-final { scan-assembler {__kcfi_typeid_external_func} } } */
+
+/* External functions that are only called directly should NOT get
+   __kcfi_typeid_ symbols.  */
+/* { dg-final { scan-assembler-not {__kcfi_typeid_external_func_int} } } */
+
+/* Should have trap section for KCFI checks.  */
+/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
+
+/* AArch64 should NOT have trap section (uses brk immediate instead).  */
+/* { dg-final { scan-assembler-not {\.kcfi_traps} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit should NOT have trap section (uses udf immediate instead).  */
+/* { dg-final { scan-assembler-not {\.kcfi_traps} { target arm32 } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c
new file mode 100644
index 000000000000..f34d5f88547f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c
@@ -0,0 +1,84 @@
+/* Test KCFI check sharing bug - optimizer incorrectly shares KCFI checks
+   between different function types.  */
+/* { dg-do compile } */
+/* { dg-additional-options "-O2" } */
+
+/* Reproduce the pattern from Linux kernel internal_create_group where:
+   - Two different function pointer types (is_visible vs is_bin_visible).
+   - Both get loaded into the same register (%rcx).
+   - Optimizer creates shared KCFI check with wrong type ID.
+   - This causes CFI failures in production kernel.  */
+
+struct kobject { int dummy; };
+struct attribute { int dummy; };
+struct bin_attribute { int dummy; };
+
+struct attribute_group {
+    const char *name;
+    // Type ID A
+    int (*is_visible)(struct kobject *, struct attribute *, int);
+    // Type ID B
+    int (*is_bin_visible)(struct kobject *, const struct bin_attribute *, int);
+    struct attribute **attrs;
+    const struct bin_attribute **bin_attrs;
+};
+
+/* Function that mimics __first_visible from kernel - gets inlined into
+   caller.  */
+static int __first_visible(const struct attribute_group *grp, struct kobject *kobj)
+{
+    /* Path 1: Call is_visible function pointer.  */
+    if (grp->attrs && grp->attrs[0] && grp->is_visible)
+        return grp->is_visible(kobj, grp->attrs[0], 0);
+
+    /* Path 2: Call is_bin_visible function pointer.  */
+    if (grp->bin_attrs && grp->bin_attrs[0] && grp->is_bin_visible)
+        return grp->is_bin_visible(kobj, grp->bin_attrs[0], 0);
+
+    return 0;
+}
+
+/* Main function that triggers the optimization bug.  */
+int test_kcfi_check_sharing(struct kobject *kobj, const struct attribute_group *grp)
+{
+    /* This should inline __first_visible and create the problematic pattern where:
+       1. Both function pointers get loaded into same register.
+       2. Optimizer shares KCFI check between them.
+       3. Uses wrong type ID for one of the calls.  */
+    return __first_visible(grp, kobj);
+}
+
+/* Each indirect call should have its own KCFI check with correct type ID.
+
+   Should see:
+   1. KCFI check for is_visible call with is_visible type ID.
+   2. KCFI check for is_bin_visible call with is_bin_visible type ID.  */
+
+/* Verify we have TWO different KCFI check sequences.  */
+/* Each check should have different type ID constants.  */
+/* x86: { dg-final { scan-assembler-times {movl\s+\$-?[0-9]+,\s+%r10d} 2 { target i?86-*-* x86_64-*-* } } } */
+/* AArch64: { dg-final { scan-assembler-times {mov\s+w17, #[0-9]+} 2 { target aarch64*-*-* } } } */
+/* ARM 32-bit: { dg-final { scan-assembler-times {movw\s+r1, #[0-9]+} 2 { target arm32 } } } */
+/* RISC-V: { dg-final { scan-assembler-times {lui\tt2, [0-9]+} 2 { target riscv*-*-* } } } */
+
+/* Verify the checks use DIFFERENT type IDs (not shared).
+   We should NOT see the same type ID used twice - that would indicate
+   sharing bug.  */
+/* x86: { dg-final { scan-assembler-not {movl\s+\$(-?[0-9]+),\s+%r10d.*movl\s+\$\1,\s+%r10d} { target i?86-*-* x86_64-*-* } } } */
+/* AArch64: { dg-final { scan-assembler-not {mov\s+w17, #([0-9]+).*mov\s+w17, #\1} { target aarch64*-*-* } } } */
+/* ARM 32-bit: { dg-final { scan-assembler-not {movw\s+r1, #([0-9]+).*movw\s+r1, #\1} { target arm32 } } } */
+/* RISC-V: { dg-final { scan-assembler-not {lui\s+t2, ([0-9]+)\s.*lui\s+t2, \1\s} { target riscv*-*-* } } } */
+
+/* Verify each call follows its own check (not shared) */
+/* Should have 2 separate trap instructions.  */
+/* x86: { dg-final { scan-assembler-times {ud2} 2 { target i?86-*-* x86_64-*-* } } } */
+/* AArch64: { dg-final { scan-assembler-times {brk\s+#[0-9]+} 2 { target aarch64*-*-* } } } */
+/* ARM 32-bit: { dg-final { scan-assembler-times {udf\s+#[0-9]+} 2 { target arm32 } } } */
+/* RISC-V: { dg-final { scan-assembler-times {ebreak} 2 { target riscv*-*-* } } } */
+
+/* Verify 2 separate call sites.  */
+/* x86: { dg-final { scan-assembler-times {jmp\s+\*%[a-z0-9]+} 2 { target i?86-*-* x86_64-*-* } } } */
+/* AArch64: Allow both blr (regular call) and br (tail call) */
+/* AArch64: { dg-final { scan-assembler-times {br\tx[0-9]+} 2 { target aarch64*-*-* } } } */
+/* ARM 32-bit: { dg-final { scan-assembler-times {bx\s+(?:r[0-9]+|ip)} 2 { target arm32 } } } */
+/* RISC-V: { dg-final { scan-assembler-times {jalr\t[a-z0-9]+} 2 { target riscv*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-cold-partition.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-cold-partition.c
new file mode 100644
index 000000000000..17def558ada4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-cold-partition.c
@@ -0,0 +1,136 @@
+/* Test KCFI cold function and cold partition behavior.  */
+/* { dg-do compile } */
+/* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-freorder-blocks-and-partition" { target freorder } } */
+
+void regular_function(void) {
+    /* Regular function should get preamble.  */
+}
+
+/* Cold-attributed function should STILL get preamble (it's a regular
+   function, just marked cold).  */
+__attribute__((cold))
+void cold_attributed_function(void) {
+    /* This function has cold attribute but should still get KCFI preamble.  */
+}
+
+/* Hot-attributed function should get preamble.  */
+__attribute__((hot))
+void hot_attributed_function(void) {
+    /* This function is explicitly hot and should get KCFI preamble.  */
+}
+
+/* Global to prevent optimization from eliminating cold paths.  */
+extern void abort(void);
+
+/* Additional function to test that normal functions still get preambles.  */
+__attribute__((noinline))
+int another_regular_function(int x) {
+    return x + 42;
+}
+
+/* Function designed to generate cold partitions under optimization.  */
+__attribute__((noinline))
+void function_with_cold_partition(int condition) {
+    /* Hot path - very likely to execute.  */
+    if (__builtin_expect(condition == 42, 1)) {
+        /* Simple hot path that optimizer will keep inline.  */
+        return;
+    }
+
+    /* Cold paths that actually do something to prevent elimination.  */
+    if (__builtin_expect(condition < 0, 0)) {
+        /* Error path 1 - call abort to prevent elimination.  */
+        abort();
+    }
+
+    if (__builtin_expect(condition > 1000000, 0)) {
+        /* Error path 2 - call abort to prevent elimination.  */
+        abort();
+    }
+
+    if (__builtin_expect(condition == 999999, 0)) {
+        /* Error path 3 - more substantial cold code.  */
+        volatile int sum = 0;
+        for (volatile int i = 0; i < 100; i++) {
+            sum += i * condition;
+        }
+        if (sum > 0)
+            abort();
+    }
+
+    /* More cold paths - switch with many unlikely cases.  */
+    switch (condition) {
+        case 1000001: case 1000002: case 1000003: case 1000004: case 1000005:
+        case 1000006: case 1000007: case 1000008: case 1000009: case 1000010:
+            /* Each case does some work before abort.  */
+            volatile int work = condition * 2;
+            if (work > 0) abort();
+            break;
+        default:
+            if (condition != 42) {
+                /* Fallback cold path - substantial work.  */
+                volatile int result = 0;
+                for (volatile int j = 0; j < condition % 50; j++) {
+                    result += j;
+                }
+                if (result >= 0) abort();
+            }
+    }
+}
+
+/* Test function pointers to ensure address-taken detection works.  */
+void test_function_pointers(void) {
+    void (*regular_ptr)(void) = regular_function;
+    void (*cold_ptr)(void) = cold_attributed_function;
+    void (*hot_ptr)(void) = hot_attributed_function;
+
+    regular_ptr();
+    cold_ptr();
+    hot_ptr();
+}
+
+int main() {
+    regular_function();
+    cold_attributed_function();
+    hot_attributed_function();
+    function_with_cold_partition(42); /* Normal case - stay in hot path.  */
+    another_regular_function(5);
+    test_function_pointers();
+    return 0;
+}
+
+/* Regular function should have preamble.  */
+/* { dg-final { scan-assembler "__cfi_regular_function:" } } */
+
+/* Cold-attributed function should STILL have preamble (it's a legitimate function) */
+/* { dg-final { scan-assembler "__cfi_cold_attributed_function:" } } */
+
+/* Hot-attributed function should have preamble.  */
+/* { dg-final { scan-assembler "__cfi_hot_attributed_function:" } } */
+
+/* Function that generates cold partitions should have preamble for main entry.  */
+/* { dg-final { scan-assembler "__cfi_function_with_cold_partition:" } } */
+
+/* Address-taken functions should have preambles.  */
+/* { dg-final { scan-assembler "__cfi_test_function_pointers:" } } */
+
+/* The function should generate a .cold partition (only on targets that support freorder) */
+/* { dg-final { scan-assembler "function_with_cold_partition\\.cold:" { target freorder } } } */
+
+/* The .cold partition should NOT get a __cfi_ preamble since it's never
+   reached via indirect calls.  */
+/* { dg-final { scan-assembler-not "__cfi_function_with_cold_partition\\.cold:" { target freorder } } } */
+
+/* Additional regular function should get preamble.  */
+/* { dg-final { scan-assembler "__cfi_another_regular_function:" } } */
+
+/* Test coverage summary:
+   1. Cold-attributed function (__attribute__((cold))): SHOULD get preamble
+   2. Cold partition (-freorder-blocks-and-partition): should NOT get preamble
+   3. IPA split .part function (split_part=true): Logic in place, would skip if triggered
+
+   Note: IPA function splitting (creating .part functions with split_part=true) requires
+   specific optimization conditions that are difficult to trigger reliably in tests.
+   The KCFI logic correctly handles this case using the split_part flag check.
+*/
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-complex-addressing.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-complex-addressing.c
new file mode 100644
index 000000000000..b9a8955b0899
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-complex-addressing.c
@@ -0,0 +1,135 @@
+/* Test KCFI with complex addressing modes (structure members, array
+   elements). This is a regression test for the change_address_1 RTL
+   error that occurred when target_addr was PLUS(reg, offset) instead
+   of a simple register.  */
+/* { dg-do compile } */
+/* { dg-additional-options "-O2" } */
+
+struct function_table {
+    int (*callback1)(int);
+    int (*callback2)(int, int);
+    void (*callback3)(void);
+    int (*callback4)(void *, void *, void *, void *, void *, void *);
+    int data;
+};
+
+static int handler1(int x) {
+    return x * 2;
+}
+
+static int handler2(int x, int y) {
+    return x + y;
+}
+
+static void handler3(void) {
+    /* Empty handler.  */
+}
+
+/* Test indirect calls through structure members - this creates
+   PLUS(reg, offset) addressing.  */
+int test_struct_members(struct function_table *table) {
+    int result = 0;
+
+    /* These indirect calls will generate complex addressing modes:
+     * call *(%rdi)          - callback1 at offset 0
+     * call *8(%rdi)         - callback2 at offset 8
+     * call *16(%rdi)        - callback3 at offset 16
+     * KCFI must handle PLUS(reg, struct_offset) + kcfi_offset.  */
+
+    result += table->callback1(10);
+    result += table->callback2(5, 7);
+    table->callback3();
+
+    return result;
+}
+
+/* Test indirect calls through array elements - another source of
+   complex addressing.  */
+typedef int (*func_array_t)(int);
+
+int test_array_elements(func_array_t functions[], int index) {
+    /* This creates addressing like MEM[PLUS(PLUS(reg, index*8), 0)]
+       which should be simplified to MEM[PLUS(reg, index*8)].  */
+    return functions[index](42);
+}
+
+/* Test with global structure.  */
+static struct function_table global_table = {
+    .callback1 = handler1,
+    .callback2 = handler2,
+    .callback3 = handler3,
+    .data = 100
+};
+
+int test_global_struct(void) {
+    /* Access through global structure - may generate different
+       addressing patterns.  */
+    return global_table.callback1(20) + global_table.callback2(3, 4);
+}
+
+/* Test nested structure access.  */
+struct nested_table {
+    struct function_table inner;
+    int extra_data;
+};
+
+int test_nested_struct(struct nested_table *nested) {
+    /* Even more complex addressing: nested structure member access.  */
+    return nested->inner.callback1(15);
+}
+
+int test_many_args(void *one, void *two, void *three, void *four, void *five, void *six)
+{
+    return (unsigned long)one + (unsigned long)two + (unsigned long)three
+	   + (unsigned long)four + (unsigned long)five + (unsigned long)six;
+}
+
+int main() {
+    struct function_table local_table = {
+        .callback1 = handler1,
+        .callback2 = handler2,
+        .callback3 = handler3,
+        .callback4 = test_many_args,
+        .data = 50
+    };
+
+    func_array_t func_array[] = { handler1, handler1, handler1 };
+
+    int result = 0;
+    result += test_struct_members(&local_table);
+    result += test_array_elements(func_array, 1);
+    result += test_global_struct();
+
+    struct nested_table nested = { .inner = local_table, .extra_data = 200 };
+    result += test_nested_struct(&nested);
+
+    result += local_table.callback4(handler1, handler2, handler3, &result, main, &local_table);
+
+    return result;
+}
+
+/* Verify that all address-taken functions get KCFI preambles.  */
+/* { dg-final { scan-assembler {__cfi_handler1:} } } */
+/* { dg-final { scan-assembler {__cfi_handler2:} } } */
+/* { dg-final { scan-assembler {__cfi_handler3:} } } */
+/* { dg-final { scan-assembler {__cfi_test_many_args:} } } */
+
+/* x86_64: Verify KCFI checks are generated for indirect calls through
+   complex addressing.  */
+/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-4\(%r[a-z0-9]+\), %r10d} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler {ud2} { target x86_64-*-* } } } */
+
+/* AArch64: Verify KCFI checks for complex addressing.  */
+/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-4\]} { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler {brk} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Verify KCFI checks for complex addressing with stack spilling.  */
+/* { dg-final { scan-assembler {ldr\tr0, \[r[0-9]+, #-4\]} { target arm32 } } } */
+/* { dg-final { scan-assembler {udf} { target arm32 } } } */
+
+/* RISC-V: Verify KCFI check sequence for complex addressing.  */
+/* { dg-final { scan-assembler {lw\tt1, -4\([a-z0-9]+\)\n\tlui\tt2, [0-9]+\n\taddiw\tt2, t2, -?[0-9]+\n\tbeq\tt1, t2, \.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tebreak} { target riscv*-*-* } } } */
+
+/* Should have trap section for x86 and RISC-V only.  */
+/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-ipa-robustness.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-ipa-robustness.c
new file mode 100644
index 000000000000..a43bcd4f3e3f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-ipa-robustness.c
@@ -0,0 +1,54 @@
+/* Test KCFI IPA pass robustness with compiler-generated constructs.  */
+/* { dg-do compile } */
+/* { dg-additional-options "-O2" } */
+
+#include <stddef.h>
+
+/* Test various compiler-generated constructs that could confuse IPA pass.  */
+
+/* static_assert - this was causing the original crash.  */
+typedef struct {
+    int field1;
+    char field2;
+} test_struct_t;
+
+static_assert(offsetof(test_struct_t, field1) == 0, "layout check 1");
+static_assert(offsetof(test_struct_t, field2) == 4, "layout check 2");
+static_assert(sizeof(test_struct_t) >= 5, "size check");
+
+/* Regular functions that should get KCFI analysis.  */
+void regular_function(void) {
+    /* Should get KCFI preamble.  */
+}
+
+static void static_function(void) {
+    /* With -O2: correctly identified as not address-taken, no preamble.  */
+}
+
+void address_taken_function(void) {
+    /* Should get KCFI preamble (address taken below) */
+}
+
+/* Function pointer to create address-taken scenario.  */
+void (*func_ptr)(void) = address_taken_function;
+
+/* More static_asserts mixed with function definitions.  */
+static_assert(sizeof(void*) >= 4, "pointer size check");
+
+int main(void) {
+    regular_function();    /* Direct call.  */
+    static_function();     /* Direct call to static.  */
+    func_ptr();            /* Indirect call.  */
+
+    static_assert(sizeof(int) == 4, "int size check");
+
+    return 0;
+}
+
+/* Verify KCFI preambles are generated appropriately.  */
+/* { dg-final { scan-assembler "__cfi_regular_function:" } } */
+/* { dg-final { scan-assembler "__cfi_address_taken_function:" } } */
+/* { dg-final { scan-assembler "__cfi_main:" } } */
+
+/* With -O2: static_function correctly identified as not address-taken.  */
+/* { dg-final { scan-assembler-not "__cfi_static_function:" } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-move-preservation.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-move-preservation.c
new file mode 100644
index 000000000000..50029d136716
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-move-preservation.c
@@ -0,0 +1,55 @@
+/* Test that KCFI preserves function pointer moves at -O2 optimization.
+   This test ensures that the combine pass doesn't incorrectly optimize away
+   the move instruction needed to transfer function pointers from argument
+   registers to the target registers used by KCFI patterns.  */
+
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -std=gnu11" } */
+
+static int called_count = 0;
+
+/* Function taking one argument, returning void.  */
+static __attribute__((noinline)) void increment_void(int *counter)
+{
+    (*counter)++;
+}
+
+/* Function taking one argument, returning int.  */
+static __attribute__((noinline)) int increment_int(int *counter)
+{
+    (*counter)++;
+    return *counter;
+}
+
+/* Don't allow the compiler to inline the calls.  */
+static __attribute__((noinline)) void indirect_call(void (*func)(int *))
+{
+    func(&called_count);
+}
+
+int main(void)
+{
+    /* This should work - matching prototype.  */
+    indirect_call(increment_void);
+
+    /* This should trap - mismatched prototype.  */
+    indirect_call((void *)increment_int);
+
+    return 0;
+}
+
+/* Verify complete KCFI check sequence with preserved move instruction. At
+   -O2, the combine pass previously optimized away the move from %rdi to %rax,
+   breaking KCFI. Verify the full sequence is preserved. */
+
+/* x86_64: Complete KCFI sequence with move preservation and indirect jump.  */
+/* { dg-final { scan-assembler {(indirect_call):.*\n.*movq\s+%rdi,\s+(%rax)\n.*movl\s+\$[0-9]+,\s+%r10d\n\taddl\s+-4\(\2\),\s+%r10d\n\tje\s+\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tud2.*\.Lkcfi_call[0-9]+:\n\tjmp\s+\*\2.*\.size\s+\1,\s+\.-\1} { target x86_64-*-* } } } */
+
+/* AArch64: Complete KCFI sequence with move preservation and indirect branch.  */
+/* { dg-final { scan-assembler {(indirect_call):.*\n.*mov\s+(x[0-9]+),\s+x0\n.*ldur\s+w16,\s+\[\2,\s+#-4\]\n\tmov\s+w17,\s+#[0-9]+\n\tmovk\s+w17,\s+#[0-9]+,\s+lsl\s+#16\n\tcmp\s+w16,\s+w17\n\tb\.eq\s+\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tbrk\s+#[0-9]+.*\.Lkcfi_call[0-9]+:\n\tbr\s+\2.*\.size\s+\1,\s+\.-\1} { target aarch64*-*-* } } } */
+
+/* ARM32: Complete KCFI sequence with move preservation and indirect branch.  */
+/* { dg-final { scan-assembler {(indirect_call):.*\n.*mov\s+(r[0-9]+),\s+r0\n.*push\s+\{r0,\s+r1\}\n\tldr\s+r0,\s+\[\2,\s+#-4\]\n\tmovw\s+r1,\s+#[0-9]+\n\tmovt\s+r1,\s+#[0-9]+\n\tcmp\s+r0,\s+r1\n\tpop\s+\{r0,\s+r1\}\n\tbeq\s+\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tudf\s+#[0-9]+.*\.Lkcfi_call[0-9]+:\n\tbx\s+\2.*\.size\s+\1,\s+\.-\1} { target arm32 } } } */
+
+/* RISC-V: Complete KCFI sequence with move preservation and indirect jump.  */
+/* { dg-final { scan-assembler {(indirect_call):.*mv\s+(a[0-9]+),a0.*lw\s+t1,\s+-4\(\2\).*lui\s+t2,\s+[0-9]+.*addiw\s+t2,\s+t2,\s+-?[0-9]+.*beq\s+t1,\s+t2,\s+\.Lkcfi_call[0-9]+.*ebreak.*jalr\s+zero,\s+\2,\s+0.*\.size\s+\1,\s+\.-\1} { target riscv64-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize-inline.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize-inline.c
new file mode 100644
index 000000000000..c43d8014ff2d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize-inline.c
@@ -0,0 +1,100 @@
+/* Test that no_sanitize("kcfi") attribute is preserved during inlining.  */
+/* { dg-do compile } */
+/* { dg-additional-options "-O2" } */
+
+extern void external_side_effect(int value);
+
+/* Regular function (should get KCFI checks) */
+__attribute__((noinline))
+void normal_function(void (*callback)(int))
+{
+    /* This indirect call must generate KCFI checks.  */
+    callback(300);
+    external_side_effect(300);
+}
+
+/* Regular function marked with no_sanitize("kcfi") (positive control) */
+__attribute__((noinline, no_sanitize("kcfi")))
+void sensitive_non_inline_function(void (*callback)(int))
+{
+    /* This indirect call should NOT generate KCFI checks.  */
+    callback(100);
+    external_side_effect(100);
+}
+
+/* Function marked with both no_sanitize("kcfi") and always_inline.  */
+__attribute__((always_inline, no_sanitize("kcfi")))
+static inline void sensitive_inline_function(void (*callback)(int))
+{
+    /* This indirect call should NOT generate KCFI checks when inlined.  */
+    callback(42);
+    external_side_effect(42);
+}
+
+/* Explicit wrapper for testing sensitive_inline_function behavior.  */
+__attribute__((noinline))
+void wrap_sensitive_inline(void (*callback)(int))
+{
+    sensitive_inline_function(callback);
+}
+
+/* Function marked with only always_inline (should get KCFI checks) */
+__attribute__((always_inline))
+static inline void normal_inline_function(void (*callback)(int))
+{
+    /* This indirect call must generate KCFI checks when inlined.  */
+    callback(200);
+    external_side_effect(200);
+}
+
+/* Explicit wrapper for testing normal_inline_function behavior.  */
+__attribute__((noinline))
+void wrap_normal_inline(void (*callback)(int))
+{
+    normal_inline_function(callback);
+}
+
+void test_callback(int value)
+{
+    external_side_effect(value);
+}
+
+static void (*volatile function_pointer)(int) = test_callback;
+
+int main(void)
+{
+    void (*fn_ptr)(int) = function_pointer;
+
+    normal_function(fn_ptr);
+    wrap_normal_inline(fn_ptr);
+    sensitive_non_inline_function(fn_ptr);
+    wrap_sensitive_inline(fn_ptr);
+
+    return 0;
+}
+
+/* Verify correct number of KCFI checks: exactly 2 */
+/* { dg-final { scan-assembler-times {ud2} 2 { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-times {brk\s+#[0-9]+} 2 { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler-times {udf\s+#[0-9]+} 2 { target arm32 } } } */
+/* { dg-final { scan-assembler-times {ebreak} 2 { target riscv*-*-* } } } */
+
+/* Positive controls: these should have KCFI checks.  */
+/* { dg-final { scan-assembler {normal_function:.*ud2.*\.size\s+normal_function} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler {wrap_normal_inline:.*ud2.*\.size\s+wrap_normal_inline} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler {normal_function:.*brk\s+#[0-9]+.*\.size\s+normal_function} { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler {wrap_normal_inline:.*brk\s+#[0-9]+.*\.size\s+wrap_normal_inline} { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler {normal_function:.*udf\t#[0-9]+.*\.size\s+normal_function} { target arm32 } } } */
+/* { dg-final { scan-assembler {wrap_normal_inline:.*udf\t#[0-9]+.*\.size\s+wrap_normal_inline} { target arm32 } } } */
+/* { dg-final { scan-assembler {normal_function:.*ebreak.*\.size\s+normal_function} { target riscv*-*-* } } } */
+/* { dg-final { scan-assembler {wrap_normal_inline:.*ebreak.*\.size\s+wrap_normal_inline} { target riscv*-*-* } } } */
+
+/* Negative controls: these should NOT have KCFI checks.  */
+/* { dg-final { scan-assembler-not {sensitive_non_inline_function:.*ud2.*\.size\s+sensitive_non_inline_function} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-not {wrap_sensitive_inline:.*ud2.*\.size\s+wrap_sensitive_inline} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-not {sensitive_non_inline_function:.*brk\s+#[0-9]+.*\.size\s+sensitive_non_inline_function} { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler-not {wrap_sensitive_inline:.*brk\s+#[0-9]+.*\.size\s+wrap_sensitive_inline} { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler-not {sensitive_non_inline_function:[^\n]*udf\t#[0-9]+[^\n]*\.size\tsensitive_non_inline_function} { target arm32 } } } */
+/* { dg-final { scan-assembler-not {wrap_sensitive_inline:[^\n]*udf\t#[0-9]+[^\n]*\.size\twrap_sensitive_inline} { target arm32 } } } */
+/* { dg-final { scan-assembler-not {sensitive_non_inline_function:.*ebreak.*\.size\s+sensitive_non_inline_function} { target riscv*-*-* } } } */
+/* { dg-final { scan-assembler-not {wrap_sensitive_inline:.*ebreak.*\.size\s+wrap_sensitive_inline} { target riscv*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c
new file mode 100644
index 000000000000..6f1a558c0820
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c
@@ -0,0 +1,39 @@
+/* Test KCFI with no_sanitize attribute.  */
+/* { dg-do compile } */
+
+void target_function(void) {
+    /* This should get KCFI preamble.  */
+}
+
+void caller_with_checks(void) {
+    /* This function should generate KCFI checks.  */
+    void (*func_ptr)(void) = target_function;
+    func_ptr();
+}
+
+__attribute__((no_sanitize("kcfi")))
+void caller_no_checks(void) {
+    /* This function should NOT generate KCFI checks due to no_sanitize.  */
+    void (*func_ptr)(void) = target_function;
+    func_ptr();
+}
+
+int main() {
+    caller_with_checks();    /* This should generate checks inside.  */
+    caller_no_checks();      /* This should NOT generate checks inside.  */
+    return 0;
+}
+
+/* All functions should get preambles regardless of no_sanitize.  */
+/* { dg-final { scan-assembler "__cfi_target_function:" } } */
+/* { dg-final { scan-assembler "__cfi_caller_with_checks:" } } */
+/* { dg-final { scan-assembler "__cfi_caller_no_checks:" } } */
+/* { dg-final { scan-assembler "__cfi_main:" } } */
+
+/* caller_with_checks() should generate KCFI check.
+   caller_no_checks() should NOT generate KCFI check (no_sanitize).
+   So a total of exactly 1 KCFI check in the entire program.  */
+/* { dg-final { scan-assembler-times {addl\t-4\(%r[ad]x\), %r1[01]d} 1 { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-times {ldur\tw16, \[x[0-9]+, #-4\]} 1 { target aarch64-*-* } } } */
+/* { dg-final { scan-assembler-times {ldr\tr0, \[r[0-9]+, #-4\]} 1 { target arm32 } } } */
+/* { dg-final { scan-assembler-times {lw\tt1, -[0-9]+\(} 1 { target riscv*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-offset-validation.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-offset-validation.c
new file mode 100644
index 000000000000..f93a042d9752
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-offset-validation.c
@@ -0,0 +1,48 @@
+/* Test KCFI call-site offset validation across architectures.  */
+/* { dg-do compile } */
+/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
+
+void target_func_a(void) { }
+void target_func_b(int x) { }
+void target_func_c(int x, int y) { }
+
+int main() {
+    void (*ptr_a)(void) = target_func_a;
+    void (*ptr_b)(int) = target_func_b;
+    void (*ptr_c)(int, int) = target_func_c;
+
+    /* Multiple indirect calls.  */
+    ptr_a();
+    ptr_b(1);
+    ptr_c(1, 2);
+
+    return 0;
+}
+
+/* Should have KCFI preambles for all functions.  */
+/* { dg-final { scan-assembler "__cfi_target_func_a:" } } */
+/* { dg-final { scan-assembler "__cfi_target_func_b:" } } */
+/* { dg-final { scan-assembler "__cfi_target_func_c:" } } */
+
+/* x86_64: All call sites should use -4 offset for KCFI type ID loads, even
+   with -falign-functions=16 (we're not using patchable entries here).  */
+/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-4\(%r[a-z0-9]+\), %r10d} { target x86_64-*-* } } } */
+
+/* AArch64: All call sites should use -4 offset.  */
+/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-4\]} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: All call sites should use -4 offset with stack spilling.  */
+/* { dg-final { scan-assembler {ldr\tr0, \[r[0-9]+, #-4\]} { target arm32 } } } */
+
+/* RISC-V: All call sites should use -4 offset.  */
+/* { dg-final { scan-assembler {lw\tt1, -4\(} { target riscv*-*-* } } } */
+
+/* Should have trap section.  */
+/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
+
+/* AArch64 should NOT have trap section (uses brk immediate instead) */
+/* { dg-final { scan-assembler-not {\.kcfi_traps} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit should NOT have trap section (uses udf immediate instead) */
+/* { dg-final { scan-assembler-not {\.kcfi_traps} { target arm32 } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-basic.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-basic.c
new file mode 100644
index 000000000000..a2d0ef0c6ff6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-basic.c
@@ -0,0 +1,70 @@
+/* Test KCFI with patchable function entries - basic case.  */
+/* { dg-do compile } */
+/* { dg-additional-options "-fpatchable-function-entry=5,2" } */
+/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
+
+void test_function(int x) {
+    /* Function should get both KCFI preamble and patchable entries.  */
+}
+
+int main() {
+    test_function(42);
+    return 0;
+}
+
+/* Should have KCFI preamble.  */
+/* { dg-final { scan-assembler "__cfi_test_function:" } } */
+
+/* Should have patchable function entry section.  */
+/* { dg-final { scan-assembler "__patchable_function_entries" } } */
+
+/* x86_64: Should have exactly 2 prefix NOPs between .LPFE and .type.  */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*\.type} { target x86_64-*-* } } } */
+
+/* x86_64: Should have exactly 3 entry NOPs between .cfi_startproc and
+   pushq.  */
+/* { dg-final { scan-assembler {\.cfi_startproc\n\t*nop\n\t*nop\n\t*nop\n\t*pushq} { target x86_64-*-* } } } */
+
+/* x86_64: KCFI should have exactly 9 NOPs between __cfi_ and movl.  */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*movl} { target x86_64-*-* } } } */
+
+/* x86_64: Validate KCFI type ID is present.  */
+/* { dg-final { scan-assembler {movl\t\$0x[0-9a-f]+, %eax} { target x86_64-*-* } } } */
+
+/* AArch64: Should have exactly 2 prefix NOPs between .LPFE and .type.  */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*\.type} { target aarch64*-*-* } } } */
+
+/* AArch64: Should have exactly 3 entry NOPs between .cfi_startproc and
+   stack manipulation.  */
+/* { dg-final { scan-assembler {\.cfi_startproc\n\t*nop\n\t*nop\n\t*nop\n\t*sub\t*sp} { target aarch64*-*-* } } } */
+
+/* AArch64: KCFI should have alignment NOPs then .word immediate.  */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target aarch64*-*-* } } } */
+
+/* AArch64: Validate clean KCFI boundary - .word then immediate end/size.  */
+/* { dg-final { scan-assembler {\.word\t0x[0-9a-f]+\n\.Lcfi_func_end_test_function:\n\t\.size\t__cfi_test_function, \.-__cfi_test_function} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Should have exactly 2 prefix NOPs between .LPFE and .syntax.  */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*\.syntax} { target arm32 } } } */
+
+/* ARM 32-bit: Should have exactly 3 entry NOPs after function label.  */
+/* { dg-final { scan-assembler {test_function:\n\t*nop\n\t*nop\n\t*nop} { target arm32 } } } */
+
+/* ARM 32-bit: KCFI should have alignment NOPs then .word immediate.  */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target arm32 } } } */
+
+/* ARM 32-bit: Validate clean KCFI boundary - .word then immediate end/size.  */
+/* { dg-final { scan-assembler {\.word\t0x[0-9a-f]+\n\.Lcfi_func_end_test_function:\n\t\.size\t__cfi_test_function, \.-__cfi_test_function} { target arm32 } } } */
+
+/* RISC-V: Should have exactly 2 prefix NOPs between .LPFE and .type.  */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*\.type} { target riscv*-*-* } } } */
+
+/* RISC-V: Should have exactly 3 entry NOPs before .cfi_startproc followed
+   by addi sp.  */
+/* { dg-final { scan-assembler {nop\n\t*nop\n\t*nop\n\.LFB[0-9]+:\n\t*\.cfi_startproc\n\t*addi\t*sp} { target riscv*-*-* } } } */
+
+/* RISC-V: KCFI should have alignment NOPs then .word immediate.  */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target riscv*-*-* } } } */
+
+/* RISC-V: Validate clean KCFI boundary - .word then immediate end/size.  */
+/* { dg-final { scan-assembler {\.word\t0x[0-9a-f]+\n\.Lcfi_func_end_test_function:\n\t\.size\t__cfi_test_function, \.-__cfi_test_function} { target riscv*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-entry-only.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-entry-only.c
new file mode 100644
index 000000000000..62e1926e107e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-entry-only.c
@@ -0,0 +1,62 @@
+/* Test KCFI with patchable function entries - entry NOPs only.  */
+/* { dg-do compile } */
+/* { dg-additional-options "-fpatchable-function-entry=4,0" } */
+/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
+
+void test_function(void) {
+}
+
+static void caller(void) {
+    /* Make an indirect call to test callsite offset calculation.  */
+    void (*func_ptr)(void) = test_function;
+    func_ptr();
+}
+
+int main() {
+    test_function();  /* Direct call.  */
+    caller();         /* Indirect call via static function.  */
+    return 0;
+}
+
+/* x86_64: Should have KCFI preamble with architecture alignment NOPs (11).  */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+movl\t+\$0x[0-9a-f]+, %eax} { target x86_64-*-* } } } */
+
+/* AArch64: Should have KCFI preamble with no alignment NOPs.  */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*\.word\t0x[0-9a-f]+} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Should have KCFI preamble with no alignment NOPs.  */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t\.word\t0x[0-9a-f]+} { target arm32 } } } */
+
+/* RISC-V: Should have KCFI preamble with no alignment NOPs.  */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t\.word\t0x[0-9a-f]+} { target riscv*-*-* } } } */
+
+/* x86_64: Indirect call should use original prefix NOPs (0) for offset
+   calculation: -4 offset.  */
+/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-4\(%r[a-z0-9]+\), %r10d\n\tje\t(\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tud2\n.*\n\1:\n\tcall} { target x86_64-*-* } } } */
+
+/* x86_64: All 4 NOPs are entry NOPs - should have exactly 4 entry NOPs.  */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*pushq} { target x86_64-*-* } } } */
+
+/* AArch64: All 4 NOPs are entry NOPs - should have exactly 4 entry NOPs.  */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*stp} { target aarch64*-*-* } } } */
+
+/* AArch64: No alignment NOPs - function type should come immediately before
+   function.  */
+/* { dg-final { scan-assembler {\.type\t*test_function, %function\n*test_function:} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: All 4 NOPs are entry NOPs - should have exactly 4 entry NOPs.  */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop} { target arm32 } } } */
+
+/* ARM 32-bit: No alignment NOPs - function type should come immediately
+   before function.  */
+/* { dg-final { scan-assembler {\.type\t*test_function, %function\n*test_function:} { target arm32 } } } */
+
+/* RISC-V: All 4 NOPs are entry NOPs.  */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\.LFB} { target riscv*-*-* } } } */
+
+/* RISC-V: No alignment NOPs - function type should come immediately
+   before function.  */
+/* { dg-final { scan-assembler {\.type\t*test_function, @function\n*test_function:} { target riscv*-*-* } } } */
+
+/* Should have patchable function entry section.  */
+/* { dg-final { scan-assembler "__patchable_function_entries" } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-large.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-large.c
new file mode 100644
index 000000000000..3d5618847840
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-large.c
@@ -0,0 +1,51 @@
+/* Test KCFI with large patchable function entries.  */
+/* { dg-do compile } */
+/* { dg-additional-options "-fpatchable-function-entry=11,11" } */
+/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
+
+void test_function(void) {
+}
+
+int main() {
+    void (*func_ptr)(void) = test_function;
+    func_ptr();
+    return 0;
+}
+
+/* Should have KCFI preamble.  */
+/* { dg-final { scan-assembler "__cfi_test_function:" } } */
+
+/* Should have patchable function entry section.  */
+/* { dg-final { scan-assembler "__patchable_function_entries" } } */
+
+/* x86_64: Should have exactly 11 alignment NOPs between .LPFE and .type.  */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target x86_64-*-* } } } */
+
+/* x86_64: Should have 0 entry NOPs - function starts immediately with
+   pushq.  */
+/* { dg-final { scan-assembler {test_function:\n\.LFB[0-9]+:\n\t*\.cfi_startproc\n\t*pushq\t*%rbp} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target x86_64-*-* } } } */
+
+/* x86_64: KCFI should have 0 entry NOPs - goes directly to typeid movl.  */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*movl\t\$0x[0-9a-f]+, %eax} { target x86_64-*-* } } } */
+
+/* x86_64: Call site should use -15 offset.  */
+/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-15\(%r[a-z0-9]+\), %r10d} { target x86_64-*-* } } } */
+
+/* AArch64: Should have exactly 11 prefix NOPs between .LPFE and .type.  */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Should have exactly 11 prefix NOPs between .LPFE and .type.  */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop} { target arm32 } } } */
+
+/* AArch64: Call site should use -15 offset.  */
+/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-15\]} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Call site should use -15 offset.  */
+/* { dg-final { scan-assembler {ldr\tr0, \[r[0-9]+, #-15\]} { target arm32 } } } */
+
+/* RISC-V: Should have 11 prefix NOPs between .LPFE and .type.  */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target riscv*-*-* } } } */
+
+/* RISC-V: Call site should use -15 offset (same as x86/AArch64).  */
+/* { dg-final { scan-assembler {lw\tt1, -15\(} { target riscv*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-medium.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-medium.c
new file mode 100644
index 000000000000..4f00a86dbcb7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-medium.c
@@ -0,0 +1,60 @@
+/* Test KCFI with medium patchable function entries.  */
+/* { dg-do compile } */
+/* { dg-additional-options "-fpatchable-function-entry=8,4" } */
+/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
+
+void test_function(void) {
+}
+
+int main() {
+    void (*func_ptr)(void) = test_function;
+    func_ptr();
+    return 0;
+}
+
+/* Should have KCFI preamble.  */
+/* { dg-final { scan-assembler "__cfi_test_function:" } } */
+
+/* Should have patchable function entry section.  */
+/* { dg-final { scan-assembler "__patchable_function_entries" } } */
+
+/* x86_64: Should have exactly 4 prefix NOPs between .LPFE and .type.  */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target x86_64-*-* } } } */
+
+/* x86_64: Should have exactly 4 entry NOPs between .cfi_startproc and
+   pushq.  */
+/* { dg-final { scan-assembler {\.cfi_startproc\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*pushq} { target x86_64-*-* } } } */
+
+/* x86_64: KCFI should have exactly 7 alignment NOPs between __cfi_ and
+   typeid movl.  */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*movl\t\$0x[0-9a-f]+, %eax} { target x86_64-*-* } } } */
+
+/* x86_64: Call site should use -8 offset.  */
+/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-8\(%r[a-z0-9]+\), %r10d} { target x86_64-*-* } } } */
+
+/* AArch64: Should have exactly 4 prefix NOPs between .LPFE and .type.  */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target aarch64*-*-* } } } */
+
+/* AArch64: Should have exactly 4 entry NOPs after .cfi_startproc.  */
+/* { dg-final { scan-assembler {\.cfi_startproc\n\t*nop\n\t*nop\n\t*nop\n\t*nop} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Should have exactly 4 prefix NOPs between .LPFE and .syntax.  */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.syntax} { target arm32 } } } */
+
+/* ARM 32-bit: Should have exactly 4 entry NOPs after function label.  */
+/* { dg-final { scan-assembler {test_function:\n\t*nop\n\t*nop\n\t*nop\n\t*nop} { target arm32 } } } */
+
+/* AArch64: Call site should use -8 offset.  */
+/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-8\]} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Call site should use -8 offset.  */
+/* { dg-final { scan-assembler {ldr\tr0, \[r[0-9]+, #-8\]} { target arm32 } } } */
+
+/* RISC-V: Should have exactly 4 prefix NOPs between .LPFE and .type.  */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target riscv*-*-* } } } */
+
+/* RISC-V: Should have 4 entry NOPs.  */
+/* { dg-final { scan-assembler {test_function:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\.LFB} { target riscv*-*-* } } } */
+
+/* RISC-V: Call site should use -8 offset (same as x86/AArch64) */
+/* { dg-final { scan-assembler {lw\tt1, -8\(} { target riscv*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-prefix-only.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-prefix-only.c
new file mode 100644
index 000000000000..98c53ef52989
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-prefix-only.c
@@ -0,0 +1,60 @@
+/* Test KCFI with patchable function entries - prefix NOPs only.  */
+/* { dg-do compile } */
+/* { dg-additional-options "-fpatchable-function-entry=3,3" } */
+/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
+
+void test_function(void) {
+}
+
+int main() {
+    test_function();
+    return 0;
+}
+
+/* Should have KCFI preamble.  */
+/* { dg-final { scan-assembler "__cfi_test_function:" } } */
+
+/* x86_64: All 3 NOPs are prefix NOPs - should have exactly 3 prefix NOPs.  */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*\.type\t*test_function} { target x86_64-*-* } } } */
+
+/* x86_64: No entry NOPs - function should start immediately with prologue. */
+/* { dg-final { scan-assembler {test_function:\n\.LFB[0-9]+:\n\t*\.cfi_startproc\n\t*pushq\t*%rbp} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target x86_64-*-* } } } */
+
+/* x86_64: should have exactly 8 alignment NOPs.  */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*movl} { target x86_64-*-* } } } */
+
+/* AArch64: All 3 NOPs are prefix NOPs - should have exactly 3 prefix NOPs.  */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*\.type\t*test_function} { target aarch64*-*-* } } } */
+
+/* AArch64: No entry NOPs - function should start immediately with prologue.  */
+/* { dg-final { scan-assembler {test_function:\n\.LFB[0-9]+:\n\t*\.cfi_startproc\n\t*nop\n\t*ret} { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target aarch64*-*-* } } } */
+
+/* AArch64: KCFI type ID should have 1 alignment NOP then word.  */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: All 3 NOPs are prefix NOPs - should have exactly 3 prefix NOPs.  */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop} { target arm32 } } } */
+
+/* ARM 32-bit: No entry NOPs - function should start immediately with
+   prologue.  */
+/* { dg-final { scan-assembler {test_function:} { target arm32 } } } */
+/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target arm32 } } } */
+
+/* ARM 32-bit: KCFI type ID should have 1 alignment NOP then word.  */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target arm32 } } } */
+
+/* RISC-V: All 3 NOPs are prefix NOPs - should have exactly 3 prefix NOPs.  */
+/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*\.type\t*test_function} { target riscv*-*-* } } } */
+
+/* RISC-V: No entry NOPs - function should start immediately with
+   .cfi_startproc.  */
+/* { dg-final { scan-assembler {test_function:\n\.LFB[0-9]+:\n\t*\.cfi_startproc} { target riscv*-*-* } } } */
+/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target riscv*-*-* } } } */
+
+/* RISC-V: KCFI type ID should have 1 alignment NOP then word.  */
+/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target riscv*-*-* } } } */
+
+/* Should have patchable function entry section.  */
+/* { dg-final { scan-assembler "__patchable_function_entries" } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-pic-addressing.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-pic-addressing.c
new file mode 100644
index 000000000000..26323db4572f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-pic-addressing.c
@@ -0,0 +1,104 @@
+/* Test KCFI with position-independent code addressing modes.
+   This is a regression test for complex addressing like
+   PLUS(PLUS(...), symbol_ref) which can occur with PIC and caused
+   change_address_1 RTL errors.  */
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -fpic" } */
+
+/* Global function pointer table that creates PIC addressing.  */
+struct callbacks {
+    int (*handler1)(int);
+    void (*handler2)(void);
+    int (*handler3)(int, int);
+};
+
+static int simple_handler(int x) {
+    return x * 2;
+}
+
+static void void_handler(void) {
+    /* Empty handler.  */
+}
+
+static int complex_handler(int a, int b) {
+    return a + b;
+}
+
+/* Global structure that will require PIC addressing.  */
+struct callbacks global_callbacks = {
+    .handler1 = simple_handler,
+    .handler2 = void_handler,
+    .handler3 = complex_handler
+};
+
+/* Function that uses PIC addressing to access global callbacks.  */
+int test_pic_addressing(int value) {
+    /* These indirect calls through global structure create complex
+       addressing like PLUS(PLUS(GOT_base, symbol_offset), struct_offset)
+       which previously caused RTL errors in KCFI instrumentation.  */
+
+    int result = 0;
+    result += global_callbacks.handler1(value);
+
+    global_callbacks.handler2();
+
+    result += global_callbacks.handler3(value, result);
+
+    return result;
+}
+
+/* Test with function pointer arrays.  */
+static int (*func_array[])(int) = {
+    simple_handler,
+    simple_handler,
+    simple_handler
+};
+
+int test_pic_array(int index, int value) {
+    /* Array access with PIC can also create complex addressing.  */
+    return func_array[index % 3](value);
+}
+
+/* Test with dynamic PIC addressing.  */
+struct callbacks *get_callbacks(void) {
+    return &global_callbacks;
+}
+
+int test_dynamic_pic(int value) {
+    /* Dynamic access through function call creates very complex addressing.  */
+    struct callbacks *cb = get_callbacks();
+    return cb->handler1(value) + cb->handler3(value, value);
+}
+
+int main() {
+    int result = 0;
+    result += test_pic_addressing(10);
+    result += test_pic_array(1, 20);
+    result += test_dynamic_pic(5);
+    return result;
+}
+
+/* Verify that all address-taken functions get KCFI preambles.  */
+/* { dg-final { scan-assembler {__cfi_simple_handler:} } } */
+/* { dg-final { scan-assembler {__cfi_void_handler:} } } */
+/* { dg-final { scan-assembler {__cfi_complex_handler:} } } */
+
+/* x86_64: Verify KCFI checks are generated.  */
+/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-4\(%r[a-z0-9]+\), %r10d} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler {ud2} { target x86_64-*-* } } } */
+
+/* AArch64: Verify KCFI checks.  */
+/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-4\]} { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler {brk} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit: Verify KCFI checks with PIC addressing and stack spilling.  */
+/* { dg-final { scan-assembler {ldr\tr0, \[r[0-9]+, #-4\]} { target arm32 } } } */
+/* { dg-final { scan-assembler {udf} { target arm32 } } } */
+
+/* RISC-V: Verify KCFI checks are generated.  */
+/* { dg-final { scan-assembler {lw\tt1, -4\([a-z0-9]+\)\n\tlui\tt2, [0-9]+\n\taddiw\tt2, t2, -?[0-9]+} { target riscv*-*-* } } } */
+/* { dg-final { scan-assembler {ebreak} { target riscv*-*-* } } } */
+
+/* Should have trap section.  */
+/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-retpoline-r11.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-retpoline-r11.c
new file mode 100644
index 000000000000..79e5ca61cdc2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-retpoline-r11.c
@@ -0,0 +1,50 @@
+/* Test KCFI with retpoline thunk-extern flag forces r11 usage.  */
+/* { dg-do compile { target x86_64-*-* } } */
+/* { dg-additional-options "-O2 -mindirect-branch=thunk-extern" } */
+
+extern int external_target(void);
+
+/* Test regular call (not tail call) */
+__attribute__((noinline))
+int call_test(int (*func_ptr)(void)) {
+    /* This indirect call should use r11 when both KCFI and
+       -mindirect-branch=thunk-extern are enabled.  */
+    int result = func_ptr();  /* Function parameter prevents direct optimization.  */
+    return result + 1;  /* Prevent tail call optimization.  */
+}
+
+/* Reference external_target to generate the required symbol.  */
+int (*external_func_ptr)(void) = external_target;
+
+/* Test function for sibcalls (tail calls) */
+__attribute__((noinline))
+void sibcall_test(int (**func_ptr)(void)) {
+    /* This sibcall should use r11 when both KCFI and
+       -mindirect-branch=thunk-extern are enabled.  */
+    (*func_ptr)();  /* Tail call - should be optimized to sibcall.  */
+}
+
+/* Should have weak symbol for external function.  */
+/* { dg-final { scan-assembler "__kcfi_typeid_external_target" } } */
+
+/* When both KCFI and -mindirect-branch=thunk-extern are enabled,
+   indirect calls should always use r11 register and convert to extern thunks.  */
+/* { dg-final { scan-assembler-times {call\s+__x86_indirect_thunk_r11} 1 } } */
+
+/* Sibcalls should also use r11 register and convert to extern thunks.  */
+/* { dg-final { scan-assembler-times {jmp\s+__x86_indirect_thunk_r11} 1 } } */
+
+/* Should have exactly 2 KCFI traps (one per function) */
+/* { dg-final { scan-assembler-times {ud2} 2 } } */
+
+/* Should NOT use other registers for indirect calls.  */
+/* { dg-final { scan-assembler-not {call\s+\*%rax} } } */
+/* { dg-final { scan-assembler-not {call\s+\*%rcx} } } */
+/* { dg-final { scan-assembler-not {call\s+\*%rdx} } } */
+/* { dg-final { scan-assembler-not {call\s+\*%rdi} } } */
+
+/* Should NOT use other registers for sibcalls.  */
+/* { dg-final { scan-assembler-not {jmp\s+\*%rax} } } */
+/* { dg-final { scan-assembler-not {jmp\s+\*%rcx} } } */
+/* { dg-final { scan-assembler-not {jmp\s+\*%rdx} } } */
+/* { dg-final { scan-assembler-not {jmp\s+\*%rdi} } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-runtime.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-runtime.c
new file mode 100644
index 000000000000..6ad8fab5da80
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-runtime.c
@@ -0,0 +1,151 @@
+/* Test KCFI runtime behavior: working calls and type mismatch trapping.
+   { dg-do run { target native } }
+   { dg-options "-fsanitize=kcfi" } */
+
+#include <stdio.h>
+#include <signal.h>
+#include <setjmp.h>
+#include <stdlib.h>
+#include <string.h>
+
+/* Test functions with different signatures */
+static int func_int_void(void)
+{
+    return 42;
+}
+
+__attribute__((nocf_check))
+static int func_int_void_nocf_check(void)
+{
+    return 42;
+}
+
+static int func_int_int(int x)
+{
+    return x * 4;
+}
+
+/* Global state for signal handling */
+static volatile int trap_occurred = 0;
+static jmp_buf trap_env;
+
+/* Signal handler for KCFI traps */
+static void trap_handler(int sig)
+{
+    trap_occurred = 1;
+    longjmp(trap_env, 1);
+}
+
+/* Compatible indirect call should work */
+static int test_compatible_call(void)
+{
+    typedef int (*int_void_ptr)(void);
+    int_void_ptr ptr = func_int_void;
+
+    fprintf(stderr, "Calling %s(0x%08x) through %s(0x%08x) ...\n",
+	    __builtin_typeinfo_name(typeof(func_int_void)),
+	    __builtin_typeinfo_hash(typeof(func_int_void)),
+	    __builtin_typeinfo_name(typeof(*ptr)),
+	    __builtin_typeinfo_hash(typeof(*ptr)));
+
+    trap_occurred = 0;
+    /* This should work - same signature */
+    int result = ptr();
+
+    return (trap_occurred == 0 && result == 42) ? 1 : 0;
+}
+
+/* Compatible indirect call to nocf_check should not work */
+static int test_nocf_check_trap(void)
+{
+    trap_occurred = 0;
+
+    if (setjmp(trap_env) == 0) {
+      typedef int (__attribute__((nocf_check)) *int_void_ptr_nocf)(void);
+      int_void_ptr_nocf ptr = func_int_void_nocf_check;
+
+      fprintf(stderr, "Calling %s(0x%08x) through %s(0x%08x) ...\n",
+	      __builtin_typeinfo_name(typeof(func_int_void_nocf_check)),
+	      __builtin_typeinfo_hash(typeof(func_int_void_nocf_check)),
+	      __builtin_typeinfo_name(typeof(*ptr)),
+	      __builtin_typeinfo_hash(typeof(*ptr)));
+
+      int result = ptr();
+
+      /* If we get here, the trap didn't occur */
+      return 0;
+    } else {
+      /* We caught the trap - this is expected */
+      return trap_occurred;
+    }
+}
+
+/* Type mismatch should trap */
+static int test_type_mismatch_trap(void)
+{
+    trap_occurred = 0;
+
+    if (setjmp(trap_env) == 0) {
+      /* Cast func_int_void to incompatible void(*)(void) type */
+      typedef void (*void_void_ptr)(void);
+      void_void_ptr ptr = (void_void_ptr)func_int_void;
+
+      fprintf(stderr, "Calling %s(0x%08x) through %s(0x%08x) ...\n",
+	      __builtin_typeinfo_name(typeof(func_int_void)),
+	      __builtin_typeinfo_hash(typeof(func_int_void)),
+	      __builtin_typeinfo_name(typeof(*ptr)),
+	      __builtin_typeinfo_hash(typeof(*ptr)));
+
+      /* This should trap because type IDs don't match:
+         - func_int_void has type ID for int(void)
+         - but we're calling through void(void) pointer type */
+      ptr();
+
+      /* If we get here, the trap didn't occur */
+      return 0;
+    } else {
+      /* We caught the trap - this is expected */
+      return trap_occurred;
+    }
+}
+
+int main(void)
+{
+    struct sigaction sa = {
+      .sa_handler = trap_handler,
+      .sa_flags = SA_NODEFER,
+    };
+    int failed = 3;
+
+    /* Install trap handler.  */
+    if (sigaction(SIGILL, &sa, NULL)) {
+      perror("sigaction");
+      return 1;
+    }
+
+    /* Compatible call should work */
+    if (test_compatible_call()) {
+      printf("OK: matched indirect call succeeded\n");
+      failed--;
+    } else {
+      printf("FAIL\n");
+    }
+
+    /* Using nocf_check should trap */
+    if (test_nocf_check_trap()) {
+      printf("OK: indirect call to nocf_check correctly trapped\n");
+      failed--;
+    } else {
+      printf("FAIL\n");
+    }
+
+    /* Type mismatch should trap */
+    if (test_type_mismatch_trap()) {
+      printf("OK: mismatched indirect call correctly trapped\n");
+      failed--;
+    } else {
+      printf("FAIL\n");
+    }
+
+    return failed;
+}
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c
new file mode 100644
index 000000000000..e2e3912fffa3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c
@@ -0,0 +1,142 @@
+/* Test KCFI protection when indirect calls get converted to tail calls.  */
+/* { dg-do compile } */
+/* { dg-additional-options "-O2" } */
+
+typedef int (*func_ptr_t)(int);
+typedef void (*void_func_ptr_t)(void);
+
+struct function_table {
+    func_ptr_t process;
+    void_func_ptr_t cleanup;
+};
+
+/* Target functions.  */
+int process_data(int x) { return x * 2; }
+void cleanup_data(void) {}
+
+/* Initialize function table.  */
+volatile struct function_table vtable = {
+    .process = &process_data,
+    .cleanup = &cleanup_data
+};
+
+/* Indirect call through struct member that should become tail call.  */
+int test_struct_indirect_call(int x) {
+    /* This is an indirect call that should be converted to tail call:
+       Without -fno-optimize-sibling-calls should become "jmp *vtable+0(%rip)"
+       With -fno-optimize-sibling-calls should become "call *vtable+0(%rip)"  */
+    return vtable.process(x);
+}
+
+/* Indirect call through function pointer parameter.  */
+int test_param_indirect_call(func_ptr_t handler, int x) {
+    /* This is an indirect call that should be converted to tail call:
+       Without -fno-optimize-sibling-calls should become "jmp *%rdi"
+       With -fno-optimize-sibling-calls should be "call *%rdi"  */
+    return handler(x);
+}
+
+/* Void indirect call through struct member.  */
+void test_void_indirect_call(void) {
+    /* This is an indirect call that should be converted to tail call:
+     * Without -fno-optimize-sibling-calls: should become "jmp *vtable+8(%rip)"
+     * With -fno-optimize-sibling-calls: should be "call *vtable+8(%rip)"  */
+    vtable.cleanup();
+}
+
+/* Non-tail call for comparison (should always be call).  */
+int test_non_tail_indirect_call(func_ptr_t handler, int x) {
+    /* This should never become a tail call - always "call *%rdi"  */
+    int result = handler(x);
+    return result + 1;  /* Prevents tail call optimization.  */
+}
+
+/* Should have KCFI preambles for all functions.  */
+/* { dg-final { scan-assembler-times "__cfi_process_data:" 1 } } */
+/* { dg-final { scan-assembler-times "__cfi_cleanup_data:" 1 } } */
+/* { dg-final { scan-assembler-times "__cfi_test_struct_indirect_call:" 1 } } */
+/* { dg-final { scan-assembler-times "__cfi_test_param_indirect_call:" 1 } } */
+/* { dg-final { scan-assembler-times "__cfi_test_void_indirect_call:" 1 } } */
+/* { dg-final { scan-assembler-times "__cfi_test_non_tail_indirect_call:" 1 } } */
+
+/* Should have exactly 4 KCFI checks for indirect calls as
+   (load type ID + compare).  */
+/* { dg-final { scan-assembler-times {movl\t\$-?[0-9]+, %r10d} 4 { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-times {addl\t-4\(%r[a-z0-9]+\), %r10d} 4 { target x86_64-*-* } } } */
+
+/* Should have exactly 4 trap sections and 4 trap instructions.  */
+/* { dg-final { scan-assembler-times "\\.kcfi_traps" 4 { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-times "ud2" 4 { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-times "\\.kcfi_traps" 4 { target riscv*-*-* } } } */
+/* { dg-final { scan-assembler-times "ebreak" 4 { target riscv*-*-* } } } */
+
+/* Should NOT have unprotected direct jumps to vtable.  */
+/* { dg-final { scan-assembler-not {jmp\t\*vtable\(%rip\)} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-not {jmp\t\*vtable\+8\(%rip\)} { target x86_64-*-* } } } */
+
+/* Should have exactly 3 protected tail calls (jmp through register after
+   KCFI check).  */
+/* { dg-final { scan-assembler-times {jmp\t\*%[a-z0-9]+} 3 { target x86_64-*-* } } } */
+
+/* Should have exactly 1 regular call (non-tail call case).  */
+/* { dg-final { scan-assembler-times {call\t\*%[a-z0-9]+} 1 { target x86_64-*-* } } } */
+
+/* RISC-V: Should have exactly 4 KCFI checks for indirect calls
+   (comparison instruction).  */
+/* { dg-final { scan-assembler-times {beq\tt1, t2, \.Lkcfi_call[0-9]+} 4 { target riscv*-*-* } } } */
+
+/* RISC-V: Should have exactly 4 KCFI checks for indirect calls as
+   (load type ID + compare).  */
+/* { dg-final { scan-assembler-times {lw\tt1, -4\([a-z0-9]+\)} 4 { target riscv*-*-* } } } */
+/* { dg-final { scan-assembler-times {lui\tt2, [0-9]+} 4 { target riscv*-*-* } } } */
+
+/* RISC-V: Should have exactly 3 protected tail calls (jr after
+   KCFI check - no return address save).  */
+/* { dg-final { scan-assembler-times {jalr\t(x0|zero), [a-z0-9]+, 0} 3 { target riscv*-*-* } } } */
+
+/* RISC-V: Should have exactly 1 regular call (non-tail call case - saves
+   return address).  */
+/* { dg-final { scan-assembler-times {jalr\t(x1|ra), [a-z0-9]+, 0} 1 { target riscv*-*-* } } } */
+
+/* Type ID loading should use lui + addiw pattern for 32-bit constants.  */
+/* { dg-final { scan-assembler {lui\tt2, [0-9]+} { target riscv*-*-* } } } */
+/* { dg-final { scan-assembler {addiw\tt2, t2, -?[0-9]+} { target riscv*-*-* } } } */
+
+/* Should have exactly 4 KCFI checks for indirect calls (load type ID from
+   -4 offset + compare).  */
+/* { dg-final { scan-assembler-times {ldur\tw16, \[x[0-9]+, #-4\]} 4 { target aarch64-*-* } } } */
+/* { dg-final { scan-assembler-times {cmp\tw16, w17} 4 { target aarch64-*-* } } } */
+
+/* Should have exactly 4 trap instructions.  */
+/* { dg-final { scan-assembler-times {brk\t#[0-9]+} 4 { target aarch64-*-* } } } */
+
+/* Should have exactly 3 protected tail calls (br through register after
+   KCFI check).  */
+/* { dg-final { scan-assembler-times {br\tx[0-9]+} 3 { target aarch64-*-* } } } */
+
+/* Should have exactly 1 regular call (non-tail call case).  */
+/* { dg-final { scan-assembler-times {blr\tx[0-9]+} 1 { target aarch64-*-* } } } */
+
+/* Type ID loading should use mov + movk pattern for 32-bit constants.  */
+/* { dg-final { scan-assembler {mov\tw17, #[0-9]+} { target aarch64-*-* } } } */
+/* { dg-final { scan-assembler {movk\tw17, #[0-9]+, lsl #16} { target aarch64-*-* } } } */
+
+/* Should have exactly 4 KCFI checks for indirect calls (load type ID from
+   -4 offset + compare).  */
+/* { dg-final { scan-assembler-times {ldr\tr0, \[r[0-9]+, #-4\]} 4 { target arm32 } } } */
+/* { dg-final { scan-assembler-times {cmp\tr0, r1} 4 { target arm32 } } } */
+
+/* Should have exactly 4 trap instructions.  */
+/* { dg-final { scan-assembler-times {udf\t#[0-9]+} 4 { target arm32 } } } */
+
+/* Should have exactly 3 protected tail calls (bx through register after
+   KCFI check).  */
+/* { dg-final { scan-assembler-times {bx\tr[0-9]+} 3 { target arm32 } } } */
+
+/* Should have exactly 1 regular call (non-tail call case).  */
+/* { dg-final { scan-assembler-times {blx\tr[0-9]+} 1 { target arm32 } } } */
+
+/* Type ID loading should use movw + movt pattern for 32-bit constants
+   into r1.  */
+/* { dg-final { scan-assembler {movw\tr1, #[0-9]+} { target arm32 } } } */
+/* { dg-final { scan-assembler {movt\tr1, #[0-9]+} { target arm32 } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-encoding.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-encoding.c
new file mode 100644
index 000000000000..f2226fa58ac9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-encoding.c
@@ -0,0 +1,54 @@
+/* Test AArch64 and ARM32 KCFI trap encoding in BRK/UDF instructions.  */
+/* { dg-do compile { target { aarch64*-*-* || arm32 } } } */
+
+void target_function(int x, char y) {
+}
+
+int main() {
+    void (*func_ptr)(int, char) = target_function;
+
+    /* This should generate trap with immediate encoding.  */
+    func_ptr(42, 'a');
+
+    return 0;
+}
+
+/* Should have KCFI preamble.  */
+/* { dg-final { scan-assembler "__cfi_target_function:" } } */
+
+/* AArch64 specific: Should have BRK instruction with proper ESR encoding
+   ESR format: 0x8000 | ((type_reg & 31) << 5) | (addr_reg & 31)
+
+   Test the ESR encoding by checking for the expected value.
+   Since we know this test uses x2, we expect ESR = 0x8000 | (17<<5) | 2 = 33314
+
+   A truly dynamic test would need to extract the register from blr and compute
+   the corresponding ESR, but DejaGnu's regex limitations make this complex.
+   This test validates the specific case and documents the encoding.
+   */
+/* { dg-final { scan-assembler "blr\\s+x2" { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler "brk\\s+#33314" { target aarch64*-*-* } } } */
+
+/* Should have KCFI check with type comparison.  */
+/* { dg-final { scan-assembler {ldur\t*w16, \[x[0-9]+, #-4\]} { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler {cmp\t*w16, w17} { target aarch64*-*-* } } } */
+
+/* ARM32 specific: Should have UDF instruction with proper encoding
+   UDF format: 0x8000 | ((type_reg & 31) << 5) | (addr_reg & 31)
+
+   Since ARM32 spills and restores r0/r1 before the trap, the type_reg
+   field uses 0x1F (31) to indicate "register was spilled" rather than
+   pointing to a live register. The addr_reg field contains the actual
+   target register number.
+
+   For this test case using r3, we expect:
+   UDF = 0x8000 | (31 << 5) | 3 = 0x8000 | 0x3E0 | 3 = 33763
+   */
+/* { dg-final { scan-assembler "blx\\s+r3" { target arm32 } } } */
+/* { dg-final { scan-assembler "udf\\s+#33763" { target arm32 } } } */
+
+/* Should have register spilling and restoration around type check.  */
+/* { dg-final { scan-assembler {push\t*\{r0, r1\}} { target arm32 } } } */
+/* { dg-final { scan-assembler {pop\t*\{r0, r1\}} { target arm32 } } } */
+/* { dg-final { scan-assembler {ldr\t*r0, \[r[0-9]+, #-4\]} { target arm32 } } } */
+/* { dg-final { scan-assembler {cmp\t*r0, r1} { target arm32 } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c
new file mode 100644
index 000000000000..7f5f8a82f3dc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c
@@ -0,0 +1,41 @@
+/* Test KCFI trap section generation.  */
+/* { dg-do compile } */
+
+void target_function(void) {}
+
+int main() {
+    void (*func_ptr)(void) = target_function;
+
+    /* Multiple indirect calls to generate multiple trap entries.  */
+    func_ptr();
+    func_ptr();
+
+    return 0;
+}
+
+/* Should have KCFI preamble.  */
+/* { dg-final { scan-assembler "__cfi_target_function:" } } */
+
+/* Should have exactly 2 trap labels in code.  */
+/* { dg-final { scan-assembler-times {\.L[^:]+:\n\s*ud2} 2 { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-times {\.L[^:]+:\n\s*brk} 2 { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler-times {\.L[^:]+:\n\s*udf} 2 { target arm32 } } } */
+/* { dg-final { scan-assembler-times {\.L[^:]+:\n\s*ebreak} 2 { target riscv*-*-* } } } */
+
+/* x86_64: Should have complete .kcfi_traps section sequence with relative
+   offset and 2 entries.  */
+/* { dg-final { scan-assembler {\.section\t\.kcfi_traps,"ao",@progbits,\.text\n\.Lkcfi_entry([^:]+):\n\t\.long\t\.Lkcfi_trap([^\s\n]+)-\.Lkcfi_entry\1\n\t\.text} { target x86_64-*-* } } } */
+/* { dg-final { scan-assembler-times {\.section\t\.kcfi_traps,"ao",@progbits,\.text} 2 { target x86_64-*-* } } } */
+
+/* AArch64 should NOT have .kcfi_traps section (uses brk immediate instead) */
+/* { dg-final { scan-assembler-not {\.section\t+\.kcfi_traps} { target aarch64*-*-* } } } */
+/* { dg-final { scan-assembler-not {\.long.*-\.L} { target aarch64*-*-* } } } */
+
+/* ARM 32-bit should NOT have .kcfi_traps section (uses udf immediate instead) */
+/* { dg-final { scan-assembler-not {\.section\t+\.kcfi_traps} { target arm32 } } } */
+/* { dg-final { scan-assembler-not {\.long.*-\.L} { target arm32 } } } */
+
+/* RISC-V: Should have complete .kcfi_traps section sequence with relative
+   offset and 2 entries.  */
+/* { dg-final { scan-assembler {\.section\t\.kcfi_traps,"ao",@progbits,\.text\n\.Lkcfi_entry([^:]+):\n\t\.4byte\t\.L([^\s\n]+)-\.Lkcfi_entry\1\n\t\.text} { target riscv*-*-* } } } */
+/* { dg-final { scan-assembler-times {\.section\t\.kcfi_traps,"ao",@progbits,\.text} 2 { target riscv*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi.exp b/gcc/testsuite/gcc.dg/kcfi/kcfi.exp
new file mode 100644
index 000000000000..0bbba196c82f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/kcfi/kcfi.exp
@@ -0,0 +1,64 @@
+#   Copyright (C) 2025 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+# GCC testsuite for KCFI (Kernel Control Flow Integrity) tests.
+
+# Load support procs.
+load_lib gcc-dg.exp
+
+# KCFI is only supported on specific targets
+if { ![istarget "x86_64-*-*"] \
+     && ![istarget "aarch64-*-*"] && ![istarget "arm*-*-*"] \
+     && ![istarget "riscv*-*-*"] } {
+    return
+}
+
+# Skip tests if x86_64 is running in 32-bit mode (-m32)
+if { [istarget "x86_64-*-*"] && ![check_effective_target_lp64] } {
+    return
+}
+
+# Skip tests if AArch64 is running in ILP32 mode (-mabi=ilp32)
+if { [istarget "aarch64-*-*"] && ![check_effective_target_lp64] } {
+    return
+}
+
+# Skip tests if RISC-V is running in 32-bit mode (riscv32-*)
+if { [istarget "riscv*-*-*"] && ![check_effective_target_lp64] } {
+    return
+}
+
+# Add KCFI-specific flags to any existing DEFAULT_CFLAGS
+global DEFAULT_CFLAGS
+if ![info exists DEFAULT_CFLAGS] then {
+    set DEFAULT_CFLAGS ""
+}
+set DEFAULT_CFLAGS "$DEFAULT_CFLAGS -fsanitize=kcfi"
+
+# Add ARM32-specific flags for arm32 targets
+if [check_effective_target_arm32] {
+    set DEFAULT_CFLAGS "$DEFAULT_CFLAGS -march=armv7-a -mfloat-abi=soft"
+}
+
+# Initialize `dg'.
+dg-init
+
+# Main loop.
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c]] \
+	"" $DEFAULT_CFLAGS
+
+# All done.
+dg-finish
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 4/7] aarch64: Add AArch64 Kernel Control Flow Integrity implementation
  2025-09-13 23:24 ` [PATCH v3 4/7] aarch64: Add AArch64 " Kees Cook
@ 2025-09-13 23:43   ` Andrew Pinski
  2025-09-14 19:45     ` Kees Cook
  2025-09-17 20:01     ` Kees Cook
  0 siblings, 2 replies; 28+ messages in thread
From: Andrew Pinski @ 2025-09-13 23:43 UTC (permalink / raw)
  To: Kees Cook
  Cc: Qing Zhao, Andrew Pinski, Jakub Jelinek, Martin Uecker,
	Richard Biener, Joseph Myers, Peter Zijlstra, Jan Hubicka,
	Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
	Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
	Jim Wilson, Dan Li, Sami Tolvanen, Ramon de C Valle, Joao Moreira,
	Nathan Chancellor, Bill Wendling, gcc-patches, linux-hardening

On Sat, Sep 13, 2025 at 4:28 PM Kees Cook <kees@kernel.org> wrote:
>
> Implement AArch64-specific KCFI backend.
>
> - Trap debugging through ESR (Exception Syndrome Register) encoding
>   in BRK instruction immediate values.
>
> - Scratch register allocation using w16/w17 (x16/x17) following
>   AArch64 procedure call standard for intra-procedure-call registers.

How does this interact with BTI and sibcalls? Since for indirect
calls, x17 is already used for the address.
Why do you need/want to use a fixed register here for the load/compare
anyways? Why can't you use any free register?


>
> Assembly Code Pattern for AArch64:
>   ldur w16, [target, #-4]       ; Load actual type ID from preamble
>   mov  w17, #type_id_low        ; Load expected type (lower 16 bits)
>   movk w17, #type_id_high, lsl #16  ; Load upper 16 bits if needed
>   cmp  w16, w17                 ; Compare type IDs directly
>   b.eq .Lpass                   ; Branch if types match
>   .Ltrap: brk #esr_value        ; Enhanced trap with register info
>   .Lpass: blr/br target         ; Execute validated indirect transfer
>
> ESR (Exception Syndrome Register) Integration:
> - BRK instruction immediate encoding format:
>   0x8000 | ((TypeIndex & 31) << 5) | (AddrIndex & 31)
>   - TypeIndex indicates which W register contains expected type (W17 = 17)
>   - AddrIndex indicates which X register contains target address (0-30)
>   - Example: brk #33313 (0x8221) = expected type in W17, target address in X1
>
> Build and run tested with Linux kernel ARCH=arm64.
>
> gcc/ChangeLog:
>
>         config/aarch64/aarch64-protos.h: Declare aarch64_indirect_branch_asm,
>         and KCFI helpers.
>         config/aarch64/aarch64.cc (aarch64_expand_call): Wrap CALLs in
>         KCFI, with clobbers.
>         (aarch64_indirect_branch_asm): New function, extract common
>         logic for branch asm, like existing call asm helper.
>         (aarch64_output_kcfi_insn): Emit KCFI assembly.
>         config/aarch64/aarch64.md: Add KCFI RTL patterns and replace
>         open-coded branch emission with aarch64_indirect_branch_asm.
>         doc/invoke.texi: Document aarch64 nuances.
>
> Signed-off-by: Kees Cook <kees@kernel.org>
> ---
>  gcc/config/aarch64/aarch64-protos.h |   5 ++
>  gcc/config/aarch64/aarch64.cc       | 116 ++++++++++++++++++++++++++++
>  gcc/config/aarch64/aarch64.md       |  64 +++++++++++++--
>  gcc/doc/invoke.texi                 |  14 ++++
>  4 files changed, 191 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
> index 56efcf2c7f2c..c91fdcc80ea3 100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -1261,6 +1261,7 @@ tree aarch64_resolve_overloaded_builtin_general (location_t, tree, void *);
>
>  const char *aarch64_sls_barrier (int);
>  const char *aarch64_indirect_call_asm (rtx);
> +const char *aarch64_indirect_branch_asm (rtx);
>  extern bool aarch64_harden_sls_retbr_p (void);
>  extern bool aarch64_harden_sls_blr_p (void);
>
> @@ -1284,4 +1285,8 @@ extern unsigned aarch64_stack_alignment (const_tree exp, unsigned align);
>  extern rtx aarch64_gen_compare_zero_and_branch (rtx_code code, rtx x,
>                                                 rtx_code_label *label);
>
> +/* KCFI support.  */
> +extern void kcfi_emit_trap_with_section (FILE *file, rtx trap_label_rtx);
> +extern const char *aarch64_output_kcfi_insn (rtx_insn *insn, rtx *operands);
> +
>  #endif /* GCC_AARCH64_PROTOS_H */
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index fb8311b655d7..a7d17f18b72e 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -83,6 +83,7 @@
>  #include "rtlanal.h"
>  #include "tree-dfa.h"
>  #include "asan.h"
> +#include "kcfi.h"
>  #include "aarch64-elf-metadata.h"
>  #include "aarch64-feature-deps.h"
>  #include "config/arm/aarch-common.h"
> @@ -11848,6 +11849,16 @@ aarch64_expand_call (rtx result, rtx mem, rtx cookie, bool sibcall)
>
>    call = gen_rtx_CALL (VOIDmode, mem, const0_rtx);
>
> +  /* Only indirect calls need KCFI instrumentation.  */
> +  bool is_direct_call = SYMBOL_REF_P (XEXP (mem, 0));
> +  rtx kcfi_type_rtx = is_direct_call ? NULL_RTX
> +    : kcfi_get_type_id_for_expanding_gimple_call ();
> +  if (kcfi_type_rtx)
> +    {
> +      /* Wrap call in KCFI.  */
> +      call = gen_rtx_KCFI (VOIDmode, call, kcfi_type_rtx);
> +    }
> +
>    if (result != NULL_RTX)
>      call = gen_rtx_SET (result, call);
>
> @@ -11864,6 +11875,16 @@ aarch64_expand_call (rtx result, rtx mem, rtx cookie, bool sibcall)
>
>    auto call_insn = aarch64_emit_call_insn (call);
>
> +  /* Add KCFI clobbers for indirect calls.  */
> +  if (kcfi_type_rtx)
> +    {
> +      rtx usage = CALL_INSN_FUNCTION_USAGE (call_insn);
> +      /* Add X16 and X17 clobbers for AArch64 KCFI scratch registers.  */
> +      clobber_reg (&usage, gen_rtx_REG (DImode, 16));
> +      clobber_reg (&usage, gen_rtx_REG (DImode, 17));
> +      CALL_INSN_FUNCTION_USAGE (call_insn) = usage;
> +    }
> +
>    /* Check whether the call requires a change to PSTATE.SM.  We can't
>       emit the instructions to change PSTATE.SM yet, since they involve
>       a change in vector length and a change in instruction set, which

Also how does this interact with SME calls?

> @@ -30630,6 +30651,14 @@ aarch64_indirect_call_asm (rtx addr)
>    return "";
>  }
>
> +const char *
> +aarch64_indirect_branch_asm (rtx addr)
> +{
> +  gcc_assert (REG_P (addr));
> +  output_asm_insn ("br\t%0", &addr);
> +  return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
> +}
> +
>  /* Emit the assembly instruction to load the thread pointer into DEST.
>     Select between different tpidr_elN registers depending on -mtp= setting.  */
>
> @@ -32823,6 +32852,93 @@ aarch64_libgcc_floating_mode_supported_p
>  #undef TARGET_DOCUMENTATION_NAME
>  #define TARGET_DOCUMENTATION_NAME "AArch64"
>
> +/* Output the assembly for a KCFI checked call instruction.  */
> +
> +const char *
> +aarch64_output_kcfi_insn (rtx_insn *insn, rtx *operands)
> +{
> +  /* KCFI is only supported in LP64 mode.  */
> +  if (TARGET_ILP32)
> +    {
> +      sorry ("%<-fsanitize=kcfi%> is not supported for %<-mabi=ilp32%>");

You should reject -fsanitize=kcfi during option processing instead of
this late in the compilation.

> +      return "";
> +    }
> +  /* Target register is operands[0].  */
> +  rtx target_reg = operands[0];
> +  gcc_assert (REG_P (target_reg));
> +
> +  /* Get KCFI type ID from operand[3].  */
> +  uint32_t type_id = (uint32_t) INTVAL (operands[3]);

Maybe an assert that `(int32_t)type_id == INTVAL (operands[3])`?

> +
> +  /* Calculate typeid offset from call target.  */
> +  HOST_WIDE_INT offset = -(4 + kcfi_patchable_entry_prefix_nops);
> +
> +  /* Generate labels internally.  */
> +  rtx trap_label = gen_label_rtx ();
> +  rtx call_label = gen_label_rtx ();
> +
> +  /* Get label numbers for custom naming.  */
> +  int trap_labelno = CODE_LABEL_NUMBER (trap_label);
> +  int call_labelno = CODE_LABEL_NUMBER (call_label);
> +
> +  /* Generate custom label names.  */
> +  char trap_name[32];
> +  char call_name[32];
> +  ASM_GENERATE_INTERNAL_LABEL (trap_name, "Lkcfi_trap", trap_labelno);
> +  ASM_GENERATE_INTERNAL_LABEL (call_name, "Lkcfi_call", call_labelno);
> +
> +  rtx temp_operands[3];
> +
> +  /* Load actual type into w16 from memory at offset using ldur.  */
> +  temp_operands[0] = gen_rtx_REG (SImode, R16_REGNUM);
> +  temp_operands[1] = target_reg;
> +  temp_operands[2] = GEN_INT (offset);
> +  output_asm_insn ("ldur\t%w0, [%1, #%2]", temp_operands);

Since you are using a fixed register, you don't need the temp_operands[0] here.
Also what happens if target_reg is x16? Shouldn't there be an assert
on that here?

> +
> +  /* Load expected type low 16 bits into w17.  */
> +  temp_operands[0] = gen_rtx_REG (SImode, R17_REGNUM);
> +  temp_operands[1] = GEN_INT (type_id & 0xFFFF);
> +  output_asm_insn ("mov\t%w0, #%1", temp_operands);
> +
> +  /* Load expected type high 16 bits into w17.  */
> +  temp_operands[0] = gen_rtx_REG (SImode, R17_REGNUM);
> +  temp_operands[1] = GEN_INT ((type_id >> 16) & 0xFFFF);
> +  output_asm_insn ("movk\t%w0, #%1, lsl #16", temp_operands);
> +
> +  /* Compare types.  */
> +  temp_operands[0] = gen_rtx_REG (SImode, R16_REGNUM);
> +  temp_operands[1] = gen_rtx_REG (SImode, R17_REGNUM);
> +  output_asm_insn ("cmp\t%w0, %w1", temp_operands);

No reason for the temp_operands here.

> +
> +  /* Output conditional branch to call label.  */
> +  fputs ("\tb.eq\t", asm_out_file);
> +  assemble_name (asm_out_file, call_name);
> +  fputc ('\n', asm_out_file);

There has to be a better way of implementing this.

> +
> +  /* Output trap label and BRK instruction.  */
> +  ASM_OUTPUT_LABEL (asm_out_file, trap_name);
> +
> +  /* Calculate and emit BRK with ESR encoding.  */
> +  unsigned type_index = R17_REGNUM;
> +  unsigned addr_index = REGNO (operands[0]) - R0_REGNUM;
> +  unsigned esr_value = 0x8000 | ((type_index & 31) << 5) | (addr_index & 31);
> +
> +  temp_operands[0] = GEN_INT (esr_value);
> +  output_asm_insn ("brk\t#%0", temp_operands);
> +
> +  /* Output call label.  */
> +  ASM_OUTPUT_LABEL (asm_out_file, call_name);
> +
> +  /* Return appropriate call instruction based on SIBLING_CALL_P.  */
> +  if (SIBLING_CALL_P (insn))
> +    return aarch64_indirect_branch_asm (operands[0]);
> +  else
> +    return aarch64_indirect_call_asm (operands[0]);
> +}
> +
> +#undef TARGET_KCFI_SUPPORTED
> +#define TARGET_KCFI_SUPPORTED hook_bool_void_true
> +
>  struct gcc_target targetm = TARGET_INITIALIZER;
>
>  #include "gt-aarch64.h"
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index fedbd4026a06..1a5abc142f50 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -1483,6 +1483,19 @@
>    }"
>  )
>
> +;; KCFI indirect call
> +(define_insn "*call_insn"
> +  [(kcfi (call (mem:DI (match_operand:DI 0 "aarch64_call_insn_operand" "Ucr"))
> +              (match_operand 1 "" ""))
> +        (match_operand 3 "const_int_operand"))
> +   (unspec:DI [(match_operand:DI 2 "const_int_operand")] UNSPEC_CALLEE_ABI)
> +   (clobber (reg:DI LR_REGNUM))]
> +  "!SIBLING_CALL_P (insn)"
> +{
> +  return aarch64_output_kcfi_insn (insn, operands);
> +}
> +  [(set_attr "type" "call")])
> +
>  (define_insn "*call_insn"
>    [(call (mem:DI (match_operand:DI 0 "aarch64_call_insn_operand"))
>          (match_operand 1 "" ""))
> @@ -1510,6 +1523,20 @@
>    }"
>  )
>
> +;; KCFI call with return value
> +(define_insn "*call_value_insn"
> +  [(set (match_operand 0 "" "")
> +       (kcfi (call (mem:DI (match_operand:DI 1 "aarch64_call_insn_operand" "Ucr"))
> +                   (match_operand 2 "" ""))
> +             (match_operand 4 "const_int_operand")))
> +   (unspec:DI [(match_operand:DI 3 "const_int_operand")] UNSPEC_CALLEE_ABI)
> +   (clobber (reg:DI LR_REGNUM))]
> +  "!SIBLING_CALL_P (insn)"
> +{
> +  return aarch64_output_kcfi_insn (insn, &operands[1]);
> +}
> +  [(set_attr "type" "call")])
> +
>  (define_insn "*call_value_insn"
>    [(set (match_operand 0 "" "")
>         (call (mem:DI (match_operand:DI 1 "aarch64_call_insn_operand"))
> @@ -1550,6 +1577,19 @@
>    }
>  )
>
> +;; KCFI sibling call
> +(define_insn "*sibcall_insn"
> +  [(kcfi (call (mem:DI (match_operand:DI 0 "aarch64_call_insn_operand" "Ucs"))
> +              (match_operand 1 ""))
> +        (match_operand 3 "const_int_operand"))
> +   (unspec:DI [(match_operand:DI 2 "const_int_operand")] UNSPEC_CALLEE_ABI)
> +   (return)]
> +  "SIBLING_CALL_P (insn)"
> +{
> +  return aarch64_output_kcfi_insn (insn, operands);
> +}
> +  [(set_attr "type" "branch")])
> +
>  (define_insn "*sibcall_insn"
>    [(call (mem:DI (match_operand:DI 0 "aarch64_call_insn_operand" "Ucs, Usf"))
>          (match_operand 1 ""))
> @@ -1558,16 +1598,27 @@
>    "SIBLING_CALL_P (insn)"
>    {
>      if (which_alternative == 0)
> -      {
> -       output_asm_insn ("br\\t%0", operands);
> -       return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
> -      }
> +      return aarch64_indirect_branch_asm (operands[0]);
>      return "b\\t%c0";
>    }
>    [(set_attr "type" "branch, branch")
>     (set_attr "sls_length" "retbr,none")]
>  )
>
> +;; KCFI sibling call with return value
> +(define_insn "*sibcall_value_insn"
> +  [(set (match_operand 0 "")
> +       (kcfi (call (mem:DI (match_operand:DI 1 "aarch64_call_insn_operand" "Ucs"))
> +                   (match_operand 2 ""))
> +             (match_operand 4 "const_int_operand")))
> +   (unspec:DI [(match_operand:DI 3 "const_int_operand")] UNSPEC_CALLEE_ABI)
> +   (return)]
> +  "SIBLING_CALL_P (insn)"
> +{
> +  return aarch64_output_kcfi_insn (insn, &operands[1]);
> +}
> +  [(set_attr "type" "branch")])
> +
>  (define_insn "*sibcall_value_insn"
>    [(set (match_operand 0 "")
>         (call (mem:DI
> @@ -1578,10 +1629,7 @@
>    "SIBLING_CALL_P (insn)"
>    {
>      if (which_alternative == 0)
> -      {
> -       output_asm_insn ("br\\t%1", operands);
> -       return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
> -      }
> +      return aarch64_indirect_branch_asm (operands[1]);
>      return "b\\t%c1";
>    }
>    [(set_attr "type" "branch, branch")
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index bd84b7dd903f..972e8e76494f 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -18425,6 +18425,20 @@ check-call bundle are considered ABI, as the Linux kernel may
>  optionally rewrite these areas at boot time to mitigate detected CPU
>  errata.
>
> +On AArch64, KCFI type identifiers are emitted as a @code{.word ID}
> +directive (a 32-bit constant) before the function entry.  AArch64's
> +natural 4-byte instruction alignment eliminates the need for additional
> +alignment NOPs.  When used with @option{-fpatchable-function-entry}, the
> +type identifier is placed before any prefix NOPs.  The runtime check
> +uses @code{x16} and @code{x17} as scratch registers.  Type mismatches
> +trigger a @code{brk} instruction with an immediate value that encodes
> +both the expected type register index and the target address register
> +index in the format @code{0x8000 | (type_reg << 5) | addr_reg}.  This
> +encoding is captured in the ESR (Exception Syndrome Register) when the
> +trap is taken, allowing the kernel to identify both the KCFI violation
> +and the involved registers for detailed diagnostics (eliminating the need
> +for a separate @code{.kcfi_traps} section as used on x86_64).
> +
>  KCFI is intended primarily for kernel code and may not be suitable
>  for user-space applications that rely on techniques incompatible
>  with strict type checking of indirect calls.
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 7/7] kcfi: Add regression test suite
  2025-09-13 23:24 ` [PATCH v3 7/7] kcfi: Add regression test suite Kees Cook
@ 2025-09-13 23:51   ` Andrew Pinski
  2025-09-17 19:51     ` Kees Cook
  2025-09-13 23:58   ` Andrew Pinski
  1 sibling, 1 reply; 28+ messages in thread
From: Andrew Pinski @ 2025-09-13 23:51 UTC (permalink / raw)
  To: Kees Cook
  Cc: Qing Zhao, Andrew Pinski, Jakub Jelinek, Martin Uecker,
	Richard Biener, Joseph Myers, Peter Zijlstra, Jan Hubicka,
	Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
	Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
	Jim Wilson, Dan Li, Sami Tolvanen, Ramon de C Valle, Joao Moreira,
	Nathan Chancellor, Bill Wendling, gcc-patches, linux-hardening

On Sat, Sep 13, 2025 at 4:36 PM Kees Cook <kees@kernel.org> wrote:
>
> Adds a test suite for KCFI (Kernel Control Flow Integrity) ABI, covering
> core functionality, optimization and code generation, addressing,
> architecture-specific KCFI sequence emission, and integration with
> patchable function entry.
>
> Tests can be run via:
>   make check-c RUNTESTFLAGS='kcfi.exp'
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.dg/kcfi/kcfi-adjacency.c: New test.
>         * gcc.dg/kcfi/kcfi-basics.c: New test.
>         * gcc.dg/kcfi/kcfi-call-sharing.c: New test.
>         * gcc.dg/kcfi/kcfi-cold-partition.c: New test.
>         * gcc.dg/kcfi/kcfi-complex-addressing.c: New test.
>         * gcc.dg/kcfi/kcfi-ipa-robustness.c: New test.
>         * gcc.dg/kcfi/kcfi-move-preservation.c: New test.
>         * gcc.dg/kcfi/kcfi-no-sanitize-inline.c: New test.
>         * gcc.dg/kcfi/kcfi-no-sanitize.c: New test.
>         * gcc.dg/kcfi/kcfi-offset-validation.c: New test.
>         * gcc.dg/kcfi/kcfi-patchable-basic.c: New test.
>         * gcc.dg/kcfi/kcfi-patchable-entry-only.c: New test.
>         * gcc.dg/kcfi/kcfi-patchable-large.c: New test.
>         * gcc.dg/kcfi/kcfi-patchable-medium.c: New test.
>         * gcc.dg/kcfi/kcfi-patchable-prefix-only.c: New test.
>         * gcc.dg/kcfi/kcfi-pic-addressing.c: New test.
>         * gcc.dg/kcfi/kcfi-retpoline-r11.c: New test.
>         * gcc.dg/kcfi/kcfi-runtime.c: New test.
>         * gcc.dg/kcfi/kcfi-tail-calls.c: New test.
>         * gcc.dg/kcfi/kcfi-trap-encoding.c: New test.
>         * gcc.dg/kcfi/kcfi-trap-section.c: New test.
>         * gcc.dg/kcfi/kcfi.exp: New test.
>
> Signed-off-by: Kees Cook <kees@kernel.org>
> ---
>  gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c    |  72 +++++++++
>  gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c       | 108 +++++++++++++
>  gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c |  84 ++++++++++
>  .../gcc.dg/kcfi/kcfi-cold-partition.c         | 136 ++++++++++++++++
>  .../gcc.dg/kcfi/kcfi-complex-addressing.c     | 135 ++++++++++++++++
>  .../gcc.dg/kcfi/kcfi-ipa-robustness.c         |  54 +++++++
>  .../gcc.dg/kcfi/kcfi-move-preservation.c      |  55 +++++++
>  .../gcc.dg/kcfi/kcfi-no-sanitize-inline.c     | 100 ++++++++++++
>  gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c  |  39 +++++
>  .../gcc.dg/kcfi/kcfi-offset-validation.c      |  48 ++++++
>  .../gcc.dg/kcfi/kcfi-patchable-basic.c        |  70 ++++++++
>  .../gcc.dg/kcfi/kcfi-patchable-entry-only.c   |  62 +++++++
>  .../gcc.dg/kcfi/kcfi-patchable-large.c        |  51 ++++++
>  .../gcc.dg/kcfi/kcfi-patchable-medium.c       |  60 +++++++
>  .../gcc.dg/kcfi/kcfi-patchable-prefix-only.c  |  60 +++++++
>  .../gcc.dg/kcfi/kcfi-pic-addressing.c         | 104 ++++++++++++
>  .../gcc.dg/kcfi/kcfi-retpoline-r11.c          |  50 ++++++
>  gcc/testsuite/gcc.dg/kcfi/kcfi-runtime.c      | 151 ++++++++++++++++++
>  gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c   | 142 ++++++++++++++++
>  .../gcc.dg/kcfi/kcfi-trap-encoding.c          |  54 +++++++
>  gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c |  41 +++++
>  gcc/testsuite/gcc.dg/kcfi/kcfi.exp            |  64 ++++++++
>  22 files changed, 1740 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-cold-partition.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-complex-addressing.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-ipa-robustness.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-move-preservation.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize-inline.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-offset-validation.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-basic.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-entry-only.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-large.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-medium.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-prefix-only.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-pic-addressing.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-retpoline-r11.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-runtime.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-trap-encoding.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi.exp
>
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c
> new file mode 100644
> index 000000000000..becb47678df0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c
> @@ -0,0 +1,72 @@
> +/* Test KCFI check/transfer adjacency - regression test for instruction
> +   insertion.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2" } */
> +
> +/* This test ensures that KCFI security checks remain immediately adjacent
> +   to their corresponding indirect calls/jumps, with no executable instructions
> +   between the type ID check and the control flow transfer. */
> +
> +/* External function pointers to prevent optimization.  */
> +extern void (*complex_func_ptr)(int, int, int, int);
> +extern int (*return_func_ptr)(int, int);
> +
> +/* Function with complex argument preparation that could tempt
> +   the optimizer to insert instructions between KCFI check and call.  */
> +__attribute__((noinline)) void test_complex_args(int a, int b, int c, int d) {
> +    /* Complex argument expressions that might cause instruction scheduling.  */
> +    complex_func_ptr(a * 2, b + c, d - a, (a << 1) | b);
> +}
> +
> +/* Function with return value handling.  */
> +__attribute__((noinline)) int test_return_value(int x, int y) {
> +    /* Return value handling that shouldn't interfere with adjacency.  */
> +    int result = return_func_ptr(x + 1, y * 2);
> +    return result + 1;
> +}
> +
> +/* Test struct field access that caused issues in try-catch.c.  */
> +struct call_info {
> +    void (*handler)(void);
> +    int status;
> +    int data;
> +};
> +
> +extern struct call_info *global_call_info;
> +
> +__attribute__((noinline)) void test_struct_field_call(void) {
> +    /* This pattern caused adjacency issues before the fix.  */
> +    global_call_info->handler();
> +}
> +
> +/* Test conditional indirect call.  */
> +__attribute__((noinline)) void test_conditional_call(int flag) {
> +    if (flag) {
> +        global_call_info->handler();
> +    }
> +}
> +
> +/* Should have KCFI instrumentation for all indirect calls.  */
> +
> +/* x86_64: Complete KCFI check sequence should be present.  */
> +/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r1[01]d\n\taddl\t[^,]+, %r1[01]d\n\tje\t\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tud2} { target x86_64-*-* } } } */
> +
> +/* AArch64: Complete KCFI check sequence should be present.  */
> +/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-[0-9]+\]\n\tmov\tw17, #[0-9]+\n\tmovk\tw17, #[0-9]+, lsl #16\n\tcmp\tw16, w17\n\tb\.eq\t(\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tbrk\t#[0-9]+\n\1:\n\tblr\tx[0-9]+} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Complete KCFI check sequence should be present with stack
> +   spilling.  */
> +/* { dg-final { scan-assembler {push\t\{r0, r1\}\n\tldr\tr0, \[r[0-9]+, #-[0-9]+\]\n\tmovw\tr1, #[0-9]+\n\tmovt\tr1, #[0-9]+\n\tcmp\tr0, r1\n\tpop\t\{r0, r1\}\n\tbeq\t\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tudf\t#[0-9]+\n\.Lkcfi_call[0-9]+:\n\tblx\tr[0-9]+} { target arm32 } } } */
> +
> +/* RISC-V: Complete KCFI check sequence should be present.  */
> +/* { dg-final { scan-assembler {lw\tt1, -4\([a-z0-9]+\)\n\tlui\tt2, [0-9]+\n\taddiw\tt2, t2, -?[0-9]+\n\tbeq\tt1, t2, \.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tebreak} { target riscv*-*-* } } } */
> +
> +/* Should have trap section with entries.  */
> +/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
> +
> +/* AArch64 should NOT have trap section (uses brk immediate instead) */
> +/* { dg-final { scan-assembler-not {\.kcfi_traps} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit should NOT have trap section (uses udf immediate instead) */
> +/* { dg-final { scan-assembler-not {\.kcfi_traps} { target arm32 } } } */


I think it would be better to use check-function-bodies here rather
than scan-assembler for the sequences. Maybe each target should have
its own testcase rather than putting it all in one source.
Plus I think the target testcase should be part of the target patch
rather than its own patch to make it easier to review both things
together. Because while I was reviewing the aarch64 part I was
thinking where are the testcases for the aarch64 specific changes.

Thanks,
Andrew


> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c
> new file mode 100644
> index 000000000000..b0a9e11f1f3c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c
> @@ -0,0 +1,108 @@
> +/* Test basic KCFI functionality - preamble generation.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
> +
> +/* Extern function declarations - should NOT get KCFI preambles.  */
> +extern void external_func(void);
> +extern int external_func_int(int x);
> +
> +void regular_function(int x) {
> +    /* This should get KCFI preamble.  */
> +}
> +
> +void static_target_function(int x) {
> +    /* Target function that can be called indirectly.  */
> +}
> +
> +__attribute__((nocf_check))
> +void nocf_check_function(int x) {
> +    /* This function has nocf_check attribute - should NOT get KCFI preamble.  */
> +}
> +
> +static void static_caller(void) {
> +    /* Static function that makes an indirect call
> +       Should NOT get KCFI preamble (not address-taken)
> +       But must generate KCFI check for the indirect call.  */
> +    void (*local_ptr)(int) = static_target_function;
> +    local_ptr(42);  /* This should generate KCFI check.  */
> +}
> +
> +/* Make external_func address-taken.  */
> +void (*func_ptr)(int) = regular_function;
> +void (*ext_ptr)(void) = external_func;
> +void (__attribute__((nocf_check)) *nocf_ptr)(int) = nocf_check_function;
> +
> +int main() {
> +    func_ptr(42);
> +    ext_ptr();        /* Indirect call to external_func.  */
> +    external_func_int(10);  /* Direct call to external_func_int.  */
> +    static_caller();  /* Direct call to static function.  */
> +    return 0;
> +}
> +
> +/* Verify KCFI preamble exists for regular_function.  */
> +/* { dg-final { scan-assembler {__cfi_regular_function:} } } */
> +
> +/* Verify KCFI preamble symbol comes before main function symbol.  */
> +/* { dg-final { scan-assembler {__cfi_regular_function:.*regular_function:} } } */
> +
> +/* Target function should have preamble (address-taken).  */
> +/* { dg-final { scan-assembler {__cfi_static_target_function:} } } */
> +
> +/* Static caller should NOT have preamble (it's only called directly,
> +   not address-taken). */
> +/* { dg-final { scan-assembler-not {__cfi_static_caller:} } } */
> +
> +/* Function with nocf_check attribute should NOT have preamble.  */
> +/* { dg-final { scan-assembler-not {__cfi_nocf_check_function:} } } */
> +
> +/* x86_64: Verify type ID in preamble (after NOPs, before function label) */
> +/* { dg-final { scan-assembler {__cfi_regular_function:\n\t+nop\n.*\n\t+movl\t+\$0x[0-9a-f]+, %eax} { target x86_64-*-* } } } */
> +
> +/* AArch64: Verify type ID word in preamble.  */
> +/* { dg-final { scan-assembler {__cfi_regular_function:\n\t\.word\t0x[0-9a-f]+} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Verify type ID word in preamble.  */
> +/* { dg-final { scan-assembler {__cfi_regular_function:\n\t\.word\t0x[0-9a-f]+} { target arm32 } } } */
> +
> +/* RISC-V: Verify type ID word in preamble */
> +/* { dg-final { scan-assembler {__cfi_regular_function:\n\t\.word\t0x[0-9a-f]+} { target riscv*-*-* } } } */
> +
> +/* x86_64: Static function should generate complete KCFI check sequence.  */
> +/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-4\(%r[a-z0-9]+\), %r10d\n\tje\t(\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tud2\n.*\n\1:\n\tcall} { target x86_64-*-* } } } */
> +
> +/* AArch64: Static function should generate complete KCFI check sequence.  */
> +/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-4\]\n\tmov\tw17, #[0-9]+\n\tmovk\tw17, #[0-9]+, lsl #16\n\tcmp\tw16, w17\n\tb\.eq\t(\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tbrk\t#[0-9]+\n\1:\n\tblr} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Static function should generate complete KCFI check sequence
> +   with stack spilling.  */
> +/* { dg-final { scan-assembler {push\t\{r0, r1\}\n\tldr\tr0, \[r[0-9]+, #-4\]\n\tmovw\tr1, #[0-9]+\n\tmovt\tr1, #[0-9]+\n\tcmp\tr0, r1\n\tpop\t\{r0, r1\}\n\tbeq\t\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tudf\t#[0-9]+\n\.Lkcfi_call[0-9]+:\n\tblx\tr[0-9]+} { target arm32 } } } */
> +
> +/* RISC-V: Static function should generate KCFI check for indirect call.  */
> +/* { dg-final { scan-assembler {lw\tt1, -4\([a-z0-9]+\)\n\tlui\tt2, [0-9]+\n\taddiw\tt2, t2, -?[0-9]+\n\tbeq\tt1, t2, (\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tebreak\n\t\.section\t\.kcfi_traps,"ao",@progbits,\.text\n\.Lkcfi_entry[0-9]+:\n\t\.4byte\t\.Lkcfi_trap[0-9]+-\.Lkcfi_entry[0-9]+\n\t\.text\n\1:\n\tjalr} { target riscv*-*-* } } } */
> +
> +/* Extern functions should NOT get KCFI preambles.  */
> +/* { dg-final { scan-assembler-not {__cfi_external_func:} } } */
> +/* { dg-final { scan-assembler-not {__cfi_external_func_int:} } } */
> +
> +/* Local functions should NOT get __kcfi_typeid_ symbols.  */
> +/* Only external declarations that are address-taken should get __kcfi_typeid_ */
> +/* { dg-final { scan-assembler-not {__kcfi_typeid_regular_function} } } */
> +/* { dg-final { scan-assembler-not {__kcfi_typeid_main} } } */
> +
> +/* External address-taken functions should get __kcfi_typeid_ symbols.  */
> +/* { dg-final { scan-assembler {__kcfi_typeid_external_func} } } */
> +
> +/* External functions that are only called directly should NOT get
> +   __kcfi_typeid_ symbols.  */
> +/* { dg-final { scan-assembler-not {__kcfi_typeid_external_func_int} } } */
> +
> +/* Should have trap section for KCFI checks.  */
> +/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
> +
> +/* AArch64 should NOT have trap section (uses brk immediate instead).  */
> +/* { dg-final { scan-assembler-not {\.kcfi_traps} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit should NOT have trap section (uses udf immediate instead).  */
> +/* { dg-final { scan-assembler-not {\.kcfi_traps} { target arm32 } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c
> new file mode 100644
> index 000000000000..f34d5f88547f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c
> @@ -0,0 +1,84 @@
> +/* Test KCFI check sharing bug - optimizer incorrectly shares KCFI checks
> +   between different function types.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2" } */
> +
> +/* Reproduce the pattern from Linux kernel internal_create_group where:
> +   - Two different function pointer types (is_visible vs is_bin_visible).
> +   - Both get loaded into the same register (%rcx).
> +   - Optimizer creates shared KCFI check with wrong type ID.
> +   - This causes CFI failures in production kernel.  */
> +
> +struct kobject { int dummy; };
> +struct attribute { int dummy; };
> +struct bin_attribute { int dummy; };
> +
> +struct attribute_group {
> +    const char *name;
> +    // Type ID A
> +    int (*is_visible)(struct kobject *, struct attribute *, int);
> +    // Type ID B
> +    int (*is_bin_visible)(struct kobject *, const struct bin_attribute *, int);
> +    struct attribute **attrs;
> +    const struct bin_attribute **bin_attrs;
> +};
> +
> +/* Function that mimics __first_visible from kernel - gets inlined into
> +   caller.  */
> +static int __first_visible(const struct attribute_group *grp, struct kobject *kobj)
> +{
> +    /* Path 1: Call is_visible function pointer.  */
> +    if (grp->attrs && grp->attrs[0] && grp->is_visible)
> +        return grp->is_visible(kobj, grp->attrs[0], 0);
> +
> +    /* Path 2: Call is_bin_visible function pointer.  */
> +    if (grp->bin_attrs && grp->bin_attrs[0] && grp->is_bin_visible)
> +        return grp->is_bin_visible(kobj, grp->bin_attrs[0], 0);
> +
> +    return 0;
> +}
> +
> +/* Main function that triggers the optimization bug.  */
> +int test_kcfi_check_sharing(struct kobject *kobj, const struct attribute_group *grp)
> +{
> +    /* This should inline __first_visible and create the problematic pattern where:
> +       1. Both function pointers get loaded into same register.
> +       2. Optimizer shares KCFI check between them.
> +       3. Uses wrong type ID for one of the calls.  */
> +    return __first_visible(grp, kobj);
> +}
> +
> +/* Each indirect call should have its own KCFI check with correct type ID.
> +
> +   Should see:
> +   1. KCFI check for is_visible call with is_visible type ID.
> +   2. KCFI check for is_bin_visible call with is_bin_visible type ID.  */
> +
> +/* Verify we have TWO different KCFI check sequences.  */
> +/* Each check should have different type ID constants.  */
> +/* x86: { dg-final { scan-assembler-times {movl\s+\$-?[0-9]+,\s+%r10d} 2 { target i?86-*-* x86_64-*-* } } } */
> +/* AArch64: { dg-final { scan-assembler-times {mov\s+w17, #[0-9]+} 2 { target aarch64*-*-* } } } */
> +/* ARM 32-bit: { dg-final { scan-assembler-times {movw\s+r1, #[0-9]+} 2 { target arm32 } } } */
> +/* RISC-V: { dg-final { scan-assembler-times {lui\tt2, [0-9]+} 2 { target riscv*-*-* } } } */
> +
> +/* Verify the checks use DIFFERENT type IDs (not shared).
> +   We should NOT see the same type ID used twice - that would indicate
> +   sharing bug.  */
> +/* x86: { dg-final { scan-assembler-not {movl\s+\$(-?[0-9]+),\s+%r10d.*movl\s+\$\1,\s+%r10d} { target i?86-*-* x86_64-*-* } } } */
> +/* AArch64: { dg-final { scan-assembler-not {mov\s+w17, #([0-9]+).*mov\s+w17, #\1} { target aarch64*-*-* } } } */
> +/* ARM 32-bit: { dg-final { scan-assembler-not {movw\s+r1, #([0-9]+).*movw\s+r1, #\1} { target arm32 } } } */
> +/* RISC-V: { dg-final { scan-assembler-not {lui\s+t2, ([0-9]+)\s.*lui\s+t2, \1\s} { target riscv*-*-* } } } */
> +
> +/* Verify each call follows its own check (not shared) */
> +/* Should have 2 separate trap instructions.  */
> +/* x86: { dg-final { scan-assembler-times {ud2} 2 { target i?86-*-* x86_64-*-* } } } */
> +/* AArch64: { dg-final { scan-assembler-times {brk\s+#[0-9]+} 2 { target aarch64*-*-* } } } */
> +/* ARM 32-bit: { dg-final { scan-assembler-times {udf\s+#[0-9]+} 2 { target arm32 } } } */
> +/* RISC-V: { dg-final { scan-assembler-times {ebreak} 2 { target riscv*-*-* } } } */
> +
> +/* Verify 2 separate call sites.  */
> +/* x86: { dg-final { scan-assembler-times {jmp\s+\*%[a-z0-9]+} 2 { target i?86-*-* x86_64-*-* } } } */
> +/* AArch64: Allow both blr (regular call) and br (tail call) */
> +/* AArch64: { dg-final { scan-assembler-times {br\tx[0-9]+} 2 { target aarch64*-*-* } } } */
> +/* ARM 32-bit: { dg-final { scan-assembler-times {bx\s+(?:r[0-9]+|ip)} 2 { target arm32 } } } */
> +/* RISC-V: { dg-final { scan-assembler-times {jalr\t[a-z0-9]+} 2 { target riscv*-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-cold-partition.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-cold-partition.c
> new file mode 100644
> index 000000000000..17def558ada4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-cold-partition.c
> @@ -0,0 +1,136 @@
> +/* Test KCFI cold function and cold partition behavior.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2" } */
> +/* { dg-additional-options "-freorder-blocks-and-partition" { target freorder } } */
> +
> +void regular_function(void) {
> +    /* Regular function should get preamble.  */
> +}
> +
> +/* Cold-attributed function should STILL get preamble (it's a regular
> +   function, just marked cold).  */
> +__attribute__((cold))
> +void cold_attributed_function(void) {
> +    /* This function has cold attribute but should still get KCFI preamble.  */
> +}
> +
> +/* Hot-attributed function should get preamble.  */
> +__attribute__((hot))
> +void hot_attributed_function(void) {
> +    /* This function is explicitly hot and should get KCFI preamble.  */
> +}
> +
> +/* Global to prevent optimization from eliminating cold paths.  */
> +extern void abort(void);
> +
> +/* Additional function to test that normal functions still get preambles.  */
> +__attribute__((noinline))
> +int another_regular_function(int x) {
> +    return x + 42;
> +}
> +
> +/* Function designed to generate cold partitions under optimization.  */
> +__attribute__((noinline))
> +void function_with_cold_partition(int condition) {
> +    /* Hot path - very likely to execute.  */
> +    if (__builtin_expect(condition == 42, 1)) {
> +        /* Simple hot path that optimizer will keep inline.  */
> +        return;
> +    }
> +
> +    /* Cold paths that actually do something to prevent elimination.  */
> +    if (__builtin_expect(condition < 0, 0)) {
> +        /* Error path 1 - call abort to prevent elimination.  */
> +        abort();
> +    }
> +
> +    if (__builtin_expect(condition > 1000000, 0)) {
> +        /* Error path 2 - call abort to prevent elimination.  */
> +        abort();
> +    }
> +
> +    if (__builtin_expect(condition == 999999, 0)) {
> +        /* Error path 3 - more substantial cold code.  */
> +        volatile int sum = 0;
> +        for (volatile int i = 0; i < 100; i++) {
> +            sum += i * condition;
> +        }
> +        if (sum > 0)
> +            abort();
> +    }
> +
> +    /* More cold paths - switch with many unlikely cases.  */
> +    switch (condition) {
> +        case 1000001: case 1000002: case 1000003: case 1000004: case 1000005:
> +        case 1000006: case 1000007: case 1000008: case 1000009: case 1000010:
> +            /* Each case does some work before abort.  */
> +            volatile int work = condition * 2;
> +            if (work > 0) abort();
> +            break;
> +        default:
> +            if (condition != 42) {
> +                /* Fallback cold path - substantial work.  */
> +                volatile int result = 0;
> +                for (volatile int j = 0; j < condition % 50; j++) {
> +                    result += j;
> +                }
> +                if (result >= 0) abort();
> +            }
> +    }
> +}
> +
> +/* Test function pointers to ensure address-taken detection works.  */
> +void test_function_pointers(void) {
> +    void (*regular_ptr)(void) = regular_function;
> +    void (*cold_ptr)(void) = cold_attributed_function;
> +    void (*hot_ptr)(void) = hot_attributed_function;
> +
> +    regular_ptr();
> +    cold_ptr();
> +    hot_ptr();
> +}
> +
> +int main() {
> +    regular_function();
> +    cold_attributed_function();
> +    hot_attributed_function();
> +    function_with_cold_partition(42); /* Normal case - stay in hot path.  */
> +    another_regular_function(5);
> +    test_function_pointers();
> +    return 0;
> +}
> +
> +/* Regular function should have preamble.  */
> +/* { dg-final { scan-assembler "__cfi_regular_function:" } } */
> +
> +/* Cold-attributed function should STILL have preamble (it's a legitimate function) */
> +/* { dg-final { scan-assembler "__cfi_cold_attributed_function:" } } */
> +
> +/* Hot-attributed function should have preamble.  */
> +/* { dg-final { scan-assembler "__cfi_hot_attributed_function:" } } */
> +
> +/* Function that generates cold partitions should have preamble for main entry.  */
> +/* { dg-final { scan-assembler "__cfi_function_with_cold_partition:" } } */
> +
> +/* Address-taken functions should have preambles.  */
> +/* { dg-final { scan-assembler "__cfi_test_function_pointers:" } } */
> +
> +/* The function should generate a .cold partition (only on targets that support freorder) */
> +/* { dg-final { scan-assembler "function_with_cold_partition\\.cold:" { target freorder } } } */
> +
> +/* The .cold partition should NOT get a __cfi_ preamble since it's never
> +   reached via indirect calls.  */
> +/* { dg-final { scan-assembler-not "__cfi_function_with_cold_partition\\.cold:" { target freorder } } } */
> +
> +/* Additional regular function should get preamble.  */
> +/* { dg-final { scan-assembler "__cfi_another_regular_function:" } } */
> +
> +/* Test coverage summary:
> +   1. Cold-attributed function (__attribute__((cold))): SHOULD get preamble
> +   2. Cold partition (-freorder-blocks-and-partition): should NOT get preamble
> +   3. IPA split .part function (split_part=true): Logic in place, would skip if triggered
> +
> +   Note: IPA function splitting (creating .part functions with split_part=true) requires
> +   specific optimization conditions that are difficult to trigger reliably in tests.
> +   The KCFI logic correctly handles this case using the split_part flag check.
> +*/
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-complex-addressing.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-complex-addressing.c
> new file mode 100644
> index 000000000000..b9a8955b0899
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-complex-addressing.c
> @@ -0,0 +1,135 @@
> +/* Test KCFI with complex addressing modes (structure members, array
> +   elements). This is a regression test for the change_address_1 RTL
> +   error that occurred when target_addr was PLUS(reg, offset) instead
> +   of a simple register.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2" } */
> +
> +struct function_table {
> +    int (*callback1)(int);
> +    int (*callback2)(int, int);
> +    void (*callback3)(void);
> +    int (*callback4)(void *, void *, void *, void *, void *, void *);
> +    int data;
> +};
> +
> +static int handler1(int x) {
> +    return x * 2;
> +}
> +
> +static int handler2(int x, int y) {
> +    return x + y;
> +}
> +
> +static void handler3(void) {
> +    /* Empty handler.  */
> +}
> +
> +/* Test indirect calls through structure members - this creates
> +   PLUS(reg, offset) addressing.  */
> +int test_struct_members(struct function_table *table) {
> +    int result = 0;
> +
> +    /* These indirect calls will generate complex addressing modes:
> +     * call *(%rdi)          - callback1 at offset 0
> +     * call *8(%rdi)         - callback2 at offset 8
> +     * call *16(%rdi)        - callback3 at offset 16
> +     * KCFI must handle PLUS(reg, struct_offset) + kcfi_offset.  */
> +
> +    result += table->callback1(10);
> +    result += table->callback2(5, 7);
> +    table->callback3();
> +
> +    return result;
> +}
> +
> +/* Test indirect calls through array elements - another source of
> +   complex addressing.  */
> +typedef int (*func_array_t)(int);
> +
> +int test_array_elements(func_array_t functions[], int index) {
> +    /* This creates addressing like MEM[PLUS(PLUS(reg, index*8), 0)]
> +       which should be simplified to MEM[PLUS(reg, index*8)].  */
> +    return functions[index](42);
> +}
> +
> +/* Test with global structure.  */
> +static struct function_table global_table = {
> +    .callback1 = handler1,
> +    .callback2 = handler2,
> +    .callback3 = handler3,
> +    .data = 100
> +};
> +
> +int test_global_struct(void) {
> +    /* Access through global structure - may generate different
> +       addressing patterns.  */
> +    return global_table.callback1(20) + global_table.callback2(3, 4);
> +}
> +
> +/* Test nested structure access.  */
> +struct nested_table {
> +    struct function_table inner;
> +    int extra_data;
> +};
> +
> +int test_nested_struct(struct nested_table *nested) {
> +    /* Even more complex addressing: nested structure member access.  */
> +    return nested->inner.callback1(15);
> +}
> +
> +int test_many_args(void *one, void *two, void *three, void *four, void *five, void *six)
> +{
> +    return (unsigned long)one + (unsigned long)two + (unsigned long)three
> +          + (unsigned long)four + (unsigned long)five + (unsigned long)six;
> +}
> +
> +int main() {
> +    struct function_table local_table = {
> +        .callback1 = handler1,
> +        .callback2 = handler2,
> +        .callback3 = handler3,
> +        .callback4 = test_many_args,
> +        .data = 50
> +    };
> +
> +    func_array_t func_array[] = { handler1, handler1, handler1 };
> +
> +    int result = 0;
> +    result += test_struct_members(&local_table);
> +    result += test_array_elements(func_array, 1);
> +    result += test_global_struct();
> +
> +    struct nested_table nested = { .inner = local_table, .extra_data = 200 };
> +    result += test_nested_struct(&nested);
> +
> +    result += local_table.callback4(handler1, handler2, handler3, &result, main, &local_table);
> +
> +    return result;
> +}
> +
> +/* Verify that all address-taken functions get KCFI preambles.  */
> +/* { dg-final { scan-assembler {__cfi_handler1:} } } */
> +/* { dg-final { scan-assembler {__cfi_handler2:} } } */
> +/* { dg-final { scan-assembler {__cfi_handler3:} } } */
> +/* { dg-final { scan-assembler {__cfi_test_many_args:} } } */
> +
> +/* x86_64: Verify KCFI checks are generated for indirect calls through
> +   complex addressing.  */
> +/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-4\(%r[a-z0-9]+\), %r10d} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler {ud2} { target x86_64-*-* } } } */
> +
> +/* AArch64: Verify KCFI checks for complex addressing.  */
> +/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-4\]} { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler {brk} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Verify KCFI checks for complex addressing with stack spilling.  */
> +/* { dg-final { scan-assembler {ldr\tr0, \[r[0-9]+, #-4\]} { target arm32 } } } */
> +/* { dg-final { scan-assembler {udf} { target arm32 } } } */
> +
> +/* RISC-V: Verify KCFI check sequence for complex addressing.  */
> +/* { dg-final { scan-assembler {lw\tt1, -4\([a-z0-9]+\)\n\tlui\tt2, [0-9]+\n\taddiw\tt2, t2, -?[0-9]+\n\tbeq\tt1, t2, \.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tebreak} { target riscv*-*-* } } } */
> +
> +/* Should have trap section for x86 and RISC-V only.  */
> +/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-ipa-robustness.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-ipa-robustness.c
> new file mode 100644
> index 000000000000..a43bcd4f3e3f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-ipa-robustness.c
> @@ -0,0 +1,54 @@
> +/* Test KCFI IPA pass robustness with compiler-generated constructs.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include <stddef.h>
> +
> +/* Test various compiler-generated constructs that could confuse IPA pass.  */
> +
> +/* static_assert - this was causing the original crash.  */
> +typedef struct {
> +    int field1;
> +    char field2;
> +} test_struct_t;
> +
> +static_assert(offsetof(test_struct_t, field1) == 0, "layout check 1");
> +static_assert(offsetof(test_struct_t, field2) == 4, "layout check 2");
> +static_assert(sizeof(test_struct_t) >= 5, "size check");
> +
> +/* Regular functions that should get KCFI analysis.  */
> +void regular_function(void) {
> +    /* Should get KCFI preamble.  */
> +}
> +
> +static void static_function(void) {
> +    /* With -O2: correctly identified as not address-taken, no preamble.  */
> +}
> +
> +void address_taken_function(void) {
> +    /* Should get KCFI preamble (address taken below) */
> +}
> +
> +/* Function pointer to create address-taken scenario.  */
> +void (*func_ptr)(void) = address_taken_function;
> +
> +/* More static_asserts mixed with function definitions.  */
> +static_assert(sizeof(void*) >= 4, "pointer size check");
> +
> +int main(void) {
> +    regular_function();    /* Direct call.  */
> +    static_function();     /* Direct call to static.  */
> +    func_ptr();            /* Indirect call.  */
> +
> +    static_assert(sizeof(int) == 4, "int size check");
> +
> +    return 0;
> +}
> +
> +/* Verify KCFI preambles are generated appropriately.  */
> +/* { dg-final { scan-assembler "__cfi_regular_function:" } } */
> +/* { dg-final { scan-assembler "__cfi_address_taken_function:" } } */
> +/* { dg-final { scan-assembler "__cfi_main:" } } */
> +
> +/* With -O2: static_function correctly identified as not address-taken.  */
> +/* { dg-final { scan-assembler-not "__cfi_static_function:" } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-move-preservation.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-move-preservation.c
> new file mode 100644
> index 000000000000..50029d136716
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-move-preservation.c
> @@ -0,0 +1,55 @@
> +/* Test that KCFI preserves function pointer moves at -O2 optimization.
> +   This test ensures that the combine pass doesn't incorrectly optimize away
> +   the move instruction needed to transfer function pointers from argument
> +   registers to the target registers used by KCFI patterns.  */
> +
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2 -std=gnu11" } */
> +
> +static int called_count = 0;
> +
> +/* Function taking one argument, returning void.  */
> +static __attribute__((noinline)) void increment_void(int *counter)
> +{
> +    (*counter)++;
> +}
> +
> +/* Function taking one argument, returning int.  */
> +static __attribute__((noinline)) int increment_int(int *counter)
> +{
> +    (*counter)++;
> +    return *counter;
> +}
> +
> +/* Don't allow the compiler to inline the calls.  */
> +static __attribute__((noinline)) void indirect_call(void (*func)(int *))
> +{
> +    func(&called_count);
> +}
> +
> +int main(void)
> +{
> +    /* This should work - matching prototype.  */
> +    indirect_call(increment_void);
> +
> +    /* This should trap - mismatched prototype.  */
> +    indirect_call((void *)increment_int);
> +
> +    return 0;
> +}
> +
> +/* Verify complete KCFI check sequence with preserved move instruction. At
> +   -O2, the combine pass previously optimized away the move from %rdi to %rax,
> +   breaking KCFI. Verify the full sequence is preserved. */
> +
> +/* x86_64: Complete KCFI sequence with move preservation and indirect jump.  */
> +/* { dg-final { scan-assembler {(indirect_call):.*\n.*movq\s+%rdi,\s+(%rax)\n.*movl\s+\$[0-9]+,\s+%r10d\n\taddl\s+-4\(\2\),\s+%r10d\n\tje\s+\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tud2.*\.Lkcfi_call[0-9]+:\n\tjmp\s+\*\2.*\.size\s+\1,\s+\.-\1} { target x86_64-*-* } } } */
> +
> +/* AArch64: Complete KCFI sequence with move preservation and indirect branch.  */
> +/* { dg-final { scan-assembler {(indirect_call):.*\n.*mov\s+(x[0-9]+),\s+x0\n.*ldur\s+w16,\s+\[\2,\s+#-4\]\n\tmov\s+w17,\s+#[0-9]+\n\tmovk\s+w17,\s+#[0-9]+,\s+lsl\s+#16\n\tcmp\s+w16,\s+w17\n\tb\.eq\s+\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tbrk\s+#[0-9]+.*\.Lkcfi_call[0-9]+:\n\tbr\s+\2.*\.size\s+\1,\s+\.-\1} { target aarch64*-*-* } } } */
> +
> +/* ARM32: Complete KCFI sequence with move preservation and indirect branch.  */
> +/* { dg-final { scan-assembler {(indirect_call):.*\n.*mov\s+(r[0-9]+),\s+r0\n.*push\s+\{r0,\s+r1\}\n\tldr\s+r0,\s+\[\2,\s+#-4\]\n\tmovw\s+r1,\s+#[0-9]+\n\tmovt\s+r1,\s+#[0-9]+\n\tcmp\s+r0,\s+r1\n\tpop\s+\{r0,\s+r1\}\n\tbeq\s+\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tudf\s+#[0-9]+.*\.Lkcfi_call[0-9]+:\n\tbx\s+\2.*\.size\s+\1,\s+\.-\1} { target arm32 } } } */
> +
> +/* RISC-V: Complete KCFI sequence with move preservation and indirect jump.  */
> +/* { dg-final { scan-assembler {(indirect_call):.*mv\s+(a[0-9]+),a0.*lw\s+t1,\s+-4\(\2\).*lui\s+t2,\s+[0-9]+.*addiw\s+t2,\s+t2,\s+-?[0-9]+.*beq\s+t1,\s+t2,\s+\.Lkcfi_call[0-9]+.*ebreak.*jalr\s+zero,\s+\2,\s+0.*\.size\s+\1,\s+\.-\1} { target riscv64-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize-inline.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize-inline.c
> new file mode 100644
> index 000000000000..c43d8014ff2d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize-inline.c
> @@ -0,0 +1,100 @@
> +/* Test that no_sanitize("kcfi") attribute is preserved during inlining.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2" } */
> +
> +extern void external_side_effect(int value);
> +
> +/* Regular function (should get KCFI checks) */
> +__attribute__((noinline))
> +void normal_function(void (*callback)(int))
> +{
> +    /* This indirect call must generate KCFI checks.  */
> +    callback(300);
> +    external_side_effect(300);
> +}
> +
> +/* Regular function marked with no_sanitize("kcfi") (positive control) */
> +__attribute__((noinline, no_sanitize("kcfi")))
> +void sensitive_non_inline_function(void (*callback)(int))
> +{
> +    /* This indirect call should NOT generate KCFI checks.  */
> +    callback(100);
> +    external_side_effect(100);
> +}
> +
> +/* Function marked with both no_sanitize("kcfi") and always_inline.  */
> +__attribute__((always_inline, no_sanitize("kcfi")))
> +static inline void sensitive_inline_function(void (*callback)(int))
> +{
> +    /* This indirect call should NOT generate KCFI checks when inlined.  */
> +    callback(42);
> +    external_side_effect(42);
> +}
> +
> +/* Explicit wrapper for testing sensitive_inline_function behavior.  */
> +__attribute__((noinline))
> +void wrap_sensitive_inline(void (*callback)(int))
> +{
> +    sensitive_inline_function(callback);
> +}
> +
> +/* Function marked with only always_inline (should get KCFI checks) */
> +__attribute__((always_inline))
> +static inline void normal_inline_function(void (*callback)(int))
> +{
> +    /* This indirect call must generate KCFI checks when inlined.  */
> +    callback(200);
> +    external_side_effect(200);
> +}
> +
> +/* Explicit wrapper for testing normal_inline_function behavior.  */
> +__attribute__((noinline))
> +void wrap_normal_inline(void (*callback)(int))
> +{
> +    normal_inline_function(callback);
> +}
> +
> +void test_callback(int value)
> +{
> +    external_side_effect(value);
> +}
> +
> +static void (*volatile function_pointer)(int) = test_callback;
> +
> +int main(void)
> +{
> +    void (*fn_ptr)(int) = function_pointer;
> +
> +    normal_function(fn_ptr);
> +    wrap_normal_inline(fn_ptr);
> +    sensitive_non_inline_function(fn_ptr);
> +    wrap_sensitive_inline(fn_ptr);
> +
> +    return 0;
> +}
> +
> +/* Verify correct number of KCFI checks: exactly 2 */
> +/* { dg-final { scan-assembler-times {ud2} 2 { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-times {brk\s+#[0-9]+} 2 { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler-times {udf\s+#[0-9]+} 2 { target arm32 } } } */
> +/* { dg-final { scan-assembler-times {ebreak} 2 { target riscv*-*-* } } } */
> +
> +/* Positive controls: these should have KCFI checks.  */
> +/* { dg-final { scan-assembler {normal_function:.*ud2.*\.size\s+normal_function} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler {wrap_normal_inline:.*ud2.*\.size\s+wrap_normal_inline} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler {normal_function:.*brk\s+#[0-9]+.*\.size\s+normal_function} { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler {wrap_normal_inline:.*brk\s+#[0-9]+.*\.size\s+wrap_normal_inline} { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler {normal_function:.*udf\t#[0-9]+.*\.size\s+normal_function} { target arm32 } } } */
> +/* { dg-final { scan-assembler {wrap_normal_inline:.*udf\t#[0-9]+.*\.size\s+wrap_normal_inline} { target arm32 } } } */
> +/* { dg-final { scan-assembler {normal_function:.*ebreak.*\.size\s+normal_function} { target riscv*-*-* } } } */
> +/* { dg-final { scan-assembler {wrap_normal_inline:.*ebreak.*\.size\s+wrap_normal_inline} { target riscv*-*-* } } } */
> +
> +/* Negative controls: these should NOT have KCFI checks.  */
> +/* { dg-final { scan-assembler-not {sensitive_non_inline_function:.*ud2.*\.size\s+sensitive_non_inline_function} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-not {wrap_sensitive_inline:.*ud2.*\.size\s+wrap_sensitive_inline} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-not {sensitive_non_inline_function:.*brk\s+#[0-9]+.*\.size\s+sensitive_non_inline_function} { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler-not {wrap_sensitive_inline:.*brk\s+#[0-9]+.*\.size\s+wrap_sensitive_inline} { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler-not {sensitive_non_inline_function:[^\n]*udf\t#[0-9]+[^\n]*\.size\tsensitive_non_inline_function} { target arm32 } } } */
> +/* { dg-final { scan-assembler-not {wrap_sensitive_inline:[^\n]*udf\t#[0-9]+[^\n]*\.size\twrap_sensitive_inline} { target arm32 } } } */
> +/* { dg-final { scan-assembler-not {sensitive_non_inline_function:.*ebreak.*\.size\s+sensitive_non_inline_function} { target riscv*-*-* } } } */
> +/* { dg-final { scan-assembler-not {wrap_sensitive_inline:.*ebreak.*\.size\s+wrap_sensitive_inline} { target riscv*-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c
> new file mode 100644
> index 000000000000..6f1a558c0820
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c
> @@ -0,0 +1,39 @@
> +/* Test KCFI with no_sanitize attribute.  */
> +/* { dg-do compile } */
> +
> +void target_function(void) {
> +    /* This should get KCFI preamble.  */
> +}
> +
> +void caller_with_checks(void) {
> +    /* This function should generate KCFI checks.  */
> +    void (*func_ptr)(void) = target_function;
> +    func_ptr();
> +}
> +
> +__attribute__((no_sanitize("kcfi")))
> +void caller_no_checks(void) {
> +    /* This function should NOT generate KCFI checks due to no_sanitize.  */
> +    void (*func_ptr)(void) = target_function;
> +    func_ptr();
> +}
> +
> +int main() {
> +    caller_with_checks();    /* This should generate checks inside.  */
> +    caller_no_checks();      /* This should NOT generate checks inside.  */
> +    return 0;
> +}
> +
> +/* All functions should get preambles regardless of no_sanitize.  */
> +/* { dg-final { scan-assembler "__cfi_target_function:" } } */
> +/* { dg-final { scan-assembler "__cfi_caller_with_checks:" } } */
> +/* { dg-final { scan-assembler "__cfi_caller_no_checks:" } } */
> +/* { dg-final { scan-assembler "__cfi_main:" } } */
> +
> +/* caller_with_checks() should generate KCFI check.
> +   caller_no_checks() should NOT generate KCFI check (no_sanitize).
> +   So a total of exactly 1 KCFI check in the entire program.  */
> +/* { dg-final { scan-assembler-times {addl\t-4\(%r[ad]x\), %r1[01]d} 1 { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-times {ldur\tw16, \[x[0-9]+, #-4\]} 1 { target aarch64-*-* } } } */
> +/* { dg-final { scan-assembler-times {ldr\tr0, \[r[0-9]+, #-4\]} 1 { target arm32 } } } */
> +/* { dg-final { scan-assembler-times {lw\tt1, -[0-9]+\(} 1 { target riscv*-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-offset-validation.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-offset-validation.c
> new file mode 100644
> index 000000000000..f93a042d9752
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-offset-validation.c
> @@ -0,0 +1,48 @@
> +/* Test KCFI call-site offset validation across architectures.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
> +
> +void target_func_a(void) { }
> +void target_func_b(int x) { }
> +void target_func_c(int x, int y) { }
> +
> +int main() {
> +    void (*ptr_a)(void) = target_func_a;
> +    void (*ptr_b)(int) = target_func_b;
> +    void (*ptr_c)(int, int) = target_func_c;
> +
> +    /* Multiple indirect calls.  */
> +    ptr_a();
> +    ptr_b(1);
> +    ptr_c(1, 2);
> +
> +    return 0;
> +}
> +
> +/* Should have KCFI preambles for all functions.  */
> +/* { dg-final { scan-assembler "__cfi_target_func_a:" } } */
> +/* { dg-final { scan-assembler "__cfi_target_func_b:" } } */
> +/* { dg-final { scan-assembler "__cfi_target_func_c:" } } */
> +
> +/* x86_64: All call sites should use -4 offset for KCFI type ID loads, even
> +   with -falign-functions=16 (we're not using patchable entries here).  */
> +/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-4\(%r[a-z0-9]+\), %r10d} { target x86_64-*-* } } } */
> +
> +/* AArch64: All call sites should use -4 offset.  */
> +/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-4\]} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: All call sites should use -4 offset with stack spilling.  */
> +/* { dg-final { scan-assembler {ldr\tr0, \[r[0-9]+, #-4\]} { target arm32 } } } */
> +
> +/* RISC-V: All call sites should use -4 offset.  */
> +/* { dg-final { scan-assembler {lw\tt1, -4\(} { target riscv*-*-* } } } */
> +
> +/* Should have trap section.  */
> +/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
> +
> +/* AArch64 should NOT have trap section (uses brk immediate instead) */
> +/* { dg-final { scan-assembler-not {\.kcfi_traps} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit should NOT have trap section (uses udf immediate instead) */
> +/* { dg-final { scan-assembler-not {\.kcfi_traps} { target arm32 } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-basic.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-basic.c
> new file mode 100644
> index 000000000000..a2d0ef0c6ff6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-basic.c
> @@ -0,0 +1,70 @@
> +/* Test KCFI with patchable function entries - basic case.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-fpatchable-function-entry=5,2" } */
> +/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
> +
> +void test_function(int x) {
> +    /* Function should get both KCFI preamble and patchable entries.  */
> +}
> +
> +int main() {
> +    test_function(42);
> +    return 0;
> +}
> +
> +/* Should have KCFI preamble.  */
> +/* { dg-final { scan-assembler "__cfi_test_function:" } } */
> +
> +/* Should have patchable function entry section.  */
> +/* { dg-final { scan-assembler "__patchable_function_entries" } } */
> +
> +/* x86_64: Should have exactly 2 prefix NOPs between .LPFE and .type.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*\.type} { target x86_64-*-* } } } */
> +
> +/* x86_64: Should have exactly 3 entry NOPs between .cfi_startproc and
> +   pushq.  */
> +/* { dg-final { scan-assembler {\.cfi_startproc\n\t*nop\n\t*nop\n\t*nop\n\t*pushq} { target x86_64-*-* } } } */
> +
> +/* x86_64: KCFI should have exactly 9 NOPs between __cfi_ and movl.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*movl} { target x86_64-*-* } } } */
> +
> +/* x86_64: Validate KCFI type ID is present.  */
> +/* { dg-final { scan-assembler {movl\t\$0x[0-9a-f]+, %eax} { target x86_64-*-* } } } */
> +
> +/* AArch64: Should have exactly 2 prefix NOPs between .LPFE and .type.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*\.type} { target aarch64*-*-* } } } */
> +
> +/* AArch64: Should have exactly 3 entry NOPs between .cfi_startproc and
> +   stack manipulation.  */
> +/* { dg-final { scan-assembler {\.cfi_startproc\n\t*nop\n\t*nop\n\t*nop\n\t*sub\t*sp} { target aarch64*-*-* } } } */
> +
> +/* AArch64: KCFI should have alignment NOPs then .word immediate.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target aarch64*-*-* } } } */
> +
> +/* AArch64: Validate clean KCFI boundary - .word then immediate end/size.  */
> +/* { dg-final { scan-assembler {\.word\t0x[0-9a-f]+\n\.Lcfi_func_end_test_function:\n\t\.size\t__cfi_test_function, \.-__cfi_test_function} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Should have exactly 2 prefix NOPs between .LPFE and .syntax.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*\.syntax} { target arm32 } } } */
> +
> +/* ARM 32-bit: Should have exactly 3 entry NOPs after function label.  */
> +/* { dg-final { scan-assembler {test_function:\n\t*nop\n\t*nop\n\t*nop} { target arm32 } } } */
> +
> +/* ARM 32-bit: KCFI should have alignment NOPs then .word immediate.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target arm32 } } } */
> +
> +/* ARM 32-bit: Validate clean KCFI boundary - .word then immediate end/size.  */
> +/* { dg-final { scan-assembler {\.word\t0x[0-9a-f]+\n\.Lcfi_func_end_test_function:\n\t\.size\t__cfi_test_function, \.-__cfi_test_function} { target arm32 } } } */
> +
> +/* RISC-V: Should have exactly 2 prefix NOPs between .LPFE and .type.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*\.type} { target riscv*-*-* } } } */
> +
> +/* RISC-V: Should have exactly 3 entry NOPs before .cfi_startproc followed
> +   by addi sp.  */
> +/* { dg-final { scan-assembler {nop\n\t*nop\n\t*nop\n\.LFB[0-9]+:\n\t*\.cfi_startproc\n\t*addi\t*sp} { target riscv*-*-* } } } */
> +
> +/* RISC-V: KCFI should have alignment NOPs then .word immediate.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target riscv*-*-* } } } */
> +
> +/* RISC-V: Validate clean KCFI boundary - .word then immediate end/size.  */
> +/* { dg-final { scan-assembler {\.word\t0x[0-9a-f]+\n\.Lcfi_func_end_test_function:\n\t\.size\t__cfi_test_function, \.-__cfi_test_function} { target riscv*-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-entry-only.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-entry-only.c
> new file mode 100644
> index 000000000000..62e1926e107e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-entry-only.c
> @@ -0,0 +1,62 @@
> +/* Test KCFI with patchable function entries - entry NOPs only.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-fpatchable-function-entry=4,0" } */
> +/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
> +
> +void test_function(void) {
> +}
> +
> +static void caller(void) {
> +    /* Make an indirect call to test callsite offset calculation.  */
> +    void (*func_ptr)(void) = test_function;
> +    func_ptr();
> +}
> +
> +int main() {
> +    test_function();  /* Direct call.  */
> +    caller();         /* Indirect call via static function.  */
> +    return 0;
> +}
> +
> +/* x86_64: Should have KCFI preamble with architecture alignment NOPs (11).  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+movl\t+\$0x[0-9a-f]+, %eax} { target x86_64-*-* } } } */
> +
> +/* AArch64: Should have KCFI preamble with no alignment NOPs.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*\.word\t0x[0-9a-f]+} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Should have KCFI preamble with no alignment NOPs.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t\.word\t0x[0-9a-f]+} { target arm32 } } } */
> +
> +/* RISC-V: Should have KCFI preamble with no alignment NOPs.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t\.word\t0x[0-9a-f]+} { target riscv*-*-* } } } */
> +
> +/* x86_64: Indirect call should use original prefix NOPs (0) for offset
> +   calculation: -4 offset.  */
> +/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-4\(%r[a-z0-9]+\), %r10d\n\tje\t(\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tud2\n.*\n\1:\n\tcall} { target x86_64-*-* } } } */
> +
> +/* x86_64: All 4 NOPs are entry NOPs - should have exactly 4 entry NOPs.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*pushq} { target x86_64-*-* } } } */
> +
> +/* AArch64: All 4 NOPs are entry NOPs - should have exactly 4 entry NOPs.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*stp} { target aarch64*-*-* } } } */
> +
> +/* AArch64: No alignment NOPs - function type should come immediately before
> +   function.  */
> +/* { dg-final { scan-assembler {\.type\t*test_function, %function\n*test_function:} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: All 4 NOPs are entry NOPs - should have exactly 4 entry NOPs.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop} { target arm32 } } } */
> +
> +/* ARM 32-bit: No alignment NOPs - function type should come immediately
> +   before function.  */
> +/* { dg-final { scan-assembler {\.type\t*test_function, %function\n*test_function:} { target arm32 } } } */
> +
> +/* RISC-V: All 4 NOPs are entry NOPs.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\.LFB} { target riscv*-*-* } } } */
> +
> +/* RISC-V: No alignment NOPs - function type should come immediately
> +   before function.  */
> +/* { dg-final { scan-assembler {\.type\t*test_function, @function\n*test_function:} { target riscv*-*-* } } } */
> +
> +/* Should have patchable function entry section.  */
> +/* { dg-final { scan-assembler "__patchable_function_entries" } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-large.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-large.c
> new file mode 100644
> index 000000000000..3d5618847840
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-large.c
> @@ -0,0 +1,51 @@
> +/* Test KCFI with large patchable function entries.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-fpatchable-function-entry=11,11" } */
> +/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
> +
> +void test_function(void) {
> +}
> +
> +int main() {
> +    void (*func_ptr)(void) = test_function;
> +    func_ptr();
> +    return 0;
> +}
> +
> +/* Should have KCFI preamble.  */
> +/* { dg-final { scan-assembler "__cfi_test_function:" } } */
> +
> +/* Should have patchable function entry section.  */
> +/* { dg-final { scan-assembler "__patchable_function_entries" } } */
> +
> +/* x86_64: Should have exactly 11 alignment NOPs between .LPFE and .type.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target x86_64-*-* } } } */
> +
> +/* x86_64: Should have 0 entry NOPs - function starts immediately with
> +   pushq.  */
> +/* { dg-final { scan-assembler {test_function:\n\.LFB[0-9]+:\n\t*\.cfi_startproc\n\t*pushq\t*%rbp} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target x86_64-*-* } } } */
> +
> +/* x86_64: KCFI should have 0 entry NOPs - goes directly to typeid movl.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*movl\t\$0x[0-9a-f]+, %eax} { target x86_64-*-* } } } */
> +
> +/* x86_64: Call site should use -15 offset.  */
> +/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-15\(%r[a-z0-9]+\), %r10d} { target x86_64-*-* } } } */
> +
> +/* AArch64: Should have exactly 11 prefix NOPs between .LPFE and .type.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Should have exactly 11 prefix NOPs between .LPFE and .type.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop} { target arm32 } } } */
> +
> +/* AArch64: Call site should use -15 offset.  */
> +/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-15\]} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Call site should use -15 offset.  */
> +/* { dg-final { scan-assembler {ldr\tr0, \[r[0-9]+, #-15\]} { target arm32 } } } */
> +
> +/* RISC-V: Should have 11 prefix NOPs between .LPFE and .type.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target riscv*-*-* } } } */
> +
> +/* RISC-V: Call site should use -15 offset (same as x86/AArch64).  */
> +/* { dg-final { scan-assembler {lw\tt1, -15\(} { target riscv*-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-medium.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-medium.c
> new file mode 100644
> index 000000000000..4f00a86dbcb7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-medium.c
> @@ -0,0 +1,60 @@
> +/* Test KCFI with medium patchable function entries.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-fpatchable-function-entry=8,4" } */
> +/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
> +
> +void test_function(void) {
> +}
> +
> +int main() {
> +    void (*func_ptr)(void) = test_function;
> +    func_ptr();
> +    return 0;
> +}
> +
> +/* Should have KCFI preamble.  */
> +/* { dg-final { scan-assembler "__cfi_test_function:" } } */
> +
> +/* Should have patchable function entry section.  */
> +/* { dg-final { scan-assembler "__patchable_function_entries" } } */
> +
> +/* x86_64: Should have exactly 4 prefix NOPs between .LPFE and .type.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target x86_64-*-* } } } */
> +
> +/* x86_64: Should have exactly 4 entry NOPs between .cfi_startproc and
> +   pushq.  */
> +/* { dg-final { scan-assembler {\.cfi_startproc\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*pushq} { target x86_64-*-* } } } */
> +
> +/* x86_64: KCFI should have exactly 7 alignment NOPs between __cfi_ and
> +   typeid movl.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*movl\t\$0x[0-9a-f]+, %eax} { target x86_64-*-* } } } */
> +
> +/* x86_64: Call site should use -8 offset.  */
> +/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-8\(%r[a-z0-9]+\), %r10d} { target x86_64-*-* } } } */
> +
> +/* AArch64: Should have exactly 4 prefix NOPs between .LPFE and .type.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target aarch64*-*-* } } } */
> +
> +/* AArch64: Should have exactly 4 entry NOPs after .cfi_startproc.  */
> +/* { dg-final { scan-assembler {\.cfi_startproc\n\t*nop\n\t*nop\n\t*nop\n\t*nop} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Should have exactly 4 prefix NOPs between .LPFE and .syntax.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.syntax} { target arm32 } } } */
> +
> +/* ARM 32-bit: Should have exactly 4 entry NOPs after function label.  */
> +/* { dg-final { scan-assembler {test_function:\n\t*nop\n\t*nop\n\t*nop\n\t*nop} { target arm32 } } } */
> +
> +/* AArch64: Call site should use -8 offset.  */
> +/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-8\]} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Call site should use -8 offset.  */
> +/* { dg-final { scan-assembler {ldr\tr0, \[r[0-9]+, #-8\]} { target arm32 } } } */
> +
> +/* RISC-V: Should have exactly 4 prefix NOPs between .LPFE and .type.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target riscv*-*-* } } } */
> +
> +/* RISC-V: Should have 4 entry NOPs.  */
> +/* { dg-final { scan-assembler {test_function:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\.LFB} { target riscv*-*-* } } } */
> +
> +/* RISC-V: Call site should use -8 offset (same as x86/AArch64) */
> +/* { dg-final { scan-assembler {lw\tt1, -8\(} { target riscv*-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-prefix-only.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-prefix-only.c
> new file mode 100644
> index 000000000000..98c53ef52989
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-prefix-only.c
> @@ -0,0 +1,60 @@
> +/* Test KCFI with patchable function entries - prefix NOPs only.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-fpatchable-function-entry=3,3" } */
> +/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
> +
> +void test_function(void) {
> +}
> +
> +int main() {
> +    test_function();
> +    return 0;
> +}
> +
> +/* Should have KCFI preamble.  */
> +/* { dg-final { scan-assembler "__cfi_test_function:" } } */
> +
> +/* x86_64: All 3 NOPs are prefix NOPs - should have exactly 3 prefix NOPs.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*\.type\t*test_function} { target x86_64-*-* } } } */
> +
> +/* x86_64: No entry NOPs - function should start immediately with prologue. */
> +/* { dg-final { scan-assembler {test_function:\n\.LFB[0-9]+:\n\t*\.cfi_startproc\n\t*pushq\t*%rbp} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target x86_64-*-* } } } */
> +
> +/* x86_64: should have exactly 8 alignment NOPs.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*movl} { target x86_64-*-* } } } */
> +
> +/* AArch64: All 3 NOPs are prefix NOPs - should have exactly 3 prefix NOPs.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*\.type\t*test_function} { target aarch64*-*-* } } } */
> +
> +/* AArch64: No entry NOPs - function should start immediately with prologue.  */
> +/* { dg-final { scan-assembler {test_function:\n\.LFB[0-9]+:\n\t*\.cfi_startproc\n\t*nop\n\t*ret} { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target aarch64*-*-* } } } */
> +
> +/* AArch64: KCFI type ID should have 1 alignment NOP then word.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: All 3 NOPs are prefix NOPs - should have exactly 3 prefix NOPs.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop} { target arm32 } } } */
> +
> +/* ARM 32-bit: No entry NOPs - function should start immediately with
> +   prologue.  */
> +/* { dg-final { scan-assembler {test_function:} { target arm32 } } } */
> +/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target arm32 } } } */
> +
> +/* ARM 32-bit: KCFI type ID should have 1 alignment NOP then word.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target arm32 } } } */
> +
> +/* RISC-V: All 3 NOPs are prefix NOPs - should have exactly 3 prefix NOPs.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*\.type\t*test_function} { target riscv*-*-* } } } */
> +
> +/* RISC-V: No entry NOPs - function should start immediately with
> +   .cfi_startproc.  */
> +/* { dg-final { scan-assembler {test_function:\n\.LFB[0-9]+:\n\t*\.cfi_startproc} { target riscv*-*-* } } } */
> +/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target riscv*-*-* } } } */
> +
> +/* RISC-V: KCFI type ID should have 1 alignment NOP then word.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target riscv*-*-* } } } */
> +
> +/* Should have patchable function entry section.  */
> +/* { dg-final { scan-assembler "__patchable_function_entries" } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-pic-addressing.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-pic-addressing.c
> new file mode 100644
> index 000000000000..26323db4572f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-pic-addressing.c
> @@ -0,0 +1,104 @@
> +/* Test KCFI with position-independent code addressing modes.
> +   This is a regression test for complex addressing like
> +   PLUS(PLUS(...), symbol_ref) which can occur with PIC and caused
> +   change_address_1 RTL errors.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2 -fpic" } */
> +
> +/* Global function pointer table that creates PIC addressing.  */
> +struct callbacks {
> +    int (*handler1)(int);
> +    void (*handler2)(void);
> +    int (*handler3)(int, int);
> +};
> +
> +static int simple_handler(int x) {
> +    return x * 2;
> +}
> +
> +static void void_handler(void) {
> +    /* Empty handler.  */
> +}
> +
> +static int complex_handler(int a, int b) {
> +    return a + b;
> +}
> +
> +/* Global structure that will require PIC addressing.  */
> +struct callbacks global_callbacks = {
> +    .handler1 = simple_handler,
> +    .handler2 = void_handler,
> +    .handler3 = complex_handler
> +};
> +
> +/* Function that uses PIC addressing to access global callbacks.  */
> +int test_pic_addressing(int value) {
> +    /* These indirect calls through global structure create complex
> +       addressing like PLUS(PLUS(GOT_base, symbol_offset), struct_offset)
> +       which previously caused RTL errors in KCFI instrumentation.  */
> +
> +    int result = 0;
> +    result += global_callbacks.handler1(value);
> +
> +    global_callbacks.handler2();
> +
> +    result += global_callbacks.handler3(value, result);
> +
> +    return result;
> +}
> +
> +/* Test with function pointer arrays.  */
> +static int (*func_array[])(int) = {
> +    simple_handler,
> +    simple_handler,
> +    simple_handler
> +};
> +
> +int test_pic_array(int index, int value) {
> +    /* Array access with PIC can also create complex addressing.  */
> +    return func_array[index % 3](value);
> +}
> +
> +/* Test with dynamic PIC addressing.  */
> +struct callbacks *get_callbacks(void) {
> +    return &global_callbacks;
> +}
> +
> +int test_dynamic_pic(int value) {
> +    /* Dynamic access through function call creates very complex addressing.  */
> +    struct callbacks *cb = get_callbacks();
> +    return cb->handler1(value) + cb->handler3(value, value);
> +}
> +
> +int main() {
> +    int result = 0;
> +    result += test_pic_addressing(10);
> +    result += test_pic_array(1, 20);
> +    result += test_dynamic_pic(5);
> +    return result;
> +}
> +
> +/* Verify that all address-taken functions get KCFI preambles.  */
> +/* { dg-final { scan-assembler {__cfi_simple_handler:} } } */
> +/* { dg-final { scan-assembler {__cfi_void_handler:} } } */
> +/* { dg-final { scan-assembler {__cfi_complex_handler:} } } */
> +
> +/* x86_64: Verify KCFI checks are generated.  */
> +/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-4\(%r[a-z0-9]+\), %r10d} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler {ud2} { target x86_64-*-* } } } */
> +
> +/* AArch64: Verify KCFI checks.  */
> +/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-4\]} { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler {brk} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Verify KCFI checks with PIC addressing and stack spilling.  */
> +/* { dg-final { scan-assembler {ldr\tr0, \[r[0-9]+, #-4\]} { target arm32 } } } */
> +/* { dg-final { scan-assembler {udf} { target arm32 } } } */
> +
> +/* RISC-V: Verify KCFI checks are generated.  */
> +/* { dg-final { scan-assembler {lw\tt1, -4\([a-z0-9]+\)\n\tlui\tt2, [0-9]+\n\taddiw\tt2, t2, -?[0-9]+} { target riscv*-*-* } } } */
> +/* { dg-final { scan-assembler {ebreak} { target riscv*-*-* } } } */
> +
> +/* Should have trap section.  */
> +/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-retpoline-r11.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-retpoline-r11.c
> new file mode 100644
> index 000000000000..79e5ca61cdc2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-retpoline-r11.c
> @@ -0,0 +1,50 @@
> +/* Test KCFI with retpoline thunk-extern flag forces r11 usage.  */
> +/* { dg-do compile { target x86_64-*-* } } */
> +/* { dg-additional-options "-O2 -mindirect-branch=thunk-extern" } */
> +
> +extern int external_target(void);
> +
> +/* Test regular call (not tail call) */
> +__attribute__((noinline))
> +int call_test(int (*func_ptr)(void)) {
> +    /* This indirect call should use r11 when both KCFI and
> +       -mindirect-branch=thunk-extern are enabled.  */
> +    int result = func_ptr();  /* Function parameter prevents direct optimization.  */
> +    return result + 1;  /* Prevent tail call optimization.  */
> +}
> +
> +/* Reference external_target to generate the required symbol.  */
> +int (*external_func_ptr)(void) = external_target;
> +
> +/* Test function for sibcalls (tail calls) */
> +__attribute__((noinline))
> +void sibcall_test(int (**func_ptr)(void)) {
> +    /* This sibcall should use r11 when both KCFI and
> +       -mindirect-branch=thunk-extern are enabled.  */
> +    (*func_ptr)();  /* Tail call - should be optimized to sibcall.  */
> +}
> +
> +/* Should have weak symbol for external function.  */
> +/* { dg-final { scan-assembler "__kcfi_typeid_external_target" } } */
> +
> +/* When both KCFI and -mindirect-branch=thunk-extern are enabled,
> +   indirect calls should always use r11 register and convert to extern thunks.  */
> +/* { dg-final { scan-assembler-times {call\s+__x86_indirect_thunk_r11} 1 } } */
> +
> +/* Sibcalls should also use r11 register and convert to extern thunks.  */
> +/* { dg-final { scan-assembler-times {jmp\s+__x86_indirect_thunk_r11} 1 } } */
> +
> +/* Should have exactly 2 KCFI traps (one per function) */
> +/* { dg-final { scan-assembler-times {ud2} 2 } } */
> +
> +/* Should NOT use other registers for indirect calls.  */
> +/* { dg-final { scan-assembler-not {call\s+\*%rax} } } */
> +/* { dg-final { scan-assembler-not {call\s+\*%rcx} } } */
> +/* { dg-final { scan-assembler-not {call\s+\*%rdx} } } */
> +/* { dg-final { scan-assembler-not {call\s+\*%rdi} } } */
> +
> +/* Should NOT use other registers for sibcalls.  */
> +/* { dg-final { scan-assembler-not {jmp\s+\*%rax} } } */
> +/* { dg-final { scan-assembler-not {jmp\s+\*%rcx} } } */
> +/* { dg-final { scan-assembler-not {jmp\s+\*%rdx} } } */
> +/* { dg-final { scan-assembler-not {jmp\s+\*%rdi} } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-runtime.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-runtime.c
> new file mode 100644
> index 000000000000..6ad8fab5da80
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-runtime.c
> @@ -0,0 +1,151 @@
> +/* Test KCFI runtime behavior: working calls and type mismatch trapping.
> +   { dg-do run { target native } }
> +   { dg-options "-fsanitize=kcfi" } */
> +
> +#include <stdio.h>
> +#include <signal.h>
> +#include <setjmp.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +/* Test functions with different signatures */
> +static int func_int_void(void)
> +{
> +    return 42;
> +}
> +
> +__attribute__((nocf_check))
> +static int func_int_void_nocf_check(void)
> +{
> +    return 42;
> +}
> +
> +static int func_int_int(int x)
> +{
> +    return x * 4;
> +}
> +
> +/* Global state for signal handling */
> +static volatile int trap_occurred = 0;
> +static jmp_buf trap_env;
> +
> +/* Signal handler for KCFI traps */
> +static void trap_handler(int sig)
> +{
> +    trap_occurred = 1;
> +    longjmp(trap_env, 1);
> +}
> +
> +/* Compatible indirect call should work */
> +static int test_compatible_call(void)
> +{
> +    typedef int (*int_void_ptr)(void);
> +    int_void_ptr ptr = func_int_void;
> +
> +    fprintf(stderr, "Calling %s(0x%08x) through %s(0x%08x) ...\n",
> +           __builtin_typeinfo_name(typeof(func_int_void)),
> +           __builtin_typeinfo_hash(typeof(func_int_void)),
> +           __builtin_typeinfo_name(typeof(*ptr)),
> +           __builtin_typeinfo_hash(typeof(*ptr)));
> +
> +    trap_occurred = 0;
> +    /* This should work - same signature */
> +    int result = ptr();
> +
> +    return (trap_occurred == 0 && result == 42) ? 1 : 0;
> +}
> +
> +/* Compatible indirect call to nocf_check should not work */
> +static int test_nocf_check_trap(void)
> +{
> +    trap_occurred = 0;
> +
> +    if (setjmp(trap_env) == 0) {
> +      typedef int (__attribute__((nocf_check)) *int_void_ptr_nocf)(void);
> +      int_void_ptr_nocf ptr = func_int_void_nocf_check;
> +
> +      fprintf(stderr, "Calling %s(0x%08x) through %s(0x%08x) ...\n",
> +             __builtin_typeinfo_name(typeof(func_int_void_nocf_check)),
> +             __builtin_typeinfo_hash(typeof(func_int_void_nocf_check)),
> +             __builtin_typeinfo_name(typeof(*ptr)),
> +             __builtin_typeinfo_hash(typeof(*ptr)));
> +
> +      int result = ptr();
> +
> +      /* If we get here, the trap didn't occur */
> +      return 0;
> +    } else {
> +      /* We caught the trap - this is expected */
> +      return trap_occurred;
> +    }
> +}
> +
> +/* Type mismatch should trap */
> +static int test_type_mismatch_trap(void)
> +{
> +    trap_occurred = 0;
> +
> +    if (setjmp(trap_env) == 0) {
> +      /* Cast func_int_void to incompatible void(*)(void) type */
> +      typedef void (*void_void_ptr)(void);
> +      void_void_ptr ptr = (void_void_ptr)func_int_void;
> +
> +      fprintf(stderr, "Calling %s(0x%08x) through %s(0x%08x) ...\n",
> +             __builtin_typeinfo_name(typeof(func_int_void)),
> +             __builtin_typeinfo_hash(typeof(func_int_void)),
> +             __builtin_typeinfo_name(typeof(*ptr)),
> +             __builtin_typeinfo_hash(typeof(*ptr)));
> +
> +      /* This should trap because type IDs don't match:
> +         - func_int_void has type ID for int(void)
> +         - but we're calling through void(void) pointer type */
> +      ptr();
> +
> +      /* If we get here, the trap didn't occur */
> +      return 0;
> +    } else {
> +      /* We caught the trap - this is expected */
> +      return trap_occurred;
> +    }
> +}
> +
> +int main(void)
> +{
> +    struct sigaction sa = {
> +      .sa_handler = trap_handler,
> +      .sa_flags = SA_NODEFER,
> +    };
> +    int failed = 3;
> +
> +    /* Install trap handler.  */
> +    if (sigaction(SIGILL, &sa, NULL)) {
> +      perror("sigaction");
> +      return 1;
> +    }
> +
> +    /* Compatible call should work */
> +    if (test_compatible_call()) {
> +      printf("OK: matched indirect call succeeded\n");
> +      failed--;
> +    } else {
> +      printf("FAIL\n");
> +    }
> +
> +    /* Using nocf_check should trap */
> +    if (test_nocf_check_trap()) {
> +      printf("OK: indirect call to nocf_check correctly trapped\n");
> +      failed--;
> +    } else {
> +      printf("FAIL\n");
> +    }
> +
> +    /* Type mismatch should trap */
> +    if (test_type_mismatch_trap()) {
> +      printf("OK: mismatched indirect call correctly trapped\n");
> +      failed--;
> +    } else {
> +      printf("FAIL\n");
> +    }
> +
> +    return failed;
> +}
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c
> new file mode 100644
> index 000000000000..e2e3912fffa3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c
> @@ -0,0 +1,142 @@
> +/* Test KCFI protection when indirect calls get converted to tail calls.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2" } */
> +
> +typedef int (*func_ptr_t)(int);
> +typedef void (*void_func_ptr_t)(void);
> +
> +struct function_table {
> +    func_ptr_t process;
> +    void_func_ptr_t cleanup;
> +};
> +
> +/* Target functions.  */
> +int process_data(int x) { return x * 2; }
> +void cleanup_data(void) {}
> +
> +/* Initialize function table.  */
> +volatile struct function_table vtable = {
> +    .process = &process_data,
> +    .cleanup = &cleanup_data
> +};
> +
> +/* Indirect call through struct member that should become tail call.  */
> +int test_struct_indirect_call(int x) {
> +    /* This is an indirect call that should be converted to tail call:
> +       Without -fno-optimize-sibling-calls should become "jmp *vtable+0(%rip)"
> +       With -fno-optimize-sibling-calls should become "call *vtable+0(%rip)"  */
> +    return vtable.process(x);
> +}
> +
> +/* Indirect call through function pointer parameter.  */
> +int test_param_indirect_call(func_ptr_t handler, int x) {
> +    /* This is an indirect call that should be converted to tail call:
> +       Without -fno-optimize-sibling-calls should become "jmp *%rdi"
> +       With -fno-optimize-sibling-calls should be "call *%rdi"  */
> +    return handler(x);
> +}
> +
> +/* Void indirect call through struct member.  */
> +void test_void_indirect_call(void) {
> +    /* This is an indirect call that should be converted to tail call:
> +     * Without -fno-optimize-sibling-calls: should become "jmp *vtable+8(%rip)"
> +     * With -fno-optimize-sibling-calls: should be "call *vtable+8(%rip)"  */
> +    vtable.cleanup();
> +}
> +
> +/* Non-tail call for comparison (should always be call).  */
> +int test_non_tail_indirect_call(func_ptr_t handler, int x) {
> +    /* This should never become a tail call - always "call *%rdi"  */
> +    int result = handler(x);
> +    return result + 1;  /* Prevents tail call optimization.  */
> +}
> +
> +/* Should have KCFI preambles for all functions.  */
> +/* { dg-final { scan-assembler-times "__cfi_process_data:" 1 } } */
> +/* { dg-final { scan-assembler-times "__cfi_cleanup_data:" 1 } } */
> +/* { dg-final { scan-assembler-times "__cfi_test_struct_indirect_call:" 1 } } */
> +/* { dg-final { scan-assembler-times "__cfi_test_param_indirect_call:" 1 } } */
> +/* { dg-final { scan-assembler-times "__cfi_test_void_indirect_call:" 1 } } */
> +/* { dg-final { scan-assembler-times "__cfi_test_non_tail_indirect_call:" 1 } } */
> +
> +/* Should have exactly 4 KCFI checks for indirect calls as
> +   (load type ID + compare).  */
> +/* { dg-final { scan-assembler-times {movl\t\$-?[0-9]+, %r10d} 4 { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-times {addl\t-4\(%r[a-z0-9]+\), %r10d} 4 { target x86_64-*-* } } } */
> +
> +/* Should have exactly 4 trap sections and 4 trap instructions.  */
> +/* { dg-final { scan-assembler-times "\\.kcfi_traps" 4 { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-times "ud2" 4 { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-times "\\.kcfi_traps" 4 { target riscv*-*-* } } } */
> +/* { dg-final { scan-assembler-times "ebreak" 4 { target riscv*-*-* } } } */
> +
> +/* Should NOT have unprotected direct jumps to vtable.  */
> +/* { dg-final { scan-assembler-not {jmp\t\*vtable\(%rip\)} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-not {jmp\t\*vtable\+8\(%rip\)} { target x86_64-*-* } } } */
> +
> +/* Should have exactly 3 protected tail calls (jmp through register after
> +   KCFI check).  */
> +/* { dg-final { scan-assembler-times {jmp\t\*%[a-z0-9]+} 3 { target x86_64-*-* } } } */
> +
> +/* Should have exactly 1 regular call (non-tail call case).  */
> +/* { dg-final { scan-assembler-times {call\t\*%[a-z0-9]+} 1 { target x86_64-*-* } } } */
> +
> +/* RISC-V: Should have exactly 4 KCFI checks for indirect calls
> +   (comparison instruction).  */
> +/* { dg-final { scan-assembler-times {beq\tt1, t2, \.Lkcfi_call[0-9]+} 4 { target riscv*-*-* } } } */
> +
> +/* RISC-V: Should have exactly 4 KCFI checks for indirect calls as
> +   (load type ID + compare).  */
> +/* { dg-final { scan-assembler-times {lw\tt1, -4\([a-z0-9]+\)} 4 { target riscv*-*-* } } } */
> +/* { dg-final { scan-assembler-times {lui\tt2, [0-9]+} 4 { target riscv*-*-* } } } */
> +
> +/* RISC-V: Should have exactly 3 protected tail calls (jr after
> +   KCFI check - no return address save).  */
> +/* { dg-final { scan-assembler-times {jalr\t(x0|zero), [a-z0-9]+, 0} 3 { target riscv*-*-* } } } */
> +
> +/* RISC-V: Should have exactly 1 regular call (non-tail call case - saves
> +   return address).  */
> +/* { dg-final { scan-assembler-times {jalr\t(x1|ra), [a-z0-9]+, 0} 1 { target riscv*-*-* } } } */
> +
> +/* Type ID loading should use lui + addiw pattern for 32-bit constants.  */
> +/* { dg-final { scan-assembler {lui\tt2, [0-9]+} { target riscv*-*-* } } } */
> +/* { dg-final { scan-assembler {addiw\tt2, t2, -?[0-9]+} { target riscv*-*-* } } } */
> +
> +/* Should have exactly 4 KCFI checks for indirect calls (load type ID from
> +   -4 offset + compare).  */
> +/* { dg-final { scan-assembler-times {ldur\tw16, \[x[0-9]+, #-4\]} 4 { target aarch64-*-* } } } */
> +/* { dg-final { scan-assembler-times {cmp\tw16, w17} 4 { target aarch64-*-* } } } */
> +
> +/* Should have exactly 4 trap instructions.  */
> +/* { dg-final { scan-assembler-times {brk\t#[0-9]+} 4 { target aarch64-*-* } } } */
> +
> +/* Should have exactly 3 protected tail calls (br through register after
> +   KCFI check).  */
> +/* { dg-final { scan-assembler-times {br\tx[0-9]+} 3 { target aarch64-*-* } } } */
> +
> +/* Should have exactly 1 regular call (non-tail call case).  */
> +/* { dg-final { scan-assembler-times {blr\tx[0-9]+} 1 { target aarch64-*-* } } } */
> +
> +/* Type ID loading should use mov + movk pattern for 32-bit constants.  */
> +/* { dg-final { scan-assembler {mov\tw17, #[0-9]+} { target aarch64-*-* } } } */
> +/* { dg-final { scan-assembler {movk\tw17, #[0-9]+, lsl #16} { target aarch64-*-* } } } */
> +
> +/* Should have exactly 4 KCFI checks for indirect calls (load type ID from
> +   -4 offset + compare).  */
> +/* { dg-final { scan-assembler-times {ldr\tr0, \[r[0-9]+, #-4\]} 4 { target arm32 } } } */
> +/* { dg-final { scan-assembler-times {cmp\tr0, r1} 4 { target arm32 } } } */
> +
> +/* Should have exactly 4 trap instructions.  */
> +/* { dg-final { scan-assembler-times {udf\t#[0-9]+} 4 { target arm32 } } } */
> +
> +/* Should have exactly 3 protected tail calls (bx through register after
> +   KCFI check).  */
> +/* { dg-final { scan-assembler-times {bx\tr[0-9]+} 3 { target arm32 } } } */
> +
> +/* Should have exactly 1 regular call (non-tail call case).  */
> +/* { dg-final { scan-assembler-times {blx\tr[0-9]+} 1 { target arm32 } } } */
> +
> +/* Type ID loading should use movw + movt pattern for 32-bit constants
> +   into r1.  */
> +/* { dg-final { scan-assembler {movw\tr1, #[0-9]+} { target arm32 } } } */
> +/* { dg-final { scan-assembler {movt\tr1, #[0-9]+} { target arm32 } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-encoding.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-encoding.c
> new file mode 100644
> index 000000000000..f2226fa58ac9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-encoding.c
> @@ -0,0 +1,54 @@
> +/* Test AArch64 and ARM32 KCFI trap encoding in BRK/UDF instructions.  */
> +/* { dg-do compile { target { aarch64*-*-* || arm32 } } } */
> +
> +void target_function(int x, char y) {
> +}
> +
> +int main() {
> +    void (*func_ptr)(int, char) = target_function;
> +
> +    /* This should generate trap with immediate encoding.  */
> +    func_ptr(42, 'a');
> +
> +    return 0;
> +}
> +
> +/* Should have KCFI preamble.  */
> +/* { dg-final { scan-assembler "__cfi_target_function:" } } */
> +
> +/* AArch64 specific: Should have BRK instruction with proper ESR encoding
> +   ESR format: 0x8000 | ((type_reg & 31) << 5) | (addr_reg & 31)
> +
> +   Test the ESR encoding by checking for the expected value.
> +   Since we know this test uses x2, we expect ESR = 0x8000 | (17<<5) | 2 = 33314
> +
> +   A truly dynamic test would need to extract the register from blr and compute
> +   the corresponding ESR, but DejaGnu's regex limitations make this complex.
> +   This test validates the specific case and documents the encoding.
> +   */
> +/* { dg-final { scan-assembler "blr\\s+x2" { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler "brk\\s+#33314" { target aarch64*-*-* } } } */
> +
> +/* Should have KCFI check with type comparison.  */
> +/* { dg-final { scan-assembler {ldur\t*w16, \[x[0-9]+, #-4\]} { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler {cmp\t*w16, w17} { target aarch64*-*-* } } } */
> +
> +/* ARM32 specific: Should have UDF instruction with proper encoding
> +   UDF format: 0x8000 | ((type_reg & 31) << 5) | (addr_reg & 31)
> +
> +   Since ARM32 spills and restores r0/r1 before the trap, the type_reg
> +   field uses 0x1F (31) to indicate "register was spilled" rather than
> +   pointing to a live register. The addr_reg field contains the actual
> +   target register number.
> +
> +   For this test case using r3, we expect:
> +   UDF = 0x8000 | (31 << 5) | 3 = 0x8000 | 0x3E0 | 3 = 33763
> +   */
> +/* { dg-final { scan-assembler "blx\\s+r3" { target arm32 } } } */
> +/* { dg-final { scan-assembler "udf\\s+#33763" { target arm32 } } } */
> +
> +/* Should have register spilling and restoration around type check.  */
> +/* { dg-final { scan-assembler {push\t*\{r0, r1\}} { target arm32 } } } */
> +/* { dg-final { scan-assembler {pop\t*\{r0, r1\}} { target arm32 } } } */
> +/* { dg-final { scan-assembler {ldr\t*r0, \[r[0-9]+, #-4\]} { target arm32 } } } */
> +/* { dg-final { scan-assembler {cmp\t*r0, r1} { target arm32 } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c
> new file mode 100644
> index 000000000000..7f5f8a82f3dc
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c
> @@ -0,0 +1,41 @@
> +/* Test KCFI trap section generation.  */
> +/* { dg-do compile } */
> +
> +void target_function(void) {}
> +
> +int main() {
> +    void (*func_ptr)(void) = target_function;
> +
> +    /* Multiple indirect calls to generate multiple trap entries.  */
> +    func_ptr();
> +    func_ptr();
> +
> +    return 0;
> +}
> +
> +/* Should have KCFI preamble.  */
> +/* { dg-final { scan-assembler "__cfi_target_function:" } } */
> +
> +/* Should have exactly 2 trap labels in code.  */
> +/* { dg-final { scan-assembler-times {\.L[^:]+:\n\s*ud2} 2 { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-times {\.L[^:]+:\n\s*brk} 2 { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler-times {\.L[^:]+:\n\s*udf} 2 { target arm32 } } } */
> +/* { dg-final { scan-assembler-times {\.L[^:]+:\n\s*ebreak} 2 { target riscv*-*-* } } } */
> +
> +/* x86_64: Should have complete .kcfi_traps section sequence with relative
> +   offset and 2 entries.  */
> +/* { dg-final { scan-assembler {\.section\t\.kcfi_traps,"ao",@progbits,\.text\n\.Lkcfi_entry([^:]+):\n\t\.long\t\.Lkcfi_trap([^\s\n]+)-\.Lkcfi_entry\1\n\t\.text} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-times {\.section\t\.kcfi_traps,"ao",@progbits,\.text} 2 { target x86_64-*-* } } } */
> +
> +/* AArch64 should NOT have .kcfi_traps section (uses brk immediate instead) */
> +/* { dg-final { scan-assembler-not {\.section\t+\.kcfi_traps} { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler-not {\.long.*-\.L} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit should NOT have .kcfi_traps section (uses udf immediate instead) */
> +/* { dg-final { scan-assembler-not {\.section\t+\.kcfi_traps} { target arm32 } } } */
> +/* { dg-final { scan-assembler-not {\.long.*-\.L} { target arm32 } } } */
> +
> +/* RISC-V: Should have complete .kcfi_traps section sequence with relative
> +   offset and 2 entries.  */
> +/* { dg-final { scan-assembler {\.section\t\.kcfi_traps,"ao",@progbits,\.text\n\.Lkcfi_entry([^:]+):\n\t\.4byte\t\.L([^\s\n]+)-\.Lkcfi_entry\1\n\t\.text} { target riscv*-*-* } } } */
> +/* { dg-final { scan-assembler-times {\.section\t\.kcfi_traps,"ao",@progbits,\.text} 2 { target riscv*-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi.exp b/gcc/testsuite/gcc.dg/kcfi/kcfi.exp
> new file mode 100644
> index 000000000000..0bbba196c82f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi.exp
> @@ -0,0 +1,64 @@
> +#   Copyright (C) 2025 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with GCC; see the file COPYING3.  If not see
> +# <http://www.gnu.org/licenses/>.
> +
> +# GCC testsuite for KCFI (Kernel Control Flow Integrity) tests.
> +
> +# Load support procs.
> +load_lib gcc-dg.exp
> +
> +# KCFI is only supported on specific targets
> +if { ![istarget "x86_64-*-*"] \
> +     && ![istarget "aarch64-*-*"] && ![istarget "arm*-*-*"] \
> +     && ![istarget "riscv*-*-*"] } {
> +    return
> +}
> +
> +# Skip tests if x86_64 is running in 32-bit mode (-m32)
> +if { [istarget "x86_64-*-*"] && ![check_effective_target_lp64] } {
> +    return
> +}
> +
> +# Skip tests if AArch64 is running in ILP32 mode (-mabi=ilp32)
> +if { [istarget "aarch64-*-*"] && ![check_effective_target_lp64] } {
> +    return
> +}
> +
> +# Skip tests if RISC-V is running in 32-bit mode (riscv32-*)
> +if { [istarget "riscv*-*-*"] && ![check_effective_target_lp64] } {
> +    return
> +}
> +
> +# Add KCFI-specific flags to any existing DEFAULT_CFLAGS
> +global DEFAULT_CFLAGS
> +if ![info exists DEFAULT_CFLAGS] then {
> +    set DEFAULT_CFLAGS ""
> +}
> +set DEFAULT_CFLAGS "$DEFAULT_CFLAGS -fsanitize=kcfi"
> +
> +# Add ARM32-specific flags for arm32 targets
> +if [check_effective_target_arm32] {
> +    set DEFAULT_CFLAGS "$DEFAULT_CFLAGS -march=armv7-a -mfloat-abi=soft"
> +}
> +
> +# Initialize `dg'.
> +dg-init
> +
> +# Main loop.
> +dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c]] \
> +       "" $DEFAULT_CFLAGS
> +
> +# All done.
> +dg-finish
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 7/7] kcfi: Add regression test suite
  2025-09-13 23:24 ` [PATCH v3 7/7] kcfi: Add regression test suite Kees Cook
  2025-09-13 23:51   ` Andrew Pinski
@ 2025-09-13 23:58   ` Andrew Pinski
  1 sibling, 0 replies; 28+ messages in thread
From: Andrew Pinski @ 2025-09-13 23:58 UTC (permalink / raw)
  To: Kees Cook
  Cc: Qing Zhao, Andrew Pinski, Jakub Jelinek, Martin Uecker,
	Richard Biener, Joseph Myers, Peter Zijlstra, Jan Hubicka,
	Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
	Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
	Jim Wilson, Dan Li, Sami Tolvanen, Ramon de C Valle, Joao Moreira,
	Nathan Chancellor, Bill Wendling, gcc-patches, linux-hardening

On Sat, Sep 13, 2025 at 4:36 PM Kees Cook <kees@kernel.org> wrote:
>
> Adds a test suite for KCFI (Kernel Control Flow Integrity) ABI, covering
> core functionality, optimization and code generation, addressing,
> architecture-specific KCFI sequence emission, and integration with
> patchable function entry.
>
> Tests can be run via:
>   make check-c RUNTESTFLAGS='kcfi.exp'
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.dg/kcfi/kcfi-adjacency.c: New test.
>         * gcc.dg/kcfi/kcfi-basics.c: New test.
>         * gcc.dg/kcfi/kcfi-call-sharing.c: New test.
>         * gcc.dg/kcfi/kcfi-cold-partition.c: New test.
>         * gcc.dg/kcfi/kcfi-complex-addressing.c: New test.
>         * gcc.dg/kcfi/kcfi-ipa-robustness.c: New test.
>         * gcc.dg/kcfi/kcfi-move-preservation.c: New test.
>         * gcc.dg/kcfi/kcfi-no-sanitize-inline.c: New test.
>         * gcc.dg/kcfi/kcfi-no-sanitize.c: New test.
>         * gcc.dg/kcfi/kcfi-offset-validation.c: New test.
>         * gcc.dg/kcfi/kcfi-patchable-basic.c: New test.
>         * gcc.dg/kcfi/kcfi-patchable-entry-only.c: New test.
>         * gcc.dg/kcfi/kcfi-patchable-large.c: New test.
>         * gcc.dg/kcfi/kcfi-patchable-medium.c: New test.
>         * gcc.dg/kcfi/kcfi-patchable-prefix-only.c: New test.
>         * gcc.dg/kcfi/kcfi-pic-addressing.c: New test.
>         * gcc.dg/kcfi/kcfi-retpoline-r11.c: New test.
>         * gcc.dg/kcfi/kcfi-runtime.c: New test.
>         * gcc.dg/kcfi/kcfi-tail-calls.c: New test.
>         * gcc.dg/kcfi/kcfi-trap-encoding.c: New test.
>         * gcc.dg/kcfi/kcfi-trap-section.c: New test.
>         * gcc.dg/kcfi/kcfi.exp: New test.
>
> Signed-off-by: Kees Cook <kees@kernel.org>
> ---
>  gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c    |  72 +++++++++
>  gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c       | 108 +++++++++++++
>  gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c |  84 ++++++++++
>  .../gcc.dg/kcfi/kcfi-cold-partition.c         | 136 ++++++++++++++++
>  .../gcc.dg/kcfi/kcfi-complex-addressing.c     | 135 ++++++++++++++++
>  .../gcc.dg/kcfi/kcfi-ipa-robustness.c         |  54 +++++++
>  .../gcc.dg/kcfi/kcfi-move-preservation.c      |  55 +++++++
>  .../gcc.dg/kcfi/kcfi-no-sanitize-inline.c     | 100 ++++++++++++
>  gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c  |  39 +++++
>  .../gcc.dg/kcfi/kcfi-offset-validation.c      |  48 ++++++
>  .../gcc.dg/kcfi/kcfi-patchable-basic.c        |  70 ++++++++
>  .../gcc.dg/kcfi/kcfi-patchable-entry-only.c   |  62 +++++++
>  .../gcc.dg/kcfi/kcfi-patchable-large.c        |  51 ++++++
>  .../gcc.dg/kcfi/kcfi-patchable-medium.c       |  60 +++++++
>  .../gcc.dg/kcfi/kcfi-patchable-prefix-only.c  |  60 +++++++
>  .../gcc.dg/kcfi/kcfi-pic-addressing.c         | 104 ++++++++++++
>  .../gcc.dg/kcfi/kcfi-retpoline-r11.c          |  50 ++++++
>  gcc/testsuite/gcc.dg/kcfi/kcfi-runtime.c      | 151 ++++++++++++++++++
>  gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c   | 142 ++++++++++++++++
>  .../gcc.dg/kcfi/kcfi-trap-encoding.c          |  54 +++++++
>  gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c |  41 +++++
>  gcc/testsuite/gcc.dg/kcfi/kcfi.exp            |  64 ++++++++
>  22 files changed, 1740 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-cold-partition.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-complex-addressing.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-ipa-robustness.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-move-preservation.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize-inline.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-offset-validation.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-basic.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-entry-only.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-large.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-medium.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-prefix-only.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-pic-addressing.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-retpoline-r11.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-runtime.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-trap-encoding.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c
>  create mode 100644 gcc/testsuite/gcc.dg/kcfi/kcfi.exp
>
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c
> new file mode 100644
> index 000000000000..becb47678df0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-adjacency.c
> @@ -0,0 +1,72 @@
> +/* Test KCFI check/transfer adjacency - regression test for instruction
> +   insertion.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2" } */
> +
> +/* This test ensures that KCFI security checks remain immediately adjacent
> +   to their corresponding indirect calls/jumps, with no executable instructions
> +   between the type ID check and the control flow transfer. */
> +
> +/* External function pointers to prevent optimization.  */
> +extern void (*complex_func_ptr)(int, int, int, int);
> +extern int (*return_func_ptr)(int, int);
> +
> +/* Function with complex argument preparation that could tempt
> +   the optimizer to insert instructions between KCFI check and call.  */
> +__attribute__((noinline)) void test_complex_args(int a, int b, int c, int d) {
> +    /* Complex argument expressions that might cause instruction scheduling.  */
> +    complex_func_ptr(a * 2, b + c, d - a, (a << 1) | b);
> +}
> +
> +/* Function with return value handling.  */
> +__attribute__((noinline)) int test_return_value(int x, int y) {
> +    /* Return value handling that shouldn't interfere with adjacency.  */
> +    int result = return_func_ptr(x + 1, y * 2);
> +    return result + 1;
> +}
> +
> +/* Test struct field access that caused issues in try-catch.c.  */
> +struct call_info {
> +    void (*handler)(void);
> +    int status;
> +    int data;
> +};
> +
> +extern struct call_info *global_call_info;
> +
> +__attribute__((noinline)) void test_struct_field_call(void) {
> +    /* This pattern caused adjacency issues before the fix.  */
> +    global_call_info->handler();
> +}
> +
> +/* Test conditional indirect call.  */
> +__attribute__((noinline)) void test_conditional_call(int flag) {
> +    if (flag) {
> +        global_call_info->handler();
> +    }
> +}
> +
> +/* Should have KCFI instrumentation for all indirect calls.  */
> +
> +/* x86_64: Complete KCFI check sequence should be present.  */
> +/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r1[01]d\n\taddl\t[^,]+, %r1[01]d\n\tje\t\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tud2} { target x86_64-*-* } } } */
> +
> +/* AArch64: Complete KCFI check sequence should be present.  */
> +/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-[0-9]+\]\n\tmov\tw17, #[0-9]+\n\tmovk\tw17, #[0-9]+, lsl #16\n\tcmp\tw16, w17\n\tb\.eq\t(\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tbrk\t#[0-9]+\n\1:\n\tblr\tx[0-9]+} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Complete KCFI check sequence should be present with stack
> +   spilling.  */
> +/* { dg-final { scan-assembler {push\t\{r0, r1\}\n\tldr\tr0, \[r[0-9]+, #-[0-9]+\]\n\tmovw\tr1, #[0-9]+\n\tmovt\tr1, #[0-9]+\n\tcmp\tr0, r1\n\tpop\t\{r0, r1\}\n\tbeq\t\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tudf\t#[0-9]+\n\.Lkcfi_call[0-9]+:\n\tblx\tr[0-9]+} { target arm32 } } } */
> +
> +/* RISC-V: Complete KCFI check sequence should be present.  */
> +/* { dg-final { scan-assembler {lw\tt1, -4\([a-z0-9]+\)\n\tlui\tt2, [0-9]+\n\taddiw\tt2, t2, -?[0-9]+\n\tbeq\tt1, t2, \.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tebreak} { target riscv*-*-* } } } */
> +
> +/* Should have trap section with entries.  */
> +/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
> +
> +/* AArch64 should NOT have trap section (uses brk immediate instead) */
> +/* { dg-final { scan-assembler-not {\.kcfi_traps} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit should NOT have trap section (uses udf immediate instead) */
> +/* { dg-final { scan-assembler-not {\.kcfi_traps} { target arm32 } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c
> new file mode 100644
> index 000000000000..b0a9e11f1f3c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-basics.c
> @@ -0,0 +1,108 @@
> +/* Test basic KCFI functionality - preamble generation.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
> +
> +/* Extern function declarations - should NOT get KCFI preambles.  */
> +extern void external_func(void);
> +extern int external_func_int(int x);
> +
> +void regular_function(int x) {
> +    /* This should get KCFI preamble.  */
> +}
> +
> +void static_target_function(int x) {
> +    /* Target function that can be called indirectly.  */
> +}
> +
> +__attribute__((nocf_check))
> +void nocf_check_function(int x) {
> +    /* This function has nocf_check attribute - should NOT get KCFI preamble.  */
> +}
> +
> +static void static_caller(void) {
> +    /* Static function that makes an indirect call
> +       Should NOT get KCFI preamble (not address-taken)
> +       But must generate KCFI check for the indirect call.  */
> +    void (*local_ptr)(int) = static_target_function;
> +    local_ptr(42);  /* This should generate KCFI check.  */
> +}
> +
> +/* Make external_func address-taken.  */
> +void (*func_ptr)(int) = regular_function;
> +void (*ext_ptr)(void) = external_func;
> +void (__attribute__((nocf_check)) *nocf_ptr)(int) = nocf_check_function;
> +
> +int main() {
> +    func_ptr(42);
> +    ext_ptr();        /* Indirect call to external_func.  */
> +    external_func_int(10);  /* Direct call to external_func_int.  */
> +    static_caller();  /* Direct call to static function.  */
> +    return 0;
> +}
> +
> +/* Verify KCFI preamble exists for regular_function.  */
> +/* { dg-final { scan-assembler {__cfi_regular_function:} } } */
> +
> +/* Verify KCFI preamble symbol comes before main function symbol.  */
> +/* { dg-final { scan-assembler {__cfi_regular_function:.*regular_function:} } } */
> +
> +/* Target function should have preamble (address-taken).  */
> +/* { dg-final { scan-assembler {__cfi_static_target_function:} } } */
> +
> +/* Static caller should NOT have preamble (it's only called directly,
> +   not address-taken). */
> +/* { dg-final { scan-assembler-not {__cfi_static_caller:} } } */
> +
> +/* Function with nocf_check attribute should NOT have preamble.  */
> +/* { dg-final { scan-assembler-not {__cfi_nocf_check_function:} } } */
> +
> +/* x86_64: Verify type ID in preamble (after NOPs, before function label) */
> +/* { dg-final { scan-assembler {__cfi_regular_function:\n\t+nop\n.*\n\t+movl\t+\$0x[0-9a-f]+, %eax} { target x86_64-*-* } } } */
> +
> +/* AArch64: Verify type ID word in preamble.  */
> +/* { dg-final { scan-assembler {__cfi_regular_function:\n\t\.word\t0x[0-9a-f]+} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Verify type ID word in preamble.  */
> +/* { dg-final { scan-assembler {__cfi_regular_function:\n\t\.word\t0x[0-9a-f]+} { target arm32 } } } */
> +
> +/* RISC-V: Verify type ID word in preamble */
> +/* { dg-final { scan-assembler {__cfi_regular_function:\n\t\.word\t0x[0-9a-f]+} { target riscv*-*-* } } } */
> +
> +/* x86_64: Static function should generate complete KCFI check sequence.  */
> +/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-4\(%r[a-z0-9]+\), %r10d\n\tje\t(\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tud2\n.*\n\1:\n\tcall} { target x86_64-*-* } } } */
> +
> +/* AArch64: Static function should generate complete KCFI check sequence.  */
> +/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-4\]\n\tmov\tw17, #[0-9]+\n\tmovk\tw17, #[0-9]+, lsl #16\n\tcmp\tw16, w17\n\tb\.eq\t(\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tbrk\t#[0-9]+\n\1:\n\tblr} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Static function should generate complete KCFI check sequence
> +   with stack spilling.  */
> +/* { dg-final { scan-assembler {push\t\{r0, r1\}\n\tldr\tr0, \[r[0-9]+, #-4\]\n\tmovw\tr1, #[0-9]+\n\tmovt\tr1, #[0-9]+\n\tcmp\tr0, r1\n\tpop\t\{r0, r1\}\n\tbeq\t\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tudf\t#[0-9]+\n\.Lkcfi_call[0-9]+:\n\tblx\tr[0-9]+} { target arm32 } } } */
> +
> +/* RISC-V: Static function should generate KCFI check for indirect call.  */
> +/* { dg-final { scan-assembler {lw\tt1, -4\([a-z0-9]+\)\n\tlui\tt2, [0-9]+\n\taddiw\tt2, t2, -?[0-9]+\n\tbeq\tt1, t2, (\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tebreak\n\t\.section\t\.kcfi_traps,"ao",@progbits,\.text\n\.Lkcfi_entry[0-9]+:\n\t\.4byte\t\.Lkcfi_trap[0-9]+-\.Lkcfi_entry[0-9]+\n\t\.text\n\1:\n\tjalr} { target riscv*-*-* } } } */
> +
> +/* Extern functions should NOT get KCFI preambles.  */
> +/* { dg-final { scan-assembler-not {__cfi_external_func:} } } */
> +/* { dg-final { scan-assembler-not {__cfi_external_func_int:} } } */
> +
> +/* Local functions should NOT get __kcfi_typeid_ symbols.  */
> +/* Only external declarations that are address-taken should get __kcfi_typeid_ */
> +/* { dg-final { scan-assembler-not {__kcfi_typeid_regular_function} } } */
> +/* { dg-final { scan-assembler-not {__kcfi_typeid_main} } } */
> +
> +/* External address-taken functions should get __kcfi_typeid_ symbols.  */
> +/* { dg-final { scan-assembler {__kcfi_typeid_external_func} } } */
> +
> +/* External functions that are only called directly should NOT get
> +   __kcfi_typeid_ symbols.  */
> +/* { dg-final { scan-assembler-not {__kcfi_typeid_external_func_int} } } */
> +
> +/* Should have trap section for KCFI checks.  */
> +/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
> +
> +/* AArch64 should NOT have trap section (uses brk immediate instead).  */
> +/* { dg-final { scan-assembler-not {\.kcfi_traps} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit should NOT have trap section (uses udf immediate instead).  */
> +/* { dg-final { scan-assembler-not {\.kcfi_traps} { target arm32 } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c
> new file mode 100644
> index 000000000000..f34d5f88547f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-call-sharing.c
> @@ -0,0 +1,84 @@
> +/* Test KCFI check sharing bug - optimizer incorrectly shares KCFI checks
> +   between different function types.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2" } */
> +
> +/* Reproduce the pattern from Linux kernel internal_create_group where:
> +   - Two different function pointer types (is_visible vs is_bin_visible).
> +   - Both get loaded into the same register (%rcx).
> +   - Optimizer creates shared KCFI check with wrong type ID.
> +   - This causes CFI failures in production kernel.  */
> +
> +struct kobject { int dummy; };
> +struct attribute { int dummy; };
> +struct bin_attribute { int dummy; };
> +
> +struct attribute_group {
> +    const char *name;
> +    // Type ID A
> +    int (*is_visible)(struct kobject *, struct attribute *, int);
> +    // Type ID B
> +    int (*is_bin_visible)(struct kobject *, const struct bin_attribute *, int);
> +    struct attribute **attrs;
> +    const struct bin_attribute **bin_attrs;
> +};
> +
> +/* Function that mimics __first_visible from kernel - gets inlined into
> +   caller.  */
> +static int __first_visible(const struct attribute_group *grp, struct kobject *kobj)
> +{
> +    /* Path 1: Call is_visible function pointer.  */
> +    if (grp->attrs && grp->attrs[0] && grp->is_visible)
> +        return grp->is_visible(kobj, grp->attrs[0], 0);
> +
> +    /* Path 2: Call is_bin_visible function pointer.  */
> +    if (grp->bin_attrs && grp->bin_attrs[0] && grp->is_bin_visible)
> +        return grp->is_bin_visible(kobj, grp->bin_attrs[0], 0);
> +
> +    return 0;
> +}
> +
> +/* Main function that triggers the optimization bug.  */
> +int test_kcfi_check_sharing(struct kobject *kobj, const struct attribute_group *grp)
> +{
> +    /* This should inline __first_visible and create the problematic pattern where:
> +       1. Both function pointers get loaded into same register.
> +       2. Optimizer shares KCFI check between them.
> +       3. Uses wrong type ID for one of the calls.  */
> +    return __first_visible(grp, kobj);
> +}
> +
> +/* Each indirect call should have its own KCFI check with correct type ID.
> +
> +   Should see:
> +   1. KCFI check for is_visible call with is_visible type ID.
> +   2. KCFI check for is_bin_visible call with is_bin_visible type ID.  */
> +
> +/* Verify we have TWO different KCFI check sequences.  */
> +/* Each check should have different type ID constants.  */
> +/* x86: { dg-final { scan-assembler-times {movl\s+\$-?[0-9]+,\s+%r10d} 2 { target i?86-*-* x86_64-*-* } } } */
> +/* AArch64: { dg-final { scan-assembler-times {mov\s+w17, #[0-9]+} 2 { target aarch64*-*-* } } } */
> +/* ARM 32-bit: { dg-final { scan-assembler-times {movw\s+r1, #[0-9]+} 2 { target arm32 } } } */
> +/* RISC-V: { dg-final { scan-assembler-times {lui\tt2, [0-9]+} 2 { target riscv*-*-* } } } */
> +
> +/* Verify the checks use DIFFERENT type IDs (not shared).
> +   We should NOT see the same type ID used twice - that would indicate
> +   sharing bug.  */
> +/* x86: { dg-final { scan-assembler-not {movl\s+\$(-?[0-9]+),\s+%r10d.*movl\s+\$\1,\s+%r10d} { target i?86-*-* x86_64-*-* } } } */
> +/* AArch64: { dg-final { scan-assembler-not {mov\s+w17, #([0-9]+).*mov\s+w17, #\1} { target aarch64*-*-* } } } */
> +/* ARM 32-bit: { dg-final { scan-assembler-not {movw\s+r1, #([0-9]+).*movw\s+r1, #\1} { target arm32 } } } */
> +/* RISC-V: { dg-final { scan-assembler-not {lui\s+t2, ([0-9]+)\s.*lui\s+t2, \1\s} { target riscv*-*-* } } } */
> +
> +/* Verify each call follows its own check (not shared) */
> +/* Should have 2 separate trap instructions.  */
> +/* x86: { dg-final { scan-assembler-times {ud2} 2 { target i?86-*-* x86_64-*-* } } } */
> +/* AArch64: { dg-final { scan-assembler-times {brk\s+#[0-9]+} 2 { target aarch64*-*-* } } } */
> +/* ARM 32-bit: { dg-final { scan-assembler-times {udf\s+#[0-9]+} 2 { target arm32 } } } */
> +/* RISC-V: { dg-final { scan-assembler-times {ebreak} 2 { target riscv*-*-* } } } */
> +
> +/* Verify 2 separate call sites.  */
> +/* x86: { dg-final { scan-assembler-times {jmp\s+\*%[a-z0-9]+} 2 { target i?86-*-* x86_64-*-* } } } */
> +/* AArch64: Allow both blr (regular call) and br (tail call) */
> +/* AArch64: { dg-final { scan-assembler-times {br\tx[0-9]+} 2 { target aarch64*-*-* } } } */
> +/* ARM 32-bit: { dg-final { scan-assembler-times {bx\s+(?:r[0-9]+|ip)} 2 { target arm32 } } } */
> +/* RISC-V: { dg-final { scan-assembler-times {jalr\t[a-z0-9]+} 2 { target riscv*-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-cold-partition.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-cold-partition.c
> new file mode 100644
> index 000000000000..17def558ada4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-cold-partition.c
> @@ -0,0 +1,136 @@
> +/* Test KCFI cold function and cold partition behavior.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2" } */
> +/* { dg-additional-options "-freorder-blocks-and-partition" { target freorder } } */
> +
> +void regular_function(void) {
> +    /* Regular function should get preamble.  */
> +}
> +
> +/* Cold-attributed function should STILL get preamble (it's a regular
> +   function, just marked cold).  */
> +__attribute__((cold))
> +void cold_attributed_function(void) {
> +    /* This function has cold attribute but should still get KCFI preamble.  */
> +}
> +
> +/* Hot-attributed function should get preamble.  */
> +__attribute__((hot))
> +void hot_attributed_function(void) {
> +    /* This function is explicitly hot and should get KCFI preamble.  */
> +}
> +
> +/* Global to prevent optimization from eliminating cold paths.  */
> +extern void abort(void);
> +
> +/* Additional function to test that normal functions still get preambles.  */
> +__attribute__((noinline))
> +int another_regular_function(int x) {
> +    return x + 42;
> +}
> +
> +/* Function designed to generate cold partitions under optimization.  */
> +__attribute__((noinline))
> +void function_with_cold_partition(int condition) {
> +    /* Hot path - very likely to execute.  */
> +    if (__builtin_expect(condition == 42, 1)) {
> +        /* Simple hot path that optimizer will keep inline.  */
> +        return;
> +    }
> +
> +    /* Cold paths that actually do something to prevent elimination.  */
> +    if (__builtin_expect(condition < 0, 0)) {
> +        /* Error path 1 - call abort to prevent elimination.  */
> +        abort();
> +    }
> +
> +    if (__builtin_expect(condition > 1000000, 0)) {
> +        /* Error path 2 - call abort to prevent elimination.  */
> +        abort();
> +    }
> +
> +    if (__builtin_expect(condition == 999999, 0)) {
> +        /* Error path 3 - more substantial cold code.  */
> +        volatile int sum = 0;
> +        for (volatile int i = 0; i < 100; i++) {
> +            sum += i * condition;
> +        }
> +        if (sum > 0)
> +            abort();
> +    }
> +
> +    /* More cold paths - switch with many unlikely cases.  */
> +    switch (condition) {
> +        case 1000001: case 1000002: case 1000003: case 1000004: case 1000005:
> +        case 1000006: case 1000007: case 1000008: case 1000009: case 1000010:
> +            /* Each case does some work before abort.  */
> +            volatile int work = condition * 2;
> +            if (work > 0) abort();
> +            break;
> +        default:
> +            if (condition != 42) {
> +                /* Fallback cold path - substantial work.  */
> +                volatile int result = 0;
> +                for (volatile int j = 0; j < condition % 50; j++) {
> +                    result += j;
> +                }
> +                if (result >= 0) abort();
> +            }
> +    }
> +}
> +
> +/* Test function pointers to ensure address-taken detection works.  */
> +void test_function_pointers(void) {
> +    void (*regular_ptr)(void) = regular_function;
> +    void (*cold_ptr)(void) = cold_attributed_function;
> +    void (*hot_ptr)(void) = hot_attributed_function;
> +
> +    regular_ptr();
> +    cold_ptr();
> +    hot_ptr();
> +}
> +
> +int main() {
> +    regular_function();
> +    cold_attributed_function();
> +    hot_attributed_function();
> +    function_with_cold_partition(42); /* Normal case - stay in hot path.  */
> +    another_regular_function(5);
> +    test_function_pointers();
> +    return 0;
> +}
> +
> +/* Regular function should have preamble.  */
> +/* { dg-final { scan-assembler "__cfi_regular_function:" } } */
> +
> +/* Cold-attributed function should STILL have preamble (it's a legitimate function) */
> +/* { dg-final { scan-assembler "__cfi_cold_attributed_function:" } } */
> +
> +/* Hot-attributed function should have preamble.  */
> +/* { dg-final { scan-assembler "__cfi_hot_attributed_function:" } } */
> +
> +/* Function that generates cold partitions should have preamble for main entry.  */
> +/* { dg-final { scan-assembler "__cfi_function_with_cold_partition:" } } */
> +
> +/* Address-taken functions should have preambles.  */
> +/* { dg-final { scan-assembler "__cfi_test_function_pointers:" } } */
> +
> +/* The function should generate a .cold partition (only on targets that support freorder) */
> +/* { dg-final { scan-assembler "function_with_cold_partition\\.cold:" { target freorder } } } */
> +
> +/* The .cold partition should NOT get a __cfi_ preamble since it's never
> +   reached via indirect calls.  */
> +/* { dg-final { scan-assembler-not "__cfi_function_with_cold_partition\\.cold:" { target freorder } } } */
> +
> +/* Additional regular function should get preamble.  */
> +/* { dg-final { scan-assembler "__cfi_another_regular_function:" } } */
> +
> +/* Test coverage summary:
> +   1. Cold-attributed function (__attribute__((cold))): SHOULD get preamble
> +   2. Cold partition (-freorder-blocks-and-partition): should NOT get preamble
> +   3. IPA split .part function (split_part=true): Logic in place, would skip if triggered
> +
> +   Note: IPA function splitting (creating .part functions with split_part=true) requires
> +   specific optimization conditions that are difficult to trigger reliably in tests.
> +   The KCFI logic correctly handles this case using the split_part flag check.
> +*/
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-complex-addressing.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-complex-addressing.c
> new file mode 100644
> index 000000000000..b9a8955b0899
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-complex-addressing.c
> @@ -0,0 +1,135 @@
> +/* Test KCFI with complex addressing modes (structure members, array
> +   elements). This is a regression test for the change_address_1 RTL
> +   error that occurred when target_addr was PLUS(reg, offset) instead
> +   of a simple register.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2" } */
> +
> +struct function_table {
> +    int (*callback1)(int);
> +    int (*callback2)(int, int);
> +    void (*callback3)(void);
> +    int (*callback4)(void *, void *, void *, void *, void *, void *);
> +    int data;
> +};
> +
> +static int handler1(int x) {
> +    return x * 2;
> +}
> +
> +static int handler2(int x, int y) {
> +    return x + y;
> +}
> +
> +static void handler3(void) {
> +    /* Empty handler.  */
> +}
> +
> +/* Test indirect calls through structure members - this creates
> +   PLUS(reg, offset) addressing.  */
> +int test_struct_members(struct function_table *table) {
> +    int result = 0;
> +
> +    /* These indirect calls will generate complex addressing modes:
> +     * call *(%rdi)          - callback1 at offset 0
> +     * call *8(%rdi)         - callback2 at offset 8
> +     * call *16(%rdi)        - callback3 at offset 16
> +     * KCFI must handle PLUS(reg, struct_offset) + kcfi_offset.  */
> +
> +    result += table->callback1(10);
> +    result += table->callback2(5, 7);
> +    table->callback3();
> +
> +    return result;
> +}
> +
> +/* Test indirect calls through array elements - another source of
> +   complex addressing.  */
> +typedef int (*func_array_t)(int);
> +
> +int test_array_elements(func_array_t functions[], int index) {
> +    /* This creates addressing like MEM[PLUS(PLUS(reg, index*8), 0)]
> +       which should be simplified to MEM[PLUS(reg, index*8)].  */
> +    return functions[index](42);
> +}
> +
> +/* Test with global structure.  */
> +static struct function_table global_table = {
> +    .callback1 = handler1,
> +    .callback2 = handler2,
> +    .callback3 = handler3,
> +    .data = 100
> +};
> +
> +int test_global_struct(void) {
> +    /* Access through global structure - may generate different
> +       addressing patterns.  */
> +    return global_table.callback1(20) + global_table.callback2(3, 4);
> +}
> +
> +/* Test nested structure access.  */
> +struct nested_table {
> +    struct function_table inner;
> +    int extra_data;
> +};
> +
> +int test_nested_struct(struct nested_table *nested) {
> +    /* Even more complex addressing: nested structure member access.  */
> +    return nested->inner.callback1(15);
> +}
> +
> +int test_many_args(void *one, void *two, void *three, void *four, void *five, void *six)
> +{
> +    return (unsigned long)one + (unsigned long)two + (unsigned long)three
> +          + (unsigned long)four + (unsigned long)five + (unsigned long)six;
> +}
> +
> +int main() {
> +    struct function_table local_table = {
> +        .callback1 = handler1,
> +        .callback2 = handler2,
> +        .callback3 = handler3,
> +        .callback4 = test_many_args,
> +        .data = 50
> +    };
> +
> +    func_array_t func_array[] = { handler1, handler1, handler1 };
> +
> +    int result = 0;
> +    result += test_struct_members(&local_table);
> +    result += test_array_elements(func_array, 1);
> +    result += test_global_struct();
> +
> +    struct nested_table nested = { .inner = local_table, .extra_data = 200 };
> +    result += test_nested_struct(&nested);
> +
> +    result += local_table.callback4(handler1, handler2, handler3, &result, main, &local_table);
> +
> +    return result;
> +}
> +
> +/* Verify that all address-taken functions get KCFI preambles.  */
> +/* { dg-final { scan-assembler {__cfi_handler1:} } } */
> +/* { dg-final { scan-assembler {__cfi_handler2:} } } */
> +/* { dg-final { scan-assembler {__cfi_handler3:} } } */
> +/* { dg-final { scan-assembler {__cfi_test_many_args:} } } */
> +
> +/* x86_64: Verify KCFI checks are generated for indirect calls through
> +   complex addressing.  */
> +/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-4\(%r[a-z0-9]+\), %r10d} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler {ud2} { target x86_64-*-* } } } */
> +
> +/* AArch64: Verify KCFI checks for complex addressing.  */
> +/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-4\]} { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler {brk} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Verify KCFI checks for complex addressing with stack spilling.  */
> +/* { dg-final { scan-assembler {ldr\tr0, \[r[0-9]+, #-4\]} { target arm32 } } } */
> +/* { dg-final { scan-assembler {udf} { target arm32 } } } */
> +
> +/* RISC-V: Verify KCFI check sequence for complex addressing.  */
> +/* { dg-final { scan-assembler {lw\tt1, -4\([a-z0-9]+\)\n\tlui\tt2, [0-9]+\n\taddiw\tt2, t2, -?[0-9]+\n\tbeq\tt1, t2, \.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tebreak} { target riscv*-*-* } } } */
> +
> +/* Should have trap section for x86 and RISC-V only.  */
> +/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-ipa-robustness.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-ipa-robustness.c
> new file mode 100644
> index 000000000000..a43bcd4f3e3f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-ipa-robustness.c
> @@ -0,0 +1,54 @@
> +/* Test KCFI IPA pass robustness with compiler-generated constructs.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include <stddef.h>
> +
> +/* Test various compiler-generated constructs that could confuse IPA pass.  */
> +
> +/* static_assert - this was causing the original crash.  */
> +typedef struct {
> +    int field1;
> +    char field2;
> +} test_struct_t;
> +
> +static_assert(offsetof(test_struct_t, field1) == 0, "layout check 1");
> +static_assert(offsetof(test_struct_t, field2) == 4, "layout check 2");
> +static_assert(sizeof(test_struct_t) >= 5, "size check");
> +
> +/* Regular functions that should get KCFI analysis.  */
> +void regular_function(void) {
> +    /* Should get KCFI preamble.  */
> +}
> +
> +static void static_function(void) {
> +    /* With -O2: correctly identified as not address-taken, no preamble.  */
> +}
> +
> +void address_taken_function(void) {
> +    /* Should get KCFI preamble (address taken below) */
> +}
> +
> +/* Function pointer to create address-taken scenario.  */
> +void (*func_ptr)(void) = address_taken_function;
> +
> +/* More static_asserts mixed with function definitions.  */
> +static_assert(sizeof(void*) >= 4, "pointer size check");
> +
> +int main(void) {
> +    regular_function();    /* Direct call.  */
> +    static_function();     /* Direct call to static.  */
> +    func_ptr();            /* Indirect call.  */
> +
> +    static_assert(sizeof(int) == 4, "int size check");
> +
> +    return 0;
> +}
> +
> +/* Verify KCFI preambles are generated appropriately.  */
> +/* { dg-final { scan-assembler "__cfi_regular_function:" } } */
> +/* { dg-final { scan-assembler "__cfi_address_taken_function:" } } */
> +/* { dg-final { scan-assembler "__cfi_main:" } } */
> +
> +/* With -O2: static_function correctly identified as not address-taken.  */
> +/* { dg-final { scan-assembler-not "__cfi_static_function:" } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-move-preservation.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-move-preservation.c
> new file mode 100644
> index 000000000000..50029d136716
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-move-preservation.c
> @@ -0,0 +1,55 @@
> +/* Test that KCFI preserves function pointer moves at -O2 optimization.
> +   This test ensures that the combine pass doesn't incorrectly optimize away
> +   the move instruction needed to transfer function pointers from argument
> +   registers to the target registers used by KCFI patterns.  */
> +
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2 -std=gnu11" } */
> +
> +static int called_count = 0;
> +
> +/* Function taking one argument, returning void.  */
> +static __attribute__((noinline)) void increment_void(int *counter)
> +{
> +    (*counter)++;
> +}
> +
> +/* Function taking one argument, returning int.  */
> +static __attribute__((noinline)) int increment_int(int *counter)
> +{
> +    (*counter)++;
> +    return *counter;
> +}
> +
> +/* Don't allow the compiler to inline the calls.  */
> +static __attribute__((noinline)) void indirect_call(void (*func)(int *))
> +{
> +    func(&called_count);
> +}
> +
> +int main(void)
> +{
> +    /* This should work - matching prototype.  */
> +    indirect_call(increment_void);
> +
> +    /* This should trap - mismatched prototype.  */
> +    indirect_call((void *)increment_int);
> +
> +    return 0;
> +}
> +
> +/* Verify complete KCFI check sequence with preserved move instruction. At
> +   -O2, the combine pass previously optimized away the move from %rdi to %rax,
> +   breaking KCFI. Verify the full sequence is preserved. */
> +
> +/* x86_64: Complete KCFI sequence with move preservation and indirect jump.  */
> +/* { dg-final { scan-assembler {(indirect_call):.*\n.*movq\s+%rdi,\s+(%rax)\n.*movl\s+\$[0-9]+,\s+%r10d\n\taddl\s+-4\(\2\),\s+%r10d\n\tje\s+\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tud2.*\.Lkcfi_call[0-9]+:\n\tjmp\s+\*\2.*\.size\s+\1,\s+\.-\1} { target x86_64-*-* } } } */
> +
> +/* AArch64: Complete KCFI sequence with move preservation and indirect branch.  */
> +/* { dg-final { scan-assembler {(indirect_call):.*\n.*mov\s+(x[0-9]+),\s+x0\n.*ldur\s+w16,\s+\[\2,\s+#-4\]\n\tmov\s+w17,\s+#[0-9]+\n\tmovk\s+w17,\s+#[0-9]+,\s+lsl\s+#16\n\tcmp\s+w16,\s+w17\n\tb\.eq\s+\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tbrk\s+#[0-9]+.*\.Lkcfi_call[0-9]+:\n\tbr\s+\2.*\.size\s+\1,\s+\.-\1} { target aarch64*-*-* } } } */
> +
> +/* ARM32: Complete KCFI sequence with move preservation and indirect branch.  */
> +/* { dg-final { scan-assembler {(indirect_call):.*\n.*mov\s+(r[0-9]+),\s+r0\n.*push\s+\{r0,\s+r1\}\n\tldr\s+r0,\s+\[\2,\s+#-4\]\n\tmovw\s+r1,\s+#[0-9]+\n\tmovt\s+r1,\s+#[0-9]+\n\tcmp\s+r0,\s+r1\n\tpop\s+\{r0,\s+r1\}\n\tbeq\s+\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tudf\s+#[0-9]+.*\.Lkcfi_call[0-9]+:\n\tbx\s+\2.*\.size\s+\1,\s+\.-\1} { target arm32 } } } */
> +
> +/* RISC-V: Complete KCFI sequence with move preservation and indirect jump.  */
> +/* { dg-final { scan-assembler {(indirect_call):.*mv\s+(a[0-9]+),a0.*lw\s+t1,\s+-4\(\2\).*lui\s+t2,\s+[0-9]+.*addiw\s+t2,\s+t2,\s+-?[0-9]+.*beq\s+t1,\s+t2,\s+\.Lkcfi_call[0-9]+.*ebreak.*jalr\s+zero,\s+\2,\s+0.*\.size\s+\1,\s+\.-\1} { target riscv64-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize-inline.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize-inline.c
> new file mode 100644
> index 000000000000..c43d8014ff2d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize-inline.c
> @@ -0,0 +1,100 @@
> +/* Test that no_sanitize("kcfi") attribute is preserved during inlining.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2" } */
> +
> +extern void external_side_effect(int value);
> +
> +/* Regular function (should get KCFI checks) */
> +__attribute__((noinline))
> +void normal_function(void (*callback)(int))
> +{
> +    /* This indirect call must generate KCFI checks.  */
> +    callback(300);
> +    external_side_effect(300);
> +}
> +
> +/* Regular function marked with no_sanitize("kcfi") (positive control) */
> +__attribute__((noinline, no_sanitize("kcfi")))
> +void sensitive_non_inline_function(void (*callback)(int))
> +{
> +    /* This indirect call should NOT generate KCFI checks.  */
> +    callback(100);
> +    external_side_effect(100);
> +}
> +
> +/* Function marked with both no_sanitize("kcfi") and always_inline.  */
> +__attribute__((always_inline, no_sanitize("kcfi")))
> +static inline void sensitive_inline_function(void (*callback)(int))
> +{
> +    /* This indirect call should NOT generate KCFI checks when inlined.  */
> +    callback(42);
> +    external_side_effect(42);
> +}
> +
> +/* Explicit wrapper for testing sensitive_inline_function behavior.  */
> +__attribute__((noinline))
> +void wrap_sensitive_inline(void (*callback)(int))
> +{
> +    sensitive_inline_function(callback);
> +}
> +
> +/* Function marked with only always_inline (should get KCFI checks) */
> +__attribute__((always_inline))
> +static inline void normal_inline_function(void (*callback)(int))
> +{
> +    /* This indirect call must generate KCFI checks when inlined.  */
> +    callback(200);
> +    external_side_effect(200);
> +}
> +
> +/* Explicit wrapper for testing normal_inline_function behavior.  */
> +__attribute__((noinline))
> +void wrap_normal_inline(void (*callback)(int))
> +{
> +    normal_inline_function(callback);
> +}
> +
> +void test_callback(int value)
> +{
> +    external_side_effect(value);
> +}
> +
> +static void (*volatile function_pointer)(int) = test_callback;
> +
> +int main(void)
> +{
> +    void (*fn_ptr)(int) = function_pointer;
> +
> +    normal_function(fn_ptr);
> +    wrap_normal_inline(fn_ptr);
> +    sensitive_non_inline_function(fn_ptr);
> +    wrap_sensitive_inline(fn_ptr);
> +
> +    return 0;
> +}
> +
> +/* Verify correct number of KCFI checks: exactly 2 */
> +/* { dg-final { scan-assembler-times {ud2} 2 { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-times {brk\s+#[0-9]+} 2 { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler-times {udf\s+#[0-9]+} 2 { target arm32 } } } */
> +/* { dg-final { scan-assembler-times {ebreak} 2 { target riscv*-*-* } } } */
> +
> +/* Positive controls: these should have KCFI checks.  */
> +/* { dg-final { scan-assembler {normal_function:.*ud2.*\.size\s+normal_function} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler {wrap_normal_inline:.*ud2.*\.size\s+wrap_normal_inline} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler {normal_function:.*brk\s+#[0-9]+.*\.size\s+normal_function} { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler {wrap_normal_inline:.*brk\s+#[0-9]+.*\.size\s+wrap_normal_inline} { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler {normal_function:.*udf\t#[0-9]+.*\.size\s+normal_function} { target arm32 } } } */
> +/* { dg-final { scan-assembler {wrap_normal_inline:.*udf\t#[0-9]+.*\.size\s+wrap_normal_inline} { target arm32 } } } */
> +/* { dg-final { scan-assembler {normal_function:.*ebreak.*\.size\s+normal_function} { target riscv*-*-* } } } */
> +/* { dg-final { scan-assembler {wrap_normal_inline:.*ebreak.*\.size\s+wrap_normal_inline} { target riscv*-*-* } } } */
> +
> +/* Negative controls: these should NOT have KCFI checks.  */
> +/* { dg-final { scan-assembler-not {sensitive_non_inline_function:.*ud2.*\.size\s+sensitive_non_inline_function} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-not {wrap_sensitive_inline:.*ud2.*\.size\s+wrap_sensitive_inline} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-not {sensitive_non_inline_function:.*brk\s+#[0-9]+.*\.size\s+sensitive_non_inline_function} { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler-not {wrap_sensitive_inline:.*brk\s+#[0-9]+.*\.size\s+wrap_sensitive_inline} { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler-not {sensitive_non_inline_function:[^\n]*udf\t#[0-9]+[^\n]*\.size\tsensitive_non_inline_function} { target arm32 } } } */
> +/* { dg-final { scan-assembler-not {wrap_sensitive_inline:[^\n]*udf\t#[0-9]+[^\n]*\.size\twrap_sensitive_inline} { target arm32 } } } */
> +/* { dg-final { scan-assembler-not {sensitive_non_inline_function:.*ebreak.*\.size\s+sensitive_non_inline_function} { target riscv*-*-* } } } */
> +/* { dg-final { scan-assembler-not {wrap_sensitive_inline:.*ebreak.*\.size\s+wrap_sensitive_inline} { target riscv*-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c
> new file mode 100644
> index 000000000000..6f1a558c0820
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-no-sanitize.c
> @@ -0,0 +1,39 @@
> +/* Test KCFI with no_sanitize attribute.  */
> +/* { dg-do compile } */
> +
> +void target_function(void) {
> +    /* This should get KCFI preamble.  */
> +}
> +
> +void caller_with_checks(void) {
> +    /* This function should generate KCFI checks.  */
> +    void (*func_ptr)(void) = target_function;
> +    func_ptr();
> +}
> +
> +__attribute__((no_sanitize("kcfi")))
> +void caller_no_checks(void) {
> +    /* This function should NOT generate KCFI checks due to no_sanitize.  */
> +    void (*func_ptr)(void) = target_function;
> +    func_ptr();
> +}
> +
> +int main() {
> +    caller_with_checks();    /* This should generate checks inside.  */
> +    caller_no_checks();      /* This should NOT generate checks inside.  */
> +    return 0;
> +}
> +
> +/* All functions should get preambles regardless of no_sanitize.  */
> +/* { dg-final { scan-assembler "__cfi_target_function:" } } */
> +/* { dg-final { scan-assembler "__cfi_caller_with_checks:" } } */
> +/* { dg-final { scan-assembler "__cfi_caller_no_checks:" } } */
> +/* { dg-final { scan-assembler "__cfi_main:" } } */
> +
> +/* caller_with_checks() should generate KCFI check.
> +   caller_no_checks() should NOT generate KCFI check (no_sanitize).
> +   So a total of exactly 1 KCFI check in the entire program.  */
> +/* { dg-final { scan-assembler-times {addl\t-4\(%r[ad]x\), %r1[01]d} 1 { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-times {ldur\tw16, \[x[0-9]+, #-4\]} 1 { target aarch64-*-* } } } */
> +/* { dg-final { scan-assembler-times {ldr\tr0, \[r[0-9]+, #-4\]} 1 { target arm32 } } } */
> +/* { dg-final { scan-assembler-times {lw\tt1, -[0-9]+\(} 1 { target riscv*-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-offset-validation.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-offset-validation.c
> new file mode 100644
> index 000000000000..f93a042d9752
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-offset-validation.c
> @@ -0,0 +1,48 @@
> +/* Test KCFI call-site offset validation across architectures.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
> +
> +void target_func_a(void) { }
> +void target_func_b(int x) { }
> +void target_func_c(int x, int y) { }
> +
> +int main() {
> +    void (*ptr_a)(void) = target_func_a;
> +    void (*ptr_b)(int) = target_func_b;
> +    void (*ptr_c)(int, int) = target_func_c;
> +
> +    /* Multiple indirect calls.  */
> +    ptr_a();
> +    ptr_b(1);
> +    ptr_c(1, 2);
> +
> +    return 0;
> +}
> +
> +/* Should have KCFI preambles for all functions.  */
> +/* { dg-final { scan-assembler "__cfi_target_func_a:" } } */
> +/* { dg-final { scan-assembler "__cfi_target_func_b:" } } */
> +/* { dg-final { scan-assembler "__cfi_target_func_c:" } } */
> +
> +/* x86_64: All call sites should use -4 offset for KCFI type ID loads, even
> +   with -falign-functions=16 (we're not using patchable entries here).  */
> +/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-4\(%r[a-z0-9]+\), %r10d} { target x86_64-*-* } } } */
> +
> +/* AArch64: All call sites should use -4 offset.  */
> +/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-4\]} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: All call sites should use -4 offset with stack spilling.  */
> +/* { dg-final { scan-assembler {ldr\tr0, \[r[0-9]+, #-4\]} { target arm32 } } } */
> +
> +/* RISC-V: All call sites should use -4 offset.  */
> +/* { dg-final { scan-assembler {lw\tt1, -4\(} { target riscv*-*-* } } } */
> +
> +/* Should have trap section.  */
> +/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
> +
> +/* AArch64 should NOT have trap section (uses brk immediate instead) */
> +/* { dg-final { scan-assembler-not {\.kcfi_traps} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit should NOT have trap section (uses udf immediate instead) */
> +/* { dg-final { scan-assembler-not {\.kcfi_traps} { target arm32 } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-basic.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-basic.c
> new file mode 100644
> index 000000000000..a2d0ef0c6ff6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-basic.c
> @@ -0,0 +1,70 @@
> +/* Test KCFI with patchable function entries - basic case.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-fpatchable-function-entry=5,2" } */
> +/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
> +
> +void test_function(int x) {
> +    /* Function should get both KCFI preamble and patchable entries.  */
> +}
> +
> +int main() {
> +    test_function(42);
> +    return 0;
> +}
> +
> +/* Should have KCFI preamble.  */
> +/* { dg-final { scan-assembler "__cfi_test_function:" } } */
> +
> +/* Should have patchable function entry section.  */
> +/* { dg-final { scan-assembler "__patchable_function_entries" } } */
> +
> +/* x86_64: Should have exactly 2 prefix NOPs between .LPFE and .type.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*\.type} { target x86_64-*-* } } } */
> +
> +/* x86_64: Should have exactly 3 entry NOPs between .cfi_startproc and
> +   pushq.  */
> +/* { dg-final { scan-assembler {\.cfi_startproc\n\t*nop\n\t*nop\n\t*nop\n\t*pushq} { target x86_64-*-* } } } */
> +
> +/* x86_64: KCFI should have exactly 9 NOPs between __cfi_ and movl.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*movl} { target x86_64-*-* } } } */
> +
> +/* x86_64: Validate KCFI type ID is present.  */
> +/* { dg-final { scan-assembler {movl\t\$0x[0-9a-f]+, %eax} { target x86_64-*-* } } } */
> +
> +/* AArch64: Should have exactly 2 prefix NOPs between .LPFE and .type.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*\.type} { target aarch64*-*-* } } } */
> +
> +/* AArch64: Should have exactly 3 entry NOPs between .cfi_startproc and
> +   stack manipulation.  */
> +/* { dg-final { scan-assembler {\.cfi_startproc\n\t*nop\n\t*nop\n\t*nop\n\t*sub\t*sp} { target aarch64*-*-* } } } */
> +
> +/* AArch64: KCFI should have alignment NOPs then .word immediate.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target aarch64*-*-* } } } */
> +
> +/* AArch64: Validate clean KCFI boundary - .word then immediate end/size.  */
> +/* { dg-final { scan-assembler {\.word\t0x[0-9a-f]+\n\.Lcfi_func_end_test_function:\n\t\.size\t__cfi_test_function, \.-__cfi_test_function} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Should have exactly 2 prefix NOPs between .LPFE and .syntax.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*\.syntax} { target arm32 } } } */
> +
> +/* ARM 32-bit: Should have exactly 3 entry NOPs after function label.  */
> +/* { dg-final { scan-assembler {test_function:\n\t*nop\n\t*nop\n\t*nop} { target arm32 } } } */
> +
> +/* ARM 32-bit: KCFI should have alignment NOPs then .word immediate.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target arm32 } } } */
> +
> +/* ARM 32-bit: Validate clean KCFI boundary - .word then immediate end/size.  */
> +/* { dg-final { scan-assembler {\.word\t0x[0-9a-f]+\n\.Lcfi_func_end_test_function:\n\t\.size\t__cfi_test_function, \.-__cfi_test_function} { target arm32 } } } */
> +
> +/* RISC-V: Should have exactly 2 prefix NOPs between .LPFE and .type.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*\.type} { target riscv*-*-* } } } */
> +
> +/* RISC-V: Should have exactly 3 entry NOPs before .cfi_startproc followed
> +   by addi sp.  */
> +/* { dg-final { scan-assembler {nop\n\t*nop\n\t*nop\n\.LFB[0-9]+:\n\t*\.cfi_startproc\n\t*addi\t*sp} { target riscv*-*-* } } } */
> +
> +/* RISC-V: KCFI should have alignment NOPs then .word immediate.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target riscv*-*-* } } } */
> +
> +/* RISC-V: Validate clean KCFI boundary - .word then immediate end/size.  */
> +/* { dg-final { scan-assembler {\.word\t0x[0-9a-f]+\n\.Lcfi_func_end_test_function:\n\t\.size\t__cfi_test_function, \.-__cfi_test_function} { target riscv*-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-entry-only.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-entry-only.c
> new file mode 100644
> index 000000000000..62e1926e107e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-entry-only.c
> @@ -0,0 +1,62 @@
> +/* Test KCFI with patchable function entries - entry NOPs only.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-fpatchable-function-entry=4,0" } */
> +/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
> +
> +void test_function(void) {
> +}
> +
> +static void caller(void) {
> +    /* Make an indirect call to test callsite offset calculation.  */
> +    void (*func_ptr)(void) = test_function;
> +    func_ptr();
> +}
> +
> +int main() {
> +    test_function();  /* Direct call.  */
> +    caller();         /* Indirect call via static function.  */
> +    return 0;
> +}
> +
> +/* x86_64: Should have KCFI preamble with architecture alignment NOPs (11).  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+nop\n\t+movl\t+\$0x[0-9a-f]+, %eax} { target x86_64-*-* } } } */
> +
> +/* AArch64: Should have KCFI preamble with no alignment NOPs.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*\.word\t0x[0-9a-f]+} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Should have KCFI preamble with no alignment NOPs.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t\.word\t0x[0-9a-f]+} { target arm32 } } } */
> +
> +/* RISC-V: Should have KCFI preamble with no alignment NOPs.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t\.word\t0x[0-9a-f]+} { target riscv*-*-* } } } */
> +
> +/* x86_64: Indirect call should use original prefix NOPs (0) for offset
> +   calculation: -4 offset.  */
> +/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-4\(%r[a-z0-9]+\), %r10d\n\tje\t(\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tud2\n.*\n\1:\n\tcall} { target x86_64-*-* } } } */
> +
> +/* x86_64: All 4 NOPs are entry NOPs - should have exactly 4 entry NOPs.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*pushq} { target x86_64-*-* } } } */
> +
> +/* AArch64: All 4 NOPs are entry NOPs - should have exactly 4 entry NOPs.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*stp} { target aarch64*-*-* } } } */
> +
> +/* AArch64: No alignment NOPs - function type should come immediately before
> +   function.  */
> +/* { dg-final { scan-assembler {\.type\t*test_function, %function\n*test_function:} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: All 4 NOPs are entry NOPs - should have exactly 4 entry NOPs.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop} { target arm32 } } } */
> +
> +/* ARM 32-bit: No alignment NOPs - function type should come immediately
> +   before function.  */
> +/* { dg-final { scan-assembler {\.type\t*test_function, %function\n*test_function:} { target arm32 } } } */
> +
> +/* RISC-V: All 4 NOPs are entry NOPs.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\.LFB} { target riscv*-*-* } } } */
> +
> +/* RISC-V: No alignment NOPs - function type should come immediately
> +   before function.  */
> +/* { dg-final { scan-assembler {\.type\t*test_function, @function\n*test_function:} { target riscv*-*-* } } } */
> +
> +/* Should have patchable function entry section.  */
> +/* { dg-final { scan-assembler "__patchable_function_entries" } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-large.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-large.c
> new file mode 100644
> index 000000000000..3d5618847840
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-large.c
> @@ -0,0 +1,51 @@
> +/* Test KCFI with large patchable function entries.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-fpatchable-function-entry=11,11" } */
> +/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
> +
> +void test_function(void) {
> +}
> +
> +int main() {
> +    void (*func_ptr)(void) = test_function;
> +    func_ptr();
> +    return 0;
> +}
> +
> +/* Should have KCFI preamble.  */
> +/* { dg-final { scan-assembler "__cfi_test_function:" } } */
> +
> +/* Should have patchable function entry section.  */
> +/* { dg-final { scan-assembler "__patchable_function_entries" } } */
> +
> +/* x86_64: Should have exactly 11 alignment NOPs between .LPFE and .type.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target x86_64-*-* } } } */
> +
> +/* x86_64: Should have 0 entry NOPs - function starts immediately with
> +   pushq.  */
> +/* { dg-final { scan-assembler {test_function:\n\.LFB[0-9]+:\n\t*\.cfi_startproc\n\t*pushq\t*%rbp} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target x86_64-*-* } } } */
> +
> +/* x86_64: KCFI should have 0 entry NOPs - goes directly to typeid movl.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*movl\t\$0x[0-9a-f]+, %eax} { target x86_64-*-* } } } */
> +
> +/* x86_64: Call site should use -15 offset.  */
> +/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-15\(%r[a-z0-9]+\), %r10d} { target x86_64-*-* } } } */
> +
> +/* AArch64: Should have exactly 11 prefix NOPs between .LPFE and .type.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Should have exactly 11 prefix NOPs between .LPFE and .type.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop} { target arm32 } } } */
> +
> +/* AArch64: Call site should use -15 offset.  */
> +/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-15\]} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Call site should use -15 offset.  */
> +/* { dg-final { scan-assembler {ldr\tr0, \[r[0-9]+, #-15\]} { target arm32 } } } */
> +
> +/* RISC-V: Should have 11 prefix NOPs between .LPFE and .type.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target riscv*-*-* } } } */
> +
> +/* RISC-V: Call site should use -15 offset (same as x86/AArch64).  */
> +/* { dg-final { scan-assembler {lw\tt1, -15\(} { target riscv*-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-medium.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-medium.c
> new file mode 100644
> index 000000000000..4f00a86dbcb7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-medium.c
> @@ -0,0 +1,60 @@
> +/* Test KCFI with medium patchable function entries.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-fpatchable-function-entry=8,4" } */
> +/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
> +
> +void test_function(void) {
> +}
> +
> +int main() {
> +    void (*func_ptr)(void) = test_function;
> +    func_ptr();
> +    return 0;
> +}
> +
> +/* Should have KCFI preamble.  */
> +/* { dg-final { scan-assembler "__cfi_test_function:" } } */
> +
> +/* Should have patchable function entry section.  */
> +/* { dg-final { scan-assembler "__patchable_function_entries" } } */
> +
> +/* x86_64: Should have exactly 4 prefix NOPs between .LPFE and .type.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target x86_64-*-* } } } */
> +
> +/* x86_64: Should have exactly 4 entry NOPs between .cfi_startproc and
> +   pushq.  */
> +/* { dg-final { scan-assembler {\.cfi_startproc\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*pushq} { target x86_64-*-* } } } */
> +
> +/* x86_64: KCFI should have exactly 7 alignment NOPs between __cfi_ and
> +   typeid movl.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*movl\t\$0x[0-9a-f]+, %eax} { target x86_64-*-* } } } */
> +
> +/* x86_64: Call site should use -8 offset.  */
> +/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-8\(%r[a-z0-9]+\), %r10d} { target x86_64-*-* } } } */
> +
> +/* AArch64: Should have exactly 4 prefix NOPs between .LPFE and .type.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target aarch64*-*-* } } } */
> +
> +/* AArch64: Should have exactly 4 entry NOPs after .cfi_startproc.  */
> +/* { dg-final { scan-assembler {\.cfi_startproc\n\t*nop\n\t*nop\n\t*nop\n\t*nop} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Should have exactly 4 prefix NOPs between .LPFE and .syntax.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.syntax} { target arm32 } } } */
> +
> +/* ARM 32-bit: Should have exactly 4 entry NOPs after function label.  */
> +/* { dg-final { scan-assembler {test_function:\n\t*nop\n\t*nop\n\t*nop\n\t*nop} { target arm32 } } } */
> +
> +/* AArch64: Call site should use -8 offset.  */
> +/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-8\]} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Call site should use -8 offset.  */
> +/* { dg-final { scan-assembler {ldr\tr0, \[r[0-9]+, #-8\]} { target arm32 } } } */
> +
> +/* RISC-V: Should have exactly 4 prefix NOPs between .LPFE and .type.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*\.type} { target riscv*-*-* } } } */
> +
> +/* RISC-V: Should have 4 entry NOPs.  */
> +/* { dg-final { scan-assembler {test_function:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\.LFB} { target riscv*-*-* } } } */
> +
> +/* RISC-V: Call site should use -8 offset (same as x86/AArch64) */
> +/* { dg-final { scan-assembler {lw\tt1, -8\(} { target riscv*-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-prefix-only.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-prefix-only.c
> new file mode 100644
> index 000000000000..98c53ef52989
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-patchable-prefix-only.c
> @@ -0,0 +1,60 @@
> +/* Test KCFI with patchable function entries - prefix NOPs only.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-fpatchable-function-entry=3,3" } */
> +/* { dg-additional-options "-falign-functions=16" { target x86_64-*-* } } */
> +
> +void test_function(void) {
> +}
> +
> +int main() {
> +    test_function();
> +    return 0;
> +}
> +
> +/* Should have KCFI preamble.  */
> +/* { dg-final { scan-assembler "__cfi_test_function:" } } */
> +
> +/* x86_64: All 3 NOPs are prefix NOPs - should have exactly 3 prefix NOPs.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*\.type\t*test_function} { target x86_64-*-* } } } */
> +
> +/* x86_64: No entry NOPs - function should start immediately with prologue. */
> +/* { dg-final { scan-assembler {test_function:\n\.LFB[0-9]+:\n\t*\.cfi_startproc\n\t*pushq\t*%rbp} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target x86_64-*-* } } } */
> +
> +/* x86_64: should have exactly 8 alignment NOPs.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*nop\n\t*movl} { target x86_64-*-* } } } */
> +
> +/* AArch64: All 3 NOPs are prefix NOPs - should have exactly 3 prefix NOPs.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*\.type\t*test_function} { target aarch64*-*-* } } } */
> +
> +/* AArch64: No entry NOPs - function should start immediately with prologue.  */
> +/* { dg-final { scan-assembler {test_function:\n\.LFB[0-9]+:\n\t*\.cfi_startproc\n\t*nop\n\t*ret} { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target aarch64*-*-* } } } */
> +
> +/* AArch64: KCFI type ID should have 1 alignment NOP then word.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: All 3 NOPs are prefix NOPs - should have exactly 3 prefix NOPs.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop} { target arm32 } } } */
> +
> +/* ARM 32-bit: No entry NOPs - function should start immediately with
> +   prologue.  */
> +/* { dg-final { scan-assembler {test_function:} { target arm32 } } } */
> +/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target arm32 } } } */
> +
> +/* ARM 32-bit: KCFI type ID should have 1 alignment NOP then word.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target arm32 } } } */
> +
> +/* RISC-V: All 3 NOPs are prefix NOPs - should have exactly 3 prefix NOPs.  */
> +/* { dg-final { scan-assembler {\.LPFE[0-9]+:\n\t*nop\n\t*nop\n\t*nop\n\t*\.type\t*test_function} { target riscv*-*-* } } } */
> +
> +/* RISC-V: No entry NOPs - function should start immediately with
> +   .cfi_startproc.  */
> +/* { dg-final { scan-assembler {test_function:\n\.LFB[0-9]+:\n\t*\.cfi_startproc} { target riscv*-*-* } } } */
> +/* { dg-final { scan-assembler-not {\t*\.weak\t*__kcfi_typeid_test_function\n} { target riscv*-*-* } } } */
> +
> +/* RISC-V: KCFI type ID should have 1 alignment NOP then word.  */
> +/* { dg-final { scan-assembler {__cfi_test_function:\n\t*nop\n\t*\.word\t0x[0-9a-f]+} { target riscv*-*-* } } } */
> +
> +/* Should have patchable function entry section.  */
> +/* { dg-final { scan-assembler "__patchable_function_entries" } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-pic-addressing.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-pic-addressing.c
> new file mode 100644
> index 000000000000..26323db4572f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-pic-addressing.c
> @@ -0,0 +1,104 @@
> +/* Test KCFI with position-independent code addressing modes.
> +   This is a regression test for complex addressing like
> +   PLUS(PLUS(...), symbol_ref) which can occur with PIC and caused
> +   change_address_1 RTL errors.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2 -fpic" } */
> +
> +/* Global function pointer table that creates PIC addressing.  */
> +struct callbacks {
> +    int (*handler1)(int);
> +    void (*handler2)(void);
> +    int (*handler3)(int, int);
> +};
> +
> +static int simple_handler(int x) {
> +    return x * 2;
> +}
> +
> +static void void_handler(void) {
> +    /* Empty handler.  */
> +}
> +
> +static int complex_handler(int a, int b) {
> +    return a + b;
> +}
> +
> +/* Global structure that will require PIC addressing.  */
> +struct callbacks global_callbacks = {
> +    .handler1 = simple_handler,
> +    .handler2 = void_handler,
> +    .handler3 = complex_handler
> +};
> +
> +/* Function that uses PIC addressing to access global callbacks.  */
> +int test_pic_addressing(int value) {
> +    /* These indirect calls through global structure create complex
> +       addressing like PLUS(PLUS(GOT_base, symbol_offset), struct_offset)
> +       which previously caused RTL errors in KCFI instrumentation.  */
> +
> +    int result = 0;
> +    result += global_callbacks.handler1(value);
> +
> +    global_callbacks.handler2();
> +
> +    result += global_callbacks.handler3(value, result);
> +
> +    return result;
> +}
> +
> +/* Test with function pointer arrays.  */
> +static int (*func_array[])(int) = {
> +    simple_handler,
> +    simple_handler,
> +    simple_handler
> +};
> +
> +int test_pic_array(int index, int value) {
> +    /* Array access with PIC can also create complex addressing.  */
> +    return func_array[index % 3](value);
> +}
> +
> +/* Test with dynamic PIC addressing.  */
> +struct callbacks *get_callbacks(void) {
> +    return &global_callbacks;
> +}
> +
> +int test_dynamic_pic(int value) {
> +    /* Dynamic access through function call creates very complex addressing.  */
> +    struct callbacks *cb = get_callbacks();
> +    return cb->handler1(value) + cb->handler3(value, value);
> +}
> +
> +int main() {
> +    int result = 0;
> +    result += test_pic_addressing(10);
> +    result += test_pic_array(1, 20);
> +    result += test_dynamic_pic(5);
> +    return result;
> +}
> +
> +/* Verify that all address-taken functions get KCFI preambles.  */
> +/* { dg-final { scan-assembler {__cfi_simple_handler:} } } */
> +/* { dg-final { scan-assembler {__cfi_void_handler:} } } */
> +/* { dg-final { scan-assembler {__cfi_complex_handler:} } } */
> +
> +/* x86_64: Verify KCFI checks are generated.  */
> +/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r10d\n\taddl\t-4\(%r[a-z0-9]+\), %r10d} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler {ud2} { target x86_64-*-* } } } */
> +
> +/* AArch64: Verify KCFI checks.  */
> +/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-4\]} { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler {brk} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit: Verify KCFI checks with PIC addressing and stack spilling.  */
> +/* { dg-final { scan-assembler {ldr\tr0, \[r[0-9]+, #-4\]} { target arm32 } } } */
> +/* { dg-final { scan-assembler {udf} { target arm32 } } } */
> +
> +/* RISC-V: Verify KCFI checks are generated.  */
> +/* { dg-final { scan-assembler {lw\tt1, -4\([a-z0-9]+\)\n\tlui\tt2, [0-9]+\n\taddiw\tt2, t2, -?[0-9]+} { target riscv*-*-* } } } */
> +/* { dg-final { scan-assembler {ebreak} { target riscv*-*-* } } } */
> +
> +/* Should have trap section.  */
> +/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-retpoline-r11.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-retpoline-r11.c
> new file mode 100644
> index 000000000000..79e5ca61cdc2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-retpoline-r11.c
> @@ -0,0 +1,50 @@
> +/* Test KCFI with retpoline thunk-extern flag forces r11 usage.  */
> +/* { dg-do compile { target x86_64-*-* } } */
> +/* { dg-additional-options "-O2 -mindirect-branch=thunk-extern" } */
> +
> +extern int external_target(void);
> +
> +/* Test regular call (not tail call) */
> +__attribute__((noinline))
> +int call_test(int (*func_ptr)(void)) {
> +    /* This indirect call should use r11 when both KCFI and
> +       -mindirect-branch=thunk-extern are enabled.  */
> +    int result = func_ptr();  /* Function parameter prevents direct optimization.  */
> +    return result + 1;  /* Prevent tail call optimization.  */
> +}
> +
> +/* Reference external_target to generate the required symbol.  */
> +int (*external_func_ptr)(void) = external_target;
> +
> +/* Test function for sibcalls (tail calls) */
> +__attribute__((noinline))
> +void sibcall_test(int (**func_ptr)(void)) {
> +    /* This sibcall should use r11 when both KCFI and
> +       -mindirect-branch=thunk-extern are enabled.  */
> +    (*func_ptr)();  /* Tail call - should be optimized to sibcall.  */
> +}
> +
> +/* Should have weak symbol for external function.  */
> +/* { dg-final { scan-assembler "__kcfi_typeid_external_target" } } */
> +
> +/* When both KCFI and -mindirect-branch=thunk-extern are enabled,
> +   indirect calls should always use r11 register and convert to extern thunks.  */
> +/* { dg-final { scan-assembler-times {call\s+__x86_indirect_thunk_r11} 1 } } */
> +
> +/* Sibcalls should also use r11 register and convert to extern thunks.  */
> +/* { dg-final { scan-assembler-times {jmp\s+__x86_indirect_thunk_r11} 1 } } */
> +
> +/* Should have exactly 2 KCFI traps (one per function) */
> +/* { dg-final { scan-assembler-times {ud2} 2 } } */
> +
> +/* Should NOT use other registers for indirect calls.  */
> +/* { dg-final { scan-assembler-not {call\s+\*%rax} } } */
> +/* { dg-final { scan-assembler-not {call\s+\*%rcx} } } */
> +/* { dg-final { scan-assembler-not {call\s+\*%rdx} } } */
> +/* { dg-final { scan-assembler-not {call\s+\*%rdi} } } */
> +
> +/* Should NOT use other registers for sibcalls.  */
> +/* { dg-final { scan-assembler-not {jmp\s+\*%rax} } } */
> +/* { dg-final { scan-assembler-not {jmp\s+\*%rcx} } } */
> +/* { dg-final { scan-assembler-not {jmp\s+\*%rdx} } } */
> +/* { dg-final { scan-assembler-not {jmp\s+\*%rdi} } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-runtime.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-runtime.c
> new file mode 100644
> index 000000000000..6ad8fab5da80
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-runtime.c
> @@ -0,0 +1,151 @@
> +/* Test KCFI runtime behavior: working calls and type mismatch trapping.
> +   { dg-do run { target native } }
> +   { dg-options "-fsanitize=kcfi" } */
> +
> +#include <stdio.h>
> +#include <signal.h>
> +#include <setjmp.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +/* Test functions with different signatures */
> +static int func_int_void(void)
> +{
> +    return 42;
> +}
> +
> +__attribute__((nocf_check))
> +static int func_int_void_nocf_check(void)
> +{
> +    return 42;
> +}
> +
> +static int func_int_int(int x)
> +{
> +    return x * 4;
> +}
> +
> +/* Global state for signal handling */
> +static volatile int trap_occurred = 0;
> +static jmp_buf trap_env;
> +
> +/* Signal handler for KCFI traps */
> +static void trap_handler(int sig)
> +{
> +    trap_occurred = 1;
> +    longjmp(trap_env, 1);
> +}
> +
> +/* Compatible indirect call should work */
> +static int test_compatible_call(void)
> +{
> +    typedef int (*int_void_ptr)(void);
> +    int_void_ptr ptr = func_int_void;
> +
> +    fprintf(stderr, "Calling %s(0x%08x) through %s(0x%08x) ...\n",
> +           __builtin_typeinfo_name(typeof(func_int_void)),
> +           __builtin_typeinfo_hash(typeof(func_int_void)),
> +           __builtin_typeinfo_name(typeof(*ptr)),
> +           __builtin_typeinfo_hash(typeof(*ptr)));
> +
> +    trap_occurred = 0;
> +    /* This should work - same signature */
> +    int result = ptr();
> +
> +    return (trap_occurred == 0 && result == 42) ? 1 : 0;
> +}
> +
> +/* Compatible indirect call to nocf_check should not work */
> +static int test_nocf_check_trap(void)
> +{
> +    trap_occurred = 0;
> +
> +    if (setjmp(trap_env) == 0) {
> +      typedef int (__attribute__((nocf_check)) *int_void_ptr_nocf)(void);
> +      int_void_ptr_nocf ptr = func_int_void_nocf_check;
> +
> +      fprintf(stderr, "Calling %s(0x%08x) through %s(0x%08x) ...\n",
> +             __builtin_typeinfo_name(typeof(func_int_void_nocf_check)),
> +             __builtin_typeinfo_hash(typeof(func_int_void_nocf_check)),
> +             __builtin_typeinfo_name(typeof(*ptr)),
> +             __builtin_typeinfo_hash(typeof(*ptr)));
> +
> +      int result = ptr();
> +
> +      /* If we get here, the trap didn't occur */
> +      return 0;
> +    } else {
> +      /* We caught the trap - this is expected */
> +      return trap_occurred;
> +    }
> +}
> +
> +/* Type mismatch should trap */
> +static int test_type_mismatch_trap(void)
> +{
> +    trap_occurred = 0;
> +
> +    if (setjmp(trap_env) == 0) {
> +      /* Cast func_int_void to incompatible void(*)(void) type */
> +      typedef void (*void_void_ptr)(void);
> +      void_void_ptr ptr = (void_void_ptr)func_int_void;
> +
> +      fprintf(stderr, "Calling %s(0x%08x) through %s(0x%08x) ...\n",
> +             __builtin_typeinfo_name(typeof(func_int_void)),
> +             __builtin_typeinfo_hash(typeof(func_int_void)),
> +             __builtin_typeinfo_name(typeof(*ptr)),
> +             __builtin_typeinfo_hash(typeof(*ptr)));
> +
> +      /* This should trap because type IDs don't match:
> +         - func_int_void has type ID for int(void)
> +         - but we're calling through void(void) pointer type */
> +      ptr();
> +
> +      /* If we get here, the trap didn't occur */
> +      return 0;
> +    } else {
> +      /* We caught the trap - this is expected */
> +      return trap_occurred;
> +    }
> +}
> +
> +int main(void)
> +{
> +    struct sigaction sa = {
> +      .sa_handler = trap_handler,
> +      .sa_flags = SA_NODEFER,
> +    };
> +    int failed = 3;
> +
> +    /* Install trap handler.  */
> +    if (sigaction(SIGILL, &sa, NULL)) {
> +      perror("sigaction");
> +      return 1;
> +    }
> +
> +    /* Compatible call should work */
> +    if (test_compatible_call()) {
> +      printf("OK: matched indirect call succeeded\n");
> +      failed--;
> +    } else {
> +      printf("FAIL\n");
> +    }
> +
> +    /* Using nocf_check should trap */
> +    if (test_nocf_check_trap()) {
> +      printf("OK: indirect call to nocf_check correctly trapped\n");
> +      failed--;
> +    } else {
> +      printf("FAIL\n");
> +    }
> +
> +    /* Type mismatch should trap */
> +    if (test_type_mismatch_trap()) {
> +      printf("OK: mismatched indirect call correctly trapped\n");
> +      failed--;
> +    } else {
> +      printf("FAIL\n");
> +    }
> +
> +    return failed;
> +}
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c
> new file mode 100644
> index 000000000000..e2e3912fffa3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-tail-calls.c
> @@ -0,0 +1,142 @@
> +/* Test KCFI protection when indirect calls get converted to tail calls.  */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2" } */
> +
> +typedef int (*func_ptr_t)(int);
> +typedef void (*void_func_ptr_t)(void);
> +
> +struct function_table {
> +    func_ptr_t process;
> +    void_func_ptr_t cleanup;
> +};
> +
> +/* Target functions.  */
> +int process_data(int x) { return x * 2; }
> +void cleanup_data(void) {}
> +
> +/* Initialize function table.  */
> +volatile struct function_table vtable = {
> +    .process = &process_data,
> +    .cleanup = &cleanup_data
> +};
> +
> +/* Indirect call through struct member that should become tail call.  */
> +int test_struct_indirect_call(int x) {
> +    /* This is an indirect call that should be converted to tail call:
> +       Without -fno-optimize-sibling-calls should become "jmp *vtable+0(%rip)"
> +       With -fno-optimize-sibling-calls should become "call *vtable+0(%rip)"  */
> +    return vtable.process(x);
> +}
> +
> +/* Indirect call through function pointer parameter.  */
> +int test_param_indirect_call(func_ptr_t handler, int x) {
> +    /* This is an indirect call that should be converted to tail call:
> +       Without -fno-optimize-sibling-calls should become "jmp *%rdi"
> +       With -fno-optimize-sibling-calls should be "call *%rdi"  */
> +    return handler(x);
> +}
> +
> +/* Void indirect call through struct member.  */
> +void test_void_indirect_call(void) {
> +    /* This is an indirect call that should be converted to tail call:
> +     * Without -fno-optimize-sibling-calls: should become "jmp *vtable+8(%rip)"
> +     * With -fno-optimize-sibling-calls: should be "call *vtable+8(%rip)"  */
> +    vtable.cleanup();
> +}
> +
> +/* Non-tail call for comparison (should always be call).  */
> +int test_non_tail_indirect_call(func_ptr_t handler, int x) {
> +    /* This should never become a tail call - always "call *%rdi"  */
> +    int result = handler(x);
> +    return result + 1;  /* Prevents tail call optimization.  */
> +}
> +
> +/* Should have KCFI preambles for all functions.  */
> +/* { dg-final { scan-assembler-times "__cfi_process_data:" 1 } } */
> +/* { dg-final { scan-assembler-times "__cfi_cleanup_data:" 1 } } */
> +/* { dg-final { scan-assembler-times "__cfi_test_struct_indirect_call:" 1 } } */
> +/* { dg-final { scan-assembler-times "__cfi_test_param_indirect_call:" 1 } } */
> +/* { dg-final { scan-assembler-times "__cfi_test_void_indirect_call:" 1 } } */
> +/* { dg-final { scan-assembler-times "__cfi_test_non_tail_indirect_call:" 1 } } */
> +
> +/* Should have exactly 4 KCFI checks for indirect calls as
> +   (load type ID + compare).  */
> +/* { dg-final { scan-assembler-times {movl\t\$-?[0-9]+, %r10d} 4 { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-times {addl\t-4\(%r[a-z0-9]+\), %r10d} 4 { target x86_64-*-* } } } */
> +
> +/* Should have exactly 4 trap sections and 4 trap instructions.  */
> +/* { dg-final { scan-assembler-times "\\.kcfi_traps" 4 { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-times "ud2" 4 { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-times "\\.kcfi_traps" 4 { target riscv*-*-* } } } */
> +/* { dg-final { scan-assembler-times "ebreak" 4 { target riscv*-*-* } } } */
> +
> +/* Should NOT have unprotected direct jumps to vtable.  */
> +/* { dg-final { scan-assembler-not {jmp\t\*vtable\(%rip\)} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-not {jmp\t\*vtable\+8\(%rip\)} { target x86_64-*-* } } } */
> +
> +/* Should have exactly 3 protected tail calls (jmp through register after
> +   KCFI check).  */
> +/* { dg-final { scan-assembler-times {jmp\t\*%[a-z0-9]+} 3 { target x86_64-*-* } } } */
> +
> +/* Should have exactly 1 regular call (non-tail call case).  */
> +/* { dg-final { scan-assembler-times {call\t\*%[a-z0-9]+} 1 { target x86_64-*-* } } } */
> +
> +/* RISC-V: Should have exactly 4 KCFI checks for indirect calls
> +   (comparison instruction).  */
> +/* { dg-final { scan-assembler-times {beq\tt1, t2, \.Lkcfi_call[0-9]+} 4 { target riscv*-*-* } } } */
> +
> +/* RISC-V: Should have exactly 4 KCFI checks for indirect calls as
> +   (load type ID + compare).  */
> +/* { dg-final { scan-assembler-times {lw\tt1, -4\([a-z0-9]+\)} 4 { target riscv*-*-* } } } */
> +/* { dg-final { scan-assembler-times {lui\tt2, [0-9]+} 4 { target riscv*-*-* } } } */
> +
> +/* RISC-V: Should have exactly 3 protected tail calls (jr after
> +   KCFI check - no return address save).  */
> +/* { dg-final { scan-assembler-times {jalr\t(x0|zero), [a-z0-9]+, 0} 3 { target riscv*-*-* } } } */
> +
> +/* RISC-V: Should have exactly 1 regular call (non-tail call case - saves
> +   return address).  */
> +/* { dg-final { scan-assembler-times {jalr\t(x1|ra), [a-z0-9]+, 0} 1 { target riscv*-*-* } } } */
> +
> +/* Type ID loading should use lui + addiw pattern for 32-bit constants.  */
> +/* { dg-final { scan-assembler {lui\tt2, [0-9]+} { target riscv*-*-* } } } */
> +/* { dg-final { scan-assembler {addiw\tt2, t2, -?[0-9]+} { target riscv*-*-* } } } */
> +
> +/* Should have exactly 4 KCFI checks for indirect calls (load type ID from
> +   -4 offset + compare).  */
> +/* { dg-final { scan-assembler-times {ldur\tw16, \[x[0-9]+, #-4\]} 4 { target aarch64-*-* } } } */
> +/* { dg-final { scan-assembler-times {cmp\tw16, w17} 4 { target aarch64-*-* } } } */
> +
> +/* Should have exactly 4 trap instructions.  */
> +/* { dg-final { scan-assembler-times {brk\t#[0-9]+} 4 { target aarch64-*-* } } } */
> +
> +/* Should have exactly 3 protected tail calls (br through register after
> +   KCFI check).  */
> +/* { dg-final { scan-assembler-times {br\tx[0-9]+} 3 { target aarch64-*-* } } } */
> +
> +/* Should have exactly 1 regular call (non-tail call case).  */
> +/* { dg-final { scan-assembler-times {blr\tx[0-9]+} 1 { target aarch64-*-* } } } */
> +
> +/* Type ID loading should use mov + movk pattern for 32-bit constants.  */
> +/* { dg-final { scan-assembler {mov\tw17, #[0-9]+} { target aarch64-*-* } } } */
> +/* { dg-final { scan-assembler {movk\tw17, #[0-9]+, lsl #16} { target aarch64-*-* } } } */
> +
> +/* Should have exactly 4 KCFI checks for indirect calls (load type ID from
> +   -4 offset + compare).  */
> +/* { dg-final { scan-assembler-times {ldr\tr0, \[r[0-9]+, #-4\]} 4 { target arm32 } } } */
> +/* { dg-final { scan-assembler-times {cmp\tr0, r1} 4 { target arm32 } } } */
> +
> +/* Should have exactly 4 trap instructions.  */
> +/* { dg-final { scan-assembler-times {udf\t#[0-9]+} 4 { target arm32 } } } */
> +
> +/* Should have exactly 3 protected tail calls (bx through register after
> +   KCFI check).  */
> +/* { dg-final { scan-assembler-times {bx\tr[0-9]+} 3 { target arm32 } } } */
> +
> +/* Should have exactly 1 regular call (non-tail call case).  */
> +/* { dg-final { scan-assembler-times {blx\tr[0-9]+} 1 { target arm32 } } } */
> +
> +/* Type ID loading should use movw + movt pattern for 32-bit constants
> +   into r1.  */
> +/* { dg-final { scan-assembler {movw\tr1, #[0-9]+} { target arm32 } } } */
> +/* { dg-final { scan-assembler {movt\tr1, #[0-9]+} { target arm32 } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-encoding.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-encoding.c
> new file mode 100644
> index 000000000000..f2226fa58ac9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-encoding.c
> @@ -0,0 +1,54 @@
> +/* Test AArch64 and ARM32 KCFI trap encoding in BRK/UDF instructions.  */
> +/* { dg-do compile { target { aarch64*-*-* || arm32 } } } */
> +
> +void target_function(int x, char y) {
> +}
> +
> +int main() {
> +    void (*func_ptr)(int, char) = target_function;
> +
> +    /* This should generate trap with immediate encoding.  */
> +    func_ptr(42, 'a');
> +
> +    return 0;
> +}
> +
> +/* Should have KCFI preamble.  */
> +/* { dg-final { scan-assembler "__cfi_target_function:" } } */
> +
> +/* AArch64 specific: Should have BRK instruction with proper ESR encoding
> +   ESR format: 0x8000 | ((type_reg & 31) << 5) | (addr_reg & 31)
> +
> +   Test the ESR encoding by checking for the expected value.
> +   Since we know this test uses x2, we expect ESR = 0x8000 | (17<<5) | 2 = 33314
> +
> +   A truly dynamic test would need to extract the register from blr and compute
> +   the corresponding ESR, but DejaGnu's regex limitations make this complex.
> +   This test validates the specific case and documents the encoding.
> +   */
> +/* { dg-final { scan-assembler "blr\\s+x2" { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler "brk\\s+#33314" { target aarch64*-*-* } } } */
> +
> +/* Should have KCFI check with type comparison.  */
> +/* { dg-final { scan-assembler {ldur\t*w16, \[x[0-9]+, #-4\]} { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler {cmp\t*w16, w17} { target aarch64*-*-* } } } */
> +
> +/* ARM32 specific: Should have UDF instruction with proper encoding
> +   UDF format: 0x8000 | ((type_reg & 31) << 5) | (addr_reg & 31)
> +
> +   Since ARM32 spills and restores r0/r1 before the trap, the type_reg
> +   field uses 0x1F (31) to indicate "register was spilled" rather than
> +   pointing to a live register. The addr_reg field contains the actual
> +   target register number.
> +
> +   For this test case using r3, we expect:
> +   UDF = 0x8000 | (31 << 5) | 3 = 0x8000 | 0x3E0 | 3 = 33763
> +   */
> +/* { dg-final { scan-assembler "blx\\s+r3" { target arm32 } } } */
> +/* { dg-final { scan-assembler "udf\\s+#33763" { target arm32 } } } */
> +
> +/* Should have register spilling and restoration around type check.  */
> +/* { dg-final { scan-assembler {push\t*\{r0, r1\}} { target arm32 } } } */
> +/* { dg-final { scan-assembler {pop\t*\{r0, r1\}} { target arm32 } } } */
> +/* { dg-final { scan-assembler {ldr\t*r0, \[r[0-9]+, #-4\]} { target arm32 } } } */
> +/* { dg-final { scan-assembler {cmp\t*r0, r1} { target arm32 } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c b/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c
> new file mode 100644
> index 000000000000..7f5f8a82f3dc
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi-trap-section.c
> @@ -0,0 +1,41 @@
> +/* Test KCFI trap section generation.  */
> +/* { dg-do compile } */
> +
> +void target_function(void) {}
> +
> +int main() {
> +    void (*func_ptr)(void) = target_function;
> +
> +    /* Multiple indirect calls to generate multiple trap entries.  */
> +    func_ptr();
> +    func_ptr();
> +
> +    return 0;
> +}
> +
> +/* Should have KCFI preamble.  */
> +/* { dg-final { scan-assembler "__cfi_target_function:" } } */
> +
> +/* Should have exactly 2 trap labels in code.  */
> +/* { dg-final { scan-assembler-times {\.L[^:]+:\n\s*ud2} 2 { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-times {\.L[^:]+:\n\s*brk} 2 { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler-times {\.L[^:]+:\n\s*udf} 2 { target arm32 } } } */
> +/* { dg-final { scan-assembler-times {\.L[^:]+:\n\s*ebreak} 2 { target riscv*-*-* } } } */
> +
> +/* x86_64: Should have complete .kcfi_traps section sequence with relative
> +   offset and 2 entries.  */
> +/* { dg-final { scan-assembler {\.section\t\.kcfi_traps,"ao",@progbits,\.text\n\.Lkcfi_entry([^:]+):\n\t\.long\t\.Lkcfi_trap([^\s\n]+)-\.Lkcfi_entry\1\n\t\.text} { target x86_64-*-* } } } */
> +/* { dg-final { scan-assembler-times {\.section\t\.kcfi_traps,"ao",@progbits,\.text} 2 { target x86_64-*-* } } } */
> +
> +/* AArch64 should NOT have .kcfi_traps section (uses brk immediate instead) */
> +/* { dg-final { scan-assembler-not {\.section\t+\.kcfi_traps} { target aarch64*-*-* } } } */
> +/* { dg-final { scan-assembler-not {\.long.*-\.L} { target aarch64*-*-* } } } */
> +
> +/* ARM 32-bit should NOT have .kcfi_traps section (uses udf immediate instead) */
> +/* { dg-final { scan-assembler-not {\.section\t+\.kcfi_traps} { target arm32 } } } */
> +/* { dg-final { scan-assembler-not {\.long.*-\.L} { target arm32 } } } */
> +
> +/* RISC-V: Should have complete .kcfi_traps section sequence with relative
> +   offset and 2 entries.  */
> +/* { dg-final { scan-assembler {\.section\t\.kcfi_traps,"ao",@progbits,\.text\n\.Lkcfi_entry([^:]+):\n\t\.4byte\t\.L([^\s\n]+)-\.Lkcfi_entry\1\n\t\.text} { target riscv*-*-* } } } */
> +/* { dg-final { scan-assembler-times {\.section\t\.kcfi_traps,"ao",@progbits,\.text} 2 { target riscv*-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/kcfi/kcfi.exp b/gcc/testsuite/gcc.dg/kcfi/kcfi.exp
> new file mode 100644
> index 000000000000..0bbba196c82f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/kcfi/kcfi.exp
> @@ -0,0 +1,64 @@
> +#   Copyright (C) 2025 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with GCC; see the file COPYING3.  If not see
> +# <http://www.gnu.org/licenses/>.
> +
> +# GCC testsuite for KCFI (Kernel Control Flow Integrity) tests.
> +
> +# Load support procs.
> +load_lib gcc-dg.exp
> +
> +# KCFI is only supported on specific targets
> +if { ![istarget "x86_64-*-*"] \
> +     && ![istarget "aarch64-*-*"] && ![istarget "arm*-*-*"] \
> +     && ![istarget "riscv*-*-*"] } {
> +    return
> +}
> +
> +# Skip tests if x86_64 is running in 32-bit mode (-m32)
> +if { [istarget "x86_64-*-*"] && ![check_effective_target_lp64] } {
> +    return
> +}

This is also wrong, you can have a multi-lib version of i?86 which
support -m64 too.


> +
> +# Skip tests if AArch64 is running in ILP32 mode (-mabi=ilp32)
> +if { [istarget "aarch64-*-*"] && ![check_effective_target_lp64] } {
> +    return
> +}
> +
> +# Skip tests if RISC-V is running in 32-bit mode (riscv32-*)
> +if { [istarget "riscv*-*-*"] && ![check_effective_target_lp64] } {
> +    return
> +}

These should really be part a new check_effective_target_kfci instead
of embeded here.
And then you can just call that from here.

Does kcfi work on non-elf targets, e.g. aarch64-mingw or x86_64-mingw
or x86_64-darwin?  I see you check lp64 and that would reject the
mingw case but not the darwin case.

Thanks,
Andrew


> +
> +# Add KCFI-specific flags to any existing DEFAULT_CFLAGS
> +global DEFAULT_CFLAGS
> +if ![info exists DEFAULT_CFLAGS] then {
> +    set DEFAULT_CFLAGS ""
> +}
> +set DEFAULT_CFLAGS "$DEFAULT_CFLAGS -fsanitize=kcfi"
> +
> +# Add ARM32-specific flags for arm32 targets
> +if [check_effective_target_arm32] {
> +    set DEFAULT_CFLAGS "$DEFAULT_CFLAGS -march=armv7-a -mfloat-abi=soft"

I think the above is incorrect can you explain why you want to change
this? Especially since you might not have soft fp ABI compatiable
glibc installed?
Also it is either arm or aarch32 and not arm32.

Thanks
Andrew


> +}
> +
> +# Initialize `dg'.
> +dg-init
> +
> +# Main loop.
> +dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c]] \
> +       "" $DEFAULT_CFLAGS
> +
> +# All done.
> +dg-finish
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 4/7] aarch64: Add AArch64 Kernel Control Flow Integrity implementation
  2025-09-13 23:43   ` Andrew Pinski
@ 2025-09-14 19:45     ` Kees Cook
  2025-09-14 19:52       ` Andrew Pinski
  2025-09-17 20:01     ` Kees Cook
  1 sibling, 1 reply; 28+ messages in thread
From: Kees Cook @ 2025-09-14 19:45 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Qing Zhao, Andrew Pinski, Jakub Jelinek, Martin Uecker,
	Richard Biener, Joseph Myers, Peter Zijlstra, Jan Hubicka,
	Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
	Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
	Jim Wilson, Dan Li, Sami Tolvanen, Ramon de C Valle, Joao Moreira,
	Nathan Chancellor, Bill Wendling, gcc-patches, linux-hardening

On Sat, Sep 13, 2025 at 04:43:29PM -0700, Andrew Pinski wrote:
> On Sat, Sep 13, 2025 at 4:28 PM Kees Cook <kees@kernel.org> wrote:
> >
> > Implement AArch64-specific KCFI backend.
> >
> > - Trap debugging through ESR (Exception Syndrome Register) encoding
> >   in BRK instruction immediate values.
> >
> > - Scratch register allocation using w16/w17 (x16/x17) following
> >   AArch64 procedure call standard for intra-procedure-call registers.
> 
> How does this interact with BTI and sibcalls? Since for indirect
> calls, x17 is already used for the address.
> Why do you need/want to use a fixed register here for the load/compare
> anyways? Why can't you use any free register?

Ah, yeah, good point. I'm struggling with this on aarch32 too for Ard's
suggestion about using an eor sequence. So, the problem I haven't been
able to solve is that call instructions cannot have scratch register
operands. Or rather, can't when there is register pressure like on
aarch32, where a spill would be needed. This is in the LRA:

  if (CALL_P (curr_insn))
    no_output_reloads_p = true;

And that does make perfect sense, there's no place (in the current
design) to provide a place where the reload would happen. But I also
can't let the kcfi check/call get split up arbitrarily.

Is there some way I can convince LRA to let me do the restore manually?

> > +const char *
> > +aarch64_output_kcfi_insn (rtx_insn *insn, rtx *operands)
> > +{
> > +  /* KCFI is only supported in LP64 mode.  */
> > +  if (TARGET_ILP32)
> > +    {
> > +      sorry ("%<-fsanitize=kcfi%> is not supported for %<-mabi=ilp32%>");
> 
> You should reject -fsanitize=kcfi during option processing instead of
> this late in the compilation.

Where is best to do this on a per-arch basis?

> > +  /* Get KCFI type ID from operand[3].  */
> > +  uint32_t type_id = (uint32_t) INTVAL (operands[3]);
> 
> Maybe an assert that `(int32_t)type_id == INTVAL (operands[3])`?

Oh, hm, actually, I think I should be using UINTVAL instead?

> > +  /* Load actual type into w16 from memory at offset using ldur.  */
> > +  temp_operands[0] = gen_rtx_REG (SImode, R16_REGNUM);
> > +  temp_operands[1] = target_reg;
> > +  temp_operands[2] = GEN_INT (offset);
> > +  output_asm_insn ("ldur\t%w0, [%1, #%2]", temp_operands);
> 
> Since you are using a fixed register, you don't need the temp_operands[0] here.
> Also what happens if target_reg is x16? Shouldn't there be an assert
> on that here?

Yeah, I need to solve the scratch register issue more generally.

> > +  /* Output conditional branch to call label.  */
> > +  fputs ("\tb.eq\t", asm_out_file);
> > +  assemble_name (asm_out_file, call_name);
> > +  fputc ('\n', asm_out_file);
> 
> There has to be a better way of implementing this.

I couldn't find one that would let me keep the custom label name. I'd
love to have something better! :)


-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 4/7] aarch64: Add AArch64 Kernel Control Flow Integrity implementation
  2025-09-14 19:45     ` Kees Cook
@ 2025-09-14 19:52       ` Andrew Pinski
  0 siblings, 0 replies; 28+ messages in thread
From: Andrew Pinski @ 2025-09-14 19:52 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andrew Pinski, Qing Zhao, Jakub Jelinek, Martin Uecker,
	Richard Biener, Joseph Myers, Peter Zijlstra, Jan Hubicka,
	Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
	Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
	Jim Wilson, Dan Li, Sami Tolvanen, Ramon de C Valle, Joao Moreira,
	Nathan Chancellor, Bill Wendling, gcc-patches, linux-hardening

On Sun, Sep 14, 2025 at 12:45 PM Kees Cook <kees@kernel.org> wrote:
>
> On Sat, Sep 13, 2025 at 04:43:29PM -0700, Andrew Pinski wrote:
> > On Sat, Sep 13, 2025 at 4:28 PM Kees Cook <kees@kernel.org> wrote:
> > >
> > > Implement AArch64-specific KCFI backend.
> > >
> > > - Trap debugging through ESR (Exception Syndrome Register) encoding
> > >   in BRK instruction immediate values.
> > >
> > > - Scratch register allocation using w16/w17 (x16/x17) following
> > >   AArch64 procedure call standard for intra-procedure-call registers.
> >
> > How does this interact with BTI and sibcalls? Since for indirect
> > calls, x17 is already used for the address.
> > Why do you need/want to use a fixed register here for the load/compare
> > anyways? Why can't you use any free register?
>
> Ah, yeah, good point. I'm struggling with this on aarch32 too for Ard's
> suggestion about using an eor sequence. So, the problem I haven't been
> able to solve is that call instructions cannot have scratch register
> operands. Or rather, can't when there is register pressure like on
> aarch32, where a spill would be needed. This is in the LRA:
>
>   if (CALL_P (curr_insn))
>     no_output_reloads_p = true;
>
> And that does make perfect sense, there's no place (in the current
> design) to provide a place where the reload would happen. But I also
> can't let the kcfi check/call get split up arbitrarily.
>
> Is there some way I can convince LRA to let me do the restore manually?
>
> > > +const char *
> > > +aarch64_output_kcfi_insn (rtx_insn *insn, rtx *operands)
> > > +{
> > > +  /* KCFI is only supported in LP64 mode.  */
> > > +  if (TARGET_ILP32)
> > > +    {
> > > +      sorry ("%<-fsanitize=kcfi%> is not supported for %<-mabi=ilp32%>");
> >
> > You should reject -fsanitize=kcfi during option processing instead of
> > this late in the compilation.
>
> Where is best to do this on a per-arch basis?

In the case of aarch64, aarch64_override_options_internal looks like a
good place. There are already errors produced here when it comes to
incompatible options. Other targets will have a similar place too; I
was only reviewing the aarch64 specific changes here.

Thanks,
Andrew Pinski

>
> > > +  /* Get KCFI type ID from operand[3].  */
> > > +  uint32_t type_id = (uint32_t) INTVAL (operands[3]);
> >
> > Maybe an assert that `(int32_t)type_id == INTVAL (operands[3])`?
>
> Oh, hm, actually, I think I should be using UINTVAL instead?
>
> > > +  /* Load actual type into w16 from memory at offset using ldur.  */
> > > +  temp_operands[0] = gen_rtx_REG (SImode, R16_REGNUM);
> > > +  temp_operands[1] = target_reg;
> > > +  temp_operands[2] = GEN_INT (offset);
> > > +  output_asm_insn ("ldur\t%w0, [%1, #%2]", temp_operands);
> >
> > Since you are using a fixed register, you don't need the temp_operands[0] here.
> > Also what happens if target_reg is x16? Shouldn't there be an assert
> > on that here?
>
> Yeah, I need to solve the scratch register issue more generally.
>
> > > +  /* Output conditional branch to call label.  */
> > > +  fputs ("\tb.eq\t", asm_out_file);
> > > +  assemble_name (asm_out_file, call_name);
> > > +  fputc ('\n', asm_out_file);
> >
> > There has to be a better way of implementing this.
>
> I couldn't find one that would let me keep the custom label name. I'd
> love to have something better! :)
>
>
> --
> Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
  2025-09-13 23:23 ` [PATCH v3 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure Kees Cook
@ 2025-09-17 13:42   ` Qing Zhao
  2025-09-17 21:09     ` Kees Cook
  0 siblings, 1 reply; 28+ messages in thread
From: Qing Zhao @ 2025-09-17 13:42 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andrew Pinski, Jakub Jelinek, Martin Uecker, Richard Biener,
	Joseph Myers, Peter Zijlstra, Jan Hubicka, Richard Earnshaw,
	Richard Sandiford, Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng,
	Palmer Dabbelt, Andrew Waterman, Jim Wilson, Dan Li,
	Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
	Bill Wendling, gcc-patches@gcc.gnu.org,
	linux-hardening@vger.kernel.org

Hi, Kees,

This version of the middle-end change is much simpler and cleaner-:).
See my comments and questions below:

> On Sep 13, 2025, at 19:23, Kees Cook <kees@kernel.org> wrote:
> 
> Implements the Linux Kernel Control Flow Integrity ABI, which provides a
> function prototype based forward edge control flow integrity protection
> by instrumenting every indirect call to check for a hash value before
> the target function address. If the hash at the call site and the hash
> at the target do not match, execution will trap.
> 
> See the start of kcfi.cc for design details.
> 
> gcc/ChangeLog:
> 
> * kcfi.h: New file with KCFI public interface declarations.
> * kcfi.cc: New file implementing Kernel Control Flow Integrity
> infrastructure.
> * Makefile.in (OBJS): Add kcfi.o.
> * flag-types.h (enum sanitize_code): Add SANITIZE_KCFI.
> * gimple.h (enum gf_mask): Add GF_CALL_INLINED_FROM_KCFI_NOSANTIZE.
> (gimple_call_set_inlined_from_kcfi_nosantize): New function.
> (gimple_call_inlined_from_kcfi_nosantize_p): New function.
> * tree-pass.h Add kcfi passes.
> * df-scan.cc (df_uses_record): Add KCFI case to handle KCFI RTL
> patterns and process wrapped RTL.
>        * doc/extend.texi: Update nocf_check for kcfi.
> * doc/invoke.texi (fsanitize=kcfi): Add documentation for KCFI
> sanitizer option.
> * doc/tm.texi.in: Add Kernel Control Flow Integrity section with
> TARGET_KCFI_SUPPORTED, TARGET_KCFI_MASK_TYPE_ID,
> TARGET_KCFI_EMIT_TYPE_ID hooks.
> * doc/tm.texi: Regenerate.
> * final.cc (call_from_call_insn): Add KCFI case to handle
> KCFI-wrapped calls.
> * opts.cc (sanitizer_opts): Add kcfi entry.
> * passes.cc: Include kcfi.h.
> * passes.def: Add KCFI IPA pass.
> * rtl.def (KCFI): Add new RTL code for KCFI instrumentation.
> * rtlanal.cc (rtx_cost): Add KCFI case.
> * target.def: Add KCFI target hooks.
> * toplev.cc (process_options): Add KCFI option processing.
> * tree-inline.cc: Include kcfi.h and asan.h.
> (copy_bb): Handle KCFI no_sanitize attribute propagation during
> inlining.
> * varasm.cc (assemble_start_function): Emit KCFI preambles.
> (assemble_external_real): Emit KCFI typeid symbols.
> (default_elf_asm_named_section): Handle .kcfi_traps using
> SECTION_LINK_ORDER flag.
> 
> gcc/c-family/ChangeLog:
> 
> * c-attribs.cc: Include asan.h.
>        (handle_nocf_check_attribute): Enable nocf_check under kcfi.
> (handle_patchable_function_entry_attribute): Add error for using
> patchable_function_entry attribute with -fsanitize=kcfi.
> 
> Signed-off-by: Kees Cook <kees@kernel.org>
> ---
> gcc/kcfi.h                |  52 ++++
> gcc/kcfi.cc               | 601 ++++++++++++++++++++++++++++++++++++++
> gcc/Makefile.in           |   1 +
> gcc/flag-types.h          |   2 +
> gcc/gimple.h              |  22 ++
> gcc/tree-pass.h           |   1 +
> gcc/c-family/c-attribs.cc |  17 +-
> gcc/df-scan.cc            |   7 +
> gcc/doc/extend.texi       |  38 +++
> gcc/doc/invoke.texi       |  33 +++
> gcc/doc/tm.texi           |  31 ++
> gcc/doc/tm.texi.in        |  12 +
> gcc/final.cc              |   3 +
> gcc/opts.cc               |   1 +
> gcc/passes.cc             |   1 +
> gcc/passes.def            |   1 +
> gcc/rtl.def               |   6 +
> gcc/rtlanal.cc            |   5 +
> gcc/target.def            |  38 +++
> gcc/toplev.cc             |  10 +
> gcc/tree-inline.cc        |  10 +
> gcc/varasm.cc             |  37 ++-
> 22 files changed, 918 insertions(+), 11 deletions(-)
> create mode 100644 gcc/kcfi.h
> create mode 100644 gcc/kcfi.cc
> 
> diff --git a/gcc/kcfi.h b/gcc/kcfi.h
> new file mode 100644
> index 000000000000..32c186416493
> --- /dev/null
> +++ b/gcc/kcfi.h
> @@ -0,0 +1,52 @@
> +/* Kernel Control Flow Integrity (KCFI) support for GCC.
> +   Copyright (C) 2025 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +#ifndef GCC_KCFI_H
> +#define GCC_KCFI_H
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "rtl.h"
> +
> +/* Common helper for RTL patterns to emit .kcfi_traps section entry.
> +   Call after emitting trap label and instruction with the trap symbol
> +   reference.  */
> +extern void kcfi_emit_traps_section (FILE *file, rtx trap_label_sym);
> +
> +/* Extract KCFI type ID from current GIMPLE statement.  */
> +extern rtx __kcfi_get_type_id_for_expanding_gimple_call (void);
> +
> +/* Convenience wrapper to check for SANITIZE_KCFI.  */
> +#define kcfi_get_type_id_for_expanding_gimple_call() \
> +  ((flag_sanitize & SANITIZE_KCFI) \
> +     ? __kcfi_get_type_id_for_expanding_gimple_call () \
> +     : NULL_RTX)
> +
> +/* Emit KCFI type ID symbol for external address-taken functions.  */
> +extern void kcfi_emit_typeid_symbol (FILE *asm_file, tree fndecl);
> +
> +/* Emit KCFI preamble for potential indirect call targets.  */
> +extern void kcfi_emit_preamble (FILE *asm_file, tree fndecl,
> + const char *actual_fname);
> +
> +/* For calculating callsite offset.  */
> +extern HOST_WIDE_INT kcfi_patchable_entry_prefix_nops;
> +
> +#endif /* GCC_KCFI_H */
> diff --git a/gcc/kcfi.cc b/gcc/kcfi.cc
> new file mode 100644
> index 000000000000..9ed0cb00faa1
> --- /dev/null
> +++ b/gcc/kcfi.cc
> @@ -0,0 +1,601 @@
> +/* Kernel Control Flow Integrity (KCFI) support for GCC.
> +   Copyright (C) 2025 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +/* KCFI ABI Design:
> +
> +The Linux Kernel Control Flow Integrity ABI provides a function prototype
> +based forward edge control flow integrity protection by instrumenting
> +every indirect call to check for a hash value before the target function
> +address.  If the hash at the call site and the hash at the target do not
> +match, execution will trap.
> +
> +The general CFI ideas are discussed here, but focuses more on a CFG
> +analysis to construct valid call destinations, which tends to require LTO:
> +https://users.soe.ucsc.edu/~abadi/Papers/cfi-tissec-revised.pdf
> +
> +Later refinement for using jump tables (constructed via CFG analysis
> +during LTO) was proposed here:
> +https://www.usenix.org/system/files/conference/usenixsecurity14/sec14-paper-tice.pdf
> +
> +Linux used the above implementation from 2018 to 2022:
> +https://android-developers.googleblog.com/2018/10/control-flow-integrity-in-android-kernel.html
> +but the corner cases for target addresses not being the actual functions
> +(i.e. pointing into the jump table) was a continual source of problems,
> +and generating the jump tables required full LTO, which had its own set
> +of problems.
> +
> +Looking at function prototypes as the source of call validity was
> +presented here, though still relied on LTO:
> +https://www.blackhat.com/docs/asia-17/materials/asia-17-Moreira-Drop-The-Rop-Fine-Grained-Control-Flow-Integrity-For-The-Linux-Kernel-wp.pdf
> +
> +The KCFI approach built on the function-prototype idea, but avoided
> +needing LTO, and could be further updated to deal with CPU errata
> +(retpolines, etc):
> +https://lpc.events/event/16/contributions/1315/
> +
> +KCFI has a number of specific constraints.  Some are tied to the
> +backend architecture, which are covered in arch-specific code.
> +The constraints are:
> +
> +- The KCFI scheme generates a unique 32-bit hash ("typeid") for each
> +  unique function prototype, allowing for indirect call sites to verify
> +  that they are calling into a matching _type_ of function pointer.
> +  This changes the semantics of some optimization logic because now
> +  indirect calls to different types cannot be merged.  For example:
> +
> +    if (p->func_type_1)
> + return p->func_type_1 ();
> +    if (p->func_type_2)
> + return p->func_type_2 ();
> +
> +  In final asm, the optimizer may collapse the second indirect call
> +  into a jump to the first indirect call once it has loaded the function
> +  pointer.  KCFI must block cross-type merging otherwise there will be a
> +  single KCFI check happening for only 1 type but being used by 2 target
> +  types.  The distinguishing characteristic for call merging becomes the
> +  type, not the address/register usage.
> +
> +- The check-call instruction sequence must be treated as a single unit: it
> +  cannot be rearranged or split or optimized.  The pattern is that
> +  indirect calls, "call *%target", get converted into:
> +
> +    mov $target_expression, %target ; only present if the expression was
> +    ; not already in %target register
> +    load -$offset(%target), %tmp    ; load typeid hash from target preamble
> +    cmp $typeid, %tmp    ; compare expected typeid with loaded
> +    je .Lkcfi_call$N    ; success: jump to the indirect call
> +  .Lkcfi_trap$N:    ; label of trap insn
> +    trap    ; trap on failure, but arranged so
> +    ; "permissive mode" falls through
> +  .Lkcfi_call$N:    ; label of call insn
> +    call *%target    ; actual indirect call
> +
> +  This pattern of call immediately after trap provides for the
> +  "permissive" checking mode automatically: the trap gets handled,
> +  a warning emitted, and then execution continues after the trap to
> +  the call.
> +
> +- KCFI check-call instrumentation must survive tail call optimization.
> +  If an indirect call is turned into an indirect jump, KCFI checking
> +  must still happen (but it will use a jmp rather than a call).

I didn’t see any code changes in this patch address the above issue,
 is the issue automatically resolved without special handling? 
> +
> +- Functions that may be called indirectly have a preamble added,
> +  __cfi_$original_func_name, which contains the $typeid value:
> +
> +    __cfi_target_func:
> +      .word $typeid
> +    target_func:
> +       [regular function entry...]
> +
> +- The preamble needs to interact with patchable function entry so that
> +  the typeid appears further away from the actual start of the function
> +  (leaving the prefix NOPs of the patchable function entry unchanged).
> +  This means only _globally defined_ patchable function entry is supported
> +  with KCFI (indrect call sites must know in advance what the offset is,
> +  which may not be possible with extern functions that use a function
> +  attribute to change their patchable function entry characteristics).
> +  For example, a "4,4" patchable function entry would end up like:
> +
> +    __cfi_target_func:
> +      .data $typeid
> +      nop nop nop nop
> +    target_func:
> +       [regular function entry...]
> +
> +  Architectures may need to add alignment nops prior to the typeid to keep
> +  __cfi_target_func aligned for function call conventions.

I am still a little confused with the above, are there two “nops” need to be computed
and added: one is for patchable function entry, the other one is for architecture specific
alignment nops? 
If so, you might need to clarify the above to make this clear. 

> +
> +- External functions that are address-taken have a weak __kcfi_typeid_$func
> +  symbol added with the typeid value available so that the typeid can be
> +  referenced from assembly linkages, etc, where the typeid values cannot be
> +  calculated (i.e where C type information is missing):
> +
> +    .weak   __kcfi_typeid_$func
> +    .set    __kcfi_typeid_$func, $typeid
> +

From my previous understanding, the above weak symbol is emitted for external functions
that are address-taken AND does not have a definition in the compilation. So the weak symbols
Is emitted at the declaration site of the external function, is this true?

If so, could you please clarify this in the above?

> +- On architectures that do not have a good way to encode additional
> +  details in their trap insn (e.g. x86_64 and riscv64), the trap location
> +  is identified as a KCFI trap via a relative address offset entry
> +  emitted into the .kcfi_traps section for each indirect call site's
> +  trap instruction.  The previous check-call example's insn sequence would
> +  then have section changes inserted between the trap and call:
> +
> +  ...
> +  .Lkcfi_trap$N:
> +    trap
> +  .section .kcfi_traps,"ao",@progbits,.text
> +  .Lkcfi_entry$N:
> +    .long .Lkcfi_trap$N - .Lkcfi_entry$N
> +  .text
> +  .Lkcfi_call$N:
> +    call *%target
> +
> +  It is up to such architectures to decode instructions prior to the
> +  trap to locate the typeid that the callsite was expecting.
> +
> +  For architectures that can encode immediates in their trap function
> +  (e.g. aarch64 and arm32), this isn't needed: they just use immediate
> +  codes that indicate a KCFI trap.
> +
> +- The no_sanitize("kcfi") function attribute means that the marked
> +  function must not produce KCFI checking for indirect calls, and this
> +  attribute must survive inlining.  This is used rarely by Linux, but
> +  is required to make BPF JIT trampolines work on older Linux kernel
> +  versions.
> +
> +- The "nocf_check" function attribute can be used to supress the
> +  KCFI preamble for a function, making that function unavailable
> +  for indirect calls.
> +
> +As a result of these constraints, there are some behavioral aspects
> +that need to be preserved across the middle-end and back-end.
> +
> +For indirect call sites:
> +
> +- All function types have their associated typeid attached as an
> +  attribute.
> +
> +- Keep typeid information available through to the RTL expansion
> +  phase was done via a new KCFI insn RTL pattern that wraps the CALL
> +  and the typeid.
> +
> +- Keep indirect calls from being merged (see earlier example) by
> +  checking the KCFI insn's typeid for equality.

Is this resolved by the following code:

rtlanal.cc
index 63a1d08c46cf..5016fe93ccac 100644
--- a/gcc/rtlanal.cc
+++ b/gcc/rtlanal.cc
@@ -1177,6 +1177,11 @@ reg_referenced_p (const_rtx x, const_rtx body)
    case IF_THEN_ELSE:
      return reg_overlap_mentioned_p (x, body);

+    case KCFI:
+      /* For KCFI wrapper, check both the wrapped call and the type ID.  */
+      return (reg_overlap_mentioned_p (x, XEXP (body, 0))
+      || reg_overlap_mentioned_p (x, XEXP (body, 1)));
+

> +
> +- To make sure KCFI expansion is skipped for inline functions that
> +  are marked with no_sanitize("kcfi"), the inlining is marked during
> +  GIMPLE with a new flag which is checked during expansion.
> +
> +- KCFI insn emission interacts with patchable function entry to
> +  load the typeid from the target preambble, offset by prefix NOPs.
> +
> +For indirect call targets:
> +
> +- kcfi_emit_preamble interacts with patchable function entry to add
> +  any needed alignment prior to emitting the typeid.
> +
> +- assemble_external_real calls kcfi_emit_typeid_symbol to add the
> +  __kcfi_typeid_$func symbols.
> +
> +*/
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "target.h"
> +#include "function.h"
> +#include "tree.h"
> +#include "tree-pass.h"
> +#include "dumpfile.h"
> +#include "basic-block.h"
> +#include "gimple.h"
> +#include "gimple-iterator.h"
> +#include "cgraph.h"
> +#include "kcfi.h"
> +#include "stringpool.h"
> +#include "attribs.h"
> +#include "rtl.h"
> +#include "cfg.h"
> +#include "cfgrtl.h"
> +#include "asan.h"
> +#include "diagnostic-core.h"
> +#include "memmodel.h"
> +#include "print-tree.h"
> +#include "emit-rtl.h"
> +#include "output.h"
> +#include "builtins.h"
> +#include "varasm.h"
> +#include "opts.h"
> +#include "target.h"
> +#include "flags.h"
> +#include "kcfi-typeinfo.h"
> +#include "insn-config.h"
> +#include "recog.h"
> +
> +/* For callsite typeid loading offset.  */
> +HOST_WIDE_INT kcfi_patchable_entry_prefix_nops = 0;
> +/* For preamble alignment.  */
> +static HOST_WIDE_INT kcfi_patchable_entry_arch_alignment_nops = 0;
> +static const char *kcfi_nop = NULL;
> +
> +/* Common helper for RTL patterns to emit .kcfi_traps section entry.  */

I noticed that you didn’t explain each parameter of the function in all the comments for the functions.
This need to be updated for all the new functions. 

> +void
> +kcfi_emit_traps_section (FILE *file, rtx trap_label_sym)
> +{
> +  /* Generate entry label internally and get its number.  */
> +  rtx entry_label = gen_label_rtx ();
> +  int entry_labelno = CODE_LABEL_NUMBER (entry_label);

Is the only usage of the new RTX “entry_label” is to generate a label_number? 
If so, the entry_label is not needed at all.  You can get a distinct labelno for each
Lkcfi_entry, for example, the function id for the current function.

> +
> +  /* Generate entry label name with custom prefix.  */
> +  char entry_name[32];
> +  ASM_GENERATE_INTERNAL_LABEL (entry_name, "Lkcfi_entry", entry_labelno);
> +
> +  /* Save current section to restore later.  */
> +  section *saved_section = in_section;
> +
> +  /* Use varasm infrastructure for section handling:
> +     .section .kcfi_traps,"ao",@progbits,.text  */
> +  section *kcfi_traps_section = get_section (".kcfi_traps",
> +     SECTION_LINK_ORDER, NULL);
> +  switch_to_section (kcfi_traps_section);
> +
> +  /* Emit entry label for relative offset:
> +     .Lkcfi_entry$N:  */
> +  ASM_OUTPUT_LABEL (file, entry_name);
> +
> +  /* Generate address difference using RTL infrastructure.  */
> +  rtx entry_label_sym = gen_rtx_SYMBOL_REF (Pmode, entry_name);
> +  rtx addr_diff = gen_rtx_MINUS (Pmode, trap_label_sym, entry_label_sym);
> +
> +  /* Emit the address difference as a 4-byte value:
> +    .long .Lkcfi_trap$N - .Lkcfi_entry$N  */
> +  assemble_integer (addr_diff, 4, BITS_PER_UNIT, 1);
> +
> +  /* Restore the previous section:
> +     .text  */
> +  switch_to_section (saved_section);
> +}
> +
> +/* Compute KCFI type ID for a function type.  */
> +
> +static uint32_t
> +compute_kcfi_type_id (tree fntype)
> +{
> +  gcc_assert (fntype);
> +  gcc_assert (TREE_CODE (fntype) == FUNCTION_TYPE);
> +
> +  uint32_t type_id = typeinfo_get_hash (fntype);
> +
> +  /* Apply target-specific masking if supported.  */
> +  if (targetm.kcfi.mask_type_id)
> +    type_id = targetm.kcfi.mask_type_id (type_id);
> +
> +  /* Output to dump file if enabled.  */
> +  if (dump_file && (dump_flags & TDF_DETAILS))
> +    {
> +      std::string mangled_name = typeinfo_get_name (fntype);
> +      fprintf (dump_file, "KCFI type ID: mangled='%s' typeid=0x%08x\n",
> +       mangled_name.c_str (), type_id);
> +    }
> +
> +  return type_id;
> +}
> +
> +/* Function attribute to store KCFI type ID.  */
> +static tree kcfi_type_id_attr = NULL_TREE;
> +
> +/* Get KCFI type ID for a function type.  Set it if missing.  */
> +
> +static uint32_t
> +kcfi_get_type_id (tree fn_type)
> +{
> +  uint32_t type_id;
> +
> +  /* Cache the attribute identifier.  */
> +  if (!kcfi_type_id_attr)
> +    kcfi_type_id_attr = get_identifier ("kcfi_type_id");
> +
> +  tree attr = lookup_attribute (IDENTIFIER_POINTER (kcfi_type_id_attr),
> + TYPE_ATTRIBUTES (fn_type));

The above can be simplified as:
+  tree attr = lookup_attribute (“kcfi_type_id”, TYPE_ATTRIBUTES (fn_type));

> +  if (attr)
> +    {
> +      tree value = TREE_VALUE (attr);
> +      gcc_assert (value && TREE_CODE (value) == INTEGER_CST);
> +      type_id = (uint32_t) TREE_INT_CST_LOW (value);
> +    }
> +  else
> +    {
> +      type_id = compute_kcfi_type_id (fn_type);
> +
> +      tree type_id_tree = build_int_cst (unsigned_type_node, type_id);
> +      tree attr = build_tree_list (kcfi_type_id_attr, type_id_tree);
> +
> +      TYPE_ATTRIBUTES (fn_type) = chainon (TYPE_ATTRIBUTES (fn_type), attr);
> +    }
> +
> +  return type_id;
> +}
> +
> +/* Prepare the global KCFI alignment NOPs calculation.
> +   Called once during IPA pass to set global variables.  */
> +
> +static void
> +kcfi_prepare_alignment_nops (void)
> +{
> +  /* Only use global patchable-function-entry flag, not function attributes.
> +     KCFI callsites cannot know about function-specific attributes.  */
> +  if (flag_patchable_function_entry)
> +    {
> +      HOST_WIDE_INT total_nops, prefix_nops = 0;
> +      parse_and_check_patch_area (flag_patchable_function_entry, false,
> +  &total_nops, &prefix_nops);
> +      /* Store value for callsite offset calculation.  */
> +      kcfi_patchable_entry_prefix_nops = prefix_nops;
> +    }
> +
> +  /* Calculate architecture-specific alignment NOPs.
> +     KCFI preamble layout:
> +     __cfi_func: [alignment_nops][typeid][prefix_nops] func: [entry_nops]
> +
> +     The alignment NOPs ensure __cfi_func stays at proper function entry
> +     alignment when prefix NOPs are added.  */

In the above, it looks like there are three “nops”:

alignment_nops
prefix_nops
entry_nops

Which global map to each of the above? My guess is:

kcfi_patchable_entry_prefix_nops. —> prefix_nops
kcfi_patchable_entry_arch_alignment_nops —> alignment_nops
?? —> entry_nops

Is the above correct understanding?

I have a hard time to map these concepts with your codes in 
this routine.

I think more detailed description of the “nops” and a clear mapping
between these “nops” and the global variables calculated is needed
In the comments of this routine, 

> +  HOST_WIDE_INT arch_alignment = 0;
> +
> +  /* Calculate alignment NOPs based on function alignment setting.
> +     Use explicit -falign-functions value if set, otherwise default to 4.  */
> +  int alignment_bytes = 4;
> +  if (align_functions.levels[0].log > 0)
> +    alignment_bytes = align_functions.levels[0].get_value ();
> +
> +  /* Get typeid instruction size from target hook, default to 4 bytes.  */
> +  int typeid_size = targetm.kcfi.emit_type_id
> +    ? targetm.kcfi.emit_type_id (NULL, 0) : 4;
> +
> +  /* Calculate alignment NOP bytes needed.  */
> +  arch_alignment = (alignment_bytes
> +    - ((kcfi_patchable_entry_prefix_nops + typeid_size)
> +       % alignment_bytes)) % alignment_bytes;
> +
> +  /* Prepare NOP template.  */
> +  rtx_insn *nop_insn = make_insn_raw (gen_nop ());
> +  int code_num = recog_memoized (nop_insn);
> +  kcfi_nop = get_insn_template (code_num, nop_insn);
> +
> +  /* Calculate number of NOP instructions needed for alignment.  */
> +  int nop_size = get_attr_length (nop_insn);
> +  if (arch_alignment % nop_size != 0)
> +    sorry ("KCFI function entry alignment padding bytes "
> +   "(" HOST_WIDE_INT_PRINT_DEC ") are not a multiple of "
> +   "architecture NOP instruction size (%d)",
> +   arch_alignment, nop_size);
> +  kcfi_patchable_entry_arch_alignment_nops = arch_alignment / nop_size;
> +}
> +
> +/* Extract KCFI type ID from indirect call GIMPLE statement.
> +   Returns RTX constant with type ID, or NULL_RTX if no KCFI needed.  */


> +
> +rtx
> +__kcfi_get_type_id_for_expanding_gimple_call (void)
> +{
> +  gcc_assert (currently_expanding_gimple_stmt);
> +  gcc_assert (is_gimple_call (currently_expanding_gimple_stmt));
> +
> +  /* Internally checks for no_sanitize("kcfi") with current_function_decl.  */
> +  if (!sanitize_flags_p (SANITIZE_KCFI))
> +    return NULL_RTX;
> +
> +  gcall *call_stmt = as_a <gcall *> (currently_expanding_gimple_stmt);
> +
> +  /* Only indirect calls need KCFI instrumentation.  */
> +  if (gimple_call_fndecl (call_stmt))
> +    return NULL_RTX;
> +
> +  /* Skip calls originating from inlined no_sanitize("kcfi") functions.  */
> +  if (gimple_call_inlined_from_kcfi_nosantize_p (call_stmt))
> +    return NULL_RTX;
> +
> +  /* Get function type of call.  */
> +  tree fn_type = gimple_call_fntype (call_stmt);
> +  gcc_assert (fn_type);
> +
> +  /* Return the type_id.  */
> +  return GEN_INT (kcfi_get_type_id (fn_type));
> +}
> +
> +/* Emit KCFI type ID symbol for an address-taken external function.  */

Is it more accurate to say:

Emit KCFI type ID symbol for the declaration of an address-taken external function FNDECL
to the assembly file ASM_FILE.

??

> +
> +void
> +kcfi_emit_typeid_symbol (FILE *asm_file, tree fndecl)
> +{
> +  /* Only emit for external function declarations.  */
> +  if (TREE_CODE (fndecl) != FUNCTION_DECL || DECL_INITIAL (fndecl))
> +    return;
> +
> +  /* Only emit for functions that are address-taken.  */
> +  struct cgraph_node *node = cgraph_node::get (fndecl);
> +  if (!node || !node->address_taken)
> +    return;
> +
> +  /* Get symbol name from RTL and strip encoding prefixes.  */
> +  rtx rtl = DECL_RTL (fndecl);
> +  const char *name = XSTR (XEXP (rtl, 0), 0);
> +  name = targetm.strip_name_encoding (name);
> +
> +  /* .weak __kcfi_typeid_{name} */
> +  std::string symbol_name = std::string ("__kcfi_typeid_") + name;
> +  ASM_WEAKEN_LABEL (asm_file, symbol_name.c_str ());
> +
> +  /* .set __kcfi_typeid_{name}, 0x{type_id} */
> +  char val[16];
> +  snprintf (val, sizeof (val), "0x%08x",
> +    kcfi_get_type_id (TREE_TYPE (fndecl)));
> +  ASM_OUTPUT_DEF (asm_file, symbol_name.c_str (), val);
> +}
> +
> +/* Emit KCFI preamble before the function label.
> +   Functions get preambles when -fsanitize=kcfi is enabled, regardless of
> +   no_sanitize("kcfi") attribute.  */
> +
> +void
> +kcfi_emit_preamble (FILE *asm_file, tree fndecl, const char *actual_fname)
> +{
> +  /* Skip functions with nocf_check attribute.  */
> +  if (lookup_attribute ("nocf_check", TYPE_ATTRIBUTES (TREE_TYPE (fndecl))))
> +    return;
> +
> +  struct cgraph_node *node = cgraph_node::get (fndecl);
> +
> +  /* Ignore cold partition functions: not reached via indirect call.  */
> +  if (node && node->split_part)
> +    return;
> +
> +  /* Ignore cold partition sections: cold partitions are never indirect call
> +     targets.  Only skip preambles for cold partitions (has_bb_partition = true)
> +     not for entire cold-attributed functions (has_bb_partition = false).  */
> +  if (in_cold_section_p && crtl && crtl->has_bb_partition)
> +    return;
> +
> +  /* Check if function is truly address-taken using cgraph node analysis.  */
> +  bool addr_taken = (node && node->address_taken);
> +
> +  /* Only instrument functions that can be targets of indirect calls:
> +     - Public functions (can be called externally)
> +     - External declarations (from other modules)
> +     - Functions with true address-taken status from cgraph analysis.  */
> +  if (!(TREE_PUBLIC (fndecl) || DECL_EXTERNAL (fndecl) || addr_taken))
> +    return;
> +
> +  /* Use actual function name if provided, otherwise fall back to
> +     DECL_ASSEMBLER_NAME.  */
> +  const char *fname = actual_fname
> + ? actual_fname
> + : IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (fndecl));
> +
> +  /* Create symbol name for reuse.  */
> +  std::string cfi_symbol_name = std::string ("__cfi_") + fname;
> +
> +  /* Emit __cfi_ symbol with proper visibility.  */
> +  if (TREE_PUBLIC (fndecl))
> +    {
> +      if (DECL_WEAK (fndecl))
> + ASM_WEAKEN_LABEL (asm_file, cfi_symbol_name.c_str ());
> +      else
> + targetm.asm_out.globalize_label (asm_file, cfi_symbol_name.c_str ());
> +    }
> +
> +  /* Emit .type directive.  */
> +  ASM_OUTPUT_TYPE_DIRECTIVE (asm_file, cfi_symbol_name.c_str (), "function");
> +  ASM_OUTPUT_LABEL (asm_file, cfi_symbol_name.c_str ());
> +
> +  /* Emit architecture-specific alignment NOPs using target's NOP template.  */
> +  for (int i = 0; i < kcfi_patchable_entry_arch_alignment_nops; i++)
> +    output_asm_insn (kcfi_nop, NULL);
> +
> +  /* Emit type ID bytes.  */
> +  uint32_t type_id = kcfi_get_type_id (TREE_TYPE (fndecl));
> +  if (targetm.kcfi.emit_type_id)
> +    targetm.kcfi.emit_type_id (asm_file, type_id);
> +  else
> +    fprintf (asm_file, "\t.word\t0x%08x\n", type_id);
> +
> +  /* Mark end of __cfi_ symbol and emit size directive.  */
> +  std::string cfi_end_label = std::string (".Lcfi_func_end_") + fname;
> +  ASM_OUTPUT_LABEL (asm_file, cfi_end_label.c_str ());
> +
> +  ASM_OUTPUT_MEASURED_SIZE (asm_file, cfi_symbol_name.c_str ());
> +}
> +
> +namespace {
> +
> +/* IPA pass for KCFI type ID setting - runs once per compilation unit.  */
> +
> +const pass_data pass_data_ipa_kcfi =
> +{
> +  SIMPLE_IPA_PASS, /* type */
> +  "ipa_kcfi", /* name */
> +  OPTGROUP_NONE, /* optinfo_flags */
> +  TV_IPA_OPT, /* tv_id */
> +  0, /* properties_required */
> +  0, /* properties_provided */
> +  0, /* properties_destroyed */
> +  0, /* todo_flags_start */
> +  0, /* todo_flags_finish */
> +};
> +
> +/* Set KCFI type_ids for all usable function types in compilation unit.  */
> +
> +static unsigned int
> +ipa_kcfi_execute (void)
> +{
> +  struct cgraph_node *node;
> +
> +  /* Prepare global KCFI alignment NOPs calculation once for all functions.  */
> +  kcfi_prepare_alignment_nops ();
> +
> +  /* Process all functions - both local and external.  */
> +  FOR_EACH_FUNCTION (node)
> +    {
> +      tree fndecl = node->decl;
> +
> +      /* Skip all non-NORMAL builtins (MD, FRONTEND) entirely.
> + For NORMAL builtins, skip those that lack an implicit
> + implementation (closest way to distinguishing DEF_LIB_BUILTIN
> + from others).  E.g. we need to have typeids for memset().  */

I see indentation issue in the above comments.

> +      if (fndecl_built_in_p (fndecl))
> + {
> +  if (DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_NORMAL)
> +    continue;
> +  if (!builtin_decl_implicit_p (DECL_FUNCTION_CODE (fndecl)))
> +    continue;
> + }

Also see indentation issue in the above.
> +
> +      /* Cache the type_id in the function type.  */
> +      kcfi_get_type_id (TREE_TYPE (fndecl));
> +    }
> +
> +  return 0;
> +}
> +
> +class pass_ipa_kcfi : public simple_ipa_opt_pass
> +{
> +public:
> +  pass_ipa_kcfi (gcc::context *ctxt)
> +    : simple_ipa_opt_pass (pass_data_ipa_kcfi, ctxt)
> +  {}
> +
> +  bool gate (function *) final override
> +  {
> +    return sanitize_flags_p (SANITIZE_KCFI);
> +  }
> +
> +  unsigned int execute (function *) final override
> +  {
> +    return ipa_kcfi_execute ();
> +  }
> +
> +}; /* class pass_ipa_kcfi */
> +
> +} /* anon namespace */
> +
> +simple_ipa_opt_pass *
> +make_pass_ipa_kcfi (gcc::context *ctxt)
> +{
> +  return new pass_ipa_kcfi (ctxt);
> +}
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index a14fb498ce44..5b89161ac75a 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1592,6 +1592,7 @@ OBJS = \
> ira-lives.o \
> jump.o \
> kcfi-typeinfo.o \
> + kcfi.o \
> langhooks.o \
> late-combine.o \
> lcm.o \
> diff --git a/gcc/flag-types.h b/gcc/flag-types.h
> index bf681c3e8153..c3c0bc61ee3e 100644
> --- a/gcc/flag-types.h
> +++ b/gcc/flag-types.h
> @@ -337,6 +337,8 @@ enum sanitize_code {
>   SANITIZE_KERNEL_HWADDRESS = 1UL << 30,
>   /* Shadow Call Stack.  */
>   SANITIZE_SHADOW_CALL_STACK = 1UL << 31,
> +  /* KCFI (Kernel Control Flow Integrity) */
> +  SANITIZE_KCFI = 1ULL << 32,
>   SANITIZE_SHIFT = SANITIZE_SHIFT_BASE | SANITIZE_SHIFT_EXPONENT,
>   SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | SANITIZE_UNREACHABLE
>       | SANITIZE_VLA | SANITIZE_NULL | SANITIZE_RETURN
> diff --git a/gcc/gimple.h b/gcc/gimple.h
> index da32651ea017..d5e7acc2c6a7 100644
> --- a/gcc/gimple.h
> +++ b/gcc/gimple.h
> @@ -142,6 +142,7 @@ enum gf_mask {
>     GF_CALL_ALLOCA_FOR_VAR = 1 << 5,
>     GF_CALL_INTERNAL = 1 << 6,
>     GF_CALL_CTRL_ALTERING       = 1 << 7,
> +    GF_CALL_INLINED_FROM_KCFI_NOSANTIZE = 1 << 8,
>     GF_CALL_MUST_TAIL_CALL = 1 << 9,
>     GF_CALL_BY_DESCRIPTOR = 1 << 10,
>     GF_CALL_NOCF_CHECK = 1 << 11,
> @@ -3487,6 +3488,27 @@ gimple_call_from_thunk_p (gcall *s)
>   return (s->subcode & GF_CALL_FROM_THUNK) != 0;
> }
> 
> +/* If INLINED_FROM_KCFI_NOSANTIZE_P is true, mark GIMPLE_CALL S as being
> +   inlined from a function with no_sanitize("kcfi").  */
> +
> +inline void
> +gimple_call_set_inlined_from_kcfi_nosantize (gcall *s,
> +     bool inlined_from_kcfi_nosantize_p)
> +{
> +  if (inlined_from_kcfi_nosantize_p)
> +    s->subcode |= GF_CALL_INLINED_FROM_KCFI_NOSANTIZE;
> +  else
> +    s->subcode &= ~GF_CALL_INLINED_FROM_KCFI_NOSANTIZE;
> +}
> +
> +/* Return true if GIMPLE_CALL S was inlined from a function with
> +   no_sanitize("kcfi").  */
> +
> +inline bool
> +gimple_call_inlined_from_kcfi_nosantize_p (const gcall *s)
> +{
> +  return (s->subcode & GF_CALL_INLINED_FROM_KCFI_NOSANTIZE) != 0;
> +}
> 
> /* If FROM_NEW_OR_DELETE_P is true, mark GIMPLE_CALL S as being a call
>    to operator new or delete created from a new or delete expression.  */
> diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
> index 1c68a69350df..8155249c990a 100644
> --- a/gcc/tree-pass.h
> +++ b/gcc/tree-pass.h
> @@ -544,6 +544,7 @@ extern ipa_opt_pass_d *make_pass_ipa_odr (gcc::context *ctxt);
> extern ipa_opt_pass_d *make_pass_ipa_reference (gcc::context *ctxt);
> extern ipa_opt_pass_d *make_pass_ipa_pure_const (gcc::context *ctxt);
> extern simple_ipa_opt_pass *make_pass_ipa_pta (gcc::context *ctxt);
> +extern simple_ipa_opt_pass *make_pass_ipa_kcfi (gcc::context *ctxt);
> extern simple_ipa_opt_pass *make_pass_ipa_tm (gcc::context *ctxt);
> extern simple_ipa_opt_pass *make_pass_target_clone (gcc::context *ctxt);
> extern simple_ipa_opt_pass *make_pass_dispatcher_calls (gcc::context *ctxt);
> diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
> index 1e3a94ed9493..1580ab25f70b 100644
> --- a/gcc/c-family/c-attribs.cc
> +++ b/gcc/c-family/c-attribs.cc
> @@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.  If not see
> #include "gimplify.h"
> #include "tree-pretty-print.h"
> #include "gcc-rich-location.h"
> +#include "asan.h"
> #include "gcc-urlifier.h"
> 
> static tree handle_packed_attribute (tree *, tree, tree, int, bool *);
> @@ -1740,8 +1741,11 @@ handle_nocf_check_attribute (tree *node, tree name,
>       warning (OPT_Wattributes, "%qE attribute ignored", name);
>       *no_add_attrs = true;
>     }
> -  else if (!(flag_cf_protection & CF_BRANCH))
> +  else if (!(flag_cf_protection & CF_BRANCH)
> +   && !(flag_sanitize & SANITIZE_KCFI))
>     {
> +      /* Allow it with -fsanitize=kcfi, but leave this warning alone
> + to avoid confusion over this weird corner case.  */
>       warning (OPT_Wattributes, "%qE attribute ignored. Use "
> "%<-fcf-protection%> option to enable it",
> name);
> @@ -6508,6 +6512,17 @@ static tree
> handle_patchable_function_entry_attribute (tree *, tree name, tree args,
>   int, bool *no_add_attrs)
> {
> +  /* Function-specific patchable_function_entry attribute is incompatible
> +     with KCFI because KCFI callsites cannot know about function-specific
> +     patchable entry settings on a preamble in a different translation
> +     unit.  */
> +  if (sanitize_flags_p (SANITIZE_KCFI))
> +    {
> +      error ("%qE attribute cannot be used with %<-fsanitize=kcfi%>", name);
> +      *no_add_attrs = true;
> +      return NULL_TREE;
> +    }
> +
>   for (; args; args = TREE_CHAIN (args))
>     {
>       tree val = TREE_VALUE (args);
> diff --git a/gcc/df-scan.cc b/gcc/df-scan.cc
> index 1e4c6a2a4fb5..2be5e60786a3 100644
> --- a/gcc/df-scan.cc
> +++ b/gcc/df-scan.cc
> @@ -2851,6 +2851,13 @@ df_uses_record (class df_collection_rec *collection_rec,
>       /* If we're clobbering a REG then we have a def so ignore.  */
>       return;
> 
> +    case KCFI:
> +      /* KCFI wraps other RTL - process the wrapped RTL.  */
> +      df_uses_record (collection_rec, &XEXP (x, 0), ref_type, bb, insn_info,
> +      flags);
> +      /* The type ID operand (XEXP (x, 1)) doesn't contain register uses.  */
> +      return;
> +
>     case MEM:
>       df_uses_record (collection_rec,
>      &XEXP (x, 0), DF_REF_REG_MEM_LOAD,
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 7cddea1ed6c1..ae9c039ab589 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -2740,6 +2740,44 @@ void __attribute__ ((no_sanitize ("alignment,object-size")))
> g () @{ /* @r{Do something.} */; @}
> @end smallexample
> 
> +When @code{no_sanitize("kcfi")} is applied to a function, it disables
> +the generation of Kernel Control Flow Integrity (KCFI) instrumentation
> +for indirect function calls within that function.  This means that
> +indirect calls in the marked function will not be checked against the
> +target function's type signature.
> +
> +However, the function itself will still receive a KCFI preamble (type
> +identifier) when compiled with @option{-fsanitize=kcfi}, allowing it to
> +be safely called indirectly from other functions that do perform KCFI
> +checks.  In other words, @code{no_sanitize("kcfi")} affects outgoing
> +calls from the function, not incoming calls to the function.
> +
> +@smallexample
> +void __attribute__ ((no_sanitize ("kcfi")))
> +trusted_function(void (*callback)(int))
> +@{
> +  /* This indirect call will NOT be instrumented with KCFI checks */
> +  callback(42);
> +@}
> +
> +void regular_function(void (*callback)(int))
> +@{
> +  /* This indirect call WILL be instrumented with KCFI checks */
> +  callback(42);
> +@}
> +@end smallexample
> +
> +This attribute is primarily used in kernel code for special contexts such
> +as BPF JIT trampolines or other low-level code where KCFI instrumentation
> +might interfere with the intended operation.  The attribute survives
> +inlining to ensure that @code{no_sanitize("kcfi")} functions do not generate
> +KCFI checks even when inlined into a function that otherwise performs KCFI
> +checks.
> +
> +Note: To disable KCFI preamble generation for functions so that they may
> +explicitly not be called indirectly, use the @code{nocf_check} function
> +attribute instead.
> +
> @cindex @code{no_sanitize_address} function attribute
> @item no_sanitize_address
> @itemx no_address_safety_analysis
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 56c4fa86e346..f96e104a7248 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -18382,6 +18382,39 @@ possible by specifying the command-line options
> @option{--param hwasan-instrument-allocas=1} respectively. Using a random frame
> tag is not implemented for kernel instrumentation.
> 
> +@opindex fsanitize=kcfi
> +@item -fsanitize=kcfi
> +Enable Kernel Control Flow Integrity (KCFI), a lightweight control
> +flow integrity mechanism designed for operating system kernels.
> +KCFI instruments indirect function calls to verify that the target
> +function has the expected type signature at runtime.  Each function
> +receives a unique type identifier computed from a hash of its function
> +prototype (including parameter types and return type).  Before each
> +indirect call, the implementation inserts a check to verify that the
> +target function's type identifier matches the expected identifier
> +for the call site, issuing a trap instruction if a mismatch is detected.
> +This provides forward-edge control flow protection against attacks that
> +attempt to redirect indirect calls to unintended targets.
> +
> +The implementation adds minimal runtime overhead and does not require
> +runtime library support, making it suitable for kernel environments.
> +The type identifier is placed before the function entry point,
> +allowing runtime verification without additional metadata structures,
> +and without changing the entry points of the target functions.
> +
> +KCFI is intended primarily for kernel code and may not be suitable
> +for user-space applications that rely on techniques incompatible
> +with strict type checking of indirect calls.
> +
> +Note that KCFI is incompatible with function-specific
> +@code{patchable_function_entry} attributes because KCFI call sites
> +cannot know about function-specific patchable entry settings in different
> +translation units.  Only the global @option{-fpatchable-function-entry}
> +command-line option is supported with KCFI.
> +
> +Use @option{-fdump-ipa-kcfi-details} to examine the computed type identifier
> +hashes and their corresponding mangled type strings during compilation.
> +
> @opindex fsanitize=pointer-compare
> @item -fsanitize=pointer-compare
> Instrument comparison operation (<, <=, >, >=) with pointer operands.
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 37642680f423..69603fdad090 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -3166,6 +3166,7 @@ This describes the stack layout and calling conventions.
> * Tail Calls::
> * Shrink-wrapping separate components::
> * Stack Smashing Protection::
> +* Kernel Control Flow Integrity::
> * Miscellaneous Register Hooks::
> @end menu
> 
> @@ -5432,6 +5433,36 @@ should be allocated from heap memory and consumers should release them.
> The result will be pruned to cases with PREFIX if not NULL.
> @end deftypefn
> 
> +@node Kernel Control Flow Integrity
> +@subsection Kernel Control Flow Integrity
> +@cindex kernel control flow integrity
> +@cindex KCFI
> +
> +@deftypefn {Target Hook} bool TARGET_KCFI_SUPPORTED (void)
> +Return true if the target supports Kernel Control Flow Integrity (KCFI).
> +This hook indicates whether the target has implemented the necessary RTL
> +patterns and infrastructure to support KCFI instrumentation.  The default
> +implementation returns false.
> +@end deftypefn
> +
> +@deftypefn {Target Hook} uint32_t TARGET_KCFI_MASK_TYPE_ID (uint32_t @var{type_id})
> +Apply architecture-specific masking to KCFI type ID.  This hook allows
> +targets to apply bit masks or other transformations to the computed KCFI
> +type identifier to match the target's specific requirements.  The default
> +implementation returns the type ID unchanged.
> +@end deftypefn
> +
> +@deftypefn {Target Hook} int TARGET_KCFI_EMIT_TYPE_ID (FILE *@var{file}, uint32_t @var{type_id})
> +Emit architecture-specific type ID instruction for KCFI preambles
> +and return the size of the instruction in bytes.
> +@var{file} is the assembly output stream and @var{type_id} is the KCFI
> +type identifier to emit.  If @var{file} is NULL, skip emission and only
> +return the size.  If not overridden, the default fallback emits a
> +@code{.word} directive with the type ID and returns 4 bytes.  Targets can
> +override this to emit different instruction sequences and return their
> +corresponding sizes.
> +@end deftypefn
> +
> @node Miscellaneous Register Hooks
> @subsection Miscellaneous register hooks
> @cindex miscellaneous register hooks
> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index c3ed9a9fd7c2..b2856886194c 100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -2433,6 +2433,7 @@ This describes the stack layout and calling conventions.
> * Tail Calls::
> * Shrink-wrapping separate components::
> * Stack Smashing Protection::
> +* Kernel Control Flow Integrity::
> * Miscellaneous Register Hooks::
> @end menu
> 
> @@ -3807,6 +3808,17 @@ generic code.
> 
> @hook TARGET_GET_VALID_OPTION_VALUES
> 
> +@node Kernel Control Flow Integrity
> +@subsection Kernel Control Flow Integrity
> +@cindex kernel control flow integrity
> +@cindex KCFI
> +
> +@hook TARGET_KCFI_SUPPORTED
> +
> +@hook TARGET_KCFI_MASK_TYPE_ID
> +
> +@hook TARGET_KCFI_EMIT_TYPE_ID
> +
> @node Miscellaneous Register Hooks
> @subsection Miscellaneous register hooks
> @cindex miscellaneous register hooks
> diff --git a/gcc/final.cc b/gcc/final.cc
> index afcb0bb9efbc..7f6aa9f9e480 100644
> --- a/gcc/final.cc
> +++ b/gcc/final.cc
> @@ -2094,6 +2094,9 @@ call_from_call_insn (const rtx_call_insn *insn)
> case SET:
>  x = XEXP (x, 1);
>  break;
> + case KCFI:
> +  x = XEXP (x, 0);
> +  break;
> }
>     }
>   return x;
> diff --git a/gcc/opts.cc b/gcc/opts.cc
> index 3ab993aea573..0ee37e01d24a 100644
> --- a/gcc/opts.cc
> +++ b/gcc/opts.cc
> @@ -2170,6 +2170,7 @@ const struct sanitizer_opts_s sanitizer_opts[] =
>   SANITIZER_OPT (pointer-overflow, SANITIZE_POINTER_OVERFLOW, true, true),
>   SANITIZER_OPT (builtin, SANITIZE_BUILTIN, true, true),
>   SANITIZER_OPT (shadow-call-stack, SANITIZE_SHADOW_CALL_STACK, false, false),
> +  SANITIZER_OPT (kcfi, SANITIZE_KCFI, false, true),
>   SANITIZER_OPT (all, ~sanitize_code_type (0), true, true),
> #undef SANITIZER_OPT
>   { NULL, sanitize_code_type (0), 0UL, false, false }
> diff --git a/gcc/passes.cc b/gcc/passes.cc
> index a33c8d924a52..4c6ceac740ff 100644
> --- a/gcc/passes.cc
> +++ b/gcc/passes.cc
> @@ -63,6 +63,7 @@ along with GCC; see the file COPYING3.  If not see
> #include "diagnostic-core.h" /* for fnotice */
> #include "stringpool.h"
> #include "attribs.h"
> +#include "kcfi.h"
> 
> /* Reserved TODOs */
> #define TODO_verify_il (1u << 31)
> diff --git a/gcc/passes.def b/gcc/passes.def
> index 68ce53baa0f1..65dd0bf4a41e 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -52,6 +52,7 @@ along with GCC; see the file COPYING3.  If not see
>   NEXT_PASS (pass_ipa_auto_profile_offline);
>   NEXT_PASS (pass_ipa_free_lang_data);
>   NEXT_PASS (pass_ipa_function_and_variable_visibility);
> +  NEXT_PASS (pass_ipa_kcfi);
>   NEXT_PASS (pass_ipa_strub_mode);
>   NEXT_PASS (pass_build_ssa_passes);
>   PUSH_INSERT_PASSES_WITHIN (pass_build_ssa_passes)
> diff --git a/gcc/rtl.def b/gcc/rtl.def
> index 15ae7d10fcc1..af643d187b95 100644
> --- a/gcc/rtl.def
> +++ b/gcc/rtl.def
> @@ -318,6 +318,12 @@ DEF_RTL_EXPR(CLOBBER, "clobber", "e", RTX_EXTRA)
> 
> DEF_RTL_EXPR(CALL, "call", "ee", RTX_EXTRA)
> 
> +/* KCFI wrapper for call expressions.
> +   Operand 0 is the call expression.
> +   Operand 1 is the KCFI type ID (const_int).  */
> +
> +DEF_RTL_EXPR(KCFI, "kcfi", "ee", RTX_EXTRA)
> +
> /* Return from a subroutine.  */
> 
> DEF_RTL_EXPR(RETURN, "return", "", RTX_EXTRA)
> diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
> index 63a1d08c46cf..5016fe93ccac 100644
> --- a/gcc/rtlanal.cc
> +++ b/gcc/rtlanal.cc
> @@ -1177,6 +1177,11 @@ reg_referenced_p (const_rtx x, const_rtx body)
>     case IF_THEN_ELSE:
>       return reg_overlap_mentioned_p (x, body);
> 
> +    case KCFI:
> +      /* For KCFI wrapper, check both the wrapped call and the type ID.  */
> +      return (reg_overlap_mentioned_p (x, XEXP (body, 0))
> +      || reg_overlap_mentioned_p (x, XEXP (body, 1)));
> +

Is the above change prevent the indirect callsite merging?

thanks.

Qing


>     case TRAP_IF:
>       return reg_overlap_mentioned_p (x, TRAP_CONDITION (body));
> 
> diff --git a/gcc/target.def b/gcc/target.def
> index 8e491d838642..47a11c60809a 100644
> --- a/gcc/target.def
> +++ b/gcc/target.def
> @@ -7589,6 +7589,44 @@ DEFHOOKPOD
> The default value is NULL.",
>  const char *, NULL)
> 
> +/* Kernel Control Flow Integrity (KCFI) hooks.  */
> +#undef HOOK_PREFIX
> +#define HOOK_PREFIX "TARGET_KCFI_"
> +HOOK_VECTOR (TARGET_KCFI, kcfi)
> +
> +DEFHOOK
> +(supported,
> + "Return true if the target supports Kernel Control Flow Integrity (KCFI).\n\
> +This hook indicates whether the target has implemented the necessary RTL\n\
> +patterns and infrastructure to support KCFI instrumentation.  The default\n\
> +implementation returns false.",
> + bool, (void),
> + hook_bool_void_false)
> +
> +DEFHOOK
> +(mask_type_id,
> + "Apply architecture-specific masking to KCFI type ID.  This hook allows\n\
> +targets to apply bit masks or other transformations to the computed KCFI\n\
> +type identifier to match the target's specific requirements.  The default\n\
> +implementation returns the type ID unchanged.",
> + uint32_t, (uint32_t type_id),
> + NULL)
> +
> +DEFHOOK
> +(emit_type_id,
> + "Emit architecture-specific type ID instruction for KCFI preambles\n\
> +and return the size of the instruction in bytes.\n\
> +@var{file} is the assembly output stream and @var{type_id} is the KCFI\n\
> +type identifier to emit.  If @var{file} is NULL, skip emission and only\n\
> +return the size.  If not overridden, the default fallback emits a\n\
> +@code{.word} directive with the type ID and returns 4 bytes.  Targets can\n\
> +override this to emit different instruction sequences and return their\n\
> +corresponding sizes.",
> + int, (FILE *file, uint32_t type_id),
> + NULL)
> +
> +HOOK_VECTOR_END (kcfi)
> +
> /* Close the 'struct gcc_target' definition.  */
> HOOK_VECTOR_END (C90_EMPTY_HACK)
> 
> diff --git a/gcc/toplev.cc b/gcc/toplev.cc
> index d26467450e37..f48cfeb050aa 100644
> --- a/gcc/toplev.cc
> +++ b/gcc/toplev.cc
> @@ -67,6 +67,7 @@ along with GCC; see the file COPYING3.  If not see
> #include "attribs.h"
> #include "asan.h"
> #include "tsan.h"
> +#include "kcfi.h"
> #include "plugin.h"
> #include "context.h"
> #include "pass_manager.h"
> @@ -1739,6 +1740,15 @@ process_options ()
>  "requires %<-fno-exceptions%>");
>     }
> 
> +  if (flag_sanitize & SANITIZE_KCFI)
> +    {
> +      if (!targetm.kcfi.supported ())
> + sorry ("%<-fsanitize=kcfi%> not supported by this target");
> +
> +      if (!lang_GNU_C ())
> + sorry ("%<-fsanitize=kcfi%> is only supported for C");
> +    }
> +
>   HOST_WIDE_INT patch_area_size, patch_area_start;
>   parse_and_check_patch_area (flag_patchable_function_entry, false,
>      &patch_area_size, &patch_area_start);
> diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
> index 08e642178ba5..e674e176f7d3 100644
> --- a/gcc/tree-inline.cc
> +++ b/gcc/tree-inline.cc
> @@ -2104,6 +2104,16 @@ copy_bb (copy_body_data *id, basic_block bb,
>  /* Advance iterator now before stmt is moved to seq_gsi.  */
>  gsi_next (&stmts_gsi);
> 
> +  /* If inlining from a function with no_sanitize("kcfi"), mark any
> +     call statements in the inlined body with the flag so they skip
> +     KCFI instrumentation.  */
> +  if (is_gimple_call (stmt)
> +      && !sanitize_flags_p (SANITIZE_KCFI, id->src_fn))
> +    {
> +      gcall *call = as_a <gcall *> (stmt);
> +      gimple_call_set_inlined_from_kcfi_nosantize (call, true);
> +    }
> +
>  if (gimple_nop_p (stmt))
>      continue;
> 
> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> index 0d78f5b384fb..d4e9e2373c6c 100644
> --- a/gcc/varasm.cc
> +++ b/gcc/varasm.cc
> @@ -57,6 +57,7 @@ along with GCC; see the file COPYING3.  If not see
> #include "attribs.h"
> #include "asan.h"
> #include "rtl-iter.h"
> +#include "kcfi.h"
> #include "file-prefix-map.h" /* remap_debug_filename()  */
> #include "alloc-pool.h"
> #include "toplev.h"
> @@ -2199,6 +2200,10 @@ assemble_start_function (tree decl, const char *fnname)
>   unsigned short patch_area_size = crtl->patch_area_size;
>   unsigned short patch_area_entry = crtl->patch_area_entry;
> 
> +  /* Emit KCFI preamble before any patchable areas.  */
> +  if (flag_sanitize & SANITIZE_KCFI)
> +    kcfi_emit_preamble (asm_out_file, decl, fnname);
> +
>   /* Emit the patching area before the entry label, if any.  */
>   if (patch_area_entry > 0)
>     targetm.asm_out.print_patchable_function_entry (asm_out_file,
> @@ -2767,6 +2772,9 @@ assemble_external_real (tree decl)
>       /* Some systems do require some output.  */
>       SYMBOL_REF_USED (XEXP (rtl, 0)) = 1;
>       ASM_OUTPUT_EXTERNAL (asm_out_file, decl, XSTR (XEXP (rtl, 0), 0));
> +
> +      if (flag_sanitize & SANITIZE_KCFI)
> + kcfi_emit_typeid_symbol (asm_out_file, decl);
>     }
> }
> #endif
> @@ -7283,16 +7291,25 @@ default_elf_asm_named_section (const char *name, unsigned int flags,
> fprintf (asm_out_file, ",%d", flags & SECTION_ENTSIZE);
>       if (flags & SECTION_LINK_ORDER)
> {
> -  /* For now, only section "__patchable_function_entries"
> -     adopts flag SECTION_LINK_ORDER, internal label LPFE*
> -     was emitted in default_print_patchable_function_entry,
> -     just place it here for linked_to section.  */
> -  gcc_assert (!strcmp (name, "__patchable_function_entries"));
> -  fprintf (asm_out_file, ",");
> -  char buf[256];
> -  ASM_GENERATE_INTERNAL_LABEL (buf, "LPFE",
> -       current_function_funcdef_no);
> -  assemble_name_raw (asm_out_file, buf);
> +  if (!strcmp (name, "__patchable_function_entries"))
> +    {
> +      /* For patchable function entries, internal label LPFE*
> + was emitted in default_print_patchable_function_entry,
> + just place it here for linked_to section.  */
> +      fprintf (asm_out_file, ",");
> +      char buf[256];
> +      ASM_GENERATE_INTERNAL_LABEL (buf, "LPFE",
> +   current_function_funcdef_no);
> +      assemble_name_raw (asm_out_file, buf);
> +    }
> +  else if (!strcmp (name, ".kcfi_traps"))
> +    {
> +      /* KCFI traps section links to .text section.  */
> +      fprintf (asm_out_file, ",.text");
> +    }
> +  else
> +    internal_error ("unexpected use of %<SECTION_LINK_ORDER%> by section %qs",
> +    name);
> }
>       if (HAVE_COMDAT_GROUP && (flags & SECTION_LINKONCE))
> {
> -- 
> 2.34.1
> 


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/7] typeinfo: Introduce KCFI typeinfo mangling API
  2025-09-13 23:23 ` [PATCH v3 1/7] typeinfo: Introduce KCFI typeinfo mangling API Kees Cook
@ 2025-09-17 17:56   ` Qing Zhao
  2025-09-17 21:20     ` Kees Cook
  2025-09-18  7:20     ` Martin Uecker
  0 siblings, 2 replies; 28+ messages in thread
From: Qing Zhao @ 2025-09-17 17:56 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andrew Pinski, Jakub Jelinek, Martin Uecker, Richard Biener,
	Joseph Myers, Peter Zijlstra, Jan Hubicka, Richard Earnshaw,
	Richard Sandiford, Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng,
	Palmer Dabbelt, Andrew Waterman, Jim Wilson, Dan Li,
	Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
	Bill Wendling, gcc-patches@gcc.gnu.org,
	linux-hardening@vger.kernel.org

Hi, 

> On Sep 13, 2025, at 19:23, Kees Cook <kees@kernel.org> wrote:
> 
> To support the KCFI typeid and future type-based allocators,

Could you please explain a little bit more on the “future type-based allocators”?

And why these two new builtins are necessary for this purpose?

> which need
> to convert unique types into unique 32-bit values, add a mangling system
> based on the Itanium C++ mangling ABI, adapted for for C types.

There is a redundant “for” in the above last sentence. 

> Introduce
> __builtin_typeinfo_hash for the hash, and __builtin_typeinfo_name for
> testing and debugging (to see the human-readable mangling form).

In addition to the testing and debugging purpose, are  there any use cases for
these two new compiler provided builtins? 

thanks.

Qing

> Add
> tests for typeinfo validation and error handling.
> 
> gcc/ChangeLog:
> 
> * Makefile.in: Add kcfi-typeinfo.o.
> * doc/extend.texi: Document typeinfo builtins.
> * kcfi-typeinfo.h: New file, typeinfo mangling API.
> * kcfi-typeinfo.cc: New file, implement typeinfo mangling.
> 
> gcc/c-family/ChangeLog:
> 
> * c-common.h (enum rid): Add typeinfo builtins.
> * c-common.cc: Add typeinfo builtins.
> 
> gcc/c/ChangeLog:
> 
> * c-parser.cc (c_parser_get_builtin_type_arg): New function,
> parse type.
> (c_parser_postfix_expression): Add typeinfo builtins.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.dg/builtin-typeinfo-errors.c: New test, validate bad
> arguments are rejected.
> * gcc.dg/builtin-typeinfo.c: New test, typeinfo mangling.
> 
> Signed-off-by: Kees Cook <kees@kernel.org>
> ---
> gcc/Makefile.in                               |   1 +
> gcc/c-family/c-common.h                       |   1 +
> gcc/kcfi-typeinfo.h                           |  32 ++
> .../gcc.dg/builtin-typeinfo-errors.c          |  28 ++
> gcc/testsuite/gcc.dg/builtin-typeinfo.c       | 350 +++++++++++++
> gcc/c-family/c-common.cc                      |   2 +
> gcc/c/c-parser.cc                             |  72 +++
> gcc/doc/extend.texi                           |  94 ++++
> gcc/kcfi-typeinfo.cc                          | 475 ++++++++++++++++++
> 9 files changed, 1055 insertions(+)
> create mode 100644 gcc/kcfi-typeinfo.h
> create mode 100644 gcc/testsuite/gcc.dg/builtin-typeinfo-errors.c
> create mode 100644 gcc/testsuite/gcc.dg/builtin-typeinfo.c
> create mode 100644 gcc/kcfi-typeinfo.cc
> 
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index d2744db843d7..a14fb498ce44 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1591,6 +1591,7 @@ OBJS = \
> ira-emit.o \
> ira-lives.o \
> jump.o \
> + kcfi-typeinfo.o \
> langhooks.o \
> late-combine.o \
> lcm.o \
> diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
> index b6021d241731..e0100837946e 100644
> --- a/gcc/c-family/c-common.h
> +++ b/gcc/c-family/c-common.h
> @@ -112,6 +112,7 @@ enum rid
>   RID_BUILTIN_SHUFFLEVECTOR,   RID_BUILTIN_CONVERTVECTOR,  RID_BUILTIN_TGMATH,
>   RID_BUILTIN_HAS_ATTRIBUTE,   RID_BUILTIN_ASSOC_BARRIER,  RID_BUILTIN_STDC,
>   RID_BUILTIN_COUNTED_BY_REF,
> +  RID_BUILTIN_TYPEINFO_NAME,  RID_BUILTIN_TYPEINFO_HASH,
>   RID_DFLOAT32, RID_DFLOAT64, RID_DFLOAT128, RID_DFLOAT64X,
> 
>   /* TS 18661-3 keywords, in the same sequence as the TI_* values.  */
> diff --git a/gcc/kcfi-typeinfo.h b/gcc/kcfi-typeinfo.h
> new file mode 100644
> index 000000000000..805f9ebaeca4
> --- /dev/null
> +++ b/gcc/kcfi-typeinfo.h
> @@ -0,0 +1,32 @@
> +/* KCFI-compatible type mangling, based on Itanium C++ ABI.
> +   Copyright (C) 2025 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +#ifndef GCC_KCFI_TYPEINFO_H
> +#define GCC_KCFI_TYPEINFO_H
> +
> +#include "tree.h"
> +#include <string>
> +
> +/* Get the typeinfo mangled name string for any C type.  */
> +extern std::string typeinfo_get_name (tree type);
> +
> +/* Get the typeinfo hash for any C type.  */
> +extern uint32_t typeinfo_get_hash (tree type);
> +
> +#endif /* GCC_KCFI_TYPEINFO_H */
> diff --git a/gcc/testsuite/gcc.dg/builtin-typeinfo-errors.c b/gcc/testsuite/gcc.dg/builtin-typeinfo-errors.c
> new file mode 100644
> index 000000000000..71ad01337b4e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/builtin-typeinfo-errors.c
> @@ -0,0 +1,28 @@
> +/* Test error handling for __builtin_typeinfo_name and __builtin_typeinfo_hash.  */
> +/* { dg-do compile } */
> +
> +int main() {
> +    /* Test missing arguments */
> +    const char *result1 = __builtin_typeinfo_name(); /* { dg-error "expected specifier-qualifier-list before '\\)'" } */
> +    /* { dg-error "expected type name in '__builtin_typeinfo_name'" "" { target *-*-* } .-1 } */
> +    unsigned int result2 = __builtin_typeinfo_hash(); /* { dg-error "expected specifier-qualifier-list before '\\)'" } */
> +    /* { dg-error "expected type name in '__builtin_typeinfo_hash'" "" { target *-*-* } .-1 } */
> +
> +    /* Test wrong argument types (expressions instead of type names) */
> +    const char *result3 = __builtin_typeinfo_name(42); /* { dg-error "expected specifier-qualifier-list before numeric constant" } */
> +    /* { dg-error "expected type name in '__builtin_typeinfo_name'" "" { target *-*-* } .-1 } */
> +    unsigned int result4 = __builtin_typeinfo_hash(42); /* { dg-error "expected specifier-qualifier-list before numeric constant" } */
> +    /* { dg-error "expected type name in '__builtin_typeinfo_hash'" "" { target *-*-* } .-1 } */
> +
> +    int x = 5;
> +    const char *result5 = __builtin_typeinfo_name(x); /* { dg-error "expected specifier-qualifier-list before" } */
> +    /* { dg-error "expected type name in '__builtin_typeinfo_name'" "" { target *-*-* } .-1 } */
> +    unsigned int result6 = __builtin_typeinfo_hash(x); /* { dg-error "expected specifier-qualifier-list before" } */
> +    /* { dg-error "expected type name in '__builtin_typeinfo_hash'" "" { target *-*-* } .-1 } */
> +
> +    /* Test too many arguments */
> +    const char *result7 = __builtin_typeinfo_name(int, int); /* { dg-error "expected '\\)' before ','" } */
> +    unsigned int result8 = __builtin_typeinfo_hash(int, int); /* { dg-error "expected '\\)' before ','" } */
> +
> +    return 0;
> +}
> diff --git a/gcc/testsuite/gcc.dg/builtin-typeinfo.c b/gcc/testsuite/gcc.dg/builtin-typeinfo.c
> new file mode 100644
> index 000000000000..744dc50f407e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/builtin-typeinfo.c
> @@ -0,0 +1,350 @@
> +/* Test KCFI type mangling using __builtin_typeinfo_name.  */
> +/* { dg-do run } */
> +/* { dg-options "-std=gnu99" } */
> +
> +#include <stdio.h>
> +#include <string.h>
> +#include <stdarg.h>
> +
> +int pass, fail;
> +
> +#define TEST_STRING(expr, expected_string) \
> +  do { \
> +    const char *actual_string = __builtin_typeinfo_name(typeof(expr)); \
> +    printf("Testing %s: ", #expr); \
> +    if (strcmp(actual_string, expected_string) == 0) { \
> +      printf("PASS (%s)\n", actual_string); \
> +      pass ++; \
> +    } else { \
> +      printf("FAIL\n"); \
> +      printf("  Expected: %s\n", expected_string); \
> +      printf("  Actual:   %s\n", actual_string); \
> +      fail ++; \
> +    } \
> +  } while (0)
> +
> +int main(void)
> +{
> +    printf("Testing KCFI Typeinfo Mangling\n");
> +    printf("======================================================\n");
> +
> +    /* Test basic types */
> +    TEST_STRING(void, "_ZTSv");
> +    TEST_STRING(char, "_ZTSc");
> +    TEST_STRING(int, "_ZTSi");
> +    TEST_STRING(short, "_ZTSs");
> +    TEST_STRING(long, "_ZTSl");
> +    TEST_STRING(float, "_ZTSf");
> +    TEST_STRING(double, "_ZTSd");
> +
> +    /* Test qualified types */
> +    TEST_STRING(const int, "_ZTSKi");
> +    TEST_STRING(volatile int, "_ZTSVi");
> +
> +    /* Test pointer types */
> +    TEST_STRING(char*, "_ZTSPc");
> +    TEST_STRING(int*, "_ZTSPi");
> +    TEST_STRING(void*, "_ZTSPv");
> +    TEST_STRING(const char*, "_ZTSPKc");
> +
> +    /* Test array types */
> +    TEST_STRING(int[10],  "_ZTSA10_i");
> +    TEST_STRING(char[20], "_ZTSA20_c");
> +    TEST_STRING(short[],  "_ZTSA_s");
> +
> +    /* Test basic function types */
> +    extern void func_void(void);
> +    extern void func_char(char x);
> +    extern void func_short(short x);
> +    extern void func_int(int x);
> +    extern void func_long(long x);
> +    TEST_STRING(func_void,  "_ZTSFvvE");
> +    TEST_STRING(func_char,  "_ZTSFvcE");
> +    TEST_STRING(func_short, "_ZTSFvsE");
> +    TEST_STRING(func_int,   "_ZTSFviE");
> +    TEST_STRING(func_long,  "_ZTSFvlE");
> +
> +    /* Test functions with unsigned types */
> +    extern void func_unsigned_char(unsigned char x);
> +    extern void func_unsigned_short(unsigned short x);
> +    extern void func_unsigned_int(unsigned int x);
> +    TEST_STRING(func_unsigned_char,  "_ZTSFvhE");
> +    TEST_STRING(func_unsigned_short, "_ZTSFvtE");
> +    TEST_STRING(func_unsigned_int,   "_ZTSFvjE");
> +
> +    /* Test functions with signed types */
> +    extern void func_signed_char(signed char x);
> +    extern void func_signed_short(signed short x);
> +    extern void func_signed_int(signed int x);
> +    TEST_STRING(func_signed_char,  "_ZTSFvaE");
> +    TEST_STRING(func_signed_short, "_ZTSFvsE");
> +    TEST_STRING(func_signed_int,   "_ZTSFviE");
> +
> +    /* Test functions with pointer types */
> +    extern void func_void_ptr(void *x);
> +    extern void func_char_ptr(char *x);
> +    extern void func_short_ptr(short *x);
> +    extern void func_int_ptr(int *x);
> +    extern void func_int_array(int arr[]); /* Decays to "int *".  */
> +    extern void func_long_ptr(long *x);
> +    TEST_STRING(func_void_ptr,  "_ZTSFvPvE");
> +    TEST_STRING(func_char_ptr,  "_ZTSFvPcE");
> +    TEST_STRING(func_short_ptr, "_ZTSFvPsE");
> +    TEST_STRING(func_int_ptr,   "_ZTSFvPiE");
> +    TEST_STRING(func_int_array, "_ZTSFvPiE");
> +    TEST_STRING(func_long_ptr,  "_ZTSFvPlE");
> +
> +    /* Test functions with const qualifiers */
> +    extern void func_const_void_ptr(const void *x);
> +    extern void func_const_char_ptr(const char *x);
> +    extern void func_const_short_ptr(const short *x);
> +    extern void func_const_int_ptr(const int *x);
> +    extern void func_const_long_ptr(const long *x);
> +    TEST_STRING(func_const_void_ptr,  "_ZTSFvPKvE");
> +    TEST_STRING(func_const_char_ptr,  "_ZTSFvPKcE");
> +    TEST_STRING(func_const_short_ptr, "_ZTSFvPKsE");
> +    TEST_STRING(func_const_int_ptr,   "_ZTSFvPKiE");
> +    TEST_STRING(func_const_long_ptr,  "_ZTSFvPKlE");
> +
> +    /* Test nested pointers */
> +    extern void func_int_ptr_ptr(int **x);
> +    extern void func_char_ptr_ptr(char **x);
> +    TEST_STRING(func_int_ptr_ptr,  "_ZTSFvPPiE");
> +    TEST_STRING(func_char_ptr_ptr, "_ZTSFvPPcE");
> +
> +    /* Test multiple parameters */
> +    extern void func_int_char(int x, char y);
> +    extern void func_char_int(char x, int y);
> +    extern void func_two_int(int x, int y);
> +    TEST_STRING(func_int_char, "_ZTSFvicE");
> +    TEST_STRING(func_char_int, "_ZTSFvciE");
> +    TEST_STRING(func_two_int,  "_ZTSFviiE");
> +
> +    /* Test return types */
> +    extern int func_return_int(void);
> +    extern char func_return_char(void);
> +    extern void* func_return_ptr(void);
> +    TEST_STRING(func_return_int,  "_ZTSFivE");
> +    TEST_STRING(func_return_char, "_ZTSFcvE");
> +    TEST_STRING(func_return_ptr,  "_ZTSFPvvE");
> +
> +    /* Test function pointer parameters */
> +    extern void func_fptr_void(void (*fp)(void));
> +    extern void func_fptr_int(void (*fp)(int));
> +    extern void func_fptr_ret_int(int (*fp)(void));
> +    TEST_STRING(func_fptr_void,    "_ZTSFvPFvvEE");
> +    TEST_STRING(func_fptr_int,     "_ZTSFvPFviEE");
> +    TEST_STRING(func_fptr_ret_int, "_ZTSFvPFivEE");
> +
> +    /* Test variadic functions */
> +    struct audit_context { int dummy; };
> +    extern void func_variadic_simple(const char *fmt, ...);
> +    extern void func_variadic_mixed(int x, const char *fmt, ...);
> +    extern void func_variadic_multi(int x, char y, const char *fmt, ...);
> +    extern void audit_log_pattern(struct audit_context *ctx, unsigned int gfp_mask,
> +  int type, const char *fmt, ...);
> +    TEST_STRING(func_variadic_simple, "_ZTSFvPKczE");
> +    TEST_STRING(func_variadic_mixed,  "_ZTSFviPKczE");
> +    TEST_STRING(func_variadic_multi,  "_ZTSFvicPKczE");
> +    TEST_STRING(audit_log_pattern,    "_ZTSFvP13audit_contextjiPKczE");
> +
> +    /* Test mixed const/non-const */
> +    extern void func_const_mixed(int x, const char *fmt);
> +    TEST_STRING(func_const_mixed,  "_ZTSFviPKcE");
> +
> +    /* Test named struct types */
> +    struct test_struct_a { int x; };
> +    struct test_struct_b { char y; };
> +    struct test_struct_c { void *ptr; };
> +    TEST_STRING(struct test_struct_a, "_ZTS13test_struct_a");
> +    extern void func_struct_a_ptr(struct test_struct_a *x);
> +    extern void func_struct_b_ptr(struct test_struct_b *x);
> +    extern void func_struct_c_ptr(struct test_struct_c *x);
> +    TEST_STRING(func_struct_a_ptr, "_ZTSFvP13test_struct_aE");
> +    TEST_STRING(func_struct_b_ptr, "_ZTSFvP13test_struct_bE");
> +    TEST_STRING(func_struct_c_ptr, "_ZTSFvP13test_struct_cE");
> +
> +    /* Test const named struct types */
> +    extern void func_const_struct_a_ptr(const struct test_struct_a *x);
> +    extern void func_const_struct_b_ptr(const struct test_struct_b *x);
> +    extern void func_const_struct_c_ptr(const struct test_struct_c *x);
> +    TEST_STRING(func_const_struct_a_ptr, "_ZTSFvPK13test_struct_aE");
> +    TEST_STRING(func_const_struct_b_ptr, "_ZTSFvPK13test_struct_bE");
> +    TEST_STRING(func_const_struct_c_ptr, "_ZTSFvPK13test_struct_cE");
> +
> +    /* Test named union types */
> +    union test_union_a { int x; float y; };
> +    union test_union_b { char a; void *b; };
> +    TEST_STRING(union test_union_a,  "_ZTS12test_union_a");
> +    extern void func_union_a_ptr(union test_union_a *x);
> +    extern void func_union_b_ptr(union test_union_b *x);
> +    TEST_STRING(func_union_a_ptr, "_ZTSFvP12test_union_aE");
> +    TEST_STRING(func_union_b_ptr, "_ZTSFvP12test_union_bE");
> +
> +    /* Test enum types: distinct from int */
> +    enum test_enum_a { ENUM_A_VAL };
> +    enum test_enum_b { ENUM_B_VAL };
> +    TEST_STRING(enum test_enum_a, "_ZTS11test_enum_a");
> +    extern void func_enum_a_ptr(enum test_enum_a *x);
> +    extern void func_enum_b_ptr(enum test_enum_b *x);
> +    TEST_STRING(func_enum_a_ptr, "_ZTSFvP11test_enum_aE");
> +    TEST_STRING(func_enum_b_ptr, "_ZTSFvP11test_enum_bE");
> +
> +    /* Test union member discrimination */
> +    struct tasklet {
> +        int state;
> +        union {
> +            void (*func)(unsigned long data);
> +            void (*callback)(struct tasklet *t);
> +        };
> +        unsigned long data;
> +    } tasklet_instance;
> +    TEST_STRING(tasklet_instance, "_ZTS7tasklet");
> +    struct tasklet *p = &tasklet_instance;
> +    extern void tasklet_callback_function(struct tasklet *t);
> +    extern void tasklet_func_function(unsigned long data);
> +    TEST_STRING(tasklet_func_function,     "_ZTSFvmE");
> +    TEST_STRING(*p->func,                  "_ZTSFvmE");
> +    TEST_STRING(tasklet_callback_function, "_ZTSFvP7taskletE");
> +    TEST_STRING(*p->callback,              "_ZTSFvP7taskletE");
> +
> +    /* Test struct return pointers */
> +    extern struct test_struct_a* func_ret_struct_a_ptr(void);
> +    extern struct test_struct_b* func_ret_struct_b_ptr(void);
> +    extern struct test_struct_c* func_ret_struct_c_ptr(void);
> +    TEST_STRING(func_ret_struct_a_ptr, "_ZTSFP13test_struct_avE");
> +    TEST_STRING(func_ret_struct_b_ptr, "_ZTSFP13test_struct_bvE");
> +    TEST_STRING(func_ret_struct_c_ptr, "_ZTSFP13test_struct_cvE");
> +
> +    /* Test struct by-value parameters */
> +    extern void func_struct_a_val(struct test_struct_a x);
> +    extern void func_struct_b_val(struct test_struct_b x);
> +    extern void func_struct_c_val(struct test_struct_c x);
> +    TEST_STRING(func_struct_a_val, "_ZTSFv13test_struct_aE");
> +    TEST_STRING(func_struct_b_val, "_ZTSFv13test_struct_bE");
> +    TEST_STRING(func_struct_c_val, "_ZTSFv13test_struct_cE");
> +
> +    /* Test struct return by-value */
> +    extern struct test_struct_a func_ret_struct_a_val(void);
> +    extern struct test_struct_b func_ret_struct_b_val(void);
> +    extern struct test_struct_c func_ret_struct_c_val(void);
> +    TEST_STRING(func_ret_struct_a_val, "_ZTSF13test_struct_avE");
> +    TEST_STRING(func_ret_struct_b_val, "_ZTSF13test_struct_bvE");
> +    TEST_STRING(func_ret_struct_c_val, "_ZTSF13test_struct_cvE");
> +
> +    /* Test mixed struct parameters */
> +    extern void func_struct_a_b(struct test_struct_a *a, struct test_struct_b *b);
> +    extern void func_struct_b_a(struct test_struct_b *b, struct test_struct_a *a);
> +    TEST_STRING(func_struct_a_b, "_ZTSFvP13test_struct_aP13test_struct_bE");
> +    TEST_STRING(func_struct_b_a, "_ZTSFvP13test_struct_bP13test_struct_aE");
> +
> +    /* Test anonymous struct typedefs */
> +    typedef struct { int x; } typedef_struct_x;
> +    typedef struct { int y; } typedef_struct_y;
> +    TEST_STRING(typedef_struct_x, "_ZTS16typedef_struct_x");
> +    extern void func_typedef_x_ptr(typedef_struct_x *x);
> +    extern void func_typedef_y_ptr(typedef_struct_y *y);
> +    TEST_STRING(func_typedef_x_ptr, "_ZTSFvP16typedef_struct_xE");
> +    TEST_STRING(func_typedef_y_ptr, "_ZTSFvP16typedef_struct_yE");
> +    extern void func_typedef_x(typedef_struct_x x);
> +    TEST_STRING(func_typedef_x, "_ZTSFv16typedef_struct_xE");
> +
> +    /* Test anonymous union typedefs */
> +    typedef union { int x; short a; } typedef_union_x;
> +    typedef union { int y; short b; } typedef_union_y;
> +    TEST_STRING(typedef_union_x, "_ZTS15typedef_union_x");
> +    extern void func_typedef_union_x_ptr(typedef_union_x *x);
> +    extern void func_typedef_union_y_ptr(typedef_union_y *y);
> +    TEST_STRING(func_typedef_union_x_ptr, "_ZTSFvP15typedef_union_xE");
> +    TEST_STRING(func_typedef_union_y_ptr, "_ZTSFvP15typedef_union_yE");
> +    extern void func_typedef_union_x(typedef_union_x x);
> +    TEST_STRING(func_typedef_union_x, "_ZTSFv15typedef_union_xE");
> +
> +    /* Test anonymous enum typedefs */
> +    typedef enum { STEP_1, STEP_2 } typedef_enum_x;
> +    typedef enum { STEP_A, STEP_B } typedef_enum_y;
> +    TEST_STRING(typedef_enum_x, "_ZTS14typedef_enum_x");
> +    extern void func_typedef_enum_x_ptr(typedef_enum_x *x);
> +    extern void func_typedef_enum_y_ptr(typedef_enum_y *y);
> +    TEST_STRING(func_typedef_enum_x_ptr, "_ZTSFvP14typedef_enum_xE");
> +    TEST_STRING(func_typedef_enum_y_ptr, "_ZTSFvP14typedef_enum_yE");
> +    extern void func_typedef_enum_x(typedef_enum_x x);
> +    TEST_STRING(func_typedef_enum_x, "_ZTSFv14typedef_enum_xE");
> +
> +    /* Test basic typedef vs open-coded function types: should be the same.  */
> +    typedef void (*func_type_typedef)(int, char);
> +    TEST_STRING(func_type_typedef,           "_ZTSPFvicE");
> +    extern void func_with_typedef_param(func_type_typedef fp);
> +    extern void func_with_opencoded_param(void (*fp)(int, char));
> +    TEST_STRING(func_with_typedef_param,   "_ZTSFvPFvicEE");
> +    TEST_STRING(func_with_opencoded_param, "_ZTSFvPFvicEE");
> +
> +    /* Test return function pointer types */
> +    typedef int (*ret_func_type_typedef)(void);
> +    TEST_STRING(ret_func_type_typedef,     "_ZTSPFivE");
> +    extern ret_func_type_typedef func_ret_typedef_param(void);
> +    extern int (*func_ret_opencoded_param(void))(void);
> +    TEST_STRING(func_ret_typedef_param,   "_ZTSFPFivEvE");
> +    TEST_STRING(func_ret_opencoded_param, "_ZTSFPFivEvE");
> +
> +    /* Test additional type combos */
> +    extern void func_float(float x);
> +    extern void func_double_ptr(double *x);
> +    extern void func_float_ptr(float *x);
> +    extern void func_void_ptr_ptr(void **x);
> +    extern void func_ptr_val(int *x, int y);
> +    extern void func_val_ptr(int x, int *y);
> +    extern float func_return_float(void);
> +    extern double func_return_double(void);
> +    TEST_STRING(func_float,         "_ZTSFvfE");
> +    TEST_STRING(func_double_ptr,    "_ZTSFvPdE");
> +    TEST_STRING(func_float_ptr,     "_ZTSFvPfE");
> +    TEST_STRING(func_void_ptr_ptr,  "_ZTSFvPPvE");
> +    TEST_STRING(func_ptr_val,       "_ZTSFvPiiE");
> +    TEST_STRING(func_val_ptr,       "_ZTSFviPiE");
> +    TEST_STRING(func_return_float,  "_ZTSFfvE");
> +    TEST_STRING(func_return_double, "_ZTSFdvE");
> +
> +    /* Test VLA types: should be all the same.  */
> +    extern void func_vla_1d(int n, int arr[n]);
> +    extern void func_vla_empty(int n, int arr[]);
> +    extern void func_vla_ptr(int n, int *arr);
> +    TEST_STRING(func_vla_1d,    "_ZTSFviPiE");
> +    TEST_STRING(func_vla_empty, "_ZTSFviPiE");
> +    TEST_STRING(func_vla_ptr,   "_ZTSFviPiE");
> +
> +    /* Test 2D VLA with fixed dimension: should be all the same.  */
> +    extern void func_vla_2d_first(int n, int arr[n][10]);
> +    extern void func_vla_2d_empty(int n, int arr[][10]);
> +    extern void func_vla_2d_ptr(int n, int (*arr)[10]);
> +    TEST_STRING(func_vla_2d_first, "_ZTSFviPA10_iE");
> +    TEST_STRING(func_vla_2d_empty, "_ZTSFviPA10_iE");
> +    TEST_STRING(func_vla_2d_ptr,   "_ZTSFviPA10_iE");
> +
> +    /* Test 2D VLA with both dimensions variable: should be all the same.  */
> +    extern void func_vla_2d_both(int rows, int cols, int arr[rows][cols]);
> +    extern void func_vla_2d_second(int rows, int cols, int arr[][cols]);
> +    extern void func_vla_2d_star(int rows, int cols, int arr[*][cols]);
> +    TEST_STRING(func_vla_2d_both,   "_ZTSFviiPA_iE");
> +    TEST_STRING(func_vla_2d_second, "_ZTSFviiPA_iE");
> +    TEST_STRING(func_vla_2d_star,   "_ZTSFviiPA_iE");
> +
> +    /* Test recursive typedef canonicalization */
> +    struct recursive_struct_test { int field; };
> +    typedef struct recursive_struct_test recursive_struct_typedef_1;
> +    typedef recursive_struct_typedef_1 recursive_struct_typedef_2;
> +    extern void func_recursive_struct_test(struct recursive_struct_test *x);
> +    TEST_STRING(func_recursive_struct_test, "_ZTSFvP21recursive_struct_testE");
> +
> +    /* Test anonymous struct, union, enum types */
> +    struct { int a; short b; } anon_struct;
> +    union { int x; float y; } anon_union;
> +    enum { ANON_VAL1, ANON_VAL2 } anon_enum;
> +    TEST_STRING(anon_struct, "_ZTS3$_0"); // <length>$_<counter>
> +    TEST_STRING(anon_union, "_ZTS3$_1");  // <length>$_<counter>
> +    TEST_STRING(anon_enum, "_ZTS3$_2");   // <length>$_<counter>
> +
> +    printf("\n================================================================\n");
> +    printf("Passed: %d Failed: %d (%d total tests)\n", pass, fail, pass + fail);
> +    return fail;
> +}
> diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
> index e7dd4602ac11..94f2c2001ad5 100644
> --- a/gcc/c-family/c-common.cc
> +++ b/gcc/c-family/c-common.cc
> @@ -461,6 +461,8 @@ const struct c_common_resword c_common_reswords[] =
>   { "__builtin_stdc_trailing_zeros", RID_BUILTIN_STDC, D_CONLY },
>   { "__builtin_tgmath", RID_BUILTIN_TGMATH, D_CONLY },
>   { "__builtin_offsetof", RID_OFFSETOF, 0 },
> +  { "__builtin_typeinfo_hash", RID_BUILTIN_TYPEINFO_HASH, D_CONLY },
> +  { "__builtin_typeinfo_name", RID_BUILTIN_TYPEINFO_NAME, D_CONLY },
>   { "__builtin_types_compatible_p", RID_TYPES_COMPATIBLE_P, D_CONLY },
>   { "__builtin_c23_va_start", RID_C23_VA_START, D_C23 },
>   { "__builtin_va_arg", RID_VA_ARG, 0 },
> diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
> index e8b64948bf69..996fb576ac7c 100644
> --- a/gcc/c/c-parser.cc
> +++ b/gcc/c/c-parser.cc
> @@ -77,6 +77,7 @@ along with GCC; see the file COPYING3.  If not see
> #include "asan.h"
> #include "c-family/c-ubsan.h"
> #include "gcc-urlifier.h"
> +#include "kcfi-typeinfo.h"
> 
> /* We need to walk over decls with incomplete struct/union/enum types
>    after parsing the whole translation unit.
> @@ -11017,6 +11018,38 @@ c_parser_has_attribute_expression (c_parser *parser)
>   return result;
> }
> 
> +/* Parse the single type name argument of a builtin that takes a type name.
> +   Returns true on success and stores the parsed type in *OUT_TYPE.
> +   If successful, *OUT_CLOSE_PAREN_LOC is written with the location of
> +   the closing parenthesis.  */
> +
> +static bool
> +c_parser_get_builtin_type_arg (c_parser *parser, const char *bname,
> +       tree *out_type, location_t *out_close_paren_loc)
> +{
> +  matching_parens parens;
> +  if (!parens.require_open (parser))
> +    return false;
> +
> +  struct c_type_name *type_name = c_parser_type_name (parser);
> +  if (type_name == NULL)
> +    {
> +      error_at (c_parser_peek_token (parser)->location,
> + "expected type name in %qs", bname);
> +      return false;
> +    }
> +
> +  *out_close_paren_loc = c_parser_peek_token (parser)->location;
> +  parens.skip_until_found_close (parser);
> +
> +  tree type = groktypename (type_name, NULL, NULL);
> +  if (type == error_mark_node)
> +    return false;
> +
> +  *out_type = type;
> +  return true;
> +}
> +
> /* Helper function to read arguments of builtins which are interfaces
>    for the middle-end nodes like COMPLEX_EXPR, VEC_PERM_EXPR and
>    others.  The name of the builtin is passed using BNAME parameter.
> @@ -12025,6 +12058,45 @@ c_parser_postfix_expression (c_parser *parser)
>    set_c_expr_source_range (&expr, loc, close_paren_loc);
>  }
>  break;
> + case RID_BUILTIN_TYPEINFO_NAME:
> +  {
> +    c_parser_consume_token (parser);
> +    location_t close_paren_loc;
> +    tree type;
> +    if (!c_parser_get_builtin_type_arg (parser,
> + "__builtin_typeinfo_name",
> + &type, &close_paren_loc))
> +      {
> + expr.set_error ();
> + break;
> +      }
> +
> +    /* Call the typeinfo name function.  */
> +    std::string type_name = typeinfo_get_name (type);
> +    expr.value = build_string_literal (type_name.length () + 1,
> +       type_name.c_str ());
> +    set_c_expr_source_range (&expr, loc, close_paren_loc);
> +  }
> +  break;
> + case RID_BUILTIN_TYPEINFO_HASH:
> +  {
> +    c_parser_consume_token (parser);
> +    location_t close_paren_loc;
> +    tree type;
> +    if (!c_parser_get_builtin_type_arg (parser,
> + "__builtin_typeinfo_hash",
> + &type, &close_paren_loc))
> +      {
> + expr.set_error ();
> + break;
> +      }
> +
> +    /* Call the typeinfo hash function.  */
> +    uint32_t type_hash = typeinfo_get_hash (type);
> +    expr.value = build_int_cst (unsigned_type_node, type_hash);
> +    set_c_expr_source_range (&expr, loc, close_paren_loc);
> +  }
> +  break;
> case RID_BUILTIN_TGMATH:
>  {
>    vec<c_expr_t, va_gc> *cexpr_list;
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 382295834035..7cddea1ed6c1 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -17547,6 +17547,100 @@ which will cause a @code{NULL} pointer to be used for the unsafe case.
> 
> @enddefbuiltin
> 
> +@defbuiltin{{unsigned int} __builtin_typeinfo_hash (@var{type})}
> +
> +The built-in function @code{__builtin_typeinfo_hash} returns a hash value
> +for the given type @var{type} (which is a type, not an expression).  The hash
> +is computed using the FNV-1a algorithm on the type's mangled name representation,
> +which follows a subset of the Itanium C++ ABI conventions adapted for C types.
> +(See @code{__buitin_typeinfo_name} for the string representation.)
> +
> +This built-in is primarily intended for kernel control flow integrity (KCFI)
> +implementations and other type-aware runtime systems that need to generate
> +consistent type identifiers.  The hash value is a 32-bit unsigned integer.
> +
> +Key characteristics of the hash:
> +@itemize @bullet
> +@item
> +The hash is consistent for the same type across different translation units.
> +@item
> +Typedefs are recursively canonicalized down to integral type name or named
> +struct, union, or enum tag name.
> +@item
> +Typedefs of anonymous structs, unions, and enums preserve the typedef name
> +in the hash calculation (e.g., @code{typedef struct @{ int x; @} foo_t;}
> +uses @code{foo_t} in the hash).
> +@item
> +Type qualifiers (@code{const}, @code{volatile}, @code{restrict}) affect
> +the hash value.
> +@item
> +Function types include parameter types and variadic markers in the hash.
> +@end itemize
> +
> +For example:
> +@smallexample
> +typedef struct @{ int x; @} mytype_t;
> +unsigned int hash1 = __builtin_typeinfo_hash(mytype_t);
> +unsigned int hash2 = __builtin_typeinfo_hash(struct @{ int x; @});
> +/* hash1 != hash2 because the typedef name is preserved */
> +
> +void func(int x, char y);
> +unsigned int hash3 = __builtin_typeinfo_hash(typeof(func));
> +/* Returns hash for function type "void(int, char)" */
> +@end smallexample
> +
> +@emph{Note:} This construct is only available for C@. For C++, see
> +@code{std::type_info::hash_code}.
> +
> +@enddefbuiltin
> +
> +@defbuiltin{{const char *} __builtin_typeinfo_name (@var{type})}
> +
> +The built-in function @code{__builtin_typeinfo_name} returns a string
> +containing the mangled name representation of the given type @var{type}
> +(which is a type, not an expression).  The string follows a subset of the
> +Itanium C++ ABI mangling conventions adapted for C types.  (See
> +@code{__buitin_typeinfo_hash} for the unsigned 32-bit hash representation.)
> +
> +The returned string is a compile-time constant suitable for use in
> +string comparisons, debugging output, or other type introspection needs.
> +The string begins with @code{_ZTS} followed by the encoded type information.
> +
> +Mangling examples:
> +@itemize @bullet
> +@item
> +@code{int} becomes @code{"_ZTSi"}
> +@item
> +@code{char *} becomes @code{"_ZTSPc"}
> +@item
> +@code{const int} becomes @code{"_ZTSKi"}
> +@item
> +@code{int[10]} becomes @code{"_ZTSA10_i"}
> +@item
> +@code{void (*)(int)} becomes @code{"_ZTSPFviE"}
> +@item
> +@code{struct foo} becomes @code{"_ZTS3foo"}
> +@item
> +@code{typedef struct @{ int x; @} bar_t;} becomes @code{"_ZTS5bar_t"}
> +@end itemize
> +
> +The mangling preserves typedef names for anonymous compound types, which
> +is particularly useful for distinguishing between different typedefs of
> +structurally identical anonymous types:
> +
> +@smallexample
> +typedef struct @{ int x; @} type_a;
> +typedef struct @{ int x; @} type_b;
> +const char *name_a = __builtin_typeinfo_name(type_a);  /* "_ZTS6type_a" */
> +const char *name_b = __builtin_typeinfo_name(type_b);  /* "_ZTS6type_b" */
> +/* name_a and name_b are different despite identical structure */
> +@end smallexample
> +
> +@emph{Note:} This construct is only available for C@. For C++, see
> +@code{std::type_info::name}.
> +
> +@enddefbuiltin
> +
> @defbuiltin{int __builtin_types_compatible_p (@var{type1}, @var{type2})}
> 
> You can use the built-in function @code{__builtin_types_compatible_p} to
> diff --git a/gcc/kcfi-typeinfo.cc b/gcc/kcfi-typeinfo.cc
> new file mode 100644
> index 000000000000..24099c42cc2e
> --- /dev/null
> +++ b/gcc/kcfi-typeinfo.cc
> @@ -0,0 +1,475 @@
> +/* KCFI-compatible type mangling, based on Itanium C++ ABI.
> +   Copyright (C) 2025 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +/* Produces typeinfo mangling similar to Itanium C++ Mangling ABI, but
> +   limited to types exposed within GCC for C language handling.  The
> +   hashes are used by KCFI (and future type-aware allocator support).
> +   The strings are used for testing and debugging.  */
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "tree.h"
> +#include "diagnostic-core.h"
> +#include "stringpool.h"
> +#include "stor-layout.h"
> +#include "print-tree.h"
> +#include "kcfi-typeinfo.h"
> +
> +/* Helper to update FNV-1a hash with a single character.  */
> +
> +static inline void
> +fnv1a_hash_char (uint32_t *hash_state, unsigned char c)
> +{
> +  *hash_state ^= c;
> +  *hash_state *= 16777619U; /* FNV-1a 32-bit prime.  */
> +}
> +
> +/* Helper to append character to optional string and update hash using
> +   FNV-1a.  */
> +
> +static void
> +append_char (char c, std::string *out_str, uint32_t *hash_state)
> +{
> +  if (out_str)
> +    *out_str += c;
> +  if (!hash_state)
> +    return;
> +  fnv1a_hash_char (hash_state, (unsigned char) c);
> +}
> +
> +/* Helper to append string to optional string and update hash using
> +   FNV-1a.  */
> +
> +static void
> +append_string (const char *str, std::string *out_str, uint32_t *hash_state)
> +{
> +  if (out_str)
> +    *out_str += str;
> +  if (!hash_state)
> +    return;
> +  for (const char *p = str; *p; p++)
> +    fnv1a_hash_char (hash_state, (unsigned char) *p);
> +}
> +
> +/* Forward declaration for recursive type mangling.  */
> +
> +static void mangle_type (tree type, std::string *out_str, uint32_t *hash_state);
> +
> +/* Mangle a builtin type following Itanium C++ ABI for C types.  */
> +
> +static void
> +mangle_builtin_type (tree type, std::string *out_str, uint32_t *hash_state)
> +{
> +  gcc_assert (type != NULL_TREE);
> +
> +  switch (TREE_CODE (type))
> +    {
> +    case VOID_TYPE:
> +      append_char ('v', out_str, hash_state);
> +      return;
> +
> +    case BOOLEAN_TYPE:
> +      append_char ('b', out_str, hash_state);
> +      return;
> +
> +    case INTEGER_TYPE:
> +      if (type == char_type_node)
> + append_char ('c', out_str, hash_state);
> +      else if (type == signed_char_type_node)
> + append_char ('a', out_str, hash_state);
> +      else if (type == unsigned_char_type_node)
> + append_char ('h', out_str, hash_state);
> +      else if (type == short_integer_type_node)
> + append_char ('s', out_str, hash_state);
> +      else if (type == short_unsigned_type_node)
> + append_char ('t', out_str, hash_state);
> +      else if (type == integer_type_node)
> + append_char ('i', out_str, hash_state);
> +      else if (type == unsigned_type_node)
> + append_char ('j', out_str, hash_state);
> +      else if (type == long_integer_type_node)
> + append_char ('l', out_str, hash_state);
> +      else if (type == long_unsigned_type_node)
> + append_char ('m', out_str, hash_state);
> +      else if (type == long_long_integer_type_node)
> + append_char ('x', out_str, hash_state);
> +      else if (type == long_long_unsigned_type_node)
> + append_char ('y', out_str, hash_state);
> +      else
> + {
> +  /* Fallback for other integer types - use precision-based
> +     encoding.  */
> +  append_char ('i', out_str, hash_state);
> +  append_string (std::to_string (TYPE_PRECISION (type)).c_str (),
> + out_str, hash_state);
> + }
> +      return;
> +
> +    case REAL_TYPE:
> +      if (type == float_type_node)
> + append_char ('f', out_str, hash_state);
> +      else if (type == double_type_node)
> + append_char ('d', out_str, hash_state);
> +      else if (type == long_double_type_node)
> + append_char ('e', out_str, hash_state);
> +      else
> + {
> +  /* Fallback for other real types.  */
> +  append_char ('f', out_str, hash_state);
> +  append_string (std::to_string (TYPE_PRECISION (type)).c_str (),
> + out_str, hash_state);
> + }
> +      return;
> +
> +    case VECTOR_TYPE:
> +      {
> + /* Handle vector types:
> +   Dv<num-elements>_<element-type-encoding>
> +   Example: uint8x16_t -> Dv16_h (vector of 16 unsigned char)  */
> + tree vector_size = TYPE_SIZE_UNIT (type);
> + tree element_type = TREE_TYPE (type);
> + tree element_size = TYPE_SIZE_UNIT (element_type);
> +
> + if (vector_size && element_size
> +    && TREE_CODE (vector_size) == INTEGER_CST
> +    && TREE_CODE (element_size) == INTEGER_CST)
> +  {
> +    append_char ('D', out_str, hash_state);
> +    append_char ('v', out_str, hash_state);
> +
> +    unsigned HOST_WIDE_INT vec_bytes = tree_to_uhwi (vector_size);
> +    unsigned HOST_WIDE_INT elem_bytes = tree_to_uhwi (element_size);
> +    unsigned HOST_WIDE_INT num_elements = vec_bytes / elem_bytes;
> +
> +    /* Append number of elements.  */
> +    append_string (std::to_string (num_elements).c_str (),
> +   out_str, hash_state);
> +    append_char ('_', out_str, hash_state);
> +
> +    /* Recursively mangle the element type.  */
> +    mangle_type (element_type, out_str, hash_state);
> +    return;
> +  }
> + /* Fail for vectors with unknown size.  */
> +      }
> +      break;
> +
> +    default:
> +      break;
> +    }
> +
> +  /* Unknown builtin type: this should never happen in a well-formed C.  */
> +  debug_tree (type);
> +  internal_error ("mangle: Unknown builtin type - please report this as a bug");
> +}
> +
> +/* Canonicalize typedef types to their underlying named struct/union types.  */
> +
> +static tree
> +canonicalize_typedef_type (tree type)
> +{
> +  /* Handle typedef types: canonicalize to named structs when possible.  */
> +  if (TYPE_NAME (type) && TREE_CODE (TYPE_NAME (type)) == TYPE_DECL)
> +    {
> +      tree type_decl = TYPE_NAME (type);
> +
> +      /* Check if this is a typedef (not the original struct declaration) */
> +      if (DECL_ORIGINAL_TYPE (type_decl))
> + {
> +  tree original_type = DECL_ORIGINAL_TYPE (type_decl);
> +
> +  /* Handle struct/union/enum types.  */
> +  if (TREE_CODE (original_type) == RECORD_TYPE
> +      || TREE_CODE (original_type) == UNION_TYPE
> +      || TREE_CODE (original_type) == ENUMERAL_TYPE)
> +    {
> +      /* Preserve typedef of anonymous struct/union/enum types.  */
> +      if (!TYPE_NAME (original_type))
> + return type;
> +
> +      /* Named compound type: canonicalize to it.  */
> +      return canonicalize_typedef_type (original_type);
> +    }
> +
> +  /* For basic type typedefs (e.g., u8 -> unsigned char),
> +     canonicalize to original type.  */
> +  if (TREE_CODE (original_type) == INTEGER_TYPE
> +      || TREE_CODE (original_type) == REAL_TYPE
> +      || TREE_CODE (original_type) == POINTER_TYPE
> +      || TREE_CODE (original_type) == ARRAY_TYPE
> +      || TREE_CODE (original_type) == FUNCTION_TYPE
> +      || TREE_CODE (original_type) == METHOD_TYPE
> +      || TREE_CODE (original_type) == BOOLEAN_TYPE
> +      || TREE_CODE (original_type) == COMPLEX_TYPE
> +      || TREE_CODE (original_type) == VECTOR_TYPE)
> +    {
> +      /* Recursively canonicalize in case the original type is
> + also a typedef.  */
> +      return canonicalize_typedef_type (original_type);
> +    }
> + }
> +    }
> +
> +  return type;
> +}
> +
> +/* Recursively mangle a C type following Itanium C++ ABI.  */
> +
> +static void
> +mangle_type (tree type, std::string *out_str, uint32_t *hash_state)
> +{
> +  gcc_assert (type != NULL_TREE);
> +
> +  /* Canonicalize typedef types to their underlying named struct types.  */
> +  type = canonicalize_typedef_type (type);
> +
> +  /* Save original qualified type for cases where we need typedef
> +     information.  */
> +  tree qualified_type = type;
> +
> +  /* Centralized qualifier handling: emit qualifiers for this type,
> +     then continue with unqualified version.  */
> +  if (TYPE_QUALS (type) != TYPE_UNQUALIFIED)
> +    {
> +      /* Emit qualifiers in Itanium ABI order: restrict, volatile, const.  */
> +      if (TYPE_QUALS (type) & TYPE_QUAL_RESTRICT)
> + append_char ('r', out_str, hash_state);
> +      if (TYPE_QUALS (type) & TYPE_QUAL_VOLATILE)
> + append_char ('V', out_str, hash_state);
> +      if (TYPE_QUALS (type) & TYPE_QUAL_CONST)
> + append_char ('K', out_str, hash_state);
> +
> +      /* Get unqualified version for further processing.  */
> +      type = TYPE_MAIN_VARIANT (type);
> +    }
> +
> +  switch (TREE_CODE (type))
> +    {
> +    case POINTER_TYPE:
> +      {
> + /* Pointer type: 'P' + pointed-to type.  */
> + append_char ('P', out_str, hash_state);
> +
> + /* Recursively mangle the pointed-to type.  */
> + tree pointed_to_type = TREE_TYPE (type);
> + mangle_type (pointed_to_type, out_str, hash_state);
> + break;
> +      }
> +
> +    case ARRAY_TYPE:
> +      /* Array type: 'A' + size + '_' + element type (simplified).  */
> +      append_char ('A', out_str, hash_state);
> +      if (TYPE_DOMAIN (type) && TYPE_MAX_VALUE (TYPE_DOMAIN (type)))
> + {
> +  tree max_val = TYPE_MAX_VALUE (TYPE_DOMAIN (type));
> +  /* Check if array size is compile-time constant to handle VLAs. */
> +  if (TREE_CODE (max_val) == INTEGER_CST && tree_fits_shwi_p (max_val))
> +    {
> +      HOST_WIDE_INT size = tree_to_shwi (max_val) + 1;
> +      append_string (std::to_string ((long) size).c_str (),
> +     out_str, hash_state);
> +    }
> +  /* For VLAs or non-constant dimensions, emit empty size (A_).  */
> +  append_char ('_', out_str, hash_state);
> + }
> +      else
> + {
> +  /* No domain or no max value: emit A_.  */
> +  append_char ('_', out_str, hash_state);
> + }
> +      mangle_type (TREE_TYPE (type), out_str, hash_state);
> +      break;
> +
> +    case REFERENCE_TYPE:
> +      /* Reference type: 'R' + referenced type.
> + Note: We must handle references to builtin types including compiler
> + builtins like __builtin_va_list used in functions like va_start.  */
> +      append_char ('R', out_str, hash_state);
> +      mangle_type (TREE_TYPE (type), out_str, hash_state);
> +      break;
> +
> +    case FUNCTION_TYPE:
> +      {
> + /* Function type: 'F' + return type + parameter types + 'E' */
> + append_char ('F', out_str, hash_state);
> + mangle_type (TREE_TYPE (type), out_str, hash_state);
> +
> + /* Add parameter types.  */
> + tree param_types = TYPE_ARG_TYPES (type);
> +
> + if (param_types == NULL_TREE)
> +  {
> +    /* func () - no parameter list (could be variadic). */
> +  }
> + else
> +  {
> +    bool found_real_params = false;
> +    for (tree param = param_types; param; param = TREE_CHAIN (param))
> +      {
> + tree param_type = TREE_VALUE (param);
> + if (param_type == void_type_node)
> +  {
> +    /* Check if this is the first parameter (explicit void) or a
> +       sentinel.  */
> +    if (!found_real_params)
> +      {
> + /* func (void) - explicit empty parameter list.
> +   Mangle void to distinguish from variadic func (). */
> + mangle_type (void_type_node, out_str, hash_state);
> +      }
> +    /* If we found real params before this void, it's a sentinel
> +       so stop here.  */
> +    break;
> +  }
> +
> + found_real_params = true;
> +
> + /* For value parameters, ignore const/volatile qualifiers as
> +   they don't affect the calling convention.  "const int" and
> +   "int" are passed identically by value.  */
> + tree canonical_param_type = param_type;
> +
> + if (TREE_CODE (param_type) != POINTER_TYPE
> +    && TREE_CODE (param_type) != REFERENCE_TYPE
> +    && TREE_CODE (param_type) != ARRAY_TYPE)
> +  {
> +    /* For non-pointer/reference value parameters, strip
> +       qualifiers by default.  */
> +    canonical_param_type = TYPE_MAIN_VARIANT (param_type);
> +
> +    /* Exception: preserve typedef information for anonymous
> +       compound types.  */
> +    if (TYPE_NAME (param_type)
> + && TREE_CODE (TYPE_NAME (param_type)) == TYPE_DECL
> + && DECL_ORIGINAL_TYPE (TYPE_NAME (param_type)))
> +      {
> + tree original_type
> +  = DECL_ORIGINAL_TYPE (TYPE_NAME (param_type));
> + if ((TREE_CODE (original_type) == RECORD_TYPE
> +     || TREE_CODE (original_type) == UNION_TYPE
> +     || TREE_CODE (original_type) == ENUMERAL_TYPE)
> +    && !TYPE_NAME (original_type))
> +  {
> +    /* Preserve typedef of an anonymous
> +       struct/union/enum.  */
> +    canonical_param_type = param_type;
> +  }
> +      }
> +  }
> +
> + mangle_type (canonical_param_type, out_str, hash_state);
> +      }
> +  }
> +
> + /* Check if this is a variadic function and add 'z' marker.  */
> + if (stdarg_p (type))
> +  {
> +    append_char ('z', out_str, hash_state);
> +  }
> +
> + append_char ('E', out_str, hash_state);
> + break;
> +      }
> +
> +    case RECORD_TYPE:
> +    case UNION_TYPE:
> +    case ENUMERAL_TYPE:
> +      {
> + /* Struct/union/enum: use simplified representation for C types.  */
> + const char *name = NULL;
> +
> + /* For compound types, use the original qualified type to preserve
> +   typedef info.  */
> + if (TYPE_QUALS (qualified_type) != TYPE_UNQUALIFIED)
> +  {
> +    type = qualified_type;
> +  }
> +
> + if (TYPE_NAME (type))
> +  {
> +    if (TREE_CODE (TYPE_NAME (type)) == TYPE_DECL)
> +      {
> + /* TYPE_DECL case: both named structs and typedef structs.  */
> + tree decl_name = DECL_NAME (TYPE_NAME (type));
> + if (decl_name && TREE_CODE (decl_name) == IDENTIFIER_NODE)
> +  {
> +    name = IDENTIFIER_POINTER (decl_name);
> +  }
> +      }
> +    else if (TREE_CODE (TYPE_NAME (type)) == IDENTIFIER_NODE)
> +      {
> + /* Direct identifier case.  */
> + name = IDENTIFIER_POINTER (TYPE_NAME (type));
> +      }
> +  }
> +
> + if (name)
> +  {
> +    append_string (std::to_string (strlen (name)).c_str (),
> +   out_str, hash_state);
> +    append_string (name, out_str, hash_state);
> +    break;
> +  }
> +
> + /* If no name found, use anonymous type format: <length>$_<counter>.  */
> + static unsigned anon_counter = 0;
> + std::string anon_name = "$_" + std::to_string (anon_counter++);
> +
> + append_string (std::to_string (anon_name.length ()).c_str (),
> +       out_str, hash_state);
> + append_string (anon_name.c_str (), out_str, hash_state);
> + break;
> +      }
> +
> +    default:
> +      /* Handle builtin types.  */
> +      mangle_builtin_type (type, out_str, hash_state);
> +      break;
> +    }
> +}
> +
> +/* Get the typeinfo mangled name string for any C type.
> +   Returns the mangled type string following Itanium C++ ABI conventions.  */
> +
> +std::string
> +typeinfo_get_name (tree type)
> +{
> +  gcc_assert (type != NULL_TREE);
> +  std::string result = "_ZTS";
> +
> +  mangle_type (type, &result, nullptr);
> +  return result;
> +}
> +
> +/* Get the typeinfo hash for any C type.
> +   Returns the FNV-1a hash of the mangled type string.  */
> +
> +uint32_t
> +typeinfo_get_hash (tree type)
> +{
> +  gcc_assert (type != NULL_TREE);
> +  uint32_t hash_state = 2166136261U; /* FNV-1a 32-bit offset basis.  */
> +
> +  /* Include _ZTS prefix in hash calculation.  */
> +  append_string ("_ZTS", nullptr, &hash_state);
> +
> +  mangle_type (type, nullptr, &hash_state);
> +  return hash_state;
> +}
> -- 
> 2.34.1
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 7/7] kcfi: Add regression test suite
  2025-09-13 23:51   ` Andrew Pinski
@ 2025-09-17 19:51     ` Kees Cook
  0 siblings, 0 replies; 28+ messages in thread
From: Kees Cook @ 2025-09-17 19:51 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Qing Zhao, Andrew Pinski, Jakub Jelinek, Martin Uecker,
	Richard Biener, Joseph Myers, Peter Zijlstra, Jan Hubicka,
	Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
	Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
	Jim Wilson, Dan Li, Sami Tolvanen, Ramon de C Valle, Joao Moreira,
	Nathan Chancellor, Bill Wendling, gcc-patches, linux-hardening

On Sat, Sep 13, 2025 at 04:51:21PM -0700, Andrew Pinski wrote:
> On Sat, Sep 13, 2025 at 4:36 PM Kees Cook <kees@kernel.org> wrote:
> > +/* Should have KCFI instrumentation for all indirect calls.  */
> > +
> > +/* x86_64: Complete KCFI check sequence should be present.  */
> > +/* { dg-final { scan-assembler {movl\t\$-?[0-9]+, %r1[01]d\n\taddl\t[^,]+, %r1[01]d\n\tje\t\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tud2} { target x86_64-*-* } } } */
> > +
> > +/* AArch64: Complete KCFI check sequence should be present.  */
> > +/* { dg-final { scan-assembler {ldur\tw16, \[x[0-9]+, #-[0-9]+\]\n\tmov\tw17, #[0-9]+\n\tmovk\tw17, #[0-9]+, lsl #16\n\tcmp\tw16, w17\n\tb\.eq\t(\.Lkcfi_call[0-9]+)\n\.Lkcfi_trap[0-9]+:\n\tbrk\t#[0-9]+\n\1:\n\tblr\tx[0-9]+} { target aarch64*-*-* } } } */
> > +
> > +/* ARM 32-bit: Complete KCFI check sequence should be present with stack
> > +   spilling.  */
> > +/* { dg-final { scan-assembler {push\t\{r0, r1\}\n\tldr\tr0, \[r[0-9]+, #-[0-9]+\]\n\tmovw\tr1, #[0-9]+\n\tmovt\tr1, #[0-9]+\n\tcmp\tr0, r1\n\tpop\t\{r0, r1\}\n\tbeq\t\.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tudf\t#[0-9]+\n\.Lkcfi_call[0-9]+:\n\tblx\tr[0-9]+} { target arm32 } } } */
> > +
> > +/* RISC-V: Complete KCFI check sequence should be present.  */
> > +/* { dg-final { scan-assembler {lw\tt1, -4\([a-z0-9]+\)\n\tlui\tt2, [0-9]+\n\taddiw\tt2, t2, -?[0-9]+\n\tbeq\tt1, t2, \.Lkcfi_call[0-9]+\n\.Lkcfi_trap[0-9]+:\n\tebreak} { target riscv*-*-* } } } */
> > +
> > +/* Should have trap section with entries.  */
> > +/* { dg-final { scan-assembler {\.kcfi_traps} { target x86_64-*-* } } } */
> > +/* { dg-final { scan-assembler {\.kcfi_traps} { target riscv*-*-* } } } */
> > +
> > +/* AArch64 should NOT have trap section (uses brk immediate instead) */
> > +/* { dg-final { scan-assembler-not {\.kcfi_traps} { target aarch64*-*-* } } } */
> > +
> > +/* ARM 32-bit should NOT have trap section (uses udf immediate instead) */
> > +/* { dg-final { scan-assembler-not {\.kcfi_traps} { target arm32 } } } */
> 
> 
> I think it would be better to use check-function-bodies here rather
> than scan-assembler for the sequences. Maybe each target should have
> its own testcase rather than putting it all in one source.
> Plus I think the target testcase should be part of the target patch
> rather than its own patch to make it easier to review both things
> together. Because while I was reviewing the aarch64 part I was
> thinking where are the testcases for the aarch64 specific changes.

Ah yeah, that works. I spent some time scratching my head over how to
have it not drop labels, but I've gotten a bunch of these converted now.
Some constructs I left, especially "scan-assembler-not" tests for v4.
It's significantly more readable now! Thanks! :)

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 4/7] aarch64: Add AArch64 Kernel Control Flow Integrity implementation
  2025-09-13 23:43   ` Andrew Pinski
  2025-09-14 19:45     ` Kees Cook
@ 2025-09-17 20:01     ` Kees Cook
  1 sibling, 0 replies; 28+ messages in thread
From: Kees Cook @ 2025-09-17 20:01 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Qing Zhao, Andrew Pinski, Jakub Jelinek, Martin Uecker,
	Richard Biener, Joseph Myers, Peter Zijlstra, Jan Hubicka,
	Richard Earnshaw, Richard Sandiford, Marcus Shawcroft,
	Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
	Jim Wilson, Dan Li, Sami Tolvanen, Ramon de C Valle, Joao Moreira,
	Nathan Chancellor, Bill Wendling, gcc-patches, linux-hardening

On Sat, Sep 13, 2025 at 04:43:29PM -0700, Andrew Pinski wrote:
> On Sat, Sep 13, 2025 at 4:28 PM Kees Cook <kees@kernel.org> wrote:
> >
> > Implement AArch64-specific KCFI backend.
> >
> > - Trap debugging through ESR (Exception Syndrome Register) encoding
> >   in BRK instruction immediate values.
> >
> > - Scratch register allocation using w16/w17 (x16/x17) following
> >   AArch64 procedure call standard for intra-procedure-call registers.
> 
> How does this interact with BTI and sibcalls?

BTI and KCFI are complementary. BTI uses passes to insert insns at entry
points and at call-return sites. Like x86's CET "endbr" stuff, KCFI is
providing finer granularity checking for forward-edge.

Sibcalls are handled normally and there's no change to their
construction beyond the KCFI sequence using jmp instead of call.

> Since for indirect
> calls, x17 is already used for the address.
> Why do you need/want to use a fixed register here for the load/compare
> anyways? Why can't you use any free register?

I spent a bunch of time trying to understand the register allocator,
and the bottom line is that the register allocator won't give me a
scratch register if we hit register pressure because it (correctly) sees
that while it can do a spill, it can't do a reload since the insn is a
"CALL". As such, I have to do register lifetime management internally
to the KCFI insn sequence.

For aarch32, I've done this by using ip (r12) by default, but if it's
used as the target register, I switch to r3, and do a spill/reload only
if r3 is used as a call argument. Since r3 is already in the clobber list
due to the call, the register allocator is already doing a spill/reload
of r3 when it is live.

For aarch64 w16 and w17 are universally on the clobber list (even for
sibcalls), so I'm free to use them internally. But "proving" this to
answer your question led me to find where that clobber is happening,
which means I can drop the redundant clobber I was adding in this patch.

> > +  /* Add KCFI clobbers for indirect calls.  */
> > +  if (kcfi_type_rtx)
> > +    {
> > +      rtx usage = CALL_INSN_FUNCTION_USAGE (call_insn);
> > +      /* Add X16 and X17 clobbers for AArch64 KCFI scratch registers.  */
> > +      clobber_reg (&usage, gen_rtx_REG (DImode, 16));
> > +      clobber_reg (&usage, gen_rtx_REG (DImode, 17));
> > +      CALL_INSN_FUNCTION_USAGE (call_insn) = usage;
> > +    }

i.e. I've dropped the above.

> > +
> >    /* Check whether the call requires a change to PSTATE.SM.  We can't
> >       emit the instructions to change PSTATE.SM yet, since they involve
> >       a change in vector length and a change in instruction set, which
> 
> Also how does this interact with SME calls?

Based on what I've been able to find, there's no conflict: the KCFI
typeid is tied strictly to the function type and doesn't take the SME
attributes into account. So this appears to be fine.

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
  2025-09-17 13:42   ` Qing Zhao
@ 2025-09-17 21:09     ` Kees Cook
  2025-09-18 16:59       ` Qing Zhao
  2025-09-18 19:39       ` Kees Cook
  0 siblings, 2 replies; 28+ messages in thread
From: Kees Cook @ 2025-09-17 21:09 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Andrew Pinski, Jakub Jelinek, Martin Uecker, Richard Biener,
	Joseph Myers, Peter Zijlstra, Jan Hubicka, Richard Earnshaw,
	Richard Sandiford, Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng,
	Palmer Dabbelt, Andrew Waterman, Jim Wilson, Dan Li,
	Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
	Bill Wendling, gcc-patches@gcc.gnu.org,
	linux-hardening@vger.kernel.org

On Wed, Sep 17, 2025 at 01:42:32PM +0000, Qing Zhao wrote:
> This version of the middle-end change is much simpler and cleaner-:).

Thanks! I think it's getter closer (hopefully). :)

> > On Sep 13, 2025, at 19:23, Kees Cook <kees@kernel.org> wrote:
> > +- KCFI check-call instrumentation must survive tail call optimization.
> > +  If an indirect call is turned into an indirect jump, KCFI checking
> > +  must still happen (but it will use a jmp rather than a call).
> 
> I didn’t see any code changes in this patch address the above issue,
>  is the issue automatically resolved without special handling? 

The logic for this is handled by the split RTL patterns on the backend. We
end up with 4 RTL patterns for KCFI that match the regular 4 call
patterns:

- call
- call with return value
- sibcall
- sibcall with return value

In the RTL assembly output the "is this a sibcall?" test is made to
choose between emitting a "call" or a "jump" insn.

> > +- Functions that may be called indirectly have a preamble added,
> > +  __cfi_$original_func_name, which contains the $typeid value:
> > +
> > +    __cfi_target_func:
> > +      .word $typeid
> > +    target_func:
> > +       [regular function entry...]
> > +
> > +- The preamble needs to interact with patchable function entry so that
> > +  the typeid appears further away from the actual start of the function
> > +  (leaving the prefix NOPs of the patchable function entry unchanged).
> > +  This means only _globally defined_ patchable function entry is supported
> > +  with KCFI (indrect call sites must know in advance what the offset is,
> > +  which may not be possible with extern functions that use a function
> > +  attribute to change their patchable function entry characteristics).
> > +  For example, a "4,4" patchable function entry would end up like:
> > +
> > +    __cfi_target_func:
> > +      .data $typeid
> > +      nop nop nop nop
> > +    target_func:
> > +       [regular function entry...]
> > +
> > +  Architectures may need to add alignment nops prior to the typeid to keep
> > +  __cfi_target_func aligned for function call conventions.
> 
> I am still a little confused with the above, are there two “nops” need to be computed
> and added: one is for patchable function entry, the other one is for architecture specific
> alignment nops? 
> If so, you might need to clarify the above to make this clear. 

Yes, this is a confusing bit of logic that needs more clarity. I'll
improve this. Here's what happens:

Normal function has no preamble:

func:
	body...

With KCFI, a preamble is created to hold the typeid to be checked from
site sites (addressed as -4 from "func"):

__cfi_func:
	.word typeid_value
func:
	body...

A "patchable function entry" function has both "prefix" and "entry" nops
added:

__pfe_func:
	nop	// "prefix" nops
	nop
func:
	nop	// "entry" nops
	nop
	nop
	body...

Confusingly, the argument specifies total (and optionally prefix):
 -fpatchable-function-entry=TOTAL[,PREFIX]
So the above example is -fpatchable-function-entry=5,2 (5 total NOPs,
with 2 of them being preamble insns).

For KCFI, callsites need to address the typeid, so a normal KCFI
callsite would use:

	load %tmp, -4(%target)

but when PFE is active, the typeid must be placed before the prefix NOPs
since PFE requires that the entire space is NOPs. Therefore the prefix
NOPs need to be included (and measured in _bytes_, not instructions)
when loading the typeid:

	load %tmp, -12(%target)
	// 2 nops (8 bytes on aarch64) and 4 bytes for typeid == -12

Which corresponds to the resulting function preamble layout:

__cfi_func:
	.word typeid_value
__pfe_func:
	nop	// "prefix" nops
	nop
func:
	nop	// "entry" nops
	nop
	nop
	body...

Now, an _additional_ requirement for x86 is that __cfi_func be function
entry aligned, so that Linux can, if it chooses, live-patch the entire
KCFI and PFE prefix area into a callable target (this is the "FineIBT"
KCFI alternative). So, when -falign-functions=N is set, given x86's 1
byte NOPs and the "movl" encoding used for holding the KCFI type id, the
final layout, given -falign-functions=8 -fpatchable-function-entry=4,1
would be:

__cfi_func:
	nop	// "alignment" nops	// 2 bytes total
	nop
	.word typeid_value		// 5 bytes total
__pfe_func:
	nop	// "prefix" nops	// 1 byte total
func:
	nop	// "entry" nops
	nop
	nop
	body...

4 total PFE bytes with 1 as prefix (leving 3 at the func entry). And
to align __cfi_func to 8 bytes, we have 5 byte typeid insn, and 1 byte
"prefix" nop, so we need 2 more bytes to be the "alignment" nops.


This layout was not obvious initially for x86 because Linux's FineIBT
implementation uses -falign-functions=16 -fpatchable-function-entry=11,11
so the alignment nops are pre-calculated.

> 
> > +
> > +- External functions that are address-taken have a weak __kcfi_typeid_$func
> > +  symbol added with the typeid value available so that the typeid can be
> > +  referenced from assembly linkages, etc, where the typeid values cannot be
> > +  calculated (i.e where C type information is missing):
> > +
> > +    .weak   __kcfi_typeid_$func
> > +    .set    __kcfi_typeid_$func, $typeid
> > +
> 
> From my previous understanding, the above weak symbol is emitted for external functions
> that are address-taken AND does not have a definition in the compilation. So the weak symbols
> Is emitted at the declaration site of the external function, is this true?
> 
> If so, could you please clarify this in the above?

Yes, this happens via assemble_external_real, which can be called under
a few conditions in gcc/varasm.cc.

> > +- Keep indirect calls from being merged (see earlier example) by
> > +  checking the KCFI insn's typeid for equality.
> 
> Is this resolved by the following code:
> 
> rtlanal.cc
> index 63a1d08c46cf..5016fe93ccac 100644
> --- a/gcc/rtlanal.cc
> +++ b/gcc/rtlanal.cc
> @@ -1177,6 +1177,11 @@ reg_referenced_p (const_rtx x, const_rtx body)
>     case IF_THEN_ELSE:
>       return reg_overlap_mentioned_p (x, body);
> 
> +    case KCFI:
> +      /* For KCFI wrapper, check both the wrapped call and the type ID.  */
> +      return (reg_overlap_mentioned_p (x, XEXP (body, 0))
> +      || reg_overlap_mentioned_p (x, XEXP (body, 1)));
> +

The above is needed for accurate register "liveness" checking. When the
above code is removed, the kcfi-move-preservation.c regression test
fails (since it doesn't see the clobbers).

AFAICT, simply making it a new type of RTL (the DEF_RTL_EXPR), made it
unmergeable. I assume this is because whatever was doing the call
merging was looking strictly for "CALL" types, but I honestly don't know
where that was happening.

> > +/* Common helper for RTL patterns to emit .kcfi_traps section entry.  */
> 
> I noticed that you didn’t explain each parameter of the function in all the comments for the functions.
> This need to be updated for all the new functions. 

For externs like these, should the parameter documentation go in the .h
file, or the .cc file?

> > +void
> > +kcfi_emit_traps_section (FILE *file, rtx trap_label_sym)
> > +{
> > +  /* Generate entry label internally and get its number.  */
> > +  rtx entry_label = gen_label_rtx ();
> > +  int entry_labelno = CODE_LABEL_NUMBER (entry_label);
> 
> Is the only usage of the new RTX “entry_label” is to generate a label_number? 
> If so, the entry_label is not needed at all.  You can get a distinct labelno for each
> Lkcfi_entry, for example, the function id for the current function.

It is, yes. I can't use the function id because it's only incremented per
function and a given function may have multiple kcfi call sites within
it. I did have a version of this logic that used a kcfi-specific global
counter but (at the time) I was having trouble with it and had seen that
other "custom label" examples in the code base used this style, so I
switched to that.

I have since figured out why the global counter wasn't work (I was using
it during expansion and not during insn output, so I had cases where a
call was getting duplicated and I had a repeated label). If it's
preferred, I could try switching back to the global counter to avoid
these "useless" gen_label_rtx calls?

> > +static uint32_t
> > +kcfi_get_type_id (tree fn_type)
> > +{
> > +  uint32_t type_id;
> > +
> > +  /* Cache the attribute identifier.  */
> > +  if (!kcfi_type_id_attr)
> > +    kcfi_type_id_attr = get_identifier ("kcfi_type_id");
> > +
> > +  tree attr = lookup_attribute (IDENTIFIER_POINTER (kcfi_type_id_attr),
> > + TYPE_ATTRIBUTES (fn_type));
> 
> The above can be simplified as:
> +  tree attr = lookup_attribute (“kcfi_type_id”, TYPE_ATTRIBUTES (fn_type));

Ugh, I totally misunderstood the examples I saw of this. I thought they
were caching the string lookup, but now that I look more closely, I see:

#define IDENTIFIER_POINTER(NODE) \
  ((const char *) IDENTIFIER_NODE_CHECK (NODE)->identifier.id.str)

it's just returning the string!

I will throw away the "caching" I was doing. I thought it would actually
look up the attribute using the tree returned by get_identifier, but I
see there is no overloaded lookup_attribute that takes a tree argument.

*face palm*

> > +/* Emit KCFI type ID symbol for an address-taken external function.  */
> 
> Is it more accurate to say:
> 
> Emit KCFI type ID symbol for the declaration of an address-taken external function FNDECL
> to the assembly file ASM_FILE.
> 
> ??

Yup, I will update it.

> > +  /* Process all functions - both local and external.  */
> > +  FOR_EACH_FUNCTION (node)
> > +    {
> > +      tree fndecl = node->decl;
> > +
> > +      /* Skip all non-NORMAL builtins (MD, FRONTEND) entirely.
> > + For NORMAL builtins, skip those that lack an implicit
> > + implementation (closest way to distinguishing DEF_LIB_BUILTIN
> > + from others).  E.g. we need to have typeids for memset().  */
> 
> I see indentation issue in the above comments.

This looks like your email client again. It passes
contrib/check_GNU_style.py:

  FOR_EACH_FUNCTION (node)$
    {$
      tree fndecl = node->decl;$
$
      /* Skip all non-NORMAL builtins (MD, FRONTEND) entirely.$
^I For NORMAL builtins, skip those that lack an implicit$
^I implementation (closest way to distinguishing DEF_LIB_BUILTIN$
^I from others).  E.g. we need to have typeids for memset().  */$

Or is there something special I need to be doing differently for
comments?

> 
> > +      if (fndecl_built_in_p (fndecl))
> > + {
> > +  if (DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_NORMAL)
> > +    continue;
> > +  if (!builtin_decl_implicit_p (DECL_FUNCTION_CODE (fndecl)))
> > +    continue;
> > + }
> 
> Also see indentation issue in the above.

      if (fndecl_built_in_p (fndecl))$
^I{$
^I  if (DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_NORMAL)$
^I    continue;$
^I  if (!builtin_decl_implicit_p (DECL_FUNCTION_CODE (fndecl)))$
^I    continue;$
^I}$

Looks like the same thing?


Thanks for the review! I'll have v4 ready soon.

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/7] typeinfo: Introduce KCFI typeinfo mangling API
  2025-09-17 17:56   ` Qing Zhao
@ 2025-09-17 21:20     ` Kees Cook
  2025-09-18  7:20     ` Martin Uecker
  1 sibling, 0 replies; 28+ messages in thread
From: Kees Cook @ 2025-09-17 21:20 UTC (permalink / raw)
  To: Qing Zhao, Marco Elver
  Cc: Andrew Pinski, Jakub Jelinek, Martin Uecker, Richard Biener,
	Joseph Myers, Peter Zijlstra, Jan Hubicka, Richard Earnshaw,
	Richard Sandiford, Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng,
	Palmer Dabbelt, Andrew Waterman, Jim Wilson, Dan Li,
	Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
	Bill Wendling, gcc-patches@gcc.gnu.org,
	linux-hardening@vger.kernel.org

On Wed, Sep 17, 2025 at 05:56:17PM +0000, Qing Zhao wrote:
> Hi, 
> 
> > On Sep 13, 2025, at 19:23, Kees Cook <kees@kernel.org> wrote:
> > 
> > To support the KCFI typeid and future type-based allocators,
> 
> Could you please explain a little bit more on the “future type-based allocators”?

Sure, here's a link to Marco Elver's work:
https://lore.kernel.org/lkml/20250825154505.1558444-1-elver@google.com/
The "alloc token" is a bit more complicated:
https://github.com/melver/llvm-project/blob/alloc-token/clang/docs/AllocToken.rst
but my proposed __builtin_typeinfo_hash would align with mode=2 in the
Clang proposal.

> And why these two new builtins are necessary for this purpose?

Andrew didn't want strings produced when they were unused so I needed a
second API for getting string names. In either case, the result needed
to be compile-time constant. Also, I wanted to mirror the existing C++
typeinfo API that has "hash" and "name" accessors separate.

> > which need
> > to convert unique types into unique 32-bit values, add a mangling system
> > based on the Itanium C++ mangling ABI, adapted for for C types.
> 
> There is a redundant “for” in the above last sentence. 

Oops! Fixed.

> > Introduce
> > __builtin_typeinfo_hash for the hash, and __builtin_typeinfo_name for
> > testing and debugging (to see the human-readable mangling form).
> 
> In addition to the testing and debugging purpose, are  there any use cases for
> these two new compiler provided builtins? 

For _hash, yes, using it for type-based compile-time constant values
provides a building block for having type-aware logic in Linux,
especially with the allocator. For _name, it's a nice way to get at the
typeinfo for getting a stable string (right now there is no way to use
typeof() output for any reporting).

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/7] typeinfo: Introduce KCFI typeinfo mangling API
  2025-09-17 17:56   ` Qing Zhao
  2025-09-17 21:20     ` Kees Cook
@ 2025-09-18  7:20     ` Martin Uecker
  2025-09-18 18:09       ` Kees Cook
  1 sibling, 1 reply; 28+ messages in thread
From: Martin Uecker @ 2025-09-18  7:20 UTC (permalink / raw)
  To: Qing Zhao, Kees Cook
  Cc: Andrew Pinski, Jakub Jelinek, Richard Biener, Joseph Myers,
	Peter Zijlstra, Jan Hubicka, Richard Earnshaw, Richard Sandiford,
	Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng, Palmer Dabbelt,
	Andrew Waterman, Jim Wilson, Dan Li, Sami Tolvanen,
	Ramon de C Valle, Joao Moreira, Nathan Chancellor, Bill Wendling,
	gcc-patches@gcc.gnu.org, linux-hardening@vger.kernel.org

Am Mittwoch, dem 17.09.2025 um 17:56 +0000 schrieb Qing Zhao:
> Hi, 
> 
> > On Sep 13, 2025, at 19:23, Kees Cook <kees@kernel.org> wrote:
> > 
> > To support the KCFI typeid and future type-based allocators,
> 
> Could you please explain a little bit more on the “future type-based allocators”?

People work on allocators where different types go into different
storage regions.  This prevents type-confusion errors, especially
where data could be misinterpreted as a pointer.

> 
> And why these two new builtins are necessary for this purpose?
> 
> > which need
> > to convert unique types into unique 32-bit values, add a mangling system
> > based on the Itanium C++ mangling ABI, adapted for for C types.
> 
> There is a redundant “for” in the above last sentence. 
> 
> > Introduce
> > __builtin_typeinfo_hash for the hash, and __builtin_typeinfo_name for
> > testing and debugging (to see the human-readable mangling form).
> 
> In addition to the testing and debugging purpose, are  there any use cases for
> these two new compiler provided builtins? 

Kees may have other applications in mind, but it would be useful for all
user or library code that may do some kind of run-time type checking.
It might also be useful for serialization.

What I find problematic though is that this is not based on GNU / ISO C
rules but on stricter Linux kernel rules.   I think such builtin should
have two versions.  

So maybe

__builtin_typeinfo_hash_strict // strict
__builtin_typeinfo_hash_canonical // standard

or similar, or maybe instead have a flag argument so that we can
other options which may turn out to be important in the future
(such as ignoring  qualifiers or supporting newer languag features).

  
Martin




> 
> thanks.
> 
> Qing
> 
> > Add
> > tests for typeinfo validation and error handling.
> > 
> > gcc/ChangeLog:
> > 
> > * Makefile.in: Add kcfi-typeinfo.o.
> > * doc/extend.texi: Document typeinfo builtins.
> > * kcfi-typeinfo.h: New file, typeinfo mangling API.
> > * kcfi-typeinfo.cc: New file, implement typeinfo mangling.
> > 
> > gcc/c-family/ChangeLog:
> > 
> > * c-common.h (enum rid): Add typeinfo builtins.
> > * c-common.cc: Add typeinfo builtins.
> > 
> > gcc/c/ChangeLog:
> > 
> > * c-parser.cc (c_parser_get_builtin_type_arg): New function,
> > parse type.
> > (c_parser_postfix_expression): Add typeinfo builtins.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * gcc.dg/builtin-typeinfo-errors.c: New test, validate bad
> > arguments are rejected.
> > * gcc.dg/builtin-typeinfo.c: New test, typeinfo mangling.
> > 
> > Signed-off-by: Kees Cook <kees@kernel.org>
> > ---
> > gcc/Makefile.in                               |   1 +
> > gcc/c-family/c-common.h                       |   1 +
> > gcc/kcfi-typeinfo.h                           |  32 ++
> > .../gcc.dg/builtin-typeinfo-errors.c          |  28 ++
> > gcc/testsuite/gcc.dg/builtin-typeinfo.c       | 350 +++++++++++++
> > gcc/c-family/c-common.cc                      |   2 +
> > gcc/c/c-parser.cc                             |  72 +++
> > gcc/doc/extend.texi                           |  94 ++++
> > gcc/kcfi-typeinfo.cc                          | 475 ++++++++++++++++++
> > 9 files changed, 1055 insertions(+)
> > create mode 100644 gcc/kcfi-typeinfo.h
> > create mode 100644 gcc/testsuite/gcc.dg/builtin-typeinfo-errors.c
> > create mode 100644 gcc/testsuite/gcc.dg/builtin-typeinfo.c
> > create mode 100644 gcc/kcfi-typeinfo.cc
> > 
> > diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> > index d2744db843d7..a14fb498ce44 100644
> > --- a/gcc/Makefile.in
> > +++ b/gcc/Makefile.in
> > @@ -1591,6 +1591,7 @@ OBJS = \
> > ira-emit.o \
> > ira-lives.o \
> > jump.o \
> > + kcfi-typeinfo.o \
> > langhooks.o \
> > late-combine.o \
> > lcm.o \
> > diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
> > index b6021d241731..e0100837946e 100644
> > --- a/gcc/c-family/c-common.h
> > +++ b/gcc/c-family/c-common.h
> > @@ -112,6 +112,7 @@ enum rid
> >   RID_BUILTIN_SHUFFLEVECTOR,   RID_BUILTIN_CONVERTVECTOR,  RID_BUILTIN_TGMATH,
> >   RID_BUILTIN_HAS_ATTRIBUTE,   RID_BUILTIN_ASSOC_BARRIER,  RID_BUILTIN_STDC,
> >   RID_BUILTIN_COUNTED_BY_REF,
> > +  RID_BUILTIN_TYPEINFO_NAME,  RID_BUILTIN_TYPEINFO_HASH,
> >   RID_DFLOAT32, RID_DFLOAT64, RID_DFLOAT128, RID_DFLOAT64X,
> > 
> >   /* TS 18661-3 keywords, in the same sequence as the TI_* values.  */
> > diff --git a/gcc/kcfi-typeinfo.h b/gcc/kcfi-typeinfo.h
> > new file mode 100644
> > index 000000000000..805f9ebaeca4
> > --- /dev/null
> > +++ b/gcc/kcfi-typeinfo.h
> > @@ -0,0 +1,32 @@
> > +/* KCFI-compatible type mangling, based on Itanium C++ ABI.
> > +   Copyright (C) 2025 Free Software Foundation, Inc.
> > +
> > +This file is part of GCC.
> > +
> > +GCC is free software; you can redistribute it and/or modify it under
> > +the terms of the GNU General Public License as published by the Free
> > +Software Foundation; either version 3, or (at your option) any later
> > +version.
> > +
> > +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> > +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> > +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> > +for more details.
> > +
> > +You should have received a copy of the GNU General Public License
> > +along with GCC; see the file COPYING3.  If not see
> > +<http://www.gnu.org/licenses/>.  */
> > +
> > +#ifndef GCC_KCFI_TYPEINFO_H
> > +#define GCC_KCFI_TYPEINFO_H
> > +
> > +#include "tree.h"
> > +#include <string>
> > +
> > +/* Get the typeinfo mangled name string for any C type.  */
> > +extern std::string typeinfo_get_name (tree type);
> > +
> > +/* Get the typeinfo hash for any C type.  */
> > +extern uint32_t typeinfo_get_hash (tree type);
> > +
> > +#endif /* GCC_KCFI_TYPEINFO_H */
> > diff --git a/gcc/testsuite/gcc.dg/builtin-typeinfo-errors.c b/gcc/testsuite/gcc.dg/builtin-typeinfo-errors.c
> > new file mode 100644
> > index 000000000000..71ad01337b4e
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/builtin-typeinfo-errors.c
> > @@ -0,0 +1,28 @@
> > +/* Test error handling for __builtin_typeinfo_name and __builtin_typeinfo_hash.  */
> > +/* { dg-do compile } */
> > +
> > +int main() {
> > +    /* Test missing arguments */
> > +    const char *result1 = __builtin_typeinfo_name(); /* { dg-error "expected specifier-qualifier-list before '\\)'" } */
> > +    /* { dg-error "expected type name in '__builtin_typeinfo_name'" "" { target *-*-* } .-1 } */
> > +    unsigned int result2 = __builtin_typeinfo_hash(); /* { dg-error "expected specifier-qualifier-list before '\\)'" } */
> > +    /* { dg-error "expected type name in '__builtin_typeinfo_hash'" "" { target *-*-* } .-1 } */
> > +
> > +    /* Test wrong argument types (expressions instead of type names) */
> > +    const char *result3 = __builtin_typeinfo_name(42); /* { dg-error "expected specifier-qualifier-list before numeric constant" } */
> > +    /* { dg-error "expected type name in '__builtin_typeinfo_name'" "" { target *-*-* } .-1 } */
> > +    unsigned int result4 = __builtin_typeinfo_hash(42); /* { dg-error "expected specifier-qualifier-list before numeric constant" } */
> > +    /* { dg-error "expected type name in '__builtin_typeinfo_hash'" "" { target *-*-* } .-1 } */
> > +
> > +    int x = 5;
> > +    const char *result5 = __builtin_typeinfo_name(x); /* { dg-error "expected specifier-qualifier-list before" } */
> > +    /* { dg-error "expected type name in '__builtin_typeinfo_name'" "" { target *-*-* } .-1 } */
> > +    unsigned int result6 = __builtin_typeinfo_hash(x); /* { dg-error "expected specifier-qualifier-list before" } */
> > +    /* { dg-error "expected type name in '__builtin_typeinfo_hash'" "" { target *-*-* } .-1 } */
> > +
> > +    /* Test too many arguments */
> > +    const char *result7 = __builtin_typeinfo_name(int, int); /* { dg-error "expected '\\)' before ','" } */
> > +    unsigned int result8 = __builtin_typeinfo_hash(int, int); /* { dg-error "expected '\\)' before ','" } */
> > +
> > +    return 0;
> > +}
> > diff --git a/gcc/testsuite/gcc.dg/builtin-typeinfo.c b/gcc/testsuite/gcc.dg/builtin-typeinfo.c
> > new file mode 100644
> > index 000000000000..744dc50f407e
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/builtin-typeinfo.c
> > @@ -0,0 +1,350 @@
> > +/* Test KCFI type mangling using __builtin_typeinfo_name.  */
> > +/* { dg-do run } */
> > +/* { dg-options "-std=gnu99" } */
> > +
> > +#include <stdio.h>
> > +#include <string.h>
> > +#include <stdarg.h>
> > +
> > +int pass, fail;
> > +
> > +#define TEST_STRING(expr, expected_string) \
> > +  do { \
> > +    const char *actual_string = __builtin_typeinfo_name(typeof(expr)); \
> > +    printf("Testing %s: ", #expr); \
> > +    if (strcmp(actual_string, expected_string) == 0) { \
> > +      printf("PASS (%s)\n", actual_string); \
> > +      pass ++; \
> > +    } else { \
> > +      printf("FAIL\n"); \
> > +      printf("  Expected: %s\n", expected_string); \
> > +      printf("  Actual:   %s\n", actual_string); \
> > +      fail ++; \
> > +    } \
> > +  } while (0)
> > +
> > +int main(void)
> > +{
> > +    printf("Testing KCFI Typeinfo Mangling\n");
> > +    printf("======================================================\n");
> > +
> > +    /* Test basic types */
> > +    TEST_STRING(void, "_ZTSv");
> > +    TEST_STRING(char, "_ZTSc");
> > +    TEST_STRING(int, "_ZTSi");
> > +    TEST_STRING(short, "_ZTSs");
> > +    TEST_STRING(long, "_ZTSl");
> > +    TEST_STRING(float, "_ZTSf");
> > +    TEST_STRING(double, "_ZTSd");
> > +
> > +    /* Test qualified types */
> > +    TEST_STRING(const int, "_ZTSKi");
> > +    TEST_STRING(volatile int, "_ZTSVi");
> > +
> > +    /* Test pointer types */
> > +    TEST_STRING(char*, "_ZTSPc");
> > +    TEST_STRING(int*, "_ZTSPi");
> > +    TEST_STRING(void*, "_ZTSPv");
> > +    TEST_STRING(const char*, "_ZTSPKc");
> > +
> > +    /* Test array types */
> > +    TEST_STRING(int[10],  "_ZTSA10_i");
> > +    TEST_STRING(char[20], "_ZTSA20_c");
> > +    TEST_STRING(short[],  "_ZTSA_s");
> > +
> > +    /* Test basic function types */
> > +    extern void func_void(void);
> > +    extern void func_char(char x);
> > +    extern void func_short(short x);
> > +    extern void func_int(int x);
> > +    extern void func_long(long x);
> > +    TEST_STRING(func_void,  "_ZTSFvvE");
> > +    TEST_STRING(func_char,  "_ZTSFvcE");
> > +    TEST_STRING(func_short, "_ZTSFvsE");
> > +    TEST_STRING(func_int,   "_ZTSFviE");
> > +    TEST_STRING(func_long,  "_ZTSFvlE");
> > +
> > +    /* Test functions with unsigned types */
> > +    extern void func_unsigned_char(unsigned char x);
> > +    extern void func_unsigned_short(unsigned short x);
> > +    extern void func_unsigned_int(unsigned int x);
> > +    TEST_STRING(func_unsigned_char,  "_ZTSFvhE");
> > +    TEST_STRING(func_unsigned_short, "_ZTSFvtE");
> > +    TEST_STRING(func_unsigned_int,   "_ZTSFvjE");
> > +
> > +    /* Test functions with signed types */
> > +    extern void func_signed_char(signed char x);
> > +    extern void func_signed_short(signed short x);
> > +    extern void func_signed_int(signed int x);
> > +    TEST_STRING(func_signed_char,  "_ZTSFvaE");
> > +    TEST_STRING(func_signed_short, "_ZTSFvsE");
> > +    TEST_STRING(func_signed_int,   "_ZTSFviE");
> > +
> > +    /* Test functions with pointer types */
> > +    extern void func_void_ptr(void *x);
> > +    extern void func_char_ptr(char *x);
> > +    extern void func_short_ptr(short *x);
> > +    extern void func_int_ptr(int *x);
> > +    extern void func_int_array(int arr[]); /* Decays to "int *".  */
> > +    extern void func_long_ptr(long *x);
> > +    TEST_STRING(func_void_ptr,  "_ZTSFvPvE");
> > +    TEST_STRING(func_char_ptr,  "_ZTSFvPcE");
> > +    TEST_STRING(func_short_ptr, "_ZTSFvPsE");
> > +    TEST_STRING(func_int_ptr,   "_ZTSFvPiE");
> > +    TEST_STRING(func_int_array, "_ZTSFvPiE");
> > +    TEST_STRING(func_long_ptr,  "_ZTSFvPlE");
> > +
> > +    /* Test functions with const qualifiers */
> > +    extern void func_const_void_ptr(const void *x);
> > +    extern void func_const_char_ptr(const char *x);
> > +    extern void func_const_short_ptr(const short *x);
> > +    extern void func_const_int_ptr(const int *x);
> > +    extern void func_const_long_ptr(const long *x);
> > +    TEST_STRING(func_const_void_ptr,  "_ZTSFvPKvE");
> > +    TEST_STRING(func_const_char_ptr,  "_ZTSFvPKcE");
> > +    TEST_STRING(func_const_short_ptr, "_ZTSFvPKsE");
> > +    TEST_STRING(func_const_int_ptr,   "_ZTSFvPKiE");
> > +    TEST_STRING(func_const_long_ptr,  "_ZTSFvPKlE");
> > +
> > +    /* Test nested pointers */
> > +    extern void func_int_ptr_ptr(int **x);
> > +    extern void func_char_ptr_ptr(char **x);
> > +    TEST_STRING(func_int_ptr_ptr,  "_ZTSFvPPiE");
> > +    TEST_STRING(func_char_ptr_ptr, "_ZTSFvPPcE");
> > +
> > +    /* Test multiple parameters */
> > +    extern void func_int_char(int x, char y);
> > +    extern void func_char_int(char x, int y);
> > +    extern void func_two_int(int x, int y);
> > +    TEST_STRING(func_int_char, "_ZTSFvicE");
> > +    TEST_STRING(func_char_int, "_ZTSFvciE");
> > +    TEST_STRING(func_two_int,  "_ZTSFviiE");
> > +
> > +    /* Test return types */
> > +    extern int func_return_int(void);
> > +    extern char func_return_char(void);
> > +    extern void* func_return_ptr(void);
> > +    TEST_STRING(func_return_int,  "_ZTSFivE");
> > +    TEST_STRING(func_return_char, "_ZTSFcvE");
> > +    TEST_STRING(func_return_ptr,  "_ZTSFPvvE");
> > +
> > +    /* Test function pointer parameters */
> > +    extern void func_fptr_void(void (*fp)(void));
> > +    extern void func_fptr_int(void (*fp)(int));
> > +    extern void func_fptr_ret_int(int (*fp)(void));
> > +    TEST_STRING(func_fptr_void,    "_ZTSFvPFvvEE");
> > +    TEST_STRING(func_fptr_int,     "_ZTSFvPFviEE");
> > +    TEST_STRING(func_fptr_ret_int, "_ZTSFvPFivEE");
> > +
> > +    /* Test variadic functions */
> > +    struct audit_context { int dummy; };
> > +    extern void func_variadic_simple(const char *fmt, ...);
> > +    extern void func_variadic_mixed(int x, const char *fmt, ...);
> > +    extern void func_variadic_multi(int x, char y, const char *fmt, ...);
> > +    extern void audit_log_pattern(struct audit_context *ctx, unsigned int gfp_mask,
> > +  int type, const char *fmt, ...);
> > +    TEST_STRING(func_variadic_simple, "_ZTSFvPKczE");
> > +    TEST_STRING(func_variadic_mixed,  "_ZTSFviPKczE");
> > +    TEST_STRING(func_variadic_multi,  "_ZTSFvicPKczE");
> > +    TEST_STRING(audit_log_pattern,    "_ZTSFvP13audit_contextjiPKczE");
> > +
> > +    /* Test mixed const/non-const */
> > +    extern void func_const_mixed(int x, const char *fmt);
> > +    TEST_STRING(func_const_mixed,  "_ZTSFviPKcE");
> > +
> > +    /* Test named struct types */
> > +    struct test_struct_a { int x; };
> > +    struct test_struct_b { char y; };
> > +    struct test_struct_c { void *ptr; };
> > +    TEST_STRING(struct test_struct_a, "_ZTS13test_struct_a");
> > +    extern void func_struct_a_ptr(struct test_struct_a *x);
> > +    extern void func_struct_b_ptr(struct test_struct_b *x);
> > +    extern void func_struct_c_ptr(struct test_struct_c *x);
> > +    TEST_STRING(func_struct_a_ptr, "_ZTSFvP13test_struct_aE");
> > +    TEST_STRING(func_struct_b_ptr, "_ZTSFvP13test_struct_bE");
> > +    TEST_STRING(func_struct_c_ptr, "_ZTSFvP13test_struct_cE");
> > +
> > +    /* Test const named struct types */
> > +    extern void func_const_struct_a_ptr(const struct test_struct_a *x);
> > +    extern void func_const_struct_b_ptr(const struct test_struct_b *x);
> > +    extern void func_const_struct_c_ptr(const struct test_struct_c *x);
> > +    TEST_STRING(func_const_struct_a_ptr, "_ZTSFvPK13test_struct_aE");
> > +    TEST_STRING(func_const_struct_b_ptr, "_ZTSFvPK13test_struct_bE");
> > +    TEST_STRING(func_const_struct_c_ptr, "_ZTSFvPK13test_struct_cE");
> > +
> > +    /* Test named union types */
> > +    union test_union_a { int x; float y; };
> > +    union test_union_b { char a; void *b; };
> > +    TEST_STRING(union test_union_a,  "_ZTS12test_union_a");
> > +    extern void func_union_a_ptr(union test_union_a *x);
> > +    extern void func_union_b_ptr(union test_union_b *x);
> > +    TEST_STRING(func_union_a_ptr, "_ZTSFvP12test_union_aE");
> > +    TEST_STRING(func_union_b_ptr, "_ZTSFvP12test_union_bE");
> > +
> > +    /* Test enum types: distinct from int */
> > +    enum test_enum_a { ENUM_A_VAL };
> > +    enum test_enum_b { ENUM_B_VAL };
> > +    TEST_STRING(enum test_enum_a, "_ZTS11test_enum_a");
> > +    extern void func_enum_a_ptr(enum test_enum_a *x);
> > +    extern void func_enum_b_ptr(enum test_enum_b *x);
> > +    TEST_STRING(func_enum_a_ptr, "_ZTSFvP11test_enum_aE");
> > +    TEST_STRING(func_enum_b_ptr, "_ZTSFvP11test_enum_bE");
> > +
> > +    /* Test union member discrimination */
> > +    struct tasklet {
> > +        int state;
> > +        union {
> > +            void (*func)(unsigned long data);
> > +            void (*callback)(struct tasklet *t);
> > +        };
> > +        unsigned long data;
> > +    } tasklet_instance;
> > +    TEST_STRING(tasklet_instance, "_ZTS7tasklet");
> > +    struct tasklet *p = &tasklet_instance;
> > +    extern void tasklet_callback_function(struct tasklet *t);
> > +    extern void tasklet_func_function(unsigned long data);
> > +    TEST_STRING(tasklet_func_function,     "_ZTSFvmE");
> > +    TEST_STRING(*p->func,                  "_ZTSFvmE");
> > +    TEST_STRING(tasklet_callback_function, "_ZTSFvP7taskletE");
> > +    TEST_STRING(*p->callback,              "_ZTSFvP7taskletE");
> > +
> > +    /* Test struct return pointers */
> > +    extern struct test_struct_a* func_ret_struct_a_ptr(void);
> > +    extern struct test_struct_b* func_ret_struct_b_ptr(void);
> > +    extern struct test_struct_c* func_ret_struct_c_ptr(void);
> > +    TEST_STRING(func_ret_struct_a_ptr, "_ZTSFP13test_struct_avE");
> > +    TEST_STRING(func_ret_struct_b_ptr, "_ZTSFP13test_struct_bvE");
> > +    TEST_STRING(func_ret_struct_c_ptr, "_ZTSFP13test_struct_cvE");
> > +
> > +    /* Test struct by-value parameters */
> > +    extern void func_struct_a_val(struct test_struct_a x);
> > +    extern void func_struct_b_val(struct test_struct_b x);
> > +    extern void func_struct_c_val(struct test_struct_c x);
> > +    TEST_STRING(func_struct_a_val, "_ZTSFv13test_struct_aE");
> > +    TEST_STRING(func_struct_b_val, "_ZTSFv13test_struct_bE");
> > +    TEST_STRING(func_struct_c_val, "_ZTSFv13test_struct_cE");
> > +
> > +    /* Test struct return by-value */
> > +    extern struct test_struct_a func_ret_struct_a_val(void);
> > +    extern struct test_struct_b func_ret_struct_b_val(void);
> > +    extern struct test_struct_c func_ret_struct_c_val(void);
> > +    TEST_STRING(func_ret_struct_a_val, "_ZTSF13test_struct_avE");
> > +    TEST_STRING(func_ret_struct_b_val, "_ZTSF13test_struct_bvE");
> > +    TEST_STRING(func_ret_struct_c_val, "_ZTSF13test_struct_cvE");
> > +
> > +    /* Test mixed struct parameters */
> > +    extern void func_struct_a_b(struct test_struct_a *a, struct test_struct_b *b);
> > +    extern void func_struct_b_a(struct test_struct_b *b, struct test_struct_a *a);
> > +    TEST_STRING(func_struct_a_b, "_ZTSFvP13test_struct_aP13test_struct_bE");
> > +    TEST_STRING(func_struct_b_a, "_ZTSFvP13test_struct_bP13test_struct_aE");
> > +
> > +    /* Test anonymous struct typedefs */
> > +    typedef struct { int x; } typedef_struct_x;
> > +    typedef struct { int y; } typedef_struct_y;
> > +    TEST_STRING(typedef_struct_x, "_ZTS16typedef_struct_x");
> > +    extern void func_typedef_x_ptr(typedef_struct_x *x);
> > +    extern void func_typedef_y_ptr(typedef_struct_y *y);
> > +    TEST_STRING(func_typedef_x_ptr, "_ZTSFvP16typedef_struct_xE");
> > +    TEST_STRING(func_typedef_y_ptr, "_ZTSFvP16typedef_struct_yE");
> > +    extern void func_typedef_x(typedef_struct_x x);
> > +    TEST_STRING(func_typedef_x, "_ZTSFv16typedef_struct_xE");
> > +
> > +    /* Test anonymous union typedefs */
> > +    typedef union { int x; short a; } typedef_union_x;
> > +    typedef union { int y; short b; } typedef_union_y;
> > +    TEST_STRING(typedef_union_x, "_ZTS15typedef_union_x");
> > +    extern void func_typedef_union_x_ptr(typedef_union_x *x);
> > +    extern void func_typedef_union_y_ptr(typedef_union_y *y);
> > +    TEST_STRING(func_typedef_union_x_ptr, "_ZTSFvP15typedef_union_xE");
> > +    TEST_STRING(func_typedef_union_y_ptr, "_ZTSFvP15typedef_union_yE");
> > +    extern void func_typedef_union_x(typedef_union_x x);
> > +    TEST_STRING(func_typedef_union_x, "_ZTSFv15typedef_union_xE");
> > +
> > +    /* Test anonymous enum typedefs */
> > +    typedef enum { STEP_1, STEP_2 } typedef_enum_x;
> > +    typedef enum { STEP_A, STEP_B } typedef_enum_y;
> > +    TEST_STRING(typedef_enum_x, "_ZTS14typedef_enum_x");
> > +    extern void func_typedef_enum_x_ptr(typedef_enum_x *x);
> > +    extern void func_typedef_enum_y_ptr(typedef_enum_y *y);
> > +    TEST_STRING(func_typedef_enum_x_ptr, "_ZTSFvP14typedef_enum_xE");
> > +    TEST_STRING(func_typedef_enum_y_ptr, "_ZTSFvP14typedef_enum_yE");
> > +    extern void func_typedef_enum_x(typedef_enum_x x);
> > +    TEST_STRING(func_typedef_enum_x, "_ZTSFv14typedef_enum_xE");
> > +
> > +    /* Test basic typedef vs open-coded function types: should be the same.  */
> > +    typedef void (*func_type_typedef)(int, char);
> > +    TEST_STRING(func_type_typedef,           "_ZTSPFvicE");
> > +    extern void func_with_typedef_param(func_type_typedef fp);
> > +    extern void func_with_opencoded_param(void (*fp)(int, char));
> > +    TEST_STRING(func_with_typedef_param,   "_ZTSFvPFvicEE");
> > +    TEST_STRING(func_with_opencoded_param, "_ZTSFvPFvicEE");
> > +
> > +    /* Test return function pointer types */
> > +    typedef int (*ret_func_type_typedef)(void);
> > +    TEST_STRING(ret_func_type_typedef,     "_ZTSPFivE");
> > +    extern ret_func_type_typedef func_ret_typedef_param(void);
> > +    extern int (*func_ret_opencoded_param(void))(void);
> > +    TEST_STRING(func_ret_typedef_param,   "_ZTSFPFivEvE");
> > +    TEST_STRING(func_ret_opencoded_param, "_ZTSFPFivEvE");
> > +
> > +    /* Test additional type combos */
> > +    extern void func_float(float x);
> > +    extern void func_double_ptr(double *x);
> > +    extern void func_float_ptr(float *x);
> > +    extern void func_void_ptr_ptr(void **x);
> > +    extern void func_ptr_val(int *x, int y);
> > +    extern void func_val_ptr(int x, int *y);
> > +    extern float func_return_float(void);
> > +    extern double func_return_double(void);
> > +    TEST_STRING(func_float,         "_ZTSFvfE");
> > +    TEST_STRING(func_double_ptr,    "_ZTSFvPdE");
> > +    TEST_STRING(func_float_ptr,     "_ZTSFvPfE");
> > +    TEST_STRING(func_void_ptr_ptr,  "_ZTSFvPPvE");
> > +    TEST_STRING(func_ptr_val,       "_ZTSFvPiiE");
> > +    TEST_STRING(func_val_ptr,       "_ZTSFviPiE");
> > +    TEST_STRING(func_return_float,  "_ZTSFfvE");
> > +    TEST_STRING(func_return_double, "_ZTSFdvE");
> > +
> > +    /* Test VLA types: should be all the same.  */
> > +    extern void func_vla_1d(int n, int arr[n]);
> > +    extern void func_vla_empty(int n, int arr[]);
> > +    extern void func_vla_ptr(int n, int *arr);
> > +    TEST_STRING(func_vla_1d,    "_ZTSFviPiE");
> > +    TEST_STRING(func_vla_empty, "_ZTSFviPiE");
> > +    TEST_STRING(func_vla_ptr,   "_ZTSFviPiE");
> > +
> > +    /* Test 2D VLA with fixed dimension: should be all the same.  */
> > +    extern void func_vla_2d_first(int n, int arr[n][10]);
> > +    extern void func_vla_2d_empty(int n, int arr[][10]);
> > +    extern void func_vla_2d_ptr(int n, int (*arr)[10]);
> > +    TEST_STRING(func_vla_2d_first, "_ZTSFviPA10_iE");
> > +    TEST_STRING(func_vla_2d_empty, "_ZTSFviPA10_iE");
> > +    TEST_STRING(func_vla_2d_ptr,   "_ZTSFviPA10_iE");
> > +
> > +    /* Test 2D VLA with both dimensions variable: should be all the same.  */
> > +    extern void func_vla_2d_both(int rows, int cols, int arr[rows][cols]);
> > +    extern void func_vla_2d_second(int rows, int cols, int arr[][cols]);
> > +    extern void func_vla_2d_star(int rows, int cols, int arr[*][cols]);
> > +    TEST_STRING(func_vla_2d_both,   "_ZTSFviiPA_iE");
> > +    TEST_STRING(func_vla_2d_second, "_ZTSFviiPA_iE");
> > +    TEST_STRING(func_vla_2d_star,   "_ZTSFviiPA_iE");
> > +
> > +    /* Test recursive typedef canonicalization */
> > +    struct recursive_struct_test { int field; };
> > +    typedef struct recursive_struct_test recursive_struct_typedef_1;
> > +    typedef recursive_struct_typedef_1 recursive_struct_typedef_2;
> > +    extern void func_recursive_struct_test(struct recursive_struct_test *x);
> > +    TEST_STRING(func_recursive_struct_test, "_ZTSFvP21recursive_struct_testE");
> > +
> > +    /* Test anonymous struct, union, enum types */
> > +    struct { int a; short b; } anon_struct;
> > +    union { int x; float y; } anon_union;
> > +    enum { ANON_VAL1, ANON_VAL2 } anon_enum;
> > +    TEST_STRING(anon_struct, "_ZTS3$_0"); // <length>$_<counter>
> > +    TEST_STRING(anon_union, "_ZTS3$_1");  // <length>$_<counter>
> > +    TEST_STRING(anon_enum, "_ZTS3$_2");   // <length>$_<counter>
> > +
> > +    printf("\n================================================================\n");
> > +    printf("Passed: %d Failed: %d (%d total tests)\n", pass, fail, pass + fail);
> > +    return fail;
> > +}
> > diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
> > index e7dd4602ac11..94f2c2001ad5 100644
> > --- a/gcc/c-family/c-common.cc
> > +++ b/gcc/c-family/c-common.cc
> > @@ -461,6 +461,8 @@ const struct c_common_resword c_common_reswords[] =
> >   { "__builtin_stdc_trailing_zeros", RID_BUILTIN_STDC, D_CONLY },
> >   { "__builtin_tgmath", RID_BUILTIN_TGMATH, D_CONLY },
> >   { "__builtin_offsetof", RID_OFFSETOF, 0 },
> > +  { "__builtin_typeinfo_hash", RID_BUILTIN_TYPEINFO_HASH, D_CONLY },
> > +  { "__builtin_typeinfo_name", RID_BUILTIN_TYPEINFO_NAME, D_CONLY },
> >   { "__builtin_types_compatible_p", RID_TYPES_COMPATIBLE_P, D_CONLY },
> >   { "__builtin_c23_va_start", RID_C23_VA_START, D_C23 },
> >   { "__builtin_va_arg", RID_VA_ARG, 0 },
> > diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
> > index e8b64948bf69..996fb576ac7c 100644
> > --- a/gcc/c/c-parser.cc
> > +++ b/gcc/c/c-parser.cc
> > @@ -77,6 +77,7 @@ along with GCC; see the file COPYING3.  If not see
> > #include "asan.h"
> > #include "c-family/c-ubsan.h"
> > #include "gcc-urlifier.h"
> > +#include "kcfi-typeinfo.h"
> > 
> > /* We need to walk over decls with incomplete struct/union/enum types
> >    after parsing the whole translation unit.
> > @@ -11017,6 +11018,38 @@ c_parser_has_attribute_expression (c_parser *parser)
> >   return result;
> > }
> > 
> > +/* Parse the single type name argument of a builtin that takes a type name.
> > +   Returns true on success and stores the parsed type in *OUT_TYPE.
> > +   If successful, *OUT_CLOSE_PAREN_LOC is written with the location of
> > +   the closing parenthesis.  */
> > +
> > +static bool
> > +c_parser_get_builtin_type_arg (c_parser *parser, const char *bname,
> > +       tree *out_type, location_t *out_close_paren_loc)
> > +{
> > +  matching_parens parens;
> > +  if (!parens.require_open (parser))
> > +    return false;
> > +
> > +  struct c_type_name *type_name = c_parser_type_name (parser);
> > +  if (type_name == NULL)
> > +    {
> > +      error_at (c_parser_peek_token (parser)->location,
> > + "expected type name in %qs", bname);
> > +      return false;
> > +    }
> > +
> > +  *out_close_paren_loc = c_parser_peek_token (parser)->location;
> > +  parens.skip_until_found_close (parser);
> > +
> > +  tree type = groktypename (type_name, NULL, NULL);
> > +  if (type == error_mark_node)
> > +    return false;
> > +
> > +  *out_type = type;
> > +  return true;
> > +}
> > +
> > /* Helper function to read arguments of builtins which are interfaces
> >    for the middle-end nodes like COMPLEX_EXPR, VEC_PERM_EXPR and
> >    others.  The name of the builtin is passed using BNAME parameter.
> > @@ -12025,6 +12058,45 @@ c_parser_postfix_expression (c_parser *parser)
> >    set_c_expr_source_range (&expr, loc, close_paren_loc);
> >  }
> >  break;
> > + case RID_BUILTIN_TYPEINFO_NAME:
> > +  {
> > +    c_parser_consume_token (parser);
> > +    location_t close_paren_loc;
> > +    tree type;
> > +    if (!c_parser_get_builtin_type_arg (parser,
> > + "__builtin_typeinfo_name",
> > + &type, &close_paren_loc))
> > +      {
> > + expr.set_error ();
> > + break;
> > +      }
> > +
> > +    /* Call the typeinfo name function.  */
> > +    std::string type_name = typeinfo_get_name (type);
> > +    expr.value = build_string_literal (type_name.length () + 1,
> > +       type_name.c_str ());
> > +    set_c_expr_source_range (&expr, loc, close_paren_loc);
> > +  }
> > +  break;
> > + case RID_BUILTIN_TYPEINFO_HASH:
> > +  {
> > +    c_parser_consume_token (parser);
> > +    location_t close_paren_loc;
> > +    tree type;
> > +    if (!c_parser_get_builtin_type_arg (parser,
> > + "__builtin_typeinfo_hash",
> > + &type, &close_paren_loc))
> > +      {
> > + expr.set_error ();
> > + break;
> > +      }
> > +
> > +    /* Call the typeinfo hash function.  */
> > +    uint32_t type_hash = typeinfo_get_hash (type);
> > +    expr.value = build_int_cst (unsigned_type_node, type_hash);
> > +    set_c_expr_source_range (&expr, loc, close_paren_loc);
> > +  }
> > +  break;
> > case RID_BUILTIN_TGMATH:
> >  {
> >    vec<c_expr_t, va_gc> *cexpr_list;
> > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> > index 382295834035..7cddea1ed6c1 100644
> > --- a/gcc/doc/extend.texi
> > +++ b/gcc/doc/extend.texi
> > @@ -17547,6 +17547,100 @@ which will cause a @code{NULL} pointer to be used for the unsafe case.
> > 
> > @enddefbuiltin
> > 
> > +@defbuiltin{{unsigned int} __builtin_typeinfo_hash (@var{type})}
> > +
> > +The built-in function @code{__builtin_typeinfo_hash} returns a hash value
> > +for the given type @var{type} (which is a type, not an expression).  The hash
> > +is computed using the FNV-1a algorithm on the type's mangled name representation,
> > +which follows a subset of the Itanium C++ ABI conventions adapted for C types.
> > +(See @code{__buitin_typeinfo_name} for the string representation.)
> > +
> > +This built-in is primarily intended for kernel control flow integrity (KCFI)
> > +implementations and other type-aware runtime systems that need to generate
> > +consistent type identifiers.  The hash value is a 32-bit unsigned integer.
> > +
> > +Key characteristics of the hash:
> > +@itemize @bullet
> > +@item
> > +The hash is consistent for the same type across different translation units.
> > +@item
> > +Typedefs are recursively canonicalized down to integral type name or named
> > +struct, union, or enum tag name.
> > +@item
> > +Typedefs of anonymous structs, unions, and enums preserve the typedef name
> > +in the hash calculation (e.g., @code{typedef struct @{ int x; @} foo_t;}
> > +uses @code{foo_t} in the hash).
> > +@item
> > +Type qualifiers (@code{const}, @code{volatile}, @code{restrict}) affect
> > +the hash value.
> > +@item
> > +Function types include parameter types and variadic markers in the hash.
> > +@end itemize
> > +
> > +For example:
> > +@smallexample
> > +typedef struct @{ int x; @} mytype_t;
> > +unsigned int hash1 = __builtin_typeinfo_hash(mytype_t);
> > +unsigned int hash2 = __builtin_typeinfo_hash(struct @{ int x; @});
> > +/* hash1 != hash2 because the typedef name is preserved */
> > +
> > +void func(int x, char y);
> > +unsigned int hash3 = __builtin_typeinfo_hash(typeof(func));
> > +/* Returns hash for function type "void(int, char)" */
> > +@end smallexample
> > +
> > +@emph{Note:} This construct is only available for C@. For C++, see
> > +@code{std::type_info::hash_code}.
> > +
> > +@enddefbuiltin
> > +
> > +@defbuiltin{{const char *} __builtin_typeinfo_name (@var{type})}
> > +
> > +The built-in function @code{__builtin_typeinfo_name} returns a string
> > +containing the mangled name representation of the given type @var{type}
> > +(which is a type, not an expression).  The string follows a subset of the
> > +Itanium C++ ABI mangling conventions adapted for C types.  (See
> > +@code{__buitin_typeinfo_hash} for the unsigned 32-bit hash representation.)
> > +
> > +The returned string is a compile-time constant suitable for use in
> > +string comparisons, debugging output, or other type introspection needs.
> > +The string begins with @code{_ZTS} followed by the encoded type information.
> > +
> > +Mangling examples:
> > +@itemize @bullet
> > +@item
> > +@code{int} becomes @code{"_ZTSi"}
> > +@item
> > +@code{char *} becomes @code{"_ZTSPc"}
> > +@item
> > +@code{const int} becomes @code{"_ZTSKi"}
> > +@item
> > +@code{int[10]} becomes @code{"_ZTSA10_i"}
> > +@item
> > +@code{void (*)(int)} becomes @code{"_ZTSPFviE"}
> > +@item
> > +@code{struct foo} becomes @code{"_ZTS3foo"}
> > +@item
> > +@code{typedef struct @{ int x; @} bar_t;} becomes @code{"_ZTS5bar_t"}
> > +@end itemize
> > +
> > +The mangling preserves typedef names for anonymous compound types, which
> > +is particularly useful for distinguishing between different typedefs of
> > +structurally identical anonymous types:
> > +
> > +@smallexample
> > +typedef struct @{ int x; @} type_a;
> > +typedef struct @{ int x; @} type_b;
> > +const char *name_a = __builtin_typeinfo_name(type_a);  /* "_ZTS6type_a" */
> > +const char *name_b = __builtin_typeinfo_name(type_b);  /* "_ZTS6type_b" */
> > +/* name_a and name_b are different despite identical structure */
> > +@end smallexample
> > +
> > +@emph{Note:} This construct is only available for C@. For C++, see
> > +@code{std::type_info::name}.
> > +
> > +@enddefbuiltin
> > +
> > @defbuiltin{int __builtin_types_compatible_p (@var{type1}, @var{type2})}
> > 
> > You can use the built-in function @code{__builtin_types_compatible_p} to
> > diff --git a/gcc/kcfi-typeinfo.cc b/gcc/kcfi-typeinfo.cc
> > new file mode 100644
> > index 000000000000..24099c42cc2e
> > --- /dev/null
> > +++ b/gcc/kcfi-typeinfo.cc
> > @@ -0,0 +1,475 @@
> > +/* KCFI-compatible type mangling, based on Itanium C++ ABI.
> > +   Copyright (C) 2025 Free Software Foundation, Inc.
> > +
> > +This file is part of GCC.
> > +
> > +GCC is free software; you can redistribute it and/or modify it under
> > +the terms of the GNU General Public License as published by the Free
> > +Software Foundation; either version 3, or (at your option) any later
> > +version.
> > +
> > +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> > +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> > +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> > +for more details.
> > +
> > +You should have received a copy of the GNU General Public License
> > +along with GCC; see the file COPYING3.  If not see
> > +<http://www.gnu.org/licenses/>.  */
> > +
> > +/* Produces typeinfo mangling similar to Itanium C++ Mangling ABI, but
> > +   limited to types exposed within GCC for C language handling.  The
> > +   hashes are used by KCFI (and future type-aware allocator support).
> > +   The strings are used for testing and debugging.  */
> > +
> > +#include "config.h"
> > +#include "system.h"
> > +#include "coretypes.h"
> > +#include "tree.h"
> > +#include "diagnostic-core.h"
> > +#include "stringpool.h"
> > +#include "stor-layout.h"
> > +#include "print-tree.h"
> > +#include "kcfi-typeinfo.h"
> > +
> > +/* Helper to update FNV-1a hash with a single character.  */
> > +
> > +static inline void
> > +fnv1a_hash_char (uint32_t *hash_state, unsigned char c)
> > +{
> > +  *hash_state ^= c;
> > +  *hash_state *= 16777619U; /* FNV-1a 32-bit prime.  */
> > +}
> > +
> > +/* Helper to append character to optional string and update hash using
> > +   FNV-1a.  */
> > +
> > +static void
> > +append_char (char c, std::string *out_str, uint32_t *hash_state)
> > +{
> > +  if (out_str)
> > +    *out_str += c;
> > +  if (!hash_state)
> > +    return;
> > +  fnv1a_hash_char (hash_state, (unsigned char) c);
> > +}
> > +
> > +/* Helper to append string to optional string and update hash using
> > +   FNV-1a.  */
> > +
> > +static void
> > +append_string (const char *str, std::string *out_str, uint32_t *hash_state)
> > +{
> > +  if (out_str)
> > +    *out_str += str;
> > +  if (!hash_state)
> > +    return;
> > +  for (const char *p = str; *p; p++)
> > +    fnv1a_hash_char (hash_state, (unsigned char) *p);
> > +}
> > +
> > +/* Forward declaration for recursive type mangling.  */
> > +
> > +static void mangle_type (tree type, std::string *out_str, uint32_t *hash_state);
> > +
> > +/* Mangle a builtin type following Itanium C++ ABI for C types.  */
> > +
> > +static void
> > +mangle_builtin_type (tree type, std::string *out_str, uint32_t *hash_state)
> > +{
> > +  gcc_assert (type != NULL_TREE);
> > +
> > +  switch (TREE_CODE (type))
> > +    {
> > +    case VOID_TYPE:
> > +      append_char ('v', out_str, hash_state);
> > +      return;
> > +
> > +    case BOOLEAN_TYPE:
> > +      append_char ('b', out_str, hash_state);
> > +      return;
> > +
> > +    case INTEGER_TYPE:
> > +      if (type == char_type_node)
> > + append_char ('c', out_str, hash_state);
> > +      else if (type == signed_char_type_node)
> > + append_char ('a', out_str, hash_state);
> > +      else if (type == unsigned_char_type_node)
> > + append_char ('h', out_str, hash_state);
> > +      else if (type == short_integer_type_node)
> > + append_char ('s', out_str, hash_state);
> > +      else if (type == short_unsigned_type_node)
> > + append_char ('t', out_str, hash_state);
> > +      else if (type == integer_type_node)
> > + append_char ('i', out_str, hash_state);
> > +      else if (type == unsigned_type_node)
> > + append_char ('j', out_str, hash_state);
> > +      else if (type == long_integer_type_node)
> > + append_char ('l', out_str, hash_state);
> > +      else if (type == long_unsigned_type_node)
> > + append_char ('m', out_str, hash_state);
> > +      else if (type == long_long_integer_type_node)
> > + append_char ('x', out_str, hash_state);
> > +      else if (type == long_long_unsigned_type_node)
> > + append_char ('y', out_str, hash_state);
> > +      else
> > + {
> > +  /* Fallback for other integer types - use precision-based
> > +     encoding.  */
> > +  append_char ('i', out_str, hash_state);
> > +  append_string (std::to_string (TYPE_PRECISION (type)).c_str (),
> > + out_str, hash_state);
> > + }
> > +      return;
> > +
> > +    case REAL_TYPE:
> > +      if (type == float_type_node)
> > + append_char ('f', out_str, hash_state);
> > +      else if (type == double_type_node)
> > + append_char ('d', out_str, hash_state);
> > +      else if (type == long_double_type_node)
> > + append_char ('e', out_str, hash_state);
> > +      else
> > + {
> > +  /* Fallback for other real types.  */
> > +  append_char ('f', out_str, hash_state);
> > +  append_string (std::to_string (TYPE_PRECISION (type)).c_str (),
> > + out_str, hash_state);
> > + }
> > +      return;
> > +
> > +    case VECTOR_TYPE:
> > +      {
> > + /* Handle vector types:
> > +   Dv<num-elements>_<element-type-encoding>
> > +   Example: uint8x16_t -> Dv16_h (vector of 16 unsigned char)  */
> > + tree vector_size = TYPE_SIZE_UNIT (type);
> > + tree element_type = TREE_TYPE (type);
> > + tree element_size = TYPE_SIZE_UNIT (element_type);
> > +
> > + if (vector_size && element_size
> > +    && TREE_CODE (vector_size) == INTEGER_CST
> > +    && TREE_CODE (element_size) == INTEGER_CST)
> > +  {
> > +    append_char ('D', out_str, hash_state);
> > +    append_char ('v', out_str, hash_state);
> > +
> > +    unsigned HOST_WIDE_INT vec_bytes = tree_to_uhwi (vector_size);
> > +    unsigned HOST_WIDE_INT elem_bytes = tree_to_uhwi (element_size);
> > +    unsigned HOST_WIDE_INT num_elements = vec_bytes / elem_bytes;
> > +
> > +    /* Append number of elements.  */
> > +    append_string (std::to_string (num_elements).c_str (),
> > +   out_str, hash_state);
> > +    append_char ('_', out_str, hash_state);
> > +
> > +    /* Recursively mangle the element type.  */
> > +    mangle_type (element_type, out_str, hash_state);
> > +    return;
> > +  }
> > + /* Fail for vectors with unknown size.  */
> > +      }
> > +      break;
> > +
> > +    default:
> > +      break;
> > +    }
> > +
> > +  /* Unknown builtin type: this should never happen in a well-formed C.  */
> > +  debug_tree (type);
> > +  internal_error ("mangle: Unknown builtin type - please report this as a bug");
> > +}
> > +
> > +/* Canonicalize typedef types to their underlying named struct/union types.  */
> > +
> > +static tree
> > +canonicalize_typedef_type (tree type)
> > +{
> > +  /* Handle typedef types: canonicalize to named structs when possible.  */
> > +  if (TYPE_NAME (type) && TREE_CODE (TYPE_NAME (type)) == TYPE_DECL)
> > +    {
> > +      tree type_decl = TYPE_NAME (type);
> > +
> > +      /* Check if this is a typedef (not the original struct declaration) */
> > +      if (DECL_ORIGINAL_TYPE (type_decl))
> > + {
> > +  tree original_type = DECL_ORIGINAL_TYPE (type_decl);
> > +
> > +  /* Handle struct/union/enum types.  */
> > +  if (TREE_CODE (original_type) == RECORD_TYPE
> > +      || TREE_CODE (original_type) == UNION_TYPE
> > +      || TREE_CODE (original_type) == ENUMERAL_TYPE)
> > +    {
> > +      /* Preserve typedef of anonymous struct/union/enum types.  */
> > +      if (!TYPE_NAME (original_type))
> > + return type;
> > +
> > +      /* Named compound type: canonicalize to it.  */
> > +      return canonicalize_typedef_type (original_type);
> > +    }
> > +
> > +  /* For basic type typedefs (e.g., u8 -> unsigned char),
> > +     canonicalize to original type.  */
> > +  if (TREE_CODE (original_type) == INTEGER_TYPE
> > +      || TREE_CODE (original_type) == REAL_TYPE
> > +      || TREE_CODE (original_type) == POINTER_TYPE
> > +      || TREE_CODE (original_type) == ARRAY_TYPE
> > +      || TREE_CODE (original_type) == FUNCTION_TYPE
> > +      || TREE_CODE (original_type) == METHOD_TYPE
> > +      || TREE_CODE (original_type) == BOOLEAN_TYPE
> > +      || TREE_CODE (original_type) == COMPLEX_TYPE
> > +      || TREE_CODE (original_type) == VECTOR_TYPE)
> > +    {
> > +      /* Recursively canonicalize in case the original type is
> > + also a typedef.  */
> > +      return canonicalize_typedef_type (original_type);
> > +    }
> > + }
> > +    }
> > +
> > +  return type;
> > +}
> > +
> > +/* Recursively mangle a C type following Itanium C++ ABI.  */
> > +
> > +static void
> > +mangle_type (tree type, std::string *out_str, uint32_t *hash_state)
> > +{
> > +  gcc_assert (type != NULL_TREE);
> > +
> > +  /* Canonicalize typedef types to their underlying named struct types.  */
> > +  type = canonicalize_typedef_type (type);
> > +
> > +  /* Save original qualified type for cases where we need typedef
> > +     information.  */
> > +  tree qualified_type = type;
> > +
> > +  /* Centralized qualifier handling: emit qualifiers for this type,
> > +     then continue with unqualified version.  */
> > +  if (TYPE_QUALS (type) != TYPE_UNQUALIFIED)
> > +    {
> > +      /* Emit qualifiers in Itanium ABI order: restrict, volatile, const.  */
> > +      if (TYPE_QUALS (type) & TYPE_QUAL_RESTRICT)
> > + append_char ('r', out_str, hash_state);
> > +      if (TYPE_QUALS (type) & TYPE_QUAL_VOLATILE)
> > + append_char ('V', out_str, hash_state);
> > +      if (TYPE_QUALS (type) & TYPE_QUAL_CONST)
> > + append_char ('K', out_str, hash_state);
> > +
> > +      /* Get unqualified version for further processing.  */
> > +      type = TYPE_MAIN_VARIANT (type);
> > +    }
> > +
> > +  switch (TREE_CODE (type))
> > +    {
> > +    case POINTER_TYPE:
> > +      {
> > + /* Pointer type: 'P' + pointed-to type.  */
> > + append_char ('P', out_str, hash_state);
> > +
> > + /* Recursively mangle the pointed-to type.  */
> > + tree pointed_to_type = TREE_TYPE (type);
> > + mangle_type (pointed_to_type, out_str, hash_state);
> > + break;
> > +      }
> > +
> > +    case ARRAY_TYPE:
> > +      /* Array type: 'A' + size + '_' + element type (simplified).  */
> > +      append_char ('A', out_str, hash_state);
> > +      if (TYPE_DOMAIN (type) && TYPE_MAX_VALUE (TYPE_DOMAIN (type)))
> > + {
> > +  tree max_val = TYPE_MAX_VALUE (TYPE_DOMAIN (type));
> > +  /* Check if array size is compile-time constant to handle VLAs. */
> > +  if (TREE_CODE (max_val) == INTEGER_CST && tree_fits_shwi_p (max_val))
> > +    {
> > +      HOST_WIDE_INT size = tree_to_shwi (max_val) + 1;
> > +      append_string (std::to_string ((long) size).c_str (),
> > +     out_str, hash_state);
> > +    }
> > +  /* For VLAs or non-constant dimensions, emit empty size (A_).  */
> > +  append_char ('_', out_str, hash_state);
> > + }
> > +      else
> > + {
> > +  /* No domain or no max value: emit A_.  */
> > +  append_char ('_', out_str, hash_state);
> > + }
> > +      mangle_type (TREE_TYPE (type), out_str, hash_state);
> > +      break;
> > +
> > +    case REFERENCE_TYPE:
> > +      /* Reference type: 'R' + referenced type.
> > + Note: We must handle references to builtin types including compiler
> > + builtins like __builtin_va_list used in functions like va_start.  */
> > +      append_char ('R', out_str, hash_state);
> > +      mangle_type (TREE_TYPE (type), out_str, hash_state);
> > +      break;
> > +
> > +    case FUNCTION_TYPE:
> > +      {
> > + /* Function type: 'F' + return type + parameter types + 'E' */
> > + append_char ('F', out_str, hash_state);
> > + mangle_type (TREE_TYPE (type), out_str, hash_state);
> > +
> > + /* Add parameter types.  */
> > + tree param_types = TYPE_ARG_TYPES (type);
> > +
> > + if (param_types == NULL_TREE)
> > +  {
> > +    /* func () - no parameter list (could be variadic). */
> > +  }
> > + else
> > +  {
> > +    bool found_real_params = false;
> > +    for (tree param = param_types; param; param = TREE_CHAIN (param))
> > +      {
> > + tree param_type = TREE_VALUE (param);
> > + if (param_type == void_type_node)
> > +  {
> > +    /* Check if this is the first parameter (explicit void) or a
> > +       sentinel.  */
> > +    if (!found_real_params)
> > +      {
> > + /* func (void) - explicit empty parameter list.
> > +   Mangle void to distinguish from variadic func (). */
> > + mangle_type (void_type_node, out_str, hash_state);
> > +      }
> > +    /* If we found real params before this void, it's a sentinel
> > +       so stop here.  */
> > +    break;
> > +  }
> > +
> > + found_real_params = true;
> > +
> > + /* For value parameters, ignore const/volatile qualifiers as
> > +   they don't affect the calling convention.  "const int" and
> > +   "int" are passed identically by value.  */
> > + tree canonical_param_type = param_type;
> > +
> > + if (TREE_CODE (param_type) != POINTER_TYPE
> > +    && TREE_CODE (param_type) != REFERENCE_TYPE
> > +    && TREE_CODE (param_type) != ARRAY_TYPE)
> > +  {
> > +    /* For non-pointer/reference value parameters, strip
> > +       qualifiers by default.  */
> > +    canonical_param_type = TYPE_MAIN_VARIANT (param_type);
> > +
> > +    /* Exception: preserve typedef information for anonymous
> > +       compound types.  */
> > +    if (TYPE_NAME (param_type)
> > + && TREE_CODE (TYPE_NAME (param_type)) == TYPE_DECL
> > + && DECL_ORIGINAL_TYPE (TYPE_NAME (param_type)))
> > +      {
> > + tree original_type
> > +  = DECL_ORIGINAL_TYPE (TYPE_NAME (param_type));
> > + if ((TREE_CODE (original_type) == RECORD_TYPE
> > +     || TREE_CODE (original_type) == UNION_TYPE
> > +     || TREE_CODE (original_type) == ENUMERAL_TYPE)
> > +    && !TYPE_NAME (original_type))
> > +  {
> > +    /* Preserve typedef of an anonymous
> > +       struct/union/enum.  */
> > +    canonical_param_type = param_type;
> > +  }
> > +      }
> > +  }
> > +
> > + mangle_type (canonical_param_type, out_str, hash_state);
> > +      }
> > +  }
> > +
> > + /* Check if this is a variadic function and add 'z' marker.  */
> > + if (stdarg_p (type))
> > +  {
> > +    append_char ('z', out_str, hash_state);
> > +  }
> > +
> > + append_char ('E', out_str, hash_state);
> > + break;
> > +      }
> > +
> > +    case RECORD_TYPE:
> > +    case UNION_TYPE:
> > +    case ENUMERAL_TYPE:
> > +      {
> > + /* Struct/union/enum: use simplified representation for C types.  */
> > + const char *name = NULL;
> > +
> > + /* For compound types, use the original qualified type to preserve
> > +   typedef info.  */
> > + if (TYPE_QUALS (qualified_type) != TYPE_UNQUALIFIED)
> > +  {
> > +    type = qualified_type;
> > +  }
> > +
> > + if (TYPE_NAME (type))
> > +  {
> > +    if (TREE_CODE (TYPE_NAME (type)) == TYPE_DECL)
> > +      {
> > + /* TYPE_DECL case: both named structs and typedef structs.  */
> > + tree decl_name = DECL_NAME (TYPE_NAME (type));
> > + if (decl_name && TREE_CODE (decl_name) == IDENTIFIER_NODE)
> > +  {
> > +    name = IDENTIFIER_POINTER (decl_name);
> > +  }
> > +      }
> > +    else if (TREE_CODE (TYPE_NAME (type)) == IDENTIFIER_NODE)
> > +      {
> > + /* Direct identifier case.  */
> > + name = IDENTIFIER_POINTER (TYPE_NAME (type));
> > +      }
> > +  }
> > +
> > + if (name)
> > +  {
> > +    append_string (std::to_string (strlen (name)).c_str (),
> > +   out_str, hash_state);
> > +    append_string (name, out_str, hash_state);
> > +    break;
> > +  }
> > +
> > + /* If no name found, use anonymous type format: <length>$_<counter>.  */
> > + static unsigned anon_counter = 0;
> > + std::string anon_name = "$_" + std::to_string (anon_counter++);
> > +
> > + append_string (std::to_string (anon_name.length ()).c_str (),
> > +       out_str, hash_state);
> > + append_string (anon_name.c_str (), out_str, hash_state);
> > + break;
> > +      }
> > +
> > +    default:
> > +      /* Handle builtin types.  */
> > +      mangle_builtin_type (type, out_str, hash_state);
> > +      break;
> > +    }
> > +}
> > +
> > +/* Get the typeinfo mangled name string for any C type.
> > +   Returns the mangled type string following Itanium C++ ABI conventions.  */
> > +
> > +std::string
> > +typeinfo_get_name (tree type)
> > +{
> > +  gcc_assert (type != NULL_TREE);
> > +  std::string result = "_ZTS";
> > +
> > +  mangle_type (type, &result, nullptr);
> > +  return result;
> > +}
> > +
> > +/* Get the typeinfo hash for any C type.
> > +   Returns the FNV-1a hash of the mangled type string.  */
> > +
> > +uint32_t
> > +typeinfo_get_hash (tree type)
> > +{
> > +  gcc_assert (type != NULL_TREE);
> > +  uint32_t hash_state = 2166136261U; /* FNV-1a 32-bit offset basis.  */
> > +
> > +  /* Include _ZTS prefix in hash calculation.  */
> > +  append_string ("_ZTS", nullptr, &hash_state);
> > +
> > +  mangle_type (type, nullptr, &hash_state);
> > +  return hash_state;
> > +}
> > -- 
> > 2.34.1
> > 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
  2025-09-17 21:09     ` Kees Cook
@ 2025-09-18 16:59       ` Qing Zhao
  2025-09-18 18:20         ` Kees Cook
  2025-09-18 19:39       ` Kees Cook
  1 sibling, 1 reply; 28+ messages in thread
From: Qing Zhao @ 2025-09-18 16:59 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andrew Pinski, Jakub Jelinek, Martin Uecker, Richard Biener,
	Joseph Myers, Peter Zijlstra, Jan Hubicka, Richard Earnshaw,
	Richard Sandiford, Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng,
	Palmer Dabbelt, Andrew Waterman, Jim Wilson, Dan Li,
	Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
	Bill Wendling, gcc-patches@gcc.gnu.org,
	linux-hardening@vger.kernel.org



> On Sep 17, 2025, at 17:09, Kees Cook <kees@kernel.org> wrote:
> 
> On Wed, Sep 17, 2025 at 01:42:32PM +0000, Qing Zhao wrote:
>> This version of the middle-end change is much simpler and cleaner-:).
> 
> Thanks! I think it's getter closer (hopefully). :)
> 
>>> On Sep 13, 2025, at 19:23, Kees Cook <kees@kernel.org> wrote:
>>> +- KCFI check-call instrumentation must survive tail call optimization.
>>> +  If an indirect call is turned into an indirect jump, KCFI checking
>>> +  must still happen (but it will use a jmp rather than a call).
>> 
>> I didn’t see any code changes in this patch address the above issue,
>> is the issue automatically resolved without special handling?
> 
> The logic for this is handled by the split RTL patterns on the backend. We
> end up with 4 RTL patterns for KCFI that match the regular 4 call
> patterns:
> 
> - call
> - call with return value
> - sibcall
> - sibcall with return value
> 
> In the RTL assembly output the "is this a sibcall?" test is made to
> choose between emitting a "call" or a "jump" insn.

Oh, okay, I see. 

> 
>>> +- Functions that may be called indirectly have a preamble added,
>>> +  __cfi_$original_func_name, which contains the $typeid value:
>>> +
>>> +    __cfi_target_func:
>>> +      .word $typeid
>>> +    target_func:
>>> +       [regular function entry...]
>>> +
>>> +- The preamble needs to interact with patchable function entry so that
>>> +  the typeid appears further away from the actual start of the function
>>> +  (leaving the prefix NOPs of the patchable function entry unchanged).
>>> +  This means only _globally defined_ patchable function entry is supported
>>> +  with KCFI (indrect call sites must know in advance what the offset is,
>>> +  which may not be possible with extern functions that use a function
>>> +  attribute to change their patchable function entry characteristics).
>>> +  For example, a "4,4" patchable function entry would end up like:
>>> +
>>> +    __cfi_target_func:
>>> +      .data $typeid
>>> +      nop nop nop nop
>>> +    target_func:
>>> +       [regular function entry...]
>>> +
>>> +  Architectures may need to add alignment nops prior to the typeid to keep
>>> +  __cfi_target_func aligned for function call conventions.
>> 
>> I am still a little confused with the above, are there two “nops” need to be computed
>> and added: one is for patchable function entry, the other one is for architecture specific
>> alignment nops? 
>> If so, you might need to clarify the above to make this clear.
> 
> Yes, this is a confusing bit of logic that needs more clarity. I'll
> improve this. Here's what happens:
> 
> Normal function has no preamble:
> 
> func:
> body...
> 
> With KCFI, a preamble is created to hold the typeid to be checked from
> site sites (addressed as -4 from "func"):
> 
> __cfi_func:
> .word typeid_value
> func:
> body...
> 
> A "patchable function entry" function has both "prefix" and "entry" nops
> added:
> 
> __pfe_func:
> nop // "prefix" nops
> nop
> func:
> nop // "entry" nops
> nop
> nop
> body...
> 
> Confusingly, the argument specifies total (and optionally prefix):
> -fpatchable-function-entry=TOTAL[,PREFIX]
> So the above example is -fpatchable-function-entry=5,2 (5 total NOPs,
> with 2 of them being preamble insns).
> 
> For KCFI, callsites need to address the typeid, so a normal KCFI
> callsite would use:
> 
> load %tmp, -4(%target)
> 
> but when PFE is active, the typeid must be placed before the prefix NOPs
> since PFE requires that the entire space is NOPs. Therefore the prefix
> NOPs need to be included (and measured in _bytes_, not instructions)
> when loading the typeid:
> 
> load %tmp, -12(%target)
> // 2 nops (8 bytes on aarch64) and 4 bytes for typeid == -12
> 
> Which corresponds to the resulting function preamble layout:
> 
> __cfi_func:
> .word typeid_value
> __pfe_func:
> nop // "prefix" nops
> nop
> func:
> nop // "entry" nops
> nop
> nop
> body...

Okay, the above is clean. 

The global:
/* For callsite typeid loading offset.  */
+HOST_WIDE_INT kcfi_patchable_entry_prefix_nops = 0;

is for the above “prefix” nops.  And this “prefix_nops” will impact the call site loading offset. 


> 
> Now, an _additional_ requirement for x86 is that __cfi_func be function
> entry aligned, so that Linux can, if it chooses, live-patch the entire
> KCFI and PFE prefix area into a callable target (this is the "FineIBT"
> KCFI alternative). So, when -falign-functions=N is set, given x86's 1
> byte NOPs and the "movl" encoding used for holding the KCFI type id, the
> final layout, given -falign-functions=8 -fpatchable-function-entry=4,1
> would be:
> 
> __cfi_func:
> nop // "alignment" nops // 2 bytes total
> nop
> .word typeid_value // 5 bytes total
> __pfe_func:
> nop // "prefix" nops // 1 byte total
> func:
> nop // "entry" nops
> nop
> nop
> body...
> 
> 4 total PFE bytes with 1 as prefix (leving 3 at the func entry). And
> to align __cfi_func to 8 bytes, we have 5 byte typeid insn, and 1 byte
> "prefix" nop, so we need 2 more bytes to be the "alignment" nops.

Okay, I see. 

+/* For preamble alignment.  */
+static HOST_WIDE_INT kcfi_patchable_entry_arch_alignment_nops = 0;

Is for this “alignment” nops. And this “alignment_nops” will NOT impact the call site loading offset. 
It only impacts the position of the “__cfi_func” symbol. 

The above examples explain the whole picture very well.
It might be a good idea to include them as comments of the routine “kcfi_prepare_alignment_nops”. 

> 
> 
> This layout was not obvious initially for x86 because Linux's FineIBT
> implementation uses -falign-functions=16 -fpatchable-function-entry=11,11
> so the alignment nops are pre-calculated.
> 
>> 
>>> +
>>> +- External functions that are address-taken have a weak __kcfi_typeid_$func
>>> +  symbol added with the typeid value available so that the typeid can be
>>> +  referenced from assembly linkages, etc, where the typeid values cannot be
>>> +  calculated (i.e where C type information is missing):
>>> +
>>> +    .weak   __kcfi_typeid_$func
>>> +    .set    __kcfi_typeid_$func, $typeid
>>> +
>> 
>> From my previous understanding, the above weak symbol is emitted for external functions
>> that are address-taken AND does not have a definition in the compilation. So the weak symbols
>> Is emitted at the declaration site of the external function, is this true?
>> 
>> If so, could you please clarify this in the above?
> 
> Yes, this happens via assemble_external_real, which can be called under
> a few conditions in gcc/varasm.cc.

Okay. Please clarify this in the design doc. 
> 
>>> +- Keep indirect calls from being merged (see earlier example) by
>>> +  checking the KCFI insn's typeid for equality.
>> 
>> Is this resolved by the following code:
>> 
>> rtlanal.cc
>> index 63a1d08c46cf..5016fe93ccac 100644
>> --- a/gcc/rtlanal.cc
>> +++ b/gcc/rtlanal.cc
>> @@ -1177,6 +1177,11 @@ reg_referenced_p (const_rtx x, const_rtx body)
>>    case IF_THEN_ELSE:
>>      return reg_overlap_mentioned_p (x, body);
>> 
>> +    case KCFI:
>> +      /* For KCFI wrapper, check both the wrapped call and the type ID.  */
>> +      return (reg_overlap_mentioned_p (x, XEXP (body, 0))
>> +      || reg_overlap_mentioned_p (x, XEXP (body, 1)));
>> +
> 
> The above is needed for accurate register "liveness" checking. When the
> above code is removed, the kcfi-move-preservation.c regression test
> fails (since it doesn't see the clobbers).
Okay.  I see. 
> 
> AFAICT, simply making it a new type of RTL (the DEF_RTL_EXPR), made it
> unmergeable.

Then is it possible some legal merging might not work anymore with this change? 

> I assume this is because whatever was doing the call
> merging was looking strictly for "CALL" types, but I honestly don't know
> where that was happening.
> 
>>> +/* Common helper for RTL patterns to emit .kcfi_traps section entry.  */
>> 
>> I noticed that you didn’t explain each parameter of the function in all the comments for the functions.
>> This need to be updated for all the new functions.
> 
> For externs like these, should the parameter documentation go in the .h
> file, or the .cc file?

My understanding is the parameter doc going in the .cc file (just double checked some gcc files to make sure this) -:)
> 
>>> +void
>>> +kcfi_emit_traps_section (FILE *file, rtx trap_label_sym)
>>> +{
>>> +  /* Generate entry label internally and get its number.  */
>>> +  rtx entry_label = gen_label_rtx ();
>>> +  int entry_labelno = CODE_LABEL_NUMBER (entry_label);
>> 
>> Is the only usage of the new RTX “entry_label” is to generate a label_number? 
>> If so, the entry_label is not needed at all.  You can get a distinct labelno for each
>> Lkcfi_entry, for example, the function id for the current function.
> 
> It is, yes. I can't use the function id because it's only incremented per
> function and a given function may have multiple kcfi call sites within
> it.

Okay.  I see. 

So, you need a unique lableno for each Lkcfi_entryN? Any other requirement?

> I did have a version of this logic that used a kcfi-specific global
> counter but (at the time) I was having trouble with it

What kind of issue? 

> and had seen that
> other "custom label" examples in the code base used this style, so I
> switched to that.

My concern is, the new generated RTX "entry_label” is not used at all, will there be any member leak from this?


> 
> I have since figured out why the global counter wasn't work (I was using
> it during expansion and not during insn output, so I had cases where a
> call was getting duplicated and I had a repeated label). If it's
> preferred, I could try switching back to the global counter to avoid
> these "useless" gen_label_rtx calls?

Yes, global counter approach is better. 

> 
>>> +static uint32_t
>>> +kcfi_get_type_id (tree fn_type)
>>> +{
>>> +  uint32_t type_id;
>>> +
>>> +  /* Cache the attribute identifier.  */
>>> +  if (!kcfi_type_id_attr)
>>> +    kcfi_type_id_attr = get_identifier ("kcfi_type_id");
>>> +
>>> +  tree attr = lookup_attribute (IDENTIFIER_POINTER (kcfi_type_id_attr),
>>> + TYPE_ATTRIBUTES (fn_type));
>> 
>> The above can be simplified as:
>> +  tree attr = lookup_attribute (“kcfi_type_id”, TYPE_ATTRIBUTES (fn_type));
> 
> Ugh, I totally misunderstood the examples I saw of this. I thought they
> were caching the string lookup, but now that I look more closely, I see:
> 
> #define IDENTIFIER_POINTER(NODE) \
>  ((const char *) IDENTIFIER_NODE_CHECK (NODE)->identifier.id.str)
> 
> it's just returning the string!
> 
> I will throw away the "caching" I was doing. I thought it would actually
> look up the attribute using the tree returned by get_identifier, but I
> see there is no overloaded lookup_attribute that takes a tree argument.
> 
> *face palm*

-:)

> 
>>> +/* Emit KCFI type ID symbol for an address-taken external function.  */
>> 
>> Is it more accurate to say:
>> 
>> Emit KCFI type ID symbol for the declaration of an address-taken external function FNDECL
>> to the assembly file ASM_FILE.
>> 
>> ??
> 
> Yup, I will update it.
> 
>>> +  /* Process all functions - both local and external.  */
>>> +  FOR_EACH_FUNCTION (node)
>>> +    {
>>> +      tree fndecl = node->decl;
>>> +
>>> +      /* Skip all non-NORMAL builtins (MD, FRONTEND) entirely.
>>> + For NORMAL builtins, skip those that lack an implicit
>>> + implementation (closest way to distinguishing DEF_LIB_BUILTIN
>>> + from others).  E.g. we need to have typeids for memset().  */
>> 
>> I see indentation issue in the above comments.
> 
> This looks like your email client again. It passes
> contrib/check_GNU_style.py:
> 
>  FOR_EACH_FUNCTION (node)$
>    {$
>      tree fndecl = node->decl;$
> $
>      /* Skip all non-NORMAL builtins (MD, FRONTEND) entirely.$
> ^I For NORMAL builtins, skip those that lack an implicit$
> ^I implementation (closest way to distinguishing DEF_LIB_BUILTIN$
> ^I from others).  E.g. we need to have typeids for memset().  */$
> 
> Or is there something special I need to be doing differently for
> comments?

Yeah, I guess it’s issue with my mail client. Sorry about that. 
> 
>> 
>>> +      if (fndecl_built_in_p (fndecl))
>>> + {
>>> +  if (DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_NORMAL)
>>> +    continue;
>>> +  if (!builtin_decl_implicit_p (DECL_FUNCTION_CODE (fndecl)))
>>> +    continue;
>>> + }
>> 
>> Also see indentation issue in the above.
> 
>      if (fndecl_built_in_p (fndecl))$
> ^I{$
> ^I  if (DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_NORMAL)$
> ^I    continue;$
> ^I  if (!builtin_decl_implicit_p (DECL_FUNCTION_CODE (fndecl)))$
> ^I    continue;$
> ^I}$
> 
> Looks like the same thing?

Yeah. 
> 
> 
> Thanks for the review! I'll have v4 ready soon.

Thanks.

Qing
> 
> -Kees
> 
> -- 
> Kees Cook


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/7] typeinfo: Introduce KCFI typeinfo mangling API
  2025-09-18  7:20     ` Martin Uecker
@ 2025-09-18 18:09       ` Kees Cook
  2025-09-18 18:40         ` Martin Uecker
  0 siblings, 1 reply; 28+ messages in thread
From: Kees Cook @ 2025-09-18 18:09 UTC (permalink / raw)
  To: Martin Uecker
  Cc: Qing Zhao, Andrew Pinski, Jakub Jelinek, Richard Biener,
	Joseph Myers, Peter Zijlstra, Jan Hubicka, Richard Earnshaw,
	Richard Sandiford, Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng,
	Palmer Dabbelt, Andrew Waterman, Jim Wilson, Dan Li,
	Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
	Bill Wendling, gcc-patches@gcc.gnu.org,
	linux-hardening@vger.kernel.org

On Thu, Sep 18, 2025 at 09:20:52AM +0200, Martin Uecker wrote:
> Am Mittwoch, dem 17.09.2025 um 17:56 +0000 schrieb Qing Zhao:
> > Hi, 
> > 
> > > On Sep 13, 2025, at 19:23, Kees Cook <kees@kernel.org> wrote:
> > > 
> > > To support the KCFI typeid and future type-based allocators,
> 
> What I find problematic though is that this is not based on GNU / ISO C
> rules but on stricter Linux kernel rules.   I think such builtin should
> have two versions.  
> 
> So maybe
> 
> __builtin_typeinfo_hash_strict // strict
> __builtin_typeinfo_hash_canonical // standard
> 
> or similar, or maybe instead have a flag argument so that we can
> other options which may turn out to be important in the future
> (such as ignoring  qualifiers or supporting newer languag features).

Can you send me a patch to gcc/testsuite/gcc.dg/builtin-typeinfo.c
that shows what differences you mean? Because AFAICT, this C version
matches the C++ typeinfo implementation. There isn't a need for these
hashes to be comparable in a way that they could be used to, for
example, reimplement __builtin_types_compatible_p. It's called
"typeinfo" and that has a specific meaning currently...

Given:

    typedef int arr10[10];
    typedef int arr_unknown[];
    typedef int *arr;
    typedef struct named { int a; int b; } named_t;
    typedef struct { int a; int b; } nameless_t;
    typedef void (*func_arr10)(int[10]);
    typedef void (*func_arr_unknown)(int[]);
    typedef void (*func_ptr)(int*);
    typedef void (*func_named(named_t*);
    typedef void (*func_nameless(nameless_t*);

C++ typeinfo(...).name() shows:

  int[10]:		A10_i
  int[]:		A_i
  int *:		Pi
  named_t:		5named
  nameless_t:		10nameless_t
  void(*)(int[10]):	PFvPiE
  void(*)(int[]):	PFvPiE
  void(*)(int*):	PFvPiE
  void(*)(named_t*):	PFvP5namedE
  void(*)(nameless_t*):	PFvP10nameless_tE

This __builtin_typeinfo_name(...) shows:

  int[10]:		A10_i
  int[]:		A_i
  int *:		Pi
  __builtin_compatible_types_p(int[10], int[]): true
  __builtin_compatible_types_p(int[], int*):	false
  named_t:		5named
  nameless_t:		10nameless_t
  void(*)(int[10]):	PFvPiE
  void(*)(int[]):	PFvPiE
  void(*)(int*):	PFvPiE
  void(*)(named_t*):	PFvP5namedE
  void(*)(nameless_t*):	PFvP10nameless_tE

What would you want the "Strict ISO C" builtin to do instead?

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
  2025-09-18 16:59       ` Qing Zhao
@ 2025-09-18 18:20         ` Kees Cook
  2025-09-18 18:48           ` Qing Zhao
  0 siblings, 1 reply; 28+ messages in thread
From: Kees Cook @ 2025-09-18 18:20 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Andrew Pinski, Jakub Jelinek, Martin Uecker, Richard Biener,
	Joseph Myers, Peter Zijlstra, Jan Hubicka, Richard Earnshaw,
	Richard Sandiford, Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng,
	Palmer Dabbelt, Andrew Waterman, Jim Wilson, Dan Li,
	Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
	Bill Wendling, gcc-patches@gcc.gnu.org,
	linux-hardening@vger.kernel.org

On Thu, Sep 18, 2025 at 04:59:56PM +0000, Qing Zhao wrote:
> 
> 
> > On Sep 17, 2025, at 17:09, Kees Cook <kees@kernel.org> wrote:
> > 
> > On Wed, Sep 17, 2025 at 01:42:32PM +0000, Qing Zhao wrote:
> >> This version of the middle-end change is much simpler and cleaner-:).
> > 
> > Thanks! I think it's getter closer (hopefully). :)
> > 
> >>> On Sep 13, 2025, at 19:23, Kees Cook <kees@kernel.org> wrote:
> [...]
> The above examples explain the whole picture very well.
> It might be a good idea to include them as comments of the routine “kcfi_prepare_alignment_nops”. 

I've expanded this much more now for the future v4.

> >>> +- External functions that are address-taken have a weak __kcfi_typeid_$func
> >>> +  symbol added with the typeid value available so that the typeid can be
> >>> +  referenced from assembly linkages, etc, where the typeid values cannot be
> >>> +  calculated (i.e where C type information is missing):
> >>> +
> >>> +    .weak   __kcfi_typeid_$func
> >>> +    .set    __kcfi_typeid_$func, $typeid
> >>> +
> >> 
> >> From my previous understanding, the above weak symbol is emitted for external functions
> >> that are address-taken AND does not have a definition in the compilation. So the weak symbols
> >> Is emitted at the declaration site of the external function, is this true?
> >> 
> >> If so, could you please clarify this in the above?
> > 
> > Yes, this happens via assemble_external_real, which can be called under
> > a few conditions in gcc/varasm.cc.
> 
> Okay. Please clarify this in the design doc. 

I mention it later in the "behavioral" section:

- assemble_external_real calls kcfi_emit_typeid_symbol to add the
  __kcfi_typeid_$func symbols.

I had left off implementation details (i.e. "called from
assemble_external_real") in the "constraints" section. How would you
like this arranged?

> >>> +- Keep indirect calls from being merged (see earlier example) by
> >>> +  checking the KCFI insn's typeid for equality.
> >> 
> >> Is this resolved by the following code:
> >> 
> >> rtlanal.cc
> >> index 63a1d08c46cf..5016fe93ccac 100644
> >> --- a/gcc/rtlanal.cc
> >> +++ b/gcc/rtlanal.cc
> >> @@ -1177,6 +1177,11 @@ reg_referenced_p (const_rtx x, const_rtx body)
> >>    case IF_THEN_ELSE:
> >>      return reg_overlap_mentioned_p (x, body);
> >> 
> >> +    case KCFI:
> >> +      /* For KCFI wrapper, check both the wrapped call and the type ID.  */
> >> +      return (reg_overlap_mentioned_p (x, XEXP (body, 0))
> >> +      || reg_overlap_mentioned_p (x, XEXP (body, 1)));
> >> +
> > 
> > The above is needed for accurate register "liveness" checking. When the
> > above code is removed, the kcfi-move-preservation.c regression test
> > fails (since it doesn't see the clobbers).
> Okay.  I see. 
> > 
> > AFAICT, simply making it a new type of RTL (the DEF_RTL_EXPR), made it
> > unmergeable.
> 
> Then is it possible some legal merging might not work anymore with this change? 

Perhaps? I will see if I can construct a case where there should be a
"merged" call (when the typeid matches).

> 
> > I assume this is because whatever was doing the call
> > merging was looking strictly for "CALL" types, but I honestly don't know
> > where that was happening.
> > 
> >>> +/* Common helper for RTL patterns to emit .kcfi_traps section entry.  */
> >> 
> >> I noticed that you didn’t explain each parameter of the function in all the comments for the functions.
> >> This need to be updated for all the new functions.
> > 
> > For externs like these, should the parameter documentation go in the .h
> > file, or the .cc file?
> 
> My understanding is the parameter doc going in the .cc file (just double checked some gcc files to make sure this) -:)

Okay, thanks! I will update these.

> > 
> >>> +void
> >>> +kcfi_emit_traps_section (FILE *file, rtx trap_label_sym)
> >>> +{
> >>> +  /* Generate entry label internally and get its number.  */
> >>> +  rtx entry_label = gen_label_rtx ();
> >>> +  int entry_labelno = CODE_LABEL_NUMBER (entry_label);
> >> 
> >> Is the only usage of the new RTX “entry_label” is to generate a label_number? 
> >> If so, the entry_label is not needed at all.  You can get a distinct labelno for each
> >> Lkcfi_entry, for example, the function id for the current function.
> > 
> > It is, yes. I can't use the function id because it's only incremented per
> > function and a given function may have multiple kcfi call sites within
> > it.
> 
> Okay.  I see. 
> 
> So, you need a unique lableno for each Lkcfi_entryN? Any other requirement?

Right, I need unique labels for each of trap, call, and entry. But they
are all associated together, so they could use a single counter.

> > I did have a version of this logic that used a kcfi-specific global
> > counter but (at the time) I was having trouble with it
> 
> What kind of issue? 
> 
> > and had seen that
> > other "custom label" examples in the code base used this style, so I
> > switched to that.
> 
> My concern is, the new generated RTX "entry_label” is not used at all, will there be any member leak from this?
> 
> 
> > 
> > I have since figured out why the global counter wasn't work (I was using
> > it during expansion and not during insn output, so I had cases where a
> > call was getting duplicated and I had a repeated label). If it's
> > preferred, I could try switching back to the global counter to avoid
> > these "useless" gen_label_rtx calls?
> 
> Yes, global counter approach is better. 

Okay, I will switch to that.

> 
> > 
> >>> +static uint32_t
> >>> +kcfi_get_type_id (tree fn_type)
> >>> +{
> >>> +  uint32_t type_id;
> >>> +
> >>> +  /* Cache the attribute identifier.  */
> >>> +  if (!kcfi_type_id_attr)
> >>> +    kcfi_type_id_attr = get_identifier ("kcfi_type_id");
> >>> +
> >>> +  tree attr = lookup_attribute (IDENTIFIER_POINTER (kcfi_type_id_attr),
> >>> + TYPE_ATTRIBUTES (fn_type));
> >> 
> >> The above can be simplified as:
> >> +  tree attr = lookup_attribute (“kcfi_type_id”, TYPE_ATTRIBUTES (fn_type));
> > 
> > Ugh, I totally misunderstood the examples I saw of this. I thought they
> > were caching the string lookup, but now that I look more closely, I see:
> > 
> > #define IDENTIFIER_POINTER(NODE) \
> >  ((const char *) IDENTIFIER_NODE_CHECK (NODE)->identifier.id.str)
> > 
> > it's just returning the string!
> > 
> > I will throw away the "caching" I was doing. I thought it would actually
> > look up the attribute using the tree returned by get_identifier, but I
> > see there is no overloaded lookup_attribute that takes a tree argument.
> > 
> > *face palm*
> 
> -:)

Okay, so I tried to remove this and remembered that it's actually cached
not for lookup_attribute, but for build_tree_list call case:

      tree attr = build_tree_list (kcfi_type_id_attr, type_id_tree);

      TYPE_ATTRIBUTES (fn_type) = chainon (TYPE_ATTRIBUTES (fn_type), attr);

For _that_, I need a "tree" argument. So instead of building it each
time, I have it built already, and I can get at its string for
lookup_attribute too. So I think this code is good as-is.

Thanks!

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/7] typeinfo: Introduce KCFI typeinfo mangling API
  2025-09-18 18:09       ` Kees Cook
@ 2025-09-18 18:40         ` Martin Uecker
  0 siblings, 0 replies; 28+ messages in thread
From: Martin Uecker @ 2025-09-18 18:40 UTC (permalink / raw)
  To: Kees Cook
  Cc: Qing Zhao, Andrew Pinski, Jakub Jelinek, Richard Biener,
	Joseph Myers, Peter Zijlstra, Jan Hubicka, Richard Earnshaw,
	Richard Sandiford, Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng,
	Palmer Dabbelt, Andrew Waterman, Jim Wilson, Dan Li,
	Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
	Bill Wendling, gcc-patches@gcc.gnu.org,
	linux-hardening@vger.kernel.org

Am Donnerstag, dem 18.09.2025 um 11:09 -0700 schrieb Kees Cook:
> On Thu, Sep 18, 2025 at 09:20:52AM +0200, Martin Uecker wrote:
> > Am Mittwoch, dem 17.09.2025 um 17:56 +0000 schrieb Qing Zhao:
> > > Hi, 
> > > 
> > > > On Sep 13, 2025, at 19:23, Kees Cook <kees@kernel.org> wrote:
> > > > 
> > > > To support the KCFI typeid and future type-based allocators,
> > 
> > What I find problematic though is that this is not based on GNU / ISO C
> > rules but on stricter Linux kernel rules.   I think such builtin should
> > have two versions.  
> > 
> > So maybe
> > 
> > __builtin_typeinfo_hash_strict // strict
> > __builtin_typeinfo_hash_canonical // standard
> > 
> > or similar, or maybe instead have a flag argument so that we can
> > other options which may turn out to be important in the future
> > (such as ignoring  qualifiers or supporting newer languag features).
> 
> Can you send me a patch to gcc/testsuite/gcc.dg/builtin-typeinfo.c
> that shows what differences you mean? 

I can look at this in the next days.

> Because AFAICT, this C version
> matches the C++ typeinfo implementation. 

> There isn't a need for these
> hashes to be comparable in a way that they could be used to, for
> example, reimplement __builtin_types_compatible_p. It's called
> "typeinfo" and that has a specific meaning currently...

I would want the hashes for types which are compatible
according to ISO C to be identical. 

What I want avoid that this is used to implement some
run-time type checking (which is what KCFI does) and the run-time
check can fail even when a compile-time check according to the
usual rules of the language passes.  Even if this ok for the
Linux kernel, I think this would be surprising in general.

Martin


> 
> Given:
> 
>     typedef int arr10[10];
>     typedef int arr_unknown[];
>     typedef int *arr;
>     typedef struct named { int a; int b; } named_t;
>     typedef struct { int a; int b; } nameless_t;
>     typedef void (*func_arr10)(int[10]);
>     typedef void (*func_arr_unknown)(int[]);
>     typedef void (*func_ptr)(int*);
>     typedef void (*func_named(named_t*);
>     typedef void (*func_nameless(nameless_t*);
> 
> C++ typeinfo(...).name() shows:
> 
>   int[10]:		A10_i
>   int[]:		A_i
>   int *:		Pi
>   named_t:		5named
>   nameless_t:		10nameless_t
>   void(*)(int[10]):	PFvPiE
>   void(*)(int[]):	PFvPiE
>   void(*)(int*):	PFvPiE
>   void(*)(named_t*):	PFvP5namedE
>   void(*)(nameless_t*):	PFvP10nameless_tE
> 
> This __builtin_typeinfo_name(...) shows:
> 
>   int[10]:		A10_i
>   int[]:		A_i
>   int *:		Pi
>   __builtin_compatible_types_p(int[10], int[]): true
>   __builtin_compatible_types_p(int[], int*):	false
>   named_t:		5named
>   nameless_t:		10nameless_t
>   void(*)(int[10]):	PFvPiE
>   void(*)(int[]):	PFvPiE
>   void(*)(int*):	PFvPiE
>   void(*)(named_t*):	PFvP5namedE
>   void(*)(nameless_t*):	PFvP10nameless_tE
> 
> What would you want the "Strict ISO C" builtin to do instead?
> 
> -Kees

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
  2025-09-18 18:20         ` Kees Cook
@ 2025-09-18 18:48           ` Qing Zhao
  2025-09-18 19:20             ` Kees Cook
  0 siblings, 1 reply; 28+ messages in thread
From: Qing Zhao @ 2025-09-18 18:48 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andrew Pinski, Jakub Jelinek, Martin Uecker, Richard Biener,
	Joseph Myers, Peter Zijlstra, Jan Hubicka, Richard Earnshaw,
	Richard Sandiford, Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng,
	Palmer Dabbelt, Andrew Waterman, Jim Wilson, Dan Li,
	Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
	Bill Wendling, gcc-patches@gcc.gnu.org,
	linux-hardening@vger.kernel.org



> On Sep 18, 2025, at 14:20, Kees Cook <kees@kernel.org> wrote:
> 
>>>>> +- External functions that are address-taken have a weak __kcfi_typeid_$func
>>>>> +  symbol added with the typeid value available so that the typeid can be
>>>>> +  referenced from assembly linkages, etc, where the typeid values cannot be
>>>>> +  calculated (i.e where C type information is missing):
>>>>> +
>>>>> +    .weak   __kcfi_typeid_$func
>>>>> +    .set    __kcfi_typeid_$func, $typeid
>>>>> +
>>>> 
>>>> From my previous understanding, the above weak symbol is emitted for external functions
>>>> that are address-taken AND does not have a definition in the compilation. So the weak symbols
>>>> Is emitted at the declaration site of the external function, is this true?
>>>> 
>>>> If so, could you please clarify this in the above?
>>> 
>>> Yes, this happens via assemble_external_real, which can be called under
>>> a few conditions in gcc/varasm.cc.
>> 
>> Okay. Please clarify this in the design doc.
> 
> I mention it later in the "behavioral" section:
> 
> - assemble_external_real calls kcfi_emit_typeid_symbol to add the
>  __kcfi_typeid_$func symbols.
> 
> I had left off implementation details (i.e. "called from
> assemble_external_real") in the "constraints" section. How would you
> like this arranged?

The original arrangement is good. -:)

I guess that I didn’t make myself clear in the beginning, the following is a modified version of 
your previous paragraph:

+- An external function that is address-taken but does not have a definition has
+  a weak __kcfi_typeid_$func symbol added at the declaration site. This weak
+  symbol has  the typeid value available so that the typeid can be
+  referenced from assembly linkages, etc, where the typeid values cannot be
+  calculated (i.e where C type information is missing):
+
+    .weak   __kcfi_typeid_$func
+    .set    __kcfi_typeid_$func, $typeid
+

Is the above the correct understanding? 

>>> 
>>>>> +static uint32_t
>>>>> +kcfi_get_type_id (tree fn_type)
>>>>> +{
>>>>> +  uint32_t type_id;
>>>>> +
>>>>> +  /* Cache the attribute identifier.  */
>>>>> +  if (!kcfi_type_id_attr)
>>>>> +    kcfi_type_id_attr = get_identifier ("kcfi_type_id");
>>>>> +
>>>>> +  tree attr = lookup_attribute (IDENTIFIER_POINTER (kcfi_type_id_attr),
>>>>> + TYPE_ATTRIBUTES (fn_type));
>>>> 
>>>> The above can be simplified as:
>>>> +  tree attr = lookup_attribute (“kcfi_type_id”, TYPE_ATTRIBUTES (fn_type));
>>> 
>>> Ugh, I totally misunderstood the examples I saw of this. I thought they
>>> were caching the string lookup, but now that I look more closely, I see:
>>> 
>>> #define IDENTIFIER_POINTER(NODE) \
>>> ((const char *) IDENTIFIER_NODE_CHECK (NODE)->identifier.id.str)
>>> 
>>> it's just returning the string!
>>> 
>>> I will throw away the "caching" I was doing. I thought it would actually
>>> look up the attribute using the tree returned by get_identifier, but I
>>> see there is no overloaded lookup_attribute that takes a tree argument.
>>> 
>>> *face palm*
>> 
>> -:)
> 
> Okay, so I tried to remove this and remembered that it's actually cached
> not for lookup_attribute, but for build_tree_list call case:
> 
>      tree attr = build_tree_list (kcfi_type_id_attr, type_id_tree);
> 
>      TYPE_ATTRIBUTES (fn_type) = chainon (TYPE_ATTRIBUTES (fn_type), attr);
> 
> For _that_, I need a "tree" argument. So instead of building it each
> time, I have it built already, and I can get at its string for
> lookup_attribute too. So I think this code is good as-is.

Right, the kcfi_type_id_attr is still needed for the purpose of new type_id attribute.

But, for the following

> +  tree attr = lookup_attribute (IDENTIFIER_POINTER (kcfi_type_id_attr),
> + TYPE_ATTRIBUTES (fn_type));

The above can be simplified as:
+  tree attr = lookup_attribute (“kcfi_type_id”, TYPE_ATTRIBUTES (fn_type));

No need to call IDENTIFIER_POINTER (kcfi_type_id_attr) as the first argument for the above call.

Hope this is clear.

Qing


> 
> Thanks!
> 
> -Kees
> 
> -- 
> Kees Cook


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
  2025-09-18 18:48           ` Qing Zhao
@ 2025-09-18 19:20             ` Kees Cook
  0 siblings, 0 replies; 28+ messages in thread
From: Kees Cook @ 2025-09-18 19:20 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Andrew Pinski, Jakub Jelinek, Martin Uecker, Richard Biener,
	Joseph Myers, Peter Zijlstra, Jan Hubicka, Richard Earnshaw,
	Richard Sandiford, Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng,
	Palmer Dabbelt, Andrew Waterman, Jim Wilson, Dan Li,
	Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
	Bill Wendling, gcc-patches@gcc.gnu.org,
	linux-hardening@vger.kernel.org

On Thu, Sep 18, 2025 at 06:48:03PM +0000, Qing Zhao wrote:
> 
> 
> > On Sep 18, 2025, at 14:20, Kees Cook <kees@kernel.org> wrote:
> > 
> >>>>> +- External functions that are address-taken have a weak __kcfi_typeid_$func
> >>>>> +  symbol added with the typeid value available so that the typeid can be
> >>>>> +  referenced from assembly linkages, etc, where the typeid values cannot be
> >>>>> +  calculated (i.e where C type information is missing):
> >>>>> +
> >>>>> +    .weak   __kcfi_typeid_$func
> >>>>> +    .set    __kcfi_typeid_$func, $typeid
> >>>>> +
> >>>> 
> >>>> From my previous understanding, the above weak symbol is emitted for external functions
> >>>> that are address-taken AND does not have a definition in the compilation. So the weak symbols
> >>>> Is emitted at the declaration site of the external function, is this true?
> >>>> 
> >>>> If so, could you please clarify this in the above?
> >>> 
> >>> Yes, this happens via assemble_external_real, which can be called under
> >>> a few conditions in gcc/varasm.cc.
> >> 
> >> Okay. Please clarify this in the design doc.
> > 
> > I mention it later in the "behavioral" section:
> > 
> > - assemble_external_real calls kcfi_emit_typeid_symbol to add the
> >  __kcfi_typeid_$func symbols.
> > 
> > I had left off implementation details (i.e. "called from
> > assemble_external_real") in the "constraints" section. How would you
> > like this arranged?
> 
> The original arrangement is good. -:)
> 
> I guess that I didn’t make myself clear in the beginning, the following is a modified version of 
> your previous paragraph:
> 
> +- An external function that is address-taken but does not have a definition has
> +  a weak __kcfi_typeid_$func symbol added at the declaration site. This weak
> +  symbol has  the typeid value available so that the typeid can be
> +  referenced from assembly linkages, etc, where the typeid values cannot be
> +  calculated (i.e where C type information is missing):
> +
> +    .weak   __kcfi_typeid_$func
> +    .set    __kcfi_typeid_$func, $typeid
> +
> 
> Is the above the correct understanding? 

Ah! I see, yes, that's correct. I will update it. :)

> 
> >>> 
> >>>>> +static uint32_t
> >>>>> +kcfi_get_type_id (tree fn_type)
> >>>>> +{
> >>>>> +  uint32_t type_id;
> >>>>> +
> >>>>> +  /* Cache the attribute identifier.  */
> >>>>> +  if (!kcfi_type_id_attr)
> >>>>> +    kcfi_type_id_attr = get_identifier ("kcfi_type_id");
> >>>>> +
> >>>>> +  tree attr = lookup_attribute (IDENTIFIER_POINTER (kcfi_type_id_attr),
> >>>>> + TYPE_ATTRIBUTES (fn_type));
> >>>> 
> >>>> The above can be simplified as:
> >>>> +  tree attr = lookup_attribute (“kcfi_type_id”, TYPE_ATTRIBUTES (fn_type));
> >>> 
> >>> Ugh, I totally misunderstood the examples I saw of this. I thought they
> >>> were caching the string lookup, but now that I look more closely, I see:
> >>> 
> >>> #define IDENTIFIER_POINTER(NODE) \
> >>> ((const char *) IDENTIFIER_NODE_CHECK (NODE)->identifier.id.str)
> >>> 
> >>> it's just returning the string!
> >>> 
> >>> I will throw away the "caching" I was doing. I thought it would actually
> >>> look up the attribute using the tree returned by get_identifier, but I
> >>> see there is no overloaded lookup_attribute that takes a tree argument.
> >>> 
> >>> *face palm*
> >> 
> >> -:)
> > 
> > Okay, so I tried to remove this and remembered that it's actually cached
> > not for lookup_attribute, but for build_tree_list call case:
> > 
> >      tree attr = build_tree_list (kcfi_type_id_attr, type_id_tree);
> > 
> >      TYPE_ATTRIBUTES (fn_type) = chainon (TYPE_ATTRIBUTES (fn_type), attr);
> > 
> > For _that_, I need a "tree" argument. So instead of building it each
> > time, I have it built already, and I can get at its string for
> > lookup_attribute too. So I think this code is good as-is.
> 
> Right, the kcfi_type_id_attr is still needed for the purpose of new type_id attribute.
> 
> But, for the following
> 
> > +  tree attr = lookup_attribute (IDENTIFIER_POINTER (kcfi_type_id_attr),
> > + TYPE_ATTRIBUTES (fn_type));
> 
> The above can be simplified as:
> +  tree attr = lookup_attribute (“kcfi_type_id”, TYPE_ATTRIBUTES (fn_type));
> 
> No need to call IDENTIFIER_POINTER (kcfi_type_id_attr) as the first argument for the above call.
> 
> Hope this is clear.

Right, I did this because it seemed weird to me to open-code the same
literal string twice.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
  2025-09-17 21:09     ` Kees Cook
  2025-09-18 16:59       ` Qing Zhao
@ 2025-09-18 19:39       ` Kees Cook
  2025-09-18 20:14         ` Qing Zhao
  1 sibling, 1 reply; 28+ messages in thread
From: Kees Cook @ 2025-09-18 19:39 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Andrew Pinski, Jakub Jelinek, Martin Uecker, Richard Biener,
	Joseph Myers, Peter Zijlstra, Jan Hubicka, Richard Earnshaw,
	Richard Sandiford, Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng,
	Palmer Dabbelt, Andrew Waterman, Jim Wilson, Dan Li,
	Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
	Bill Wendling, gcc-patches@gcc.gnu.org,
	linux-hardening@vger.kernel.org

On Wed, Sep 17, 2025 at 02:09:54PM -0700, Kees Cook wrote:
> On Wed, Sep 17, 2025 at 01:42:32PM +0000, Qing Zhao wrote:
> > > On Sep 13, 2025, at 19:23, Kees Cook <kees@kernel.org> wrote:
> > > +- Keep indirect calls from being merged (see earlier example) by
> > > +  checking the KCFI insn's typeid for equality.
> > 
> > Is this resolved by the following code:
> > 
> > rtlanal.cc
> > index 63a1d08c46cf..5016fe93ccac 100644
> > --- a/gcc/rtlanal.cc
> > +++ b/gcc/rtlanal.cc
> > @@ -1177,6 +1177,11 @@ reg_referenced_p (const_rtx x, const_rtx body)
> >     case IF_THEN_ELSE:
> >       return reg_overlap_mentioned_p (x, body);
> > 
> > +    case KCFI:
> > +      /* For KCFI wrapper, check both the wrapped call and the type ID.  */
> > +      return (reg_overlap_mentioned_p (x, XEXP (body, 0))
> > +      || reg_overlap_mentioned_p (x, XEXP (body, 1)));
> > +
> 
> The above is needed for accurate register "liveness" checking. When the
> above code is removed, the kcfi-move-preservation.c regression test
> fails (since it doesn't see the clobbers).
> 
> AFAICT, simply making it a new type of RTL (the DEF_RTL_EXPR), made it
> unmergeable. I assume this is because whatever was doing the call
> merging was looking strictly for "CALL" types, but I honestly don't know
> where that was happening.

Okay, I've found this. The pass that merged the regression test's calls
is jump2. Specifically, the jump2 pass calls old_insns_match_p() which
compares instruction patterns using rtx_equal_p(), and that is doing it
naturally based on the RTL expression, i.e. matching RTL codes for KCFI,
and then matching format (KCFI defines itself as "ee" format, i.e. 2
expressions):

  code = GET_CODE (x);
  /* Rtx's of different codes cannot be equal.  */
  if (code != GET_CODE (y))
    return false;
...
  fmt = GET_RTX_FORMAT (code);
  for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
    {
      switch (fmt[i])
        {
...
        case 'e':
          if (!rtx_equal_p (XEXP (x, i), XEXP (y, i), cb))
            return false;
          break;

So if it's the same call and the same typeid, it'll get merged, otherwise
it won't. And I've validated this now with an addition to the regression
test. It now makes 3 calls, once with typeid A, and then 2 calls with
typeid B, and the typeid B calls get merged.

So there was no special handling for CALL, it's just that CALL didn't have
the typeid associated with it, and KCFI does. RTL working as intended. ;)

(But my new mystery is why my new KCFI matching typeid merging happens
on all backend _except_ arm... I will investigate that.)

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure
  2025-09-18 19:39       ` Kees Cook
@ 2025-09-18 20:14         ` Qing Zhao
  0 siblings, 0 replies; 28+ messages in thread
From: Qing Zhao @ 2025-09-18 20:14 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andrew Pinski, Jakub Jelinek, Martin Uecker, Richard Biener,
	Joseph Myers, Peter Zijlstra, Jan Hubicka, Richard Earnshaw,
	Richard Sandiford, Marcus Shawcroft, Kyrylo Tkachov, Kito Cheng,
	Palmer Dabbelt, Andrew Waterman, Jim Wilson, Dan Li,
	Sami Tolvanen, Ramon de C Valle, Joao Moreira, Nathan Chancellor,
	Bill Wendling, gcc-patches@gcc.gnu.org,
	linux-hardening@vger.kernel.org



> On Sep 18, 2025, at 15:39, Kees Cook <kees@kernel.org> wrote:
> 
> On Wed, Sep 17, 2025 at 02:09:54PM -0700, Kees Cook wrote:
>> On Wed, Sep 17, 2025 at 01:42:32PM +0000, Qing Zhao wrote:
>>>> On Sep 13, 2025, at 19:23, Kees Cook <kees@kernel.org> wrote:
>>>> +- Keep indirect calls from being merged (see earlier example) by
>>>> +  checking the KCFI insn's typeid for equality.
>>> 
>>> Is this resolved by the following code:
>>> 
>>> rtlanal.cc
>>> index 63a1d08c46cf..5016fe93ccac 100644
>>> --- a/gcc/rtlanal.cc
>>> +++ b/gcc/rtlanal.cc
>>> @@ -1177,6 +1177,11 @@ reg_referenced_p (const_rtx x, const_rtx body)
>>>    case IF_THEN_ELSE:
>>>      return reg_overlap_mentioned_p (x, body);
>>> 
>>> +    case KCFI:
>>> +      /* For KCFI wrapper, check both the wrapped call and the type ID.  */
>>> +      return (reg_overlap_mentioned_p (x, XEXP (body, 0))
>>> +      || reg_overlap_mentioned_p (x, XEXP (body, 1)));
>>> +
>> 
>> The above is needed for accurate register "liveness" checking. When the
>> above code is removed, the kcfi-move-preservation.c regression test
>> fails (since it doesn't see the clobbers).
>> 
>> AFAICT, simply making it a new type of RTL (the DEF_RTL_EXPR), made it
>> unmergeable. I assume this is because whatever was doing the call
>> merging was looking strictly for "CALL" types, but I honestly don't know
>> where that was happening.
> 
> Okay, I've found this. The pass that merged the regression test's calls
> is jump2. Specifically, the jump2 pass calls old_insns_match_p() which
> compares instruction patterns using rtx_equal_p(), and that is doing it
> naturally based on the RTL expression, i.e. matching RTL codes for KCFI,
> and then matching format (KCFI defines itself as "ee" format, i.e. 2
> expressions):
> 
>  code = GET_CODE (x);
>  /* Rtx's of different codes cannot be equal.  */
>  if (code != GET_CODE (y))
>    return false;
> ...
>  fmt = GET_RTX_FORMAT (code);
>  for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
>    {
>      switch (fmt[i])
>        {
> ...
>        case 'e':
>          if (!rtx_equal_p (XEXP (x, i), XEXP (y, i), cb))
>            return false;
>          break;
> 
> So if it's the same call and the same typeid, it'll get merged, otherwise
> it won't. And I've validated this now with an addition to the regression
> test. It now makes 3 calls, once with typeid A, and then 2 calls with
> typeid B, and the typeid B calls get merged.
> 
> So there was no special handling for CALL, it's just that CALL didn't have
> the typeid associated with it, and KCFI does. RTL working as intended. ;)

Yeah, this sounds very natural and reasonable now. Nice!
> 
> (But my new mystery is why my new KCFI matching typeid merging happens
> on all backend _except_ arm... I will investigate that.)

Have fun. -:)

Qing
> 
> -Kees
> 
> -- 
> Kees Cook


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2025-09-18 20:16 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-13 23:23 [PATCH v3 0/7] Introduce Kernel Control Flow Integrity ABI [PR107048] Kees Cook
2025-09-13 23:23 ` [PATCH v3 1/7] typeinfo: Introduce KCFI typeinfo mangling API Kees Cook
2025-09-17 17:56   ` Qing Zhao
2025-09-17 21:20     ` Kees Cook
2025-09-18  7:20     ` Martin Uecker
2025-09-18 18:09       ` Kees Cook
2025-09-18 18:40         ` Martin Uecker
2025-09-13 23:23 ` [PATCH v3 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure Kees Cook
2025-09-17 13:42   ` Qing Zhao
2025-09-17 21:09     ` Kees Cook
2025-09-18 16:59       ` Qing Zhao
2025-09-18 18:20         ` Kees Cook
2025-09-18 18:48           ` Qing Zhao
2025-09-18 19:20             ` Kees Cook
2025-09-18 19:39       ` Kees Cook
2025-09-18 20:14         ` Qing Zhao
2025-09-13 23:23 ` [PATCH v3 3/7] x86: Add x86_64 Kernel Control Flow Integrity implementation Kees Cook
2025-09-13 23:24 ` [PATCH v3 4/7] aarch64: Add AArch64 " Kees Cook
2025-09-13 23:43   ` Andrew Pinski
2025-09-14 19:45     ` Kees Cook
2025-09-14 19:52       ` Andrew Pinski
2025-09-17 20:01     ` Kees Cook
2025-09-13 23:24 ` [PATCH v3 5/7] arm: Add ARM 32-bit " Kees Cook
2025-09-13 23:24 ` [PATCH v3 6/7] riscv: Add RISC-V " Kees Cook
2025-09-13 23:24 ` [PATCH v3 7/7] kcfi: Add regression test suite Kees Cook
2025-09-13 23:51   ` Andrew Pinski
2025-09-17 19:51     ` Kees Cook
2025-09-13 23:58   ` Andrew Pinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).