* [PATCH v3 1/9] ARM: Support CLANG CFI
2024-03-11 9:15 [PATCH v3 0/9] CFI for ARM32 using LLVM Linus Walleij
@ 2024-03-11 9:15 ` Linus Walleij
2024-03-11 10:24 ` Ard Biesheuvel
2024-03-11 9:15 ` [PATCH v3 2/9] ARM: tlbflush: Make TLB flushes into static inlines Linus Walleij
` (7 subsequent siblings)
8 siblings, 1 reply; 25+ messages in thread
From: Linus Walleij @ 2024-03-11 9:15 UTC (permalink / raw)
To: Russell King, Sami Tolvanen, Kees Cook, Nathan Chancellor,
Nick Desaulniers, Ard Biesheuvel, Arnd Bergmann
Cc: linux-arm-kernel, llvm, Linus Walleij
Support Control Flow Integrity (CFI) when compiling with
CLANG.
In the as-of-writing LLVM CLANG implementation (v17)
the 32-bit ARM platform is supported by the generic CFI
implementation, which isn't tailored specifically for ARM32
but works well enough to enable the feature.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
arch/arm/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 0af6709570d1..1216656a40bc 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -34,6 +34,7 @@ config ARM
select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX
select ARCH_OPTIONAL_KERNEL_RWX_DEFAULT if CPU_V7
select ARCH_SUPPORTS_ATOMIC_RMW
+ select ARCH_SUPPORTS_CFI_CLANG
select ARCH_SUPPORTS_HUGETLBFS if ARM_LPAE
select ARCH_SUPPORTS_PER_VMA_LOCK
select ARCH_USE_BUILTIN_BSWAP
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH v3 1/9] ARM: Support CLANG CFI
2024-03-11 9:15 ` [PATCH v3 1/9] ARM: Support CLANG CFI Linus Walleij
@ 2024-03-11 10:24 ` Ard Biesheuvel
0 siblings, 0 replies; 25+ messages in thread
From: Ard Biesheuvel @ 2024-03-11 10:24 UTC (permalink / raw)
To: Linus Walleij
Cc: Russell King, Sami Tolvanen, Kees Cook, Nathan Chancellor,
Nick Desaulniers, Arnd Bergmann, linux-arm-kernel, llvm
On Mon, 11 Mar 2024 at 10:15, Linus Walleij <linus.walleij@linaro.org> wrote:
>
> Support Control Flow Integrity (CFI) when compiling with
> CLANG.
>
> In the as-of-writing LLVM CLANG implementation (v17)
> the 32-bit ARM platform is supported by the generic CFI
> implementation, which isn't tailored specifically for ARM32
> but works well enough to enable the feature.
>
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> ---
> arch/arm/Kconfig | 1 +
> 1 file changed, 1 insertion(+)
>
This patch should go last no?
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 0af6709570d1..1216656a40bc 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -34,6 +34,7 @@ config ARM
> select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX
> select ARCH_OPTIONAL_KERNEL_RWX_DEFAULT if CPU_V7
> select ARCH_SUPPORTS_ATOMIC_RMW
> + select ARCH_SUPPORTS_CFI_CLANG
> select ARCH_SUPPORTS_HUGETLBFS if ARM_LPAE
> select ARCH_SUPPORTS_PER_VMA_LOCK
> select ARCH_USE_BUILTIN_BSWAP
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v3 2/9] ARM: tlbflush: Make TLB flushes into static inlines
2024-03-11 9:15 [PATCH v3 0/9] CFI for ARM32 using LLVM Linus Walleij
2024-03-11 9:15 ` [PATCH v3 1/9] ARM: Support CLANG CFI Linus Walleij
@ 2024-03-11 9:15 ` Linus Walleij
2024-03-11 9:39 ` Russell King (Oracle)
2024-03-11 9:15 ` [PATCH v3 3/9] ARM: bugs: Check in the vtable instead of defined aliases Linus Walleij
` (6 subsequent siblings)
8 siblings, 1 reply; 25+ messages in thread
From: Linus Walleij @ 2024-03-11 9:15 UTC (permalink / raw)
To: Russell King, Sami Tolvanen, Kees Cook, Nathan Chancellor,
Nick Desaulniers, Ard Biesheuvel, Arnd Bergmann
Cc: linux-arm-kernel, llvm, Linus Walleij
Instead of just using defines to define the TLB flush functions,
use static inlines.
This has the upside that we can tag those as __nocfi so we can
execute a CFI-enabled kernel.
Move the variables around a bit so the functions can find their
global variable cpu_tlb.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
arch/arm/include/asm/tlbflush.h | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/arch/arm/include/asm/tlbflush.h b/arch/arm/include/asm/tlbflush.h
index 38c6e4a2a0b6..7340518ee0e9 100644
--- a/arch/arm/include/asm/tlbflush.h
+++ b/arch/arm/include/asm/tlbflush.h
@@ -210,13 +210,23 @@ struct cpu_tlb_fns {
unsigned long tlb_flags;
};
+extern struct cpu_tlb_fns cpu_tlb;
+
+#define __cpu_tlb_flags cpu_tlb.tlb_flags
+
/*
* Select the calling method
*/
#ifdef MULTI_TLB
-#define __cpu_flush_user_tlb_range cpu_tlb.flush_user_range
-#define __cpu_flush_kern_tlb_range cpu_tlb.flush_kern_range
+static inline void __nocfi __cpu_flush_user_tlb_range(unsigned long s, unsigned long e, struct vm_area_struct *vma)
+{
+ cpu_tlb.flush_user_range(s, e, vma);
+}
+static inline void __nocfi __cpu_flush_kern_tlb_range(unsigned long s, unsigned long e)
+{
+ cpu_tlb.flush_kern_range(s, e);
+}
#else
@@ -228,10 +238,6 @@ extern void __cpu_flush_kern_tlb_range(unsigned long, unsigned long);
#endif
-extern struct cpu_tlb_fns cpu_tlb;
-
-#define __cpu_tlb_flags cpu_tlb.tlb_flags
-
/*
* TLB Management
* ==============
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH v3 2/9] ARM: tlbflush: Make TLB flushes into static inlines
2024-03-11 9:15 ` [PATCH v3 2/9] ARM: tlbflush: Make TLB flushes into static inlines Linus Walleij
@ 2024-03-11 9:39 ` Russell King (Oracle)
2024-03-11 10:03 ` Ard Biesheuvel
0 siblings, 1 reply; 25+ messages in thread
From: Russell King (Oracle) @ 2024-03-11 9:39 UTC (permalink / raw)
To: Linus Walleij
Cc: Sami Tolvanen, Kees Cook, Nathan Chancellor, Nick Desaulniers,
Ard Biesheuvel, Arnd Bergmann, linux-arm-kernel, llvm
On Mon, Mar 11, 2024 at 10:15:39AM +0100, Linus Walleij wrote:
> Instead of just using defines to define the TLB flush functions,
> use static inlines.
>
> This has the upside that we can tag those as __nocfi so we can
> execute a CFI-enabled kernel.
Why? This seems to be brain dead.
Why can't CLANG cope with directly calling e.g.
cpu_tlb.flush_user_range? Why does it need a static function to do
exactly the same as the macro does?
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/9] ARM: tlbflush: Make TLB flushes into static inlines
2024-03-11 9:39 ` Russell King (Oracle)
@ 2024-03-11 10:03 ` Ard Biesheuvel
2024-03-11 15:34 ` Sami Tolvanen
0 siblings, 1 reply; 25+ messages in thread
From: Ard Biesheuvel @ 2024-03-11 10:03 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Linus Walleij, Sami Tolvanen, Kees Cook, Nathan Chancellor,
Nick Desaulniers, Arnd Bergmann, linux-arm-kernel, llvm
On Mon, 11 Mar 2024 at 10:39, Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> On Mon, Mar 11, 2024 at 10:15:39AM +0100, Linus Walleij wrote:
> > Instead of just using defines to define the TLB flush functions,
> > use static inlines.
> >
> > This has the upside that we can tag those as __nocfi so we can
> > execute a CFI-enabled kernel.
>
> Why? This seems to be brain dead.
>
> Why can't CLANG cope with directly calling e.g.
> cpu_tlb.flush_user_range? Why does it need a static function to do
> exactly the same as the macro does?
>
I had the same question, so I played around a bit with the code.
What I think would be better is if we could add the __nocfi annotation
to the type, i.e.,
--- a/arch/arm/include/asm/tlbflush.h
+++ b/arch/arm/include/asm/tlbflush.h
@@ -205,8 +205,8 @@
#include <linux/sched.h>
struct cpu_tlb_fns {
- void (*flush_user_range)(unsigned long, unsigned long, ...);
- void (*flush_kern_range)(unsigned long, unsigned long);
+ void (__nocfi *flush_user_range)(unsigned long, unsigned long, ...);
+ void (__nocfi *flush_kern_range)(unsigned long, unsigned long);
unsigned long tlb_flags;
};
This works for some function attributes (e.g., __efiapi is used like
this), but the attribute specifier to which __nocfi resolves does not
appear to be usable in the same manner.
Best would be to annotate the asm code using
SYM_TYPED_FUNC_START/_END, so that the CFI machinery is invoked at the
call site to validate the function type of the destination.
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: [PATCH v3 2/9] ARM: tlbflush: Make TLB flushes into static inlines
2024-03-11 10:03 ` Ard Biesheuvel
@ 2024-03-11 15:34 ` Sami Tolvanen
2024-03-11 19:50 ` Linus Walleij
0 siblings, 1 reply; 25+ messages in thread
From: Sami Tolvanen @ 2024-03-11 15:34 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Russell King (Oracle), Linus Walleij, Kees Cook,
Nathan Chancellor, Nick Desaulniers, Arnd Bergmann,
linux-arm-kernel, llvm
On Mon, Mar 11, 2024 at 3:04 AM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 11 Mar 2024 at 10:39, Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> >
> > On Mon, Mar 11, 2024 at 10:15:39AM +0100, Linus Walleij wrote:
> > > Instead of just using defines to define the TLB flush functions,
> > > use static inlines.
> > >
> > > This has the upside that we can tag those as __nocfi so we can
> > > execute a CFI-enabled kernel.
> >
> > Why? This seems to be brain dead.
> >
> > Why can't CLANG cope with directly calling e.g.
> > cpu_tlb.flush_user_range? Why does it need a static function to do
> > exactly the same as the macro does?
> >
>
> I had the same question, so I played around a bit with the code.
>
> What I think would be better is if we could add the __nocfi annotation
> to the type, i.e.,
>
> --- a/arch/arm/include/asm/tlbflush.h
> +++ b/arch/arm/include/asm/tlbflush.h
> @@ -205,8 +205,8 @@
> #include <linux/sched.h>
>
> struct cpu_tlb_fns {
> - void (*flush_user_range)(unsigned long, unsigned long, ...);
> - void (*flush_kern_range)(unsigned long, unsigned long);
> + void (__nocfi *flush_user_range)(unsigned long, unsigned long, ...);
> + void (__nocfi *flush_kern_range)(unsigned long, unsigned long);
> unsigned long tlb_flags;
> };
>
> This works for some function attributes (e.g., __efiapi is used like
> this), but the attribute specifier to which __nocfi resolves does not
> appear to be usable in the same manner.
>
> Best would be to annotate the asm code using
> SYM_TYPED_FUNC_START/_END, so that the CFI machinery is invoked at the
> call site to validate the function type of the destination.
Agreed, ideally we would annotate indirectly called assembly functions
with CFI types and avoid __nocfi wrappers.
Sami
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: [PATCH v3 2/9] ARM: tlbflush: Make TLB flushes into static inlines
2024-03-11 15:34 ` Sami Tolvanen
@ 2024-03-11 19:50 ` Linus Walleij
2024-03-11 21:36 ` Sami Tolvanen
0 siblings, 1 reply; 25+ messages in thread
From: Linus Walleij @ 2024-03-11 19:50 UTC (permalink / raw)
To: Sami Tolvanen
Cc: Ard Biesheuvel, Russell King (Oracle), Kees Cook,
Nathan Chancellor, Nick Desaulniers, Arnd Bergmann,
linux-arm-kernel, llvm
On Mon, Mar 11, 2024 at 4:35 PM Sami Tolvanen <samitolvanen@google.com> wrote:
> On Mon, Mar 11, 2024 at 3:04 AM Ard Biesheuvel <ardb@kernel.org> wrote:
>>
> > This works for some function attributes (e.g., __efiapi is used like
> > this), but the attribute specifier to which __nocfi resolves does not
> > appear to be usable in the same manner.
> >
> > Best would be to annotate the asm code using
> > SYM_TYPED_FUNC_START/_END, so that the CFI machinery is invoked at the
> > call site to validate the function type of the destination.
>
> Agreed, ideally we would annotate indirectly called assembly functions
> with CFI types and avoid __nocfi wrappers.
I'm taking a stab at SYM_TYPED_FUNC_* for ARM, as we don't have them
yet.
Yours,
Linus Walleij
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/9] ARM: tlbflush: Make TLB flushes into static inlines
2024-03-11 19:50 ` Linus Walleij
@ 2024-03-11 21:36 ` Sami Tolvanen
2024-03-11 22:17 ` Linus Walleij
0 siblings, 1 reply; 25+ messages in thread
From: Sami Tolvanen @ 2024-03-11 21:36 UTC (permalink / raw)
To: Linus Walleij
Cc: Ard Biesheuvel, Russell King (Oracle), Kees Cook,
Nathan Chancellor, Nick Desaulniers, Arnd Bergmann,
linux-arm-kernel, llvm
On Mon, Mar 11, 2024 at 7:51 PM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> On Mon, Mar 11, 2024 at 4:35 PM Sami Tolvanen <samitolvanen@google.com> wrote:
> > On Mon, Mar 11, 2024 at 3:04 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> >>
> > > This works for some function attributes (e.g., __efiapi is used like
> > > this), but the attribute specifier to which __nocfi resolves does not
> > > appear to be usable in the same manner.
> > >
> > > Best would be to annotate the asm code using
> > > SYM_TYPED_FUNC_START/_END, so that the CFI machinery is invoked at the
> > > call site to validate the function type of the destination.
> >
> > Agreed, ideally we would annotate indirectly called assembly functions
> > with CFI types and avoid __nocfi wrappers.
>
> I'm taking a stab at SYM_TYPED_FUNC_* for ARM, as we don't have them
> yet.
Does the default implementation in include/linux/cfi_types.h not work
on arm for some reason?
Sami
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/9] ARM: tlbflush: Make TLB flushes into static inlines
2024-03-11 21:36 ` Sami Tolvanen
@ 2024-03-11 22:17 ` Linus Walleij
2024-03-11 22:28 ` Sami Tolvanen
0 siblings, 1 reply; 25+ messages in thread
From: Linus Walleij @ 2024-03-11 22:17 UTC (permalink / raw)
To: Sami Tolvanen
Cc: Ard Biesheuvel, Russell King (Oracle), Kees Cook,
Nathan Chancellor, Nick Desaulniers, Arnd Bergmann,
linux-arm-kernel, llvm
On Mon, Mar 11, 2024 at 10:37 PM Sami Tolvanen <samitolvanen@google.com> wrote:
> On Mon, Mar 11, 2024 at 7:51 PM Linus Walleij <linus.walleij@linaro.org> wrote:
> > On Mon, Mar 11, 2024 at 4:35 PM Sami Tolvanen <samitolvanen@google.com> wrote:
> > > On Mon, Mar 11, 2024 at 3:04 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >>
> > > > This works for some function attributes (e.g., __efiapi is used like
> > > > this), but the attribute specifier to which __nocfi resolves does not
> > > > appear to be usable in the same manner.
> > > >
> > > > Best would be to annotate the asm code using
> > > > SYM_TYPED_FUNC_START/_END, so that the CFI machinery is invoked at the
> > > > call site to validate the function type of the destination.
> > >
> > > Agreed, ideally we would annotate indirectly called assembly functions
> > > with CFI types and avoid __nocfi wrappers.
> >
> > I'm taking a stab at SYM_TYPED_FUNC_* for ARM, as we don't have them
> > yet.
>
> Does the default implementation in include/linux/cfi_types.h not work
> on arm for some reason?
For example I try to switch over the TLB symbols like this:
diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S
index e43f6d716b4b..bbe47ca32e55 100644
--- a/arch/arm/mm/proc-macros.S
+++ b/arch/arm/mm/proc-macros.S
@@ -341,7 +341,7 @@ ENTRY(\name\()_cache_fns)
.macro define_tlb_functions name:req, flags_up:req, flags_smp
.type \name\()_tlb_fns, #object
.align 2
-ENTRY(\name\()_tlb_fns)
+SYM_TYPED_FUNC_START(\name\()_tlb_fns)
.long \name\()_flush_user_tlb_range
.long \name\()_flush_kern_tlb_range
.ifnb \flags_smp
diff --git a/arch/arm/mm/tlb-v7.S b/arch/arm/mm/tlb-v7.S
index 35fd6d4f0d03..aff9d884c30d 100644
--- a/arch/arm/mm/tlb-v7.S
+++ b/arch/arm/mm/tlb-v7.S
@@ -10,6 +10,7 @@
*/
#include <linux/init.h>
#include <linux/linkage.h>
+#include <linux/cfi_types.h>
#include <asm/assembler.h>
#include <asm/asm-offsets.h>
#include <asm/page.h>
@@ -31,7 +32,7 @@
* - the "Invalidate single entry" instruction will invalidate
* both the I and the D TLBs on Harvard-style TLBs
*/
-ENTRY(v7wbi_flush_user_tlb_range)
+SYM_TYPED_FUNC_START(v7wbi_flush_user_tlb_range)
vma_vm_mm r3, r2 @ get vma->vm_mm
mmid r3, r3 @ get vm_mm->context.id
dsb ish
@@ -57,7 +58,7 @@ ENTRY(v7wbi_flush_user_tlb_range)
blo 1b
dsb ish
ret lr
-ENDPROC(v7wbi_flush_user_tlb_range)
+SYM_FUNC_END(v7wbi_flush_user_tlb_range)
/*
* v7wbi_flush_kern_tlb_range(start,end)
@@ -67,7 +68,7 @@ ENDPROC(v7wbi_flush_user_tlb_range)
* - start - start address (may not be aligned)
* - end - end address (exclusive, may not be aligned)
*/
-ENTRY(v7wbi_flush_kern_tlb_range)
+SYM_TYPED_FUNC_START(v7wbi_flush_kern_tlb_range)
dsb ish
mov r0, r0, lsr #PAGE_SHIFT @ align address
mov r1, r1, lsr #PAGE_SHIFT
@@ -86,7 +87,7 @@ ENTRY(v7wbi_flush_kern_tlb_range)
dsb ish
isb
ret lr
-ENDPROC(v7wbi_flush_kern_tlb_range)
+SYM_FUNC_END(v7wbi_flush_kern_tlb_range)
__INIT
Compiling results in:
AR vmlinux.a
LD vmlinux.o
OBJCOPY modules.builtin.modinfo
GEN modules.builtin
MODPOST vmlinux.symvers
UPD include/generated/utsversion.h
CC init/version-timestamp.o
LD .tmp_vmlinux.kallsyms1
ld.lld: error: undefined symbol: __kcfi_typeid_v7wbi_flush_user_tlb_range
>>> referenced by arch/arm/mm/tlb-v7.o:(.text+0x0) in archive vmlinux.a
ld.lld: error: undefined symbol: __kcfi_typeid_v7wbi_flush_kern_tlb_range
>>> referenced by tlb-v7.S:60 (/mnt/storage/linus/linux-integrator/build-vexpress/../arch/arm/mm/tlb-v7.S:60)
>>> arch/arm/mm/tlb-v7.o:(.text+0x40) in archive vmlinux.a
ld.lld: error: undefined symbol: __kcfi_typeid_v7wbi_tlb_fns
>>> referenced by arch/arm/mm/tlb-v7.o:(.init.text+0x0) in archive vmlinux.a
Yours,
Linus Walleij
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH v3 2/9] ARM: tlbflush: Make TLB flushes into static inlines
2024-03-11 22:17 ` Linus Walleij
@ 2024-03-11 22:28 ` Sami Tolvanen
2024-03-11 23:56 ` Linus Walleij
0 siblings, 1 reply; 25+ messages in thread
From: Sami Tolvanen @ 2024-03-11 22:28 UTC (permalink / raw)
To: Linus Walleij
Cc: Ard Biesheuvel, Russell King (Oracle), Kees Cook,
Nathan Chancellor, Nick Desaulniers, Arnd Bergmann,
linux-arm-kernel, llvm
On Mon, Mar 11, 2024 at 3:17 PM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> LD .tmp_vmlinux.kallsyms1
> ld.lld: error: undefined symbol: __kcfi_typeid_v7wbi_flush_user_tlb_range
> >>> referenced by arch/arm/mm/tlb-v7.o:(.text+0x0) in archive vmlinux.a
>
> ld.lld: error: undefined symbol: __kcfi_typeid_v7wbi_flush_kern_tlb_range
> >>> referenced by tlb-v7.S:60 (/mnt/storage/linus/linux-integrator/build-vexpress/../arch/arm/mm/tlb-v7.S:60)
> >>> arch/arm/mm/tlb-v7.o:(.text+0x40) in archive vmlinux.a
>
> ld.lld: error: undefined symbol: __kcfi_typeid_v7wbi_tlb_fns
> >>> referenced by arch/arm/mm/tlb-v7.o:(.init.text+0x0) in archive vmlinux.a
Clang only emits __kcfi_typeid symbols for functions that are
address-taken in C code. You need to add __ADDRESSABLE(function)
references to a C file somewhere for functions that otherwise are not
address-taken.
Sami
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/9] ARM: tlbflush: Make TLB flushes into static inlines
2024-03-11 22:28 ` Sami Tolvanen
@ 2024-03-11 23:56 ` Linus Walleij
2024-03-12 7:24 ` Ard Biesheuvel
0 siblings, 1 reply; 25+ messages in thread
From: Linus Walleij @ 2024-03-11 23:56 UTC (permalink / raw)
To: Sami Tolvanen
Cc: Ard Biesheuvel, Russell King (Oracle), Kees Cook,
Nathan Chancellor, Nick Desaulniers, Arnd Bergmann,
linux-arm-kernel, llvm
On Mon, Mar 11, 2024 at 11:28 PM Sami Tolvanen <samitolvanen@google.com> wrote:
> On Mon, Mar 11, 2024 at 3:17 PM Linus Walleij <linus.walleij@linaro.org> wrote:
> >
> > LD .tmp_vmlinux.kallsyms1
> > ld.lld: error: undefined symbol: __kcfi_typeid_v7wbi_flush_user_tlb_range
> > >>> referenced by arch/arm/mm/tlb-v7.o:(.text+0x0) in archive vmlinux.a
> >
> > ld.lld: error: undefined symbol: __kcfi_typeid_v7wbi_flush_kern_tlb_range
> > >>> referenced by tlb-v7.S:60 (/mnt/storage/linus/linux-integrator/build-vexpress/../arch/arm/mm/tlb-v7.S:60)
> > >>> arch/arm/mm/tlb-v7.o:(.text+0x40) in archive vmlinux.a
> >
> > ld.lld: error: undefined symbol: __kcfi_typeid_v7wbi_tlb_fns
> > >>> referenced by arch/arm/mm/tlb-v7.o:(.init.text+0x0) in archive vmlinux.a
>
> Clang only emits __kcfi_typeid symbols for functions that are
> address-taken in C code. You need to add __ADDRESSABLE(function)
> references to a C file somewhere for functions that otherwise are not
> address-taken.
Hey it works. So for example if for these functions I also add:
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index d19d140a10c7..23eb0f9358cb 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -18,6 +18,11 @@
#include "mm.h"
+void v7wbi_flush_user_tlb_range(unsigned long, unsigned long, struct
vm_area_struct *);
+void v7wbi_flush_kern_tlb_range(unsigned long, unsigned long);
+__ADDRESSABLE(v7wbi_flush_user_tlb_range);
+__ADDRESSABLE(v7wbi_flush_kern_tlb_range);
Then that works.
The problem is that I also have to define all these function signatures that
are never used in C and there are quite a few of them, if I start listing them
all and #ifdefining them for selected CPUs it's not going to be pretty.
It can be done and they can be in a cfi-defs.c file though.
And it's better than __nocfi.
The complexity comes from the fact that arm can boot a kernel
with support for several different CPU:s.
The different CPU management functions are put in a list of supported
processors by the linker, and then e.g. the tlb maintenance functions
are dereferenced directly from *list->tlb in setup_processor()
in arch/arm/kernel/setup.c.
Yours,
Linus Walleij
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/9] ARM: tlbflush: Make TLB flushes into static inlines
2024-03-11 23:56 ` Linus Walleij
@ 2024-03-12 7:24 ` Ard Biesheuvel
2024-03-12 8:14 ` Linus Walleij
0 siblings, 1 reply; 25+ messages in thread
From: Ard Biesheuvel @ 2024-03-12 7:24 UTC (permalink / raw)
To: Linus Walleij
Cc: Sami Tolvanen, Russell King (Oracle), Kees Cook,
Nathan Chancellor, Nick Desaulniers, Arnd Bergmann,
linux-arm-kernel, llvm
On Tue, 12 Mar 2024 at 00:56, Linus Walleij <linus.walleij@linaro.org> wrote:
>
> On Mon, Mar 11, 2024 at 11:28 PM Sami Tolvanen <samitolvanen@google.com> wrote:
> > On Mon, Mar 11, 2024 at 3:17 PM Linus Walleij <linus.walleij@linaro.org> wrote:
> > >
> > > LD .tmp_vmlinux.kallsyms1
> > > ld.lld: error: undefined symbol: __kcfi_typeid_v7wbi_flush_user_tlb_range
> > > >>> referenced by arch/arm/mm/tlb-v7.o:(.text+0x0) in archive vmlinux.a
> > >
> > > ld.lld: error: undefined symbol: __kcfi_typeid_v7wbi_flush_kern_tlb_range
> > > >>> referenced by tlb-v7.S:60 (/mnt/storage/linus/linux-integrator/build-vexpress/../arch/arm/mm/tlb-v7.S:60)
> > > >>> arch/arm/mm/tlb-v7.o:(.text+0x40) in archive vmlinux.a
> > >
> > > ld.lld: error: undefined symbol: __kcfi_typeid_v7wbi_tlb_fns
> > > >>> referenced by arch/arm/mm/tlb-v7.o:(.init.text+0x0) in archive vmlinux.a
> >
> > Clang only emits __kcfi_typeid symbols for functions that are
> > address-taken in C code. You need to add __ADDRESSABLE(function)
> > references to a C file somewhere for functions that otherwise are not
> > address-taken.
>
> Hey it works. So for example if for these functions I also add:
>
> diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
> index d19d140a10c7..23eb0f9358cb 100644
> --- a/arch/arm/mm/flush.c
> +++ b/arch/arm/mm/flush.c
> @@ -18,6 +18,11 @@
>
> #include "mm.h"
>
> +void v7wbi_flush_user_tlb_range(unsigned long, unsigned long, struct
> vm_area_struct *);
> +void v7wbi_flush_kern_tlb_range(unsigned long, unsigned long);
> +__ADDRESSABLE(v7wbi_flush_user_tlb_range);
> +__ADDRESSABLE(v7wbi_flush_kern_tlb_range);
>
> Then that works.
>
> The problem is that I also have to define all these function signatures that
> are never used in C and there are quite a few of them, if I start listing them
> all and #ifdefining them for selected CPUs it's not going to be pretty.
>
> It can be done and they can be in a cfi-defs.c file though.
> And it's better than __nocfi.
>
> The complexity comes from the fact that arm can boot a kernel
> with support for several different CPU:s.
>
> The different CPU management functions are put in a list of supported
> processors by the linker, and then e.g. the tlb maintenance functions
> are dereferenced directly from *list->tlb in setup_processor()
> in arch/arm/kernel/setup.c.
>
Another option is to move the struct definitions to C entirely. For
example, the branch below implements this for the tlbflush code:
https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm-cfi
However, doing the same will be tricky for the proc_info structs, as
they have a member that contains a place-relative offset, and those
cannot be easily emitted in C (similar to the SMP_ON_UP hack in the
TLB code above).
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/9] ARM: tlbflush: Make TLB flushes into static inlines
2024-03-12 7:24 ` Ard Biesheuvel
@ 2024-03-12 8:14 ` Linus Walleij
0 siblings, 0 replies; 25+ messages in thread
From: Linus Walleij @ 2024-03-12 8:14 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Sami Tolvanen, Russell King (Oracle), Kees Cook,
Nathan Chancellor, Nick Desaulniers, Arnd Bergmann,
linux-arm-kernel, llvm
On Tue, Mar 12, 2024 at 8:25 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> On Tue, 12 Mar 2024 at 00:56, Linus Walleij <linus.walleij@linaro.org> wrote:
> > On Mon, Mar 11, 2024 at 11:28 PM Sami Tolvanen <samitolvanen@google.com> wrote:
> > > On Mon, Mar 11, 2024 at 3:17 PM Linus Walleij <linus.walleij@linaro.org> wrote:
> > +void v7wbi_flush_user_tlb_range(unsigned long, unsigned long, struct
> > vm_area_struct *);
> > +void v7wbi_flush_kern_tlb_range(unsigned long, unsigned long);
> > +__ADDRESSABLE(v7wbi_flush_user_tlb_range);
> > +__ADDRESSABLE(v7wbi_flush_kern_tlb_range);
> >
> > Then that works.
> >
> > The problem is that I also have to define all these function signatures that
> > are never used in C and there are quite a few of them, if I start listing them
> > all and #ifdefining them for selected CPUs it's not going to be pretty.
> >
> > It can be done and they can be in a cfi-defs.c file though.
> > And it's better than __nocfi.
>
> Another option is to move the struct definitions to C entirely. For
> example, the branch below implements this for the tlbflush code:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm-cfi
Hm that looks very nice IMO, I think we could go for that for the
tlb functions for sure, I will just steal your patch :D
The same can probably be done for some other vtables.
> However, doing the same will be tricky for the proc_info structs, as
> they have a member that contains a place-relative offset, and those
> cannot be easily emitted in C (similar to the SMP_ON_UP hack in the
> TLB code above).
Hm hm.
I might have to do the first approach and list all functions in a file for
those, it will probably be the lesser evil.
I'll take a stab at it and see where we end up.
Yours,
Linus Walleij
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v3 3/9] ARM: bugs: Check in the vtable instead of defined aliases
2024-03-11 9:15 [PATCH v3 0/9] CFI for ARM32 using LLVM Linus Walleij
2024-03-11 9:15 ` [PATCH v3 1/9] ARM: Support CLANG CFI Linus Walleij
2024-03-11 9:15 ` [PATCH v3 2/9] ARM: tlbflush: Make TLB flushes into static inlines Linus Walleij
@ 2024-03-11 9:15 ` Linus Walleij
2024-03-11 9:15 ` [PATCH v3 4/9] ARM: proc: Use inlines instead of defines Linus Walleij
` (5 subsequent siblings)
8 siblings, 0 replies; 25+ messages in thread
From: Linus Walleij @ 2024-03-11 9:15 UTC (permalink / raw)
To: Russell King, Sami Tolvanen, Kees Cook, Nathan Chancellor,
Nick Desaulniers, Ard Biesheuvel, Arnd Bergmann
Cc: linux-arm-kernel, llvm, Linus Walleij
Instead of checking if cpu_check_bugs() exist, check for this
callback directly in the CPU vtable: this is better because the
function is just a define to the vtable entry and this is why
the code works. But we want to be able to specify a proper
function for cpu_check_bugs() so look into the vtable instead.
In bugs.c assign PROC_VTABLE(switch_mm) instead of
assigning cpu_do_switch_mm where again this is just a define
into the vtable: this makes it possible to make
cpu_do_switch_mm() into a real function.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
arch/arm/kernel/bugs.c | 2 +-
arch/arm/mm/proc-v7-bugs.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/arm/kernel/bugs.c b/arch/arm/kernel/bugs.c
index 087bce6ec8e9..35d39efb51ed 100644
--- a/arch/arm/kernel/bugs.c
+++ b/arch/arm/kernel/bugs.c
@@ -7,7 +7,7 @@
void check_other_bugs(void)
{
#ifdef MULTI_CPU
- if (cpu_check_bugs)
+ if (PROC_VTABLE(check_bugs))
cpu_check_bugs();
#endif
}
diff --git a/arch/arm/mm/proc-v7-bugs.c b/arch/arm/mm/proc-v7-bugs.c
index 8bc7a2d6d6c7..ea3ee2bd7b56 100644
--- a/arch/arm/mm/proc-v7-bugs.c
+++ b/arch/arm/mm/proc-v7-bugs.c
@@ -87,14 +87,14 @@ static unsigned int spectre_v2_install_workaround(unsigned int method)
case SPECTRE_V2_METHOD_HVC:
per_cpu(harden_branch_predictor_fn, cpu) =
call_hvc_arch_workaround_1;
- cpu_do_switch_mm = cpu_v7_hvc_switch_mm;
+ PROC_VTABLE(switch_mm) = cpu_v7_hvc_switch_mm;
spectre_v2_method = "hypervisor";
break;
case SPECTRE_V2_METHOD_SMC:
per_cpu(harden_branch_predictor_fn, cpu) =
call_smc_arch_workaround_1;
- cpu_do_switch_mm = cpu_v7_smc_switch_mm;
+ PROC_VTABLE(switch_mm) = cpu_v7_smc_switch_mm;
spectre_v2_method = "firmware";
break;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH v3 4/9] ARM: proc: Use inlines instead of defines
2024-03-11 9:15 [PATCH v3 0/9] CFI for ARM32 using LLVM Linus Walleij
` (2 preceding siblings ...)
2024-03-11 9:15 ` [PATCH v3 3/9] ARM: bugs: Check in the vtable instead of defined aliases Linus Walleij
@ 2024-03-11 9:15 ` Linus Walleij
2024-03-11 9:15 ` [PATCH v3 5/9] ARM: delay: Turn delay functions into static inlines Linus Walleij
` (4 subsequent siblings)
8 siblings, 0 replies; 25+ messages in thread
From: Linus Walleij @ 2024-03-11 9:15 UTC (permalink / raw)
To: Russell King, Sami Tolvanen, Kees Cook, Nathan Chancellor,
Nick Desaulniers, Ard Biesheuvel, Arnd Bergmann
Cc: linux-arm-kernel, llvm, Linus Walleij
We currently access the per-cpu vtable with defines such
as:
define cpu_proc_init PROC_VTABLE(_proc_init)
Convert all of these instances to static inlines instead:
static inline __nocfi void cpu_proc_init(void)
{
PROC_VTABLE(_proc_init)();
}
This has the upside that we can add the __nocfi tag to
the inline function so CFI can skip over this and work,
and we can simplify some platform code that was looking
into the symbol table to be able to call cpu_reset(),
now we can just call it instead.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
arch/arm/common/mcpm_entry.c | 10 ++------
arch/arm/include/asm/proc-fns.h | 57 +++++++++++++++++++++++++++++++++--------
arch/arm/mach-sunxi/mc_smp.c | 7 +----
3 files changed, 50 insertions(+), 24 deletions(-)
diff --git a/arch/arm/common/mcpm_entry.c b/arch/arm/common/mcpm_entry.c
index e013ff1168d3..3e19f246caff 100644
--- a/arch/arm/common/mcpm_entry.c
+++ b/arch/arm/common/mcpm_entry.c
@@ -234,13 +234,10 @@ int mcpm_cpu_power_up(unsigned int cpu, unsigned int cluster)
return ret;
}
-typedef typeof(cpu_reset) phys_reset_t;
-
void mcpm_cpu_power_down(void)
{
unsigned int mpidr, cpu, cluster;
bool cpu_going_down, last_man;
- phys_reset_t phys_reset;
mpidr = read_cpuid_mpidr();
cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
@@ -298,8 +295,7 @@ void mcpm_cpu_power_down(void)
* the kernel as if the power_up method just had deasserted reset
* on the CPU.
*/
- phys_reset = (phys_reset_t)(unsigned long)__pa_symbol(cpu_reset);
- phys_reset(__pa_symbol(mcpm_entry_point), false);
+ cpu_reset(__pa_symbol(mcpm_entry_point), false);
/* should never get here */
BUG();
@@ -376,7 +372,6 @@ static int __init nocache_trampoline(unsigned long _arg)
unsigned int mpidr = read_cpuid_mpidr();
unsigned int cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
unsigned int cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
- phys_reset_t phys_reset;
mcpm_set_entry_vector(cpu, cluster, cpu_resume_no_hyp);
setup_mm_for_reboot();
@@ -387,8 +382,7 @@ static int __init nocache_trampoline(unsigned long _arg)
__mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN);
__mcpm_cpu_down(cpu, cluster);
- phys_reset = (phys_reset_t)(unsigned long)__pa_symbol(cpu_reset);
- phys_reset(__pa_symbol(mcpm_entry_point), false);
+ cpu_reset(__pa_symbol(mcpm_entry_point), false);
BUG();
}
diff --git a/arch/arm/include/asm/proc-fns.h b/arch/arm/include/asm/proc-fns.h
index 280396483f5d..9bd6bf5f901a 100644
--- a/arch/arm/include/asm/proc-fns.h
+++ b/arch/arm/include/asm/proc-fns.h
@@ -131,18 +131,55 @@ static inline void init_proc_vtable(const struct processor *p)
}
#endif
-#define cpu_proc_init PROC_VTABLE(_proc_init)
-#define cpu_check_bugs PROC_VTABLE(check_bugs)
-#define cpu_proc_fin PROC_VTABLE(_proc_fin)
-#define cpu_reset PROC_VTABLE(reset)
-#define cpu_do_idle PROC_VTABLE(_do_idle)
-#define cpu_dcache_clean_area PROC_TABLE(dcache_clean_area)
-#define cpu_set_pte_ext PROC_TABLE(set_pte_ext)
-#define cpu_do_switch_mm PROC_VTABLE(switch_mm)
+static inline void __nocfi cpu_proc_init(void)
+{
+ PROC_VTABLE(_proc_init)();
+}
+static inline void __nocfi cpu_check_bugs(void)
+{
+ PROC_VTABLE(check_bugs)();
+}
+static inline void __nocfi cpu_proc_fin(void)
+{
+ PROC_VTABLE(_proc_fin)();
+}
+static inline void __nocfi cpu_reset(unsigned long addr, bool hvc)
+{
+ PROC_VTABLE(reset)(addr, hvc);
+}
+static inline int __nocfi cpu_do_idle(void)
+{
+ return PROC_VTABLE(_do_idle)();
+}
+static inline void __nocfi cpu_dcache_clean_area(void *addr, int size)
+{
+ PROC_TABLE(dcache_clean_area)(addr, size);
+}
+#ifdef CONFIG_ARM_LPAE
+static inline void __nocfi cpu_set_pte_ext(pte_t *ptep, pte_t pte)
+{
+ PROC_TABLE(set_pte_ext)(ptep, pte);
+}
+#else
+static inline void __nocfi cpu_set_pte_ext(pte_t *ptep, pte_t pte, unsigned int ext)
+{
+ PROC_TABLE(set_pte_ext)(ptep, pte, ext);
+}
+#endif
+static inline void __nocfi cpu_do_switch_mm(phys_addr_t pgd_phys, struct mm_struct *mm)
+{
+ PROC_VTABLE(switch_mm)(pgd_phys, mm);
+}
/* These two are private to arch/arm/kernel/suspend.c */
-#define cpu_do_suspend PROC_VTABLE(do_suspend)
-#define cpu_do_resume PROC_VTABLE(do_resume)
+static inline void __nocfi cpu_do_suspend(void *p)
+{
+ PROC_VTABLE(do_suspend)(p);
+}
+static inline void __nocfi cpu_do_resume(void *p)
+{
+ PROC_VTABLE(do_resume)(p);
+}
#endif
extern void cpu_resume(void);
diff --git a/arch/arm/mach-sunxi/mc_smp.c b/arch/arm/mach-sunxi/mc_smp.c
index 277f6aa8e6c2..791eabb7d433 100644
--- a/arch/arm/mach-sunxi/mc_smp.c
+++ b/arch/arm/mach-sunxi/mc_smp.c
@@ -646,17 +646,12 @@ static bool __init sunxi_mc_smp_cpu_table_init(void)
*
* We need the trampoline code to enable CCI-400 on the first cluster
*/
-typedef typeof(cpu_reset) phys_reset_t;
-
static int __init nocache_trampoline(unsigned long __unused)
{
- phys_reset_t phys_reset;
-
setup_mm_for_reboot();
sunxi_cluster_cache_disable_without_axi();
- phys_reset = (phys_reset_t)(unsigned long)__pa_symbol(cpu_reset);
- phys_reset(__pa_symbol(sunxi_mc_smp_resume), false);
+ cpu_reset(__pa_symbol(sunxi_mc_smp_resume), false);
BUG();
}
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH v3 5/9] ARM: delay: Turn delay functions into static inlines
2024-03-11 9:15 [PATCH v3 0/9] CFI for ARM32 using LLVM Linus Walleij
` (3 preceding siblings ...)
2024-03-11 9:15 ` [PATCH v3 4/9] ARM: proc: Use inlines instead of defines Linus Walleij
@ 2024-03-11 9:15 ` Linus Walleij
2024-03-11 12:26 ` Ard Biesheuvel
2024-03-11 9:15 ` [PATCH v3 6/9] ARM: turn CPU cache flush " Linus Walleij
` (3 subsequent siblings)
8 siblings, 1 reply; 25+ messages in thread
From: Linus Walleij @ 2024-03-11 9:15 UTC (permalink / raw)
To: Russell King, Sami Tolvanen, Kees Cook, Nathan Chancellor,
Nick Desaulniers, Ard Biesheuvel, Arnd Bergmann
Cc: linux-arm-kernel, llvm, Linus Walleij
The members of the vector table arm_delay_ops are called
directly using defines, but this is really confusing for
KCFI. Wrap the calls in static inlines and tag them with
__nocfi so things start to work.
Without this patch, platforms without a delay timer will
not boot (sticks in calibrating loop etc).
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
arch/arm/include/asm/delay.h | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/arch/arm/include/asm/delay.h b/arch/arm/include/asm/delay.h
index 1d069e558d8d..7d611b810b6c 100644
--- a/arch/arm/include/asm/delay.h
+++ b/arch/arm/include/asm/delay.h
@@ -55,7 +55,10 @@ extern struct arm_delay_ops {
unsigned long ticks_per_jiffy;
} arm_delay_ops;
-#define __delay(n) arm_delay_ops.delay(n)
+static inline void __nocfi __delay(unsigned long n)
+{
+ arm_delay_ops.delay(n);
+}
/*
* This function intentionally does not exist; if you see references to
@@ -76,8 +79,15 @@ extern void __bad_udelay(void);
* first constant multiplications gets optimized away if the delay is
* a constant)
*/
-#define __udelay(n) arm_delay_ops.udelay(n)
-#define __const_udelay(n) arm_delay_ops.const_udelay(n)
+static inline void __nocfi __udelay(unsigned long n)
+{
+ arm_delay_ops.udelay(n);
+}
+
+static inline void __nocfi __const_udelay(unsigned long n)
+{
+ arm_delay_ops.const_udelay(n);
+}
#define udelay(n) \
(__builtin_constant_p(n) ? \
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH v3 5/9] ARM: delay: Turn delay functions into static inlines
2024-03-11 9:15 ` [PATCH v3 5/9] ARM: delay: Turn delay functions into static inlines Linus Walleij
@ 2024-03-11 12:26 ` Ard Biesheuvel
0 siblings, 0 replies; 25+ messages in thread
From: Ard Biesheuvel @ 2024-03-11 12:26 UTC (permalink / raw)
To: Linus Walleij
Cc: Russell King, Sami Tolvanen, Kees Cook, Nathan Chancellor,
Nick Desaulniers, Arnd Bergmann, linux-arm-kernel, llvm
On Mon, 11 Mar 2024 at 10:15, Linus Walleij <linus.walleij@linaro.org> wrote:
>
> The members of the vector table arm_delay_ops are called
> directly using defines, but this is really confusing for
> KCFI. Wrap the calls in static inlines and tag them with
> __nocfi so things start to work.
>
> Without this patch, platforms without a delay timer will
> not boot (sticks in calibrating loop etc).
>
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> ---
> arch/arm/include/asm/delay.h | 16 +++++++++++++---
> 1 file changed, 13 insertions(+), 3 deletions(-)
>
So what goes wrong without this patch? Is it really that easy to confuse kCFI?
As far as I can tell (and I tried reverting it too), this patch should
not be needed - in general, when all function pointer variables and
the functions themselves are defined in the C domain, the compiler
should be able to figure it out.
Also, these are functions that are used on every system and called
often and in a predictable manner, so they are prime targets for the
kind of attacks that CFI is intended to harden against, so wrapping
them in __nocfi is probably something we should try to avoid.
> diff --git a/arch/arm/include/asm/delay.h b/arch/arm/include/asm/delay.h
> index 1d069e558d8d..7d611b810b6c 100644
> --- a/arch/arm/include/asm/delay.h
> +++ b/arch/arm/include/asm/delay.h
> @@ -55,7 +55,10 @@ extern struct arm_delay_ops {
> unsigned long ticks_per_jiffy;
> } arm_delay_ops;
>
> -#define __delay(n) arm_delay_ops.delay(n)
> +static inline void __nocfi __delay(unsigned long n)
> +{
> + arm_delay_ops.delay(n);
> +}
>
> /*
> * This function intentionally does not exist; if you see references to
> @@ -76,8 +79,15 @@ extern void __bad_udelay(void);
> * first constant multiplications gets optimized away if the delay is
> * a constant)
> */
> -#define __udelay(n) arm_delay_ops.udelay(n)
> -#define __const_udelay(n) arm_delay_ops.const_udelay(n)
> +static inline void __nocfi __udelay(unsigned long n)
> +{
> + arm_delay_ops.udelay(n);
> +}
> +
> +static inline void __nocfi __const_udelay(unsigned long n)
> +{
> + arm_delay_ops.const_udelay(n);
> +}
>
> #define udelay(n) \
> (__builtin_constant_p(n) ? \
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v3 6/9] ARM: turn CPU cache flush functions into static inlines
2024-03-11 9:15 [PATCH v3 0/9] CFI for ARM32 using LLVM Linus Walleij
` (4 preceding siblings ...)
2024-03-11 9:15 ` [PATCH v3 5/9] ARM: delay: Turn delay functions into static inlines Linus Walleij
@ 2024-03-11 9:15 ` Linus Walleij
2024-03-11 9:15 ` [PATCH v3 7/9] ARM: page: Turn highpage accesses " Linus Walleij
` (2 subsequent siblings)
8 siblings, 0 replies; 25+ messages in thread
From: Linus Walleij @ 2024-03-11 9:15 UTC (permalink / raw)
To: Russell King, Sami Tolvanen, Kees Cook, Nathan Chancellor,
Nick Desaulniers, Ard Biesheuvel, Arnd Bergmann
Cc: linux-arm-kernel, llvm, Linus Walleij
The members of the vector table struct cpu_cache_fns cpu_cache
are called directly using defines, but this is really confusing
for KCFI. Wrap the calls in static inlines and tag them with
__nocfi so things start to work.
Conversely a similar approach is used for the __glue() helpers
which define their way into an assembly ENTRY(symbol) for respective
CPU variant. We wrap these into static inlines and prefix them
with __nocfi as well. (This happens on !MULTI_CACHE systems.)
For this case we also need to invoke the __glue() macro to
provide a proper function prototype for the inner function.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
arch/arm/include/asm/cacheflush.h | 45 +++++++++++++++++++++++++++++++--------
arch/arm/mm/dma.h | 28 ++++++++++++++++++------
2 files changed, 58 insertions(+), 15 deletions(-)
diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index 1075534b0a2e..76fb665162a4 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -122,14 +122,38 @@ struct cpu_cache_fns {
extern struct cpu_cache_fns cpu_cache;
-#define __cpuc_flush_icache_all cpu_cache.flush_icache_all
-#define __cpuc_flush_kern_all cpu_cache.flush_kern_all
-#define __cpuc_flush_kern_louis cpu_cache.flush_kern_louis
-#define __cpuc_flush_user_all cpu_cache.flush_user_all
-#define __cpuc_flush_user_range cpu_cache.flush_user_range
-#define __cpuc_coherent_kern_range cpu_cache.coherent_kern_range
-#define __cpuc_coherent_user_range cpu_cache.coherent_user_range
-#define __cpuc_flush_dcache_area cpu_cache.flush_kern_dcache_area
+static inline void __nocfi __cpuc_flush_icache_all(void)
+{
+ cpu_cache.flush_icache_all();
+}
+static inline void __nocfi __cpuc_flush_kern_all(void)
+{
+ cpu_cache.flush_icache_all();
+}
+static inline void __nocfi __cpuc_flush_kern_louis(void)
+{
+ cpu_cache.flush_kern_louis();
+}
+static inline void __nocfi __cpuc_flush_user_all(void)
+{
+ cpu_cache.flush_user_all();
+}
+static inline void __nocfi __cpuc_flush_user_range(unsigned long start, unsigned long end, unsigned int flags)
+{
+ cpu_cache.flush_user_range(start, end, flags);
+}
+static inline void __nocfi __cpuc_coherent_kern_range(unsigned long start, unsigned long end)
+{
+ cpu_cache.coherent_kern_range(start, end);
+}
+static inline int __nocfi __cpuc_coherent_user_range(unsigned long start, unsigned long end)
+{
+ return cpu_cache.coherent_user_range(start, end);
+}
+static inline void __nocfi __cpuc_flush_dcache_area(void *kaddr, size_t sz)
+{
+ cpu_cache.flush_kern_dcache_area(kaddr, sz);
+}
/*
* These are private to the dma-mapping API. Do not use directly.
@@ -137,7 +161,10 @@ extern struct cpu_cache_fns cpu_cache;
* is visible to DMA, or data written by DMA to system memory is
* visible to the CPU.
*/
-#define dmac_flush_range cpu_cache.dma_flush_range
+static inline void __nocfi dmac_flush_range(const void *start, const void *end)
+{
+ cpu_cache.dma_flush_range(start, end);
+}
#else
diff --git a/arch/arm/mm/dma.h b/arch/arm/mm/dma.h
index aaef64b7f177..251b8a9fffc1 100644
--- a/arch/arm/mm/dma.h
+++ b/arch/arm/mm/dma.h
@@ -5,8 +5,6 @@
#include <asm/glue-cache.h>
#ifndef MULTI_CACHE
-#define dmac_map_area __glue(_CACHE,_dma_map_area)
-#define dmac_unmap_area __glue(_CACHE,_dma_unmap_area)
/*
* These are private to the dma-mapping API. Do not use directly.
@@ -14,8 +12,20 @@
* is visible to DMA, or data written by DMA to system memory is
* visible to the CPU.
*/
-extern void dmac_map_area(const void *, size_t, int);
-extern void dmac_unmap_area(const void *, size_t, int);
+
+/* These turn into function declarations for each per-CPU glue function */
+void __glue(_CACHE,_dma_map_area)(const void *, size_t, int);
+void __glue(_CACHE,_dma_unmap_area)(const void *, size_t, int);
+
+static inline void __nocfi dmac_map_area(const void *start, size_t sz, int flags)
+{
+ __glue(_CACHE,_dma_map_area)(start, sz, flags);
+}
+
+static inline void __nocfi dmac_unmap_area(const void *start, size_t sz, int flags)
+{
+ __glue(_CACHE,_dma_unmap_area)(start, sz, flags);
+}
#else
@@ -25,8 +35,14 @@ extern void dmac_unmap_area(const void *, size_t, int);
* is visible to DMA, or data written by DMA to system memory is
* visible to the CPU.
*/
-#define dmac_map_area cpu_cache.dma_map_area
-#define dmac_unmap_area cpu_cache.dma_unmap_area
+static inline void __nocfi dmac_map_area(const void *start, size_t sz, int flags)
+{
+ cpu_cache.dma_map_area(start, sz, flags);
+}
+static inline void __nocfi dmac_unmap_area(const void *start, size_t sz, int flags)
+{
+ cpu_cache.dma_unmap_area(start, sz, flags);
+}
#endif
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH v3 7/9] ARM: page: Turn highpage accesses into static inlines
2024-03-11 9:15 [PATCH v3 0/9] CFI for ARM32 using LLVM Linus Walleij
` (5 preceding siblings ...)
2024-03-11 9:15 ` [PATCH v3 6/9] ARM: turn CPU cache flush " Linus Walleij
@ 2024-03-11 9:15 ` Linus Walleij
2024-03-11 12:15 ` Ard Biesheuvel
2024-03-11 9:15 ` [PATCH v3 8/9] ARM: ftrace: Define ftrace_stub_graph Linus Walleij
2024-03-11 9:15 ` [PATCH v3 9/9] ARM: KCFI: Allow permissive CFI mode Linus Walleij
8 siblings, 1 reply; 25+ messages in thread
From: Linus Walleij @ 2024-03-11 9:15 UTC (permalink / raw)
To: Russell King, Sami Tolvanen, Kees Cook, Nathan Chancellor,
Nick Desaulniers, Ard Biesheuvel, Arnd Bergmann
Cc: linux-arm-kernel, llvm, Linus Walleij
Clearing and copying pages in highmem uses either the cpu_user
vtable or the __glue() assembler stubs to call into per-CPU
versions of these functions.
This is all really confusing for KCFI so wrap these into static
inlines and prefix each inline function with __nocfi.
__cpu_clear_user_highpage() and __cpu_copy_user_highpage() are
exported in arch/arm/mm/proc-syms.c which causes a problem with
using static inlines, but it turns out that these exports are
completely unused, so we can just delete them.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
arch/arm/include/asm/page.h | 36 +++++++++++++++++++++++++++++-------
arch/arm/mm/proc-syms.c | 7 +------
2 files changed, 30 insertions(+), 13 deletions(-)
diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index 119aa85d1feb..8bf297228627 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -138,17 +138,39 @@ void xscale_mc_clear_user_highpage(struct page *page, unsigned long vaddr);
#ifdef MULTI_USER
extern struct cpu_user_fns cpu_user;
-#define __cpu_clear_user_highpage cpu_user.cpu_clear_user_highpage
-#define __cpu_copy_user_highpage cpu_user.cpu_copy_user_highpage
+static inline void __nocfi __cpu_clear_user_highpage(struct page *page,
+ unsigned long vaddr)
+{
+ cpu_user.cpu_clear_user_highpage(page, vaddr);
+}
+
+static inline void __nocfi __cpu_copy_user_highpage(struct page *to,
+ struct page *from, unsigned long vaddr,
+ struct vm_area_struct *vma)
+{
+ cpu_user.cpu_copy_user_highpage(to, from, vaddr, vma);
+}
#else
-#define __cpu_clear_user_highpage __glue(_USER,_clear_user_highpage)
-#define __cpu_copy_user_highpage __glue(_USER,_copy_user_highpage)
+/* These turn into function declarations for each per-CPU glue function */
+void __glue(_USER,_clear_user_highpage)(struct page *page, unsigned long vaddr);
+void __glue(_USER,_copy_user_highpage)(struct page *to, struct page *from,
+ unsigned long vaddr, struct vm_area_struct *vma);
+
+static inline void __nocfi __cpu_clear_user_highpage(struct page *page,
+ unsigned long vaddr)
+{
+ __glue(_USER,_clear_user_highpage)(page, vaddr);
+}
+
+static inline void __nocfi __cpu_copy_user_highpage(struct page *to,
+ struct page *from, unsigned long vaddr,
+ struct vm_area_struct *vma)
+{
+ __glue(_USER,_copy_user_highpage)(to, from, vaddr, vma);
+}
-extern void __cpu_clear_user_highpage(struct page *page, unsigned long vaddr);
-extern void __cpu_copy_user_highpage(struct page *to, struct page *from,
- unsigned long vaddr, struct vm_area_struct *vma);
#endif
#define clear_user_highpage(page,vaddr) \
diff --git a/arch/arm/mm/proc-syms.c b/arch/arm/mm/proc-syms.c
index e21249548e9f..c93fec38d9f4 100644
--- a/arch/arm/mm/proc-syms.c
+++ b/arch/arm/mm/proc-syms.c
@@ -31,14 +31,9 @@ EXPORT_SYMBOL(__cpuc_flush_dcache_area);
EXPORT_SYMBOL(cpu_cache);
#endif
-#ifdef CONFIG_MMU
-#ifndef MULTI_USER
-EXPORT_SYMBOL(__cpu_clear_user_highpage);
-EXPORT_SYMBOL(__cpu_copy_user_highpage);
-#else
+#if defined(CONFIG_MMU) && defined(MULTI_USER)
EXPORT_SYMBOL(cpu_user);
#endif
-#endif
/*
* No module should need to touch the TLB (and currently
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH v3 7/9] ARM: page: Turn highpage accesses into static inlines
2024-03-11 9:15 ` [PATCH v3 7/9] ARM: page: Turn highpage accesses " Linus Walleij
@ 2024-03-11 12:15 ` Ard Biesheuvel
2024-03-28 8:18 ` Linus Walleij
0 siblings, 1 reply; 25+ messages in thread
From: Ard Biesheuvel @ 2024-03-11 12:15 UTC (permalink / raw)
To: Linus Walleij
Cc: Russell King, Sami Tolvanen, Kees Cook, Nathan Chancellor,
Nick Desaulniers, Arnd Bergmann, linux-arm-kernel, llvm
On Mon, 11 Mar 2024 at 10:15, Linus Walleij <linus.walleij@linaro.org> wrote:
>
> Clearing and copying pages in highmem uses either the cpu_user
> vtable or the __glue() assembler stubs to call into per-CPU
> versions of these functions.
>
> This is all really confusing for KCFI so wrap these into static
> inlines and prefix each inline function with __nocfi.
>
> __cpu_clear_user_highpage() and __cpu_copy_user_highpage() are
> exported in arch/arm/mm/proc-syms.c which causes a problem with
> using static inlines, but it turns out that these exports are
> completely unused, so we can just delete them.
>
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> ---
> arch/arm/include/asm/page.h | 36 +++++++++++++++++++++++++++++-------
> arch/arm/mm/proc-syms.c | 7 +------
> 2 files changed, 30 insertions(+), 13 deletions(-)
>
Are you sure this patch is needed?
> diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
> index 119aa85d1feb..8bf297228627 100644
> --- a/arch/arm/include/asm/page.h
> +++ b/arch/arm/include/asm/page.h
> @@ -138,17 +138,39 @@ void xscale_mc_clear_user_highpage(struct page *page, unsigned long vaddr);
> #ifdef MULTI_USER
> extern struct cpu_user_fns cpu_user;
>
> -#define __cpu_clear_user_highpage cpu_user.cpu_clear_user_highpage
> -#define __cpu_copy_user_highpage cpu_user.cpu_copy_user_highpage
> +static inline void __nocfi __cpu_clear_user_highpage(struct page *page,
> + unsigned long vaddr)
> +{
> + cpu_user.cpu_clear_user_highpage(page, vaddr);
> +}
> +
> +static inline void __nocfi __cpu_copy_user_highpage(struct page *to,
> + struct page *from, unsigned long vaddr,
> + struct vm_area_struct *vma)
> +{
> + cpu_user.cpu_copy_user_highpage(to, from, vaddr, vma);
> +}
>
> #else
>
> -#define __cpu_clear_user_highpage __glue(_USER,_clear_user_highpage)
> -#define __cpu_copy_user_highpage __glue(_USER,_copy_user_highpage)
> +/* These turn into function declarations for each per-CPU glue function */
> +void __glue(_USER,_clear_user_highpage)(struct page *page, unsigned long vaddr);
> +void __glue(_USER,_copy_user_highpage)(struct page *to, struct page *from,
> + unsigned long vaddr, struct vm_area_struct *vma);
> +
> +static inline void __nocfi __cpu_clear_user_highpage(struct page *page,
> + unsigned long vaddr)
> +{
> + __glue(_USER,_clear_user_highpage)(page, vaddr);
> +}
> +
> +static inline void __nocfi __cpu_copy_user_highpage(struct page *to,
> + struct page *from, unsigned long vaddr,
> + struct vm_area_struct *vma)
> +{
> + __glue(_USER,_copy_user_highpage)(to, from, vaddr, vma);
> +}
>
> -extern void __cpu_clear_user_highpage(struct page *page, unsigned long vaddr);
> -extern void __cpu_copy_user_highpage(struct page *to, struct page *from,
> - unsigned long vaddr, struct vm_area_struct *vma);
> #endif
>
> #define clear_user_highpage(page,vaddr) \
> diff --git a/arch/arm/mm/proc-syms.c b/arch/arm/mm/proc-syms.c
> index e21249548e9f..c93fec38d9f4 100644
> --- a/arch/arm/mm/proc-syms.c
> +++ b/arch/arm/mm/proc-syms.c
> @@ -31,14 +31,9 @@ EXPORT_SYMBOL(__cpuc_flush_dcache_area);
> EXPORT_SYMBOL(cpu_cache);
> #endif
>
> -#ifdef CONFIG_MMU
> -#ifndef MULTI_USER
> -EXPORT_SYMBOL(__cpu_clear_user_highpage);
> -EXPORT_SYMBOL(__cpu_copy_user_highpage);
> -#else
> +#if defined(CONFIG_MMU) && defined(MULTI_USER)
> EXPORT_SYMBOL(cpu_user);
> #endif
> -#endif
>
> /*
> * No module should need to touch the TLB (and currently
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: [PATCH v3 7/9] ARM: page: Turn highpage accesses into static inlines
2024-03-11 12:15 ` Ard Biesheuvel
@ 2024-03-28 8:18 ` Linus Walleij
0 siblings, 0 replies; 25+ messages in thread
From: Linus Walleij @ 2024-03-28 8:18 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Russell King, Sami Tolvanen, Kees Cook, Nathan Chancellor,
Nick Desaulniers, Arnd Bergmann, linux-arm-kernel, llvm
On Mon, Mar 11, 2024 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> On Mon, 11 Mar 2024 at 10:15, Linus Walleij <linus.walleij@linaro.org> wrote:
> >
> > Clearing and copying pages in highmem uses either the cpu_user
> > vtable or the __glue() assembler stubs to call into per-CPU
> > versions of these functions.
> >
> > This is all really confusing for KCFI so wrap these into static
> > inlines and prefix each inline function with __nocfi.
> >
> > __cpu_clear_user_highpage() and __cpu_copy_user_highpage() are
> > exported in arch/arm/mm/proc-syms.c which causes a problem with
> > using static inlines, but it turns out that these exports are
> > completely unused, so we can just delete them.
> >
> > Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> > ---
> > arch/arm/include/asm/page.h | 36 +++++++++++++++++++++++++++++-------
> > arch/arm/mm/proc-syms.c | 7 +------
> > 2 files changed, 30 insertions(+), 13 deletions(-)
> >
>
> Are you sure this patch is needed?
It's not needed, it was a development artifact from fixing the highmem access
before fixing cache and TLB flush. The highmem code did a bunch of that.
Posting v4!
Yours,
Linus Walleij
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v3 8/9] ARM: ftrace: Define ftrace_stub_graph
2024-03-11 9:15 [PATCH v3 0/9] CFI for ARM32 using LLVM Linus Walleij
` (6 preceding siblings ...)
2024-03-11 9:15 ` [PATCH v3 7/9] ARM: page: Turn highpage accesses " Linus Walleij
@ 2024-03-11 9:15 ` Linus Walleij
2024-03-11 9:15 ` [PATCH v3 9/9] ARM: KCFI: Allow permissive CFI mode Linus Walleij
8 siblings, 0 replies; 25+ messages in thread
From: Linus Walleij @ 2024-03-11 9:15 UTC (permalink / raw)
To: Russell King, Sami Tolvanen, Kees Cook, Nathan Chancellor,
Nick Desaulniers, Ard Biesheuvel, Arnd Bergmann
Cc: linux-arm-kernel, llvm, Linus Walleij
Several architectures defines this stub for the graph tracer,
and it is needed for CFI, as it needs a separate symbol for it.
The trick from include/asm-generic/vmlinux.lds.h to define
ftrace_stub_graph to ftrace_stub isn't working when using CFI.
Commit 883bbbffa5a4 contains the details.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
arch/arm/kernel/entry-ftrace.S | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S
index 3e7bcaca5e07..bc598e3d8dd2 100644
--- a/arch/arm/kernel/entry-ftrace.S
+++ b/arch/arm/kernel/entry-ftrace.S
@@ -271,6 +271,10 @@ ENTRY(ftrace_stub)
ret lr
ENDPROC(ftrace_stub)
+ENTRY(ftrace_stub_graph)
+ ret lr
+ENDPROC(ftrace_stub_graph)
+
#ifdef CONFIG_DYNAMIC_FTRACE
__INIT
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH v3 9/9] ARM: KCFI: Allow permissive CFI mode
2024-03-11 9:15 [PATCH v3 0/9] CFI for ARM32 using LLVM Linus Walleij
` (7 preceding siblings ...)
2024-03-11 9:15 ` [PATCH v3 8/9] ARM: ftrace: Define ftrace_stub_graph Linus Walleij
@ 2024-03-11 9:15 ` Linus Walleij
2024-03-11 22:03 ` Kees Cook
8 siblings, 1 reply; 25+ messages in thread
From: Linus Walleij @ 2024-03-11 9:15 UTC (permalink / raw)
To: Russell King, Sami Tolvanen, Kees Cook, Nathan Chancellor,
Nick Desaulniers, Ard Biesheuvel, Arnd Bergmann
Cc: linux-arm-kernel, llvm, Linus Walleij
This registers a breakpoint handler for the new breakpoint type
(0x03) inserted by LLVM CLANG for CFI breakpoints.
If we are in permissive mode, just print a backtrace and continue.
Example with CONFIG_CFI_PERMISSIVE enabled:
> echo CFI_FORWARD_PROTO > /sys/kernel/debug/provoke-crash/DIRECT
lkdtm: Performing direct entry CFI_FORWARD_PROTO
lkdtm: Calling matched prototype ...
lkdtm: Calling mismatched prototype ...
CFI failure at lkdtm_indirect_call+0x40/0x4c (target: 0x0; expected type: 0x00000000)
WARNING: CPU: 1 PID: 112 at lkdtm_indirect_call+0x40/0x4c
CPU: 1 PID: 112 Comm: sh Not tainted 6.8.0-rc1+ #150
Hardware name: ARM-Versatile Express
(...)
lkdtm: FAIL: survived mismatched prototype function call!
lkdtm: Unexpected! This kernel (6.8.0-rc1+ armv7l) was built with CONFIG_CFI_CLANG=y
As you can see the LKDTM test fails, but I expect that this would be
expected behaviour in the permissive mode.
We are currently not implementing target and type for the CFI
breakpoint as this requires additional operand bundling compiler
extensions.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
arch/arm/include/asm/hw_breakpoint.h | 1 +
arch/arm/kernel/hw_breakpoint.c | 30 ++++++++++++++++++++++++++++++
2 files changed, 31 insertions(+)
diff --git a/arch/arm/include/asm/hw_breakpoint.h b/arch/arm/include/asm/hw_breakpoint.h
index 62358d3ca0a8..e7f9961c53b2 100644
--- a/arch/arm/include/asm/hw_breakpoint.h
+++ b/arch/arm/include/asm/hw_breakpoint.h
@@ -84,6 +84,7 @@ static inline void decode_ctrl_reg(u32 reg,
#define ARM_DSCR_MOE(x) ((x >> 2) & 0xf)
#define ARM_ENTRY_BREAKPOINT 0x1
#define ARM_ENTRY_ASYNC_WATCHPOINT 0x2
+#define ARM_ENTRY_CFI_BREAKPOINT 0x3
#define ARM_ENTRY_SYNC_WATCHPOINT 0xa
/* DSCR monitor/halting bits. */
diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c
index dc0fb7a81371..61a984b83bfe 100644
--- a/arch/arm/kernel/hw_breakpoint.c
+++ b/arch/arm/kernel/hw_breakpoint.c
@@ -17,6 +17,7 @@
#include <linux/perf_event.h>
#include <linux/hw_breakpoint.h>
#include <linux/smp.h>
+#include <linux/cfi.h>
#include <linux/cpu_pm.h>
#include <linux/coresight.h>
@@ -903,6 +904,32 @@ static void breakpoint_handler(unsigned long unknown, struct pt_regs *regs)
watchpoint_single_step_handler(addr);
}
+#ifdef CONFIG_CFI_CLANG
+static void hw_breakpoint_cfi_handler(struct pt_regs *regs)
+{
+ /* TODO: implementing target and type requires compiler work */
+ unsigned long target = 0;
+ u32 type = 0;
+
+ switch (report_cfi_failure(regs, instruction_pointer(regs), &target, type)) {
+ case BUG_TRAP_TYPE_BUG:
+ die("Oops - CFI", regs, 0);
+ break;
+ case BUG_TRAP_TYPE_WARN:
+ /* Skip the breaking instruction */
+ instruction_pointer(regs) += 4;
+ break;
+ default:
+ pr_crit("Unknown CFI error\n");
+ break;
+ }
+}
+#else
+static void hw_breakpoint_cfi_handler(struct pt_regs *regs)
+{
+}
+#endif
+
/*
* Called from either the Data Abort Handler [watchpoint] or the
* Prefetch Abort Handler [breakpoint] with interrupts disabled.
@@ -932,6 +959,9 @@ static int hw_breakpoint_pending(unsigned long addr, unsigned int fsr,
case ARM_ENTRY_SYNC_WATCHPOINT:
watchpoint_handler(addr, fsr, regs);
break;
+ case ARM_ENTRY_CFI_BREAKPOINT:
+ hw_breakpoint_cfi_handler(regs);
+ break;
default:
ret = 1; /* Unhandled fault. */
}
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH v3 9/9] ARM: KCFI: Allow permissive CFI mode
2024-03-11 9:15 ` [PATCH v3 9/9] ARM: KCFI: Allow permissive CFI mode Linus Walleij
@ 2024-03-11 22:03 ` Kees Cook
0 siblings, 0 replies; 25+ messages in thread
From: Kees Cook @ 2024-03-11 22:03 UTC (permalink / raw)
To: Linus Walleij
Cc: Russell King, Sami Tolvanen, Nathan Chancellor, Nick Desaulniers,
Ard Biesheuvel, Arnd Bergmann, linux-arm-kernel, llvm
On Mon, Mar 11, 2024 at 10:15:46AM +0100, Linus Walleij wrote:
> This registers a breakpoint handler for the new breakpoint type
> (0x03) inserted by LLVM CLANG for CFI breakpoints.
>
> If we are in permissive mode, just print a backtrace and continue.
>
> Example with CONFIG_CFI_PERMISSIVE enabled:
>
> > echo CFI_FORWARD_PROTO > /sys/kernel/debug/provoke-crash/DIRECT
> lkdtm: Performing direct entry CFI_FORWARD_PROTO
> lkdtm: Calling matched prototype ...
> lkdtm: Calling mismatched prototype ...
> CFI failure at lkdtm_indirect_call+0x40/0x4c (target: 0x0; expected type: 0x00000000)
> WARNING: CPU: 1 PID: 112 at lkdtm_indirect_call+0x40/0x4c
> CPU: 1 PID: 112 Comm: sh Not tainted 6.8.0-rc1+ #150
> Hardware name: ARM-Versatile Express
> (...)
> lkdtm: FAIL: survived mismatched prototype function call!
> lkdtm: Unexpected! This kernel (6.8.0-rc1+ armv7l) was built with CONFIG_CFI_CLANG=y
>
> As you can see the LKDTM test fails, but I expect that this would be
> expected behaviour in the permissive mode.
>
> We are currently not implementing target and type for the CFI
> breakpoint as this requires additional operand bundling compiler
> extensions.
>
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> ---
> arch/arm/include/asm/hw_breakpoint.h | 1 +
> arch/arm/kernel/hw_breakpoint.c | 30 ++++++++++++++++++++++++++++++
> 2 files changed, 31 insertions(+)
>
> diff --git a/arch/arm/include/asm/hw_breakpoint.h b/arch/arm/include/asm/hw_breakpoint.h
> index 62358d3ca0a8..e7f9961c53b2 100644
> --- a/arch/arm/include/asm/hw_breakpoint.h
> +++ b/arch/arm/include/asm/hw_breakpoint.h
> @@ -84,6 +84,7 @@ static inline void decode_ctrl_reg(u32 reg,
> #define ARM_DSCR_MOE(x) ((x >> 2) & 0xf)
> #define ARM_ENTRY_BREAKPOINT 0x1
> #define ARM_ENTRY_ASYNC_WATCHPOINT 0x2
> +#define ARM_ENTRY_CFI_BREAKPOINT 0x3
> #define ARM_ENTRY_SYNC_WATCHPOINT 0xa
>
> /* DSCR monitor/halting bits. */
> diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c
> index dc0fb7a81371..61a984b83bfe 100644
> --- a/arch/arm/kernel/hw_breakpoint.c
> +++ b/arch/arm/kernel/hw_breakpoint.c
> @@ -17,6 +17,7 @@
> #include <linux/perf_event.h>
> #include <linux/hw_breakpoint.h>
> #include <linux/smp.h>
> +#include <linux/cfi.h>
> #include <linux/cpu_pm.h>
> #include <linux/coresight.h>
>
> @@ -903,6 +904,32 @@ static void breakpoint_handler(unsigned long unknown, struct pt_regs *regs)
> watchpoint_single_step_handler(addr);
> }
>
> +#ifdef CONFIG_CFI_CLANG
> +static void hw_breakpoint_cfi_handler(struct pt_regs *regs)
> +{
> + /* TODO: implementing target and type requires compiler work */
> + unsigned long target = 0;
> + u32 type = 0;
> +
> + switch (report_cfi_failure(regs, instruction_pointer(regs), &target, type)) {
> + case BUG_TRAP_TYPE_BUG:
> + die("Oops - CFI", regs, 0);
> + break;
> + case BUG_TRAP_TYPE_WARN:
> + /* Skip the breaking instruction */
> + instruction_pointer(regs) += 4;
> + break;
This looks much better; thanks!
> + default:
> + pr_crit("Unknown CFI error\n");
> + break;
For something like CFI, I think it would be better to fail closed. i.e.:
die("Unknown CFI error", regs, 0);
> + }
> +}
> +#else
> +static void hw_breakpoint_cfi_handler(struct pt_regs *regs)
> +{
> +}
> +#endif
> +
> /*
> * Called from either the Data Abort Handler [watchpoint] or the
> * Prefetch Abort Handler [breakpoint] with interrupts disabled.
> @@ -932,6 +959,9 @@ static int hw_breakpoint_pending(unsigned long addr, unsigned int fsr,
> case ARM_ENTRY_SYNC_WATCHPOINT:
> watchpoint_handler(addr, fsr, regs);
> break;
> + case ARM_ENTRY_CFI_BREAKPOINT:
> + hw_breakpoint_cfi_handler(regs);
> + break;
> default:
> ret = 1; /* Unhandled fault. */
> }
>
> --
> 2.34.1
>
--
Kees Cook
^ permalink raw reply [flat|nested] 25+ messages in thread