Building the Linux kernel with Clang and LLVM
 help / color / mirror / Atom feed
* [PATCH v4 01/12] riscv: ftrace: support fastcc in Clang for WITH_ARGS
@ 2025-04-07 18:08 Andy Chiu
  2025-06-02 22:12 ` patchwork-bot+linux-riscv
       [not found] ` <20250407180838.42877-10-andybnac@gmail.com>
  0 siblings, 2 replies; 9+ messages in thread
From: Andy Chiu @ 2025-04-07 18:08 UTC (permalink / raw)
  To: linux-riscv, alexghiti, palmer
  Cc: Andy Chiu, Evgenii Shatokhin, Nathan Chancellor,
	Björn Töpel, Palmer Dabbelt, Puranjay Mohan,
	linux-kernel, linux-trace-kernel, llvm, Mark Rutland,
	Alexandre Ghiti, Nick Desaulniers, Bill Wendling, Justin Stitt,
	puranjay12, paul.walmsley, greentime.hu, nick.hu, nylon.chen,
	eric.lin, vicent.chen, zong.li, yongxuan.wang, samuel.holland,
	olivia.chu, c2232430

From: Andy Chiu <andy.chiu@sifive.com>

Some caller-saved registers which are not defined as function arguments
in the ABI can still be passed as arguments when the kernel is compiled
with Clang. As a result, we must save and restore those registers to
prevent ftrace from clobbering them.

- [1]: https://reviews.llvm.org/D68559

Reported-by: Evgenii Shatokhin <e.shatokhin@yadro.com>
Closes: https://lore.kernel.org/linux-riscv/7e7c7914-445d-426d-89a0-59a9199c45b1@yadro.com/
Fixes: 7caa9765465f ("ftrace: riscv: move from REGS to ARGS")
Acked-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>

---
Changelogs v4:
 - Add a fix tag (Björn, Evgenii)
---
 arch/riscv/include/asm/ftrace.h |  7 +++++++
 arch/riscv/kernel/asm-offsets.c |  7 +++++++
 arch/riscv/kernel/mcount-dyn.S  | 16 ++++++++++++++--
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
index d627f63ee289..d8b2138bd9c6 100644
--- a/arch/riscv/include/asm/ftrace.h
+++ b/arch/riscv/include/asm/ftrace.h
@@ -146,6 +146,13 @@ struct __arch_ftrace_regs {
 			unsigned long a5;
 			unsigned long a6;
 			unsigned long a7;
+#ifdef CONFIG_CC_IS_CLANG
+			unsigned long t2;
+			unsigned long t3;
+			unsigned long t4;
+			unsigned long t5;
+			unsigned long t6;
+#endif
 		};
 	};
 };
diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c
index 16490755304e..7c43c8e26ae7 100644
--- a/arch/riscv/kernel/asm-offsets.c
+++ b/arch/riscv/kernel/asm-offsets.c
@@ -501,6 +501,13 @@ void asm_offsets(void)
 	DEFINE(FREGS_SP,	    offsetof(struct __arch_ftrace_regs, sp));
 	DEFINE(FREGS_S0,	    offsetof(struct __arch_ftrace_regs, s0));
 	DEFINE(FREGS_T1,	    offsetof(struct __arch_ftrace_regs, t1));
+#ifdef CONFIG_CC_IS_CLANG
+	DEFINE(FREGS_T2,	    offsetof(struct __arch_ftrace_regs, t2));
+	DEFINE(FREGS_T3,	    offsetof(struct __arch_ftrace_regs, t3));
+	DEFINE(FREGS_T4,	    offsetof(struct __arch_ftrace_regs, t4));
+	DEFINE(FREGS_T5,	    offsetof(struct __arch_ftrace_regs, t5));
+	DEFINE(FREGS_T6,	    offsetof(struct __arch_ftrace_regs, t6));
+#endif
 	DEFINE(FREGS_A0,	    offsetof(struct __arch_ftrace_regs, a0));
 	DEFINE(FREGS_A1,	    offsetof(struct __arch_ftrace_regs, a1));
 	DEFINE(FREGS_A2,	    offsetof(struct __arch_ftrace_regs, a2));
diff --git a/arch/riscv/kernel/mcount-dyn.S b/arch/riscv/kernel/mcount-dyn.S
index 745dd4c4a69c..e988bd26b28b 100644
--- a/arch/riscv/kernel/mcount-dyn.S
+++ b/arch/riscv/kernel/mcount-dyn.S
@@ -96,7 +96,13 @@
 	REG_S	x8,  FREGS_S0(sp)
 #endif
 	REG_S	x6,  FREGS_T1(sp)
-
+#ifdef CONFIG_CC_IS_CLANG
+	REG_S	x7,  FREGS_T2(sp)
+	REG_S	x28, FREGS_T3(sp)
+	REG_S	x29, FREGS_T4(sp)
+	REG_S	x30, FREGS_T5(sp)
+	REG_S	x31, FREGS_T6(sp)
+#endif
 	// save the arguments
 	REG_S	x10, FREGS_A0(sp)
 	REG_S	x11, FREGS_A1(sp)
@@ -115,7 +121,13 @@
 	REG_L	x8, FREGS_S0(sp)
 #endif
 	REG_L	x6,  FREGS_T1(sp)
-
+#ifdef CONFIG_CC_IS_CLANG
+	REG_L	x7,  FREGS_T2(sp)
+	REG_L	x28, FREGS_T3(sp)
+	REG_L	x29, FREGS_T4(sp)
+	REG_L	x30, FREGS_T5(sp)
+	REG_L	x31, FREGS_T6(sp)
+#endif
 	// restore the arguments
 	REG_L	x10, FREGS_A0(sp)
 	REG_L	x11, FREGS_A1(sp)
-- 
2.39.3 (Apple Git-145)


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 01/12] riscv: ftrace: support fastcc in Clang for WITH_ARGS
  2025-04-07 18:08 [PATCH v4 01/12] riscv: ftrace: support fastcc in Clang for WITH_ARGS Andy Chiu
@ 2025-06-02 22:12 ` patchwork-bot+linux-riscv
       [not found] ` <20250407180838.42877-10-andybnac@gmail.com>
  1 sibling, 0 replies; 9+ messages in thread
From: patchwork-bot+linux-riscv @ 2025-06-02 22:12 UTC (permalink / raw)
  To: Andy Chiu
  Cc: linux-riscv, alexghiti, palmer, andy.chiu, e.shatokhin, nathan,
	bjorn, palmer, puranjay, linux-kernel, linux-trace-kernel, llvm,
	mark.rutland, alex, nick.desaulniers+lkml, morbo, justinstitt,
	puranjay12, paul.walmsley, greentime.hu, nick.hu, nylon.chen,
	eric.lin, vicent.chen, zong.li, yongxuan.wang, samuel.holland,
	olivia.chu, c2232430

Hello:

This series was applied to riscv/linux.git (for-next)
by Alexandre Ghiti <alexghiti@rivosinc.com>:

On Tue,  8 Apr 2025 02:08:25 +0800 you wrote:
> From: Andy Chiu <andy.chiu@sifive.com>
> 
> Some caller-saved registers which are not defined as function arguments
> in the ABI can still be passed as arguments when the kernel is compiled
> with Clang. As a result, we must save and restore those registers to
> prevent ftrace from clobbering them.
> 
> [...]

Here is the summary with links:
  - [v4,01/12] riscv: ftrace: support fastcc in Clang for WITH_ARGS
    https://git.kernel.org/riscv/c/7cecf4f30c33
  - [v4,02/12] riscv: ftrace factor out code defined by !WITH_ARG
    https://git.kernel.org/riscv/c/2efa234f5e0c
  - [v4,03/12] riscv: ftrace: align patchable functions to 4 Byte boundary
    https://git.kernel.org/riscv/c/cced570c2c0c
  - [v4,04/12] kernel: ftrace: export ftrace_sync_ipi
    (no matching commit)
  - [v4,05/12] riscv: ftrace: prepare ftrace for atomic code patching
    (no matching commit)
  - [v4,06/12] riscv: ftrace: do not use stop_machine to update code
    (no matching commit)
  - [v4,07/12] riscv: vector: Support calling schedule() for preemptible Vector
    https://git.kernel.org/riscv/c/e2a8cbdbe932
  - [v4,08/12] riscv: add a data fence for CMODX in the kernel mode
    https://git.kernel.org/riscv/c/29b59e3bbb6e
  - [v4,09/12] riscv: ftrace: support PREEMPT
    https://git.kernel.org/riscv/c/f48ba55bb8a8
  - [v4,10/12] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
    (no matching commit)
  - [v4,11/12] riscv: ftrace: support direct call using call_ops
    https://git.kernel.org/riscv/c/7ef9ae7457c0
  - [v4,12/12] riscv: Documentation: add a description about dynamic ftrace
    https://git.kernel.org/riscv/c/0e07200b2af6

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 10/12] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
       [not found] ` <20250407180838.42877-10-andybnac@gmail.com>
@ 2026-02-21 12:15   ` Conor Dooley
  2026-02-23 15:18     ` Puranjay Mohan
  0 siblings, 1 reply; 9+ messages in thread
From: Conor Dooley @ 2026-02-21 12:15 UTC (permalink / raw)
  To: Andy Chiu
  Cc: linux-riscv, alexghiti, palmer, Puranjay Mohan,
	Björn Töpel, linux-kernel, linux-trace-kernel,
	Alexandre Ghiti, Mark Rutland, paul.walmsley, greentime.hu,
	nick.hu, nylon.chen, eric.lin, vicent.chen, zong.li,
	yongxuan.wang, samuel.holland, olivia.chu, c2232430, arnd,
	Sami Tolvanen, Kees Cook, Nathan Chancellor, llvm, pjw

[-- Attachment #1: Type: text/plain, Size: 7719 bytes --]

Hey,

On Tue, Apr 08, 2025 at 02:08:34AM +0800, Andy Chiu wrote:
> From: Puranjay Mohan <puranjay12@gmail.com>
> 
> This patch enables support for DYNAMIC_FTRACE_WITH_CALL_OPS on RISC-V.
> This allows each ftrace callsite to provide an ftrace_ops to the common
> ftrace trampoline, allowing each callsite to invoke distinct tracer
> functions without the need to fall back to list processing or to
> allocate custom trampolines for each callsite. This significantly speeds
> up cases where multiple distinct trace functions are used and callsites
> are mostly traced by a single tracer.
> 
> The idea and most of the implementation is taken from the ARM64's
> implementation of the same feature. The idea is to place a pointer to
> the ftrace_ops as a literal at a fixed offset from the function entry
> point, which can be recovered by the common ftrace trampoline.
> 
> We use -fpatchable-function-entry to reserve 8 bytes above the function
> entry by emitting 2 4 byte or 4 2 byte  nops depending on the presence of
> CONFIG_RISCV_ISA_C. These 8 bytes are patched at runtime with a pointer
> to the associated ftrace_ops for that callsite. Functions are aligned to
> 8 bytes to make sure that the accesses to this literal are atomic.
> 
> This approach allows for directly invoking ftrace_ops::func even for
> ftrace_ops which are dynamically-allocated (or part of a module),
> without going via ftrace_ops_list_func.
> 
> We've benchamrked this with the ftrace_ops sample module on Spacemit K1
> Jupiter:
> 
> Without this patch:
> 
> baseline (Linux rivos 6.14.0-09584-g7d06015d936c #3 SMP Sat Mar 29
> +-----------------------+-----------------+----------------------------+
> |  Number of tracers    | Total time (ns) | Per-call average time      |
> |-----------------------+-----------------+----------------------------|
> | Relevant | Irrelevant |    100000 calls | Total (ns) | Overhead (ns) |
> |----------+------------+-----------------+------------+---------------|
> |        0 |          0 |        1357958 |          13 |             - |
> |        0 |          1 |        1302375 |          13 |             - |
> |        0 |          2 |        1302375 |          13 |             - |
> |        0 |         10 |        1379084 |          13 |             - |
> |        0 |        100 |        1302458 |          13 |             - |
> |        0 |        200 |        1302333 |          13 |             - |
> |----------+------------+-----------------+------------+---------------|
> |        1 |          0 |       13677833 |         136 |           123 |
> |        1 |          1 |       18500916 |         185 |           172 |
> |        1 |          2 |       22856459 |         228 |           215 |
> |        1 |         10 |       58824709 |         588 |           575 |
> |        1 |        100 |      505141584 |        5051 |          5038 |
> |        1 |        200 |     1580473126 |       15804 |         15791 |
> |----------+------------+-----------------+------------+---------------|
> |        1 |          0 |       13561000 |         135 |           122 |
> |        2 |          0 |       19707292 |         197 |           184 |
> |       10 |          0 |       67774750 |         677 |           664 |
> |      100 |          0 |      714123125 |        7141 |          7128 |
> |      200 |          0 |     1918065668 |       19180 |         19167 |
> +----------+------------+-----------------+------------+---------------+
> 
> Note: per-call overhead is estimated relative to the baseline case with
> 0 relevant tracers and 0 irrelevant tracers.
> 
> With this patch:
> 
> v4-rc4 (Linux rivos 6.14.0-09598-gd75747611c93 #4 SMP Sat Mar 29
> +-----------------------+-----------------+----------------------------+
> |  Number of tracers    | Total time (ns) | Per-call average time      |
> |-----------------------+-----------------+----------------------------|
> | Relevant | Irrelevant |    100000 calls | Total (ns) | Overhead (ns) |
> |----------+------------+-----------------+------------+---------------|
> |        0 |          0 |         1459917 |         14 |             - |
> |        0 |          1 |         1408000 |         14 |             - |
> |        0 |          2 |         1383792 |         13 |             - |
> |        0 |         10 |         1430709 |         14 |             - |
> |        0 |        100 |         1383791 |         13 |             - |
> |        0 |        200 |         1383750 |         13 |             - |
> |----------+------------+-----------------+------------+---------------|
> |        1 |          0 |         5238041 |         52 |            38 |
> |        1 |          1 |         5228542 |         52 |            38 |
> |        1 |          2 |         5325917 |         53 |            40 |
> |        1 |         10 |         5299667 |         52 |            38 |
> |        1 |        100 |         5245250 |         52 |            39 |
> |        1 |        200 |         5238459 |         52 |            39 |
> |----------+------------+-----------------+------------+---------------|
> |        1 |          0 |         5239083 |         52 |            38 |
> |        2 |          0 |        19449417 |        194 |           181 |
> |       10 |          0 |        67718584 |        677 |           663 |
> |      100 |          0 |       709840708 |       7098 |          7085 |
> |      200 |          0 |      2203580626 |      22035 |         22022 |
> +----------+------------+-----------------+------------+---------------+
> 
> Note: per-call overhead is estimated relative to the baseline case with
> 0 relevant tracers and 0 irrelevant tracers.
> 
> As can be seen from the above:
> 
>  a) Whenever there is a single relevant tracer function associated with a
>     tracee, the overhead of invoking the tracer is constant, and does not
>     scale with the number of tracers which are *not* associated with that
>     tracee.
> 
>  b) The overhead for a single relevant tracer has dropped to ~1/3 of the
>     overhead prior to this series (from 122ns to 38ns). This is largely
>     due to permitting calls to dynamically-allocated ftrace_ops without
>     going through ftrace_ops_list_func.
> 
> Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
> 
> [update kconfig, asm, refactor]
> 
> Signed-off-by: Andy Chiu <andybnac@gmail.com>
> Tested-by: Björn Töpel <bjorn@rivosinc.com>

I bisected a boot failure to this commit [c217157bcd1df ("riscv:
Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS")] yesterday, that appears
to be affecting all LLVM versions that I currently have installed. From
some initial testing of Kconfig options, it looks like the issue is
CFI_CLANG related because when I disable CFI_CLANG things work once
more. Since this option depends on !CFI_CLANG, but is def_bool y, I
modified Kconfig to force disable it at all times and tested
!DYNAMIC_FTRACE_WITH_CALL_OPS && !CFG_CLANG, which did boot.

I dunno anything about what's going on in this patch, but so little in
it relates to having DYNAMIC_FTRACE_WITH_CALL_OPS, that I was able to
figure out that the problem is -fpatchable-function-entry=8,4

FWIW, if anyone checks out this commit directly, you'll need to
cherry-pick commit e9d86b8e17e72 ("scripts: Do not strip .rela.dyn
section"), as the base of the branch that c217157bcd1df is on is
v6.15-rc3, which is in itself broken in turn by the issue fixed by
e9d86b8e17e72. Probably not someone anyone will do, but made for an
awful time trying to figure out what commit was at fault!

Cheers,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 10/12] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
  2026-02-21 12:15   ` [PATCH v4 10/12] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS Conor Dooley
@ 2026-02-23 15:18     ` Puranjay Mohan
  2026-02-23 15:27       ` Conor Dooley
  0 siblings, 1 reply; 9+ messages in thread
From: Puranjay Mohan @ 2026-02-23 15:18 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Andy Chiu, linux-riscv, alexghiti, palmer, Björn Töpel,
	linux-kernel, linux-trace-kernel, Alexandre Ghiti, Mark Rutland,
	paul.walmsley, greentime.hu, nick.hu, nylon.chen, eric.lin,
	vicent.chen, zong.li, yongxuan.wang, samuel.holland, olivia.chu,
	c2232430, arnd, Sami Tolvanen, Kees Cook, Nathan Chancellor, llvm,
	pjw

On Sat, Feb 21, 2026 at 12:15 PM Conor Dooley <conor@kernel.org> wrote:
>
> Hey,
>
> On Tue, Apr 08, 2025 at 02:08:34AM +0800, Andy Chiu wrote:
> > From: Puranjay Mohan <puranjay12@gmail.com>
> >
> > This patch enables support for DYNAMIC_FTRACE_WITH_CALL_OPS on RISC-V.
> > This allows each ftrace callsite to provide an ftrace_ops to the common
> > ftrace trampoline, allowing each callsite to invoke distinct tracer
> > functions without the need to fall back to list processing or to
> > allocate custom trampolines for each callsite. This significantly speeds
> > up cases where multiple distinct trace functions are used and callsites
> > are mostly traced by a single tracer.
> >
> > The idea and most of the implementation is taken from the ARM64's
> > implementation of the same feature. The idea is to place a pointer to
> > the ftrace_ops as a literal at a fixed offset from the function entry
> > point, which can be recovered by the common ftrace trampoline.
> >
> > We use -fpatchable-function-entry to reserve 8 bytes above the function
> > entry by emitting 2 4 byte or 4 2 byte  nops depending on the presence of
> > CONFIG_RISCV_ISA_C. These 8 bytes are patched at runtime with a pointer
> > to the associated ftrace_ops for that callsite. Functions are aligned to
> > 8 bytes to make sure that the accesses to this literal are atomic.
> >
> > This approach allows for directly invoking ftrace_ops::func even for
> > ftrace_ops which are dynamically-allocated (or part of a module),
> > without going via ftrace_ops_list_func.
> >
> > We've benchamrked this with the ftrace_ops sample module on Spacemit K1
> > Jupiter:
> >
> > Without this patch:
> >
> > baseline (Linux rivos 6.14.0-09584-g7d06015d936c #3 SMP Sat Mar 29
> > +-----------------------+-----------------+----------------------------+
> > |  Number of tracers    | Total time (ns) | Per-call average time      |
> > |-----------------------+-----------------+----------------------------|
> > | Relevant | Irrelevant |    100000 calls | Total (ns) | Overhead (ns) |
> > |----------+------------+-----------------+------------+---------------|
> > |        0 |          0 |        1357958 |          13 |             - |
> > |        0 |          1 |        1302375 |          13 |             - |
> > |        0 |          2 |        1302375 |          13 |             - |
> > |        0 |         10 |        1379084 |          13 |             - |
> > |        0 |        100 |        1302458 |          13 |             - |
> > |        0 |        200 |        1302333 |          13 |             - |
> > |----------+------------+-----------------+------------+---------------|
> > |        1 |          0 |       13677833 |         136 |           123 |
> > |        1 |          1 |       18500916 |         185 |           172 |
> > |        1 |          2 |       22856459 |         228 |           215 |
> > |        1 |         10 |       58824709 |         588 |           575 |
> > |        1 |        100 |      505141584 |        5051 |          5038 |
> > |        1 |        200 |     1580473126 |       15804 |         15791 |
> > |----------+------------+-----------------+------------+---------------|
> > |        1 |          0 |       13561000 |         135 |           122 |
> > |        2 |          0 |       19707292 |         197 |           184 |
> > |       10 |          0 |       67774750 |         677 |           664 |
> > |      100 |          0 |      714123125 |        7141 |          7128 |
> > |      200 |          0 |     1918065668 |       19180 |         19167 |
> > +----------+------------+-----------------+------------+---------------+
> >
> > Note: per-call overhead is estimated relative to the baseline case with
> > 0 relevant tracers and 0 irrelevant tracers.
> >
> > With this patch:
> >
> > v4-rc4 (Linux rivos 6.14.0-09598-gd75747611c93 #4 SMP Sat Mar 29
> > +-----------------------+-----------------+----------------------------+
> > |  Number of tracers    | Total time (ns) | Per-call average time      |
> > |-----------------------+-----------------+----------------------------|
> > | Relevant | Irrelevant |    100000 calls | Total (ns) | Overhead (ns) |
> > |----------+------------+-----------------+------------+---------------|
> > |        0 |          0 |         1459917 |         14 |             - |
> > |        0 |          1 |         1408000 |         14 |             - |
> > |        0 |          2 |         1383792 |         13 |             - |
> > |        0 |         10 |         1430709 |         14 |             - |
> > |        0 |        100 |         1383791 |         13 |             - |
> > |        0 |        200 |         1383750 |         13 |             - |
> > |----------+------------+-----------------+------------+---------------|
> > |        1 |          0 |         5238041 |         52 |            38 |
> > |        1 |          1 |         5228542 |         52 |            38 |
> > |        1 |          2 |         5325917 |         53 |            40 |
> > |        1 |         10 |         5299667 |         52 |            38 |
> > |        1 |        100 |         5245250 |         52 |            39 |
> > |        1 |        200 |         5238459 |         52 |            39 |
> > |----------+------------+-----------------+------------+---------------|
> > |        1 |          0 |         5239083 |         52 |            38 |
> > |        2 |          0 |        19449417 |        194 |           181 |
> > |       10 |          0 |        67718584 |        677 |           663 |
> > |      100 |          0 |       709840708 |       7098 |          7085 |
> > |      200 |          0 |      2203580626 |      22035 |         22022 |
> > +----------+------------+-----------------+------------+---------------+
> >
> > Note: per-call overhead is estimated relative to the baseline case with
> > 0 relevant tracers and 0 irrelevant tracers.
> >
> > As can be seen from the above:
> >
> >  a) Whenever there is a single relevant tracer function associated with a
> >     tracee, the overhead of invoking the tracer is constant, and does not
> >     scale with the number of tracers which are *not* associated with that
> >     tracee.
> >
> >  b) The overhead for a single relevant tracer has dropped to ~1/3 of the
> >     overhead prior to this series (from 122ns to 38ns). This is largely
> >     due to permitting calls to dynamically-allocated ftrace_ops without
> >     going through ftrace_ops_list_func.
> >
> > Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
> >
> > [update kconfig, asm, refactor]
> >
> > Signed-off-by: Andy Chiu <andybnac@gmail.com>
> > Tested-by: Björn Töpel <bjorn@rivosinc.com>
>
> I bisected a boot failure to this commit [c217157bcd1df ("riscv:
> Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS")] yesterday, that appears
> to be affecting all LLVM versions that I currently have installed. From
> some initial testing of Kconfig options, it looks like the issue is
> CFI_CLANG related because when I disable CFI_CLANG things work once
> more. Since this option depends on !CFI_CLANG, but is def_bool y, I
> modified Kconfig to force disable it at all times and tested
> !DYNAMIC_FTRACE_WITH_CALL_OPS && !CFG_CLANG, which did boot.
>
> I dunno anything about what's going on in this patch, but so little in
> it relates to having DYNAMIC_FTRACE_WITH_CALL_OPS, that I was able to
> figure out that the problem is -fpatchable-function-entry=8,4
>

DYNAMIC_FTRACE_WITH_CALL_OPS can't work together with CFI_CLANG.

arm64 has:

select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS \
if (DYNAMIC_FTRACE_WITH_ARGS && !CFI && \
   (CC_IS_CLANG || !CC_OPTIMIZE_FOR_SIZE))

would need something similar for riscv if not already done.

Thanks,
Puranjay

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 10/12] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
  2026-02-23 15:18     ` Puranjay Mohan
@ 2026-02-23 15:27       ` Conor Dooley
  2026-02-23 15:41         ` Puranjay Mohan
  0 siblings, 1 reply; 9+ messages in thread
From: Conor Dooley @ 2026-02-23 15:27 UTC (permalink / raw)
  To: Puranjay Mohan
  Cc: Conor Dooley, Andy Chiu, linux-riscv, alexghiti, palmer,
	Björn Töpel, linux-kernel, linux-trace-kernel,
	Alexandre Ghiti, Mark Rutland, paul.walmsley, greentime.hu,
	nick.hu, nylon.chen, eric.lin, vicent.chen, zong.li,
	yongxuan.wang, samuel.holland, olivia.chu, c2232430, arnd,
	Sami Tolvanen, Kees Cook, Nathan Chancellor, llvm, pjw

[-- Attachment #1: Type: text/plain, Size: 8504 bytes --]

On Mon, Feb 23, 2026 at 03:18:17PM +0000, Puranjay Mohan wrote:
> On Sat, Feb 21, 2026 at 12:15 PM Conor Dooley <conor@kernel.org> wrote:
> >
> > Hey,
> >
> > On Tue, Apr 08, 2025 at 02:08:34AM +0800, Andy Chiu wrote:
> > > From: Puranjay Mohan <puranjay12@gmail.com>
> > >
> > > This patch enables support for DYNAMIC_FTRACE_WITH_CALL_OPS on RISC-V.
> > > This allows each ftrace callsite to provide an ftrace_ops to the common
> > > ftrace trampoline, allowing each callsite to invoke distinct tracer
> > > functions without the need to fall back to list processing or to
> > > allocate custom trampolines for each callsite. This significantly speeds
> > > up cases where multiple distinct trace functions are used and callsites
> > > are mostly traced by a single tracer.
> > >
> > > The idea and most of the implementation is taken from the ARM64's
> > > implementation of the same feature. The idea is to place a pointer to
> > > the ftrace_ops as a literal at a fixed offset from the function entry
> > > point, which can be recovered by the common ftrace trampoline.
> > >
> > > We use -fpatchable-function-entry to reserve 8 bytes above the function
> > > entry by emitting 2 4 byte or 4 2 byte  nops depending on the presence of
> > > CONFIG_RISCV_ISA_C. These 8 bytes are patched at runtime with a pointer
> > > to the associated ftrace_ops for that callsite. Functions are aligned to
> > > 8 bytes to make sure that the accesses to this literal are atomic.
> > >
> > > This approach allows for directly invoking ftrace_ops::func even for
> > > ftrace_ops which are dynamically-allocated (or part of a module),
> > > without going via ftrace_ops_list_func.
> > >
> > > We've benchamrked this with the ftrace_ops sample module on Spacemit K1
> > > Jupiter:
> > >
> > > Without this patch:
> > >
> > > baseline (Linux rivos 6.14.0-09584-g7d06015d936c #3 SMP Sat Mar 29
> > > +-----------------------+-----------------+----------------------------+
> > > |  Number of tracers    | Total time (ns) | Per-call average time      |
> > > |-----------------------+-----------------+----------------------------|
> > > | Relevant | Irrelevant |    100000 calls | Total (ns) | Overhead (ns) |
> > > |----------+------------+-----------------+------------+---------------|
> > > |        0 |          0 |        1357958 |          13 |             - |
> > > |        0 |          1 |        1302375 |          13 |             - |
> > > |        0 |          2 |        1302375 |          13 |             - |
> > > |        0 |         10 |        1379084 |          13 |             - |
> > > |        0 |        100 |        1302458 |          13 |             - |
> > > |        0 |        200 |        1302333 |          13 |             - |
> > > |----------+------------+-----------------+------------+---------------|
> > > |        1 |          0 |       13677833 |         136 |           123 |
> > > |        1 |          1 |       18500916 |         185 |           172 |
> > > |        1 |          2 |       22856459 |         228 |           215 |
> > > |        1 |         10 |       58824709 |         588 |           575 |
> > > |        1 |        100 |      505141584 |        5051 |          5038 |
> > > |        1 |        200 |     1580473126 |       15804 |         15791 |
> > > |----------+------------+-----------------+------------+---------------|
> > > |        1 |          0 |       13561000 |         135 |           122 |
> > > |        2 |          0 |       19707292 |         197 |           184 |
> > > |       10 |          0 |       67774750 |         677 |           664 |
> > > |      100 |          0 |      714123125 |        7141 |          7128 |
> > > |      200 |          0 |     1918065668 |       19180 |         19167 |
> > > +----------+------------+-----------------+------------+---------------+
> > >
> > > Note: per-call overhead is estimated relative to the baseline case with
> > > 0 relevant tracers and 0 irrelevant tracers.
> > >
> > > With this patch:
> > >
> > > v4-rc4 (Linux rivos 6.14.0-09598-gd75747611c93 #4 SMP Sat Mar 29
> > > +-----------------------+-----------------+----------------------------+
> > > |  Number of tracers    | Total time (ns) | Per-call average time      |
> > > |-----------------------+-----------------+----------------------------|
> > > | Relevant | Irrelevant |    100000 calls | Total (ns) | Overhead (ns) |
> > > |----------+------------+-----------------+------------+---------------|
> > > |        0 |          0 |         1459917 |         14 |             - |
> > > |        0 |          1 |         1408000 |         14 |             - |
> > > |        0 |          2 |         1383792 |         13 |             - |
> > > |        0 |         10 |         1430709 |         14 |             - |
> > > |        0 |        100 |         1383791 |         13 |             - |
> > > |        0 |        200 |         1383750 |         13 |             - |
> > > |----------+------------+-----------------+------------+---------------|
> > > |        1 |          0 |         5238041 |         52 |            38 |
> > > |        1 |          1 |         5228542 |         52 |            38 |
> > > |        1 |          2 |         5325917 |         53 |            40 |
> > > |        1 |         10 |         5299667 |         52 |            38 |
> > > |        1 |        100 |         5245250 |         52 |            39 |
> > > |        1 |        200 |         5238459 |         52 |            39 |
> > > |----------+------------+-----------------+------------+---------------|
> > > |        1 |          0 |         5239083 |         52 |            38 |
> > > |        2 |          0 |        19449417 |        194 |           181 |
> > > |       10 |          0 |        67718584 |        677 |           663 |
> > > |      100 |          0 |       709840708 |       7098 |          7085 |
> > > |      200 |          0 |      2203580626 |      22035 |         22022 |
> > > +----------+------------+-----------------+------------+---------------+
> > >
> > > Note: per-call overhead is estimated relative to the baseline case with
> > > 0 relevant tracers and 0 irrelevant tracers.
> > >
> > > As can be seen from the above:
> > >
> > >  a) Whenever there is a single relevant tracer function associated with a
> > >     tracee, the overhead of invoking the tracer is constant, and does not
> > >     scale with the number of tracers which are *not* associated with that
> > >     tracee.
> > >
> > >  b) The overhead for a single relevant tracer has dropped to ~1/3 of the
> > >     overhead prior to this series (from 122ns to 38ns). This is largely
> > >     due to permitting calls to dynamically-allocated ftrace_ops without
> > >     going through ftrace_ops_list_func.
> > >
> > > Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
> > >
> > > [update kconfig, asm, refactor]
> > >
> > > Signed-off-by: Andy Chiu <andybnac@gmail.com>
> > > Tested-by: Björn Töpel <bjorn@rivosinc.com>
> >
> > I bisected a boot failure to this commit [c217157bcd1df ("riscv:
> > Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS")] yesterday, that appears
> > to be affecting all LLVM versions that I currently have installed. From
> > some initial testing of Kconfig options, it looks like the issue is
> > CFI_CLANG related because when I disable CFI_CLANG things work once
> > more. Since this option depends on !CFI_CLANG, but is def_bool y, I
> > modified Kconfig to force disable it at all times and tested
> > !DYNAMIC_FTRACE_WITH_CALL_OPS && !CFG_CLANG, which did boot.
> >
> > I dunno anything about what's going on in this patch, but so little in
> > it relates to having DYNAMIC_FTRACE_WITH_CALL_OPS, that I was able to
> > figure out that the problem is -fpatchable-function-entry=8,4
> >
> 
> DYNAMIC_FTRACE_WITH_CALL_OPS can't work together with CFI_CLANG.
> 
> arm64 has:
> 
> select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS \
> if (DYNAMIC_FTRACE_WITH_ARGS && !CFI && \
>    (CC_IS_CLANG || !CC_OPTIMIZE_FOR_SIZE))
> 
> would need something similar for riscv if not already done.


I think you've misunderstood my email. We already have:

	select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS if (DYNAMIC_FTRACE_WITH_ARGS && !CFI)

The problem is that the patch broke using CFI_CLANG, due to the
fpatchable-function-entry change.

Cheers,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 10/12] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
  2026-02-23 15:27       ` Conor Dooley
@ 2026-02-23 15:41         ` Puranjay Mohan
  2026-02-23 16:29           ` Conor Dooley
  0 siblings, 1 reply; 9+ messages in thread
From: Puranjay Mohan @ 2026-02-23 15:41 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Conor Dooley, Andy Chiu, linux-riscv, alexghiti, palmer,
	Björn Töpel, linux-kernel, linux-trace-kernel,
	Alexandre Ghiti, Mark Rutland, paul.walmsley, greentime.hu,
	nick.hu, nylon.chen, eric.lin, vicent.chen, zong.li,
	yongxuan.wang, samuel.holland, olivia.chu, c2232430, arnd,
	Sami Tolvanen, Kees Cook, Nathan Chancellor, llvm, pjw

On Mon, Feb 23, 2026 at 3:28 PM Conor Dooley <conor.dooley@microchip.com> wrote:
>
> On Mon, Feb 23, 2026 at 03:18:17PM +0000, Puranjay Mohan wrote:
> > On Sat, Feb 21, 2026 at 12:15 PM Conor Dooley <conor@kernel.org> wrote:
> > >
> > > Hey,
> > >
> > > On Tue, Apr 08, 2025 at 02:08:34AM +0800, Andy Chiu wrote:
> > > > From: Puranjay Mohan <puranjay12@gmail.com>
> > > >
> > > > This patch enables support for DYNAMIC_FTRACE_WITH_CALL_OPS on RISC-V.
> > > > This allows each ftrace callsite to provide an ftrace_ops to the common
> > > > ftrace trampoline, allowing each callsite to invoke distinct tracer
> > > > functions without the need to fall back to list processing or to
> > > > allocate custom trampolines for each callsite. This significantly speeds
> > > > up cases where multiple distinct trace functions are used and callsites
> > > > are mostly traced by a single tracer.
> > > >
> > > > The idea and most of the implementation is taken from the ARM64's
> > > > implementation of the same feature. The idea is to place a pointer to
> > > > the ftrace_ops as a literal at a fixed offset from the function entry
> > > > point, which can be recovered by the common ftrace trampoline.
> > > >
> > > > We use -fpatchable-function-entry to reserve 8 bytes above the function
> > > > entry by emitting 2 4 byte or 4 2 byte  nops depending on the presence of
> > > > CONFIG_RISCV_ISA_C. These 8 bytes are patched at runtime with a pointer
> > > > to the associated ftrace_ops for that callsite. Functions are aligned to
> > > > 8 bytes to make sure that the accesses to this literal are atomic.
> > > >
> > > > This approach allows for directly invoking ftrace_ops::func even for
> > > > ftrace_ops which are dynamically-allocated (or part of a module),
> > > > without going via ftrace_ops_list_func.
> > > >
> > > > We've benchamrked this with the ftrace_ops sample module on Spacemit K1
> > > > Jupiter:
> > > >
> > > > Without this patch:
> > > >
> > > > baseline (Linux rivos 6.14.0-09584-g7d06015d936c #3 SMP Sat Mar 29
> > > > +-----------------------+-----------------+----------------------------+
> > > > |  Number of tracers    | Total time (ns) | Per-call average time      |
> > > > |-----------------------+-----------------+----------------------------|
> > > > | Relevant | Irrelevant |    100000 calls | Total (ns) | Overhead (ns) |
> > > > |----------+------------+-----------------+------------+---------------|
> > > > |        0 |          0 |        1357958 |          13 |             - |
> > > > |        0 |          1 |        1302375 |          13 |             - |
> > > > |        0 |          2 |        1302375 |          13 |             - |
> > > > |        0 |         10 |        1379084 |          13 |             - |
> > > > |        0 |        100 |        1302458 |          13 |             - |
> > > > |        0 |        200 |        1302333 |          13 |             - |
> > > > |----------+------------+-----------------+------------+---------------|
> > > > |        1 |          0 |       13677833 |         136 |           123 |
> > > > |        1 |          1 |       18500916 |         185 |           172 |
> > > > |        1 |          2 |       22856459 |         228 |           215 |
> > > > |        1 |         10 |       58824709 |         588 |           575 |
> > > > |        1 |        100 |      505141584 |        5051 |          5038 |
> > > > |        1 |        200 |     1580473126 |       15804 |         15791 |
> > > > |----------+------------+-----------------+------------+---------------|
> > > > |        1 |          0 |       13561000 |         135 |           122 |
> > > > |        2 |          0 |       19707292 |         197 |           184 |
> > > > |       10 |          0 |       67774750 |         677 |           664 |
> > > > |      100 |          0 |      714123125 |        7141 |          7128 |
> > > > |      200 |          0 |     1918065668 |       19180 |         19167 |
> > > > +----------+------------+-----------------+------------+---------------+
> > > >
> > > > Note: per-call overhead is estimated relative to the baseline case with
> > > > 0 relevant tracers and 0 irrelevant tracers.
> > > >
> > > > With this patch:
> > > >
> > > > v4-rc4 (Linux rivos 6.14.0-09598-gd75747611c93 #4 SMP Sat Mar 29
> > > > +-----------------------+-----------------+----------------------------+
> > > > |  Number of tracers    | Total time (ns) | Per-call average time      |
> > > > |-----------------------+-----------------+----------------------------|
> > > > | Relevant | Irrelevant |    100000 calls | Total (ns) | Overhead (ns) |
> > > > |----------+------------+-----------------+------------+---------------|
> > > > |        0 |          0 |         1459917 |         14 |             - |
> > > > |        0 |          1 |         1408000 |         14 |             - |
> > > > |        0 |          2 |         1383792 |         13 |             - |
> > > > |        0 |         10 |         1430709 |         14 |             - |
> > > > |        0 |        100 |         1383791 |         13 |             - |
> > > > |        0 |        200 |         1383750 |         13 |             - |
> > > > |----------+------------+-----------------+------------+---------------|
> > > > |        1 |          0 |         5238041 |         52 |            38 |
> > > > |        1 |          1 |         5228542 |         52 |            38 |
> > > > |        1 |          2 |         5325917 |         53 |            40 |
> > > > |        1 |         10 |         5299667 |         52 |            38 |
> > > > |        1 |        100 |         5245250 |         52 |            39 |
> > > > |        1 |        200 |         5238459 |         52 |            39 |
> > > > |----------+------------+-----------------+------------+---------------|
> > > > |        1 |          0 |         5239083 |         52 |            38 |
> > > > |        2 |          0 |        19449417 |        194 |           181 |
> > > > |       10 |          0 |        67718584 |        677 |           663 |
> > > > |      100 |          0 |       709840708 |       7098 |          7085 |
> > > > |      200 |          0 |      2203580626 |      22035 |         22022 |
> > > > +----------+------------+-----------------+------------+---------------+
> > > >
> > > > Note: per-call overhead is estimated relative to the baseline case with
> > > > 0 relevant tracers and 0 irrelevant tracers.
> > > >
> > > > As can be seen from the above:
> > > >
> > > >  a) Whenever there is a single relevant tracer function associated with a
> > > >     tracee, the overhead of invoking the tracer is constant, and does not
> > > >     scale with the number of tracers which are *not* associated with that
> > > >     tracee.
> > > >
> > > >  b) The overhead for a single relevant tracer has dropped to ~1/3 of the
> > > >     overhead prior to this series (from 122ns to 38ns). This is largely
> > > >     due to permitting calls to dynamically-allocated ftrace_ops without
> > > >     going through ftrace_ops_list_func.
> > > >
> > > > Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
> > > >
> > > > [update kconfig, asm, refactor]
> > > >
> > > > Signed-off-by: Andy Chiu <andybnac@gmail.com>
> > > > Tested-by: Björn Töpel <bjorn@rivosinc.com>
> > >
> > > I bisected a boot failure to this commit [c217157bcd1df ("riscv:
> > > Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS")] yesterday, that appears
> > > to be affecting all LLVM versions that I currently have installed. From
> > > some initial testing of Kconfig options, it looks like the issue is
> > > CFI_CLANG related because when I disable CFI_CLANG things work once
> > > more. Since this option depends on !CFI_CLANG, but is def_bool y, I
> > > modified Kconfig to force disable it at all times and tested
> > > !DYNAMIC_FTRACE_WITH_CALL_OPS && !CFG_CLANG, which did boot.
> > >
> > > I dunno anything about what's going on in this patch, but so little in
> > > it relates to having DYNAMIC_FTRACE_WITH_CALL_OPS, that I was able to
> > > figure out that the problem is -fpatchable-function-entry=8,4
> > >
> >
> > DYNAMIC_FTRACE_WITH_CALL_OPS can't work together with CFI_CLANG.
> >
> > arm64 has:
> >
> > select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS \
> > if (DYNAMIC_FTRACE_WITH_ARGS && !CFI && \
> >    (CC_IS_CLANG || !CC_OPTIMIZE_FOR_SIZE))
> >
> > would need something similar for riscv if not already done.
>
>
> I think you've misunderstood my email. We already have:
>
>         select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS if (DYNAMIC_FTRACE_WITH_ARGS && !CFI)
>
> The problem is that the patch broke using CFI_CLANG, due to the
> fpatchable-function-entry change.


Yeah, sorry I did not see the patch,
the original one I sent had:

+ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS), y)
+ifeq ($(CONFIG_RISCV_ISA_C),y)
+ CC_FLAGS_FTRACE := -fpatchable-function-entry=8,4
+else
+ CC_FLAGS_FTRACE := -fpatchable-function-entry=4,2
+endif
+else


The basic Idea is that we can't put nops before the function entry
when using CFI_CLANG, because they both interfere with each other.

the fix should be something like:

-- >8 --

diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 371da75a47f9..94100810a6a4 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -14,11 +14,19 @@ endif
 ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
        LDFLAGS_vmlinux += --no-relax
        KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
+ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS),y)
 ifeq ($(CONFIG_RISCV_ISA_C),y)
        CC_FLAGS_FTRACE := -fpatchable-function-entry=8,4
 else
        CC_FLAGS_FTRACE := -fpatchable-function-entry=4,2
 endif
+else
+ifeq ($(CONFIG_RISCV_ISA_C),y)
+       CC_FLAGS_FTRACE := -fpatchable-function-entry=4
+else
+       CC_FLAGS_FTRACE := -fpatchable-function-entry=2
+endif
+endif
 endif

 ifeq ($(CONFIG_CMODEL_MEDLOW),y)

-- 8< --


Thanks,
Puranjay

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 10/12] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
  2026-02-23 15:41         ` Puranjay Mohan
@ 2026-02-23 16:29           ` Conor Dooley
  2026-02-23 17:36             ` Puranjay Mohan
  0 siblings, 1 reply; 9+ messages in thread
From: Conor Dooley @ 2026-02-23 16:29 UTC (permalink / raw)
  To: Puranjay Mohan
  Cc: Conor Dooley, Andy Chiu, linux-riscv, alexghiti, palmer,
	Björn Töpel, linux-kernel, linux-trace-kernel,
	Alexandre Ghiti, Mark Rutland, paul.walmsley, greentime.hu,
	nick.hu, nylon.chen, eric.lin, vicent.chen, zong.li,
	yongxuan.wang, samuel.holland, olivia.chu, c2232430, arnd,
	Sami Tolvanen, Kees Cook, Nathan Chancellor, llvm, pjw

[-- Attachment #1: Type: text/plain, Size: 4335 bytes --]

On Mon, Feb 23, 2026 at 03:41:26PM +0000, Puranjay Mohan wrote:
> On Mon, Feb 23, 2026 at 3:28 PM Conor Dooley <conor.dooley@microchip.com> wrote:
> >
> > On Mon, Feb 23, 2026 at 03:18:17PM +0000, Puranjay Mohan wrote:
> > > On Sat, Feb 21, 2026 at 12:15 PM Conor Dooley <conor@kernel.org> wrote:
> > > >
> > > > Hey,
> > > >
> > > > On Tue, Apr 08, 2025 at 02:08:34AM +0800, Andy Chiu wrote:
> > > > > From: Puranjay Mohan <puranjay12@gmail.com>
> > > > >
> > > > > This patch enables support for DYNAMIC_FTRACE_WITH_CALL_OPS on RISC-V.
> > > > > This allows each ftrace callsite to provide an ftrace_ops to the common
> > > > > ftrace trampoline, allowing each callsite to invoke distinct tracer
> > > > > functions without the need to fall back to list processing or to
> > > > > allocate custom trampolines for each callsite. This significantly speeds
> > > > > up cases where multiple distinct trace functions are used and callsites
> > > > > are mostly traced by a single tracer.

> > > > >
> > > > > Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
> > > > >
> > > > > [update kconfig, asm, refactor]
> > > > >
> > > > > Signed-off-by: Andy Chiu <andybnac@gmail.com>
> > > > > Tested-by: Björn Töpel <bjorn@rivosinc.com>
> > > >
> > > > I bisected a boot failure to this commit [c217157bcd1df ("riscv:
> > > > Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS")] yesterday, that appears
> > > > to be affecting all LLVM versions that I currently have installed. From
> > > > some initial testing of Kconfig options, it looks like the issue is
> > > > CFI_CLANG related because when I disable CFI_CLANG things work once
> > > > more. Since this option depends on !CFI_CLANG, but is def_bool y, I
> > > > modified Kconfig to force disable it at all times and tested
> > > > !DYNAMIC_FTRACE_WITH_CALL_OPS && !CFG_CLANG, which did boot.
> > > >
> > > > I dunno anything about what's going on in this patch, but so little in
> > > > it relates to having DYNAMIC_FTRACE_WITH_CALL_OPS, that I was able to
> > > > figure out that the problem is -fpatchable-function-entry=8,4
> > > >
> > >
> > > DYNAMIC_FTRACE_WITH_CALL_OPS can't work together with CFI_CLANG.
> > >
> > > arm64 has:
> > >
> > > select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS \
> > > if (DYNAMIC_FTRACE_WITH_ARGS && !CFI && \
> > >    (CC_IS_CLANG || !CC_OPTIMIZE_FOR_SIZE))
> > >
> > > would need something similar for riscv if not already done.
> >
> >
> > I think you've misunderstood my email. We already have:
> >
> >         select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS if (DYNAMIC_FTRACE_WITH_ARGS && !CFI)
> >
> > The problem is that the patch broke using CFI_CLANG, due to the
> > fpatchable-function-entry change.
> 
> 
> Yeah, sorry I did not see the patch,
> the original one I sent had:
> 
> +ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS), y)
> +ifeq ($(CONFIG_RISCV_ISA_C),y)
> + CC_FLAGS_FTRACE := -fpatchable-function-entry=8,4
> +else
> + CC_FLAGS_FTRACE := -fpatchable-function-entry=4,2
> +endif
> +else
> 
> 
> The basic Idea is that we can't put nops before the function entry
> when using CFI_CLANG, because they both interfere with each other.
> 
> the fix should be something like:

Ye, this is what Nathan and I both did locally, give or take. I just
wasn't sure if this was actually correct to do or if it was just
papering over an issue with our CFI support. Do you want to send this as
a patch?

> 
> -- >8 --
> 
> diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
> index 371da75a47f9..94100810a6a4 100644
> --- a/arch/riscv/Makefile
> +++ b/arch/riscv/Makefile
> @@ -14,11 +14,19 @@ endif
>  ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
>         LDFLAGS_vmlinux += --no-relax
>         KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
> +ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS),y)
>  ifeq ($(CONFIG_RISCV_ISA_C),y)
>         CC_FLAGS_FTRACE := -fpatchable-function-entry=8,4
>  else
>         CC_FLAGS_FTRACE := -fpatchable-function-entry=4,2
>  endif
> +else
> +ifeq ($(CONFIG_RISCV_ISA_C),y)
> +       CC_FLAGS_FTRACE := -fpatchable-function-entry=4
> +else
> +       CC_FLAGS_FTRACE := -fpatchable-function-entry=2
> +endif
> +endif
>  endif
> 
>  ifeq ($(CONFIG_CMODEL_MEDLOW),y)
> 
> -- 8< --
> 
> 
> Thanks,
> Puranjay

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 10/12] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
  2026-02-23 16:29           ` Conor Dooley
@ 2026-02-23 17:36             ` Puranjay Mohan
  2026-02-23 17:41               ` Conor Dooley
  0 siblings, 1 reply; 9+ messages in thread
From: Puranjay Mohan @ 2026-02-23 17:36 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Conor Dooley, Andy Chiu, linux-riscv, alexghiti, palmer,
	Björn Töpel, linux-kernel, linux-trace-kernel,
	Alexandre Ghiti, Mark Rutland, paul.walmsley, greentime.hu,
	nick.hu, nylon.chen, eric.lin, vicent.chen, zong.li,
	yongxuan.wang, samuel.holland, olivia.chu, c2232430, arnd,
	Sami Tolvanen, Kees Cook, Nathan Chancellor, llvm, pjw

On Mon, Feb 23, 2026 at 4:29 PM Conor Dooley <conor@kernel.org> wrote:
>
> On Mon, Feb 23, 2026 at 03:41:26PM +0000, Puranjay Mohan wrote:
> > On Mon, Feb 23, 2026 at 3:28 PM Conor Dooley <conor.dooley@microchip.com> wrote:
> > >
> > > On Mon, Feb 23, 2026 at 03:18:17PM +0000, Puranjay Mohan wrote:
> > > > On Sat, Feb 21, 2026 at 12:15 PM Conor Dooley <conor@kernel.org> wrote:
> > > > >
> > > > > Hey,
> > > > >
> > > > > On Tue, Apr 08, 2025 at 02:08:34AM +0800, Andy Chiu wrote:
> > > > > > From: Puranjay Mohan <puranjay12@gmail.com>
> > > > > >
> > > > > > This patch enables support for DYNAMIC_FTRACE_WITH_CALL_OPS on RISC-V.
> > > > > > This allows each ftrace callsite to provide an ftrace_ops to the common
> > > > > > ftrace trampoline, allowing each callsite to invoke distinct tracer
> > > > > > functions without the need to fall back to list processing or to
> > > > > > allocate custom trampolines for each callsite. This significantly speeds
> > > > > > up cases where multiple distinct trace functions are used and callsites
> > > > > > are mostly traced by a single tracer.
>
> > > > > >
> > > > > > Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
> > > > > >
> > > > > > [update kconfig, asm, refactor]
> > > > > >
> > > > > > Signed-off-by: Andy Chiu <andybnac@gmail.com>
> > > > > > Tested-by: Björn Töpel <bjorn@rivosinc.com>
> > > > >
> > > > > I bisected a boot failure to this commit [c217157bcd1df ("riscv:
> > > > > Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS")] yesterday, that appears
> > > > > to be affecting all LLVM versions that I currently have installed. From
> > > > > some initial testing of Kconfig options, it looks like the issue is
> > > > > CFI_CLANG related because when I disable CFI_CLANG things work once
> > > > > more. Since this option depends on !CFI_CLANG, but is def_bool y, I
> > > > > modified Kconfig to force disable it at all times and tested
> > > > > !DYNAMIC_FTRACE_WITH_CALL_OPS && !CFG_CLANG, which did boot.
> > > > >
> > > > > I dunno anything about what's going on in this patch, but so little in
> > > > > it relates to having DYNAMIC_FTRACE_WITH_CALL_OPS, that I was able to
> > > > > figure out that the problem is -fpatchable-function-entry=8,4
> > > > >
> > > >
> > > > DYNAMIC_FTRACE_WITH_CALL_OPS can't work together with CFI_CLANG.
> > > >
> > > > arm64 has:
> > > >
> > > > select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS \
> > > > if (DYNAMIC_FTRACE_WITH_ARGS && !CFI && \
> > > >    (CC_IS_CLANG || !CC_OPTIMIZE_FOR_SIZE))
> > > >
> > > > would need something similar for riscv if not already done.
> > >
> > >
> > > I think you've misunderstood my email. We already have:
> > >
> > >         select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS if (DYNAMIC_FTRACE_WITH_ARGS && !CFI)
> > >
> > > The problem is that the patch broke using CFI_CLANG, due to the
> > > fpatchable-function-entry change.
> >
> >
> > Yeah, sorry I did not see the patch,
> > the original one I sent had:
> >
> > +ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS), y)
> > +ifeq ($(CONFIG_RISCV_ISA_C),y)
> > + CC_FLAGS_FTRACE := -fpatchable-function-entry=8,4
> > +else
> > + CC_FLAGS_FTRACE := -fpatchable-function-entry=4,2
> > +endif
> > +else
> >
> >
> > The basic Idea is that we can't put nops before the function entry
> > when using CFI_CLANG, because they both interfere with each other.
> >
> > the fix should be something like:
>
> Ye, this is what Nathan and I both did locally, give or take. I just
> wasn't sure if this was actually correct to do or if it was just
> papering over an issue with our CFI support. Do you want to send this as
> a patch?

Yes, I will send a patch with fixes tag.

Thanks,
Puranjay

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 10/12] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
  2026-02-23 17:36             ` Puranjay Mohan
@ 2026-02-23 17:41               ` Conor Dooley
  0 siblings, 0 replies; 9+ messages in thread
From: Conor Dooley @ 2026-02-23 17:41 UTC (permalink / raw)
  To: Puranjay Mohan
  Cc: Conor Dooley, Andy Chiu, linux-riscv, alexghiti, palmer,
	Björn Töpel, linux-kernel, linux-trace-kernel,
	Alexandre Ghiti, Mark Rutland, paul.walmsley, greentime.hu,
	nick.hu, nylon.chen, eric.lin, vicent.chen, zong.li,
	yongxuan.wang, samuel.holland, olivia.chu, c2232430, arnd,
	Sami Tolvanen, Kees Cook, Nathan Chancellor, llvm, pjw

[-- Attachment #1: Type: text/plain, Size: 413 bytes --]

On Mon, Feb 23, 2026 at 05:36:24PM +0000, Puranjay Mohan wrote:
> > Ye, this is what Nathan and I both did locally, give or take. I just
> > wasn't sure if this was actually correct to do or if it was just
> > papering over an issue with our CFI support. Do you want to send this as
> > a patch?
> 
> Yes, I will send a patch with fixes tag.


Great, thanks! Add a cc: stable too, while you're at it ;)

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-02-23 17:42 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-07 18:08 [PATCH v4 01/12] riscv: ftrace: support fastcc in Clang for WITH_ARGS Andy Chiu
2025-06-02 22:12 ` patchwork-bot+linux-riscv
     [not found] ` <20250407180838.42877-10-andybnac@gmail.com>
2026-02-21 12:15   ` [PATCH v4 10/12] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS Conor Dooley
2026-02-23 15:18     ` Puranjay Mohan
2026-02-23 15:27       ` Conor Dooley
2026-02-23 15:41         ` Puranjay Mohan
2026-02-23 16:29           ` Conor Dooley
2026-02-23 17:36             ` Puranjay Mohan
2026-02-23 17:41               ` Conor Dooley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox