* [PATCH v4 01/12] riscv: ftrace: support fastcc in Clang for WITH_ARGS
@ 2025-04-07 18:08 Andy Chiu
2025-06-02 22:12 ` patchwork-bot+linux-riscv
[not found] ` <20250407180838.42877-10-andybnac@gmail.com>
0 siblings, 2 replies; 9+ messages in thread
From: Andy Chiu @ 2025-04-07 18:08 UTC (permalink / raw)
To: linux-riscv, alexghiti, palmer
Cc: Andy Chiu, Evgenii Shatokhin, Nathan Chancellor,
Björn Töpel, Palmer Dabbelt, Puranjay Mohan,
linux-kernel, linux-trace-kernel, llvm, Mark Rutland,
Alexandre Ghiti, Nick Desaulniers, Bill Wendling, Justin Stitt,
puranjay12, paul.walmsley, greentime.hu, nick.hu, nylon.chen,
eric.lin, vicent.chen, zong.li, yongxuan.wang, samuel.holland,
olivia.chu, c2232430
From: Andy Chiu <andy.chiu@sifive.com>
Some caller-saved registers which are not defined as function arguments
in the ABI can still be passed as arguments when the kernel is compiled
with Clang. As a result, we must save and restore those registers to
prevent ftrace from clobbering them.
- [1]: https://reviews.llvm.org/D68559
Reported-by: Evgenii Shatokhin <e.shatokhin@yadro.com>
Closes: https://lore.kernel.org/linux-riscv/7e7c7914-445d-426d-89a0-59a9199c45b1@yadro.com/
Fixes: 7caa9765465f ("ftrace: riscv: move from REGS to ARGS")
Acked-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
---
Changelogs v4:
- Add a fix tag (Björn, Evgenii)
---
arch/riscv/include/asm/ftrace.h | 7 +++++++
arch/riscv/kernel/asm-offsets.c | 7 +++++++
arch/riscv/kernel/mcount-dyn.S | 16 ++++++++++++++--
3 files changed, 28 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
index d627f63ee289..d8b2138bd9c6 100644
--- a/arch/riscv/include/asm/ftrace.h
+++ b/arch/riscv/include/asm/ftrace.h
@@ -146,6 +146,13 @@ struct __arch_ftrace_regs {
unsigned long a5;
unsigned long a6;
unsigned long a7;
+#ifdef CONFIG_CC_IS_CLANG
+ unsigned long t2;
+ unsigned long t3;
+ unsigned long t4;
+ unsigned long t5;
+ unsigned long t6;
+#endif
};
};
};
diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c
index 16490755304e..7c43c8e26ae7 100644
--- a/arch/riscv/kernel/asm-offsets.c
+++ b/arch/riscv/kernel/asm-offsets.c
@@ -501,6 +501,13 @@ void asm_offsets(void)
DEFINE(FREGS_SP, offsetof(struct __arch_ftrace_regs, sp));
DEFINE(FREGS_S0, offsetof(struct __arch_ftrace_regs, s0));
DEFINE(FREGS_T1, offsetof(struct __arch_ftrace_regs, t1));
+#ifdef CONFIG_CC_IS_CLANG
+ DEFINE(FREGS_T2, offsetof(struct __arch_ftrace_regs, t2));
+ DEFINE(FREGS_T3, offsetof(struct __arch_ftrace_regs, t3));
+ DEFINE(FREGS_T4, offsetof(struct __arch_ftrace_regs, t4));
+ DEFINE(FREGS_T5, offsetof(struct __arch_ftrace_regs, t5));
+ DEFINE(FREGS_T6, offsetof(struct __arch_ftrace_regs, t6));
+#endif
DEFINE(FREGS_A0, offsetof(struct __arch_ftrace_regs, a0));
DEFINE(FREGS_A1, offsetof(struct __arch_ftrace_regs, a1));
DEFINE(FREGS_A2, offsetof(struct __arch_ftrace_regs, a2));
diff --git a/arch/riscv/kernel/mcount-dyn.S b/arch/riscv/kernel/mcount-dyn.S
index 745dd4c4a69c..e988bd26b28b 100644
--- a/arch/riscv/kernel/mcount-dyn.S
+++ b/arch/riscv/kernel/mcount-dyn.S
@@ -96,7 +96,13 @@
REG_S x8, FREGS_S0(sp)
#endif
REG_S x6, FREGS_T1(sp)
-
+#ifdef CONFIG_CC_IS_CLANG
+ REG_S x7, FREGS_T2(sp)
+ REG_S x28, FREGS_T3(sp)
+ REG_S x29, FREGS_T4(sp)
+ REG_S x30, FREGS_T5(sp)
+ REG_S x31, FREGS_T6(sp)
+#endif
// save the arguments
REG_S x10, FREGS_A0(sp)
REG_S x11, FREGS_A1(sp)
@@ -115,7 +121,13 @@
REG_L x8, FREGS_S0(sp)
#endif
REG_L x6, FREGS_T1(sp)
-
+#ifdef CONFIG_CC_IS_CLANG
+ REG_L x7, FREGS_T2(sp)
+ REG_L x28, FREGS_T3(sp)
+ REG_L x29, FREGS_T4(sp)
+ REG_L x30, FREGS_T5(sp)
+ REG_L x31, FREGS_T6(sp)
+#endif
// restore the arguments
REG_L x10, FREGS_A0(sp)
REG_L x11, FREGS_A1(sp)
--
2.39.3 (Apple Git-145)
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH v4 01/12] riscv: ftrace: support fastcc in Clang for WITH_ARGS
2025-04-07 18:08 [PATCH v4 01/12] riscv: ftrace: support fastcc in Clang for WITH_ARGS Andy Chiu
@ 2025-06-02 22:12 ` patchwork-bot+linux-riscv
[not found] ` <20250407180838.42877-10-andybnac@gmail.com>
1 sibling, 0 replies; 9+ messages in thread
From: patchwork-bot+linux-riscv @ 2025-06-02 22:12 UTC (permalink / raw)
To: Andy Chiu
Cc: linux-riscv, alexghiti, palmer, andy.chiu, e.shatokhin, nathan,
bjorn, palmer, puranjay, linux-kernel, linux-trace-kernel, llvm,
mark.rutland, alex, nick.desaulniers+lkml, morbo, justinstitt,
puranjay12, paul.walmsley, greentime.hu, nick.hu, nylon.chen,
eric.lin, vicent.chen, zong.li, yongxuan.wang, samuel.holland,
olivia.chu, c2232430
Hello:
This series was applied to riscv/linux.git (for-next)
by Alexandre Ghiti <alexghiti@rivosinc.com>:
On Tue, 8 Apr 2025 02:08:25 +0800 you wrote:
> From: Andy Chiu <andy.chiu@sifive.com>
>
> Some caller-saved registers which are not defined as function arguments
> in the ABI can still be passed as arguments when the kernel is compiled
> with Clang. As a result, we must save and restore those registers to
> prevent ftrace from clobbering them.
>
> [...]
Here is the summary with links:
- [v4,01/12] riscv: ftrace: support fastcc in Clang for WITH_ARGS
https://git.kernel.org/riscv/c/7cecf4f30c33
- [v4,02/12] riscv: ftrace factor out code defined by !WITH_ARG
https://git.kernel.org/riscv/c/2efa234f5e0c
- [v4,03/12] riscv: ftrace: align patchable functions to 4 Byte boundary
https://git.kernel.org/riscv/c/cced570c2c0c
- [v4,04/12] kernel: ftrace: export ftrace_sync_ipi
(no matching commit)
- [v4,05/12] riscv: ftrace: prepare ftrace for atomic code patching
(no matching commit)
- [v4,06/12] riscv: ftrace: do not use stop_machine to update code
(no matching commit)
- [v4,07/12] riscv: vector: Support calling schedule() for preemptible Vector
https://git.kernel.org/riscv/c/e2a8cbdbe932
- [v4,08/12] riscv: add a data fence for CMODX in the kernel mode
https://git.kernel.org/riscv/c/29b59e3bbb6e
- [v4,09/12] riscv: ftrace: support PREEMPT
https://git.kernel.org/riscv/c/f48ba55bb8a8
- [v4,10/12] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
(no matching commit)
- [v4,11/12] riscv: ftrace: support direct call using call_ops
https://git.kernel.org/riscv/c/7ef9ae7457c0
- [v4,12/12] riscv: Documentation: add a description about dynamic ftrace
https://git.kernel.org/riscv/c/0e07200b2af6
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 9+ messages in thread[parent not found: <20250407180838.42877-10-andybnac@gmail.com>]
* Re: [PATCH v4 10/12] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
[not found] ` <20250407180838.42877-10-andybnac@gmail.com>
@ 2026-02-21 12:15 ` Conor Dooley
2026-02-23 15:18 ` Puranjay Mohan
0 siblings, 1 reply; 9+ messages in thread
From: Conor Dooley @ 2026-02-21 12:15 UTC (permalink / raw)
To: Andy Chiu
Cc: linux-riscv, alexghiti, palmer, Puranjay Mohan,
Björn Töpel, linux-kernel, linux-trace-kernel,
Alexandre Ghiti, Mark Rutland, paul.walmsley, greentime.hu,
nick.hu, nylon.chen, eric.lin, vicent.chen, zong.li,
yongxuan.wang, samuel.holland, olivia.chu, c2232430, arnd,
Sami Tolvanen, Kees Cook, Nathan Chancellor, llvm, pjw
[-- Attachment #1: Type: text/plain, Size: 7719 bytes --]
Hey,
On Tue, Apr 08, 2025 at 02:08:34AM +0800, Andy Chiu wrote:
> From: Puranjay Mohan <puranjay12@gmail.com>
>
> This patch enables support for DYNAMIC_FTRACE_WITH_CALL_OPS on RISC-V.
> This allows each ftrace callsite to provide an ftrace_ops to the common
> ftrace trampoline, allowing each callsite to invoke distinct tracer
> functions without the need to fall back to list processing or to
> allocate custom trampolines for each callsite. This significantly speeds
> up cases where multiple distinct trace functions are used and callsites
> are mostly traced by a single tracer.
>
> The idea and most of the implementation is taken from the ARM64's
> implementation of the same feature. The idea is to place a pointer to
> the ftrace_ops as a literal at a fixed offset from the function entry
> point, which can be recovered by the common ftrace trampoline.
>
> We use -fpatchable-function-entry to reserve 8 bytes above the function
> entry by emitting 2 4 byte or 4 2 byte nops depending on the presence of
> CONFIG_RISCV_ISA_C. These 8 bytes are patched at runtime with a pointer
> to the associated ftrace_ops for that callsite. Functions are aligned to
> 8 bytes to make sure that the accesses to this literal are atomic.
>
> This approach allows for directly invoking ftrace_ops::func even for
> ftrace_ops which are dynamically-allocated (or part of a module),
> without going via ftrace_ops_list_func.
>
> We've benchamrked this with the ftrace_ops sample module on Spacemit K1
> Jupiter:
>
> Without this patch:
>
> baseline (Linux rivos 6.14.0-09584-g7d06015d936c #3 SMP Sat Mar 29
> +-----------------------+-----------------+----------------------------+
> | Number of tracers | Total time (ns) | Per-call average time |
> |-----------------------+-----------------+----------------------------|
> | Relevant | Irrelevant | 100000 calls | Total (ns) | Overhead (ns) |
> |----------+------------+-----------------+------------+---------------|
> | 0 | 0 | 1357958 | 13 | - |
> | 0 | 1 | 1302375 | 13 | - |
> | 0 | 2 | 1302375 | 13 | - |
> | 0 | 10 | 1379084 | 13 | - |
> | 0 | 100 | 1302458 | 13 | - |
> | 0 | 200 | 1302333 | 13 | - |
> |----------+------------+-----------------+------------+---------------|
> | 1 | 0 | 13677833 | 136 | 123 |
> | 1 | 1 | 18500916 | 185 | 172 |
> | 1 | 2 | 22856459 | 228 | 215 |
> | 1 | 10 | 58824709 | 588 | 575 |
> | 1 | 100 | 505141584 | 5051 | 5038 |
> | 1 | 200 | 1580473126 | 15804 | 15791 |
> |----------+------------+-----------------+------------+---------------|
> | 1 | 0 | 13561000 | 135 | 122 |
> | 2 | 0 | 19707292 | 197 | 184 |
> | 10 | 0 | 67774750 | 677 | 664 |
> | 100 | 0 | 714123125 | 7141 | 7128 |
> | 200 | 0 | 1918065668 | 19180 | 19167 |
> +----------+------------+-----------------+------------+---------------+
>
> Note: per-call overhead is estimated relative to the baseline case with
> 0 relevant tracers and 0 irrelevant tracers.
>
> With this patch:
>
> v4-rc4 (Linux rivos 6.14.0-09598-gd75747611c93 #4 SMP Sat Mar 29
> +-----------------------+-----------------+----------------------------+
> | Number of tracers | Total time (ns) | Per-call average time |
> |-----------------------+-----------------+----------------------------|
> | Relevant | Irrelevant | 100000 calls | Total (ns) | Overhead (ns) |
> |----------+------------+-----------------+------------+---------------|
> | 0 | 0 | 1459917 | 14 | - |
> | 0 | 1 | 1408000 | 14 | - |
> | 0 | 2 | 1383792 | 13 | - |
> | 0 | 10 | 1430709 | 14 | - |
> | 0 | 100 | 1383791 | 13 | - |
> | 0 | 200 | 1383750 | 13 | - |
> |----------+------------+-----------------+------------+---------------|
> | 1 | 0 | 5238041 | 52 | 38 |
> | 1 | 1 | 5228542 | 52 | 38 |
> | 1 | 2 | 5325917 | 53 | 40 |
> | 1 | 10 | 5299667 | 52 | 38 |
> | 1 | 100 | 5245250 | 52 | 39 |
> | 1 | 200 | 5238459 | 52 | 39 |
> |----------+------------+-----------------+------------+---------------|
> | 1 | 0 | 5239083 | 52 | 38 |
> | 2 | 0 | 19449417 | 194 | 181 |
> | 10 | 0 | 67718584 | 677 | 663 |
> | 100 | 0 | 709840708 | 7098 | 7085 |
> | 200 | 0 | 2203580626 | 22035 | 22022 |
> +----------+------------+-----------------+------------+---------------+
>
> Note: per-call overhead is estimated relative to the baseline case with
> 0 relevant tracers and 0 irrelevant tracers.
>
> As can be seen from the above:
>
> a) Whenever there is a single relevant tracer function associated with a
> tracee, the overhead of invoking the tracer is constant, and does not
> scale with the number of tracers which are *not* associated with that
> tracee.
>
> b) The overhead for a single relevant tracer has dropped to ~1/3 of the
> overhead prior to this series (from 122ns to 38ns). This is largely
> due to permitting calls to dynamically-allocated ftrace_ops without
> going through ftrace_ops_list_func.
>
> Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
>
> [update kconfig, asm, refactor]
>
> Signed-off-by: Andy Chiu <andybnac@gmail.com>
> Tested-by: Björn Töpel <bjorn@rivosinc.com>
I bisected a boot failure to this commit [c217157bcd1df ("riscv:
Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS")] yesterday, that appears
to be affecting all LLVM versions that I currently have installed. From
some initial testing of Kconfig options, it looks like the issue is
CFI_CLANG related because when I disable CFI_CLANG things work once
more. Since this option depends on !CFI_CLANG, but is def_bool y, I
modified Kconfig to force disable it at all times and tested
!DYNAMIC_FTRACE_WITH_CALL_OPS && !CFG_CLANG, which did boot.
I dunno anything about what's going on in this patch, but so little in
it relates to having DYNAMIC_FTRACE_WITH_CALL_OPS, that I was able to
figure out that the problem is -fpatchable-function-entry=8,4
FWIW, if anyone checks out this commit directly, you'll need to
cherry-pick commit e9d86b8e17e72 ("scripts: Do not strip .rela.dyn
section"), as the base of the branch that c217157bcd1df is on is
v6.15-rc3, which is in itself broken in turn by the issue fixed by
e9d86b8e17e72. Probably not someone anyone will do, but made for an
awful time trying to figure out what commit was at fault!
Cheers,
Conor.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v4 10/12] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
2026-02-21 12:15 ` [PATCH v4 10/12] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS Conor Dooley
@ 2026-02-23 15:18 ` Puranjay Mohan
2026-02-23 15:27 ` Conor Dooley
0 siblings, 1 reply; 9+ messages in thread
From: Puranjay Mohan @ 2026-02-23 15:18 UTC (permalink / raw)
To: Conor Dooley
Cc: Andy Chiu, linux-riscv, alexghiti, palmer, Björn Töpel,
linux-kernel, linux-trace-kernel, Alexandre Ghiti, Mark Rutland,
paul.walmsley, greentime.hu, nick.hu, nylon.chen, eric.lin,
vicent.chen, zong.li, yongxuan.wang, samuel.holland, olivia.chu,
c2232430, arnd, Sami Tolvanen, Kees Cook, Nathan Chancellor, llvm,
pjw
On Sat, Feb 21, 2026 at 12:15 PM Conor Dooley <conor@kernel.org> wrote:
>
> Hey,
>
> On Tue, Apr 08, 2025 at 02:08:34AM +0800, Andy Chiu wrote:
> > From: Puranjay Mohan <puranjay12@gmail.com>
> >
> > This patch enables support for DYNAMIC_FTRACE_WITH_CALL_OPS on RISC-V.
> > This allows each ftrace callsite to provide an ftrace_ops to the common
> > ftrace trampoline, allowing each callsite to invoke distinct tracer
> > functions without the need to fall back to list processing or to
> > allocate custom trampolines for each callsite. This significantly speeds
> > up cases where multiple distinct trace functions are used and callsites
> > are mostly traced by a single tracer.
> >
> > The idea and most of the implementation is taken from the ARM64's
> > implementation of the same feature. The idea is to place a pointer to
> > the ftrace_ops as a literal at a fixed offset from the function entry
> > point, which can be recovered by the common ftrace trampoline.
> >
> > We use -fpatchable-function-entry to reserve 8 bytes above the function
> > entry by emitting 2 4 byte or 4 2 byte nops depending on the presence of
> > CONFIG_RISCV_ISA_C. These 8 bytes are patched at runtime with a pointer
> > to the associated ftrace_ops for that callsite. Functions are aligned to
> > 8 bytes to make sure that the accesses to this literal are atomic.
> >
> > This approach allows for directly invoking ftrace_ops::func even for
> > ftrace_ops which are dynamically-allocated (or part of a module),
> > without going via ftrace_ops_list_func.
> >
> > We've benchamrked this with the ftrace_ops sample module on Spacemit K1
> > Jupiter:
> >
> > Without this patch:
> >
> > baseline (Linux rivos 6.14.0-09584-g7d06015d936c #3 SMP Sat Mar 29
> > +-----------------------+-----------------+----------------------------+
> > | Number of tracers | Total time (ns) | Per-call average time |
> > |-----------------------+-----------------+----------------------------|
> > | Relevant | Irrelevant | 100000 calls | Total (ns) | Overhead (ns) |
> > |----------+------------+-----------------+------------+---------------|
> > | 0 | 0 | 1357958 | 13 | - |
> > | 0 | 1 | 1302375 | 13 | - |
> > | 0 | 2 | 1302375 | 13 | - |
> > | 0 | 10 | 1379084 | 13 | - |
> > | 0 | 100 | 1302458 | 13 | - |
> > | 0 | 200 | 1302333 | 13 | - |
> > |----------+------------+-----------------+------------+---------------|
> > | 1 | 0 | 13677833 | 136 | 123 |
> > | 1 | 1 | 18500916 | 185 | 172 |
> > | 1 | 2 | 22856459 | 228 | 215 |
> > | 1 | 10 | 58824709 | 588 | 575 |
> > | 1 | 100 | 505141584 | 5051 | 5038 |
> > | 1 | 200 | 1580473126 | 15804 | 15791 |
> > |----------+------------+-----------------+------------+---------------|
> > | 1 | 0 | 13561000 | 135 | 122 |
> > | 2 | 0 | 19707292 | 197 | 184 |
> > | 10 | 0 | 67774750 | 677 | 664 |
> > | 100 | 0 | 714123125 | 7141 | 7128 |
> > | 200 | 0 | 1918065668 | 19180 | 19167 |
> > +----------+------------+-----------------+------------+---------------+
> >
> > Note: per-call overhead is estimated relative to the baseline case with
> > 0 relevant tracers and 0 irrelevant tracers.
> >
> > With this patch:
> >
> > v4-rc4 (Linux rivos 6.14.0-09598-gd75747611c93 #4 SMP Sat Mar 29
> > +-----------------------+-----------------+----------------------------+
> > | Number of tracers | Total time (ns) | Per-call average time |
> > |-----------------------+-----------------+----------------------------|
> > | Relevant | Irrelevant | 100000 calls | Total (ns) | Overhead (ns) |
> > |----------+------------+-----------------+------------+---------------|
> > | 0 | 0 | 1459917 | 14 | - |
> > | 0 | 1 | 1408000 | 14 | - |
> > | 0 | 2 | 1383792 | 13 | - |
> > | 0 | 10 | 1430709 | 14 | - |
> > | 0 | 100 | 1383791 | 13 | - |
> > | 0 | 200 | 1383750 | 13 | - |
> > |----------+------------+-----------------+------------+---------------|
> > | 1 | 0 | 5238041 | 52 | 38 |
> > | 1 | 1 | 5228542 | 52 | 38 |
> > | 1 | 2 | 5325917 | 53 | 40 |
> > | 1 | 10 | 5299667 | 52 | 38 |
> > | 1 | 100 | 5245250 | 52 | 39 |
> > | 1 | 200 | 5238459 | 52 | 39 |
> > |----------+------------+-----------------+------------+---------------|
> > | 1 | 0 | 5239083 | 52 | 38 |
> > | 2 | 0 | 19449417 | 194 | 181 |
> > | 10 | 0 | 67718584 | 677 | 663 |
> > | 100 | 0 | 709840708 | 7098 | 7085 |
> > | 200 | 0 | 2203580626 | 22035 | 22022 |
> > +----------+------------+-----------------+------------+---------------+
> >
> > Note: per-call overhead is estimated relative to the baseline case with
> > 0 relevant tracers and 0 irrelevant tracers.
> >
> > As can be seen from the above:
> >
> > a) Whenever there is a single relevant tracer function associated with a
> > tracee, the overhead of invoking the tracer is constant, and does not
> > scale with the number of tracers which are *not* associated with that
> > tracee.
> >
> > b) The overhead for a single relevant tracer has dropped to ~1/3 of the
> > overhead prior to this series (from 122ns to 38ns). This is largely
> > due to permitting calls to dynamically-allocated ftrace_ops without
> > going through ftrace_ops_list_func.
> >
> > Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
> >
> > [update kconfig, asm, refactor]
> >
> > Signed-off-by: Andy Chiu <andybnac@gmail.com>
> > Tested-by: Björn Töpel <bjorn@rivosinc.com>
>
> I bisected a boot failure to this commit [c217157bcd1df ("riscv:
> Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS")] yesterday, that appears
> to be affecting all LLVM versions that I currently have installed. From
> some initial testing of Kconfig options, it looks like the issue is
> CFI_CLANG related because when I disable CFI_CLANG things work once
> more. Since this option depends on !CFI_CLANG, but is def_bool y, I
> modified Kconfig to force disable it at all times and tested
> !DYNAMIC_FTRACE_WITH_CALL_OPS && !CFG_CLANG, which did boot.
>
> I dunno anything about what's going on in this patch, but so little in
> it relates to having DYNAMIC_FTRACE_WITH_CALL_OPS, that I was able to
> figure out that the problem is -fpatchable-function-entry=8,4
>
DYNAMIC_FTRACE_WITH_CALL_OPS can't work together with CFI_CLANG.
arm64 has:
select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS \
if (DYNAMIC_FTRACE_WITH_ARGS && !CFI && \
(CC_IS_CLANG || !CC_OPTIMIZE_FOR_SIZE))
would need something similar for riscv if not already done.
Thanks,
Puranjay
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v4 10/12] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
2026-02-23 15:18 ` Puranjay Mohan
@ 2026-02-23 15:27 ` Conor Dooley
2026-02-23 15:41 ` Puranjay Mohan
0 siblings, 1 reply; 9+ messages in thread
From: Conor Dooley @ 2026-02-23 15:27 UTC (permalink / raw)
To: Puranjay Mohan
Cc: Conor Dooley, Andy Chiu, linux-riscv, alexghiti, palmer,
Björn Töpel, linux-kernel, linux-trace-kernel,
Alexandre Ghiti, Mark Rutland, paul.walmsley, greentime.hu,
nick.hu, nylon.chen, eric.lin, vicent.chen, zong.li,
yongxuan.wang, samuel.holland, olivia.chu, c2232430, arnd,
Sami Tolvanen, Kees Cook, Nathan Chancellor, llvm, pjw
[-- Attachment #1: Type: text/plain, Size: 8504 bytes --]
On Mon, Feb 23, 2026 at 03:18:17PM +0000, Puranjay Mohan wrote:
> On Sat, Feb 21, 2026 at 12:15 PM Conor Dooley <conor@kernel.org> wrote:
> >
> > Hey,
> >
> > On Tue, Apr 08, 2025 at 02:08:34AM +0800, Andy Chiu wrote:
> > > From: Puranjay Mohan <puranjay12@gmail.com>
> > >
> > > This patch enables support for DYNAMIC_FTRACE_WITH_CALL_OPS on RISC-V.
> > > This allows each ftrace callsite to provide an ftrace_ops to the common
> > > ftrace trampoline, allowing each callsite to invoke distinct tracer
> > > functions without the need to fall back to list processing or to
> > > allocate custom trampolines for each callsite. This significantly speeds
> > > up cases where multiple distinct trace functions are used and callsites
> > > are mostly traced by a single tracer.
> > >
> > > The idea and most of the implementation is taken from the ARM64's
> > > implementation of the same feature. The idea is to place a pointer to
> > > the ftrace_ops as a literal at a fixed offset from the function entry
> > > point, which can be recovered by the common ftrace trampoline.
> > >
> > > We use -fpatchable-function-entry to reserve 8 bytes above the function
> > > entry by emitting 2 4 byte or 4 2 byte nops depending on the presence of
> > > CONFIG_RISCV_ISA_C. These 8 bytes are patched at runtime with a pointer
> > > to the associated ftrace_ops for that callsite. Functions are aligned to
> > > 8 bytes to make sure that the accesses to this literal are atomic.
> > >
> > > This approach allows for directly invoking ftrace_ops::func even for
> > > ftrace_ops which are dynamically-allocated (or part of a module),
> > > without going via ftrace_ops_list_func.
> > >
> > > We've benchamrked this with the ftrace_ops sample module on Spacemit K1
> > > Jupiter:
> > >
> > > Without this patch:
> > >
> > > baseline (Linux rivos 6.14.0-09584-g7d06015d936c #3 SMP Sat Mar 29
> > > +-----------------------+-----------------+----------------------------+
> > > | Number of tracers | Total time (ns) | Per-call average time |
> > > |-----------------------+-----------------+----------------------------|
> > > | Relevant | Irrelevant | 100000 calls | Total (ns) | Overhead (ns) |
> > > |----------+------------+-----------------+------------+---------------|
> > > | 0 | 0 | 1357958 | 13 | - |
> > > | 0 | 1 | 1302375 | 13 | - |
> > > | 0 | 2 | 1302375 | 13 | - |
> > > | 0 | 10 | 1379084 | 13 | - |
> > > | 0 | 100 | 1302458 | 13 | - |
> > > | 0 | 200 | 1302333 | 13 | - |
> > > |----------+------------+-----------------+------------+---------------|
> > > | 1 | 0 | 13677833 | 136 | 123 |
> > > | 1 | 1 | 18500916 | 185 | 172 |
> > > | 1 | 2 | 22856459 | 228 | 215 |
> > > | 1 | 10 | 58824709 | 588 | 575 |
> > > | 1 | 100 | 505141584 | 5051 | 5038 |
> > > | 1 | 200 | 1580473126 | 15804 | 15791 |
> > > |----------+------------+-----------------+------------+---------------|
> > > | 1 | 0 | 13561000 | 135 | 122 |
> > > | 2 | 0 | 19707292 | 197 | 184 |
> > > | 10 | 0 | 67774750 | 677 | 664 |
> > > | 100 | 0 | 714123125 | 7141 | 7128 |
> > > | 200 | 0 | 1918065668 | 19180 | 19167 |
> > > +----------+------------+-----------------+------------+---------------+
> > >
> > > Note: per-call overhead is estimated relative to the baseline case with
> > > 0 relevant tracers and 0 irrelevant tracers.
> > >
> > > With this patch:
> > >
> > > v4-rc4 (Linux rivos 6.14.0-09598-gd75747611c93 #4 SMP Sat Mar 29
> > > +-----------------------+-----------------+----------------------------+
> > > | Number of tracers | Total time (ns) | Per-call average time |
> > > |-----------------------+-----------------+----------------------------|
> > > | Relevant | Irrelevant | 100000 calls | Total (ns) | Overhead (ns) |
> > > |----------+------------+-----------------+------------+---------------|
> > > | 0 | 0 | 1459917 | 14 | - |
> > > | 0 | 1 | 1408000 | 14 | - |
> > > | 0 | 2 | 1383792 | 13 | - |
> > > | 0 | 10 | 1430709 | 14 | - |
> > > | 0 | 100 | 1383791 | 13 | - |
> > > | 0 | 200 | 1383750 | 13 | - |
> > > |----------+------------+-----------------+------------+---------------|
> > > | 1 | 0 | 5238041 | 52 | 38 |
> > > | 1 | 1 | 5228542 | 52 | 38 |
> > > | 1 | 2 | 5325917 | 53 | 40 |
> > > | 1 | 10 | 5299667 | 52 | 38 |
> > > | 1 | 100 | 5245250 | 52 | 39 |
> > > | 1 | 200 | 5238459 | 52 | 39 |
> > > |----------+------------+-----------------+------------+---------------|
> > > | 1 | 0 | 5239083 | 52 | 38 |
> > > | 2 | 0 | 19449417 | 194 | 181 |
> > > | 10 | 0 | 67718584 | 677 | 663 |
> > > | 100 | 0 | 709840708 | 7098 | 7085 |
> > > | 200 | 0 | 2203580626 | 22035 | 22022 |
> > > +----------+------------+-----------------+------------+---------------+
> > >
> > > Note: per-call overhead is estimated relative to the baseline case with
> > > 0 relevant tracers and 0 irrelevant tracers.
> > >
> > > As can be seen from the above:
> > >
> > > a) Whenever there is a single relevant tracer function associated with a
> > > tracee, the overhead of invoking the tracer is constant, and does not
> > > scale with the number of tracers which are *not* associated with that
> > > tracee.
> > >
> > > b) The overhead for a single relevant tracer has dropped to ~1/3 of the
> > > overhead prior to this series (from 122ns to 38ns). This is largely
> > > due to permitting calls to dynamically-allocated ftrace_ops without
> > > going through ftrace_ops_list_func.
> > >
> > > Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
> > >
> > > [update kconfig, asm, refactor]
> > >
> > > Signed-off-by: Andy Chiu <andybnac@gmail.com>
> > > Tested-by: Björn Töpel <bjorn@rivosinc.com>
> >
> > I bisected a boot failure to this commit [c217157bcd1df ("riscv:
> > Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS")] yesterday, that appears
> > to be affecting all LLVM versions that I currently have installed. From
> > some initial testing of Kconfig options, it looks like the issue is
> > CFI_CLANG related because when I disable CFI_CLANG things work once
> > more. Since this option depends on !CFI_CLANG, but is def_bool y, I
> > modified Kconfig to force disable it at all times and tested
> > !DYNAMIC_FTRACE_WITH_CALL_OPS && !CFG_CLANG, which did boot.
> >
> > I dunno anything about what's going on in this patch, but so little in
> > it relates to having DYNAMIC_FTRACE_WITH_CALL_OPS, that I was able to
> > figure out that the problem is -fpatchable-function-entry=8,4
> >
>
> DYNAMIC_FTRACE_WITH_CALL_OPS can't work together with CFI_CLANG.
>
> arm64 has:
>
> select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS \
> if (DYNAMIC_FTRACE_WITH_ARGS && !CFI && \
> (CC_IS_CLANG || !CC_OPTIMIZE_FOR_SIZE))
>
> would need something similar for riscv if not already done.
I think you've misunderstood my email. We already have:
select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS if (DYNAMIC_FTRACE_WITH_ARGS && !CFI)
The problem is that the patch broke using CFI_CLANG, due to the
fpatchable-function-entry change.
Cheers,
Conor.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v4 10/12] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
2026-02-23 15:27 ` Conor Dooley
@ 2026-02-23 15:41 ` Puranjay Mohan
2026-02-23 16:29 ` Conor Dooley
0 siblings, 1 reply; 9+ messages in thread
From: Puranjay Mohan @ 2026-02-23 15:41 UTC (permalink / raw)
To: Conor Dooley
Cc: Conor Dooley, Andy Chiu, linux-riscv, alexghiti, palmer,
Björn Töpel, linux-kernel, linux-trace-kernel,
Alexandre Ghiti, Mark Rutland, paul.walmsley, greentime.hu,
nick.hu, nylon.chen, eric.lin, vicent.chen, zong.li,
yongxuan.wang, samuel.holland, olivia.chu, c2232430, arnd,
Sami Tolvanen, Kees Cook, Nathan Chancellor, llvm, pjw
On Mon, Feb 23, 2026 at 3:28 PM Conor Dooley <conor.dooley@microchip.com> wrote:
>
> On Mon, Feb 23, 2026 at 03:18:17PM +0000, Puranjay Mohan wrote:
> > On Sat, Feb 21, 2026 at 12:15 PM Conor Dooley <conor@kernel.org> wrote:
> > >
> > > Hey,
> > >
> > > On Tue, Apr 08, 2025 at 02:08:34AM +0800, Andy Chiu wrote:
> > > > From: Puranjay Mohan <puranjay12@gmail.com>
> > > >
> > > > This patch enables support for DYNAMIC_FTRACE_WITH_CALL_OPS on RISC-V.
> > > > This allows each ftrace callsite to provide an ftrace_ops to the common
> > > > ftrace trampoline, allowing each callsite to invoke distinct tracer
> > > > functions without the need to fall back to list processing or to
> > > > allocate custom trampolines for each callsite. This significantly speeds
> > > > up cases where multiple distinct trace functions are used and callsites
> > > > are mostly traced by a single tracer.
> > > >
> > > > The idea and most of the implementation is taken from the ARM64's
> > > > implementation of the same feature. The idea is to place a pointer to
> > > > the ftrace_ops as a literal at a fixed offset from the function entry
> > > > point, which can be recovered by the common ftrace trampoline.
> > > >
> > > > We use -fpatchable-function-entry to reserve 8 bytes above the function
> > > > entry by emitting 2 4 byte or 4 2 byte nops depending on the presence of
> > > > CONFIG_RISCV_ISA_C. These 8 bytes are patched at runtime with a pointer
> > > > to the associated ftrace_ops for that callsite. Functions are aligned to
> > > > 8 bytes to make sure that the accesses to this literal are atomic.
> > > >
> > > > This approach allows for directly invoking ftrace_ops::func even for
> > > > ftrace_ops which are dynamically-allocated (or part of a module),
> > > > without going via ftrace_ops_list_func.
> > > >
> > > > We've benchamrked this with the ftrace_ops sample module on Spacemit K1
> > > > Jupiter:
> > > >
> > > > Without this patch:
> > > >
> > > > baseline (Linux rivos 6.14.0-09584-g7d06015d936c #3 SMP Sat Mar 29
> > > > +-----------------------+-----------------+----------------------------+
> > > > | Number of tracers | Total time (ns) | Per-call average time |
> > > > |-----------------------+-----------------+----------------------------|
> > > > | Relevant | Irrelevant | 100000 calls | Total (ns) | Overhead (ns) |
> > > > |----------+------------+-----------------+------------+---------------|
> > > > | 0 | 0 | 1357958 | 13 | - |
> > > > | 0 | 1 | 1302375 | 13 | - |
> > > > | 0 | 2 | 1302375 | 13 | - |
> > > > | 0 | 10 | 1379084 | 13 | - |
> > > > | 0 | 100 | 1302458 | 13 | - |
> > > > | 0 | 200 | 1302333 | 13 | - |
> > > > |----------+------------+-----------------+------------+---------------|
> > > > | 1 | 0 | 13677833 | 136 | 123 |
> > > > | 1 | 1 | 18500916 | 185 | 172 |
> > > > | 1 | 2 | 22856459 | 228 | 215 |
> > > > | 1 | 10 | 58824709 | 588 | 575 |
> > > > | 1 | 100 | 505141584 | 5051 | 5038 |
> > > > | 1 | 200 | 1580473126 | 15804 | 15791 |
> > > > |----------+------------+-----------------+------------+---------------|
> > > > | 1 | 0 | 13561000 | 135 | 122 |
> > > > | 2 | 0 | 19707292 | 197 | 184 |
> > > > | 10 | 0 | 67774750 | 677 | 664 |
> > > > | 100 | 0 | 714123125 | 7141 | 7128 |
> > > > | 200 | 0 | 1918065668 | 19180 | 19167 |
> > > > +----------+------------+-----------------+------------+---------------+
> > > >
> > > > Note: per-call overhead is estimated relative to the baseline case with
> > > > 0 relevant tracers and 0 irrelevant tracers.
> > > >
> > > > With this patch:
> > > >
> > > > v4-rc4 (Linux rivos 6.14.0-09598-gd75747611c93 #4 SMP Sat Mar 29
> > > > +-----------------------+-----------------+----------------------------+
> > > > | Number of tracers | Total time (ns) | Per-call average time |
> > > > |-----------------------+-----------------+----------------------------|
> > > > | Relevant | Irrelevant | 100000 calls | Total (ns) | Overhead (ns) |
> > > > |----------+------------+-----------------+------------+---------------|
> > > > | 0 | 0 | 1459917 | 14 | - |
> > > > | 0 | 1 | 1408000 | 14 | - |
> > > > | 0 | 2 | 1383792 | 13 | - |
> > > > | 0 | 10 | 1430709 | 14 | - |
> > > > | 0 | 100 | 1383791 | 13 | - |
> > > > | 0 | 200 | 1383750 | 13 | - |
> > > > |----------+------------+-----------------+------------+---------------|
> > > > | 1 | 0 | 5238041 | 52 | 38 |
> > > > | 1 | 1 | 5228542 | 52 | 38 |
> > > > | 1 | 2 | 5325917 | 53 | 40 |
> > > > | 1 | 10 | 5299667 | 52 | 38 |
> > > > | 1 | 100 | 5245250 | 52 | 39 |
> > > > | 1 | 200 | 5238459 | 52 | 39 |
> > > > |----------+------------+-----------------+------------+---------------|
> > > > | 1 | 0 | 5239083 | 52 | 38 |
> > > > | 2 | 0 | 19449417 | 194 | 181 |
> > > > | 10 | 0 | 67718584 | 677 | 663 |
> > > > | 100 | 0 | 709840708 | 7098 | 7085 |
> > > > | 200 | 0 | 2203580626 | 22035 | 22022 |
> > > > +----------+------------+-----------------+------------+---------------+
> > > >
> > > > Note: per-call overhead is estimated relative to the baseline case with
> > > > 0 relevant tracers and 0 irrelevant tracers.
> > > >
> > > > As can be seen from the above:
> > > >
> > > > a) Whenever there is a single relevant tracer function associated with a
> > > > tracee, the overhead of invoking the tracer is constant, and does not
> > > > scale with the number of tracers which are *not* associated with that
> > > > tracee.
> > > >
> > > > b) The overhead for a single relevant tracer has dropped to ~1/3 of the
> > > > overhead prior to this series (from 122ns to 38ns). This is largely
> > > > due to permitting calls to dynamically-allocated ftrace_ops without
> > > > going through ftrace_ops_list_func.
> > > >
> > > > Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
> > > >
> > > > [update kconfig, asm, refactor]
> > > >
> > > > Signed-off-by: Andy Chiu <andybnac@gmail.com>
> > > > Tested-by: Björn Töpel <bjorn@rivosinc.com>
> > >
> > > I bisected a boot failure to this commit [c217157bcd1df ("riscv:
> > > Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS")] yesterday, that appears
> > > to be affecting all LLVM versions that I currently have installed. From
> > > some initial testing of Kconfig options, it looks like the issue is
> > > CFI_CLANG related because when I disable CFI_CLANG things work once
> > > more. Since this option depends on !CFI_CLANG, but is def_bool y, I
> > > modified Kconfig to force disable it at all times and tested
> > > !DYNAMIC_FTRACE_WITH_CALL_OPS && !CFG_CLANG, which did boot.
> > >
> > > I dunno anything about what's going on in this patch, but so little in
> > > it relates to having DYNAMIC_FTRACE_WITH_CALL_OPS, that I was able to
> > > figure out that the problem is -fpatchable-function-entry=8,4
> > >
> >
> > DYNAMIC_FTRACE_WITH_CALL_OPS can't work together with CFI_CLANG.
> >
> > arm64 has:
> >
> > select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS \
> > if (DYNAMIC_FTRACE_WITH_ARGS && !CFI && \
> > (CC_IS_CLANG || !CC_OPTIMIZE_FOR_SIZE))
> >
> > would need something similar for riscv if not already done.
>
>
> I think you've misunderstood my email. We already have:
>
> select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS if (DYNAMIC_FTRACE_WITH_ARGS && !CFI)
>
> The problem is that the patch broke using CFI_CLANG, due to the
> fpatchable-function-entry change.
Yeah, sorry I did not see the patch,
the original one I sent had:
+ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS), y)
+ifeq ($(CONFIG_RISCV_ISA_C),y)
+ CC_FLAGS_FTRACE := -fpatchable-function-entry=8,4
+else
+ CC_FLAGS_FTRACE := -fpatchable-function-entry=4,2
+endif
+else
The basic Idea is that we can't put nops before the function entry
when using CFI_CLANG, because they both interfere with each other.
the fix should be something like:
-- >8 --
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 371da75a47f9..94100810a6a4 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -14,11 +14,19 @@ endif
ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
LDFLAGS_vmlinux += --no-relax
KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
+ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS),y)
ifeq ($(CONFIG_RISCV_ISA_C),y)
CC_FLAGS_FTRACE := -fpatchable-function-entry=8,4
else
CC_FLAGS_FTRACE := -fpatchable-function-entry=4,2
endif
+else
+ifeq ($(CONFIG_RISCV_ISA_C),y)
+ CC_FLAGS_FTRACE := -fpatchable-function-entry=4
+else
+ CC_FLAGS_FTRACE := -fpatchable-function-entry=2
+endif
+endif
endif
ifeq ($(CONFIG_CMODEL_MEDLOW),y)
-- 8< --
Thanks,
Puranjay
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH v4 10/12] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
2026-02-23 15:41 ` Puranjay Mohan
@ 2026-02-23 16:29 ` Conor Dooley
2026-02-23 17:36 ` Puranjay Mohan
0 siblings, 1 reply; 9+ messages in thread
From: Conor Dooley @ 2026-02-23 16:29 UTC (permalink / raw)
To: Puranjay Mohan
Cc: Conor Dooley, Andy Chiu, linux-riscv, alexghiti, palmer,
Björn Töpel, linux-kernel, linux-trace-kernel,
Alexandre Ghiti, Mark Rutland, paul.walmsley, greentime.hu,
nick.hu, nylon.chen, eric.lin, vicent.chen, zong.li,
yongxuan.wang, samuel.holland, olivia.chu, c2232430, arnd,
Sami Tolvanen, Kees Cook, Nathan Chancellor, llvm, pjw
[-- Attachment #1: Type: text/plain, Size: 4335 bytes --]
On Mon, Feb 23, 2026 at 03:41:26PM +0000, Puranjay Mohan wrote:
> On Mon, Feb 23, 2026 at 3:28 PM Conor Dooley <conor.dooley@microchip.com> wrote:
> >
> > On Mon, Feb 23, 2026 at 03:18:17PM +0000, Puranjay Mohan wrote:
> > > On Sat, Feb 21, 2026 at 12:15 PM Conor Dooley <conor@kernel.org> wrote:
> > > >
> > > > Hey,
> > > >
> > > > On Tue, Apr 08, 2025 at 02:08:34AM +0800, Andy Chiu wrote:
> > > > > From: Puranjay Mohan <puranjay12@gmail.com>
> > > > >
> > > > > This patch enables support for DYNAMIC_FTRACE_WITH_CALL_OPS on RISC-V.
> > > > > This allows each ftrace callsite to provide an ftrace_ops to the common
> > > > > ftrace trampoline, allowing each callsite to invoke distinct tracer
> > > > > functions without the need to fall back to list processing or to
> > > > > allocate custom trampolines for each callsite. This significantly speeds
> > > > > up cases where multiple distinct trace functions are used and callsites
> > > > > are mostly traced by a single tracer.
> > > > >
> > > > > Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
> > > > >
> > > > > [update kconfig, asm, refactor]
> > > > >
> > > > > Signed-off-by: Andy Chiu <andybnac@gmail.com>
> > > > > Tested-by: Björn Töpel <bjorn@rivosinc.com>
> > > >
> > > > I bisected a boot failure to this commit [c217157bcd1df ("riscv:
> > > > Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS")] yesterday, that appears
> > > > to be affecting all LLVM versions that I currently have installed. From
> > > > some initial testing of Kconfig options, it looks like the issue is
> > > > CFI_CLANG related because when I disable CFI_CLANG things work once
> > > > more. Since this option depends on !CFI_CLANG, but is def_bool y, I
> > > > modified Kconfig to force disable it at all times and tested
> > > > !DYNAMIC_FTRACE_WITH_CALL_OPS && !CFG_CLANG, which did boot.
> > > >
> > > > I dunno anything about what's going on in this patch, but so little in
> > > > it relates to having DYNAMIC_FTRACE_WITH_CALL_OPS, that I was able to
> > > > figure out that the problem is -fpatchable-function-entry=8,4
> > > >
> > >
> > > DYNAMIC_FTRACE_WITH_CALL_OPS can't work together with CFI_CLANG.
> > >
> > > arm64 has:
> > >
> > > select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS \
> > > if (DYNAMIC_FTRACE_WITH_ARGS && !CFI && \
> > > (CC_IS_CLANG || !CC_OPTIMIZE_FOR_SIZE))
> > >
> > > would need something similar for riscv if not already done.
> >
> >
> > I think you've misunderstood my email. We already have:
> >
> > select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS if (DYNAMIC_FTRACE_WITH_ARGS && !CFI)
> >
> > The problem is that the patch broke using CFI_CLANG, due to the
> > fpatchable-function-entry change.
>
>
> Yeah, sorry I did not see the patch,
> the original one I sent had:
>
> +ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS), y)
> +ifeq ($(CONFIG_RISCV_ISA_C),y)
> + CC_FLAGS_FTRACE := -fpatchable-function-entry=8,4
> +else
> + CC_FLAGS_FTRACE := -fpatchable-function-entry=4,2
> +endif
> +else
>
>
> The basic Idea is that we can't put nops before the function entry
> when using CFI_CLANG, because they both interfere with each other.
>
> the fix should be something like:
Ye, this is what Nathan and I both did locally, give or take. I just
wasn't sure if this was actually correct to do or if it was just
papering over an issue with our CFI support. Do you want to send this as
a patch?
>
> -- >8 --
>
> diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
> index 371da75a47f9..94100810a6a4 100644
> --- a/arch/riscv/Makefile
> +++ b/arch/riscv/Makefile
> @@ -14,11 +14,19 @@ endif
> ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
> LDFLAGS_vmlinux += --no-relax
> KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
> +ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS),y)
> ifeq ($(CONFIG_RISCV_ISA_C),y)
> CC_FLAGS_FTRACE := -fpatchable-function-entry=8,4
> else
> CC_FLAGS_FTRACE := -fpatchable-function-entry=4,2
> endif
> +else
> +ifeq ($(CONFIG_RISCV_ISA_C),y)
> + CC_FLAGS_FTRACE := -fpatchable-function-entry=4
> +else
> + CC_FLAGS_FTRACE := -fpatchable-function-entry=2
> +endif
> +endif
> endif
>
> ifeq ($(CONFIG_CMODEL_MEDLOW),y)
>
> -- 8< --
>
>
> Thanks,
> Puranjay
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v4 10/12] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
2026-02-23 16:29 ` Conor Dooley
@ 2026-02-23 17:36 ` Puranjay Mohan
2026-02-23 17:41 ` Conor Dooley
0 siblings, 1 reply; 9+ messages in thread
From: Puranjay Mohan @ 2026-02-23 17:36 UTC (permalink / raw)
To: Conor Dooley
Cc: Conor Dooley, Andy Chiu, linux-riscv, alexghiti, palmer,
Björn Töpel, linux-kernel, linux-trace-kernel,
Alexandre Ghiti, Mark Rutland, paul.walmsley, greentime.hu,
nick.hu, nylon.chen, eric.lin, vicent.chen, zong.li,
yongxuan.wang, samuel.holland, olivia.chu, c2232430, arnd,
Sami Tolvanen, Kees Cook, Nathan Chancellor, llvm, pjw
On Mon, Feb 23, 2026 at 4:29 PM Conor Dooley <conor@kernel.org> wrote:
>
> On Mon, Feb 23, 2026 at 03:41:26PM +0000, Puranjay Mohan wrote:
> > On Mon, Feb 23, 2026 at 3:28 PM Conor Dooley <conor.dooley@microchip.com> wrote:
> > >
> > > On Mon, Feb 23, 2026 at 03:18:17PM +0000, Puranjay Mohan wrote:
> > > > On Sat, Feb 21, 2026 at 12:15 PM Conor Dooley <conor@kernel.org> wrote:
> > > > >
> > > > > Hey,
> > > > >
> > > > > On Tue, Apr 08, 2025 at 02:08:34AM +0800, Andy Chiu wrote:
> > > > > > From: Puranjay Mohan <puranjay12@gmail.com>
> > > > > >
> > > > > > This patch enables support for DYNAMIC_FTRACE_WITH_CALL_OPS on RISC-V.
> > > > > > This allows each ftrace callsite to provide an ftrace_ops to the common
> > > > > > ftrace trampoline, allowing each callsite to invoke distinct tracer
> > > > > > functions without the need to fall back to list processing or to
> > > > > > allocate custom trampolines for each callsite. This significantly speeds
> > > > > > up cases where multiple distinct trace functions are used and callsites
> > > > > > are mostly traced by a single tracer.
>
> > > > > >
> > > > > > Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
> > > > > >
> > > > > > [update kconfig, asm, refactor]
> > > > > >
> > > > > > Signed-off-by: Andy Chiu <andybnac@gmail.com>
> > > > > > Tested-by: Björn Töpel <bjorn@rivosinc.com>
> > > > >
> > > > > I bisected a boot failure to this commit [c217157bcd1df ("riscv:
> > > > > Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS")] yesterday, that appears
> > > > > to be affecting all LLVM versions that I currently have installed. From
> > > > > some initial testing of Kconfig options, it looks like the issue is
> > > > > CFI_CLANG related because when I disable CFI_CLANG things work once
> > > > > more. Since this option depends on !CFI_CLANG, but is def_bool y, I
> > > > > modified Kconfig to force disable it at all times and tested
> > > > > !DYNAMIC_FTRACE_WITH_CALL_OPS && !CFG_CLANG, which did boot.
> > > > >
> > > > > I dunno anything about what's going on in this patch, but so little in
> > > > > it relates to having DYNAMIC_FTRACE_WITH_CALL_OPS, that I was able to
> > > > > figure out that the problem is -fpatchable-function-entry=8,4
> > > > >
> > > >
> > > > DYNAMIC_FTRACE_WITH_CALL_OPS can't work together with CFI_CLANG.
> > > >
> > > > arm64 has:
> > > >
> > > > select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS \
> > > > if (DYNAMIC_FTRACE_WITH_ARGS && !CFI && \
> > > > (CC_IS_CLANG || !CC_OPTIMIZE_FOR_SIZE))
> > > >
> > > > would need something similar for riscv if not already done.
> > >
> > >
> > > I think you've misunderstood my email. We already have:
> > >
> > > select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS if (DYNAMIC_FTRACE_WITH_ARGS && !CFI)
> > >
> > > The problem is that the patch broke using CFI_CLANG, due to the
> > > fpatchable-function-entry change.
> >
> >
> > Yeah, sorry I did not see the patch,
> > the original one I sent had:
> >
> > +ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS), y)
> > +ifeq ($(CONFIG_RISCV_ISA_C),y)
> > + CC_FLAGS_FTRACE := -fpatchable-function-entry=8,4
> > +else
> > + CC_FLAGS_FTRACE := -fpatchable-function-entry=4,2
> > +endif
> > +else
> >
> >
> > The basic Idea is that we can't put nops before the function entry
> > when using CFI_CLANG, because they both interfere with each other.
> >
> > the fix should be something like:
>
> Ye, this is what Nathan and I both did locally, give or take. I just
> wasn't sure if this was actually correct to do or if it was just
> papering over an issue with our CFI support. Do you want to send this as
> a patch?
Yes, I will send a patch with fixes tag.
Thanks,
Puranjay
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v4 10/12] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
2026-02-23 17:36 ` Puranjay Mohan
@ 2026-02-23 17:41 ` Conor Dooley
0 siblings, 0 replies; 9+ messages in thread
From: Conor Dooley @ 2026-02-23 17:41 UTC (permalink / raw)
To: Puranjay Mohan
Cc: Conor Dooley, Andy Chiu, linux-riscv, alexghiti, palmer,
Björn Töpel, linux-kernel, linux-trace-kernel,
Alexandre Ghiti, Mark Rutland, paul.walmsley, greentime.hu,
nick.hu, nylon.chen, eric.lin, vicent.chen, zong.li,
yongxuan.wang, samuel.holland, olivia.chu, c2232430, arnd,
Sami Tolvanen, Kees Cook, Nathan Chancellor, llvm, pjw
[-- Attachment #1: Type: text/plain, Size: 413 bytes --]
On Mon, Feb 23, 2026 at 05:36:24PM +0000, Puranjay Mohan wrote:
> > Ye, this is what Nathan and I both did locally, give or take. I just
> > wasn't sure if this was actually correct to do or if it was just
> > papering over an issue with our CFI support. Do you want to send this as
> > a patch?
>
> Yes, I will send a patch with fixes tag.
Great, thanks! Add a cc: stable too, while you're at it ;)
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-02-23 17:42 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-07 18:08 [PATCH v4 01/12] riscv: ftrace: support fastcc in Clang for WITH_ARGS Andy Chiu
2025-06-02 22:12 ` patchwork-bot+linux-riscv
[not found] ` <20250407180838.42877-10-andybnac@gmail.com>
2026-02-21 12:15 ` [PATCH v4 10/12] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS Conor Dooley
2026-02-23 15:18 ` Puranjay Mohan
2026-02-23 15:27 ` Conor Dooley
2026-02-23 15:41 ` Puranjay Mohan
2026-02-23 16:29 ` Conor Dooley
2026-02-23 17:36 ` Puranjay Mohan
2026-02-23 17:41 ` Conor Dooley
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox