* [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application
@ 2025-04-30 11:02 Nam Cao
2025-04-30 11:02 ` [PATCH v6 17/22] arm64: mm: Add page fault trace points Nam Cao
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Nam Cao @ 2025-04-30 11:02 UTC (permalink / raw)
To: Steven Rostedt, Gabriele Monaco, linux-trace-kernel, linux-kernel
Cc: john.ogness, Nam Cao, Petr Mladek, Sergey Senozhatsky,
Ingo Molnar, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
H . Peter Anvin, Andy Lutomirski, Peter Zijlstra, Catalin Marinas,
linux-arm-kernel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
Alexandre Ghiti, linux-riscv
Real-time applications may have design flaws causing them to have
unexpected latency. For example, the applications may raise page faults, or
may be blocked trying to take a mutex without priority inheritance.
However, while attempting to implement DA monitors for these real-time
rules, deterministic automaton is found to be inappropriate as the
specification language. The automaton is complicated, hard to understand,
and error-prone.
For these cases, linear temporal logic is found to be more suitable. The
LTL is more concise and intuitive.
This series adds support for LTL RV monitor, and use it to implement two
monitors for reporting problems with real-time tasks.
Patch 1-12 cleanup and prepare the RV code for the integration of LTL
monitors.
Patch 13 adds support for LTL monitors.
Patch 14 adds the container monitor "rtapp". This encapsulates the
sub-monitors for real-time.
Patch 15-18 prepares the pagefault tracepoints, so that patch 19 can add
the monitor which watches real-time tasks doing page faults.
Patch 20 adds the "sleep" monitor: it detects potential undesirable latency
with real-time threads.
Patch 21 adds documentation on the new monitors.
Patch 22 allows the number of per-task monitors to be configurable, so that
the two new monitors can be enabled simultaneously.
v5->v6 https://lore.kernel.org/lkml/cover.1745926331.git.namcao@linutronix.de
- sleep monitor: Drop the block_on_rt_mutex tracepoints. The contention
tracepoints are sufficient.
v4->v5 https://lore.kernel.org/lkml/cover.1745390829.git.namcao@linutronix.de
- sleep monitor: Fix a false positive due to a race with waking and
scheduling.
- sleep monitor: Add block_on_rt_mutex tracepoints and use them for
BLOCK_ON_RT_MUTEX, instead of trace_sched_pi_setprio
- sleep monitor: tighten the rule on nanosleep: only clock_nanosleep()
with TIMER_ABSTIME and CLOCK_MONOTONIC is allowed
- add comments explaining why it is correct to treat PI-boosted tasks as
real-time tasks.
It should be noted that due to the changes in v5, 'perf' does not work
as well as before, because sometimes the errors happen out of the
real-time tasks' contexts. Fixing this is left for future work.
stress-ng is also far noisier in v5, because the rule on nanosleep is
tightened.
v3->v4 https://lore.kernel.org/lkml/cover.1744785335.git.namcao@linutronix.de
- support deadline tasks
- rtapp_sleep: use sched_pi_setprio tracepoint instead of contention
tracepoints for BLOCK_ON_RT_MUTEX, so that proxy lock is covered.
- fix the scripts generating an "slightly" incorrect verification automaton
- makes rtapp monitor depends on RV_PER_TASK_MONITORS >= 2
- make the event tracepoint output a bit more readable
- some documentation's format fixes
v2->v3 https://lore.kernel.org/lkml/cover.1744355018.git.namcao@linutronix.de/
- fix a problem with sleep monitor's specification (around
KTHREAD_SHOULD_STOP)
- merge the patches that move the dot2k/rvgen scripts around
- pull panic/printk changes into separate patches
- fixup some build errors
- fixup monitor's init function return code
- fix some flake8 warnings with the scripts
- add some references to LTL documentation
- fixup some mistakes with rtapp documentation
- fixup capitalization mistake with monitor_synthesis.rst
- remove the now-redundant macro RV_PER_TASK_MONITORS
v1->v2 https://lore.kernel.org/lkml/cover.1741708239.git.namcao@linutronix.de/
- Integrate the LTL scripts into the existing dot2k tool, taking
advantage of the existing monitor generation scripts.
- Switch the struct ltl_monitor to use bitmap instead of an array, to
optimize memory usage.
- Correct the generated code to be non-deterministic state machine,
instead of deterministic state machine
- Put common code for all LTL monitors into a single file
(include/rv/ltl_monitor.h), reducing code duplication
- Change the LTL monitors to make user of container. Add a bug fix to
container while at it.
- Make the number of per-task monitor configurable
Cc: Petr Mladek <pmladek@suse.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: linux-riscv@lists.infradead.org
Nam Cao (22):
rv: Add #undef TRACE_INCLUDE_FILE
printk: Make vprintk_deferred() public
panic: Add vpanic()
rv: Let the reactors take care of buffers
verification/dot2k: Make a separate dot2k_templates/Kconfig_container
verification/dot2k: Remove __buff_to_string()
verification/dot2k: Replace is_container() hack with subparsers
rv: rename CONFIG_DA_MON_EVENTS to CONFIG_RV_MON_EVENTS
verification/dot2k: Prepare the frontend for LTL inclusion
Documentation/rv: Prepare monitor synthesis document for LTL inclusion
verification/rvgen: Restructure the templates files
verification/rvgen: Restructure the classes to prepare for LTL
inclusion
rv: Add support for LTL monitors
rv: Add rtapp container monitor
x86/tracing: Remove redundant trace_pagefault_key
x86/tracing: Move page fault trace points to generic
arm64: mm: Add page fault trace points
riscv: mm: Add page fault trace points
rv: Add rtapp_pagefault monitor
rv: Add rtapp_sleep monitor
rv: Add documentation for rtapp monitor
rv: Allow to configure the number of per-task monitor
.../trace/rv/da_monitor_synthesis.rst | 147 -----
Documentation/trace/rv/index.rst | 4 +-
.../trace/rv/linear_temporal_logic.rst | 122 ++++
Documentation/trace/rv/monitor_rtapp.rst | 116 ++++
Documentation/trace/rv/monitor_synthesis.rst | 256 ++++++++
arch/arm64/mm/fault.c | 8 +
arch/riscv/mm/fault.c | 8 +
arch/x86/include/asm/trace/common.h | 12 -
arch/x86/include/asm/trace/irq_vectors.h | 1 -
arch/x86/kernel/Makefile | 1 -
arch/x86/kernel/tracepoint.c | 21 -
arch/x86/mm/fault.c | 5 +-
include/linux/panic.h | 3 +
include/linux/printk.h | 5 +
include/linux/rv.h | 74 ++-
include/linux/sched.h | 8 +-
include/rv/da_monitor.h | 45 +-
include/rv/ltl_monitor.h | 184 ++++++
.../trace/events}/exceptions.h | 27 +-
kernel/fork.c | 5 +-
kernel/panic.c | 17 +-
kernel/printk/internal.h | 1 -
kernel/trace/rv/Kconfig | 27 +-
kernel/trace/rv/Makefile | 3 +
kernel/trace/rv/monitors/pagefault/Kconfig | 11 +
.../trace/rv/monitors/pagefault/pagefault.c | 87 +++
.../trace/rv/monitors/pagefault/pagefault.h | 57 ++
.../rv/monitors/pagefault/pagefault_trace.h | 14 +
kernel/trace/rv/monitors/rtapp/Kconfig | 7 +
kernel/trace/rv/monitors/rtapp/rtapp.c | 33 ++
kernel/trace/rv/monitors/rtapp/rtapp.h | 3 +
kernel/trace/rv/monitors/sleep/Kconfig | 13 +
kernel/trace/rv/monitors/sleep/sleep.c | 227 +++++++
kernel/trace/rv/monitors/sleep/sleep.h | 238 ++++++++
kernel/trace/rv/monitors/sleep/sleep_trace.h | 14 +
kernel/trace/rv/reactor_panic.c | 8 +-
kernel/trace/rv/reactor_printk.c | 8 +-
kernel/trace/rv/rv.c | 10 +-
kernel/trace/rv/rv_reactors.c | 2 +-
kernel/trace/rv/rv_trace.h | 52 +-
tools/verification/dot2/Makefile | 26 -
tools/verification/dot2/dot2k | 53 --
tools/verification/models/rtapp/pagefault.ltl | 1 +
tools/verification/models/rtapp/sleep.ltl | 21 +
tools/verification/rvgen/.gitignore | 3 +
tools/verification/rvgen/Makefile | 27 +
tools/verification/rvgen/__main__.py | 67 +++
tools/verification/{dot2 => rvgen}/dot2c | 2 +-
.../{dot2 => rvgen/rvgen}/automata.py | 0
tools/verification/rvgen/rvgen/container.py | 22 +
.../{dot2 => rvgen/rvgen}/dot2c.py | 2 +-
tools/verification/rvgen/rvgen/dot2k.py | 129 ++++
.../dot2k.py => rvgen/rvgen/generator.py} | 249 ++------
tools/verification/rvgen/rvgen/ltl2ba.py | 558 ++++++++++++++++++
tools/verification/rvgen/rvgen/ltl2k.py | 245 ++++++++
.../rvgen/templates}/Kconfig | 0
.../rvgen/rvgen/templates/container/Kconfig | 5 +
.../rvgen/templates/container/main.c} | 0
.../rvgen/templates/container/main.h} | 0
.../rvgen/templates/dot2k}/main.c | 0
.../rvgen/templates/dot2k}/trace.h | 0
.../rvgen/rvgen/templates/ltl2k/main.c | 102 ++++
.../rvgen/rvgen/templates/ltl2k/trace.h | 14 +
63 files changed, 2860 insertions(+), 550 deletions(-)
delete mode 100644 Documentation/trace/rv/da_monitor_synthesis.rst
create mode 100644 Documentation/trace/rv/linear_temporal_logic.rst
create mode 100644 Documentation/trace/rv/monitor_rtapp.rst
create mode 100644 Documentation/trace/rv/monitor_synthesis.rst
delete mode 100644 arch/x86/include/asm/trace/common.h
delete mode 100644 arch/x86/kernel/tracepoint.c
create mode 100644 include/rv/ltl_monitor.h
rename {arch/x86/include/asm/trace => include/trace/events}/exceptions.h (55%)
create mode 100644 kernel/trace/rv/monitors/pagefault/Kconfig
create mode 100644 kernel/trace/rv/monitors/pagefault/pagefault.c
create mode 100644 kernel/trace/rv/monitors/pagefault/pagefault.h
create mode 100644 kernel/trace/rv/monitors/pagefault/pagefault_trace.h
create mode 100644 kernel/trace/rv/monitors/rtapp/Kconfig
create mode 100644 kernel/trace/rv/monitors/rtapp/rtapp.c
create mode 100644 kernel/trace/rv/monitors/rtapp/rtapp.h
create mode 100644 kernel/trace/rv/monitors/sleep/Kconfig
create mode 100644 kernel/trace/rv/monitors/sleep/sleep.c
create mode 100644 kernel/trace/rv/monitors/sleep/sleep.h
create mode 100644 kernel/trace/rv/monitors/sleep/sleep_trace.h
delete mode 100644 tools/verification/dot2/Makefile
delete mode 100644 tools/verification/dot2/dot2k
create mode 100644 tools/verification/models/rtapp/pagefault.ltl
create mode 100644 tools/verification/models/rtapp/sleep.ltl
create mode 100644 tools/verification/rvgen/.gitignore
create mode 100644 tools/verification/rvgen/Makefile
create mode 100644 tools/verification/rvgen/__main__.py
rename tools/verification/{dot2 => rvgen}/dot2c (97%)
rename tools/verification/{dot2 => rvgen/rvgen}/automata.py (100%)
create mode 100644 tools/verification/rvgen/rvgen/container.py
rename tools/verification/{dot2 => rvgen/rvgen}/dot2c.py (99%)
create mode 100644 tools/verification/rvgen/rvgen/dot2k.py
rename tools/verification/{dot2/dot2k.py => rvgen/rvgen/generator.py} (52%)
create mode 100644 tools/verification/rvgen/rvgen/ltl2ba.py
create mode 100644 tools/verification/rvgen/rvgen/ltl2k.py
rename tools/verification/{dot2/dot2k_templates => rvgen/rvgen/templates}/Kconfig (100%)
create mode 100644 tools/verification/rvgen/rvgen/templates/container/Kconfig
rename tools/verification/{dot2/dot2k_templates/main_container.c => rvgen/rvgen/templates/container/main.c} (100%)
rename tools/verification/{dot2/dot2k_templates/main_container.h => rvgen/rvgen/templates/container/main.h} (100%)
rename tools/verification/{dot2/dot2k_templates => rvgen/rvgen/templates/dot2k}/main.c (100%)
rename tools/verification/{dot2/dot2k_templates => rvgen/rvgen/templates/dot2k}/trace.h (100%)
create mode 100644 tools/verification/rvgen/rvgen/templates/ltl2k/main.c
create mode 100644 tools/verification/rvgen/rvgen/templates/ltl2k/trace.h
--
2.39.5
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v6 17/22] arm64: mm: Add page fault trace points
2025-04-30 11:02 [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application Nam Cao
@ 2025-04-30 11:02 ` Nam Cao
2025-05-07 21:23 ` Steven Rostedt
2025-05-16 14:04 ` Will Deacon
2025-04-30 12:17 ` [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application Gabriele Monaco
2025-08-10 21:12 ` patchwork-bot+linux-riscv
2 siblings, 2 replies; 14+ messages in thread
From: Nam Cao @ 2025-04-30 11:02 UTC (permalink / raw)
To: Steven Rostedt, Gabriele Monaco, linux-trace-kernel, linux-kernel
Cc: john.ogness, Nam Cao, Catalin Marinas, Will Deacon,
linux-arm-kernel
Add page fault trace points, which are useful to implement RV monitor which
watches page faults.
Signed-off-by: Nam Cao <namcao@linutronix.de>
---
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
---
arch/arm64/mm/fault.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index ef63651099a9..e3f096b0dffd 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -44,6 +44,9 @@
#include <asm/tlbflush.h>
#include <asm/traps.h>
+#define CREATE_TRACE_POINTS
+#include <trace/events/exceptions.h>
+
struct fault_info {
int (*fn)(unsigned long far, unsigned long esr,
struct pt_regs *regs);
@@ -559,6 +562,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
if (kprobe_page_fault(regs, esr))
return 0;
+ if (user_mode(regs))
+ trace_page_fault_user(addr, regs, esr);
+ else
+ trace_page_fault_kernel(addr, regs, esr);
+
/*
* If we're in an interrupt or have no user context, we must not take
* the fault.
--
2.39.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application
2025-04-30 11:02 [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application Nam Cao
2025-04-30 11:02 ` [PATCH v6 17/22] arm64: mm: Add page fault trace points Nam Cao
@ 2025-04-30 12:17 ` Gabriele Monaco
2025-04-30 19:18 ` Steven Rostedt
2025-08-10 21:12 ` patchwork-bot+linux-riscv
2 siblings, 1 reply; 14+ messages in thread
From: Gabriele Monaco @ 2025-04-30 12:17 UTC (permalink / raw)
To: Steven Rostedt
Cc: Nam Cao, john.ogness, Petr Mladek, Sergey Senozhatsky,
Ingo Molnar, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
H . Peter Anvin, Andy Lutomirski, Peter Zijlstra, Catalin Marinas,
linux-arm-kernel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
Alexandre Ghiti, linux-riscv, linux-trace-kernel, linux-kernel
On Wed, 2025-04-30 at 13:02 +0200, Nam Cao wrote:
> Real-time applications may have design flaws causing them to have
> unexpected latency. For example, the applications may raise page
> faults, or
> may be blocked trying to take a mutex without priority inheritance.
>
> However, while attempting to implement DA monitors for these real-
> time
> rules, deterministic automaton is found to be inappropriate as the
> specification language. The automaton is complicated, hard to
> understand,
> and error-prone.
>
> For these cases, linear temporal logic is found to be more suitable.
> The
> LTL is more concise and intuitive.
>
> This series adds support for LTL RV monitor, and use it to implement
> two
> monitors for reporting problems with real-time tasks.
>
Steve,
From my point of view this series is ready for inclusion, what do you
think?
We may still need Acks from the x86 and arm64 maintainers regarding the
tracepoints changes, though.
Thanks,
Gabriele
> Patch 1-12 cleanup and prepare the RV code for the integration of LTL
> monitors.
>
> Patch 13 adds support for LTL monitors.
>
> Patch 14 adds the container monitor "rtapp". This encapsulates the
> sub-monitors for real-time.
>
> Patch 15-18 prepares the pagefault tracepoints, so that patch 19 can
> add
> the monitor which watches real-time tasks doing page faults.
>
> Patch 20 adds the "sleep" monitor: it detects potential undesirable
> latency
> with real-time threads.
>
> Patch 21 adds documentation on the new monitors.
>
> Patch 22 allows the number of per-task monitors to be configurable,
> so that
> the two new monitors can be enabled simultaneously.
>
> v5->v6
> https://lore.kernel.org/lkml/cover.1745926331.git.namcao@linutronix.de
> - sleep monitor: Drop the block_on_rt_mutex tracepoints. The
> contention
> tracepoints are sufficient.
>
> v4->v5
> https://lore.kernel.org/lkml/cover.1745390829.git.namcao@linutronix.de
> - sleep monitor: Fix a false positive due to a race with waking and
> scheduling.
> - sleep monitor: Add block_on_rt_mutex tracepoints and use them for
> BLOCK_ON_RT_MUTEX, instead of trace_sched_pi_setprio
> - sleep monitor: tighten the rule on nanosleep: only
> clock_nanosleep()
> with TIMER_ABSTIME and CLOCK_MONOTONIC is allowed
> - add comments explaining why it is correct to treat PI-boosted
> tasks as
> real-time tasks.
>
> It should be noted that due to the changes in v5, 'perf' does not
> work
> as well as before, because sometimes the errors happen out of the
> real-time tasks' contexts. Fixing this is left for future work.
>
> stress-ng is also far noisier in v5, because the rule on
> nanosleep is
> tightened.
>
> v3->v4
> https://lore.kernel.org/lkml/cover.1744785335.git.namcao@linutronix.de
> - support deadline tasks
> - rtapp_sleep: use sched_pi_setprio tracepoint instead of
> contention
> tracepoints for BLOCK_ON_RT_MUTEX, so that proxy lock is covered.
> - fix the scripts generating an "slightly" incorrect verification
> automaton
> - makes rtapp monitor depends on RV_PER_TASK_MONITORS >= 2
> - make the event tracepoint output a bit more readable
> - some documentation's format fixes
>
> v2->v3
> https://lore.kernel.org/lkml/cover.1744355018.git.namcao@linutronix.de/
> - fix a problem with sleep monitor's specification (around
> KTHREAD_SHOULD_STOP)
> - merge the patches that move the dot2k/rvgen scripts around
> - pull panic/printk changes into separate patches
> - fixup some build errors
> - fixup monitor's init function return code
> - fix some flake8 warnings with the scripts
> - add some references to LTL documentation
> - fixup some mistakes with rtapp documentation
> - fixup capitalization mistake with monitor_synthesis.rst
> - remove the now-redundant macro RV_PER_TASK_MONITORS
>
> v1->v2
> https://lore.kernel.org/lkml/cover.1741708239.git.namcao@linutronix.de/
> - Integrate the LTL scripts into the existing dot2k tool, taking
> advantage of the existing monitor generation scripts.
> - Switch the struct ltl_monitor to use bitmap instead of an array,
> to
> optimize memory usage.
> - Correct the generated code to be non-deterministic state machine,
> instead of deterministic state machine
> - Put common code for all LTL monitors into a single file
> (include/rv/ltl_monitor.h), reducing code duplication
> - Change the LTL monitors to make user of container. Add a bug fix
> to
> container while at it.
> - Make the number of per-task monitor configurable
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application
2025-04-30 12:17 ` [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application Gabriele Monaco
@ 2025-04-30 19:18 ` Steven Rostedt
0 siblings, 0 replies; 14+ messages in thread
From: Steven Rostedt @ 2025-04-30 19:18 UTC (permalink / raw)
To: Gabriele Monaco
Cc: Nam Cao, john.ogness, Petr Mladek, Sergey Senozhatsky,
Ingo Molnar, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
H . Peter Anvin, Andy Lutomirski, Peter Zijlstra, Catalin Marinas,
linux-arm-kernel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
Alexandre Ghiti, linux-riscv, linux-trace-kernel, linux-kernel
On Wed, 30 Apr 2025 14:17:30 +0200
Gabriele Monaco <gmonaco@redhat.com> wrote:
> Steve,
>
> >From my point of view this series is ready for inclusion, what do you
> think?
I haven't had a chance to look at it yet. I'm finishing up some deferred
unwinding work, and then hopefully I can take a deeper look.
>
> We may still need Acks from the x86 and arm64 maintainers regarding the
> tracepoints changes, though.
Yeah, probably want to start pinging them.
-- Steve
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 17/22] arm64: mm: Add page fault trace points
2025-04-30 11:02 ` [PATCH v6 17/22] arm64: mm: Add page fault trace points Nam Cao
@ 2025-05-07 21:23 ` Steven Rostedt
2025-05-16 14:04 ` Will Deacon
1 sibling, 0 replies; 14+ messages in thread
From: Steven Rostedt @ 2025-05-07 21:23 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon
Cc: Nam Cao, Gabriele Monaco, linux-trace-kernel, linux-kernel,
john.ogness, linux-arm-kernel
Can I get an Acked-by from the ARM64 maintainers?
Thanks,
-- Steve
On Wed, 30 Apr 2025 13:02:32 +0200
Nam Cao <namcao@linutronix.de> wrote:
> Add page fault trace points, which are useful to implement RV monitor which
> watches page faults.
>
> Signed-off-by: Nam Cao <namcao@linutronix.de>
> ---
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: linux-arm-kernel@lists.infradead.org
> ---
> arch/arm64/mm/fault.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index ef63651099a9..e3f096b0dffd 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -44,6 +44,9 @@
> #include <asm/tlbflush.h>
> #include <asm/traps.h>
>
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/exceptions.h>
> +
> struct fault_info {
> int (*fn)(unsigned long far, unsigned long esr,
> struct pt_regs *regs);
> @@ -559,6 +562,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
> if (kprobe_page_fault(regs, esr))
> return 0;
>
> + if (user_mode(regs))
> + trace_page_fault_user(addr, regs, esr);
> + else
> + trace_page_fault_kernel(addr, regs, esr);
> +
> /*
> * If we're in an interrupt or have no user context, we must not take
> * the fault.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 17/22] arm64: mm: Add page fault trace points
2025-04-30 11:02 ` [PATCH v6 17/22] arm64: mm: Add page fault trace points Nam Cao
2025-05-07 21:23 ` Steven Rostedt
@ 2025-05-16 14:04 ` Will Deacon
2025-05-16 14:42 ` Steven Rostedt
` (2 more replies)
1 sibling, 3 replies; 14+ messages in thread
From: Will Deacon @ 2025-05-16 14:04 UTC (permalink / raw)
To: Nam Cao
Cc: Steven Rostedt, Gabriele Monaco, linux-trace-kernel, linux-kernel,
john.ogness, Catalin Marinas, linux-arm-kernel
On Wed, Apr 30, 2025 at 01:02:32PM +0200, Nam Cao wrote:
> Add page fault trace points, which are useful to implement RV monitor which
> watches page faults.
>
> Signed-off-by: Nam Cao <namcao@linutronix.de>
> ---
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: linux-arm-kernel@lists.infradead.org
> ---
> arch/arm64/mm/fault.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index ef63651099a9..e3f096b0dffd 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -44,6 +44,9 @@
> #include <asm/tlbflush.h>
> #include <asm/traps.h>
>
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/exceptions.h>
> +
> struct fault_info {
> int (*fn)(unsigned long far, unsigned long esr,
> struct pt_regs *regs);
> @@ -559,6 +562,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
> if (kprobe_page_fault(regs, esr))
> return 0;
>
> + if (user_mode(regs))
> + trace_page_fault_user(addr, regs, esr);
> + else
> + trace_page_fault_kernel(addr, regs, esr);
Why is this after kprobe_page_fault()?
It's also a shame that the RV monitor can't hook into perf, as we
already have a sw event for page faults that you could use instead of
adding something new.
Will
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 17/22] arm64: mm: Add page fault trace points
2025-05-16 14:04 ` Will Deacon
@ 2025-05-16 14:42 ` Steven Rostedt
2025-05-19 15:12 ` Will Deacon
2025-05-16 15:09 ` Nam Cao
2025-05-19 16:17 ` Mark Rutland
2 siblings, 1 reply; 14+ messages in thread
From: Steven Rostedt @ 2025-05-16 14:42 UTC (permalink / raw)
To: Will Deacon, Nam Cao
Cc: Gabriele Monaco, linux-trace-kernel, linux-kernel, john.ogness,
Catalin Marinas, linux-arm-kernel
On May 16, 2025 10:04:50 AM EDT, Will Deacon <will@kernel.org> wrote:
>
>> + if (user_mode(regs))
>> + trace_page_fault_user(addr, regs, esr);
>> + else
>> + trace_page_fault_kernel(addr, regs, esr);
>
>Why is this after kprobe_page_fault()?
>
>It's also a shame that the RV monitor can't hook into perf, as we
>already have a sw event for page faults that you could use instead of
>adding something new.
>
Perf events work for perf only. My question is why isn't this a tracepoint that perf could hook into?
Tracepoints are made to be generic, whereas perf events are not.
-- Steve
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 17/22] arm64: mm: Add page fault trace points
2025-05-16 14:04 ` Will Deacon
2025-05-16 14:42 ` Steven Rostedt
@ 2025-05-16 15:09 ` Nam Cao
2025-05-19 16:17 ` Mark Rutland
2 siblings, 0 replies; 14+ messages in thread
From: Nam Cao @ 2025-05-16 15:09 UTC (permalink / raw)
To: Will Deacon
Cc: Steven Rostedt, Gabriele Monaco, linux-trace-kernel, linux-kernel,
john.ogness, Catalin Marinas, linux-arm-kernel
On Fri, May 16, 2025 at 03:04:50PM +0100, Will Deacon wrote:
> On Wed, Apr 30, 2025 at 01:02:32PM +0200, Nam Cao wrote:
> > Add page fault trace points, which are useful to implement RV monitor which
> > watches page faults.
> >
> > Signed-off-by: Nam Cao <namcao@linutronix.de>
> > ---
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > Cc: linux-arm-kernel@lists.infradead.org
> > ---
> > arch/arm64/mm/fault.c | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> > index ef63651099a9..e3f096b0dffd 100644
> > --- a/arch/arm64/mm/fault.c
> > +++ b/arch/arm64/mm/fault.c
> > @@ -44,6 +44,9 @@
> > #include <asm/tlbflush.h>
> > #include <asm/traps.h>
> >
> > +#define CREATE_TRACE_POINTS
> > +#include <trace/events/exceptions.h>
> > +
> > struct fault_info {
> > int (*fn)(unsigned long far, unsigned long esr,
> > struct pt_regs *regs);
> > @@ -559,6 +562,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
> > if (kprobe_page_fault(regs, esr))
> > return 0;
> >
> > + if (user_mode(regs))
> > + trace_page_fault_user(addr, regs, esr);
> > + else
> > + trace_page_fault_kernel(addr, regs, esr);
>
> Why is this after kprobe_page_fault()?
This is me being incompetent, sorry about that. It is more logical to put
them at the beginning.
Best regards,
Nam
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 17/22] arm64: mm: Add page fault trace points
2025-05-16 14:42 ` Steven Rostedt
@ 2025-05-19 15:12 ` Will Deacon
2025-05-19 16:08 ` Steven Rostedt
0 siblings, 1 reply; 14+ messages in thread
From: Will Deacon @ 2025-05-19 15:12 UTC (permalink / raw)
To: Steven Rostedt
Cc: Nam Cao, Gabriele Monaco, linux-trace-kernel, linux-kernel,
john.ogness, Catalin Marinas, linux-arm-kernel
On Fri, May 16, 2025 at 10:42:48AM -0400, Steven Rostedt wrote:
>
>
> On May 16, 2025 10:04:50 AM EDT, Will Deacon <will@kernel.org> wrote:
> >
> >> + if (user_mode(regs))
> >> + trace_page_fault_user(addr, regs, esr);
> >> + else
> >> + trace_page_fault_kernel(addr, regs, esr);
> >
> >Why is this after kprobe_page_fault()?
> >
> >It's also a shame that the RV monitor can't hook into perf, as we
> >already have a sw event for page faults that you could use instead of
> >adding something new.
> >
>
> Perf events work for perf only. My question is why isn't this a tracepoint
> that perf could hook into?
Well, the perf event came first in this case, so we're stuck with it :/
I was hoping we could settle for a generic helper that could emit both
the trace event and the perf event (so that the ordering of the two is
portable across architectures) but, judging by Nam's reply, the trace
event is needed before kprobes gets a look in.
Will
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 17/22] arm64: mm: Add page fault trace points
2025-05-19 15:12 ` Will Deacon
@ 2025-05-19 16:08 ` Steven Rostedt
2025-05-20 14:04 ` Will Deacon
0 siblings, 1 reply; 14+ messages in thread
From: Steven Rostedt @ 2025-05-19 16:08 UTC (permalink / raw)
To: Will Deacon
Cc: Nam Cao, Gabriele Monaco, linux-trace-kernel, linux-kernel,
john.ogness, Catalin Marinas, linux-arm-kernel
On Mon, 19 May 2025 16:12:39 +0100
Will Deacon <will@kernel.org> wrote:
> > Perf events work for perf only. My question is why isn't this a tracepoint
> > that perf could hook into?
>
> Well, the perf event came first in this case, so we're stuck with it :/
I wonder what effort it will take to convert perf events to tracepoints ;-)
Note, I'm talking about tracepoints and not trace events, where the
latter is exposed to tracefs and the former is not.
>
> I was hoping we could settle for a generic helper that could emit both
> the trace event and the perf event (so that the ordering of the two is
> portable across architectures) but, judging by Nam's reply, the trace
> event is needed before kprobes gets a look in.
Perhaps we could add a helper function that does both (perf and
tracepoint) and hide the implementation from the code that calls it?
But I'm currently still on PTO so I haven't looked at the details yet.
-- Steve
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 17/22] arm64: mm: Add page fault trace points
2025-05-16 14:04 ` Will Deacon
2025-05-16 14:42 ` Steven Rostedt
2025-05-16 15:09 ` Nam Cao
@ 2025-05-19 16:17 ` Mark Rutland
2025-05-20 12:32 ` Will Deacon
2 siblings, 1 reply; 14+ messages in thread
From: Mark Rutland @ 2025-05-19 16:17 UTC (permalink / raw)
To: Will Deacon
Cc: Nam Cao, Steven Rostedt, Gabriele Monaco, linux-trace-kernel,
linux-kernel, john.ogness, Catalin Marinas, linux-arm-kernel
On Fri, May 16, 2025 at 03:04:50PM +0100, Will Deacon wrote:
> On Wed, Apr 30, 2025 at 01:02:32PM +0200, Nam Cao wrote:
> > Add page fault trace points, which are useful to implement RV monitor which
> > watches page faults.
> >
> > Signed-off-by: Nam Cao <namcao@linutronix.de>
> > ---
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > Cc: linux-arm-kernel@lists.infradead.org
> > ---
> > arch/arm64/mm/fault.c | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> > index ef63651099a9..e3f096b0dffd 100644
> > --- a/arch/arm64/mm/fault.c
> > +++ b/arch/arm64/mm/fault.c
> > @@ -44,6 +44,9 @@
> > #include <asm/tlbflush.h>
> > #include <asm/traps.h>
> >
> > +#define CREATE_TRACE_POINTS
> > +#include <trace/events/exceptions.h>
> > +
> > struct fault_info {
> > int (*fn)(unsigned long far, unsigned long esr,
> > struct pt_regs *regs);
> > @@ -559,6 +562,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
> > if (kprobe_page_fault(regs, esr))
> > return 0;
> >
> > + if (user_mode(regs))
> > + trace_page_fault_user(addr, regs, esr);
> > + else
> > + trace_page_fault_kernel(addr, regs, esr);
>
> Why is this after kprobe_page_fault()?
The kprobe_page_fault() gunk is doing something quite different, and is
poorly named. That's trying to fixup the PC (and some other state) to
hide kprobe details from the fault handling logic when an out-of-line
copy of an instruction somehow triggers a fault.
Logically, that *should* happen before the tracepoints, and shouldn't be
moved later. For other reasons it needs to be even earlier in the fault
handling flow, and is currently far too late, but that only ends up
mattering int he presence of other kernel bugs. For now I think it
should stay where it is.
More details below, for the curious and/or deranged.
The kprobe_page_fault() gunk is trying to fix up the case where an
instruction has been kprobed, an out-of-line copy of that instruction is
being stepped, and the out-of-line instruction has triggered a fault.
When that happens, kprobe_page_fault() tries to reset the faulting PC
and DAIF such that it looks like the fault was taken from the original
PC of the probed instruction.
The real logic for that happens in kprobe_fault_handler(), which adjusts
the values in pt_regs, but does not handle the live DAIF value. It also
doesn't handle the PMR when pNMI is in use. Due to this, the fault
handler can run with DAIF bits masked unexpectedly, and a subsequent
exception return *could* go wrong.
Luckily all code with an extable entry has been blacklisted for kprobes
since commit:
888b3c8720e0a403 ("arm64: Treat all entry code as non-kprobe-able")
... so we should only get here if there's another kernel bug that causes
an unmarked dereference of a faulting address, in which case we're
likely to BUG() anyway.
The real fix would be to hoist this out to the arm64 entry code (and
handle similar for other EL1 exceptions), and get rid of all the
__kprobes annotations inthe fault code.
Mark.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 17/22] arm64: mm: Add page fault trace points
2025-05-19 16:17 ` Mark Rutland
@ 2025-05-20 12:32 ` Will Deacon
0 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2025-05-20 12:32 UTC (permalink / raw)
To: Mark Rutland
Cc: Nam Cao, Steven Rostedt, Gabriele Monaco, linux-trace-kernel,
linux-kernel, john.ogness, Catalin Marinas, linux-arm-kernel
On Mon, May 19, 2025 at 05:17:02PM +0100, Mark Rutland wrote:
> On Fri, May 16, 2025 at 03:04:50PM +0100, Will Deacon wrote:
> > On Wed, Apr 30, 2025 at 01:02:32PM +0200, Nam Cao wrote:
> > > Add page fault trace points, which are useful to implement RV monitor which
> > > watches page faults.
> > >
> > > Signed-off-by: Nam Cao <namcao@linutronix.de>
> > > ---
> > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > Cc: Will Deacon <will@kernel.org>
> > > Cc: linux-arm-kernel@lists.infradead.org
> > > ---
> > > arch/arm64/mm/fault.c | 8 ++++++++
> > > 1 file changed, 8 insertions(+)
> > >
> > > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> > > index ef63651099a9..e3f096b0dffd 100644
> > > --- a/arch/arm64/mm/fault.c
> > > +++ b/arch/arm64/mm/fault.c
> > > @@ -44,6 +44,9 @@
> > > #include <asm/tlbflush.h>
> > > #include <asm/traps.h>
> > >
> > > +#define CREATE_TRACE_POINTS
> > > +#include <trace/events/exceptions.h>
> > > +
> > > struct fault_info {
> > > int (*fn)(unsigned long far, unsigned long esr,
> > > struct pt_regs *regs);
> > > @@ -559,6 +562,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
> > > if (kprobe_page_fault(regs, esr))
> > > return 0;
> > >
> > > + if (user_mode(regs))
> > > + trace_page_fault_user(addr, regs, esr);
> > > + else
> > > + trace_page_fault_kernel(addr, regs, esr);
> >
> > Why is this after kprobe_page_fault()?
>
> The kprobe_page_fault() gunk is doing something quite different, and is
> poorly named. That's trying to fixup the PC (and some other state) to
> hide kprobe details from the fault handling logic when an out-of-line
> copy of an instruction somehow triggers a fault.
>
> Logically, that *should* happen before the tracepoints, and shouldn't be
> moved later. For other reasons it needs to be even earlier in the fault
> handling flow, and is currently far too late, but that only ends up
> mattering int he presence of other kernel bugs. For now I think it
> should stay where it is.
I thought these tracepoints were intended to be used by RV, in which
case I'd have thought we'd want as much coverage as possible to reason
about what the kernel is actually doing.
> More details below, for the curious and/or deranged.
>
> The kprobe_page_fault() gunk is trying to fix up the case where an
> instruction has been kprobed, an out-of-line copy of that instruction is
> being stepped, and the out-of-line instruction has triggered a fault.
> When that happens, kprobe_page_fault() tries to reset the faulting PC
> and DAIF such that it looks like the fault was taken from the original
> PC of the probed instruction.
>
> The real logic for that happens in kprobe_fault_handler(), which adjusts
> the values in pt_regs, but does not handle the live DAIF value. It also
> doesn't handle the PMR when pNMI is in use. Due to this, the fault
> handler can run with DAIF bits masked unexpectedly, and a subsequent
> exception return *could* go wrong.
>
> Luckily all code with an extable entry has been blacklisted for kprobes
> since commit:
>
> 888b3c8720e0a403 ("arm64: Treat all entry code as non-kprobe-able")
>
> ... so we should only get here if there's another kernel bug that causes
> an unmarked dereference of a faulting address, in which case we're
> likely to BUG() anyway.
>
> The real fix would be to hoist this out to the arm64 entry code (and
> handle similar for other EL1 exceptions), and get rid of all the
> __kprobes annotations inthe fault code.
This seems to be an argument for removing kprobe_page_fault() entirely,
which is fine, but while it exists it's not obvious to me how it's
supposed to interact with RV.
I suppose the pragmatic thing to do would be to align as closely as
possible with x86, but any documentation/guidance/tests to help us
maintain that would be really helpful. Otherwise, this feels like we're
going to have a repeat of the syscall entry mess where the interaction
with ptrace, audit, seccomp etc was perpetually broken in user-visible
ways.
Will
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 17/22] arm64: mm: Add page fault trace points
2025-05-19 16:08 ` Steven Rostedt
@ 2025-05-20 14:04 ` Will Deacon
0 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2025-05-20 14:04 UTC (permalink / raw)
To: Steven Rostedt
Cc: Nam Cao, Gabriele Monaco, linux-trace-kernel, linux-kernel,
john.ogness, Catalin Marinas, linux-arm-kernel
On Mon, May 19, 2025 at 12:08:37PM -0400, Steven Rostedt wrote:
> On Mon, 19 May 2025 16:12:39 +0100
> Will Deacon <will@kernel.org> wrote:
>
> > > Perf events work for perf only. My question is why isn't this a tracepoint
> > > that perf could hook into?
> >
> > Well, the perf event came first in this case, so we're stuck with it :/
>
> I wonder what effort it will take to convert perf events to tracepoints ;-)
>
> Note, I'm talking about tracepoints and not trace events, where the
> latter is exposed to tracefs and the former is not.
>
> >
> > I was hoping we could settle for a generic helper that could emit both
> > the trace event and the perf event (so that the ordering of the two is
> > portable across architectures) but, judging by Nam's reply, the trace
> > event is needed before kprobes gets a look in.
>
> Perhaps we could add a helper function that does both (perf and
> tracepoint) and hide the implementation from the code that calls it?
Something like that sounds like a good idea, yes.
> But I'm currently still on PTO so I haven't looked at the details yet.
Enjoy!
Will
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application
2025-04-30 11:02 [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application Nam Cao
2025-04-30 11:02 ` [PATCH v6 17/22] arm64: mm: Add page fault trace points Nam Cao
2025-04-30 12:17 ` [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application Gabriele Monaco
@ 2025-08-10 21:12 ` patchwork-bot+linux-riscv
2 siblings, 0 replies; 14+ messages in thread
From: patchwork-bot+linux-riscv @ 2025-08-10 21:12 UTC (permalink / raw)
To: Nam Cao
Cc: linux-riscv, rostedt, gmonaco, linux-trace-kernel, linux-kernel,
john.ogness, pmladek, senozhatsky, mingo, tglx, bp, dave.hansen,
x86, hpa, luto, peterz, catalin.marinas, linux-arm-kernel,
paul.walmsley, palmer, aou, alex
Hello:
This patch was applied to riscv/linux.git (fixes)
by Steven Rostedt (Google) <rostedt@goodmis.org>:
On Wed, 30 Apr 2025 13:02:15 +0200 you wrote:
> Real-time applications may have design flaws causing them to have
> unexpected latency. For example, the applications may raise page faults, or
> may be blocked trying to take a mutex without priority inheritance.
>
> However, while attempting to implement DA monitors for these real-time
> rules, deterministic automaton is found to be inappropriate as the
> specification language. The automaton is complicated, hard to understand,
> and error-prone.
>
> [...]
Here is the summary with links:
- [v6,18/22] riscv: mm: Add page fault trace points
https://git.kernel.org/riscv/c/a37c71ca412d
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-08-10 21:35 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-30 11:02 [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application Nam Cao
2025-04-30 11:02 ` [PATCH v6 17/22] arm64: mm: Add page fault trace points Nam Cao
2025-05-07 21:23 ` Steven Rostedt
2025-05-16 14:04 ` Will Deacon
2025-05-16 14:42 ` Steven Rostedt
2025-05-19 15:12 ` Will Deacon
2025-05-19 16:08 ` Steven Rostedt
2025-05-20 14:04 ` Will Deacon
2025-05-16 15:09 ` Nam Cao
2025-05-19 16:17 ` Mark Rutland
2025-05-20 12:32 ` Will Deacon
2025-04-30 12:17 ` [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application Gabriele Monaco
2025-04-30 19:18 ` Steven Rostedt
2025-08-10 21:12 ` patchwork-bot+linux-riscv
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).