* [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application @ 2025-04-30 11:02 Nam Cao 2025-04-30 11:02 ` [PATCH v6 17/22] arm64: mm: Add page fault trace points Nam Cao ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: Nam Cao @ 2025-04-30 11:02 UTC (permalink / raw) To: Steven Rostedt, Gabriele Monaco, linux-trace-kernel, linux-kernel Cc: john.ogness, Nam Cao, Petr Mladek, Sergey Senozhatsky, Ingo Molnar, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski, Peter Zijlstra, Catalin Marinas, linux-arm-kernel, Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti, linux-riscv Real-time applications may have design flaws causing them to have unexpected latency. For example, the applications may raise page faults, or may be blocked trying to take a mutex without priority inheritance. However, while attempting to implement DA monitors for these real-time rules, deterministic automaton is found to be inappropriate as the specification language. The automaton is complicated, hard to understand, and error-prone. For these cases, linear temporal logic is found to be more suitable. The LTL is more concise and intuitive. This series adds support for LTL RV monitor, and use it to implement two monitors for reporting problems with real-time tasks. Patch 1-12 cleanup and prepare the RV code for the integration of LTL monitors. Patch 13 adds support for LTL monitors. Patch 14 adds the container monitor "rtapp". This encapsulates the sub-monitors for real-time. Patch 15-18 prepares the pagefault tracepoints, so that patch 19 can add the monitor which watches real-time tasks doing page faults. Patch 20 adds the "sleep" monitor: it detects potential undesirable latency with real-time threads. Patch 21 adds documentation on the new monitors. Patch 22 allows the number of per-task monitors to be configurable, so that the two new monitors can be enabled simultaneously. v5->v6 https://lore.kernel.org/lkml/cover.1745926331.git.namcao@linutronix.de - sleep monitor: Drop the block_on_rt_mutex tracepoints. The contention tracepoints are sufficient. v4->v5 https://lore.kernel.org/lkml/cover.1745390829.git.namcao@linutronix.de - sleep monitor: Fix a false positive due to a race with waking and scheduling. - sleep monitor: Add block_on_rt_mutex tracepoints and use them for BLOCK_ON_RT_MUTEX, instead of trace_sched_pi_setprio - sleep monitor: tighten the rule on nanosleep: only clock_nanosleep() with TIMER_ABSTIME and CLOCK_MONOTONIC is allowed - add comments explaining why it is correct to treat PI-boosted tasks as real-time tasks. It should be noted that due to the changes in v5, 'perf' does not work as well as before, because sometimes the errors happen out of the real-time tasks' contexts. Fixing this is left for future work. stress-ng is also far noisier in v5, because the rule on nanosleep is tightened. v3->v4 https://lore.kernel.org/lkml/cover.1744785335.git.namcao@linutronix.de - support deadline tasks - rtapp_sleep: use sched_pi_setprio tracepoint instead of contention tracepoints for BLOCK_ON_RT_MUTEX, so that proxy lock is covered. - fix the scripts generating an "slightly" incorrect verification automaton - makes rtapp monitor depends on RV_PER_TASK_MONITORS >= 2 - make the event tracepoint output a bit more readable - some documentation's format fixes v2->v3 https://lore.kernel.org/lkml/cover.1744355018.git.namcao@linutronix.de/ - fix a problem with sleep monitor's specification (around KTHREAD_SHOULD_STOP) - merge the patches that move the dot2k/rvgen scripts around - pull panic/printk changes into separate patches - fixup some build errors - fixup monitor's init function return code - fix some flake8 warnings with the scripts - add some references to LTL documentation - fixup some mistakes with rtapp documentation - fixup capitalization mistake with monitor_synthesis.rst - remove the now-redundant macro RV_PER_TASK_MONITORS v1->v2 https://lore.kernel.org/lkml/cover.1741708239.git.namcao@linutronix.de/ - Integrate the LTL scripts into the existing dot2k tool, taking advantage of the existing monitor generation scripts. - Switch the struct ltl_monitor to use bitmap instead of an array, to optimize memory usage. - Correct the generated code to be non-deterministic state machine, instead of deterministic state machine - Put common code for all LTL monitors into a single file (include/rv/ltl_monitor.h), reducing code duplication - Change the LTL monitors to make user of container. Add a bug fix to container while at it. - Make the number of per-task monitor configurable Cc: Petr Mladek <pmladek@suse.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: x86@kernel.org Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: linux-riscv@lists.infradead.org Nam Cao (22): rv: Add #undef TRACE_INCLUDE_FILE printk: Make vprintk_deferred() public panic: Add vpanic() rv: Let the reactors take care of buffers verification/dot2k: Make a separate dot2k_templates/Kconfig_container verification/dot2k: Remove __buff_to_string() verification/dot2k: Replace is_container() hack with subparsers rv: rename CONFIG_DA_MON_EVENTS to CONFIG_RV_MON_EVENTS verification/dot2k: Prepare the frontend for LTL inclusion Documentation/rv: Prepare monitor synthesis document for LTL inclusion verification/rvgen: Restructure the templates files verification/rvgen: Restructure the classes to prepare for LTL inclusion rv: Add support for LTL monitors rv: Add rtapp container monitor x86/tracing: Remove redundant trace_pagefault_key x86/tracing: Move page fault trace points to generic arm64: mm: Add page fault trace points riscv: mm: Add page fault trace points rv: Add rtapp_pagefault monitor rv: Add rtapp_sleep monitor rv: Add documentation for rtapp monitor rv: Allow to configure the number of per-task monitor .../trace/rv/da_monitor_synthesis.rst | 147 ----- Documentation/trace/rv/index.rst | 4 +- .../trace/rv/linear_temporal_logic.rst | 122 ++++ Documentation/trace/rv/monitor_rtapp.rst | 116 ++++ Documentation/trace/rv/monitor_synthesis.rst | 256 ++++++++ arch/arm64/mm/fault.c | 8 + arch/riscv/mm/fault.c | 8 + arch/x86/include/asm/trace/common.h | 12 - arch/x86/include/asm/trace/irq_vectors.h | 1 - arch/x86/kernel/Makefile | 1 - arch/x86/kernel/tracepoint.c | 21 - arch/x86/mm/fault.c | 5 +- include/linux/panic.h | 3 + include/linux/printk.h | 5 + include/linux/rv.h | 74 ++- include/linux/sched.h | 8 +- include/rv/da_monitor.h | 45 +- include/rv/ltl_monitor.h | 184 ++++++ .../trace/events}/exceptions.h | 27 +- kernel/fork.c | 5 +- kernel/panic.c | 17 +- kernel/printk/internal.h | 1 - kernel/trace/rv/Kconfig | 27 +- kernel/trace/rv/Makefile | 3 + kernel/trace/rv/monitors/pagefault/Kconfig | 11 + .../trace/rv/monitors/pagefault/pagefault.c | 87 +++ .../trace/rv/monitors/pagefault/pagefault.h | 57 ++ .../rv/monitors/pagefault/pagefault_trace.h | 14 + kernel/trace/rv/monitors/rtapp/Kconfig | 7 + kernel/trace/rv/monitors/rtapp/rtapp.c | 33 ++ kernel/trace/rv/monitors/rtapp/rtapp.h | 3 + kernel/trace/rv/monitors/sleep/Kconfig | 13 + kernel/trace/rv/monitors/sleep/sleep.c | 227 +++++++ kernel/trace/rv/monitors/sleep/sleep.h | 238 ++++++++ kernel/trace/rv/monitors/sleep/sleep_trace.h | 14 + kernel/trace/rv/reactor_panic.c | 8 +- kernel/trace/rv/reactor_printk.c | 8 +- kernel/trace/rv/rv.c | 10 +- kernel/trace/rv/rv_reactors.c | 2 +- kernel/trace/rv/rv_trace.h | 52 +- tools/verification/dot2/Makefile | 26 - tools/verification/dot2/dot2k | 53 -- tools/verification/models/rtapp/pagefault.ltl | 1 + tools/verification/models/rtapp/sleep.ltl | 21 + tools/verification/rvgen/.gitignore | 3 + tools/verification/rvgen/Makefile | 27 + tools/verification/rvgen/__main__.py | 67 +++ tools/verification/{dot2 => rvgen}/dot2c | 2 +- .../{dot2 => rvgen/rvgen}/automata.py | 0 tools/verification/rvgen/rvgen/container.py | 22 + .../{dot2 => rvgen/rvgen}/dot2c.py | 2 +- tools/verification/rvgen/rvgen/dot2k.py | 129 ++++ .../dot2k.py => rvgen/rvgen/generator.py} | 249 ++------ tools/verification/rvgen/rvgen/ltl2ba.py | 558 ++++++++++++++++++ tools/verification/rvgen/rvgen/ltl2k.py | 245 ++++++++ .../rvgen/templates}/Kconfig | 0 .../rvgen/rvgen/templates/container/Kconfig | 5 + .../rvgen/templates/container/main.c} | 0 .../rvgen/templates/container/main.h} | 0 .../rvgen/templates/dot2k}/main.c | 0 .../rvgen/templates/dot2k}/trace.h | 0 .../rvgen/rvgen/templates/ltl2k/main.c | 102 ++++ .../rvgen/rvgen/templates/ltl2k/trace.h | 14 + 63 files changed, 2860 insertions(+), 550 deletions(-) delete mode 100644 Documentation/trace/rv/da_monitor_synthesis.rst create mode 100644 Documentation/trace/rv/linear_temporal_logic.rst create mode 100644 Documentation/trace/rv/monitor_rtapp.rst create mode 100644 Documentation/trace/rv/monitor_synthesis.rst delete mode 100644 arch/x86/include/asm/trace/common.h delete mode 100644 arch/x86/kernel/tracepoint.c create mode 100644 include/rv/ltl_monitor.h rename {arch/x86/include/asm/trace => include/trace/events}/exceptions.h (55%) create mode 100644 kernel/trace/rv/monitors/pagefault/Kconfig create mode 100644 kernel/trace/rv/monitors/pagefault/pagefault.c create mode 100644 kernel/trace/rv/monitors/pagefault/pagefault.h create mode 100644 kernel/trace/rv/monitors/pagefault/pagefault_trace.h create mode 100644 kernel/trace/rv/monitors/rtapp/Kconfig create mode 100644 kernel/trace/rv/monitors/rtapp/rtapp.c create mode 100644 kernel/trace/rv/monitors/rtapp/rtapp.h create mode 100644 kernel/trace/rv/monitors/sleep/Kconfig create mode 100644 kernel/trace/rv/monitors/sleep/sleep.c create mode 100644 kernel/trace/rv/monitors/sleep/sleep.h create mode 100644 kernel/trace/rv/monitors/sleep/sleep_trace.h delete mode 100644 tools/verification/dot2/Makefile delete mode 100644 tools/verification/dot2/dot2k create mode 100644 tools/verification/models/rtapp/pagefault.ltl create mode 100644 tools/verification/models/rtapp/sleep.ltl create mode 100644 tools/verification/rvgen/.gitignore create mode 100644 tools/verification/rvgen/Makefile create mode 100644 tools/verification/rvgen/__main__.py rename tools/verification/{dot2 => rvgen}/dot2c (97%) rename tools/verification/{dot2 => rvgen/rvgen}/automata.py (100%) create mode 100644 tools/verification/rvgen/rvgen/container.py rename tools/verification/{dot2 => rvgen/rvgen}/dot2c.py (99%) create mode 100644 tools/verification/rvgen/rvgen/dot2k.py rename tools/verification/{dot2/dot2k.py => rvgen/rvgen/generator.py} (52%) create mode 100644 tools/verification/rvgen/rvgen/ltl2ba.py create mode 100644 tools/verification/rvgen/rvgen/ltl2k.py rename tools/verification/{dot2/dot2k_templates => rvgen/rvgen/templates}/Kconfig (100%) create mode 100644 tools/verification/rvgen/rvgen/templates/container/Kconfig rename tools/verification/{dot2/dot2k_templates/main_container.c => rvgen/rvgen/templates/container/main.c} (100%) rename tools/verification/{dot2/dot2k_templates/main_container.h => rvgen/rvgen/templates/container/main.h} (100%) rename tools/verification/{dot2/dot2k_templates => rvgen/rvgen/templates/dot2k}/main.c (100%) rename tools/verification/{dot2/dot2k_templates => rvgen/rvgen/templates/dot2k}/trace.h (100%) create mode 100644 tools/verification/rvgen/rvgen/templates/ltl2k/main.c create mode 100644 tools/verification/rvgen/rvgen/templates/ltl2k/trace.h -- 2.39.5 ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v6 17/22] arm64: mm: Add page fault trace points 2025-04-30 11:02 [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application Nam Cao @ 2025-04-30 11:02 ` Nam Cao 2025-05-07 21:23 ` Steven Rostedt 2025-05-16 14:04 ` Will Deacon 2025-04-30 12:17 ` [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application Gabriele Monaco 2025-08-10 21:12 ` patchwork-bot+linux-riscv 2 siblings, 2 replies; 14+ messages in thread From: Nam Cao @ 2025-04-30 11:02 UTC (permalink / raw) To: Steven Rostedt, Gabriele Monaco, linux-trace-kernel, linux-kernel Cc: john.ogness, Nam Cao, Catalin Marinas, Will Deacon, linux-arm-kernel Add page fault trace points, which are useful to implement RV monitor which watches page faults. Signed-off-by: Nam Cao <namcao@linutronix.de> --- Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org --- arch/arm64/mm/fault.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index ef63651099a9..e3f096b0dffd 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -44,6 +44,9 @@ #include <asm/tlbflush.h> #include <asm/traps.h> +#define CREATE_TRACE_POINTS +#include <trace/events/exceptions.h> + struct fault_info { int (*fn)(unsigned long far, unsigned long esr, struct pt_regs *regs); @@ -559,6 +562,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr, if (kprobe_page_fault(regs, esr)) return 0; + if (user_mode(regs)) + trace_page_fault_user(addr, regs, esr); + else + trace_page_fault_kernel(addr, regs, esr); + /* * If we're in an interrupt or have no user context, we must not take * the fault. -- 2.39.5 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v6 17/22] arm64: mm: Add page fault trace points 2025-04-30 11:02 ` [PATCH v6 17/22] arm64: mm: Add page fault trace points Nam Cao @ 2025-05-07 21:23 ` Steven Rostedt 2025-05-16 14:04 ` Will Deacon 1 sibling, 0 replies; 14+ messages in thread From: Steven Rostedt @ 2025-05-07 21:23 UTC (permalink / raw) To: Catalin Marinas, Will Deacon Cc: Nam Cao, Gabriele Monaco, linux-trace-kernel, linux-kernel, john.ogness, linux-arm-kernel Can I get an Acked-by from the ARM64 maintainers? Thanks, -- Steve On Wed, 30 Apr 2025 13:02:32 +0200 Nam Cao <namcao@linutronix.de> wrote: > Add page fault trace points, which are useful to implement RV monitor which > watches page faults. > > Signed-off-by: Nam Cao <namcao@linutronix.de> > --- > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Will Deacon <will@kernel.org> > Cc: linux-arm-kernel@lists.infradead.org > --- > arch/arm64/mm/fault.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c > index ef63651099a9..e3f096b0dffd 100644 > --- a/arch/arm64/mm/fault.c > +++ b/arch/arm64/mm/fault.c > @@ -44,6 +44,9 @@ > #include <asm/tlbflush.h> > #include <asm/traps.h> > > +#define CREATE_TRACE_POINTS > +#include <trace/events/exceptions.h> > + > struct fault_info { > int (*fn)(unsigned long far, unsigned long esr, > struct pt_regs *regs); > @@ -559,6 +562,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr, > if (kprobe_page_fault(regs, esr)) > return 0; > > + if (user_mode(regs)) > + trace_page_fault_user(addr, regs, esr); > + else > + trace_page_fault_kernel(addr, regs, esr); > + > /* > * If we're in an interrupt or have no user context, we must not take > * the fault. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 17/22] arm64: mm: Add page fault trace points 2025-04-30 11:02 ` [PATCH v6 17/22] arm64: mm: Add page fault trace points Nam Cao 2025-05-07 21:23 ` Steven Rostedt @ 2025-05-16 14:04 ` Will Deacon 2025-05-16 14:42 ` Steven Rostedt ` (2 more replies) 1 sibling, 3 replies; 14+ messages in thread From: Will Deacon @ 2025-05-16 14:04 UTC (permalink / raw) To: Nam Cao Cc: Steven Rostedt, Gabriele Monaco, linux-trace-kernel, linux-kernel, john.ogness, Catalin Marinas, linux-arm-kernel On Wed, Apr 30, 2025 at 01:02:32PM +0200, Nam Cao wrote: > Add page fault trace points, which are useful to implement RV monitor which > watches page faults. > > Signed-off-by: Nam Cao <namcao@linutronix.de> > --- > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Will Deacon <will@kernel.org> > Cc: linux-arm-kernel@lists.infradead.org > --- > arch/arm64/mm/fault.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c > index ef63651099a9..e3f096b0dffd 100644 > --- a/arch/arm64/mm/fault.c > +++ b/arch/arm64/mm/fault.c > @@ -44,6 +44,9 @@ > #include <asm/tlbflush.h> > #include <asm/traps.h> > > +#define CREATE_TRACE_POINTS > +#include <trace/events/exceptions.h> > + > struct fault_info { > int (*fn)(unsigned long far, unsigned long esr, > struct pt_regs *regs); > @@ -559,6 +562,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr, > if (kprobe_page_fault(regs, esr)) > return 0; > > + if (user_mode(regs)) > + trace_page_fault_user(addr, regs, esr); > + else > + trace_page_fault_kernel(addr, regs, esr); Why is this after kprobe_page_fault()? It's also a shame that the RV monitor can't hook into perf, as we already have a sw event for page faults that you could use instead of adding something new. Will ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 17/22] arm64: mm: Add page fault trace points 2025-05-16 14:04 ` Will Deacon @ 2025-05-16 14:42 ` Steven Rostedt 2025-05-19 15:12 ` Will Deacon 2025-05-16 15:09 ` Nam Cao 2025-05-19 16:17 ` Mark Rutland 2 siblings, 1 reply; 14+ messages in thread From: Steven Rostedt @ 2025-05-16 14:42 UTC (permalink / raw) To: Will Deacon, Nam Cao Cc: Gabriele Monaco, linux-trace-kernel, linux-kernel, john.ogness, Catalin Marinas, linux-arm-kernel On May 16, 2025 10:04:50 AM EDT, Will Deacon <will@kernel.org> wrote: > >> + if (user_mode(regs)) >> + trace_page_fault_user(addr, regs, esr); >> + else >> + trace_page_fault_kernel(addr, regs, esr); > >Why is this after kprobe_page_fault()? > >It's also a shame that the RV monitor can't hook into perf, as we >already have a sw event for page faults that you could use instead of >adding something new. > Perf events work for perf only. My question is why isn't this a tracepoint that perf could hook into? Tracepoints are made to be generic, whereas perf events are not. -- Steve ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 17/22] arm64: mm: Add page fault trace points 2025-05-16 14:42 ` Steven Rostedt @ 2025-05-19 15:12 ` Will Deacon 2025-05-19 16:08 ` Steven Rostedt 0 siblings, 1 reply; 14+ messages in thread From: Will Deacon @ 2025-05-19 15:12 UTC (permalink / raw) To: Steven Rostedt Cc: Nam Cao, Gabriele Monaco, linux-trace-kernel, linux-kernel, john.ogness, Catalin Marinas, linux-arm-kernel On Fri, May 16, 2025 at 10:42:48AM -0400, Steven Rostedt wrote: > > > On May 16, 2025 10:04:50 AM EDT, Will Deacon <will@kernel.org> wrote: > > > >> + if (user_mode(regs)) > >> + trace_page_fault_user(addr, regs, esr); > >> + else > >> + trace_page_fault_kernel(addr, regs, esr); > > > >Why is this after kprobe_page_fault()? > > > >It's also a shame that the RV monitor can't hook into perf, as we > >already have a sw event for page faults that you could use instead of > >adding something new. > > > > Perf events work for perf only. My question is why isn't this a tracepoint > that perf could hook into? Well, the perf event came first in this case, so we're stuck with it :/ I was hoping we could settle for a generic helper that could emit both the trace event and the perf event (so that the ordering of the two is portable across architectures) but, judging by Nam's reply, the trace event is needed before kprobes gets a look in. Will ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 17/22] arm64: mm: Add page fault trace points 2025-05-19 15:12 ` Will Deacon @ 2025-05-19 16:08 ` Steven Rostedt 2025-05-20 14:04 ` Will Deacon 0 siblings, 1 reply; 14+ messages in thread From: Steven Rostedt @ 2025-05-19 16:08 UTC (permalink / raw) To: Will Deacon Cc: Nam Cao, Gabriele Monaco, linux-trace-kernel, linux-kernel, john.ogness, Catalin Marinas, linux-arm-kernel On Mon, 19 May 2025 16:12:39 +0100 Will Deacon <will@kernel.org> wrote: > > Perf events work for perf only. My question is why isn't this a tracepoint > > that perf could hook into? > > Well, the perf event came first in this case, so we're stuck with it :/ I wonder what effort it will take to convert perf events to tracepoints ;-) Note, I'm talking about tracepoints and not trace events, where the latter is exposed to tracefs and the former is not. > > I was hoping we could settle for a generic helper that could emit both > the trace event and the perf event (so that the ordering of the two is > portable across architectures) but, judging by Nam's reply, the trace > event is needed before kprobes gets a look in. Perhaps we could add a helper function that does both (perf and tracepoint) and hide the implementation from the code that calls it? But I'm currently still on PTO so I haven't looked at the details yet. -- Steve ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 17/22] arm64: mm: Add page fault trace points 2025-05-19 16:08 ` Steven Rostedt @ 2025-05-20 14:04 ` Will Deacon 0 siblings, 0 replies; 14+ messages in thread From: Will Deacon @ 2025-05-20 14:04 UTC (permalink / raw) To: Steven Rostedt Cc: Nam Cao, Gabriele Monaco, linux-trace-kernel, linux-kernel, john.ogness, Catalin Marinas, linux-arm-kernel On Mon, May 19, 2025 at 12:08:37PM -0400, Steven Rostedt wrote: > On Mon, 19 May 2025 16:12:39 +0100 > Will Deacon <will@kernel.org> wrote: > > > > Perf events work for perf only. My question is why isn't this a tracepoint > > > that perf could hook into? > > > > Well, the perf event came first in this case, so we're stuck with it :/ > > I wonder what effort it will take to convert perf events to tracepoints ;-) > > Note, I'm talking about tracepoints and not trace events, where the > latter is exposed to tracefs and the former is not. > > > > > I was hoping we could settle for a generic helper that could emit both > > the trace event and the perf event (so that the ordering of the two is > > portable across architectures) but, judging by Nam's reply, the trace > > event is needed before kprobes gets a look in. > > Perhaps we could add a helper function that does both (perf and > tracepoint) and hide the implementation from the code that calls it? Something like that sounds like a good idea, yes. > But I'm currently still on PTO so I haven't looked at the details yet. Enjoy! Will ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 17/22] arm64: mm: Add page fault trace points 2025-05-16 14:04 ` Will Deacon 2025-05-16 14:42 ` Steven Rostedt @ 2025-05-16 15:09 ` Nam Cao 2025-05-19 16:17 ` Mark Rutland 2 siblings, 0 replies; 14+ messages in thread From: Nam Cao @ 2025-05-16 15:09 UTC (permalink / raw) To: Will Deacon Cc: Steven Rostedt, Gabriele Monaco, linux-trace-kernel, linux-kernel, john.ogness, Catalin Marinas, linux-arm-kernel On Fri, May 16, 2025 at 03:04:50PM +0100, Will Deacon wrote: > On Wed, Apr 30, 2025 at 01:02:32PM +0200, Nam Cao wrote: > > Add page fault trace points, which are useful to implement RV monitor which > > watches page faults. > > > > Signed-off-by: Nam Cao <namcao@linutronix.de> > > --- > > Cc: Catalin Marinas <catalin.marinas@arm.com> > > Cc: Will Deacon <will@kernel.org> > > Cc: linux-arm-kernel@lists.infradead.org > > --- > > arch/arm64/mm/fault.c | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c > > index ef63651099a9..e3f096b0dffd 100644 > > --- a/arch/arm64/mm/fault.c > > +++ b/arch/arm64/mm/fault.c > > @@ -44,6 +44,9 @@ > > #include <asm/tlbflush.h> > > #include <asm/traps.h> > > > > +#define CREATE_TRACE_POINTS > > +#include <trace/events/exceptions.h> > > + > > struct fault_info { > > int (*fn)(unsigned long far, unsigned long esr, > > struct pt_regs *regs); > > @@ -559,6 +562,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr, > > if (kprobe_page_fault(regs, esr)) > > return 0; > > > > + if (user_mode(regs)) > > + trace_page_fault_user(addr, regs, esr); > > + else > > + trace_page_fault_kernel(addr, regs, esr); > > Why is this after kprobe_page_fault()? This is me being incompetent, sorry about that. It is more logical to put them at the beginning. Best regards, Nam ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 17/22] arm64: mm: Add page fault trace points 2025-05-16 14:04 ` Will Deacon 2025-05-16 14:42 ` Steven Rostedt 2025-05-16 15:09 ` Nam Cao @ 2025-05-19 16:17 ` Mark Rutland 2025-05-20 12:32 ` Will Deacon 2 siblings, 1 reply; 14+ messages in thread From: Mark Rutland @ 2025-05-19 16:17 UTC (permalink / raw) To: Will Deacon Cc: Nam Cao, Steven Rostedt, Gabriele Monaco, linux-trace-kernel, linux-kernel, john.ogness, Catalin Marinas, linux-arm-kernel On Fri, May 16, 2025 at 03:04:50PM +0100, Will Deacon wrote: > On Wed, Apr 30, 2025 at 01:02:32PM +0200, Nam Cao wrote: > > Add page fault trace points, which are useful to implement RV monitor which > > watches page faults. > > > > Signed-off-by: Nam Cao <namcao@linutronix.de> > > --- > > Cc: Catalin Marinas <catalin.marinas@arm.com> > > Cc: Will Deacon <will@kernel.org> > > Cc: linux-arm-kernel@lists.infradead.org > > --- > > arch/arm64/mm/fault.c | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c > > index ef63651099a9..e3f096b0dffd 100644 > > --- a/arch/arm64/mm/fault.c > > +++ b/arch/arm64/mm/fault.c > > @@ -44,6 +44,9 @@ > > #include <asm/tlbflush.h> > > #include <asm/traps.h> > > > > +#define CREATE_TRACE_POINTS > > +#include <trace/events/exceptions.h> > > + > > struct fault_info { > > int (*fn)(unsigned long far, unsigned long esr, > > struct pt_regs *regs); > > @@ -559,6 +562,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr, > > if (kprobe_page_fault(regs, esr)) > > return 0; > > > > + if (user_mode(regs)) > > + trace_page_fault_user(addr, regs, esr); > > + else > > + trace_page_fault_kernel(addr, regs, esr); > > Why is this after kprobe_page_fault()? The kprobe_page_fault() gunk is doing something quite different, and is poorly named. That's trying to fixup the PC (and some other state) to hide kprobe details from the fault handling logic when an out-of-line copy of an instruction somehow triggers a fault. Logically, that *should* happen before the tracepoints, and shouldn't be moved later. For other reasons it needs to be even earlier in the fault handling flow, and is currently far too late, but that only ends up mattering int he presence of other kernel bugs. For now I think it should stay where it is. More details below, for the curious and/or deranged. The kprobe_page_fault() gunk is trying to fix up the case where an instruction has been kprobed, an out-of-line copy of that instruction is being stepped, and the out-of-line instruction has triggered a fault. When that happens, kprobe_page_fault() tries to reset the faulting PC and DAIF such that it looks like the fault was taken from the original PC of the probed instruction. The real logic for that happens in kprobe_fault_handler(), which adjusts the values in pt_regs, but does not handle the live DAIF value. It also doesn't handle the PMR when pNMI is in use. Due to this, the fault handler can run with DAIF bits masked unexpectedly, and a subsequent exception return *could* go wrong. Luckily all code with an extable entry has been blacklisted for kprobes since commit: 888b3c8720e0a403 ("arm64: Treat all entry code as non-kprobe-able") ... so we should only get here if there's another kernel bug that causes an unmarked dereference of a faulting address, in which case we're likely to BUG() anyway. The real fix would be to hoist this out to the arm64 entry code (and handle similar for other EL1 exceptions), and get rid of all the __kprobes annotations inthe fault code. Mark. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 17/22] arm64: mm: Add page fault trace points 2025-05-19 16:17 ` Mark Rutland @ 2025-05-20 12:32 ` Will Deacon 0 siblings, 0 replies; 14+ messages in thread From: Will Deacon @ 2025-05-20 12:32 UTC (permalink / raw) To: Mark Rutland Cc: Nam Cao, Steven Rostedt, Gabriele Monaco, linux-trace-kernel, linux-kernel, john.ogness, Catalin Marinas, linux-arm-kernel On Mon, May 19, 2025 at 05:17:02PM +0100, Mark Rutland wrote: > On Fri, May 16, 2025 at 03:04:50PM +0100, Will Deacon wrote: > > On Wed, Apr 30, 2025 at 01:02:32PM +0200, Nam Cao wrote: > > > Add page fault trace points, which are useful to implement RV monitor which > > > watches page faults. > > > > > > Signed-off-by: Nam Cao <namcao@linutronix.de> > > > --- > > > Cc: Catalin Marinas <catalin.marinas@arm.com> > > > Cc: Will Deacon <will@kernel.org> > > > Cc: linux-arm-kernel@lists.infradead.org > > > --- > > > arch/arm64/mm/fault.c | 8 ++++++++ > > > 1 file changed, 8 insertions(+) > > > > > > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c > > > index ef63651099a9..e3f096b0dffd 100644 > > > --- a/arch/arm64/mm/fault.c > > > +++ b/arch/arm64/mm/fault.c > > > @@ -44,6 +44,9 @@ > > > #include <asm/tlbflush.h> > > > #include <asm/traps.h> > > > > > > +#define CREATE_TRACE_POINTS > > > +#include <trace/events/exceptions.h> > > > + > > > struct fault_info { > > > int (*fn)(unsigned long far, unsigned long esr, > > > struct pt_regs *regs); > > > @@ -559,6 +562,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr, > > > if (kprobe_page_fault(regs, esr)) > > > return 0; > > > > > > + if (user_mode(regs)) > > > + trace_page_fault_user(addr, regs, esr); > > > + else > > > + trace_page_fault_kernel(addr, regs, esr); > > > > Why is this after kprobe_page_fault()? > > The kprobe_page_fault() gunk is doing something quite different, and is > poorly named. That's trying to fixup the PC (and some other state) to > hide kprobe details from the fault handling logic when an out-of-line > copy of an instruction somehow triggers a fault. > > Logically, that *should* happen before the tracepoints, and shouldn't be > moved later. For other reasons it needs to be even earlier in the fault > handling flow, and is currently far too late, but that only ends up > mattering int he presence of other kernel bugs. For now I think it > should stay where it is. I thought these tracepoints were intended to be used by RV, in which case I'd have thought we'd want as much coverage as possible to reason about what the kernel is actually doing. > More details below, for the curious and/or deranged. > > The kprobe_page_fault() gunk is trying to fix up the case where an > instruction has been kprobed, an out-of-line copy of that instruction is > being stepped, and the out-of-line instruction has triggered a fault. > When that happens, kprobe_page_fault() tries to reset the faulting PC > and DAIF such that it looks like the fault was taken from the original > PC of the probed instruction. > > The real logic for that happens in kprobe_fault_handler(), which adjusts > the values in pt_regs, but does not handle the live DAIF value. It also > doesn't handle the PMR when pNMI is in use. Due to this, the fault > handler can run with DAIF bits masked unexpectedly, and a subsequent > exception return *could* go wrong. > > Luckily all code with an extable entry has been blacklisted for kprobes > since commit: > > 888b3c8720e0a403 ("arm64: Treat all entry code as non-kprobe-able") > > ... so we should only get here if there's another kernel bug that causes > an unmarked dereference of a faulting address, in which case we're > likely to BUG() anyway. > > The real fix would be to hoist this out to the arm64 entry code (and > handle similar for other EL1 exceptions), and get rid of all the > __kprobes annotations inthe fault code. This seems to be an argument for removing kprobe_page_fault() entirely, which is fine, but while it exists it's not obvious to me how it's supposed to interact with RV. I suppose the pragmatic thing to do would be to align as closely as possible with x86, but any documentation/guidance/tests to help us maintain that would be really helpful. Otherwise, this feels like we're going to have a repeat of the syscall entry mess where the interaction with ptrace, audit, seccomp etc was perpetually broken in user-visible ways. Will ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application 2025-04-30 11:02 [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application Nam Cao 2025-04-30 11:02 ` [PATCH v6 17/22] arm64: mm: Add page fault trace points Nam Cao @ 2025-04-30 12:17 ` Gabriele Monaco 2025-04-30 19:18 ` Steven Rostedt 2025-08-10 21:12 ` patchwork-bot+linux-riscv 2 siblings, 1 reply; 14+ messages in thread From: Gabriele Monaco @ 2025-04-30 12:17 UTC (permalink / raw) To: Steven Rostedt Cc: Nam Cao, john.ogness, Petr Mladek, Sergey Senozhatsky, Ingo Molnar, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski, Peter Zijlstra, Catalin Marinas, linux-arm-kernel, Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti, linux-riscv, linux-trace-kernel, linux-kernel On Wed, 2025-04-30 at 13:02 +0200, Nam Cao wrote: > Real-time applications may have design flaws causing them to have > unexpected latency. For example, the applications may raise page > faults, or > may be blocked trying to take a mutex without priority inheritance. > > However, while attempting to implement DA monitors for these real- > time > rules, deterministic automaton is found to be inappropriate as the > specification language. The automaton is complicated, hard to > understand, > and error-prone. > > For these cases, linear temporal logic is found to be more suitable. > The > LTL is more concise and intuitive. > > This series adds support for LTL RV monitor, and use it to implement > two > monitors for reporting problems with real-time tasks. > Steve, From my point of view this series is ready for inclusion, what do you think? We may still need Acks from the x86 and arm64 maintainers regarding the tracepoints changes, though. Thanks, Gabriele > Patch 1-12 cleanup and prepare the RV code for the integration of LTL > monitors. > > Patch 13 adds support for LTL monitors. > > Patch 14 adds the container monitor "rtapp". This encapsulates the > sub-monitors for real-time. > > Patch 15-18 prepares the pagefault tracepoints, so that patch 19 can > add > the monitor which watches real-time tasks doing page faults. > > Patch 20 adds the "sleep" monitor: it detects potential undesirable > latency > with real-time threads. > > Patch 21 adds documentation on the new monitors. > > Patch 22 allows the number of per-task monitors to be configurable, > so that > the two new monitors can be enabled simultaneously. > > v5->v6 > https://lore.kernel.org/lkml/cover.1745926331.git.namcao@linutronix.de > - sleep monitor: Drop the block_on_rt_mutex tracepoints. The > contention > tracepoints are sufficient. > > v4->v5 > https://lore.kernel.org/lkml/cover.1745390829.git.namcao@linutronix.de > - sleep monitor: Fix a false positive due to a race with waking and > scheduling. > - sleep monitor: Add block_on_rt_mutex tracepoints and use them for > BLOCK_ON_RT_MUTEX, instead of trace_sched_pi_setprio > - sleep monitor: tighten the rule on nanosleep: only > clock_nanosleep() > with TIMER_ABSTIME and CLOCK_MONOTONIC is allowed > - add comments explaining why it is correct to treat PI-boosted > tasks as > real-time tasks. > > It should be noted that due to the changes in v5, 'perf' does not > work > as well as before, because sometimes the errors happen out of the > real-time tasks' contexts. Fixing this is left for future work. > > stress-ng is also far noisier in v5, because the rule on > nanosleep is > tightened. > > v3->v4 > https://lore.kernel.org/lkml/cover.1744785335.git.namcao@linutronix.de > - support deadline tasks > - rtapp_sleep: use sched_pi_setprio tracepoint instead of > contention > tracepoints for BLOCK_ON_RT_MUTEX, so that proxy lock is covered. > - fix the scripts generating an "slightly" incorrect verification > automaton > - makes rtapp monitor depends on RV_PER_TASK_MONITORS >= 2 > - make the event tracepoint output a bit more readable > - some documentation's format fixes > > v2->v3 > https://lore.kernel.org/lkml/cover.1744355018.git.namcao@linutronix.de/ > - fix a problem with sleep monitor's specification (around > KTHREAD_SHOULD_STOP) > - merge the patches that move the dot2k/rvgen scripts around > - pull panic/printk changes into separate patches > - fixup some build errors > - fixup monitor's init function return code > - fix some flake8 warnings with the scripts > - add some references to LTL documentation > - fixup some mistakes with rtapp documentation > - fixup capitalization mistake with monitor_synthesis.rst > - remove the now-redundant macro RV_PER_TASK_MONITORS > > v1->v2 > https://lore.kernel.org/lkml/cover.1741708239.git.namcao@linutronix.de/ > - Integrate the LTL scripts into the existing dot2k tool, taking > advantage of the existing monitor generation scripts. > - Switch the struct ltl_monitor to use bitmap instead of an array, > to > optimize memory usage. > - Correct the generated code to be non-deterministic state machine, > instead of deterministic state machine > - Put common code for all LTL monitors into a single file > (include/rv/ltl_monitor.h), reducing code duplication > - Change the LTL monitors to make user of container. Add a bug fix > to > container while at it. > - Make the number of per-task monitor configurable ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application 2025-04-30 12:17 ` [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application Gabriele Monaco @ 2025-04-30 19:18 ` Steven Rostedt 0 siblings, 0 replies; 14+ messages in thread From: Steven Rostedt @ 2025-04-30 19:18 UTC (permalink / raw) To: Gabriele Monaco Cc: Nam Cao, john.ogness, Petr Mladek, Sergey Senozhatsky, Ingo Molnar, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski, Peter Zijlstra, Catalin Marinas, linux-arm-kernel, Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti, linux-riscv, linux-trace-kernel, linux-kernel On Wed, 30 Apr 2025 14:17:30 +0200 Gabriele Monaco <gmonaco@redhat.com> wrote: > Steve, > > >From my point of view this series is ready for inclusion, what do you > think? I haven't had a chance to look at it yet. I'm finishing up some deferred unwinding work, and then hopefully I can take a deeper look. > > We may still need Acks from the x86 and arm64 maintainers regarding the > tracepoints changes, though. Yeah, probably want to start pinging them. -- Steve ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application 2025-04-30 11:02 [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application Nam Cao 2025-04-30 11:02 ` [PATCH v6 17/22] arm64: mm: Add page fault trace points Nam Cao 2025-04-30 12:17 ` [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application Gabriele Monaco @ 2025-08-10 21:12 ` patchwork-bot+linux-riscv 2 siblings, 0 replies; 14+ messages in thread From: patchwork-bot+linux-riscv @ 2025-08-10 21:12 UTC (permalink / raw) To: Nam Cao Cc: linux-riscv, rostedt, gmonaco, linux-trace-kernel, linux-kernel, john.ogness, pmladek, senozhatsky, mingo, tglx, bp, dave.hansen, x86, hpa, luto, peterz, catalin.marinas, linux-arm-kernel, paul.walmsley, palmer, aou, alex Hello: This patch was applied to riscv/linux.git (fixes) by Steven Rostedt (Google) <rostedt@goodmis.org>: On Wed, 30 Apr 2025 13:02:15 +0200 you wrote: > Real-time applications may have design flaws causing them to have > unexpected latency. For example, the applications may raise page faults, or > may be blocked trying to take a mutex without priority inheritance. > > However, while attempting to implement DA monitors for these real-time > rules, deterministic automaton is found to be inappropriate as the > specification language. The automaton is complicated, hard to understand, > and error-prone. > > [...] Here is the summary with links: - [v6,18/22] riscv: mm: Add page fault trace points https://git.kernel.org/riscv/c/a37c71ca412d You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-08-10 21:35 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-04-30 11:02 [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application Nam Cao 2025-04-30 11:02 ` [PATCH v6 17/22] arm64: mm: Add page fault trace points Nam Cao 2025-05-07 21:23 ` Steven Rostedt 2025-05-16 14:04 ` Will Deacon 2025-05-16 14:42 ` Steven Rostedt 2025-05-19 15:12 ` Will Deacon 2025-05-19 16:08 ` Steven Rostedt 2025-05-20 14:04 ` Will Deacon 2025-05-16 15:09 ` Nam Cao 2025-05-19 16:17 ` Mark Rutland 2025-05-20 12:32 ` Will Deacon 2025-04-30 12:17 ` [PATCH v6 00/22] RV: Linear temporal logic monitors for RT application Gabriele Monaco 2025-04-30 19:18 ` Steven Rostedt 2025-08-10 21:12 ` patchwork-bot+linux-riscv
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).